Feeds
The Drop Times: Heading to Barcelona? Stop by LagoonCon Barcelona 2024!
Gary Benson: Too many git branches?
Do you have too many git branches on the go at once? Here is the command to list them in order of last modification:
git for-each-ref --sort=-committerdate refs/headsmandclu: Barcelona Update: Starshot Events Recipe Track
With DrupalCon Barcelona fast approaching I thought it was time to share some more updates on the progress of the events recipe for Drupal CMS a.k.a. the Starshot initiative.
mandclu Sep 20, 2024 - 9:27am TagsReal Python: The Real Python Podcast – Episode #221: Thriving as a Developer With ADHD
What are strategies for being a productive developer with ADHD? How can you help your team members with ADHD to succeed and complete projects? This week on the show, we speak with Chris Ferdinandi about his website and podcast "ADHD For the Win!"
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
PyCharm: What’s New in PyCharm 2024.2.2!
PyCharm 2024.2.2 is here with many key updates, including Python support improvements, new Django features, and enhancements to the Data View tool window!
Visit our What’s New page for more details on all these features and to explore many others. You can download the latest version from our download page or update your current version through our free Toolbox App.
Download PyCharm 2024.2.2 PyCharm 2024.2.2 highlights Django enhancements PRO New code completion suggestionsWhen working with models, PyCharm now offers field completion suggestions in a variety of cases, such as Model.save(update_fields[…]), Model.refresh_from_db(fields=[…]), Model.clean_fields(exclude=[…]), and so on.
Quick-fix to create a method for an unresolved ViewSetIf a ViewSet has an unresolved reference, PyCharm suggests a quick-fix to introduce the missing method. Use Alt + Enter to call it.
What’s new! Data View PROYou can now look at n-dimensional NumPy arrays in the Data View tool window. Define the array you would like to inspect, along with a specific dimension or slice, in a special field at the bottom of the tool window, and PyCharm will display a table with the results.
Python support improvements Support for default types for type parameters (PEP 696)Improve typing with PyCharm’s support for the Python 3.13 ability to define the default types for type parameters. The IDE now incorporates default types for type parameters both for old-style and new-style generic classes, functions, and type aliases, and it takes them into account in type inference.
Pattern matching: Foldable match statementsTo improve the readability of code with large pattern-matching statements, you can now use folding for entire match statements or for separate cases inside them.
Download PyCharm 2024.2.2Visit our What’s New page to learn about other useful features included in this release, or read the release notes for the full breakdown, including more details on the features mentioned here.
If you encounter any problems, please report them in our issue tracker so we can address them promptly.
Connect with us on X (formerly Twitter) to share your thoughts on PyCharm 2024.2.2!
The Drop Times: 'Local Association Stand' at DrupalCon Barcelona 2024
Promet Source: Get to Know Provus®EDU: Our Higher Ed Drupal Distro
Qt Gradle Plugin 1.0 Released
Qt Gradle Plugin 1.0 (QtGP) build tool has been released. You can include it in your Android builds from Maven Central.
Web Review, Week 2024-38
Let’s go for my web review for the week 2024-38.
Is Tor still safe to use?Tags: tech, tor, privacy
The quick answer is yes. The longer answer is that more effort is still required to ensure the network has enough diversity of nodes to stay healthy.
https://blog.torproject.org/tor-is-still-safe/
Tags: tech, ai, machine-learning, gpt, business, economics, criticism
This is a very harsh and bleak view on the current generative AI craze. Clearly it survives on some sort of weird faith that things will magically improve. Some decision makers clearly run fully on said faith and lost all kind of realistic view of the situation. They are just very disconnected from the user’s needs.
There’s even a funny quote in there: “Generative AI must seem kind of magical when your entire life is either being in a meeting or reading an email”.
When this bubble bursts, it’s hard to predict what the fallout will be on the tech industry… for sure it won’t be pretty. It also begs the question: what is this industry going to do next? There’s clearly no plan after generative AI.
https://www.wheresyoured.at/subprimeai/
Tags: tech, ai, machine-learning, gpt, politics, ecology
Need to illustrate how much the current AI arm race is an ecological and social problem? Here is a very pathological case. This is what you get when you let the tycoons behind this completely unchecked.
https://www.npr.org/2024/09/11/nx-s1-5088134/elon-musk-ai-xai-supercomputer-memphis-pollution
Tags: tech, oracle, surveillance
People are gasping in horror with Larry Ellison’s latest claims… but really they should realize he’s not dreaming big. All of that is already here in one form or another. Maybe it was time to protest years ago?
https://www.404media.co/larry-ellisons-ai-powered-surveillance-dystopia-is-already-here/
Tags: tech, security, war, battery
Or why we should all be concerned and condemn the latest pager and walkie-talkie attacks. They clearly opened a Pandora’s box, it’d be surprising not to see more of those from various organizations. The funds and efforts required make it affordable enough.
https://www.bunniestudios.com/blog/2024/turning-everyday-gadgets-into-bombs-is-a-bad-idea/
Tags: tech, javascript, trademark, law, oracle
This is a good initiative. It makes no sense for Oracle to still cling onto JavaScript has a trademark.
Tags: tech, c++
Interesting proposal for a superset of C++ bringing a safe subset. Could it be a way to improve C++ use for the coming decade?
Tags: tech, asynchronous, multithreading, io
Or why going through an event loop might be more work initially but will make some things easier longer term. Nice way to frame how threads are bringing some opaque state.
https://utcc.utoronto.ca/~cks/space/blog/tech/ThreadsAsyncIOAndCancellation
Tags: tech, linux, kernel, realtime
Definitely good news if you have to maintain a real-time Linux system for industrial use. No more patches to carry over.
Tags: tech, kernel, rust
An interesting endeavor to create you own OS using another language than one of the usual ones.
Tags: tech, databases, sqlite, backup
Wish to use SQLite in production? You better have a good backup strategy. This article explains the main available options.
https://oldmoe.blog/2024/04/30/backup-strategies-for-sqlite-in-production/
Tags: tech, shell, scripting
Shell scripts deserve to be well designed like this indeed.
https://nochlin.com/blog/6-techniques-i-use-to-create-a-great-user-experience-for-shell-scripts
Tags: tech, shader, vulkan, directx
This is good news. DirectX being the other big graphics API if it adopts SPIR-V as interchange format it’ll open the way to more shader reuses.
https://devblogs.microsoft.com/directx/directx-adopting-spir-v/
Tags: tech, web, frontend, webgpu, 2d, graphics
Looks like an interesting tool to have in the box for 2D effects on the web.
Tags: tech, gui, html, web, frontend, complexity
A good list to design HTML forms. The bar is indeed high and there’s value in simplicity.
https://daverupert.com/2024/09/good-forms/
Tags: tech, web, frontend, css
This is indeed an interesting new CSS selector. Opens the door to doing more in a declarative way and with less Javascript.
https://www.joshwcomeau.com/css/has/
Tags: tech, management, metrics
We should definitely be more wary of metrics indeed. They help for a while, but at some point you’ll necessarily get unfortunately burnt by them. The only fallback is “good judgement”… do what you can with this.
https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/
Tags: tech, team, management
Nice tricks to help the team jell. I should try this more.
https://brittonbroderick.com/2024/08/18/building-aggressively-helpful-teams/
Bye for now!
Akademy 2024 (in Würzburg!)
This year Akademy was in Würzburg (shock! horror!). I think its not too far fetched to say that we pulled it successfully.
How it came to beDuring last years Akademy in Thessaloniki the idea came up given the high density of KDE people in the area to hold Akademy in Würzburg. On top we had the perfect venue in mind: two lecture halls for talks, BoF rooms and a ample common area for people to hack and socialize. I had such thoughts in the back of my mind for a while but did not share or go through with them.
However what Akademy does is it makes you talk to people and Tobias convinced it me that we should do it and more people even agreed that Akademy in Würzburg would be a good idea! (Of course some of these encouragers this does not mean work organizing Akademy like it would mean for us…) So on the second-to-last day of Akademy I sat down and wrote an email to my former university professor to enquire about the possibility of doing Akademy in the building that we envisioned.
And just like that, after some informal talks during Akademy and a meeting at the University and sending in a proposal after the Call for Hosts later you end up having weekly meetings to talk about and plan the next installment of Akademy.
How it did goThere were some problems but I think all in all this year Akademy went very well. I heard so much praise and positive feedback it felt a bit surreal at first (as did Sunday evening when all the talks were done). The most stressful situation for me happened on Sunday afternoon when I retrieved my charging phone from the team room to discover that the social event could not go ahead as planned and trying to manage that with the rest of the team. I think the resulting evening was very nice and chill and you could feel my relieve. I heard there was even an afterparty afterwards at the bar where we had held the welcome event.
Of course during the event there are always minor issues that need dealing with and looking out for those, dealing with them and trying to make sure everything is running smoothly (and the stress coming these things) meant that I couldn’t (and was not in the right state of mind) to focus much on the talks. When the BoF days started this was bit better but only on Thursday after the day trip I felt in a ‘conference mood’ and was able to focus fully on the BoFs that I was attending and sit down and hack a bit. If you want to learn more about the actual conference many people have blogged about it on the planet or read the report on the Akademy website.
In the end I think it’s fair to say that Akademy was a success. Made possible by KDE e.V. (go donate!), the sponsors, all the people of the Akademy team local and non-local, all the Volunteers on short and on long notice an last but not least all the awesome attendees - there would be no Akademy without any people. Thank you to you all! I am excited to learn about where next years Akademy will be (you can do it as well - it’s not hard and you get an awesome award on top) and looking forward to attending it as ‘normal Attendee’ and meeting everyone again there (if not earlier).
Talk Python to Me: #477: Awesome Text Tricks with NLP and spaCy
eiriksm.dev: - I think I said “wait that’s all?” out loud!
From time to time I get some really good and motivating feedback on the product I have built, violinist.io. And I want to start this post, which will also have a huge feature announcement, by mentioning a couple of them:
It was wonderfully painless (...) I don’t think I’ve ever experienced a faster setup of a CI tool — I think I said “wait that’s all?” out loud!
overall did the trick of what I was looking for and was very very fast
In other words, easy to set up, and fast results.
Well, today I want to share another product update that will make it theoretically even faster to set up and get results. But first allow me to provide a bit of background on why this new feature came to be.
One recurring question we get is regarding two avenues of a similar aspect:
- Does our code have to make its way to violinist.io infrastructure for us to be able to use the service?
- Can violinist.io access our Self Hosted GitLab which is locked down with a required VPN connection
They might differ a bit in wording or actual focus, but they usually boil down to one of these. And from time to time we find a compromise to both of these questions together, and the person contacting us turn into happy customers. But from time to time these questions also become the actual blocker for them to start using violinist.io. But now, at least, we have an alternative that covers both of these use cases: Self hosting violinist.io runners!
And here is why I am mentioning the feedback regarding quick onboarding in the context of this product announcement. You can literally start an update check with one docker command:
docker run \ --pull=always \ -e "LICENCE_KEY=my_key" \ -e "PROJECT_URL=https://github.com/user/repo" \ -e "USER_TOKEN=ghp_jYgGb_1npvkiHTdnM" \ ghcr.io/violinist-dev/update-check-runner:8.3-multi-composer-2This will run the same update job as if it were running on violinist.io, only using your own computer!
Of course, this in itself is not super useful. Avoiding running commands on your computer is the whole point of using an automatic update service like violinist.io, but now you can do cool stuff like:
- Run the same update jobs as violinist.io without any code entering any third party infrastructure
- Run jobs your CI infrastructure of choice. GitHub Actions, CircleCI, Bitbucket pipelines, Self Hosted GitLab, your totally locked down VPN protected GitLab instance that has a totally locked down Jenkins server. And so on
- Decide your own intervals for running them, probably inside said CI infrastructure. Daily jobs? Weekly jobs? Hourly jobs? Not a problem.
- Compose CI workflows that can do all your repositories in a matrix, all on the same schedule, if useful?
- Expose a webhook to trigger jobs, and run them when new items appear in the Drupal Security Announcements RSS feed
- And a million other things probably? You decide!
If this sounds useful to you, or your organization, please don’t hesitate to reach out for a free trial. In fact, in the name of smooth onboarding, here is an absolutely free trial for you already without reaching out (as long as you are reading and using this within 2 months of this blog post): A totally free license key, valid for all repositories for 2 months from today (valid until 2024-11-19T19:20:19+01:00):
hc1NTsMwFATgHqXyCZ6f_-2V_RyvkLgAmzRYYLVqqiaqQKhSz9CrIA6T2xAWbGE50nwz9-Vr-Xz0Ajy7tPHQjm2anx7aUI9Dpdc67H8D8-g_Ji-sZ5u_m5v6dmrndxaa50YgSJDchZXq_-lzP_cs9J7_fPEV1HZu-2m7O4wv29M4zSzsPA_X63LLWKJb1w1SykKlRJgFoXRFR25AaZ6M48KgJKFJAdlM0kEU2bpkEYwCRBdBpqisQuIl6yyh01QKqNIZqUvXRcgchS0KxWohko3JKecgfQMNow all you need is a repository and a PAT (Personal Access Token), and you are off on your new automatic update adventure. For a bit more documentation than this sparse promotional blog post, please visit the container repository.
Lastly, there are so much more I want to share and address about this. For example the aspect of open source in all of this, the differences between this and violinist.io (the SaaS), the licencing and pricing aspect. But those are all blog posts on their own. For now, I hope you will try it out if it’s useful, and that you want to connect should you have any questions or concerns. Here in the comments, or by reaching out.
Let’s close up this blog post with an animated gif of "runners".
Wingware: Wing Python IDE Version 10.0.6 - September 20, 2024
Wing 10.0.6 adds support for Python 3.13 and fixes some issues with AI development, code refactoring, and unit testing with pytest.
See the change log for details.
Download Wing 10 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products
What's New in Wing 10
AI Assisted Development
Wing Pro 10 takes advantage of recent advances in the capabilities of generative AI to provide powerful AI assisted development, including AI code suggestion, AI driven code refactoring, description-driven development, and AI chat. You can ask Wing to use AI to (1) implement missing code at the current input position, (2) refactor, enhance, or extend existing code by describing the changes that you want to make, (3) write new code from a description of its functionality and design, or (4) chat in order to work through understanding and making changes to code.
Examples of requests you can make include:
"Add a docstring to this method" "Create unit tests for class SearchEngine" "Add a phone number field to the Person class" "Clean up this code" "Convert this into a Python generator" "Create an RPC server that exposes all the public methods in class BuildingManager" "Change this method to wait asynchronously for data and return the result with a callback" "Rewrite this threaded code to instead run asynchronously"Yes, really!
Your role changes to one of directing an intelligent assistant capable of completing a wide range of programming tasks in relatively short periods of time. Instead of typing out code by hand every step of the way, you are essentially directing someone else to work through the details of manageable steps in the software development process.
Support for Python 3.12, 3.13, and ARM64 LinuxWing 10 adds support for Python 3.12 and 3.13, including (1) faster debugging with PEP 669 low impact monitoring API, (2) PEP 695 parameterized classes, functions and methods, (3) PEP 695 type statements, and (4) PEP 701 style f-strings.
Wing 10 also adds support for running Wing on ARM64 Linux systems.
Poetry Package ManagementWing Pro 10 adds support for Poetry package management in the New Project dialog and the Packages tool in the Tools menu. Poetry is an easy-to-use cross-platform dependency and package manager for Python, similar to pipenv.
Ruff Code Warnings & ReformattingWing Pro 10 adds support for Ruff as an external code checker in the Code Warnings tool, accessed from the Tools menu. Ruff can also be used as a code reformatter in the Source > Reformatting menu group. Ruff is an incredibly fast Python code checker that can replace or supplement flake8, pylint, pep8, and mypy.
Try Wing 10 Now!
Wing 10 is a ground-breaking new release in Wingware's Python IDE product line. Find out how Wing 10 can turbocharge your Python development by trying it today.
Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products
See Upgrading for details on upgrading from Wing 9 and earlier, and Migrating from Older Versions for a list of compatibility notes.
Matt Layman: Docker Go, JS, Static Files - Building SaaS #203
Release GCompris 4.2
Today we are releasing GCompris version 4.2.
It contains bug fixes and graphics improvements on multiple activities.
This version adds translation for Latvian.
It is fully translated in the following languages:
- Arabic
- Bulgarian
- Breton
- Catalan
- Catalan (Valencian)
- Greek
- UK English
- Esperanto
- Spanish
- Basque
- French
- Galician
- Croatian
- Hungarian
- Indonesian
- Italian
- Lithuanian
- Latvian
- Malayalam
- Dutch
- Norwegian Nynorsk
- Polish
- Brazilian Portuguese
- Romanian
- Russian
- Slovenian
- Albanian
- Swedish
- Swahili
- Turkish
- Ukrainian
It is also partially translated in the following languages:
- Azerbaijani (97%)
- Belarusian (87%)
- Czech (97%)
- German (96%)
- Estonian (96%)
- Finnish (95%)
- Hebrew (96%)
- Macedonian (90%)
- Portuguese (96%)
- Slovak (84%)
- Chinese Traditional (96%)
You can find packages of this new version for GNU/Linux, Windows, Android, Raspberry Pi and macOS on the download page. This update will also be available soon in the Android Play store, the F-Droid repository and the Windows store.
Thank you all,
Timothée & Johnny
Okteta got “Best Application” 2024 Akademy Award
The jury of this year’s KDE Akademy Awards, being by tradition representatives of last year’s winners, has selected the hex editor Okteta in the category “Best Application”. Thanks to them for this appreciation, even more for a niche application
Though, appreciation for what, as there are no details? The last new feature was added in 2019, with the 17th patch release since just done. So, for a reliable program with no need to relearn the UI every year and proudly close to zero open actual bug reports? Then the port to Qt6/KF6, while started in 2022, might be only completed in 2025… if ever. So rather, is this an end-of-life award for an aged 16 years old program?
Looking BackTriggered by the event some reflection on the past development, if only for the author himself, to update the memories how one got here and what it brings for the future. Which turned into a longer text than anticipated
2003-2006: Years of Initial Need for a WidgetThe Okteta project was born in 2003 , the known first code traces date back to May 13th, 2003. The first related code was imported into KDE’s code repository in August 15th, 2003, by the commit message:
Initial import of KHexEdit2, featuring a widget, a kpart and an app.Most important is the widget...
Hopefully it will be usable enough ready for KDE 3.2...
The name “KHexEdit2” was chosen as the project was a re-implementation of KHexEdit, the hex editor part of KDE since KDE 1.1. Because those times I was trying to create a viewer for executables and libraries (project name Binspekt, stalled soon), for which I wanted a widget for displaying bytes. KHexEdit seemed to have no code that could be cleanly ripped out and be reused, so work started to code such a widget from scratch and respectively also consumers of it, to add more reason & motivation.
The formal request on September 29th that year to then move the project’s code from the “kdenonbeta” area into release-covered areas had this optimistic line in it:
Finally there will be an app, build around the ReadWritePart. In 2004.
Turned out, life did not agree to that plan, thus 2004 passed without any such app. And so did 2005.
Still, back in February 2004 as part of KDE 3.2 the first elements of the still named KHexEdit2 project made a first release. Though with a bit of complexity. ((Note that at this time KDE was still also the name of the released bundle product, composed of the so called modules kdelibs, kdebase, kdeutils, etc. Where kdelibs held all the public libraries, kdebase the basic desktop components, and so on.)) As adding a complete implementation of a hexeditor widget to the official kdelibs for just a few potential users was declared unbalanced, instead just some header files with KHE namespaced abstract interface classes were added, with an inline utility method to dynamically load any plugin implementing them. So not increasing the actual runtime and installation size of kdelibs. And the kdeutils module got to provide such a plugin by the name KBytesEdit. This then was implemented by the hex editor widget library from the KHexEdit2 project, also in the kdeutils module, whose own API and headers were kept private. To confuse everyone this library was yet named libkhexedit, even if the actual KHexEdit program did not use it. Because the spirit of the naming was on the level of widgets and classes, not program names, and there the 2 postfix made no sense. Consumers of this construct became at least the utility app for Palm Pilot handhelds KPilot and the debugger plugin of the IDE KDevelop.
In March 2005 KDE 3.4 then included the KHexEdit2 KPart (read-only) as well, also located in the kdeutils module and also implemented using the private libkhexedit library. Making some people unhappy as it registered (like its Okteta successor still does today) as handler for the MIME type application/octet-stream, so popping up as fallback KPart where no other handler was found,. And seeing raw bytes over nothing has been partially perceived as “broken”
2006-2008: Second Try on a Program, as Sample and with new Dedicated Name2006 arrived, and with the author there was still some ever developing curiosity about the feasibility of writing viewer & editor programs using some higher-level reusable & exchangeable components. Now a byte array is a pretty simple data structure to use for a sample implementation of such a concept. And here we had a widget implementation for byte array viewing and editing fully under our control. And a certain level of skills with C++ acquired at the time. This was just too tempting to not give it an own try, for fun and experience. So a June blog post “Fun with KHexEdit” also mentioned a program again and introduced the name designed meanwhile:
I am tackling the construction of a successor to KHexEdit again, projectname “Okteta”.
Later that year a first visual snippet was shared, showing how KHexEdit’s UI served as initial template for Okteta, also to potentially help the transfer of existing users:
November 2006: first published screenshot of Okteta in some pre-Alpha stateTo avoid duplicated efforts and to increase pressure to deliver, two weeks later on November 27th an optimistic email was sent to KDE’s great eternal to-newer-API porting worker Laurent Montel, as it was the time to port KDE software to Qt4/KDELibs4:
Hi Laurent,
please don't spend too much effort at the old program KHexEdit, I am quite far
on the way to write a successor, called Okteta. Concerning feature
compatibility, so far I implemented around 60 % of the features of KHexEdit,
and hope to do the last 40 % until at least January. Yes, no code yet in SVN
(besides the library), but that will change in three weeks, promised.
[...]
This time life agreed mainly to the plan. Though the promise on the code in SVN was delivered on only with almost a year delay. A first dump of the program code was committed into KDE’s code repository on October 23rd, 2007, by the commit message:
Uploading the Okteta program into KDE's playground, so the code isn't lost, after growing slowly only on my hdd for more than a year. Okteta is a planned successor to KHexEdit, yet misses all of it's functionality. With it's modular architecture, based on the co-developed lib kakao, it should soon offer more than would could be done with the monolithic KHexEdit. I hope.As can be read, this first copy also was featuring a first draft for the own before mentioned higher-level component system, initially named “Kakao”, later in 2009 to be renamed to “Kasten”. That first name was made ad-hoc inspired by a drink on the table (in learned safe distance from the keyboard), to soon find it used similarly by some certain bigger IT company, even for a somehow related subject, thus a new name was by the time designed less ad-hoc.
And so some months later in April 2008 Okteta entered the “kdereview” phase, to proceed after two weeks into the kdeutils module. In time for KDE 4.1, so premiering its release as part of that in July 2008. Okteta here also took over to provide the KBytesEdit plugin for the kdelibs KHE interfaces as well as the KPart, which before had resided in subdirectories of the KHexEdit program sources. KHexEdit itself stayed unported to Qt4/KDELibs4, so Okteta as planned did not run into duplicated efforts and rivalry (or, it avoided competition, for good and bad).
July 2008: Okteta’s first release, as version 0.1, part of KDE 4.1 2008-2012: Years of Features FlowWith the foundations laid and releases established as part of KDE releases, the next years saw a number of features added, initially even each KDE version:
January 2009: new features in Okteta 0.2, part of KDE 4.2 August 2009: new features in Okteta 0.3, part of KDE 4.3 February 2010: new features in Okteta 0.4, part of KDE SC 4.4 July 2011: new features in Okteta 0.7, part of KDE Applications 4.7 August 2012: new features in Okteta 0.9, part of KDE Applications 4.9 2010-Present: Sharing Functionality in Rich Public LibrariesFrom the very begin of the project on the byte array viewing & editing feature was embeddable by 3rd-party software using the abstract KHE interfaces in kdelibs, or then the KPart at least for viewing, Though this allowed only little control & features due to a limited API.
Starting with Okteta 0.4 in February 2010, the two sets of underlying libraries, Kasten and Okteta ones, used to implement the Okteta program, the Okteta KPart and the KHE KBytesEdit plugin, are provided with stable public API.
The lower-Qt-level Okteta GUI library also started to be accompanied by a Qt Designer plugin, to allow easy use of the two provided classes of widgets also in Qt’s UI files.
February 2010: new Okteta widgets plugin for Qt Designer, part of Okteta 0.4In February 2010 during a week-long Kate-KDevelop development meeting in Berlin the intended flexibility of the new public libraries proofed itself by enabling to create a plugin for KDevelop to integrate hex viewing & edting and all the Okteta tools in just those few days, for some nice satisfaction. The plugin was officially released first with KDevelop 4.1 in October 2010 and later also ported to the Qt5/KF5 version of KDevelop. For the current Qt6/KF6-based version of KDevelop the plugin is excluded from the build for now, given the current lack of a released Qt6/KF6-based version of the Okteta and Kasten libraries.
October 2010: Okteta plugin for KDevelop, first released with KDevelop 4.1 2012-Present: Switching from Features to Architecture, from Bundled to Stand-AloneThe port to Qt5/KF5 happened without any issues and was completed for version 0.15, released as part of KDE Applications 14.12. During the transformation of KDELibs4 to KDE Frameworks 5 the KHE interfaces also got dropped there, due to Okteta meanwhile directly providing public libraries. So this ported version of Okteta also no longer provided the KBytesEdit plugin, but otherwise as before the public libraries and the KPart, next to the program itself.
After KDE Applications 17.12, as for a while there was no feature development and only occasional work on the design of the Okteta & Kasten libraries happened, the Okteta project switched to a stand-alone release schedule. A 0.25 version branch was created and patch version releases only done when there were user-relevant changes like bug fixes or bigger translation improvements.
Then 2019 brought the first version and for now also latest to also provide at least one new feature to users, for which a 0.26 version branch was created. This version has meanwhile got 17 patch releases, with bug fixes, translation improvements and other adjustments. And after 5 years of such polishing is the one which now received the “Best Application” 2024 Akademy Award
March 2019: new features in Okteta 0.26, released on own schedule 2022-Present: Preparing for Qt6 & KF6Okteta’s code base has been mostly quickly updated to any API deprecations, also as part of a “zero build warnings” strategy. So the approach taken for both Qt & KF libraries to strive for source-backward-compatible C++ API in their both new major version 6 made the initial port of Okteta to Qt6 & KF6(-Alpha) a matter of less than a day in May 2022. Now, only if one ignores one of the tools.
May 2022: preview of Okteta port to Qt6/KF6The Structures tool, first developed by Alexander Richardson in 2010 for Okteta 0.4, was in 2011 for Okteta 0.7 extended by him to also support dynamic structure definitions, using JavaScript expressions. For this QtScript has been used as engine. Four years later, in July 2015 though Qt 5.5 declared QtScript as deprecated. The officially recommended substitute QJSEngine turned out to not allow the dynamic translation of JavaScript properties and methods as relied on by the Structures tool for the copy-avoiding mapping of the data blob interpretation into the JavaScript scene (beware, for what the author understands so far). So it could not be used as drop-in replacement.
As finding a suitable and more future-proof JavaScript engine or exploring a possible reimplementation using the QJSEngine is a complex task and also needs bigger chunks of time & focussing, it had been postponed. Year after year. And thus now nine years later in 2024 there is still not even a plan. And Qt6 no longer now provides QtScript.
Just dropping the Structures tool is not a real option. It is a great feature, which also got some users. So a plan is needed and work to be done by someone. As of now my own, surely radical idea is to rewrite the whole Structures tool from scratch, still for the Qt5/KF5 version of Okteta. This should lead to fully wrapping the brain around this complex feature, instead of seeing to indirectly explore it by trying to understand all the details of the current elaborated implementation with the risk to misinterpret some intentions. Starting from scratch might also allow to finally share all the code used for the data formats in the separate Decoding Table tool, and perhaps even to introduce a more generic approach on the data formats supported in the main mass display besides currently byte values and 8-bit charset mapping. Idea, Should, Trying, Starting, Might, Perhaps… any words of confidence, please?
For now the initial Qt6/KF6 port is maintained by a single commit containing the complete dump of the “it builds, starts and does not crash on simple usage” changes, in a work branch continuously rebased to the latest current Qt5/KF5-based development branch. This commit would then at the time of the real Qt6/KF6 port be properly split into the different aspects of porting. For now it serves to hold the door open while still on the other side.
Looking Forward, by Looking Back Some MoreFor sure the initial goal with the Okteta program to do something for fun and experience has been largely achieved The current challenge with the needed replacement of the used JavaScript engine promises more experience, though no fun initially to me at least.
And some feedback as well even a KDE Akademy Award now hint the created and publicly shared program also served other people for their serious and less serious needs. Possibly even some desperate Faust-like persons, “So that they may perceive the bytes which hold their doc[ument] together in its inmost fold.” (and even tweak them for better as needed, owning their world or document). Though no contracts done, and thus no souls here owned.
But as before, Okteta actually is just a sample implementation of the actual interest pursued here, exploring the feasibility of writing programs by higher-level reusable & exchangeable components, ideally also allowing random end users to mesh those components themselves into tailored solutions for certain needs. So if development has stalled as it has, both on the components concept but also the hex editing features, how to increase motivation again to set resources aside, and for which part?
Position in the Hex Editor Solution SpaceWhen it comes to the Free Software solution space for hex editor needs, next to Okteta there is currently coverage starting from simple ones like GHex over wxHexEditor, which serves needs beyond Okteta by support for paged loading of big files and also the working memory, though sadly unmaintained currently, to the newer yet most impressive and very powerful ImHex (by what the web pages show, never tried).
So would people suffer if Okteta is gone for current platforms, at least for a while?
Open Component Systems vs. Closed Monolithic BlobsNow the author, while being curious, never got around to actually study existing solutions for the concept of higher-level component system or even deploy them in projects, only ever saw some theoretic surfaces. And is fully aware of the own experiment turning into something serious rather being a pipe dream. Even more when after soon two decades the initially created TODO list is not even 10 % done, this won’t work out this single human’ life So the following is more like the wish-wash of a hobbyist bird watcher, while also having some chicken in the backyard to which things are compared. Or alike.
It seems composable systems with complex interfaces are not the dominating species in the Free Software ecosystems. The Linux kernel outpaced any microkernel systems, e.g. Gnu Hurd is yet to be spotted outside OS zoos. The Eclipse Rich Client Platform, whose concepts were one of the original by-headlines inspirations for this project, seems to have maxed out some time ago as well. at least in the mainstream through the author’s bubble space. StarOffice^WOpenOffice^WLibreOffice has the UNO component system, but how many Add-Ons flourish on it? Then GnuStep would have enabled to spread the component concepts of OpenStep, but little has be seen? The later GnuStep-related, indeed thriving for stars impressive project Étoilé seemed to be overloaded with related ideas, but sadly never lift off. Then the GNOME project even had a reference to component systems in its initial full name “GNU Network Object Model Environment”, but its respective Bonobo framework based on CORBA faded away rather soon.
Also KDE started initially with implementations around the idea of components. To quote the KDE 1.0 announcement:
In view of these circumstances the KDE Project has developed a first rate compound document application framework, implementing the latest advances in framework technology and thus positioning itself in direct competition to such popular development frameworks as for example Mircrosoft’s MFC/COM/ActiveX technology. KDE’s KOM/OpenParts compound document technology enables developers to quickly create first rate applications implementing cutting edge technology. KDE’s KOM/OpenParts technology is build upon the open industry standards such as the object request broker standard CORBA 2.0.
KOM/OpenParts was then in KDE 2.0 replaced by KParts. Actually the presence of such technology development was the deciding factor to go for KDE when the author those days got into “Linux” and had to choose between GnuStep, Enlightenment, GNOME and KDE. These days though KDE is run with claims like “All About Apps”. The generic KServices system got destructed for KF6. The possibly latest KPart (a Markdown Viewer) was written years ago by this author, and the once KDE-central KPart-driven program Konqueror is only a shadow of its former self. KOffice & Calligra as component-oriented office suites also died or stalled close to extinction. Generic plugins like the KParts are not even listed on apps.kde.org or elsewhere anymore, also no longer mentioned as concept in KDE Gear release announcements. Similar specific plugins like the Plasma applets, they are also not listed separately, but only as part of the respective, in the example, Plasma products.
Additionally packaging formats like Flatpak or Snap are discussion-less embraced and promoted, which push in the direction of isolated and frozen software programs. Even today does Flatpak’s metadata system appstream. also otherwise discussion-less embraced by KDE, not have a concept of generic plugins, so KParts cannot be described properly.
In such an environment a component system would be limited to predefined fixed component sets in libraries, from which applications would then provide a setup and offer that to users. A bit like being able to shop as consumer at the kitchen equipment store only preset exhibition rooms, instead of meshing up items from different providers into ones’ own tailored meal preparations “app”. Surely it is in the interest of the dominating providers who then will see to bundle only their items, and then add bloat as well as only making half the items good. As consumer I desire to have the choice over pre-made bundles vs. self-assembled ones. Like there are times for All-inclusive vacations and times for self-organized ones. So with current KDE but also the larger current Free Software “desktop” scene as real world development environment working on and thinking about component systems has the author feel at odds.
So maybe the experiment with Kasten as a higher-level component system could also stop here. Perhaps some research instead could be done why such systems failed in comparison. Like, was it due to the inflexibility presented by fixed published interfaces, where on new feature needs implementations cannot simply do temporary shortcuts and adaptions where needed to be quick back to the market? Could it be due to the possible need for more abstract and generic thinking with component systems, where the majority of developers working for the market might prefer to think more concrete and case-by-case? Then, might there still be a middle-ground, where any advantages of high-level component systems are the deciding factor in the competition?
Next Release Scheduled: October 10th, version 0.26.18As described already for the early stages, there always have been ideas and plans.. and delays… and also doubts… and then things happened. Locally there are lots of notes with ideas, and a number of code drafts and sketches stacked by the years. And at least in the near future it seems there are still time windows, electricity, a laptop, and enough human capabilities to carry on and tinkering over this stack.
The Okteta (& Kasten) project for now is alive, just lurking around in front of the next evolution step to take. Which might find it new ground. Or extinction anyway. And while it is lurking, it gets a tad more feathers polished, by another bug fix release already scheduled for next month
mark.ie: My LocalGov Drupal contributions for week-ending September 20th, 2024
Do one thing, do it well - this week I spent most of my time creating a live preview module for microsites.
Programiz: Getting Started with Python
PyCharm: How to Use FastAPI for Machine Learning
This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas, Polars, and Jupyter Notebook.
FastAPI provides a quick way to build a backend service with Python. With a few decorators, you can turn your Python function into an API application.
It is widely used by many companies including Microsoft, Uber, and Netflix. According to the Python Developers Survey, FastAPI usage has grown from 21% in 2021 to 29% in 2023. For data scientists, it’s the second most popular framework, with 31% using it.
In this blog post, we will cover the basics of FastAPI for data scientists who may want to build a quick prototype for their project.
What is FastAPI?FastAPI is a popular web framework for building APIs with Python, based on standard Python type hints. It is intuitive and easy to use, and it can provide a production-ready application in a short period of time. It is fully compatible with OpenAPI and JSON Schema.
Why use FastAPI for machine learning?Most teams working on machine learning projects consist of data scientists whose domains and professions lie on the statistics side of things. They may not have experience developing software or applications to ship their machine learning projects. FastAPI enables data scientists to easily create APIs for the following projects:
Deploying prediction models
The data science team may have trained a model for the prediction of the sales demand in a warehouse. To make it useful, they have to provide an API interface so other parts of the stock management system can use this new prediction functionality.
Suggestion engines
One of the very common uses of machine learning is as a system that provides suggestions based on the users’ choices. For example, if someone puts certain products in their shopping cart, more items can be suggested to that user. Such an e-commerce system requires an API call to the suggestion engine that takes input parameters.
Dynamic dashboards and reporting systems
Sometimes, reports for data science projects need to be presented as dashboards so users can inspect the results themselves. One possible approach is to have the data model provide an API. Frontend developers can use this API to create applications that allow users to interact with the data.
Advantages of using FastAPICompared to other Python web frameworks, FastAPI is simple yet fully functional. Mainly using decorators and type hints, it allows you to build a web application without the complexity of building a whole ORM (object-relational mapping) model and with the flexibility of using any database, including any SQL and NoSQL databases. FastAPI also provides automatic documentation generation, support for additional information and validation for query parameters, and good async support.
Fast development
Creating API calls in FastAPI is as easy as adding decorators in the Python code. Little to no backend experience is needed for anyone who wants to turn a Python function into an application that will respond to API calls.
Fast documentation
FastAPI provides automatic interactive API documentation using Swagger UI, which is an industry standard. No extra effort is required to build clear documentation with API call examples. This creates an advantage for busy data science teams who may not have the energy and expertise to write technical specifications and documentation.
Easy testing
Writing tests is one of the most important steps in software development, but it can also be one of the most tedious, especially when the time of the data science team is valuable. Testing FastAPI is made simple thanks to Starlette and HTTPX. Most of the time no monkey patching is needed and tests are easy to write and understand.
Fast deployment
FastAPI comes with a CLI tool that can bridge development and deployment smoothly. It allows you to switch between development mode and production mode easily. Once development is completed, the code can be easily deployed using a Docker container with images that have Python prebuilt.
How to use FastAPI for a machine learning projectIn this example, we will turn a classification prediction model that uses the Nearest Neighbors algorithm to predict the species of various penguins based on their bill and flipper length into a backend application. We will provide an API that takes parameters from the query parameters of a URL and gives back the prediction. This shows how a prototype can be made quickly by any data scientist with no backend development experience.
We will use a simple `KNeighborsClassifier` on the penguin data set as an example. Details of how to build the model will be omitted, but feel free to check out the relevant notebook here. In the following tutorial, we will focus on the usage of FastAPI and explain some fundamental concepts. We will be building a prototype to do so.
1. Start a FastAPI project with PyCharmIn this blog post, we will be using PyCharm Professional 2024.1. The best way to start using FastAPI is to create a FastAPI project with PyCharm. When you click New Project in PyCharm, you will be presented with a large selection of projects to choose from. Select the FastAPI tab:
From here, you can put in the name of your project and take advantage of other options such as initializing Git and the virtual environment that you want to use.
After doing so, you will see the basic structure of a FastAPI project set up for you.
There is also a `test_main.http` file set up for you to quickly test all the endpoints.
Next, set up our environment dependency with `requirements.txt` by selecting Sync Python Requirements under PyCarm’s Tool menu.
Then you can select the `requirements.txt` file to be used.
You can copy and use this `requirements.txt` file. We will be using pandas and scikit-learn for the machine learning part of the project. Also, add the `penguins.csv` file to your project directory.
Arrange your machine learning code in the `main.py` file. We will start with a script that trains our model:
import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train)We can place the above code after `app = FastAPI()`. All of it will be run when we start the application.
However, there is a better way to run the start-up code we used to set up our model. We will cover that in a later part of the blog post.
Next we will look at how to add our model to FastAPI functionality. As a first step, we will add a response to the root of the URL and just simply return a message about our model in JSON format. Change the code in `async def root():` from “Hello world” to our message like this:
@app.get("/") async def root(): return { "Name": "Penguins Prediction", "description": "This is a penguins prediction model based on the bill length and flipper length of the bird.", }Now, test our application. First, we will start our application, which is easy in PyCharm. Just press the arrow button () next to your project name at the top.
If you are using the default settings, your application will run on http://127.0.0.1:8000. You can double-check that by looking at the prompt from the Run window.
Once the process has started, let’s go to `test_main.http` and press the first arrow button () next to `GET`. From the HTTP Client in the Services window, you will see the response message that we put in.
The response JSON file is also saved for future inspection.
Next, we would like to let users make predictions by providing query parameters in the URL. Let’s add the code below after the `root` function.
@app.get("/predict/") async def predict(bill_length_mm: float = 0.0, flipper_length_mm: float = 0.0): param = { "bill_length_mm": bill_length_mm, "flipper_length_mm": flipper_length_mm } if bill_length_mm <=0.0 or flipper_length_mm <=0.0: return { "parameters": param, "error message": "Invalid input values", } else: result = clf.predict([[bill_length_mm, flipper_length_mm]]) return { "parameters": param, "result": le.inverse_transform(result)[0], }Here we set the default value of the `bill_length_mm` and `flipper_length_mm` to be 0 if the user didn’t input a value. We also add a check to see if either of the values is 0 and return an error message instead of trying to predict which penguin the input refers to.
If the inputs are not 0, we will use the model to make a prediction and use the encoder to do an inverse transformation to get the label of the predicted target, i.e. the name of the penguin species.
This is not the only way you can verify inputs. You can also consider using Pydantic for input verification.
If you are using the same version of FastAPI as stated in `requirements.txt`, FastAPI automatically refreshes the service and applies changes on save. Now put in a new URL in `test_main.http` to test (separated from the URL before with ###):
### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3&flipper_length_mm=195 Accept: application/jsonPress the arrow button () next to our new URL and see the output.
Next you can try a URL with one or both of the parameters removed to see the error message:
Last, let’s look at how we can set up our model with FastAPI lifespan events. The advantage of doing that is we can make sure no request will be accepted while the model is still being set up and the memory used will be cleaned up afterward. To do that, we will use an `asynccontextmanager`. Before `app = FastAPI()` we will add:
from contextlib import asynccontextmanager ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here yield # Clean up the models and release resources ml_models.clear()Now we will move the import of pandas and scikit-learn to be alongside the other imports. We will also move our setup code inside the `lifespan` function, setting the machine learning model and LabelEncoder inside `ml_models` like this:
from fastapi import FastAPI from contextlib import asynccontextmanager import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) ml_models["clf"] = clf ml_models["le"] = le yield # Clean up the models and release resources ml_models.clear()After that we will add the `lifespan=lifespan` parameter in `app = FastAPI()`:
app = FastAPI(lifespan=lifespan)Now save and test again. Everything should work and we should see the same result as before.
Afterthought: When to train the model?From our example, you may wonder when the model is trained. Since `clf` is trained at the beginning, i.e. when the service is launched, you may wonder why we do not train the model every time someone makes a prediction.
We do not want the model to be trained every time someone makes a call, because it costs way more resources to re-train everything. Additionally, it may cause race conditions since our FastAPI application is working concurrently. This is especially the case if we use live data that changes all the time.
Technically, we can set up an API to collect data and re-train the model (which we will demonstrate in the next example). Other options would be to schedule a re-train at a certain time when a certain amount of new data has been collected or to let a super user upload new data and trigger the re-training.
So far, we are aiming to build a prototype that runs locally. Check out this article on deploying a FastAPI project on a cloud service for more information.
What is concurrency?To put it simply, concurrency is like when you are cooking in the kitchen, and while waiting for the water to boil, you go ahead and chop the vegetables. Since, in the web service world, the server is talking to many terminals, and the communication between the server and the terminals is slower than most internal applications, so the server will not talk to and serve the terminals one by one. Instead, it will talk to and serve many of them at the same time while fulfilling their requests. You may want to check out this explanation in the FastAPI documentation.
In Python, this is achieved by using async code. In our FastAPI code, the use of `async def` instead of `def` is obvious evidence that FastAPI is working concurrently. There are other keywords used in Python async code, like `await` and `asyncio.get_event_loop`, but we won’t be able to cover them in this blog post.
How to use FastAPI for an image classification projectTo discover more FastAPI functionality, we will add an image classification model based on the MNIST example in Keras to our application as well (we are using the TensorFlow backend). If you installed the `requirements.txt` provided, you should have Keras and Pillow installed for image processing and building a convolutional neural network (CNN).
1. RefactoringBefore we start, let’s refactor our code. To make the code more organized, we will put the model setup for the penguins prediction in a function:
def penguins_pipeline(): data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) return clf, leThen we rewrite the lifespan function. With full-line code completion in PyCharm, it is very easy:
2. Set up a CNN model for MNIST predictionIn similar fashion as the penguin prediction model, we create a function for MNIST prediction (and we will store the meta parameters globally):
# MNIST model meta parameters num_classes = 10 input_shape = (28, 28, 1) batch_size = 128 epochs = 15 def mnist_pipeline(): # Load the data and split it between train and test sets (x_train, y_train), _ = keras.datasets.mnist.load_data() # Scale images to the [0, 1] range x_train = x_train.astype("float32") / 255 # Make sure images have shape (28, 28, 1) x_train = np.expand_dims(x_train, -1) # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) model = keras.Sequential( [ keras.Input(shape=input_shape), layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Flatten(), layers.Dropout(0.5), layers.Dense(num_classes, activation="softmax"), ] ) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1) return modelThen add the model setup in the lifespan function:
ml_models["cnn"] = mnist_pipeline()Note that since this is added, every time you make changes to `main.py` and save, the model will be trained again. It can take a bit of time. So in development you may want to use a dummy model that requires no training time at all or a pre-trained model instead. After training, the CNN model will be ready to go.
3. Set up a POST endpoint for uploading an image file for predictionTo set up an endpoint that takes an upload file, we have to use UploadFile in FastAPI:
@app.post("/predict-image/") async def predicct_upload_file(file: UploadFile): img = await file.read() # process image for prediction img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) # predict the result result = ml_models["cnn"].predict(img).argmax(axis=-1)[0] return {"filename": file.filename, "result": str(result)}Please note that this is a POST endpoint (so far we have only set up GET endpoints).
Don’t forget to import `UploadFile` from `fastapi`:
from fastapi import FastAPI, UploadFileAnd `Image` from Pillow. We are also using `BytesIO` from the `io` module:
from PIL import Image from io import BytesIOTo test this using the PyCharm HTTP Client with a test image file, we will make use of the `multipart/form-data` encoding. You can check out the HTTP request syntax here. This is what you will put in the `test_in.http` file:
### POST http://127.0.0.1:8000/predict-image/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="test_img0.png" < ./test_img0.png --boundary– 4. Add an API to collect data and trigger retrainingNow, here comes the retraining. We set up a POST endpoint like above to accept a zip file which contains training images and labels. The zip file will then be processed and the training data will be prepared. After that we will fit the CNN model again:
@app.post("/upload-images/") async def retrain_upload_file(file: UploadFile): img_files = [] labels_file = None train_img = None with ZipFile(BytesIO(await file.read()), 'r') as zfile: for fname in zfile.namelist(): if fname[-4:] == '.txt' and fname[:2] != '__': labels_file = fname elif fname[-4:] == '.png': img_files.append(fname) if len(img_files) == 0: return {"error": "No training images (png files) found."} else: for fname in sorted(img_files): with zfile.open(fname) as img_file: img = img_file.read() # process image img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) if train_img is None: train_img = img else: train_img = np.vstack((train_img, img)) if labels_file is None: return {"error": "No training labels file (txt file) found."} else: with zfile.open(labels_file) as labels: labels_data = labels.read() labels_data = labels_data.decode("utf-8").split() labels_data = np.array(labels_data).astype("int") labels_data = keras.utils.to_categorical(labels_data, num_classes) # retrain model ml_models["cnn"].fit(train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1) return {"message": "Model trained successfully."}Remember to import `ZipFile`:
from zipfile import ZipFileIf we now try the endpoint with this zip file of 1000 retraining images and labels, you will see that it takes a moment for the response to come, as the training is taking a while:
POST http://127.0.0.1:8000/upload-images/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="training_data.zip" < ./retrain_img.zip --boundary--Imagine the zip files contain more training data or you’re retraining a more complicated model. The user would then have to wait for a long time and it would seem like things are not working for them.
5. Retrain the model with BackgroundTasksA better way to handle retraining is, after receiving the training data, we process it and check if the data is in the right format, then give a response saying that the retraining has restarted and train the model in `BackgroundTasks`. Here is how to do it. First, we will add `BackgroundTasks` to our `upload-images` endpoint:
@app.post("/upload-image/") async def retrain_upload_file(file: UploadFile, background_tasks: BackgroundTasks): ...Remember to import it from `fastapi`:
from fastapi import FastAPI, UploadFile, BackgroundTasksThen, we will put the fitting of the model into the `background_tasks`:
# retrain model background_tasks.add_task( ml_models["cnn"].fit, train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1 )Also, we will update the message in the response:
return {"message": "Data received successfully, model training has started."}Now test the endpoint again. You will see that the response has arrived much quicker, and if you look at the Run window, you’ll see that the training is running after the response has arrived.
At this point, more functionality can be added, for example, an option to notify the user later (e.g. via email) when the training is finished or track the training progress in a dashboard when a full application is built.
FastAPI provides an easy way to convert your data science project into a working application in several easy steps. It is perfect for data science teams that want to provide an application prototype for their machine learning model which can be further developed into a professional web application if needed.
PyCharm Professional is the Python IDE that allows you to develop FastAPI applications more easily with a preconfigured project for FastAPI, coding assistance, tailored run/debug configurations, and the Endpoints tool window for managing API endpoints efficiently.
Get a free trial of PyCharm ProfessionalIn this blog post, we showed the process of providing a simple API for a pre-trained prediction model. To learn more about FastAPI, I would suggest checking out the official FastAPI documentation. If you’re choosing between different frameworks, explore how FastAPI differs from Django.
About the author Cheuk Ting HoCheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.
Implementing an Audio Mixer, Part 2
In Part 1, we covered PCM audio and superimposing waveforms, and developed an algorithm to combine an arbitrary number of audio streams into one.
Now we need to use these ideas to finish a full implementation using Qt Multimedia.
Using Qt Multimedia for Audio Device AccessSo what do we need? Well, we want to use a single QAudioOutput, which we pass an audio device and a supported audio format.
We can get those like this:
const QAudioDeviceInfo &device = QAudioDeviceInfo::defaultOutputDevice(); const QAudioFormat &format = device.preferredFormat();Let’s construct our QAudioOutput object using the device and format:
static QAudioOutput audioOutput(device, format);Now, to use it to write data, we have to call start on the audio output, passing in a QIODevice *.
Normally we would use the QIODevice subclass QBuffer for a single audio buffer. But in this case, we want our own subclass of QIODevice, so we can combine multiple buffers into one IO device.
We’ll call our subclass MixerStream. This is where we will do our bufferCombine, and keep our member list of streams mStreams.
We will also need another stream type for mStreams. For now let’s just call it DecoderStream, forward declare it, and worry about its implementation later.
One thing that’s good to know at this point is DecoderStream objects will get the data buffers we need by decoding audio data from a file. Because of this, we’ll need to keep our audio format from above to as a data member mFormat. Then we can pass it to decoders when they need it.
Implementing MixerStreamSince we are subclassing QIODevice, we need to provide reimplementations for these two protected virtual functions:
virtual qint64 QIODevice::readData(char *data, qint64 maxSize); virtual qint64 QIODevice::writeData(const char *data, qint64 maxSize);We also want to provide a way to open new streams that we’ll add to mStreams, given a filename. We’ll call this function openStream. We can also allow looping a stream multiple times, so let’s add a parameter for that and give it a default value of 1.
Additionally, we’ll need a user-defined destructor to delete any pointers in the list that might remain if the MixerStream is abruptly destructed.
// mixerstream.h #pragma once #include <QAudioFormat> #include <QAudioOutput> #include <QIODevice> class DecodeStream; class MixerStream : public QIODevice { Q_OBJECT public: explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; private: QAudioFormat mFormat; QList<DecodeStream *> mStreams; };Notice that combineSamples isn’t in the header. It’s a pretty basic function that doesn’t require any members, so we can just implement it as a free function.
Let’s put it in a header mixer.h and wrap it in a namespace:
// mixer.h #pragma once #include <QtGlobal> #include <limits> namespace Mixer { inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace MixerThere are some very basic things we can get out of the way quickly in the MixerStream cpp file. Recall that we must implement these member functions:
explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override;The constructor is very simple:
MixerStream::MixerStream(const QAudioFormat &format) : mFormat(format) { setOpenMode(QIODevice::ReadOnly); }Here we use setOpenMode to automatically open our device in read-only mode, so we don’t have to call open() directly from outside the class.
Also, since it’s going to be read-only, our reimplementation of QIODevice::writeData will do nothing:
qint64 MixerStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }The custom destructor we need is also quite simple:
MixerStream::~MixerStream() { while (!mStreams.empty()) delete mStreams.takeLast(); }readData will be almost exactly the same as the implementation we did earlier, but returning qint64. The return value is meant to be the amount of data written, which in our case is just the maxSize argument given to it, as we write fixed-size buffers.
Additionally, we should call qAsConst (or std::as_const) on mStreams in the range-for to avoid detaching the Qt container. For more on qAsConst and range-based for loops, see Jesper Pederson’s blog post on the topic.
qint64 MixerStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); constexpr qint16 bitDepth = sizeof(qint16); const qint16 numSamples = maxSize / bitDepth; for (auto *stream : qAsConst(mStreams)) { auto *cursor = reinterpret_cast<qint16 *>(data); qint16 sample; for (int i = 0; i < numSamples; ++i, ++cursor) if (stream->read(reinterpret_cast<char *>(&sample), bitDepth)) *cursor = Mixer::combineSamples(sample, *cursor); } return maxSize; }That only leaves us with openStream. This one will require us to discuss DecodeStream and its interface.
The function should construct a new DecodeStream on the heap, which will need a file name and format. DecodeStream, as implied by its name, needs to decode audio files to PCM data. We’ll use a QAudioDecoder within DecodeStream to accomplish this, and for that, we need to pass mFormat to the constructor. We also need to pass loops to the constructor, as each stream can have a different number of loops.
Now our constructor call will look like this:
DecodeStream(fileName, mFormat, loops);We can then use operator<< to add it to mStreams.
Finally, we need to remove it from the list when it’s done. We’ll give it a Qt signal, finished, and connect it to a lambda expression that will remove the stream from the list and delete the pointer.
Our completed openStream function now looks like this:
void MixerStream::openStream(const QString &fileName, int loops) { auto *decodeStream = new DecodeStream(fileName, mFormat, loops); mStreams << decodeStream; connect(decodeStream, &DecodeStream::finished, this, [this, decodeStream]() { mStreams.removeAll(decodeStream); decodeStream->deleteLater(); }); }Recall from earlier that we call read on a stream, which takes a char * to which the read data will be copied and a qint64 representing the size of the data.
This is a QIODevice function, which will internally call readData. Thus, DecoderStream also needs to be a QIODevice.
Getting PCM Data for DecodeStreamIn DecodeStream, we need readData to spit out PCM data, so we need to decode our audio file to get its contents in PCM format. In Qt Multimedia, we use a QAudioDecoder for this. We pass it an audio format to decode to, and a source device, in this case a QFile file handle for our audio file.
When a QAudioDecoder‘s start method is called, it will begin decoding the source file in a non-blocking manner, emitting a signal bufferReady when a full buffer of decoded PCM data is available.
On that signal, we can call the decoder’s read method, which gives us a QAudioBuffer. To store in a data member in DecodeStream, we use a QByteArray, which we can interact with using QBuffers to get a QIODevice interface for reading and writing. This is the ideal way to work with buffers of bytes to read or write in Qt.
We’ll make two QBuffers: one for writing data to the QByteArray (we’ll call it mInputBuffer), and one for reading from the QByteArray (we’ll call it mOutputBuffer). The reason for using two buffers rather than one read/write buffer is so the read and write positions can be independent. Otherwise, we will encounter more stuttering.
So when we get the bufferReady signal, we’ll want to do something like this:
const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount());We’ll also need to have some sort of state enum. The reason for this is that when we are finished with the stream and emit finished(), we remove and delete the stream from a connected lambda expression, but read might still be called before that has completed. Thus, we want to only read from the buffer when the state is Playing.
Let’s update mixer.h to put the enum in namespace Mixer:
#pragma once #include <QtGlobal> #include <limits> namespace Mixer { enum State { Playing, Stopped }; inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace Mixer Implementing DecodeStreamNow that we understand all the data members we need to use, let’s see what our header for DecodeStream looks like:
// decodestream.h #pragma once #include "mixer.h" #include <QAudioDecoder> #include <QBuffer> #include <QFile> class DecodeStream : public QIODevice { Q_OBJECT public: explicit DecodeStream(const QString &fileName, const QAudioFormat &format, int loops); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; signals: void finished(); private: QFile mSourceFile; QByteArray mData; QBuffer mInputBuf; QBuffer mOutputBuf; QAudioDecoder mDecoder; QAudioFormat mFormat; Mixer::State mState; int mLoopsLeft; };In the constructor, we’ll initialize our private members, open the DecodeStream in read-only (like we did earlier), make sure we open the QFile and QBuffers successfully, and finally set up our QAudioDecoder.
DecodeStream::DecodeStream(const QString &fileName, const QAudioFormat &format, int loops) : mSourceFile(fileName) , mInputBuf(&mData) , mOutputBuf(&mData) , mFormat(format) , mState(Mixer::Playing) , mLoopsLeft(loops) { setOpenMode(QIODevice::ReadOnly); const bool valid = mSourceFile.open(QIODevice::ReadOnly) && mOutputBuf.open(QIODevice::ReadOnly) && mInputBuf.open(QIODevice::WriteOnly); Q_ASSERT(valid); mDecoder.setAudioFormat(mFormat); mDecoder.setSourceDevice(&mSourceFile); mDecoder.start(); connect(&mDecoder, &QAudioDecoder::bufferReady, this, [this]() { const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount()); }); }Once again, our QIODevice subclass is read-only, so our writeData method looks like this:
qint64 DecodeStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }Which leaves us with the last part of the implementation, DecodeStream‘s readData function.
We zero out the char * with memset to avoid any noise if there are areas that are not overwritten. Then we simply read from the QByteArray into the char * if mState is Mixer::Playing.
We check to see if we finished reading the file with QBuffer::atEnd(), and if we are, we decrement the loops remaining. If it’s zero now, that was the last (or only) loop, so we set mState to stopped, and emit finished(). Either way we seek back to position 0. Now if there are loops left, it starts reading from the beginning again.
qint64 DecodeStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); if (mState == Mixer::Playing) { mOutputBuf.read(data, maxSize); if (mOutputBuf.size() && mOutputBuf.atEnd()) { if (--mLoopsLeft == 0) { mState = Mixer::Stopped; emit finished(); } mOutputBuf.seek(0); } } return maxSize; }Now that we’ve implemented DecodeStream, we can actually use MixerStream to play two audio files at the same time!
Using MixerStreamHere’s an example snippet that shows how MixerStream can be used to route two simultaneous audio streams into one system mixer channel:
const auto &device = QAudioDeviceInfo::defaultOutputDevice(); const auto &format = device.preferredFormat(); auto mixerStream = std::make_unique<MixerStream>(format); auto *audioOutput = new QAudioOutput(device, format); audioOutput->setVolume(0.5); audioOutput->start(mixerStream.get()); mixerStream->openStream(QStringLiteral("/path/to/some/sound.wav")); mixerStream->openStream(QStringLiteral("/path/to/something/else.mp3"), 3); Final RemarksThe code in this series of posts is largely a reimplementation of Lova Widmark’s project QtMixer. Huge thanks to her for a great and lightweight implementation. Check the project out if you want to use something like this for a GPL-compliant project (and don’t mind that it uses qmake).
About KDAB
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
The post Implementing an Audio Mixer, Part 2 appeared first on KDAB.