Feeds

The Drop Times: 'Local Association Stand' at DrupalCon Barcelona 2024

Planet Drupal - Fri, 2024-09-20 06:40
For the first time, fifteen European local Drupal associations are joining forces to create the 'Local Association Stand' at DrupalCon Barcelona 2024. This collaborative space will serve as a hub for discussion, networking, and support for event organizers and community members. With a cosy seating area, the stand aims to connect visitors with local initiators and strengthen the bonds within the Drupal community. Key sessions include insights into setting up local association sites, organizing Drupal events, and the future of in-person DrupalCamps in the post-COVID era.
Categories: FLOSS Project Planets

Promet Source: Get to Know Provus®EDU: Our Higher Ed Drupal Distro

Planet Drupal - Fri, 2024-09-20 06:07
Takeaway: As a Drupal distribution for colleges and universities, Provus®EDU is the ultimate easy website builder that empowers your team to create engaging online experiences without needing extensive technical expertise.
Categories: FLOSS Project Planets

Qt Gradle Plugin 1.0 Released

Planet KDE - Fri, 2024-09-20 06:00

Qt Gradle Plugin 1.0 (QtGP) build tool has been released. You can include it in your Android builds from Maven Central.

Categories: FLOSS Project Planets

Web Review, Week 2024-38

Planet KDE - Fri, 2024-09-20 05:55

Let’s go for my web review for the week 2024-38.

Is Tor still safe to use?

Tags: tech, tor, privacy

The quick answer is yes. The longer answer is that more effort is still required to ensure the network has enough diversity of nodes to stay healthy.

https://blog.torproject.org/tor-is-still-safe/


The Subprime AI Crisis

Tags: tech, ai, machine-learning, gpt, business, economics, criticism

This is a very harsh and bleak view on the current generative AI craze. Clearly it survives on some sort of weird faith that things will magically improve. Some decision makers clearly run fully on said faith and lost all kind of realistic view of the situation. They are just very disconnected from the user’s needs.

There’s even a funny quote in there: “Generative AI must seem kind of magical when your entire life is either being in a meeting or reading an email”.

When this bubble bursts, it’s hard to predict what the fallout will be on the tech industry… for sure it won’t be pretty. It also begs the question: what is this industry going to do next? There’s clearly no plan after generative AI.

https://www.wheresyoured.at/subprimeai/


Elon Musk’s xAI supercomputer stirs turmoil over smog in Memphis

Tags: tech, ai, machine-learning, gpt, politics, ecology

Need to illustrate how much the current AI arm race is an ecological and social problem? Here is a very pathological case. This is what you get when you let the tycoons behind this completely unchecked.

https://www.npr.org/2024/09/11/nx-s1-5088134/elon-musk-ai-xai-supercomputer-memphis-pollution


Larry Ellison’s AI-Powered Surveillance Dystopia Is Already Here

Tags: tech, oracle, surveillance

People are gasping in horror with Larry Ellison’s latest claims… but really they should realize he’s not dreaming big. All of that is already here in one form or another. Maybe it was time to protest years ago?

https://www.404media.co/larry-ellisons-ai-powered-surveillance-dystopia-is-already-here/


Turning Everyday Gadgets into Bombs is a Bad Idea

Tags: tech, security, war, battery

Or why we should all be concerned and condemn the latest pager and walkie-talkie attacks. They clearly opened a Pandora’s box, it’d be surprising not to see more of those from various organizations. The funds and efforts required make it affordable enough.

https://www.bunniestudios.com/blog/2024/turning-everyday-gadgets-into-bombs-is-a-bad-idea/


Oracle, it’s time to free JavaScript™

Tags: tech, javascript, trademark, law, oracle

This is a good initiative. It makes no sense for Oracle to still cling onto JavaScript has a trademark.

https://javascript.tm/


Safe C++

Tags: tech, c++

Interesting proposal for a superset of C++ bringing a safe subset. Could it be a way to improve C++ use for the coming decade?

https://safecpp.org/


Threads, asynchronous IO, and cancellation

Tags: tech, asynchronous, multithreading, io

Or why going through an event loop might be more work initially but will make some things easier longer term. Nice way to frame how threads are bringing some opaque state.

https://utcc.utoronto.ca/~cks/space/blog/tech/ThreadsAsyncIOAndCancellation


Real-time Linux is officially part of the kernel after decades of debate

Tags: tech, linux, kernel, realtime

Definitely good news if you have to maintain a real-time Linux system for industrial use. No more patches to carry over.

https://arstechnica.com/gadgets/2024/09/real-time-linux-is-officially-part-of-the-kernel-after-decades-of-debate/


Writing an OS in Rust

Tags: tech, kernel, rust

An interesting endeavor to create you own OS using another language than one of the usual ones.

https://os.phil-opp.com/


Backup strategies for SQLite in production

Tags: tech, databases, sqlite, backup

Wish to use SQLite in production? You better have a good backup strategy. This article explains the main available options.

https://oldmoe.blog/2024/04/30/backup-strategies-for-sqlite-in-production/


6 Techniques I Use to Create a Great User Experience for Shell Scripts

Tags: tech, shell, scripting

Shell scripts deserve to be well designed like this indeed.

https://nochlin.com/blog/6-techniques-i-use-to-create-a-great-user-experience-for-shell-scripts


DirectX Adopting SPIR-V as the Interchange Format of the Future

Tags: tech, shader, vulkan, directx

This is good news. DirectX being the other big graphics API if it adopts SPIR-V as interchange format it’ll open the way to more shader reuses.

https://devblogs.microsoft.com/directx/directx-adopting-spir-v/


PixiJS | The HTML5 Creation Engine

Tags: tech, web, frontend, webgpu, 2d, graphics

Looks like an interesting tool to have in the box for 2D effects on the web.

https://pixijs.com/


Good forms

Tags: tech, gui, html, web, frontend, complexity

A good list to design HTML forms. The bar is indeed high and there’s value in simplicity.

https://daverupert.com/2024/09/good-forms/


The Undeniable Utility Of CSS :has

Tags: tech, web, frontend, css

This is indeed an interesting new CSS selector. Opens the door to doing more in a declarative way and with less Javascript.

https://www.joshwcomeau.com/css/has/


Goodhart’s Law in Software Engineering

Tags: tech, management, metrics

We should definitely be more wary of metrics indeed. They help for a while, but at some point you’ll necessarily get unfortunately burnt by them. The only fallback is “good judgement”… do what you can with this.

https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/


Building Aggressively Helpful Teams

Tags: tech, team, management

Nice tricks to help the team jell. I should try this more.

https://brittonbroderick.com/2024/08/18/building-aggressively-helpful-teams/


Bye for now!

Categories: FLOSS Project Planets

Akademy 2024 (in Würzburg!)

Planet KDE - Fri, 2024-09-20 05:00

This year Akademy was in Würzburg (shock! horror!). I think its not too far fetched to say that we pulled it successfully.

How it came to be

During last years Akademy in Thessaloniki the idea came up given the high density of KDE people in the area to hold Akademy in Würzburg. On top we had the perfect venue in mind: two lecture halls for talks, BoF rooms and a ample common area for people to hack and socialize. I had such thoughts in the back of my mind for a while but did not share or go through with them.

However what Akademy does is it makes you talk to people and Tobias convinced it me that we should do it and more people even agreed that Akademy in Würzburg would be a good idea! (Of course some of these encouragers this does not mean work organizing Akademy like it would mean for us…) So on the second-to-last day of Akademy I sat down and wrote an email to my former university professor to enquire about the possibility of doing Akademy in the building that we envisioned.

And just like that, after some informal talks during Akademy and a meeting at the University and sending in a proposal after the Call for Hosts later you end up having weekly meetings to talk about and plan the next installment of Akademy.

How it did go

There were some problems but I think all in all this year Akademy went very well. I heard so much praise and positive feedback it felt a bit surreal at first (as did Sunday evening when all the talks were done). The most stressful situation for me happened on Sunday afternoon when I retrieved my charging phone from the team room to discover that the social event could not go ahead as planned and trying to manage that with the rest of the team. I think the resulting evening was very nice and chill and you could feel my relieve. I heard there was even an afterparty afterwards at the bar where we had held the welcome event.

Of course during the event there are always minor issues that need dealing with and looking out for those, dealing with them and trying to make sure everything is running smoothly (and the stress coming these things) meant that I couldn’t (and was not in the right state of mind) to focus much on the talks. When the BoF days started this was bit better but only on Thursday after the day trip I felt in a ‘conference mood’ and was able to focus fully on the BoFs that I was attending and sit down and hack a bit. If you want to learn more about the actual conference many people have blogged about it on the planet or read the report on the Akademy website.

In the end I think it’s fair to say that Akademy was a success. Made possible by KDE e.V. (go donate!), the sponsors, all the people of the Akademy team local and non-local, all the Volunteers on short and on long notice an last but not least all the awesome attendees - there would be no Akademy without any people. Thank you to you all! I am excited to learn about where next years Akademy will be (you can do it as well - it’s not hard and you get an awesome award on top) and looking forward to attending it as ‘normal Attendee’ and meeting everyone again there (if not earlier).

Categories: FLOSS Project Planets

Talk Python to Me: #477: Awesome Text Tricks with NLP and spaCy

Planet Python - Fri, 2024-09-20 04:00
Do you have text that you want to process automatically? Maybe you want to pull out key products or topics of conversation? Maybe you want to get the sentiment? The possibilities are many with this week's topic: NLP with spaCy and Python. Our guest, Vincent D. Warmerdam, has worked on spaCy and other tools at Explosion AI and he's here to give us his tips and tricks for working with text from Python.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/posit'>Posit</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Course: Getting Started with NLP and spaCy</b>: <a href="https://training.talkpython.fm/courses/getting-started-with-spacy" target="_blank" >talkpython.fm</a><br/> <br/> <b>Vincent on X</b>: <a href="https://x.com/fishnets88?featured_on=talkpython" target="_blank" >@fishnets88</a><br/> <b>Vincent on Mastodon</b>: <a href="https://fosstodon.org/@koaning" target="_blank" >@koaning</a><br/> <br/> <b>Programmable Keyboards on CalmCode</b>: <a href="https://www.youtube.com/playlist?list=PLGj5nRqy15j93TD0iReqfLL9lU1lZFEs6" target="_blank" >youtube.com</a><br/> <b>Sample Space Podcast</b>: <a href="https://www.youtube.com/playlist?list=PLSIzlWDI17bRULf7X_55ab7THqA9TJPxd" target="_blank" >youtube.com</a><br/> <br/> <b>spaCy</b>: <a href="https://spacy.io?featured_on=talkpython" target="_blank" >spacy.io</a><br/> <b>Course: Build An Audio AI App</b>: <a href="https://training.talkpython.fm/courses/build-an-audio-ai-app-with-python-and-assemblyai" target="_blank" >talkpython.fm</a><br/> <b>Lemma example</b>: <a href="https://github.com/talkpython/audio-ai-with-assemblyai-course/blob/d5418ec26b4e8d9e524f2dedfc5c93c6edc817b8/code/02_feature_search/src/services/search_service.py#L80" target="_blank" >github.com</a><br/> <b>Code for spaCy course</b>: <a href="https://github.com/talkpython/nlp-with-python-and-spacy-course" target="_blank" >github.com</a><br/> <br/> <b>Python Bytes transcripts</b>: <a href="https://github.com/mikeckennedy/python_bytes_show_notes?featured_on=talkpython" target="_blank" >github.com</a><br/> <b>scikit-lego</b>: <a href="https://github.com/koaning/scikit-lego?featured_on=talkpython" target="_blank" >github.com</a><br/> <b>Projects that import "this"</b>: <a href="https://calmcode.io/til/python-import-this?featured_on=talkpython" target="_blank" >calmcode.io</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=AE7FPjaVNN0" target="_blank" >youtube.com</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/477/awesome-text-tricks-with-nlp-and-spacy" target="_blank" >talkpython.fm</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" >youtube.com</a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>
Categories: FLOSS Project Planets

eiriksm.dev: - I think I said “wait that’s all?” out loud!

Planet Drupal - Fri, 2024-09-20 02:19

From time to time I get some really good and motivating feedback on the product I have built, violinist.io. And I want to start this post, which will also have a huge feature announcement, by mentioning a couple of them:

It was wonderfully painless (...) I don’t think I’ve ever experienced a faster setup of a CI tool — I think I said “wait that’s all?” out loud!

overall did the trick of what I was looking for and was very very fast

In other words, easy to set up, and fast results.

Well, today I want to share another product update that will make it theoretically even faster to set up and get results. But first allow me to provide a bit of background on why this new feature came to be.

One recurring question we get is regarding two avenues of a similar aspect:

  • Does our code have to make its way to violinist.io infrastructure for us to be able to use the service?
  • Can violinist.io access our Self Hosted GitLab which is locked down with a required VPN connection

They might differ a bit in wording or actual focus, but they usually boil down to one of these. And from time to time we find a compromise to both of these questions together, and the person contacting us turn into happy customers. But from time to time these questions also become the actual blocker for them to start using violinist.io. But now, at least, we have an alternative that covers both of these use cases: Self hosting violinist.io runners!

And here is why I am mentioning the feedback regarding quick onboarding in the context of this product announcement. You can literally start an update check with one docker command:

docker run \   --pull=always \   -e "LICENCE_KEY=my_key" \   -e "PROJECT_URL=https://github.com/user/repo" \   -e "USER_TOKEN=ghp_jYgGb_1npvkiHTdnM" \   ghcr.io/violinist-dev/update-check-runner:8.3-multi-composer-2

This will run the same update job as if it were running on violinist.io, only using your own computer!

Of course, this in itself is not super useful. Avoiding running commands on your computer is the whole point of using an automatic update service like violinist.io, but now you can do cool stuff like:

  • Run the same update jobs as violinist.io without any code entering any third party infrastructure
  • Run jobs your CI infrastructure of choice. GitHub Actions, CircleCI, Bitbucket pipelines, Self Hosted GitLab, your totally locked down VPN protected GitLab instance that has a totally locked down Jenkins server. And so on
  • Decide your own intervals for running them, probably inside said CI infrastructure. Daily jobs? Weekly jobs? Hourly jobs? Not a problem.
  • Compose CI workflows that can do all your repositories in a matrix, all on the same schedule, if useful?
  • Expose a webhook to trigger jobs, and run them when new items appear in the Drupal Security Announcements  RSS feed
  • And a million other things probably? You decide!

If this sounds useful to you, or your organization, please don’t hesitate to reach out for a free trial. In fact, in the name of smooth onboarding, here is an absolutely free trial for you already without reaching out (as long as you are reading and using this within 2 months of this blog post): A totally free license key, valid for all repositories for 2 months from today (valid until 2024-11-19T19:20:19+01:00):

hc1NTsMwFATgHqXyCZ6f_-2V_RyvkLgAmzRYYLVqqiaqQKhSz9CrIA6T2xAWbGE50nwz9-Vr-Xz0Ajy7tPHQjm2anx7aUI9Dpdc67H8D8-g_Ji-sZ5u_m5v6dmrndxaa50YgSJDchZXq_-lzP_cs9J7_fPEV1HZu-2m7O4wv29M4zSzsPA_X63LLWKJb1w1SykKlRJgFoXRFR25AaZ6M48KgJKFJAdlM0kEU2bpkEYwCRBdBpqisQuIl6yyh01QKqNIZqUvXRcgchS0KxWohko3JKecgfQM

Now all you need is a repository and a PAT (Personal Access Token), and you are off on your new automatic update adventure. For a bit more documentation than this sparse promotional blog post, please visit the container repository.

Lastly, there are so much more I want to share and address about this. For example the aspect of open source in all of this, the differences between this and violinist.io (the SaaS), the licencing and pricing aspect. But those are all blog posts on their own. For now, I hope you will try it out if it’s useful, and that you want to connect should you have any questions or concerns. Here in the comments, or by reaching out.

Let’s close up this blog post with an animated gif of "runners".
 

Categories: FLOSS Project Planets

Wingware: Wing Python IDE Version 10.0.6 - September 20, 2024

Planet Python - Thu, 2024-09-19 21:00

Wing 10.0.6 adds support for Python 3.13 and fixes some issues with AI development, code refactoring, and unit testing with pytest.

See the change log for details.

Download Wing 10 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products


What's New in Wing 10

AI Assisted Development

Wing Pro 10 takes advantage of recent advances in the capabilities of generative AI to provide powerful AI assisted development, including AI code suggestion, AI driven code refactoring, description-driven development, and AI chat. You can ask Wing to use AI to (1) implement missing code at the current input position, (2) refactor, enhance, or extend existing code by describing the changes that you want to make, (3) write new code from a description of its functionality and design, or (4) chat in order to work through understanding and making changes to code.

Examples of requests you can make include:

"Add a docstring to this method" "Create unit tests for class SearchEngine" "Add a phone number field to the Person class" "Clean up this code" "Convert this into a Python generator" "Create an RPC server that exposes all the public methods in class BuildingManager" "Change this method to wait asynchronously for data and return the result with a callback" "Rewrite this threaded code to instead run asynchronously"

Yes, really!

Your role changes to one of directing an intelligent assistant capable of completing a wide range of programming tasks in relatively short periods of time. Instead of typing out code by hand every step of the way, you are essentially directing someone else to work through the details of manageable steps in the software development process.

Read More

Support for Python 3.12, 3.13, and ARM64 Linux

Wing 10 adds support for Python 3.12 and 3.13, including (1) faster debugging with PEP 669 low impact monitoring API, (2) PEP 695 parameterized classes, functions and methods, (3) PEP 695 type statements, and (4) PEP 701 style f-strings.

Wing 10 also adds support for running Wing on ARM64 Linux systems.

Poetry Package Management

Wing Pro 10 adds support for Poetry package management in the New Project dialog and the Packages tool in the Tools menu. Poetry is an easy-to-use cross-platform dependency and package manager for Python, similar to pipenv.

Ruff Code Warnings & Reformatting

Wing Pro 10 adds support for Ruff as an external code checker in the Code Warnings tool, accessed from the Tools menu. Ruff can also be used as a code reformatter in the Source > Reformatting menu group. Ruff is an incredibly fast Python code checker that can replace or supplement flake8, pylint, pep8, and mypy.


Try Wing 10 Now!

Wing 10 is a ground-breaking new release in Wingware's Python IDE product line. Find out how Wing 10 can turbocharge your Python development by trying it today.

Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products

See Upgrading for details on upgrading from Wing 9 and earlier, and Migrating from Older Versions for a list of compatibility notes.

Categories: FLOSS Project Planets

Matt Layman: Docker Go, JS, Static Files - Building SaaS #203

Planet Python - Thu, 2024-09-19 20:00
In this episode, we continued on the cloud migration path. We need to build a Docker container with all the necessary static files. Some of these come from Go via Hugo for a content blog. Some comes from JavaScript via Tailwind for CSS. Some come from Python via Sphinx for documentation. All need to built into the same image. That’s what we covered on this stream.
Categories: FLOSS Project Planets

Release GCompris 4.2

Planet KDE - Thu, 2024-09-19 18:00

Today we are releasing GCompris version 4.2.

It contains bug fixes and graphics improvements on multiple activities.

This version adds translation for Latvian.

It is fully translated in the following languages:

  • Arabic
  • Bulgarian
  • Breton
  • Catalan
  • Catalan (Valencian)
  • Greek
  • UK English
  • Esperanto
  • Spanish
  • Basque
  • French
  • Galician
  • Croatian
  • Hungarian
  • Indonesian
  • Italian
  • Lithuanian
  • Latvian
  • Malayalam
  • Dutch
  • Norwegian Nynorsk
  • Polish
  • Brazilian Portuguese
  • Romanian
  • Russian
  • Slovenian
  • Albanian
  • Swedish
  • Swahili
  • Turkish
  • Ukrainian

It is also partially translated in the following languages:

  • Azerbaijani (97%)
  • Belarusian (87%)
  • Czech (97%)
  • German (96%)
  • Estonian (96%)
  • Finnish (95%)
  • Hebrew (96%)
  • Macedonian (90%)
  • Portuguese (96%)
  • Slovak (84%)
  • Chinese Traditional (96%)

You can find packages of this new version for GNU/Linux, Windows, Android, Raspberry Pi and macOS on the download page. This update will also be available soon in the Android Play store, the F-Droid repository and the Windows store.

Thank you all,
Timothée & Johnny

Categories: FLOSS Project Planets

Okteta got “Best Application” 2024 Akademy Award

Planet KDE - Thu, 2024-09-19 13:08

The jury of this year’s KDE Akademy Awards, being by tradition representatives of last year’s winners, has selected the hex editor Okteta in the category “Best Application”. Thanks to them for this appreciation, even more for a niche application

Though, appreciation for what, as there are no details? The last new feature was added in 2019, with the 17th patch release since just done. So, for a reliable program with no need to relearn the UI every year and proudly close to zero open actual bug reports? Then the port to Qt6/KF6, while started in 2022, might be only completed in 2025… if ever. So rather, is this an end-of-life award for an aged 16 years old program?

Looking Back

Triggered by the event some reflection on the past development, if only for the author himself, to update the memories how one got here and what it brings for the future. Which turned into a longer text than anticipated

2003-2006: Years of Initial Need for a Widget

The Okteta project was born in 2003 , the known first code traces date back to May 13th, 2003. The first related code was imported into KDE’s code repository in August 15th, 2003, by the commit message:

Initial import of KHexEdit2, featuring a widget, a kpart and an app.
Most important is the widget...
Hopefully it will be usable enough ready for KDE 3.2...

The name “KHexEdit2” was chosen as the project was a re-implementation of KHexEdit, the hex editor part of KDE since KDE 1.1. Because those times I was trying to create a viewer for executables and libraries (project name Binspekt, stalled soon), for which I wanted a widget for displaying bytes. KHexEdit seemed to have no code that could be cleanly ripped out and be reused, so work started to code such a widget from scratch and respectively also consumers of it, to add more reason & motivation.

The formal request on September 29th that year to then move the project’s code from the “kdenonbeta” area into release-covered areas had this optimistic line in it:

Finally there will be an app, build around the ReadWritePart. In 2004.

Turned out, life did not agree to that plan, thus 2004 passed without any such app. And so did 2005.

Still, back in February 2004 as part of KDE 3.2 the first elements of the still named KHexEdit2 project made a first release. Though with a bit of complexity. ((Note that at this time KDE was still also the name of the released bundle product, composed of the so called modules kdelibs, kdebase, kdeutils, etc. Where kdelibs held all the public libraries, kdebase the basic desktop components, and so on.)) As adding a complete implementation of a hexeditor widget to the official kdelibs for just a few potential users was declared unbalanced, instead just some header files with KHE namespaced abstract interface classes were added, with an inline utility method to dynamically load any plugin implementing them. So not increasing the actual runtime and installation size of kdelibs. And the kdeutils module got to provide such a plugin by the name KBytesEdit. This then was implemented by the hex editor widget library from the KHexEdit2 project, also in the kdeutils module, whose own API and headers were kept private. To confuse everyone this library was yet named libkhexedit, even if the actual KHexEdit program did not use it. Because the spirit of the naming was on the level of widgets and classes, not program names, and there the 2 postfix made no sense. Consumers of this construct became at least the utility app for Palm Pilot handhelds KPilot and the debugger plugin of the IDE KDevelop.

In March 2005 KDE 3.4 then included the KHexEdit2 KPart (read-only) as well, also located in the kdeutils module and also implemented using the private libkhexedit library. Making some people unhappy as it registered (like its Okteta successor still does today) as handler for the MIME type application/octet-stream, so popping up as fallback KPart where no other handler was found,. And seeing raw bytes over nothing has been partially perceived as “broken”

2006-2008: Second Try on a Program, as Sample and with new Dedicated Name

2006 arrived, and with the author there was still some ever developing curiosity about the feasibility of writing viewer & editor programs using some higher-level reusable & exchangeable components. Now a byte array is a pretty simple data structure to use for a sample implementation of such a concept. And here we had a widget implementation for byte array viewing and editing fully under our control. And a certain level of skills with C++ acquired at the time. This was just too tempting to not give it an own try, for fun and experience. So a June blog post “Fun with KHexEdit” also mentioned a program again and introduced the name designed meanwhile:

I am tackling the construction of a successor to KHexEdit again, projectname “Okteta”.

Later that year a first visual snippet was shared, showing how KHexEdit’s UI served as initial template for Okteta, also to potentially help the transfer of existing users:

November 2006: first published screenshot of Okteta in some pre-Alpha state

To avoid duplicated efforts and to increase pressure to deliver, two weeks later on November 27th an optimistic email was sent to KDE’s great eternal to-newer-API porting worker Laurent Montel, as it was the time to port KDE software to Qt4/KDELibs4:

Hi Laurent,

please don't spend too much effort at the old program KHexEdit, I am quite far
on the way to write a successor, called Okteta. Concerning feature
compatibility, so far I implemented around 60 % of the features of KHexEdit,
and hope to do the last 40 % until at least January. Yes, no code yet in SVN
(besides the library), but that will change in three weeks, promised.
[...]

This time life agreed mainly to the plan. Though the promise on the code in SVN was delivered on only with almost a year delay. A first dump of the program code was committed into KDE’s code repository on October 23rd, 2007, by the commit message:

Uploading the Okteta program into KDE's playground, so the code isn't lost, after growing slowly only on my hdd for more than a year. Okteta is a planned successor to KHexEdit, yet misses all of it's functionality. With it's modular architecture, based on the co-developed lib kakao, it should soon offer more than would could be done with the monolithic KHexEdit. I hope.

As can be read, this first copy also was featuring a first draft for the own before mentioned higher-level component system, initially named “Kakao”, later in 2009 to be renamed to “Kasten”. That first name was made ad-hoc inspired by a drink on the table (in learned safe distance from the keyboard), to soon find it used similarly by some certain bigger IT company, even for a somehow related subject, thus a new name was by the time designed less ad-hoc.

And so some months later in April 2008 Okteta entered the “kdereview” phase, to proceed after two weeks into the kdeutils module. In time for KDE 4.1, so premiering its release as part of that in July 2008. Okteta here also took over to provide the KBytesEdit plugin for the kdelibs KHE interfaces as well as the KPart, which before had resided in subdirectories of the KHexEdit program sources. KHexEdit itself stayed unported to Qt4/KDELibs4, so Okteta as planned did not run into duplicated efforts and rivalry (or, it avoided competition, for good and bad).

July 2008: Okteta’s first release, as version 0.1, part of KDE 4.1 2008-2012: Years of Features Flow

With the foundations laid and releases established as part of KDE releases, the next years saw a number of features added, initially even each KDE version:

January 2009: new features in Okteta 0.2, part of KDE 4.2 August 2009: new features in Okteta 0.3, part of KDE 4.3 February 2010: new features in Okteta 0.4, part of KDE SC 4.4 July 2011: new features in Okteta 0.7, part of KDE Applications 4.7 August 2012: new features in Okteta 0.9, part of KDE Applications 4.9 2010-Present: Sharing Functionality in Rich Public Libraries

From the very begin of the project on the byte array viewing & editing feature was embeddable by 3rd-party software using the abstract KHE interfaces in kdelibs, or then the KPart at least for viewing, Though this allowed only little control & features due to a limited API.

Starting with Okteta 0.4 in February 2010, the two sets of underlying libraries, Kasten and Okteta ones, used to implement the Okteta program, the Okteta KPart and the KHE KBytesEdit plugin, are provided with stable public API.

The lower-Qt-level Okteta GUI library also started to be accompanied by a Qt Designer plugin, to allow easy use of the two provided classes of widgets also in Qt’s UI files.

February 2010: new Okteta widgets plugin for Qt Designer, part of Okteta 0.4

In February 2010 during a week-long Kate-KDevelop development meeting in Berlin the intended flexibility of the new public libraries proofed itself by enabling to create a plugin for KDevelop to integrate hex viewing & edting and all the Okteta tools in just those few days, for some nice satisfaction. The plugin was officially released first with KDevelop 4.1 in October 2010 and later also ported to the Qt5/KF5 version of KDevelop. For the current Qt6/KF6-based version of KDevelop the plugin is excluded from the build for now, given the current lack of a released Qt6/KF6-based version of the Okteta and Kasten libraries.

October 2010: Okteta plugin for KDevelop, first released with KDevelop 4.1 2012-Present: Switching from Features to Architecture, from Bundled to Stand-Alone

The port to Qt5/KF5 happened without any issues and was completed for version 0.15, released as part of KDE Applications 14.12. During the transformation of KDELibs4 to KDE Frameworks 5 the KHE interfaces also got dropped there, due to Okteta meanwhile directly providing public libraries. So this ported version of Okteta also no longer provided the KBytesEdit plugin, but otherwise as before the public libraries and the KPart, next to the program itself.

After KDE Applications 17.12, as for a while there was no feature development and only occasional work on the design of the Okteta & Kasten libraries happened, the Okteta project switched to a stand-alone release schedule. A 0.25 version branch was created and patch version releases only done when there were user-relevant changes like bug fixes or bigger translation improvements.

Then 2019 brought the first version and for now also latest to also provide at least one new feature to users, for which a 0.26 version branch was created. This version has meanwhile got 17 patch releases, with bug fixes, translation improvements and other adjustments. And after 5 years of such polishing is the one which now received the “Best Application” 2024 Akademy Award

March 2019: new features in Okteta 0.26, released on own schedule 2022-Present: Preparing for Qt6 & KF6

Okteta’s code base has been mostly quickly updated to any API deprecations, also as part of a “zero build warnings” strategy. So the approach taken for both Qt & KF libraries to strive for source-backward-compatible C++ API in their both new major version 6 made the initial port of Okteta to Qt6 & KF6(-Alpha) a matter of less than a day in May 2022. Now, only if one ignores one of the tools.

May 2022: preview of Okteta port to Qt6/KF6

The Structures tool, first developed by Alexander Richardson in 2010 for Okteta 0.4, was in 2011 for Okteta 0.7 extended by him to also support dynamic structure definitions, using JavaScript expressions. For this QtScript has been used as engine. Four years later, in July 2015 though Qt 5.5 declared QtScript as deprecated. The officially recommended substitute QJSEngine turned out to not allow the dynamic translation of JavaScript properties and methods as relied on by the Structures tool for the copy-avoiding mapping of the data blob interpretation into the JavaScript scene (beware, for what the author understands so far). So it could not be used as drop-in replacement.

As finding a suitable and more future-proof JavaScript engine or exploring a possible reimplementation using the QJSEngine is a complex task and also needs bigger chunks of time & focussing, it had been postponed. Year after year. And thus now nine years later in 2024 there is still not even a plan. And Qt6 no longer now provides QtScript.

Just dropping the Structures tool is not a real option. It is a great feature, which also got some users. So a plan is needed and work to be done by someone. As of now my own, surely radical idea is to rewrite the whole Structures tool from scratch, still for the Qt5/KF5 version of Okteta. This should lead to fully wrapping the brain around this complex feature, instead of seeing to indirectly explore it by trying to understand all the details of the current elaborated implementation with the risk to misinterpret some intentions. Starting from scratch might also allow to finally share all the code used for the data formats in the separate Decoding Table tool, and perhaps even to introduce a more generic approach on the data formats supported in the main mass display besides currently byte values and 8-bit charset mapping. Idea, Should, Trying, Starting, Might, Perhaps… any words of confidence, please?

For now the initial Qt6/KF6 port is maintained by a single commit containing the complete dump of the “it builds, starts and does not crash on simple usage” changes, in a work branch continuously rebased to the latest current Qt5/KF5-based development branch. This commit would then at the time of the real Qt6/KF6 port be properly split into the different aspects of porting. For now it serves to hold the door open while still on the other side.

Looking Forward, by Looking Back Some More

For sure the initial goal with the Okteta program to do something for fun and experience has been largely achieved The current challenge with the needed replacement of the used JavaScript engine promises more experience, though no fun initially to me at least.

And some feedback as well even a KDE Akademy Award now hint the created and publicly shared program also served other people for their serious and less serious needs. Possibly even some desperate Faust-like persons, “So that they may perceive the bytes which hold their doc[ument] together in its inmost fold.” (and even tweak them for better as needed, owning their world or document). Though no contracts done, and thus no souls here owned.

But as before, Okteta actually is just a sample implementation of the actual interest pursued here, exploring the feasibility of writing programs by higher-level reusable & exchangeable components, ideally also allowing random end users to mesh those components themselves into tailored solutions for certain needs. So if development has stalled as it has, both on the components concept but also the hex editing features, how to increase motivation again to set resources aside, and for which part?

Position in the Hex Editor Solution Space

When it comes to the Free Software solution space for hex editor needs, next to Okteta there is currently coverage starting from simple ones like GHex over wxHexEditor, which serves needs beyond Okteta by support for paged loading of big files and also the working memory, though sadly unmaintained currently, to the newer yet most impressive and very powerful ImHex (by what the web pages show, never tried).

So would people suffer if Okteta is gone for current platforms, at least for a while?

Open Component Systems vs. Closed Monolithic Blobs

Now the author, while being curious, never got around to actually study existing solutions for the concept of higher-level component system or even deploy them in projects, only ever saw some theoretic surfaces. And is fully aware of the own experiment turning into something serious rather being a pipe dream. Even more when after soon two decades the initially created TODO list is not even 10 % done, this won’t work out this single human’ life So the following is more like the wish-wash of a hobbyist bird watcher, while also having some chicken in the backyard to which things are compared. Or alike.

It seems composable systems with complex interfaces are not the dominating species in the Free Software ecosystems. The Linux kernel outpaced any microkernel systems, e.g. Gnu Hurd is yet to be spotted outside OS zoos. The Eclipse Rich Client Platform, whose concepts were one of the original by-headlines inspirations for this project, seems to have maxed out some time ago as well. at least in the mainstream through the author’s bubble space. StarOffice^WOpenOffice^WLibreOffice has the UNO component system, but how many Add-Ons flourish on it? Then GnuStep would have enabled to spread the component concepts of OpenStep, but little has be seen? The later GnuStep-related, indeed thriving for stars impressive project Étoilé seemed to be overloaded with related ideas, but sadly never lift off. Then the GNOME project even had a reference to component systems in its initial full name “GNU Network Object Model Environment”, but its respective Bonobo framework based on CORBA faded away rather soon.

Also KDE started initially with implementations around the idea of components. To quote the KDE 1.0 announcement:

In view of these circumstances the KDE Project has developed a first rate compound document application framework, implementing the latest advances in framework technology and thus positioning itself in direct competition to such popular development frameworks as for example Mircrosoft’s MFC/COM/ActiveX technology. KDE’s KOM/OpenParts compound document technology enables developers to quickly create first rate applications implementing cutting edge technology. KDE’s KOM/OpenParts technology is build upon the open industry standards such as the object request broker standard CORBA 2.0.

KOM/OpenParts was then in KDE 2.0 replaced by KParts. Actually the presence of such technology development was the deciding factor to go for KDE when the author those days got into “Linux” and had to choose between GnuStep, Enlightenment, GNOME and KDE. These days though KDE is run with claims like “All About Apps”. The generic KServices system got destructed for KF6. The possibly latest KPart (a Markdown Viewer) was written years ago by this author, and the once KDE-central KPart-driven program Konqueror is only a shadow of its former self. KOffice & Calligra as component-oriented office suites also died or stalled close to extinction. Generic plugins like the KParts are not even listed on apps.kde.org or elsewhere anymore, also no longer mentioned as concept in KDE Gear release announcements. Similar specific plugins like the Plasma applets, they are also not listed separately, but only as part of the respective, in the example, Plasma products.

Additionally packaging formats like Flatpak or Snap are discussion-less embraced and promoted, which push in the direction of isolated and frozen software programs. Even today does Flatpak’s metadata system appstream. also otherwise discussion-less embraced by KDE, not have a concept of generic plugins, so KParts cannot be described properly.

In such an environment a component system would be limited to predefined fixed component sets in libraries, from which applications would then provide a setup and offer that to users. A bit like being able to shop as consumer at the kitchen equipment store only preset exhibition rooms, instead of meshing up items from different providers into ones’ own tailored meal preparations “app”. Surely it is in the interest of the dominating providers who then will see to bundle only their items, and then add bloat as well as only making half the items good. As consumer I desire to have the choice over pre-made bundles vs. self-assembled ones. Like there are times for All-inclusive vacations and times for self-organized ones. So with current KDE but also the larger current Free Software “desktop” scene as real world development environment working on and thinking about component systems has the author feel at odds.

So maybe the experiment with Kasten as a higher-level component system could also stop here. Perhaps some research instead could be done why such systems failed in comparison. Like, was it due to the inflexibility presented by fixed published interfaces, where on new feature needs implementations cannot simply do temporary shortcuts and adaptions where needed to be quick back to the market? Could it be due to the possible need for more abstract and generic thinking with component systems, where the majority of developers working for the market might prefer to think more concrete and case-by-case? Then, might there still be a middle-ground, where any advantages of high-level component systems are the deciding factor in the competition?

Next Release Scheduled: October 10th, version 0.26.18

As described already for the early stages, there always have been ideas and plans.. and delays… and also doubts… and then things happened. Locally there are lots of notes with ideas, and a number of code drafts and sketches stacked by the years. And at least in the near future it seems there are still time windows, electricity, a laptop, and enough human capabilities to carry on and tinkering over this stack.

The Okteta (& Kasten) project for now is alive, just lurking around in front of the next evolution step to take. Which might find it new ground. Or extinction anyway. And while it is lurking, it gets a tad more feathers polished, by another bug fix release already scheduled for next month

Categories: FLOSS Project Planets

mark.ie: My LocalGov Drupal contributions for week-ending September 20th, 2024

Planet Drupal - Thu, 2024-09-19 12:00

Do one thing, do it well - this week I spent most of my time creating a live preview module for microsites.

Categories: FLOSS Project Planets

Programiz: Getting Started with Python

Planet Python - Thu, 2024-09-19 07:15
In this tutorial, you will learn to write your first Python program.
Categories: FLOSS Project Planets

PyCharm: How to Use FastAPI for Machine Learning

Planet Python - Thu, 2024-09-19 04:59

This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas, Polars, and Jupyter Notebook.

FastAPI provides a quick way to build a backend service with Python. With a few decorators, you can turn your Python function into an API application.

It is widely used by many companies including Microsoft, Uber, and Netflix. According to the Python Developers Survey, FastAPI usage has grown from 21% in 2021 to 29% in 2023. For data scientists, it’s the second most popular framework, with 31% using it.

In this blog post, we will cover the basics of FastAPI for data scientists who may want to build a quick prototype for their project. 

What is FastAPI?

FastAPI is a popular web framework for building APIs with Python, based on standard Python type hints. It is intuitive and easy to use, and it can provide a production-ready application in a short period of time. It is fully compatible with OpenAPI and JSON Schema.

Why use FastAPI for machine learning?

Most teams working on machine learning projects consist of data scientists whose domains and professions lie on the statistics side of things. They may not have experience developing software or applications to ship their machine learning projects. FastAPI enables data scientists to easily create APIs for the following projects:

Deploying prediction models

The data science team may have trained a model for the prediction of the sales demand in a warehouse. To make it useful, they have to provide an API interface so other parts of the stock management system can use this new prediction functionality.

Suggestion engines

One of the very common uses of machine learning is as a system that provides suggestions based on the users’ choices. For example, if someone puts certain products in their shopping cart, more items can be suggested to that user. Such an e-commerce system requires an API call to the suggestion engine that takes input parameters.

Dynamic dashboards and reporting systems

Sometimes, reports for data science projects need to be presented as dashboards so users can inspect the results themselves. One possible approach is to have the data model provide an API. Frontend developers can use this API to create applications that allow users to interact with the data.

Advantages of using FastAPI

Compared to other Python web frameworks, FastAPI is simple yet fully functional. Mainly using decorators and type hints, it allows you to build a web application without the complexity of building a whole ORM (object-relational mapping) model and with the flexibility of using any database, including any SQL and NoSQL databases. FastAPI also provides automatic documentation generation, support for additional information and validation for query parameters, and good async support.

Fast development

Creating API calls in FastAPI is as easy as adding decorators in the Python code. Little to no backend experience is needed for anyone who wants to turn a Python function into an application that will respond to API calls.

Fast documentation

FastAPI provides automatic interactive API documentation using Swagger UI, which is an industry standard. No extra effort is required to build clear documentation with API call examples. This creates an advantage for busy data science teams who may not have the energy and expertise to write technical specifications and documentation.

Easy testing

Writing tests is one of the most important steps in software development, but it can also be one of the most tedious, especially when the time of the data science team is valuable. Testing FastAPI is made simple thanks to Starlette and HTTPX. Most of the time no monkey patching is needed and tests are easy to write and understand.

Fast deployment

FastAPI comes with a CLI tool that can bridge development and deployment smoothly. It allows you to switch between development mode and production mode easily. Once development is completed, the code can be easily deployed using a Docker container with images that have Python prebuilt.

How to use FastAPI for a machine learning project

In this example, we will turn a classification prediction model that uses the Nearest Neighbors algorithm to predict the species of various penguins based on their bill and flipper length into a backend application. We will provide an API that takes parameters from the query parameters of a URL and gives back the prediction. This shows how a prototype can be made quickly by any data scientist with no backend development experience.

We will use a simple `KNeighborsClassifier` on the penguin data set as an example. Details of how to build the model will be omitted, but feel free to check out the relevant notebook here. In the following tutorial, we will focus on the usage of FastAPI and explain some fundamental concepts. We will be building a prototype to do so. 

1. Start a FastAPI project with PyCharm

In this blog post, we will be using PyCharm Professional 2024.1. The best way to start using FastAPI is to create a FastAPI project with PyCharm. When you click New Project in PyCharm, you will be presented with a large selection of projects to choose from. Select the FastAPI tab:



From here, you can put in the name of your project and take advantage of other options such as initializing Git and the virtual environment that you want to use.

After doing so, you will see the basic structure of a FastAPI project set up for you.



There is also a `test_main.http` file set up for you to quickly test all the endpoints.


2. Set up environment dependencies

Next, set up our environment dependency with `requirements.txt` by selecting ​​Sync Python Requirements under PyCarm’s Tool menu.



Then you can select the `requirements.txt` file to be used.



You can copy and use this `requirements.txt` file. We will be using pandas and scikit-learn for the machine learning part of the project. Also, add the `penguins.csv` file to your project directory.

3. Set up your machine learning model

Arrange your machine learning code in the `main.py` file. We will start with a script that trains our model:

import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train)

We can place the above code after `app = FastAPI()`. All of it will be run when we start the application.



However, there is a better way to run the start-up code we used to set up our model. We will cover that in a later part of the blog post.

4. Request a response

Next we will look at how to add our model to FastAPI functionality. As a first step, we will add a response to the root of the URL and just simply return a message about our model in JSON format. Change the code in `async def root():` from “Hello world” to our message like this:

@app.get("/") async def root(): return { "Name": "Penguins Prediction", "description": "This is a penguins prediction model based on the bill length and flipper length of the bird.", }

Now, test our application. First, we will start our application, which is easy in PyCharm. Just press the arrow button () next to your project name at the top.



If you are using the default settings, your application will run on http://127.0.0.1:8000. You can double-check that by looking at the prompt from the Run window.

Once the process has started, let’s go to `test_main.http` and press the first arrow button () next to `GET`. From the HTTP Client in the Services window, you will see the response message that we put in.



The response JSON file is also saved for future inspection.

5. Request with query parameters

Next, we would like to let users make predictions by providing query parameters in the URL. Let’s add the code below after the `root` function.

@app.get("/predict/") async def predict(bill_length_mm: float = 0.0, flipper_length_mm: float = 0.0): param = { "bill_length_mm": bill_length_mm, "flipper_length_mm": flipper_length_mm } if bill_length_mm <=0.0 or flipper_length_mm <=0.0: return { "parameters": param, "error message": "Invalid input values", } else: result = clf.predict([[bill_length_mm, flipper_length_mm]]) return { "parameters": param, "result": le.inverse_transform(result)[0], }

Here we set the default value of the `bill_length_mm` and `flipper_length_mm` to be 0 if the user didn’t input a value. We also add a check to see if either of the values is 0 and return an error message instead of trying to predict which penguin the input refers to.

If the inputs are not 0, we will use the model to make a prediction and use the encoder to do an inverse transformation to get the label of the predicted target, i.e. the name of the penguin species.

This is not the only way you can verify inputs. You can also consider using Pydantic for input verification.

If you are using the same version of FastAPI as stated in `requirements.txt`, FastAPI automatically refreshes the service and applies changes on save. Now put in a new URL in `test_main.http` to test (separated from the URL before with ###):

### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3&flipper_length_mm=195 Accept: application/json

Press the arrow button () next to our new URL and see the output.



Next you can try a URL with one or both of the parameters removed to see the error message:

### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3 Accept: application/json


6. Set up a machine learning model with lifespan events

Last, let’s look at how we can set up our model with FastAPI lifespan events. The advantage of doing that is we can make sure no request will be accepted while the model is still being set up and the memory used will be cleaned up afterward. To do that, we will use an `asynccontextmanager`. Before `app = FastAPI()` we will add:

from contextlib import asynccontextmanager ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here yield # Clean up the models and release resources ml_models.clear()

Now we will move the import of pandas and scikit-learn to be alongside the other imports. We will also move our setup code inside the `lifespan` function, setting the machine learning model and LabelEncoder inside `ml_models` like this:

from fastapi import FastAPI from contextlib import asynccontextmanager import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) ml_models["clf"] = clf ml_models["le"] = le yield # Clean up the models and release resources ml_models.clear()

After that we will add the `lifespan=lifespan` parameter in `app = FastAPI()`:

app = FastAPI(lifespan=lifespan)

Now save and test again. Everything should work and we should see the same result as before.

Afterthought: When to train the model?

From our example, you may wonder when the model is trained. Since `clf` is trained at the beginning, i.e. when the service is launched, you may wonder why we do not train the model every time someone makes a prediction.

We do not want the model to be trained every time someone makes a call, because it costs way more resources to re-train everything. Additionally, it may cause race conditions since our FastAPI application is working concurrently. This is especially the case if we use live data that changes all the time.

Technically, we can set up an API to collect data and re-train the model (which we will demonstrate in the next example). Other options would be to schedule a re-train at a certain time when a certain amount of new data has been collected or to let a super user upload new data and trigger the re-training.

So far, we are aiming to build a prototype that runs locally. Check out this article on deploying a FastAPI project on a cloud service for more information.

What is concurrency?

To put it simply, concurrency is like when you are cooking in the kitchen, and while waiting for the water to boil, you go ahead and chop the vegetables. Since, in the web service world, the server is talking to many terminals, and the communication between the server and the terminals is slower than most internal applications, so the server will not talk to and serve the terminals one by one. Instead, it will talk to and serve many of them at the same time while fulfilling their requests. You may want to check out this explanation in the FastAPI documentation.

In Python, this is achieved by using async code. In our FastAPI code, the use of `async def` instead of `def` is obvious evidence that FastAPI is working concurrently. There are other keywords used in Python async code, like `await` and `asyncio.get_event_loop`, but we won’t be able to cover them in this blog post. 

How to use FastAPI for an image classification project

To discover more FastAPI functionality, we will add an image classification model based on the MNIST example in Keras to our application as well (we are using the TensorFlow backend). If you installed the `requirements.txt` provided, you should have Keras and Pillow installed for image processing and building a convolutional neural network (CNN).

1. Refactoring

Before we start, let’s refactor our code. To make the code more organized, we will put the model setup for the penguins prediction in a function:

def penguins_pipeline(): data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) return clf, le

Then we rewrite the lifespan function. With full-line code completion in PyCharm, it is very easy:

2. Set up a CNN model for MNIST prediction

In similar fashion as the penguin prediction model, we create a function for MNIST prediction (and we will store the meta parameters globally):

# MNIST model meta parameters num_classes = 10 input_shape = (28, 28, 1) batch_size = 128 epochs = 15 def mnist_pipeline(): # Load the data and split it between train and test sets (x_train, y_train), _ = keras.datasets.mnist.load_data() # Scale images to the [0, 1] range x_train = x_train.astype("float32") / 255 # Make sure images have shape (28, 28, 1) x_train = np.expand_dims(x_train, -1) # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) model = keras.Sequential( [ keras.Input(shape=input_shape), layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Flatten(), layers.Dropout(0.5), layers.Dense(num_classes, activation="softmax"), ] ) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1) return model

Then add the model setup in the lifespan function:

ml_models["cnn"] = mnist_pipeline()

Note that since this is added, every time you make changes to `main.py` and save, the model will be trained again. It can take a bit of time. So in development you may want to use a dummy model that requires no training time at all or a pre-trained model instead. After training, the CNN model will be ready to go.

3. Set up a POST endpoint for uploading an image file for prediction

To set up an endpoint that takes an upload file, we have to use UploadFile in FastAPI:

@app.post("/predict-image/") async def predicct_upload_file(file: UploadFile): img = await file.read() # process image for prediction img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) # predict the result result = ml_models["cnn"].predict(img).argmax(axis=-1)[0] return {"filename": file.filename, "result": str(result)}

Please note that this is a POST endpoint (so far we have only set up GET endpoints).

Don’t forget to import `UploadFile` from `fastapi`:

from fastapi import FastAPI, UploadFile

And `Image` from Pillow. We are also using `BytesIO` from the `io` module:

from PIL import Image from io import BytesIO

To test this using the PyCharm HTTP Client with a test image file, we will make use of the `multipart/form-data` encoding. You can check out the HTTP request syntax here. This is what you will put in the `test_in.http` file:

### POST http://127.0.0.1:8000/predict-image/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="test_img0.png" < ./test_img0.png --boundary– 4. Add an API to collect data and trigger retraining

Now, here comes the retraining. We set up a POST endpoint like above to accept a zip file which contains training images and labels. The zip file will then be processed and the training data will be prepared. After that we will fit the CNN model again:

@app.post("/upload-images/") async def retrain_upload_file(file: UploadFile): img_files = [] labels_file = None train_img = None with ZipFile(BytesIO(await file.read()), 'r') as zfile: for fname in zfile.namelist(): if fname[-4:] == '.txt' and fname[:2] != '__': labels_file = fname elif fname[-4:] == '.png': img_files.append(fname) if len(img_files) == 0: return {"error": "No training images (png files) found."} else: for fname in sorted(img_files): with zfile.open(fname) as img_file: img = img_file.read() # process image img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) if train_img is None: train_img = img else: train_img = np.vstack((train_img, img)) if labels_file is None: return {"error": "No training labels file (txt file) found."} else: with zfile.open(labels_file) as labels: labels_data = labels.read() labels_data = labels_data.decode("utf-8").split() labels_data = np.array(labels_data).astype("int") labels_data = keras.utils.to_categorical(labels_data, num_classes) # retrain model ml_models["cnn"].fit(train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1) return {"message": "Model trained successfully."}

Remember to import `ZipFile`:

from zipfile import ZipFile

If we now try the endpoint with this zip file of 1000 retraining images and labels, you will see that it takes a moment for the response to come, as the training is taking a while:

POST http://127.0.0.1:8000/upload-images/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="training_data.zip" < ./retrain_img.zip --boundary--

Imagine the zip files contain more training data or you’re retraining a more complicated model. The user would then have to wait for a long time and it would seem like things are not working for them.

5. Retrain the model with BackgroundTasks

A better way to handle retraining is, after receiving the training data, we process it and check if the data is in the right format, then give a response saying that the retraining has restarted and train the model in `BackgroundTasks`. Here is how to do it. First, we will add `BackgroundTasks` to our `upload-images` endpoint:

@app.post("/upload-image/") async def retrain_upload_file(file: UploadFile, background_tasks: BackgroundTasks): ...

Remember to import it from `fastapi`:

from fastapi import FastAPI, UploadFile, BackgroundTasks

Then, we will put the fitting of the model into the `background_tasks`:

# retrain model background_tasks.add_task( ml_models["cnn"].fit, train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1 )

Also, we will update the message in the response:

return {"message": "Data received successfully, model training has started."}

Now test the endpoint again. You will see that the response has arrived much quicker, and if you look at the Run window, you’ll see that the training is running after the response has arrived.

At this point, more functionality can be added, for example, an option to notify the user later (e.g. via email) when the training is finished or track the training progress in a dashboard when a full application is built.

Develop ML FastAPI applications with PyCharm

FastAPI provides an easy way to convert your data science project into a working application in several easy steps. It is perfect for data science teams that want to provide an application prototype for their machine learning model which can be further developed into a professional web application if needed. 

PyCharm Professional is the Python IDE that allows you to develop FastAPI applications more easily with a preconfigured project for FastAPI, coding assistance, tailored run/debug configurations, and the Endpoints tool window for managing API endpoints efficiently.

Get a free trial of PyCharm Professional

In this blog post, we showed the process of providing a simple API for a pre-trained prediction model. To learn more about FastAPI, I would suggest checking out the official FastAPI documentation. If you’re choosing between different frameworks, explore how FastAPI differs from Django.

About the author Cheuk Ting Ho

Cheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.

Categories: FLOSS Project Planets

Implementing an Audio Mixer, Part 2

Planet KDE - Thu, 2024-09-19 03:15
Recap

In Part 1, we covered PCM audio and superimposing waveforms, and developed an algorithm to combine an arbitrary number of audio streams into one.

Now we need to use these ideas to finish a full implementation using Qt Multimedia.

Using Qt Multimedia for Audio Device Access

So what do we need? Well, we want to use a single QAudioOutput, which we pass an audio device and a supported audio format.

We can get those like this:

const QAudioDeviceInfo &device = QAudioDeviceInfo::defaultOutputDevice(); const QAudioFormat &format = device.preferredFormat();

Let’s construct our QAudioOutput object using the device and format:

static QAudioOutput audioOutput(device, format);

Now, to use it to write data, we have to call start on the audio output, passing in a QIODevice *.

Normally we would use the QIODevice subclass QBuffer for a single audio buffer. But in this case, we want our own subclass of QIODevice, so we can combine multiple buffers into one IO device.

We’ll call our subclass MixerStream. This is where we will do our bufferCombine, and keep our member list of streams mStreams.

We will also need another stream type for mStreams. For now let’s just call it DecoderStream, forward declare it, and worry about its implementation later.

One thing that’s good to know at this point is DecoderStream objects will get the data buffers we need by decoding audio data from a file. Because of this, we’ll need to keep our audio format from above to as a data member mFormat. Then we can pass it to decoders when they need it.

Implementing MixerStream

Since we are subclassing QIODevice, we need to provide reimplementations for these two protected virtual functions:

virtual qint64 QIODevice::readData(char *data, qint64 maxSize); virtual qint64 QIODevice::writeData(const char *data, qint64 maxSize);

We also want to provide a way to open new streams that we’ll add to mStreams, given a filename. We’ll call this function openStream. We can also allow looping a stream multiple times, so let’s add a parameter for that and give it a default value of 1.

Additionally, we’ll need a user-defined destructor to delete any pointers in the list that might remain if the MixerStream is abruptly destructed.

// mixerstream.h #pragma once #include <QAudioFormat> #include <QAudioOutput> #include <QIODevice> class DecodeStream; class MixerStream : public QIODevice { Q_OBJECT public: explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; private: QAudioFormat mFormat; QList<DecodeStream *> mStreams; };

Notice that combineSamples isn’t in the header. It’s a pretty basic function that doesn’t require any members, so we can just implement it as a free function.

Let’s put it in a header mixer.h and wrap it in a namespace:

// mixer.h #pragma once #include <QtGlobal> #include <limits> namespace Mixer { inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace Mixer

There are some very basic things we can get out of the way quickly in the MixerStream cpp file. Recall that we must implement these member functions:

explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override;

The constructor is very simple:

MixerStream::MixerStream(const QAudioFormat &format) : mFormat(format) { setOpenMode(QIODevice::ReadOnly); }

Here we use setOpenMode to automatically open our device in read-only mode, so we don’t have to call open() directly from outside the class.

Also, since it’s going to be read-only, our reimplementation of QIODevice::writeData will do nothing:

qint64 MixerStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }

The custom destructor we need is also quite simple:

MixerStream::~MixerStream() { while (!mStreams.empty()) delete mStreams.takeLast(); }

readData will be almost exactly the same as the implementation we did earlier, but returning qint64. The return value is meant to be the amount of data written, which in our case is just the maxSize argument given to it, as we write fixed-size buffers.

Additionally, we should call qAsConst (or std::as_const) on mStreams in the range-for to avoid detaching the Qt container. For more on qAsConst and range-based for loops, see Jesper Pederson’s blog post on the topic.

qint64 MixerStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); constexpr qint16 bitDepth = sizeof(qint16); const qint16 numSamples = maxSize / bitDepth; for (auto *stream : qAsConst(mStreams)) { auto *cursor = reinterpret_cast<qint16 *>(data); qint16 sample; for (int i = 0; i < numSamples; ++i, ++cursor) if (stream->read(reinterpret_cast<char *>(&sample), bitDepth)) *cursor = Mixer::combineSamples(sample, *cursor); } return maxSize; }

That only leaves us with openStream. This one will require us to discuss DecodeStream and its interface.

The function should construct a new DecodeStream on the heap, which will need a file name and format. DecodeStream, as implied by its name, needs to decode audio files to PCM data. We’ll use a QAudioDecoder within DecodeStream to accomplish this, and for that, we need to pass mFormat to the constructor. We also need to pass loops to the constructor, as each stream can have a different number of loops.

Now our constructor call will look like this:

DecodeStream(fileName, mFormat, loops);

We can then use operator<< to add it to mStreams.

Finally, we need to remove it from the list when it’s done. We’ll give it a Qt signal, finished, and connect it to a lambda expression that will remove the stream from the list and delete the pointer.

Our completed openStream function now looks like this:

void MixerStream::openStream(const QString &fileName, int loops) { auto *decodeStream = new DecodeStream(fileName, mFormat, loops); mStreams << decodeStream; connect(decodeStream, &DecodeStream::finished, this, [this, decodeStream]() { mStreams.removeAll(decodeStream); decodeStream->deleteLater(); }); }

Recall from earlier that we call read on a stream, which takes a char * to which the read data will be copied and a qint64 representing the size of the data.

This is a QIODevice function, which will internally call readData. Thus, DecoderStream also needs to be a QIODevice.

Getting PCM Data for DecodeStream

In DecodeStream, we need readData to spit out PCM data, so we need to decode our audio file to get its contents in PCM format. In Qt Multimedia, we use a QAudioDecoder for this. We pass it an audio format to decode to, and a source device, in this case a QFile file handle for our audio file.

When a QAudioDecoder‘s start method is called, it will begin decoding the source file in a non-blocking manner, emitting a signal bufferReady when a full buffer of decoded PCM data is available.

On that signal, we can call the decoder’s read method, which gives us a QAudioBuffer. To store in a data member in DecodeStream, we use a QByteArray, which we can interact with using QBuffers to get a QIODevice interface for reading and writing. This is the ideal way to work with buffers of bytes to read or write in Qt.

We’ll make two QBuffers: one for writing data to the QByteArray (we’ll call it mInputBuffer), and one for reading from the QByteArray (we’ll call it mOutputBuffer). The reason for using two buffers rather than one read/write buffer is so the read and write positions can be independent. Otherwise, we will encounter more stuttering.

So when we get the bufferReady signal, we’ll want to do something like this:

const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount());

We’ll also need to have some sort of state enum. The reason for this is that when we are finished with the stream and emit finished(), we remove and delete the stream from a connected lambda expression, but read might still be called before that has completed. Thus, we want to only read from the buffer when the state is Playing.

Let’s update mixer.h to put the enum in namespace Mixer:

#pragma once #include <QtGlobal> #include <limits> namespace Mixer { enum State { Playing, Stopped }; inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace Mixer Implementing DecodeStream

Now that we understand all the data members we need to use, let’s see what our header for DecodeStream looks like:

// decodestream.h #pragma once #include "mixer.h" #include <QAudioDecoder> #include <QBuffer> #include <QFile> class DecodeStream : public QIODevice { Q_OBJECT public: explicit DecodeStream(const QString &fileName, const QAudioFormat &format, int loops); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; signals: void finished(); private: QFile mSourceFile; QByteArray mData; QBuffer mInputBuf; QBuffer mOutputBuf; QAudioDecoder mDecoder; QAudioFormat mFormat; Mixer::State mState; int mLoopsLeft; };

In the constructor, we’ll initialize our private members, open the DecodeStream in read-only (like we did earlier), make sure we open the QFile and QBuffers successfully, and finally set up our QAudioDecoder.

DecodeStream::DecodeStream(const QString &fileName, const QAudioFormat &format, int loops) : mSourceFile(fileName) , mInputBuf(&mData) , mOutputBuf(&mData) , mFormat(format) , mState(Mixer::Playing) , mLoopsLeft(loops) { setOpenMode(QIODevice::ReadOnly); const bool valid = mSourceFile.open(QIODevice::ReadOnly) && mOutputBuf.open(QIODevice::ReadOnly) && mInputBuf.open(QIODevice::WriteOnly); Q_ASSERT(valid); mDecoder.setAudioFormat(mFormat); mDecoder.setSourceDevice(&mSourceFile); mDecoder.start(); connect(&mDecoder, &QAudioDecoder::bufferReady, this, [this]() { const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount()); }); }

Once again, our QIODevice subclass is read-only, so our writeData method looks like this:

qint64 DecodeStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }

Which leaves us with the last part of the implementation, DecodeStream‘s readData function.

We zero out the char * with memset to avoid any noise if there are areas that are not overwritten. Then we simply read from the QByteArray into the char * if mState is Mixer::Playing.

We check to see if we finished reading the file with QBuffer::atEnd(), and if we are, we decrement the loops remaining. If it’s zero now, that was the last (or only) loop, so we set mState to stopped, and emit finished(). Either way we seek back to position 0. Now if there are loops left, it starts reading from the beginning again.

qint64 DecodeStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); if (mState == Mixer::Playing) { mOutputBuf.read(data, maxSize); if (mOutputBuf.size() && mOutputBuf.atEnd()) { if (--mLoopsLeft == 0) { mState = Mixer::Stopped; emit finished(); } mOutputBuf.seek(0); } } return maxSize; }

Now that we’ve implemented DecodeStream, we can actually use MixerStream to play two audio files at the same time!

Using MixerStream

Here’s an example snippet that shows how MixerStream can be used to route two simultaneous audio streams into one system mixer channel:

const auto &device = QAudioDeviceInfo::defaultOutputDevice(); const auto &format = device.preferredFormat(); auto mixerStream = std::make_unique<MixerStream>(format); auto *audioOutput = new QAudioOutput(device, format); audioOutput->setVolume(0.5); audioOutput->start(mixerStream.get()); mixerStream->openStream(QStringLiteral("/path/to/some/sound.wav")); mixerStream->openStream(QStringLiteral("/path/to/something/else.mp3"), 3); Final Remarks

The code in this series of posts is largely a reimplementation of Lova Widmark’s project QtMixer. Huge thanks to her for a great and lightweight implementation. Check the project out if you want to use something like this for a GPL-compliant project (and don’t mind that it uses qmake).

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post Implementing an Audio Mixer, Part 2 appeared first on KDAB.

Categories: FLOSS Project Planets

Vasudev Kamath: Note to Self: Enabling Unified Kernel Image on Debian

Planet Debian - Thu, 2024-09-19 02:10

Note

These steps may not work on your system if you are using the default Debian installation. This guide assumes that your system is using systemd-boot as the bootloader, which is explained in the post linked below.

A unified kernel image (UKI) is a single executable that can be booted directly from UEFI firmware or automatically sourced by bootloaders with little or no configuration. It combines a UEFI boot stub program like systemd-stub(7), a Linux kernel image, an initrd, and additional resources into a single UEFI PE file.

systemd-boot already provides a hook for kernel installation via /etc/kernel/postinst.d/zz-systemd-boot. We just need a couple of additional configurations to generate the UKI image.

Installation and Configuration
  1. Install the systemd-ukify package:

    sudo apt-get install systemd-ukify
  2. Create the following configuration in /etc/kernel/install.conf:

    layout=uki initrd_generator=dracut uki_generator=ukify

    This configuration specifies how to generate the UKI image for the installed kernel and which generator to use.

  3. Define the kernel command line for the UKI image. Create /etc/kernel/uki.conf with the following content:

    [UKI] Cmdline=@/etc/kernel/cmdline
Generating the UKI Image

To apply these changes, regenerate the UKI image for the currently running kernel:

sudo dpkg-reconfigure linux-image-$(uname -r) Verification

Use the bootctl list command to verify the presence of a "Type #2" entry for the current kernel. The output should look similar to this:

bootctl list type: Boot Loader Specification Type #2 (.efi) title: Debian GNU/Linux trixie/sid (2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi) (default) (selected) id: 2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi source: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi sort-key: debian linux: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi options: systemd.gpt_auto=no quiet root=LABEL=root_disk ro systemd.machine_id=2d0080583f1a4127ac0b073b1a9d3e61 type: Boot Loader Specification Type #2 (.efi) title: Debian GNU/Linux trixie/sid (2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi) id: 2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi source: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi sort-key: debian linux: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi options: systemd.gpt_auto=no quiet root=LABEL=root_disk ro systemd.machine_id=2d0080583f1a4127ac0b073b1a9d3e61 type: Automatic title: Reboot Into Firmware Interface id: auto-reboot-to-firmware-setup source: /sys/firmware/efi/efivars/LoaderEntries-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f Cleanup and Reboot

Once the "Type #2" entries are generated, remove any "Type #1" entries using the bootctl unlink command. After this, reboot your system to boot from the UKI-based image.

Future Considerations

The primary use case for a UKI image is secure boot. Signing the UKI image can also be configured in the settings above, but this guide does not cover that process as it requires setting up secure boot on your system.

Categories: FLOSS Project Planets

Stack Abuse: Securing Your Email Sending With Python: Authentication and Encryption

Planet Python - Wed, 2024-09-18 22:29

Email encryption and authentication are modern security techniques that you can use to protect your emails and their content from unauthorized access.

Everyone, from individuals to business owners, uses emails for official communication, which may contain sensitive information. Therefore, securing emails is important, especially when cyberattacks like phishing, smishing, etc. are soaring high.

In this article, I'll discuss how to send emails in Python securely using email encryption and authentication.

Setting Up Your Python Environment

Before you start creating the code for sending emails, set up your Python environment first with the configurations and libraries you'll need.

You can send emails in Python using:

  • Simple Mail Transfer Protocol (SMTP): This application-level protocol simplifies the process since Python offers an in-built library or module (smtplib) for sending emails. It's suitable for businesses of all sizes as well as individuals to automate secure email sending in Python. We're using the Gmail SMTP service in this article.

  • An email API: You can leverage a third-party API like Mailtrap Python SDK, SendGrid, Gmail API, etc., to dispatch emails in Python. This method offers more features and high email delivery speeds, although it requires some investment.

In this tutorial, we're opting for the first choice - sending emails in Python using SMTP, facilitated by the smtplib library. This library uses the RFC 821 protocol and interacts with SMTP and mail servers to streamline email dispatch from your applications. Additionally, you should install packages to enable Python email encryption, authentication, and formatting.

Step 1: Install Python

Install the Python programming language on your computer (Windows, macOS, Linux, etc.). You can visit the official Python website and download and install it from there.

If you've already installed it, run this code to verify it:

python --version

Step 2: Install Necessary Modules and Libraries
  • smtplib: This handles SMTP communications. Use the code below to import 'smtplib' and connect with your email server:

    import smtplib
  • email module: This provides classes (Subject, To, From, etc.) to construct and parse emails. It also facilitates email encoding and decoding with Multipurpose Internet Mail Extensions (MIME).

  • MIMEText: It's used for formatting your emails and supports sending emails with text and attachments like images, videos, etc. Import this using the code below:

    import MIMEText
  • MIMEMultipart: Use this library to add attachments and text sections separately in your email.

    import MIMEMultipart
  • ssl: It provides Secure Sockets Layer (SSL) encryption.

Step 3: Create a Gmail Account

To send emails using the Gmail SMTP email service, I recommend creating a test account to develop the code. Delete the account once you've tested the code.

The reason is, you'll need to modify the security settings of your Gmail account to enable access from the Python code for sending emails. This might expose the login details, compromising security. In addition, it will flood your account with too many test emails.

So, instead of using your own Gmail account, create a new one for creating and testing the code. Here's how to do this:

  • Create a fresh Gmail account
  • Set up your app password:
    Google Account > Security > Turn on 2-Step Verification > Security > Set up an App Password
    Next, define a name for the app password and click on "Generate". You'll get a 16-digit password after following some instructions on the screen. Store the password safely.

Use this password while sending emails in Python. Here, we're using Gmail SMTP, but if you want to use another mail service provider, follow the same process. Alternatively, contact your company's IT team to seek support in accessing your SMTP server.

Email Authentication With Python

Email authentication is a security mechanism that verifies the sender's identity, ensuring the emails from a domain are legitimate. If you have no email authentication mechanism in place, your emails might land in spam folders, or malicious actors can spoof or intercept them. This could affect your email delivery rates and the sender's reputation.

This is the reason you must enable Python email authentication mechanisms and protocols, such as:

  • SMTP authentication: If you're sending emails using an SMTP server like Gmail SMTP, you can use this method of authentication. It verifies the sender's authenticity when sending emails via a specific mail server.

  • SPF: Stands for Sender Policy Framework and checks whether the IP address of the sending server is among

  • DKIM: Stands for DomainKeys Identified Mail and is used to add a digital signature to emails to ensure no one can alter the email's content while it's in transmission. The receiver's server will then verify the digital signature. Thus, all your emails and their content stay secure and unaltered.

  • DMARC: Stands for Domain-based Message Authentication, Reporting, and Conformance. DMARC instructs mail servers what to do if an email fails authentication. In addition, it provides reports upon detecting any suspicious activities on your domain.

How to Implement Email Authentication in Python

To authenticate your email in Python using SMTP, the smtplib library is useful. Here's how Python SMTP security works:

import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) server.starttls() # Start TLS for secure connection server.login('my_email@domain1.com', 'my_password') message = "Subject: Test Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

Implementing email authentication will add an additional layer of security to your emails and protect them from attackers or from being marked as spam.

Encrypting Emails With Python

Encrypting emails enables you to protect your email's content so that only authorized senders and receivers can access or view the content. Encrypting emails with Python is done using encryption techniques to encode the email message and transform it into a secure and unreadable format (also known as ciphertext).

This way, email encryption secures the message from unauthorized access or attackers even if they intercept the email.

Here are different types of email encryption:

  • SSL: This stands for Secure Sockets Layer, one of the most popular and widely used encryption protocols. SSL ensures email confidentiality by encrypting data transmitted between the mail server and the client.

  • TLS: This stands for Transport Layer Security and is a common email encryption protocol today. Many consider it a great alternative to SSL. It encrypts the connection between an email client and the mail server to prevent anyone from intercepting the email during its transmission.

  • E2EE: This stands for end-to-end encryption, ensuring only the intended recipient with valid credentials can decrypt the email content and read it. It aims to prevent email interception and secure the message.

How to Implement Email Encryption in Python

If your mail server requires SSL encryption, here's how to send an email in Python:

import smtplib import ssl context = ssl.create_default_context() server = smtplib.SMTP_SSL('smtp.domain1.com', 465, context=context) # This is for SSL connections, requiring port number 465 server.login('my_email@domain1.com', 'my_password') message = "Subject: SSL Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

For TLS connections, you'll need the smtplib library:

import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) # TLS requires 587 port number server.starttls() # Start TLS encryption server.login('my_email@domain1.com', 'my_password') message = "Subject: TLS Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

For end-to-end encryption, you'll need more advanced libraries or tools such as GnuPG, OpenSSL, Signal Protocol, and more.

Combining Authentication and Encryption

Email Security with Python requires both encryption and authentication. This ensures that mail servers find the email legitimate and it stays safe from cyber attackers and unauthorized access during transmission. For email encryption, you can use either SSL or TLS and combine it with SMTP authentication to establish a robust email connection.

Now that you know how to enable email encryption and authentication in your emails, let's examine some complete code examples to understand how you can send secure emails in Python using Gmail SMTP and email encryption (SSL).

Code Examples

1. Sending a Plain Text Email import smtplib from email.mime.text import MIMEText subject = "Plain Text Email" body = "This is a plain text email using Gmail SMTP and SSL." sender = "sender1@gmail.com" receivers = ["receiver1@gmail.com", "receiver2@gmail.com"] password = "my_password" def send_email(subject, body, sender, receivers, password): msg = MIMEText(body) msg['Subject'] = subject msg['From'] = sender msg['To'] = ', '.join(receivers) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp_server: smtp_server.login(sender, password) smtp_server.sendmail(sender, receivers, msg.as_string()) print("The plain text email is sent successfully!") send_email(subject, body, sender, receivers, password)

Explanation:

  • sender: This contains the sender's address.
  • receivers: This contains email addresses of receiver 1 and receiver 2.
  • msg: This is the content of the email.
  • sendmail(): This is the SMTP object's instance method. It takes three parameters - sender, receiver, and msg and sends the message.
  • with: This is a context manager that is used to properly close an SMTP connection once an email is sent.
  • MIMEText: This holds only plain text.
2. Sending an Email with Attachments

To send an email in Python with attachments securely, you will need some additional libraries like MIMEBase and encoders. Here's the code for this case:

import smtplib from email import encoders from email.mime.base import MIMEBase from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "Email with Attachments" body = "This is an email with attachments created in Python using Gmail SMTP and SSL." with open("attachment.txt", "rb") as attachment: part = MIMEBase("application", "octet-stream") # Adding the attachment to the email part.set_payload(attachment.read()) encoders.encode_base64(part) part.add_header( "Content-Disposition", # The header indicates that the file name is an attachment. f"attachment; filename='attachment.txt'", ) message = MIMEMultipart() message['Subject'] = subject message['From'] = sender message['To'] = receiver html_part = MIMEText(body) message.attach(html_part) # To attach the file message.attach(part) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string())

Explanation:

  • MIMEMultipart: This library allows you to add text and attachments both to an email separately.
  • 'rb': It represents binary mode for the attachment to be opened and the content to be read.
  • MIMEBase: This object is applicable to any file type.
  • Encode and Base64: The file will be encoded in Base64 for safe email sending.
Sending an HTML Email in Python

To send an HTML email in Python using Gmail SMTP, you need a class - MIMEText.

Here's the full code for Python send HTML email:

import smtplib from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "HTML Email in Python" body = """ <html> <body> <p>HTML email created in Python with SSL and Gmail SMTP.</p> </body> </html> """ message = MIMEText(body, 'html') # To attach the HTML content to the email message['Subject'] = subject message['From'] = sender message['To'] = receiver with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string()) Testing Your Email With Authentication and Encryption

Testing your emails before sending them to the recipients is important. It enables you to discover any issues or bugs in sending emails or with the formatting, content, etc.

Thus, always test your emails on a staging server before delivering them to your target recipients, especially when sending emails in bulk. Testing emails provide the following advantages:

  • Ensures the email sending functionality is working fine
  • Emails have proper formatting and no broken links or attachments
  • Prevents flooding the recipient's inbox with a large number of test emails
  • Enhances email deliverability and reduces spam rates
  • Ensures the email and its contents stay protected from attacks and unauthorized access

To test this combined setup of sending emails in Python with authentication and encryption enabled, use an email testing server like Mailtrap Email Testing. This will capture all the SMTP traffic from the staging environment, and detect and debug your emails before sending them. It will also analyze the email content, validate CSS/HTML, and provide a spam score so you can improve your email sending.

To get started:

  • Open Mailtrap Email Testing
  • Go to 'My Inbox'
  • Click on 'Show Credentials' to get your test credentials - login and password details

Here's the Full Code Example for Testing Your Emails:

import smtplib from socket import gaierror port = 2525 # Define the SMTP server separately smtp_server = "sandbox.smtp.mailtrap.io" login = "xyz123" # Paste your Mailtrap login details password = "abc$$" # Paste your Mailtrap password sender = "test_sender@test.com" receiver = "test_receiver@example.com" message = f"""\ Subject: Hello There! To: {receiver} From: {sender} This is a test email.""" try: with smtplib.SMTP(smtp_server, port) as server: # Use Mailtrap-generated credentials for port, server name, login, and password server.login(login, password) server.sendmail(sender, receiver, message) print('Sent') except (gaierror, ConnectionRefusedError): # In case of errors print('Unable to connect to the server.') except smtplib.SMTPServerDisconnected: print('Server connection failed!') except smtplib.SMTPException as e: print('SMTP error: ' + str(e))

If there's no error, you should see this message in the receiver's inbox:

This is a test email. Best Practices for Secure Email Sending

Consider the below Python email best practices for secure email sending:

  • Protect data: Take appropriate security measures to protect your sensitive data such as SMTP credentials, API keys, etc. Store them in a secure, private place like config files or environment variables, ensuring no one can access them publicly.

  • Encryption and authentication: Always use email encryption and authentication so that only authorized individuals can access your emails and their content.

    For authentication, you can use advanced methods like API keys, two-factor authentication, single sign-on (SSO), etc. Similarly, use advanced encryption techniques like SSL, TLS, E2EE, etc.

  • Error handling: Manage network issues, authentication errors, and other issues by handling errors effectively using except/try blocks in your code.

  • Rate-Limiting: Maintain high email deliverability by rate-limiting the email sending functionality to prevent exceeding your service limits.

  • Validate Emails: Validate email addresses from your list and remove invalid ones to enhance email deliverability and prevent your domain from getting marked as spam. You can use an email validation tool to do this.

  • Educate: Keep your team updated with secure email practices and cybersecurity risks. Monitor your spam score and email deliverability rates, and work to improve them.

Wrapping Up

Secure email sending with Python using advanced email encryption methods like SSL, TLS, and end-to-end encryption, as well as authentication protocols and techniques such as SPF, DMARC, 2FA, and API keys.

By combining these security measures, you can protect your confidential email information, improve email deliverability, and maintain trust with your target recipients. In this way, only individuals with appropriate credentials can access it. This will help prevent unauthorized access, data breaches, and other cybersecurity attacks.

Categories: FLOSS Project Planets

The Python Show: 47 - Python Projects of 2024

Planet Python - Wed, 2024-09-18 20:37

I’ve been working on lots of projects this year. Here are the ones I highlighted in this episode:

The Python Show is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Categories: FLOSS Project Planets

Oliver Davies' daily list: The two ways of writing PHP code

Planet Drupal - Wed, 2024-09-18 20:00

Something that came up in my discussion with Dave Liddament for the Beyond Blocks podcast was that there seem to be two ways of writing PHP code.

One is writing strict code by enabling strict typing, using parameter and return types, and leveraging tools like PHPStan at a high level to analyze code.

The other is no not use types and to use a more "duck typing" approach.

The term "visual debt" came from a video discussing the pros and cons of these approaches.

The same can be said for JavaScript and TypeScript, but PHP can do both and gives the Developer the choice of how they write their code.

I prefer writing strict code and for my code to be as explicit as possible, but I appreciate not everyone does and I like that PHP caters for both.

How do you write your PHP code?

Categories: FLOSS Project Planets

Armin Ronacher: Accidental Spending: A Case For an Open Source Tax?

Planet Python - Wed, 2024-09-18 20:00

Both last week at London tech leaders and this week at the Open Source Summit in Vienna I engaged in various discussions about pledging money to Open Source. At Sentry we have been funding our Open Source dependencies for a few years now and we're trying to encourage others to do the same.

It’s not an easy ask, of course. One quite memorable point raised was what I would call “accidental spending”. The story goes like this: an engineering team spins up a bunch of Kubernetes machines. As the fleet grows in scale some inefficiencies creep in. To troubleshoot or optimize, additional services such as load balancers, firewalls, cloud provider log services, etc. are provisioned with minimal discussion. Initially none of that was part of the plan, but ever so slightly for every computing resource, some extra stuff is paid on top creating largely hidden costs. Ideally all of that pays off (after all, hopefully by debugging quicker you reduce that downtime, by having that load balancer you can auto scale and save on unused computing resources etc.). But often, the payoff feels abstract and are hard to quantify.

I call those purchases “accidental” because they are proportional to the deployed infrastructure and largely acting like a tax on top of everything. Only after a while does the scale of that line item become apparent. On the other hand intentionally purchasing a third party system is a very intentional act. It's very deliberate, requiring conversations and more scrutiny is placed for putting a credit card into a new service. Companies providing services understand this and are positioning themselves accordingly. Their play could be to make the case that that their third party solution is better, cheaper etc.

Open Source funding could be seen through both of these lenses. Today, in many ways, pledging money to Open Source is a very intentional decision. It requires discussions, persuasion and justification. The purpose and the pay-off is not entirely clear. Companies are not used to the idea of funding Open Source and they don't have a strong model to reason about these investments. Likewise many Open Source projects themselves also don't have a good way of dealing with money and might lack the governance to handle funds effectively. After all many of these projects are run by individuals and not formal organizations.

Companies are unlikely to fund something without understanding the return on investment. One better understood idea is to turn that one “random person in Nebraska” maintaining a critical dependency into a well-organized team with good op-sec. But for that to happen, funding needs to scale from pennies to dollars, making it really worthwhile.

My colleague Chad Whitacre floated an idea: what if platforms like AWS or GitHub started splitting the check? By adding a line-item to the invoices of their customers to support Open Source finding. It would turn giving to Open Source into more of a tax like thing. That might leverage the general willingness to just pile up on things to do good things. If we all pay 3% on top of our Cloud or SaaS bills to give to Open Source this would quickly add up.

While I’m intrigued by the idea, I also have my doubts that this would work. It goes back to the problem mentioned earlier that some Open Source projects just have no governance or are not even ready to receive money. How much value you put on a dependency is also very individual. Just because an NPM package has a lot of downloads does not necessarily mean it's critical to the mission of the company. rrweb is a good example for us at Sentry. It sits at the core of our session replay product but since we we vendor a pinned fork, you would not see rrweb in your dependency tree. We also value that package more than some algorithm would be able to determine about how important that package is to us.

So the challenge with the tax — as appealing as it is — is that it might make the “purchase decision” of funding Open Source easier, but it would probably make the distribution problem much worse. Deliberate, intentional funding is key. At least for the moment.

Still, it’s worth considering. The “what if” is a powerful idea. Using a restaurant analogy, the “open-source tax” is like the mandatory VAT or health surcharge on your bill: no choice is involved. Another model could be more like the tip suggestions on a receipt offering a choice but also guidance on what’s appropriate to contribute.

The current model we propose with our upcoming Open Source Pledge is to suggest like a tip what you should give in relation to your developer work force. Take the average number of full time engineers you have over a year, multiply this by 2000. That is the amount in US dollars you should give to your Open Source dependencies.

That sounds like a significant amount! But let's put this in relation for a typical developer you employ: that's less than a fifth of what you would pay for FICA (Federal Insurance Contributions Act in the US) in the US. That's less than the communal tax you would pay in Austria. I'm sure you can think of similar payroll taxes in your country.

I believe that after step one of recognizing there is a funding problem follows an obvious step two: having a baseline funding amount that stands in relation to your business (you own or are a part of) of what the amount should be. Using the size of the development team as a metric offers an objective and quantifiable starting point. The beauty in my mind of the developer count in particular is that it's somewhat independently observable from both the outside and inside [1]. The latter is important! It creates a baseline for people within a company to start a conversation about Open Source funding.

If you have feedback on this, particular the pledge I invite you mail me or to leave a comment on the Pledge's issue tracker.

[1]There is an analogy to historical taxation here. For instance the Window Tax was taxation based on the number of Windows in a building. That made enforcement easy because you could count them from street level. The downside of taht was obviously the unintended consequences that this caused. Something to always keep in mind!
Categories: FLOSS Project Planets

Pages