Feeds

ThinkDrop Consulting: Presenting "Self-hosted DDEV on GitHub Actions" by Jon Pugh at DrupalGovCon 2024

Planet Drupal - Thu, 2024-08-08 07:18
Presenting "Self-hosted DDEV on GitHub Actions" by Jon Pugh at DrupalGovCon 2024 Jon Pugh Thu, 08/08/2024 - 07:18

Next week I'm headed to DrupalGovCon 2024 to present on a brand new technique I created for hosting fast and reliable preview/test environments using DDEV and GitHub Actions.

DDEV supports what they call "Casual Hosting". With the right config tweaks, you can run multiple DDEV sites on a server for hosting sites on the internet.

Categories: FLOSS Project Planets

The Drop Times: A Closer Look at FFW's Transition to JAKALA

Planet Drupal - Thu, 2024-08-08 06:13
JAKALA, the leading data and AI-driven portfolio company of Ardian Buyout has acquired FFW, a key player in digital experience solutions. This acquisition, the largest non-publicly traded digital agency deal in Europe for 2023, expands JAKALA's global footprint and service offerings, pushing its workforce to over 3,000 professionals and turnover beyond €500 million. The merger aims to enhance client experiences through combined expertise in data, technology, and AI.
Categories: FLOSS Project Planets

Akademy 2024 Call for Volunteers

Planet KDE - Thu, 2024-08-08 05:15

Akademy needs you! Volunteering is a great way to make new friends and Akademy wouldn't be possible without us all pitching in to make it happen. Find a task or two that sounds fun and sign yourself up! All you need to do is add yourself to a timeslot on the wiki page

Categories: FLOSS Project Planets

Smartbees: Drupal vs. Adobe Experience Manager: Platforms Comparison

Planet Drupal - Thu, 2024-08-08 04:26

Drupal and AEM are two popular content management systems used by many companies. They offer advanced features for creating, editing, and publishing content but differ in many ways. In this article, we will compare Drupal to AEM to help you find the right CMS for your needs.

Categories: FLOSS Project Planets

Louis-Philippe Véronneau: A Selection of DebConf24 Talks

Planet Debian - Thu, 2024-08-08 00:00

DebConf24 is now over! I'm very happy I was able to attend this year. If you haven't had time to look at the schedule yet, here is a selection of talks I liked.

What happens if I delete setup.py?: a live demo of upgrading to PEP-518 Python packaging

A great talk by Weezel showcasing how easy it is to migrate to PEP-518 for existing Python projects.

This is the kind of thing I've been doing a lot when packaging upstream projects that still use setup.py. I encourage you to send this kind of patch upstream, as it makes everyone's life much easier.

Debian on Chromebooks: What's New and What's Next?

A talk by Alper Nebi Yasak, who has done great work on running Debian and the Debian Installer on Chromebooks.

With Chromebooks being very popular machines in schools, it's nice to see people working on a path to liberate them.

Sequoia PGP, sq, gpg-from-sq, v6 OpenPGP, and Debian

I had the chance to see Justus' talk on Sequoia — an OpenPGP implementation in Rust — at DebConf22 in Kosovo. Back then, the conclusion was that sq wasn't ready for production yet.

Well it seems it now is! This in-depth talk goes through the history of the project and its goals. There is also a very good section on the current OpenPGP/LibrePGP schism.

Chameleon - the easy way to try out Sequoia - OpenPGP written in Rust

A very short talk by Holger on Chameleon, a tool to make migration to Sequoia easier.

TL;DW: apt install gpg-from-sq

Protecting OpenPGP keyservers from certificate flooding

Although I used to enjoy signing people's OpenPGP keys, I completely gave up on this practice around 2019 when dkg's key was flooded with bogus certifications and have been refusing to do so since.

In this talk, Gunnar talks about his PhD work on fixing this issue and making sure we can eventually restore this important function on keyservers.

Bits from the DPL

Bits from the DPL! A DebConf classic.

Linux live patching in Debian

Having to reboot servers after kernel upgrades is a hassle, especially with machines that have encrypted disk drives.

Although kernel live patching in Debian is still a work in progress, it is encouraging to see people trying to fix this issue.

"I use Debian BTW": fzf, tmux, zoxide and friends

A fun talk by Samuel Henrique on little changes and tricks one can make to their setup to make life easier.

Ideas to Move Debian Installer Forward

Another in-depth talk by Alper, this time on the Debian Installer and his ideas to try to make it better. I learned a lot about the d-i internals!

Lightning Talks

Lighting talks are always fun to watch! This year, the following talks happened:

  1. Customizing your Linux icons
  2. A Free Speech tracker by SFLC.IN
  3. Desktop computing is irrelevant
  4. An introduction to wcurl
  5. Aliasing in dpkg
  6. A DebConf art space
  7. Tiny Tapeout, Fomu, PiCI
  8. Data processing and visualisation in the shell
Is there a role for Debian in the post-open source era?

As an economist, I've been interested in Copyright and business models in the Free Software ecosystem for a while. In this talk, Hatta-san and Bruce Perens discuss the idea of alternative licences that are not DFSG-free, like Post-Open.

Categories: FLOSS Project Planets

KIO Thumbnailer Support

Planet KDE - Wed, 2024-08-07 20:00

The KIO Framework has gained support for de-facto standard, cross-desktop thumbnail generators. This means that we have a support for thumbnails from 3rd party applications! On Linux systems, many applications that produce some kind of output, such as a 3D file or text document, ship a thumbnailer file that tells file managers how to create thumbnails of their files. One specific example I've used here in the images are STL files, for which we don't have our own KDE-specific thumbnailer plugin.

These thumbnailer files are currently used by Nautilus and Thunar, so we felt like we were missing out and wanted to join the party! :)

Thumbnailer files

Thumbnailer files are simple text files that tell the system what program we should run to generate a thumbnail. You can check what thumbnailers you have installed by running ls /usr/share/thumbnailers

For example, the STL thumbnailer file looks like this:

[Thumbnailer Entry] TryExec=stl-thumb Exec=xvfb-run --auto-servernum -w 0 stl-thumb -f png -s %s %i %o MimeType=model/stl;model/x.stl-ascii;model/x.stl-binary;application/sla;

It tells the software running the thumbnailer what commands to use to generate the thumbnail, and what mimetypes it supports.

KDE Thumbnailer Plugins

On KDE side, we have used plugins for KIO, that reside in the kio-extras repository. They work just fine for our usecase in KDE apps, but nobody should need to write a KIO specific plugin for their application.

The changes to KIO

You can check the merge request for more in-depth details, but here's a summary of how I made it work side-by-side with our plugin system:

We utilize the KIO plugins always first if possible, since we know for sure they work. This is to avoid any possible regressions and oddities, and to keep the change as unintrusive as possible. When we encounter a mimetype that is not supported by our plugins, like STL files, we utilize a thumbnailer file instead.

This also means that it's transparent to users. Users do not have to worry which one they have installed.

Why make support for thumbnailer files then?

As mentioned earlier, no application should need to create a plugin for KIO just to make their thumbnails show up in our applications.

Thumbnailer files offer other benefits too, such as easing future transitions, (like from KF6 to KF7); working nicely with sandboxing, and being distributable in Flatpak bundles.

I am also working on moving our own plugins into thumbnailers, so we get the benefits from that too.

How can I test it out?

Currently it's only in the master branch of KIO, so if you really want to try it out, you will have to set up KDE Plasma development environment: https://develop.kde.org/docs/getting-started/building/kdesrc-build-setup/

When inside in the development environment, open Dolphin and enable the thumbnailers from preview settings.

Any help testing it would be very welcome! :) Let me know of any possible improvements and bugs!

Categories: FLOSS Project Planets

Update from the board of directors

Open Source Initiative - Wed, 2024-08-07 12:13

The Chair of the Board of the OSI has acknowledged the resignation offered by Secretary of the Board, Aeva Black. The Chair and the entire Board would like to thank Black for their invaluable contribution to the success of OSI, as well of the entire Open Source Community, and for their service as board member and officer of the Initiative.

Categories: FLOSS Research

Obey the Testing Goat: Progress on the Third Edition of the Book!

Planet Python - Wed, 2024-08-07 11:13

In lieu of a formal announcement about the Third Edition, how about a progress update?

Core technology updates: Django + Python

Embarrassment-Driven Development

One of the main motivations for a third edition was that the 2e is based on Django 1.11, which dropped out of support back in 2017, and that's been a big turnoff for readers for a while, and quite embarrassing really.

So, the plan is to upgrade to Django 5.x, and progress is good -- I've already updated most of the core chapters to Django 4.2, and upgrade Python to 3.12 while I was at it. Django 5 is next, and I'm hoping/assuming it will be a smaller leap that 1->4 was, so that won't be far behind.

New Deployment Technologies: Docker + Ansible

I've always been proud that the book includes several chapters on how to actually deploy our app to production, and make the app live on the actual public Internet. But the deployment process from the first and second editions--broadly speaking, SSH in to your server, hack about to figure out how to get your app deployed manually, and then automate what you did with glorified shellscripts, aka Fabric--was starting to look less and less like what modern deployment looks like, or my experience of it at least.

I uhmmed and ahhed about it for a while, but in the end I decided to go with a deployment process that looks like this:

  • Package up our app into a Docker container, and use our tests to confirm it really works
  • Use Ansible to automate pushing that container onto a server and running it.

Check out the latest version of the deployment chapters here:

I think I like how it's turned out, a lot of the fiddliness and debugging of deployment/production-readiness can now happen locally (in Docker containers on your own machine), so I think that tightens and speeds up the feedback loop a fair bit.

JavaScript

The Javascript chapter was another head-scratcher. I wanted to move away from QUnit, and include some more modern/ES6 syntax. In the end, I decided to go with Jasmine, which is old but still popular, but to keep the browser-based test runner, which is a bit of an unconventional choice, but it does mean we can avoid the whole Node.js and node_modules learning curve.

Aside from that, I've wound down the "JavaScript is such a nightmare" jokes, because they're really not fair any more, and were probably never that funny besides.

Check out the new version here:

Some changes of emphasis

The other main changes to the book are going to be around how I talk about some of the tradeoffs involved in the use of mocking, and unit vs integration vs functional/e2e tests. I think the first and second editions were perhaps a little too opinionated on this front (I still cringe to think how defensive I was when I first wrote the Hot Lava chapter, sorry CaseY!!), and my thinking has evolved a lot since I wrote my second book with Bob.

That's still very much on the drawing board though, so you'll have to watch this space for updates on that front.

Anyways, all the latest versions of the 3e chapters are live here on the site, and also as an Early Release on O'Reilly Learning, so do dive in and let me know what you think!

Categories: FLOSS Project Planets

Real Python: Asynchronous Iterators and Iterables in Python

Planet Python - Wed, 2024-08-07 10:00

When you write asynchronous code in Python, you’ll likely need to create asynchronous iterators and iterables at some point. Asynchronous iterators are what Python uses to control async for loops, while asynchronous iterables are objects that you can iterate over using async for loops.

Both tools allow you to iterate over awaitable objects without blocking your code. This way, you can perform different tasks asynchronously.

In this tutorial, you’ll:

  • Learn what async iterators and iterables are in Python
  • Create async generator expressions and generator iterators
  • Code async iterators and iterables with the .__aiter__() and .__anext__() methods
  • Use async iterators in async loops and comprehensions

To get the most out of this tutorial, you should know the basics of Python’s iterators and iterables. You should also know about Python’s asynchronous features and tools.

Get Your Code: Click here to download the free sample code that you’ll use to learn about asynchronous iterators and iterables in Python.

Take the Quiz: Test your knowledge with our interactive “Asynchronous Iterators and Iterables in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Asynchronous Iterators and Iterables in Python

Take this quiz to test your understanding of how to create and use Python async iterators and iterables in the context of asynchronous code.

Getting to Know Async Iterators and Iterables in Python

Iterators and iterables are fundamental components in Python. You’ll use them in almost all your programs where you iterate over data streams using a for loop. Iterators power and control the iteration process, while iterables typically hold data that you want to iterate over.

Python iterators implement the iterator design pattern, which allows you to traverse a container and access its elements. To implement this pattern, iterators need the .__iter__() and .__next__() special methods. Similarly, iterables are typically data containers that implement the .__iter__() method.

Note: To dive deeper into iterators and iterables, check out the Iterators and Iterables in Python: Run Efficient Iterations tutorial.

Python has extended the concept of iterators and iterables to asynchronous programming with the asyncio module and the async and await keywords. In this scenario, asynchronous iterators drive the asynchronous iteration process, mainly powered by async for loops and comprehensions.

Note: In this tutorial, you won’t dive into the intricacies of Python’s asynchronous programming. So, you should be familiar with the related concepts. If you’re not, then you can check out the following tutorials:

In these tutorials, you’ll gain the required background to prepare for exploring asynchronous iterators and iterables in more depth.

In the following sections, you’ll briefly examine the concepts of asynchronous iterators and iterables in Python.

Async Iterators

Python’s documentation defines asynchronous iterators, or async iterators for short, as the following:

An object that implements the .__aiter__() and .__anext__() [special] methods. .__anext__() must return an awaitable object. [An] async for [loop] resolves the awaitables returned by an asynchronous iterator’s .__anext__() method until it raises a StopAsyncIteration exception. (Source)

Similar to regular iterators that must implement .__iter__() and .__next__(), async iterators must implement .__aiter__() and .__anext__(). In regular iterators, the .__iter__() method usually returns the iterator itself. This is also true for async iterators.

To continue with this parallelism, in regular iterators, the .__next__() method must return the next object for the iteration. In async iterators, the .__anext__() method must return the next object, which must be awaitable.

Python defines awaitable objects as described in the quote below:

An object that can be used in an await expression. [It] can be a coroutine or an object with an .__await__() method. (Source)

In practice, a quick way to make an awaitable object in Python is to call an asynchronous function. You define this type of function with the async def keyword construct. This call creates a coroutine object.

Note: You can also create awaitable objects by implementing the .__await__() special method in a custom class. This method must return an iterator that yields control back to the event loop until the awaited result is ready. This topic is beyond the scope of this tutorial.

When the data stream runs out of data, the method must raise a StopAsyncIteration exception to end the asynchronous iteration process.

Here’s an example of an async iterator that allows iterating over a range of numbers asynchronously:

Read the full article at https://realpython.com/python-async-iterators/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

The Drop Times: Introducing Dresktop: Multi-Platform Tool for Drupal Project Management

Planet Drupal - Wed, 2024-08-07 09:57
Jose Daniel Estrada's new application, Dresktop, simplifies Drupal project management with multi-platform support, Docker integration, and robust tools for deployment and updates. Available on MacOS, Windows (soon), and Linux, Dresktop is a free, open-source solution for developers.
Categories: FLOSS Project Planets

Lullabot: Drupal 11: What’s New and What’s Next

Planet Drupal - Wed, 2024-08-07 09:51

Drupal 11 was released on August 2nd, 2024. A lot has changed since Drupal 10, and a lot of preliminary work has been done that will help with Drupal Starshot. Besides improvements made to CKEditor integration, a revamp of the Field UI, performance improvements, and revisions for taxonomy terms, several big changes have been included to help the ambitious site builder.

Categories: FLOSS Project Planets

Tag1 Consulting: Migrating Your Data from D7 to D10: Migrating taxonomy vocabularies and D7 field collections into D10 paragraphs

Planet Drupal - Wed, 2024-08-07 09:50

In the previous article, we began migrating configuration from Drupal 7 example site to our Drupal 10 instance, specifically content types. In today's article, we will continue with two more D7 entities: taxonomy vocabularies and field collections. The latter will be imported as Paragraphs in Drupal 10. Along the way, we will review the content model and the migration plan. This will help us determine what parts of the migration should be automated and what can be performed manually.

Read more mauricio Wed, 08/07/2024 - 06:50
Categories: FLOSS Project Planets

Python Software Foundation: Security Developer-in-Residence role extended thanks to Alpha-Omega

Planet Python - Wed, 2024-08-07 09:30

We are excited to announce the continuation of Seth Larson’s work in the Security Developer-in-Residence role through the end of 2024 thanks to continued support from Alpha-Omega. (This six month extension is intended to align the renewal period for this role with the calendar year going forward).

The first year of the Security Developer-in-Residence initiative has been a success, seeing multiple improvements to the Python ecosystem's security posture. These improvements include authorizing the PSF as a CVE Numbering Authority, migrating the CPython release process to an isolated hosted build platform, and generating comprehensive Software Bill-of-Materials documents for CPython artifacts.

Open source software security continues to evolve, this year saw new regulations for software security like the EU Cyber Resiliency Act (CRA) and evolving threats to open source like the backdoor of xz-utils.

The PSF is looking forward to continuing our investment in the security of the Python ecosystem and everyone who depends on Python software. For the remainder of 2024, priorities for Security Developer-in-Residence role include:

  • Formalization of the Python Security Response Team (PSRT) and processes for handling vulnerability reports and fixes.
  • Developing a strategy for Software Bill-of-Materials documents and Python packages.
  • Completing the migration of the CPython release process and generation of SBOM documents for the macOS installer.
  • Continued engagement with the Python community promoting security best-practices and standards.

For updates on these and other projects, check out Seth’s blog.

The PSF is a non-profit whose mission is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers. The PSF supports the Python community using corporate sponsorships, grants, and donations. Are you interested in sponsoring or donating to the PSF so it can continue supporting Python and its community? Check out our sponsorship program, donate directly here, or contact our team!

Categories: FLOSS Project Planets

Drupal life hack's: Extended Review of Backward Compatibility Questions When Upgrading to Drupal 11

Planet Drupal - Wed, 2024-08-07 09:11
Extended Review of Backward Compatibility Questions When Upgrading to Drupal 11 admin Wed, 08/07/2024 - 16:11
Categories: FLOSS Project Planets

Drupal Starshot blog: Introducing Drupal Starshot's product strategy

Planet Drupal - Wed, 2024-08-07 09:10

This blog has been re-posted and edited with permission from Dries Buytaert's blog.

Drupal Starshot aims to attract mid-market marketers by offering out-of-the-box marketing best practices, user-friendly tools, AI-driven site building features, all while maintaining the many advantages of Drupal Core.

I'm excited to share the first version of Drupal Starshot's product strategy, a document that aims to guide the development and marketing of Drupal Starshot. To read it, download the full Drupal Starshot strategy document as a PDF (8 MB).

This strategy document is the result of a collaborative effort among the Drupal Starshot leadership team, the Drupal Starshot Advisory Council, and the Drupal Core Committers. We also tested it with marketers who provided feedback and validation.

Drupal Starshot and Drupal Core

Drupal Starshot is the temporary name for an initiative that extends the capabilities of Drupal Core. Drupal Starshot aims to broaden Drupal's appeal to marketers and a wider range of project budgets. Our ultimate goal is to increase Drupal's adoption, solidify Drupal's position as a leading CMS, and champion an Open Web.

For more context, please watch my DrupalCon Portland keynote.

It's important to note that Drupal Starshot and Drupal Core will have separate, yet complementary, product strategies. Drupal Starshot will focus on empowering marketers and expanding Drupal's presence in the mid-market, while Drupal Core will prioritize the needs of developers and more technical users. I'll write more about the Drupal Core product strategy in a future blog post once we have finalized it. Together, these two strategies will form a comprehensive vision for Drupal as a product.

Why a product strategy?

By defining our goals, target audience and necessary features, we can more effectively guide contributors and ensure that everyone is working towards a common vision. This product strategy will serve as a foundation for our development roadmap, our marketing efforts, enabling Drupal Certified Partners, and more.

Drupal Starshot product strategy TL;DR

For the detailed product strategy, please read the full Drupal Starshot strategy document (8 MB, PDF). Below is a summary.

Drupal Starshot aims to be the gold standard for marketers that want to build great digital experiences.

We'd like to expand Drupal's reach by focusing on two strategic shifts:

  1. Prioritizing Drupal for content creators, marketers, web managers, and web designers so they can independently build websites. A key goal is to empower these marketing professionals to build and manage their websites independently without relying on developers or having to use the command line or an IDE.
  2. Extending Drupal's presence in the mid-market segment, targeting projects with total budgets between $30,000 and $120,000 USD (€25,000 to €100,000).

Drupal Starshot will differentiate itself from competitors by providing:

  1. A thoughtfully designed platform for marketers, balancing ease of use with flexibility. It includes smart defaults, best practices for common marketing tasks, marketing-focused editorial tools, and helpful learning resources.
  2. A growth-oriented approach. Start simple with Drupal Starshot's user-friendly tools, and unlock advanced features as your site grows or you gain expertise. With sophisticated content modeling, efficient content reuse across channels, and robust integrations with other leading marketing technologies, ambitious marketers won't face the limitations of other CMSs and will have the flexibility to scale their site as needed.
  3. AI-assisted site building tools to simplify complex tasks, making Drupal accessible to a wider range of users.
  4. Drupal's existing competitive advantages such as extensibility, scalability, security, accessibility, multilingual support, and more.
What about ambitious site builders?

In the past, we used the term ambitious site builders to describe Drupal's target audience. Although this term doesn't appear in the product strategy document, it remains relevant.

While the strategy document is publicly available, it is primarily an internal guide. It outlines our plans but doesn't dictate our marketing language. Our product strategy's language purposly aligns with terms used by our target users, based on persona research and interviews.

To me, "ambitious site builders" includes all Drupal users, from those working with Drupal Core (more technically skilled) to those working with Drupal Starshot (less technical). Both groups are ambitious, with Drupal Starshot specifically targeting "ambitious marketers" or "ambitious no-code developers".

Give feedback

The product strategy is a living document, and we value input. We invite you to share your thoughts, suggestions, and questions in the product strategy feedback issue within the Drupal Starshot issue queue.

Get involved

There are many opportunities to get involved with Drupal Starshot, whether you're a marketer, developer, designer, writer, project manager, or simply passionate about the future of Drupal. To learn more about how you can contribute to Drupal Starshot, visit https://drupal.org/starshot.

Thank you

I'd like to thank the Drupal Starshot leadership team, the Drupal Starshot Advisory Council, and the Drupal Core Committers for their input on the strategy. I'm also grateful for the marketers who provided feedback on our strategy, helping us refine our approach.

File attachments:  starshot-strategy-1920w.jpg
Categories: FLOSS Project Planets

Django Weblog: Django 5.1 released

Planet Python - Wed, 2024-08-07 09:00

The Django team is happy to announce the release of Django 5.1.

The release notes showcase a kaleidoscope of improvements. A few highlights are:

  • Easier guardrails for authentication: the new and shiny LoginRequiredMiddleware, when added to MIDDLEWARE, enforces authentication for all views by default.
  • A more inclusive framework: Django 5.1 includes several accessibility enhancements, such as improved screen reader support in the admin interface, more semantic HTML elements, and better association of help text and labels with form fieldsets.
  • The second oldest ticket fixed in this release provides the long awaited querystring template tag, which greatly simplifies the handling of query strings when building URLs in templates.

(If you are curious about the oldest ticket fixed in this release, check out Ticket #10743.)

You can get Django 5.1 from our downloads page or from the Python Package Index. The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E.

With the release of Django 5.1, Django 5.0 has reached the end of mainstream support. The final minor bug fix release, 5.0.8, was issued yesterday. Django 5.0 will receive security and data loss fixes until April 2025. All users are encouraged to upgrade before then to continue receiving fixes for security issues.

See the downloads page for a table of supported versions and the future release schedule.

Categories: FLOSS Project Planets

Jean-Pierre Lorre: Voices of the Open Source AI Definition

Open Source Initiative - Wed, 2024-08-07 08:43

The Open Source Initiative (OSI) is running a blog series to introduce some of the people who have been actively involved in the Open Source AI Definition (OSAID) co-design process. The co-design methodology allows for the integration of diverging perspectives into one just, cohesive and feasible standard. Support and contribution from a significant and broad group of stakeholders is imperative to the Open Source process and is proven to bring diverse issues to light, deliver swift outputs and garner community buy-in.

This series features the voices of the volunteers who have helped shape and are shaping the Definition.

Meet Jean-Pierre Lorre

What’s your background related to Open Source and AI?

I’ve been using Open Source technologies since the very beginning of my career and have been directly involved in Open Source projects for around 20 years.

I graduated in artificial intelligence engineering in 1985. Since then I have worked in a number of applied AI research structures in fields such as medical image processing, industrial plant supervision, speech recognition and natural language processing. My knowledge covers both symbolic AI methods and techniques and deep learning.

I currently lead a team of around fifteen AI researchers at LINAGORA. LINAGORA is an Open Source company.

What motivated you to join this co-design process to define Open Source AI?

The team I lead is heavily involved in the development of LLM generative models, which we want to distribute under an open license. I realized that the term Open Source AI was not defined and that the definition we had at LINAGORA was not the same as the one adopted by our competitors.

As the OSI is the leading organization for defining Open Source and there was a project underway to define the term Open Source AI, I decided to join it.

Can you describe your experience participating in this process? What did you most enjoy about it and what were some of the challenges you faced?

I participated in two ways: firstly, to provide input for the definition currently being drafted; and secondly, to evaluate LLM models with regard to the definition (I contributed to Bloom, Falcon and Mistral).

For the first item, my main difficulty was keeping up with the meandering discussions, which were very active. I didn’t manage to do so completely, but I was able to appreciate the summaries provided from time to time, which enabled me to follow the overall thread.

The second difficulty concerns the evaluation of the models: the aim of the exercise was to evaluate the consistency of OSAID version 0.8 on models that currently claim to be “Open Source.” Implementing the definition involves looking for information that is sometimes non-existent and sometimes difficult to find. 

Why do you think AI should be Open Source?

Artificial intelligence models are expected to play a very important role in our professional lives, but also in our everyday lives. In this respect, the need for transparency is essential to enable people to check the properties of the models. They must also be accessible to as many people as possible, to avoid widening the inequalities between those who have the means to develop them and those who will remain on the sidelines of this innovation. Similarly, they might be adapted for different uses without the need for authorization.

The Open Source approach makes it possible to create a community such as the one created by LINAGORA, OpenLLM-Europe. This is a way for small players to come together to build the critical mass needed not only to develop models but also to disseminate them. Such an approach, which may be compared to that associated with the digital commons, is a guarantee of sovereignty because it allows knowledge and governance to be shared.

In short, they are the fruit of work based on data collected from as many people as possible, so they must remain accessible to as wide an audience as possible.

What do you think is the role of data in Open Source AI?

Data provides the basis for training models. It is therefore the pool of information from which the knowledge displayed by the model and the applications deduced from it will be drawn. In the case of an open model, the dissemination of as many elements as possible to qualify this data is a means of transparency that facilitates the study of the model’s properties; indeed, this data is likely to include cultural bias, gender, ethnic origin, skin color, etc. It is also a means of facilitating the study of the model’s properties. It also makes it easier to modify the model and its outputs.

Has your personal definition of Open Source AI changed along the way? What new perspectives or ideas did you encounter while participating in the co-design process?

Yes, we initially thought that the provision of training data was a sine qua non condition for the design of truly Open Source models. Our basic assumption was that the model may be seen as a work derived from the data and that therefore the license assigned to the data, in particular the non-commercial nature, had an impact on the license of the model. As the discussions progressed, we realized that this condition was very restrictive and severely limited the possibility of developing models.

Our current analysis is that the condition defined in version 0.8 of the OSAID is sufficient to provide the necessary guarantees of transparency for the four freedoms and in particular the freedom to study the model underlying access to data. With regard to the data, it stipulates that “sufficiently detailed information about the data used to train the system, so that a skilled person can recreate a substantially equivalent system using the same or similar data” must be provided. Even if we can agree that this condition seems difficult to satisfy without providing the data sets, other avenues may be envisaged, in particular the provision of synthetic data. This information should make it possible to carry out almost all of the model’s studies.

What do you think the primary benefit will be once there is a clear definition of Open Source AI?

Having such a definition with clear, implementable rules will provide model suppliers with a concrete framework for producing models that comply with the ethics of the Open Source movement.

A collateral effect will be to help sort out the “wheat from the chaff.” In particular, to detect attempts at “Open Source washing.” This definition is therefore a structuring element for a company such as LINAGORA, which wants to build a sustainable business model around the provision of value-added AI services.

It should also be noted that such a definition is necessary for regulations such as the European IA Act, which defines exceptions for Open Source generative models. Such legislative construction cannot be satisfied with a fuzzy basis.

What do you think are the next steps for the community involved in Open Source AI?

The next steps that need to be addressed by the community concern firstly the definition of a certification process that will formalize the conformity of a model; this process may be accompanied by tools to automate it.

In a second phase, it may also be useful to provide templates of AI models that comply with the definition, as well as best practice guides, which would help model designers.

How to get involved

The OSAID co-design process is open to everyone interested in collaborating. There are many ways to get involved:

  • Join the working groups: be part of a team to evaluate various models against the OSAID.
  • Join the forum: support and comment on the drafts, record your approval or concerns to new and existing threads.
  • Comment on the latest draft: provide feedback on the latest draft document directly.
  • Follow the weekly recaps: subscribe to our newsletter and blog to be kept up-to-date.
  • Join the town hall meetings: participate in the online public town hall meetings to learn more and ask questions.
  • Join the workshops and scheduled conferences: meet the OSI and other participants at in-person events around the world.
Categories: FLOSS Research

Jamie McClelland: Who ate my RAM?

Planet Debian - Wed, 2024-08-07 08:27

One of our newest servers, with a hefty 256GB of RAM, recently began killing processes via the oomkiller.

According to free, only half of the RAM was in use (125GB). About 4GB was free, with the remainer used by the file cache.

I’m used to seeing unexpected “free RAM” numbers like this and have been assured that the kernel is simply not wasting RAM. If it’s not needed, use it to cache files to save on disk I/O. That make sense.

However… why is the oomkiller being called instead of flushing the file cache?

I came up with all kinds of amazing and wrong theories: maybe the RAM is fragmented (is that even a thing?!?), maybe there is a spike in RAM and the kernel can’t flush the cache quickly enough (I really don’t think that’s a thing). Maybe our kvm-manager has a weird bug (nope, but that didn’t stop me from opening a spurious bug report).

I learned lots of cool things, like the oomkiller report includes a table of the memory in use by each process (via the rss column) - and you have to muliply that number by 4096 because it’s in 4K pages.

That’s how I discovered that the oomkiller was killing off processes with only half the memory in use.

I also learned that lsof sometimes lists the same open file multiple times, which made me think a bunch of files were being opened repeatedly causing a memory problem, but really it amounted to nothing.

That last thing I learned, courtesy of an askubuntu post is that the /dev filesystem is allocated by default exactly half the RAM on the system. What a coincidence! That is exactly how much RAM is useable on the server.

And, on the server in question, that filesystem is full. What?!? Normally, that filesystem should be using 0 bytes because it’s not a real filesystem. But in our case a process created a 127GB file there - it was only stopped because the file system filled up.

Categories: FLOSS Project Planets

Pages