Feeds

Mario Hernandez: How to build a card component

Planet Drupal - Sat, 2024-06-15 22:24

One of the most popular components on any website I've worked on is the "Card" component. Depending on the website, the card component is used to group various pieces of content into a container which can be used throughout your website.

Read the full article on how to build a card comonent.

Categories: FLOSS Project Planets

Mario Hernandez: Flexible Headings with Twig

Planet Drupal - Sat, 2024-06-15 22:24

Proper use of headings h1-h6 in your project presents many advantages incuding semantic markup, better SEO ranking and better accesibility.

Updated April 3, 2020

Building websites using the component based approach presents all kinds of advantages over the traditional page building approach. Today I’m going to show how to create what would normally be an Atom if we use the atomic design approach for building components. We are going to take this simple component to a whole new level by providing a way to dynamically controlling how it is rendered.

The heading component

Headings are normally used for page or section titles and are a big part of making your website SEO friendly. As simple as this may sound, headings need to be carefully planned. A typical heading would look like this:

<h1>This is a Heading 1</h1>

The idea of components is that they are reusable, but how can we possibly turn what already looks like a bare bones component into one that provides options and flexibility? What if we wanted to use a h2 or h3? or what if the title field is a link to another page? Then the heading component would probably not work because we have no way of changing the heading level from h1 to any other level or add a URL. Let's improve the heading component so we make it more dynamic.

Enter Twig and JSON

Twig offers many advantages over plain HTML and today we will use some logic to transform the static heading component into a more dynamic one.

Let’s start by creating a simple JSON object which we will use as data for Twig to consume. We will build some logic around this data to make the heading component more dynamic. This is typically how I build components on projects I work on.

  1. In your project, typically within the components/patterns directory create a new folder called heading
  2. Inside the heading folder create a new file called heading.json
  3. Inside the new file paste the code snippet below
{ "heading": { "heading_level": "", "modifier": "", "title": "This is the best heading I've seen!", "url": "" } }

So we created a simple JSON object with 4 keys: heading_level, modifier, title, and url.

  • The heading_level is something we can use to change the headings from say, h1 to h2 or h3 if we need to.
  • The modifier key allows us to pass a modifier CSS class when we make use of this component. The modifier class will make it possible for us to style the heading differently than other headings, if needed.
  • The title key is the title's string of text that will become the title of a page or a component.
  • ... and finally, the url key, if present, will allow us to wrap the title in an <a> tag, to make it a link.
  1. Inside the heading folder create a new file called heading.twig
  2. Inside the new file paste the code snippet below
<h{{ heading.heading_level|default('2') }} class="heading{{ heading.modifier ? ' ' ~ heading.modifier }}"> {% if heading.url %} <a href="{{ heading.url }}" class="heading__link"> {{ heading.title }} </a> {% else %} {{ heading.title }} {% endif %} </h{{ heading.heading_level|default('2') }}>

Wow! What's all this? 😮

Let's break things down to explain what's happening here since the twig code has changed significantly:

  • First we make use of heading.heading_level to complete the number part of the heading. If a value is not provided for heading_level in the JSON file, we are setting a default of 2. This will ensue that by default we will have a <h2> as the title, much better than <h1> as we saw before. This value can be changed every time the heading isused. The same approach is taken to close the heading tag at the last line of code.
  • Also, in addition to adding a class of heading, we check whether there is a value for the modifier key in JSON. If there is, we pass it to the heading as a CSS class. If no value is provided nothing will be added.
  • In the next line line, we check whether a URL was provided in the JSON file, and if so, we wrap the Flexible Headings with Twig variable in a <a> tag to turn the title into a link. The href value for the link is ``. If no URL is provided in the JSON file, we simply print the value of Flexible Headings with Twig as plain text.
Now what?

Well, our heading component is ready but unfortunately the component on its own does not do any good. The best way to take advantage of our super smart component is to start using it within other components.

Putting the heading component to use

As previously indicated, the idea of components is so they can be reusable which eliminates code duplication. Now that we have the heading component ready, we can reuse it in other templates by taking advantage of twig’s include statements. That will look like this:

<article class="card"> {% include '@components/heading/heading.twig' with { "heading": heading } only %} </article>

The example above shows how we can reuse the heading component in the card component by using a Twig’s include statement.

NOTE: For this to work, the same data structure for the heading needs to exist in the card’s JSON file. Or, you could also alter the heading's values in twig, like this:

<article class="card"> {% include '@components/heading/heading.twig' with { "heading": { "heading_level": 3, "modifier": 'card__title', "title": "This is a super flexible and smart heading", "url": "https://mariohernandez.io" } } only %} </article>

You noticed the part @components? this is only an example of a namespace. If you are not familiar with the component libraries Drupal module, it allows you to create namespaces for your theme which you can use to nest or include components as we see above.

End result

The heading component we built above would look like this when it is rendered:

<h3 class="heading card__title"> <a href="https://mariohernandez.io" class="heading__link"> This is a super flexible and smart heading </a> </h3> In closing

The main goal of this post is to bring light on how important it is to build components that are not restricted and can be used throughout the site in a way that does not feel like you are repeating yourself.

Additional Resources:

Managing heading levels in design systems.

Categories: FLOSS Project Planets

Mario Hernandez: Getting started with Gatsby

Planet Drupal - Sat, 2024-06-15 22:24

As many developers, when I hear the words "static website" I immediate think of creating flat HTML pages and editing them by hand. Times have changed. As you will see, Static Site Generators (SSG), offer some of the most advanced features and make use of latest technologies available on the web.

Static Site Generators are nothing new. If you search for SSG you will find many. One of the most popular ones is Jekyll, which I have personally worked with and it's a really good one. However, this post focusing on Gatsby. Probably one of the hottest system for creating static sites.

What is Gatsby?

Gatsby's primarily objective is to build static sites, but as you will learn, that's just the tip of the iceberg.

Gatsby is a blazing-fast static site generator for React.

How does Gatsby work?

While other SSGs use templating languages like Mustache, Handlebars, among others, Gatsby uses React. This not only allows for building modern component-driven websites, it also provides an incredible fast page rendering. Like mind-blowing fast.

Extending Gatsby

One of the most powerful features of Gatsby is its growing number of "Plugins". Plugins are the building blocks of Gatsby. They allow you to implement new features and functionality by running a couple of commands and making some configuration changes. Anything from adding Sass to your React project, creating a blog, configuring Google Anaylitics and many many more.

Plugins are contributed code kindly provided by the generous Open Source community which totally rocks. Anyone is able to write plugins and make them available to the world to consume and use.

Check out their Plugins page for a full list of ways you can take your static site to the next level.

Editorial Process

So we are building static sites and you may be wondering How do I create content for my site? There are several ways in which you can create a content editign workflow for your site. Probably the easiest way is to use static Markdown files. Markdown is a lightweight markup language with plain text formatting syntax. It is designed so that it can be converted to HTML and many other formats. Markdown is often used to format readme files, for writing messages in online discussion forums, and to create rich text using a plain text editor. This blog is using markdown. Since I am the only creating content I don't need a fancy administrative interface to crate content.

Markdown is only one of the ways you can create content for your static sites. Others include more advanced methods such as plugging in Gatsby with your Content Management System (CMS) of choice. This includes Wordpress, Drupal, Netlify, ContentaCMS, Contenful, and others. This means if you currently use any of those CMSs, you can continue to use them to retain a familiar workflow while moving your front-end workflow to a simpler and easier to manage process. This method is usually referred to as decoupled or headless, as your back-end is independent of your front-end.

Quering Data

As previously mentioned, Gatsby with the power of React create the perfect system for building robust, flexible and super fast static sites. However, there is a third component that takes that power to a whole new level, and that is GraphQL.

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

Deploying Gatsby

Hosting for a Gatsby site can be done anywhere where React apps can run. Nowadays that's pretty much anywhere. However, before investing in an expensive and highly complicated hosting environment, take a look at some of the simpler and less expensive options on this page.
You will see that for a basic website, you can use several of the free options such as github pages, netlify and others which already include advanced continuous integration workflows. For more advanced sites where a CMS may be involved, you can also find options for deployment that will simplify your DevOps process.

My own blog is running out of a github repo that automatically get deployed when I push new updates. This is happening via Netlify which to me is probably the easiest way I have ever deployed a website.

In closing

Don't worry if you are a little skeptical about static site generators. I was too. However, I gave Gatsby a try and I see myself building more gatsby sites in the future. Before Gatsby I worked with Jekyll which is also a great static site generator, but what sets Gatsby apart is its seamless integration with React and GraphQL. The combination of those 3 provides endless posibilities in your web building process. Check it out.

Categories: FLOSS Project Planets

Mario Hernandez: Running a training workshop

Planet Drupal - Sat, 2024-06-15 22:24

Update 1-10-19
I wrote an extended version of this post at Mediacurrent's blog, check it out.

As long as I can remember I've enjoyed public speaking. This doesn't mean I am good at it, it simply means I enjoy it. School events, class president, my jobs, etc., they all taught me great lessons about public speaking. So when I started as a developer, sharing my knowledge with others at conferences or meetups came pretty natural.

I'd like to clarify, that after years of doing talks and other methods of public speaking, I am still terrified. I get nervous, my hands sweat, my legs shake, and my voice gets weird. Basically what I am trying to say is that I'm not an expert by any means, but I overcome the phobia of public speaking by doing it frequently.

For many years I have speaking at conferences, but in the past few years I started conducting longer workshops. I first started doing online workshops, which have their pros and cons. While they don't put you face to face with your audience, it also does not give you a good sense for how effective your training is because you can't see people reactions. For this reason I prefer to do face-to-face training.
As part of my job I conduct periodic Front-End training workshops for clients and recently I started conducting all-day training workshops at conferences. I really enjoy it and I'ld like to share some of the lessons learned.

Picking a topic

Ideally you want to pick a topic you feel 100% confident about. I have learned that people attending your training or talks welcome any information you can share no matter how simple or elementary it may feel to you. Don't ever think what you know may not be of interest to others because you would be wrong.

Lately I have been challenging myself a little more when picking a topic to train about. While is good to know the topic well, it is also extremely rewarding to pick a topic you'd like to learn more about. This may feel contrary to what I said ealier but hear me out. When you decide to train on a topic, you will spend a lot of time preparing, training, testing and reharsing. This is exactly how you learn a new skill. I can't tell you how many times I come out of training I did knowing more about the topic than before and also learning from people who attended the training. If you want to learn a new skill, teaching others about it could be the best way for you to learn it.

Preparing for the training

Everyone has their own style for teaching or doing a presentation. Some people like to use slides and screenshots, others show recordings of their project or code. My personal preference is to build a working prototype. This to me presents many advantages, but it also means you will spend more time getting ready.
My training workshops usually include very little slides because the majority of the training will be spent writing actual code and building the prototype during the training.

Here's my typical process for preparing for a training workshop:

  • Identify a prototype that serves the purpose of the training. If I am teaching a workshop about component based development I would normally pick something that involves the different aspects of component based development (attoms, nested components, reusable components, etc.)

  • Build the prototype upfront to ensure you have a working model to demo and go by.

  • Once prototype is built, create a public repo so you can share the working prototype

  • Write step-by-step instructions to building the prototype. Normally I would break the prototype down into small components, atoms.

  • Test, test and test. You want to make sure yoru audience will not run insto unexpected issues while following your instructions. For this reason you need to make sure you test your instructions. Ask a friend or colleague to go through each of your excercises to ensure thing work as expected.

  • Provide a pre-training evaluation. A quick set of questions that will give you an idea of people's skills level as well as environment (Linux, Widows, OSX). This will help you plan ahead of time.

  • Build a simple slide deck for introductions and agenda purposes. Mainly I move away from slides as soon as introduction and agenda is done. The rest of the training is all hands on.

Communicate with your audience ahead of time

As you will learn, one thing that can really kill a lot of the time during training is assisting people with their local environment setup. I have conducted training workshops where I've spent half the time helping people with their environment. For this reason, nowadays I communicate with the people ahead of time to ensure everyone's local environment is ready to go.

I normally make myself available once or twice in an evening through a google hangout to assit anyone who may need help. I also provide detailed instructions on how to get their local environment ready. This could save you a lot of time during training. In addition, for those who did spend the time on getting their environment ready, it's not fair that they have to be held back because someone did not make an effort to setup their environment.
I make myself available ahead of training but if someone is still having issues because of neglect, I don't hold the rest of the class back. I try to help them but at some point I move on.

During training

If possible, get help from someone who is also well-versed with the topic so they can assist you help people who may get stuck. Nothing is more frustrating that havign to break the flow of the training to help people who get stuck. Having someone else help you with this allows you to continue with the training and not have everyone loose momentum.

Finally

Enjoy yourself. Make sure you and your audience have fun. If you show excitement in what you are doing people will get excited as well.

Categories: FLOSS Project Planets

GNU Taler news: GNU Taler plugin for Adobe Commerce (Magento) now available

GNU Planet! - Sat, 2024-06-15 18:00
This project implemented the GNU Taler payment system in Adobe Commerce (formerly Magento). An extension was developed that can now be included in all Adobe Commerce online shops.
Categories: FLOSS Project Planets

This week in KDE: Final Plasma 6.1 polishing and new features for 6.2

Planet KDE - Sat, 2024-06-15 01:11

Plasma 6.1 is due to be released in three days, and lots of attention went into final release readiness activities: QA, bug-fixing, performance profiling, auto-testing, stuff like that. Boring but important! And happily, reviews of the 6.1 beta are, like, really good. So we want to make sure that the final release doesn’t disappoint!

In addition, we’re hard at work on Plasma 6.2, which is now beginning to accumulate features. Major areas of focus are some of the remaining Wayland pain points, including tablet and artist workflows. You can see some progress on that already:

New Features

Added an additional option for how to map the area of your drawing tablet to the area of your screen. It can be useful to have multiple options here, since drawing tablets are as diverse as screens, and their sizes and aspect ratios are unlikely to be identical (Joshua Goins, Plasma 6.2.0. Link):

Added a test mode feature for drawing tablets so you can test your settings (Joshua Goins, Plasma 6.2.0. Link):

Ported the Weather widget to the new API offered by the NOAA weather provider, which unlocked night forecasts in addition to the existing day forecasts (Ismael Asensio, Plasma 6.2.0. Link 1 and link 2)

This is new, so please excuse the UI glitches that are visible here. They’ll be cleaned up before the final release of Plasma 6.2 in four months. UI Improvements

You can now wake up a sleeping screen using a stylus (David Redondo, Plasma 6.1.1. Link)

KWin’s Morphing Popups effect has been deleted. While it added a measure of fanciness, unfortunately the way it worked introduced unfixable visual glitches. After multiple attempts to fix them, with a heavy heart we had to admit defeat. Removing the effect fixed six open Bugzilla tickets, one of which had 20 (!) duplicates. Stability beats fanciness. (David Edmundson, Plasma 6.2.0. Link)

Bug Fixes

Scrolling in Elisa’s lyrics view once again works with a clicky scroll wheel mouse (Jack Hill, Elisa 24.05.1. 24.05.1)

Fixed that annoying bug that affected people not using Systemd (or Plasma’s Systemd integration) which would make some but not all Plasma settings fail to get saved properly (David Edmundson, Plasma 6.1.0. Link)

Fixed a bug that could cause Plasma’s “Unify Outputs” screen action to actually disable certain screens instead of making them mirror the existing one (Xaver Hugl, Plasma 6.1.0. Link)

Quickly adapted to the NOAA weather provider’s continued API changes by implementing a quick fix to keep it working for the Weather widget in Plasma 6.1. (Ismael Asensio, Plasma 6.1.0. Link)

Fixed multiple issues causing System Monitor widgets displaying certain types of days to be visually squished when displayed on Plasma panels (Arjen Hiemstra, Plasma 6.1.0. Link)

Fixed a visual glitch that caused flickering when canceling a quick-tile action initiated by dragging a window (Xaver Hugl, Plasma 6.1.0. Link)

When you change the system volume using the slider in Plasma’s Audio widget, the maximum volume in overdrive mode is once again 150%, not 153% (Ismael Asensio, Plasma 6.1.0. Link)

Fixed an issue that could cause XWayland-using apps to freeze for a short period of time after slowly resizing their windows, especially when using screen scaling (Vlad Zahorodnii, Plasma 6.1.1. Link)

In System Monitor pie charts (both in the app and the widgets), text in the center no longer sometimes overflows when it’s very long (Arjen Hiemstra, Plasma 6.1.1. Link)

Other bug information of note:

Performance & Technical

Improved the memory efficiency of Plasma notifications that display images (Fushan Wen, Plasma 6.2. Link)

Automation & Systematization

Added a test to ensure that Plasma panel focus works as expected (Niccolò Venerandi, link)

Added a test to ensure that images in Plasma notifications appear properly (Fushan Wen, link)

Added a test for some more combinations of settings in Plasma’s Clipboard widget, since there are rather a lot of them (Fushan Wen, link)

Added a test to ensure that the camera usage monitor is working (Fushan Wen, link)

Added a test to make sure the keyboard brightness is settable as expected (Fushan Wen, link)

…And Everything Else

This blog only covers the tip of the iceberg! If you’re hungry for more, check out https://planet.kde.org, where you can find more news from other KDE contributors.

How You Can Help

The KDE organization has become important in the world, and your time and labor have helped to bring it there! But as we grow, it’s going to be equally important that this stream of labor be made sustainable, which primarily means paying for it. Right now the vast majority of KDE runs on labor not paid for by KDE e.V. (the nonprofit foundation behind KDE, of which I am a board member), and that’s a problem. We’ve taken steps to change this with paid technical contractors — but those steps are small due to growing but still limited financial resources. If you’d like to help change that, consider donating today!

Otherwise, visit https://community.kde.org/Get_Involved to discover other ways to be part of a project that really matters. Each contributor makes a huge difference in KDE; you are not a number or a cog in a machine! You don’t have to already be a programmer, either. I wasn’t when I got started. Try it, you’ll like it! We don’t bite!

Categories: FLOSS Project Planets

GNU Taler news: Real-time GNU Taler auditor

GNU Planet! - Fri, 2024-06-14 18:00
This bachelor thesis implements puts it's focus on the GNU Taler auditor. Cedric Zwahlen and Nicola Eigel made it real-time and added single page application.
Categories: FLOSS Project Planets

GSOC Week 1 Week 2

Planet KDE - Fri, 2024-06-14 14:37
This is the first blog post of my GSOC journey. I will be sharing my works and experiences here. Stay tuned for more updates. In this blog, I’ll be sharing my experiences of the first two weeks of GSOC, what are the works I did, what are challenges I faced and how did I overcome them ( Did I really overcome them :P ). On my first week I tried to understand the codebase of discover first, via doing small changes.
Categories: FLOSS Project Planets

KDE Gear 24.08 release schedule

Planet KDE - Fri, 2024-06-14 13:38

 

This is the release schedule the release team agreed on

  https://community.kde.org/Schedules/KDE_Gear_24.08_Schedule

Dependency freeze is in around 4 weeks (July 18) and feature freeze one
after that. Get your stuff ready!
 

Categories: FLOSS Project Planets

mark.ie: My Drupal Core Contributions for week-ending June 14th, 2024

Planet Drupal - Fri, 2024-06-14 12:00

Here's what I've been working on for my Drupal contributions this week. Thanks to Code Enigma for sponsoring the time to work on these.

Categories: FLOSS Project Planets

The Drop Times: Shawn Perritt on Reintroducing One of the Pioneers of the Internet to the World

Planet Drupal - Fri, 2024-06-14 10:36
Discover the transformative journey of Drupal's rebranding in an exclusive interview with Shawn Perritt, Brand and Creative Director at Acquia. During DrupalCon Portland, Shawn unveiled the refreshed Drupal brand, highlighting the collaborative efforts to modernize its identity while preserving its core values. In conversation with Alka Elizabeth, Sub-editor at The DropTimes, Shawn shares insights into the strategic thinking behind the rebrand, the unique challenges faced, and the aspirations driving this significant change. Join us for an in-depth look at how Drupal is poised to reintroduce itself to the world.
Categories: FLOSS Project Planets

Explaining the concept of Data information

Open Source Initiative - Fri, 2024-06-14 09:53

There seems to be some confusion caused by the concept of Data information included in the draft v0.0.8 of the Open Source AI Definition. Some readers may have seen the original dataset included in the list of optional components and quickly jumped to the wrong conclusions. This post clarifies how the draft arrived at its current state, the design principles behind the Data information concept and the constraints (legal and technical) it operates under.

The objective of the Open Source AI Definition

The objective of the Open Source AI Definition is to replicate in the context of artificial intelligence (AI) the principles of autonomy, transparency, frictionless reuse, and collaborative improvement for end users and developers of AI systems. These are described in the preamble.

Following the preamble is the definition of Open Source AI, an adaptation of the definition of Free Software (also known as “the four freedoms”) to AI nomenclature. The preamble and the four freedoms have been co-designed over several meetings and public discussions, online and in-person, and have not recently received significant comments. 

The Free Software definition specifies that a precondition to the freedom to study and modify a program is to have access to the source code. Source code is defined as “the preferred form of the program for making changes in.” Draft v0.0.8 contains a description of what’s necessary to enjoy the freedoms to study and modify an AI system. This new section titled Preferred form to make modifications to machine-learning systems has generated a heated debate. 

What is the preferred form to make modifications

The concept of “preferred form to make modifications” focuses on machine learning systems because these systems require data and training to produce a working system. Other AI systems are more easily classifiable as software and don’t require a special definition. 

The system analysis phase of the co-design process revealed that studying and modifying machine learning systems requires data, code for training and inference and model parameters. For the parameters, there’s no ambiguity: an Open Source AI must make them available under terms that respect the Open Source principles (no field-of-use restrictions, no discrimination against people, etc). For the data and code requirements, the text in the “preferred form to make modifications” section is longer and harder to parse, generating some confusion. 

The intent of the code and data requirements is to  ensure that end users, deployers and developers of an Open Source AI system have all the tools and instructions to recreate that AI system from scratch, to satisfy the freedoms to study and modify the system. At a high-level view, it makes sense to suggest that training datasets should be mandatorily released with permissive licenses in order to be Open Source AI.

However on close examination, it became clear that sharing the original datasets is full of traps. It actually puts Open Source at a disadvantage compared to opaque and proprietary AI systems.

The issue with data

Data is not software: The legal landscape for data is much wider than copyright. Aggregating large datasets and distributing them internationally is an endless nightmare that includes privacy laws, copyright, sui-generis rights, patents, secrets and more. Without diving deeper into legal issues, let’s focus on practical examples to clarify why the distribution of the training dataset is not spelled out as a requirement in the concept of Data information.

  • The Pile, the open dataset used to train the very open Pythia models, was taken down after an alleged copyright infringement, currently being litigated in the United States. However, the Pile appears to be legal to share in Japan. It’s also unclear whether it can be legally shared in the European Union. 
  • DOLMA, the open dataset used to train the very open OLMo models, was initially released with a restrictive license. It later switched to a permissive one. On further inspection, DOLMA appears to suffer from the same legal uncertainties of the Pile, however the Allen Institute has not been sued yet.
  • Training techniques that preserve privacy like federated learning don’t create datasets. 

All these cases show that requiring the original datasets creates vagueness and uncertainty in applying the Open Source AI Definition:

  • If a dataset is only legal in Japan, is that AI Open Source only in Japan?
  • If a dataset is initially legally available but later retracted, does the AI go from being Open Source to not?
    • If so, what happens to the applications that use such AI?
  • If no dataset is created, then will any AI trained with such techniques ever be Open Source?

Additionally, there are reasons to believe that OpenAI, Anthropic and other proprietary systems have been trained on the same questionable data inside The Pile and DOLMA: Proving that’s the case is a lot harder and expensive though. This is clearly a disincentive to be open and transparent on the data sources, adding a burden to the organizations that try to do the right thing.

The solution to these questions, draft v0.0.8 contains the concept of Data information, coupled with code requirements to obtain the expected result: for end users, developers and deployers of AI systems to be able to reproduce an Open Source AI.

Understanding the concept of Data Information

Data information, in the draft Open Source AI Definition, is defined as: 

Sufficiently detailed information about the data used to train the system, so that a skilled person can recreate a substantially equivalent system using the same or similar data.

Read that from the end: The intention of Data information is to allow developers to recreate a substantially equivalent system using the same or similar data. That means that an Open Source AI must disclose all the ingredients, where they’ve been bought and all the instructions to prepare the dish.  

This is a solution that came out of the co-design process, where reviewers didn’t rank the training datasets as high as they ranked the training code and data transparency requirements. 

Data information and the code requirements also address all of the questions around the legality of distributing data and datasets, or their absence.

If a dataset is only legal in Japan or becomes illegal later, one should still be able to recreate a dataset suitable to train an equivalent system replacing the illegal or unavailable pieces with similar ones.

AI systems trained with federated learning (where a dataset isn’t created) can still be Open Source AI if all instructions and code are released so that a new training with different data can generate an equivalent system.

The Data information concept also solves an example (raised on the forum) of an AI system trained on data licensed directly from Reddit. In this case, if the original developers released enough information to allow another AI developer to recreate a substantially equivalent system with Reddit data taken from an existing dataset, like CommonCrawl, it would be considered Open Source AI.

The proposed alternatives

While generally well received, draft v0.0.8 has been criticized by a few people on the forum for putting the training dataset in the “optional requirements”. Some suggestions and pushback we’ve received:

  • Require the use of synthetic data when the training dataset cannot be legally shared: This technique may work in some corner cases, if the technology evolves to be reliable enough. It’s expensive and untested at scale.
  • Classify as Open Source AI systems where all their components are “open source”: This approach is not rooted in the longstanding practice of the GNU project to accept system library exceptions and other compromises in exchange for more Open Source tools.
  • Datasets built by crawling the internet are the equivalent of theft, they shouldn’t be allowed  at all, let alone allowed in Open Source AI: This pushback ignores the reality that large data aggregators already have acquired legally the rights to accumulate that same data (through scraping and terms of use) and are trading it, exclusively capturing the economic value of what should be in the commons. Read Towards a Books Data Commons for AI Training for more details. There is no general agreement that text and data mining is equivalent to theft.

These demands and suggestions are hard to accept. We need an Open Source AI Definition that can effectively guide users and developers to make the right choice. We need one that doesn’t put developers of Open Source AI at a disadvantage compared to proprietary ones. We need a Definition that contains positive examples from the start so we can practically demonstrate positive qualities to policymakers. 

The discussion about data, how to generate incentives to create datasets that can be distributed internationally, safely, preserving privacy, is extremely complex. It can be addressed separately from the Open Source AI Definition. In collaboration with Open Future Foundation and others, OSI is designing a series of conferences to tackle the data governance issue. We’ll make an announcement soon.

Have your say now

The concept of Data information and code requirements is hard to grasp at first. But the preliminary results of the validation phase confirm that the draft v0.0.8 works as expected: Pythia and OLMo both would be Open Source AI, while Falcon, Grok, Llama, Mistral would not (even if they used OSD-compatible licenses) because they don’t share Data information. BLOOM and StarCoder would fail because of field-of-use restrictions in their models.

Data information can be improved but it’s better than other solutions proposed so far. As we get closer to the release of the stable version of the Open Source AI Definition, we need to hear from you: If you support this concept please comment on the forum today. If you don’t support it, please try to propose an alternative that at least covers the practical examples of Pile, DOLMA and federated learning above. Help the community move the conversation forward.

Continue the conversation in the forum

Categories: FLOSS Research

Web Review, Week 2024-24

Planet KDE - Fri, 2024-06-14 09:14

Let’s go for my web review for the week 2024-24.

Microsoft Will Switch Off Recall by Default After Security Backlash

Tags: tech, microsoft, privacy

Unsurprisingly they had to adjust under the pressure. The most blatant issues might be gone, it is still a bad idea at its core.

https://www.wired.com/story/microsoft-recall-off-default-security-concerns/


AI chatbots are intruding into online communities where people are trying to connect with other humans

Tags: tech, ai, machine-learning, gpt, criticism, ethics

Chatbots can be useful in some cases… but definitely not when people expect to connect with other humans.

https://theconversation.com/ai-chatbots-are-intruding-into-online-communities-where-people-are-trying-to-connect-with-other-humans-229473


Malicious VSCode extensions with millions of installs discovered

Tags: tech, vscode, security, ide

How trustworthy are the extensions you get in your editor or IDE? I’d expect most marketplaces to not be well harmed against such attacks.

https://www.bleepingcomputer.com/news/security/malicious-vscode-extensions-with-millions-of-installs-discovered/


HTTP/3 needs us (and other people) to make firewall changes

Tags: tech, http, quic, firewall

Good reminder that firewalls need to be adjusted for proper HTTP/3 support.

https://utcc.utoronto.ca/~cks/space/blog/sysadmin/HTTP3AndOurFirewalls


HTTP/3 in curl mid 2024 | daniel.haxx.se

Tags: tech, http, quic

Interesting status report about HTTP/3 support in curl. Shows quite well the various alternatives and how special HTTP/3 can be.

https://daniel.haxx.se/blog/2024/06/10/http-3-in-curl-mid-2024/


What is PID 0? · blog.dave.tf

Tags: tech, unix, linux, kernel, system, processes

Interesting deep dive in where the PIDs seen in user space come from. And also yes, there is something matching PID 0 which can be traced back to early UNIX systems.

https://blog.dave.tf/post/linux-pid0/


Scan HTML faster with SIMD instructions: Chrome edition – Daniel Lemire’s blog

Tags: tech, cpu, performance, SIMD

SIMD keeps providing interesting performance boosts for parsing work loads.

https://lemire.me/blog/2024/06/08/scan-html-faster-with-simd-instructions-chrome-edition/


Rolling your own fast matrix multiplication: loop order and vectorization – Daniel Lemire’s blog

Tags: tech, c++, compiler, performance, matrix

The ordering used for matrix multiplications definitely matters.

https://lemire.me/blog/2024/06/13/rolling-your-own-fast-matrix-multiplication-loop-order-and-vectorization/


You’ll regret using natural keys

Tags: tech, databases, design

Good advice on designing your database tables. The comments are good too, they allow to complete the picture.

https://blog.ploeh.dk/2024/06/03/youll-regret-using-natural-keys/


Brain dump – Pagination for database objects

Tags: tech, backend, databases

The right and wrong approaches for paginating results coming from a database.

https://www.n16f.net/blog/pagination-for-database-objects/


Optimal SQLite settings for Django

Tags: tech, django, databases, sqlite

Little and to the point reference on safer SQLite use. I should check if some of this would apply or is used by Akonadi as well.

https://gcollazo.com/optimal-sqlite-settings-for-django/


the Gilbert–Johnson–Keerthi algorithm explained as simply as possible

Tags: tech, geometry, mathematics, algorithm

Need to know if two shapes overlap? Good explanation of an elegant algorithm to do it.

https://computerwebsite.net/writing/gjk


Feynman’s Razor - by Defender of the Basic

Tags: tech, documentation, communication, gui

Nice reminder that even though we try to make things simpler to understand to people, there is a point where we can go too far.

https://defenderofthebasic.substack.com/p/feynmans-razor


Foreword for Fuzz Testing Book

Tags: tech, fuzzing, tests, history

Ever wondered where fuzz testing is coming from? This is an important bit of history.

https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html


Post-Architecture: An Open Approach to Software Engineering

Tags: tech, software, architecture

Indeed this is not for any environment and projects. So take it with a grain of salt. That said, I think this piece has a core truth to it which is more general. Software architectures shouldn’t be considered as something fixed as soon as they are planned, they need to be validated through use and to be prepared to evolve over time as needed.

https://arendjr.nl/blog/2024/06/post-architecture/


Bye for now!

Categories: FLOSS Project Planets

EuroPython: Humble Data workshop for beginners - Pythonistas and data scientists

Planet Python - Fri, 2024-06-14 08:38

Among the many wonderful workshops at EuroPython this year, we are pleased to announce we will be running the Humble Data workshop in person on Tuesday 9th July 2023, at the Prague Congress Centre (PCC). This is following successful deliveries of this workshop at PyCon US, Ghana, Namibia, Africa, Germany, Italy, PyData Global and of course, EuroPython 2022 and 2023!

How is the workshop?

Curious about the event? Read on.

Humble Data workshops are designed to get those from underrepresented groups started in both Python and data science, in an inclusive, laid-back and empathic environment. The workshops are designed to help people with zero experience with coding to learn some of the most fundamental operations in Python, and in turn, use these to get started with reading, transforming and visualizing data.

Humble Data at EuroPython 2023

The workshop will happen on 9th July 2024 for 6 hours, from 09:30 to 17:00 at the Prague Congress Centre (PCC), Room Club C

As part of the workshop, participants will work through a series of approachable tutorials with the help of a mentor. For 3 hours (breaks included) we will have teams that will work together with mentors to do plenty of exercises, quizzes and games, to go from Zero to Hero in Python data science. All that participants will need to bring is a laptop with internet access - we will help them get started with the rest!

Let us help you get started on your Python data science journey. You can read more about the workshop here.

How can I get involved?

Like the idea? Join us as a mentor or mentee!

If you’re new to coding or data science, and want to learn more in a supportive environment, apply to join us at the Humble Data workshop by filling in this form (attendees) or this form (mentors). Participation is free for anyone with a EuroPython Conference Ticket or Combined Ticket. Please note that you will need a conference ticket to participate in this workshop - we thank you for your understanding!

If you’re interested in mentoring, we would love to have your help! It is no issue if you&aposre not the most experienced programmer or data scientist: rather, we are looking for people who are respectful, patient, friendly, curious, and able to explain technical concepts in a way that is approachable for beginners. In return, you will receive the eternal gratitude of the organisers and attendees, the chance to meet people outside of your bubble, and in turn show that you don’t need to fit a certain mold to “look like” a developer or data scientist.

Our wonderful Humble Data mentor at EuroPython 2023

Finally, if you know anyone attending EuroPython this year who you think would like to either attend or mentor Humble Data, please encourage them to apply!

If you’re interested in attending as a mentee, please fill in this form, or as a mentor, please fill in this form, by July 1st, 2023.

We can’t wait to see you all in Prague this July!

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #208: Detecting Outliers in Your Data With Python

Planet Python - Fri, 2024-06-14 08:00

How do you find the most interesting or suspicious points within your data? What libraries and techniques can you use to detect these anomalies with Python? This week on the show, we speak with author Brett Kennedy about his book "Outlier Detection in Python."

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

mark.ie: Setting up a local development environment with DDEV to contribute to Drupal core

Planet Drupal - Fri, 2024-06-14 07:42

Contributing to Drupal core is a little different to contributing to a contrib module. This blog post was written during my Drupal core contribution time, sponsored by Code Enigma.

Categories: FLOSS Project Planets

Qt Creator 14 Beta released

Planet KDE - Fri, 2024-06-14 06:21

We are happy to announce the release of Qt Creator 14 Beta!

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Python's security model after the xz-utils backdoor

Planet Python - Fri, 2024-06-14 06:05

Pablo Galindo Salgado describing the xz-utils backdoor
(Photo credit: Hugo van Kemenade)
 

The backdoor of the popular compression project xz-utils was discovered on Friday, March 29th 2024, by Andres Freund. Andres is an engineer at Microsoft who noticed performance issues with SSH while contributing to the Postgres project. Andres wasn't looking for security issues, but after digging into the problem further had discovered an attempt to subvert SSH logins across multiple Linux distros.

This was a social engineering attack to gain elevated access to a project, also known as an "insider threat". An account named "Jia Tan" had begun contributing to the xz-utils project soon after the original maintainer had announced on the mailing list that they were struggling with maintenance of the project. Through the use of multiple sock-puppet accounts pressuring the maintainer and over a year of high-quality contributions, eventually Jia Tan was made a release manager for the project.

"Jia Tan may have a bigger role in the project in the future. He has been helping a lot off-list and is practically a co-maintainer already. :-)"

— xz-utils maintainer, Lasse Collin

Over time a series of small subversive changes were made to the project all culminating in a tainted release artifact that put the backdoor in motion. Luckily for all of us, Andres discovered the attack before the new version was deployed more widely.

How is Python similar to xz-utils?

Pablo Galindo Salgado, Steering Council member and the release manager for Python 3.10 and 3.11, brought this topic to the Language Summit to discuss what could be done to improve Python's security model in the wake of the xz-utils backdoor.

Pablo noted the similarities shared between CPython and xz-utils, referencing the previous Language Summit's talk on core developer burnout, the number of modules in the standard library that have one or zero maintainers, the high ratio of maintainers to source code, and the use of autotools for configuration. Autotools was used by Jia Tan as part of the backdoor, specifically to obscure the changes to tainted release artifacts.

Pablo confirmed along with many nods of agreement that indeed, CPython could be vulnerable to a contributor or core developer getting secretly malicious changes merged into the project.

"Could this happen in CPython? Yes!" -- Pablo

For multiple reasons like being able to fix bugs and single-maintainer modules, CPython doesn't require reviewers on the pull requests of core developers. This can lead to "unilateral action", meaning that a change is introduced into CPython without the review of someone besides the author. Other situations like release managers backporting fixes to other branches without review are common.

There was also an emphasis on "binary files", like wheels, images, certificates, and test data that is checked into the CPython repository. Today some of this data doesn't have a known "upstream" or source where it was generated from making introspection difficult. Part of the xz-utils backdoor utilized binary test data in order to smuggle code into the release artifacts without being reviewed by other developers.

So what can be done?

There aren't any silver bullets when it comes to social engineering and insider threats. Barry Warsaw and Carol Willing both emphasized the importance having an action plan in advance for what to do if something similar to the xz-utils backdoor were to happen in order to promptly fix the issue and alert the community.

Thomas Wouters asked the group whether the xz-utils backdoor was a serious enough event to force a new workflow to be adopted by core developers. Thomas noted that mandatory review of all pull requests had been discussed previously and wasn't adopted at the time, but also wasn't discussed as a security issue like it is today. There's been a hesitance to break peoples' workflows or make it impossible to get bugs fixed. This change would also require a cultural change to make asking for code reviews more common amongst core developers to be effective.

Carol Willing concurred, noting that almost every other project she's contributing to requires reviews for all pull requests.

Guido van Rossum was less convinced that having additional review would help much for security. Guido was more concerned about who is given "commit bit" (write access) in the first place, asking for a higher bar such as whether someone had met the person in real life, at a conference, or over a video call.

Mariatta agreed with verifying identities of core developers, including requiring updates to reconfirm the identities of individuals noting that this is commonplace for employment. Mariatta noted that the contributions being done by CPython core developers is of equal or more importance than any individuals' employment.

Some doubt was thrown on verifying identities, especially via video call, as it's now not unheard of for someone being interviewed for employment over a video call to be different from the person who shows up on the first day of work.

Hugo van Kemenade remarked on removing inactive core developers, noting that it's already documented in the CPython developer guide that inactive or unreachable core developers can be removed with or without notice. There was agreement within the group that this should be done more actively to reduce the chances that unattended privileged accounts are resurrected by malicious actors.

There was some discussion about removing modules from the standard library, especially modules which are not used or have no maintainers. Toshio Kuratomi cautioned that moving modules out of the standard library only pushes the problem outwards to one or more projects on PyPI. Łukasz Langa concurred on this point referencing specifically the "chunk" module removed via PEP 594 and feeling unsure whether the alternative project on PyPI should be recommended to users given the author not being reachable.

Overall it was clear there is more discussion and work to be done in this rapidly changing area.

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024

Planet Python - Fri, 2024-06-14 05:27

The Python Language Summit occurs every year just before PyCon US begins, this year occurring on May 15th, 2024 in Pittsburgh, Pennsylvania. The summit is attended by core developers, triagers, and Python implementation maintainers for a full day of talks and discussions on the future direction of Python.

This years summit included talks on the C API, free-threading, the security model of Python post-xz, and Python on mobile platforms.

This year's summit was attended by around 45 people and was covered by Seth Larson.

Attendees of the Python Language Summit 2024
(Photo credit: Kushal Das)

 

 

 

 

 

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: PyREPL -- New default REPL written in Python

Planet Python - Fri, 2024-06-14 05:26

Lysandros showing the mistake we've all made, no longer a problem in the new REPL
(Photo credit: Hugo van Kemenade)

One of the headline features of Python 3.13 is the new interactive interpreter, sometimes known as a "REPL" (Read-Evaluate-Print-Loop) which was contributed by Pablo Galindo Salgado, Łukasz Langa, and Lysandros Nikolaou and based on the PyPy project's own interactive interpreter, PyREPL. Pablo, Łukasz, and Lysandros all were at the Language Summit 2024 to present about this new feature coming to Python.

Why does Python need a new interpreter?

Python already has an interactive interpreter, so why do we need a new one? Lysandros explained that the existing interpreter is "deeply tangled" to Python's tokenizer which means adding new features or making changes is extremely difficult.

To lend further color to this point, Lysandros dug into how the tokenizer had changed since Python was first developed. Lysandros noted that "for the first 12 years [of Python], Guido was the only one who touched the tokenizer" and only later after the parser was replaced did anyone else meaningfully contribute to the tokenizer.

Terse example code for Python's tokenizer

Meanwhile, there are other REPLs for Python that "have many new features that [Python's] interpreter doesn't have that users have grown to expect", Lysandros explained. Some basic features that were listed as examples included lack of color support meaning no syntax highlighting, the ergonomics issues around exit versus exit(), no support for multi-line editing and buffer history, and poor ergonomics around pasting code into the interpreter.

Why PyREPL?

"We've settled on starting our solution around PyREPL", Pablo explained, "our reasoning being that maintaining terminal applications is hard. Starting from scratch would have a much higher risk for users". Pablo also noted that "most people who would interact with the REPL wouldn't test in betas", because Python pre-releases are generally used for running automated tests in continuous integration and not interactively tested manually.

Pablo explained that there are many different terminals and platforms which are all sources of behaviors and bugs that are hard to get right the first time. "[PyREPL] provided us with a solid base that we know works and we can start modifying".

Tasteful modern art or bug in the REPL?

Another major contributing factor was that PyREPL is written in Python. Pablo emphasized that "now people that want to start contributing to the REPL can actually contribute because it's written in Python".

Finally, Pablo pointed out that because the implementation is now partially shared between CPython and PyPy that both implementations can benefit from bug fixes to the shared parts of the codebase. Support for Chinese characters in the REPL was fixed in CPython and is being contributed back to PyPy.

Łukasz noted that adopting PyREPL wasn't a straightforward copy-paste job, there were multiple ideas in PyPy's PyREPL that don't make sense for CPython. Notably, PyPy is written to also support Python 2, so the code was simplified to only handle Python 3 code. PyREPL for PyPy also came with support for PyGame which wasn't necessary for CPython.

Type hints and strict type checking using mypy were also added to PyREPL, making the PyREPL module the first in the Python standard library to be type-checked on pull requests. Adding type hints to the code immediately found bugs which were fixed and reported back to PyPy.

What are the new features in 3.13?

Pablo gave a demonstration of the new features of PyREPL, including:

  • Colored prompts
  • F1 for help, F3 for bracketed paste
  • Multi-line editing and history
  • Better support for pasting blocks of code
     

Below are some recreated highlights from the demo. Pasting code samples into the old REPL that contain multiple newlines would often result in SyntaxErrors due to multiple newlines in a row resulting in that statement being evaluated. Multi-line editing also helps modifying code all in one place rather than having to piece a snippet together line-by-line, modifying what you want as you go:

Demo of multi-line paste in Python 3.13 

And the "exit versus exit()" paper-cut has been bothering Python users for long enough. This error was especially taunting because the REPL clearly knows what your intent is with it's helpful message to "Use exit() to exit":

"exit" without parenthesis just works, finally!
Windows and terminals

Support is already available for Unix consoles (Linux and macOS) in Python 3.13.0-beta1 and the standout feature request so far for PyREPL has been Windows support. Windows was left out because "historically the console on Windows was way different than Unix consoles". Łukasz continued, saying that "they don't intend to support right now" offering a "yes, but..." for users asking for Windows support.

Windows has two consoles today, cmd.exe of yore and the new "Windows Terminal" which supports many of the same features as Unix consoles including VT100 escape codes. The team's plan is to support the new Windows Terminal, and "to use our sprints here in Pittsburgh to finish". Windows support will also require removing CPython dependencies on the curses and readline libraries.

What's next for PyREPL?

The team already has plans cooking up for what to add to the REPL in Python 3.14. Łukasz commented that "syntax highlighting is an obvious idea to tackle". Łukasz also referenced an idea from Tania Allard for accessibility improvements similar to those in IPython.

Łukasz reiterated that the goal isn't to make an "uber REPL" or "replace IPython", but instead to make a REPL that core developers can use while testing development branches (where dependencies aren't working yet).

Łukasz continued that core developers aren't the only ones that these improvements benefit: "many teachers are using straight-up Python, IDLE, or the terminal because the computers they're using don't allow them to install anything else."

Given the applause from the room during the demos, it's safe to say that this work has been received well. There were only concerns about platform support and rollout for the new REPL.

Gregory Smith informed the team that functionality that requires a "Function" key (ie F1, F2, etc) must also be supported without Function keys due to some computers lacking them, like Chromebooks.

Carol Willing was concerned about releasing PyREPL without support for Windows Terminal, especially from a teaching perspective, describing that potential outcome as "painful". Carol wanted clear documentation on how to get the new REPL on Windows. "Positioning [the new REPL] for teaching without clear Windows instructions is a recipe for disaster".

Pablo assured that the team wants to add support for Windows Terminal in time for the first 3.13 release candidate. Pablo could not make guarantees due to a lack of Windows expertise among the three, saying "the reason I'm not saying 100% is because none of us are Windows experts. We understand what needs to be done... but we need some help."

Łukasz named Steve Dower, the Windows release expert for Python, who is "very motivated to help us get Windows Terminal support during sprints". Łukasz reiterated they're "not 100%, but we are very motivated to get it done".

Gregory Smith shared Carol's concern and framed the problem as one of communication strategy, proposing to "not promise too much until it works completely on Windows". By Python 3.14 the flashy features like syntax highlighting would have landed and the team would have a better understanding of what's needed for Windows. The team can revise the 3.13 "What's New in Python" depending on what gets implemented in the 3.13 timeline.

Ned Deily sought to clarify what the default experience would be for users of 3.13. Pablo said that "on Windows right now you will get the [same REPL] that you got before" and "on Linux and macOS, if your terminal supports the features which most of them do, you get the enhanced experience". "What we want in the sprints is to make Windows support the new one, if we get feature parity, then [Windows] will also get the new [REPL]".

Carol also asked to document how to opt-out of the new REPL in the case that support wasn't added in time for 3.13 to avoid differences between educational material and what students were seeing in their terminal. Kushal Das confirmed that differences across platforms is a source of problems for students, saying that "if all [students] have the same experience it's much better than just improving only macOS and Linux" to avoid students feeling bad just due to their operating system.

Pablo said that the opt-out mechanism was already in place with an environment variable and will discuss other opt-out mechanisms if needed for educators.

Emily Morehouse, speaking as a Steering Council member added that the Steering Council has requested an informational PEP on the new REPL. "Hearing concerns about how [the new REPL] might be rolled out... it sounds like we might need something that's more compatible and an easier rollout", leaving the final discussions to the 3.13 release manager, Thomas Wouters. Carol replied that she believes "we could do it in documentation".

Categories: FLOSS Project Planets

Pages