Feeds

PyCoder’s Weekly: Issue #648 (Sept. 24, 2024)

Planet Python - Tue, 2024-09-24 15:30

#648 – SEPTEMBER 24, 2024
View in Browser »

Python 3.13 Preview: Free Threading and a JIT Compiler

Get a sneak peek at the upcoming features in Python 3.13 aimed at enhancing performance. In this tutorial, you’ll make a custom Python build with Docker to enable free threading and an experimental JIT compiler. Along the way, you’ll learn how these features affect the language’s ecosystem.
REAL PYTHON

Let’s Build and Optimize a Rust Extension for Python

Python code too slow? You can quickly create a Rust extension to speed it up. This post shows you how to re-implement some Python code as Rust, connect the Rust to Python, and optimize the Rust for even further performance improvements.
ITAMAR TURNER-TRAURING

Join John Hammond & Daniel Miessler at DevSecCon 2024

Snyk is thrilled to announce DevSecCon 2024, Developing AI Trust Oct 8-9, a FREE virtual summit designed for DevOps, developer and security pros of all levels. Hear from industry pros John Hammond & Daniel Miessler for some practical tips on devsecops approach to secure development. Save our spot →
SNYK.IO sponsor

Document Intended Usage Through Tests With doctest

This post covers Python’s doctest which allows you to write tests within your code’s docstrings. This does two things: it gives you tests, but it also documents how your code can be used.
JUHA-MATTI SANTALA

Quiz: Python 3.13: Free-Threading and a JIT Compiler

REAL PYTHON

Quiz: Lists vs Tuples in Python

REAL PYTHON

Articles & Tutorials Customizing VS Code Through Color Themes

A well-designed coding environment enhances your focus and productivity and makes coding sessions more enjoyable. In this Code Conversation, your instructor Philipp Ascany will guide you step-by-step through the process of finding, installing, and adjusting color themes in VS Code.
REAL PYTHON course

Thriving as a Developer With ADHD

What are strategies for being a productive developer with ADHD? How can you help your team members with ADHD to succeed and complete projects? This week on the show, we speak with Chris Ferdinandi about his website and podcast “ADHD For the Win!”
REAL PYTHON podcast

Build AI Apps with More Flexibility for the Edge

Intel makes it easy to develop AI applications on open frameworks and libraries. Seamlessly integrate into an open ecosystem and build AI solutions with more flexibility and choice. Explore the possibilities available with Intel’s OpenVINO toolkit →
INTEL CORPORATION sponsor

Things I’ve Learned Serving on the Board of the PSF

Simon has been on the Python Software Foundation Board for two years now and has recently returned from a board retreat. This post talks about what he has learned along the way, including just what does the PSF do, PyPI, PyCons, and more.
SIMON WILLISON

Goodhart’s Law in Software Engineering

Goodhart’s law states: “When a measure becomes a target, it ceases to be a good measure.” Whether that’s test coverage, cyclomatic complexity, or code performance, all metrics are proxies and proxies can be gamed.
HILLEL WAYNE

AI-Extracted Asian Building Footprints

This post covers how Mark did a deep dive on a large dataset covering building footprints in Asia. He uses a variety of tools including duckdb and the multiprocessing module in Python.
MARK LITWINTSCHIK

Why We Wrote a New Form Library for Django

“Django comes with a form library, and yet we wrote a total replacement library… Django forms were fundamentally not usable for what we wanted to do.”
KODARE.NET

Case-Insensitive String Class

This article shows how you can create a case-insensitive string class using some basic meta programming with the dunder method __new__.
RODRIGO GIRÃO SERRÃO

7 Ways to Use Jupyter Notebooks Inside PyCharm

Discover seven ways you can use Jupyter notebooks in PyCharm to explore and work with your data more quickly and effectively.
HELEN SCOTT

Unified Python Packaging With uv

Talk Python interviews Charlie Marsh, the maintainer of ruff, and they talk about uv and other projects at Astral.
KENNEDY & MARSH podcast

It’s Time to Stop Using Python 3.8

Python 3.8 will stop getting security updates in November 2024. You really should upgrade!
ITAMAR TURNER-TRAURING

Projects & Code LightAPI: Lightweight API Framework

GITHUB.COM/IKLOBATO

peepdb: CLI Tool to View Database Tables

GITHUB.COM/EVANGELOSMEKLIS

dante: Zero-Setup Document Store for Python

GITHUB.COM/SENKO

cookiecutter-uv: A Modern Template for Python Projects

GITHUB.COM/FPGMAAS • Shared by Florian Maas

Django Content Settings: Advanced Admin Editable Settings

DJANGO-CONTENT-SETTINGS.READTHEDOCS.IO • Shared by oduvan

Events Weekly Real Python Office Hours Q&A (Virtual)

September 25, 2024
REALPYTHON.COM

PyData Paris 2024

September 25 to September 27, 2024
PYDATA.ORG

PiterPy 2024

September 26 to September 27, 2024
PITERPY.COM

PyConf Mini Davao 2024

September 26 to September 27, 2024
DURIANPY.ORG

PyCon JP 2024

September 27 to September 30, 2024
PYCON.JP

PyCon Niger 2024

September 28 to September 30, 2024
PYCON.ORG

PyConZA 2024

October 3 to October 5, 2024
PYCON.ORG

PyCon ES 2024

October 4 to October 6, 2024
PYCON.ORG

Happy Pythoning!
This was PyCoder’s Weekly Issue #648.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RcppFastAD 0.0.4 on CRAN: Updated Again

Planet Debian - Tue, 2024-09-24 13:15

A new release 0.0.4 of the RcppFastAD package by James Yang and myself is now on CRAN.

RcppFastAD wraps the FastAD header-only C++ library by James which provides a C++ implementation of both forward and reverse mode of automatic differentiation. It offers an easy-to-use header library (which we wrapped here) that is both lightweight and performant. With a little of bit of Rcpp glue, it is also easy to use from R in simple C++ applications. This release updates the quick fix in release 0.0.3 from a good week ago. James took a good look and properly disambiguated the statement that lead clang to complain, so we are back to compiling as C++17 under all compilers which makes for a slightly wider reach.

The NEWS file for this release follows.

Changes in version 0.0.4 (2024-09-24)
  • The package now properly addresses a clang warning on empty variadic macros arguments and is back to C++17 (James in #10)

Courtesy of my CRANberries, there is also a diffstat report for the most recent release. More information is available at the repository or the package page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

The Drop Times: Lupus Decoupled Drupal: Bridging Drupal’s Backend Strength with Frontend Freedom

Planet Drupal - Tue, 2024-09-24 12:00
In an article for The DropTimes, Sinduri Guntupalli explores how Lupus Decoupled Drupal merges the power of Drupal's backend with modern frontend frameworks like Vue.js and Nuxt. The platform offers a flexible, API-driven architecture with custom elements, caching optimizations, and diverse deployment options, providing an efficient solution for both developers and content editors working on complex web projects.
Categories: FLOSS Project Planets

Drupal life hack's: Invoked Controllers as a Service

Planet Drupal - Tue, 2024-09-24 11:25
Invoked Controllers as a Service admin Tue, 09/24/2024 - 18:25
Categories: FLOSS Project Planets

Drupal life hack's: Practical Use Cases of Tagged Services in Drupal

Planet Drupal - Tue, 2024-09-24 10:13
Practical Use Cases of Tagged Services in Drupal admin Tue, 09/24/2024 - 17:13
Categories: FLOSS Project Planets

Talking Drupal: Talking Drupal #468 - Drupal AI

Planet Drupal - Tue, 2024-09-24 10:00

Today we are talking about Artificial Intelligence (AI), How to integrate it with Drupal, and What the future might look like with guest Jamie Abrahams. We’ll also cover AI SEO Analyzer as our module of the week.

For show notes visit: www.talkingDrupal.com/468

Topics
  • What is AI
  • What is Drupal AI
  • How is it different from other AI modules
  • How do people use AI in Drupal
  • How does Drupal AI make AI easier to integrate in Drupal
  • What is RAG
  • How has Drupal AI evolved from AI Interpolator
  • What does the future of AI look like
Resources Guests

Jamie Abrahams - freelygive.io yautja_cetanu

Hosts

Nic Laflin - nLighteneddevelopment.com nicxvan John Picozzi - epam.com johnpicozzi Martin Anderson-Clutz - mandclu.com mandclu

MOTW Correspondent

Martin Anderson-Clutz - mandclu.com mandclu

  • Brief description:
    • Have you ever wanted an AI-based tool to give your Drupal site’s editors feedback on the SEO readiness of their content? There’s a module for that.
  • Module name/project name:
  • Brief history
    • How old: created in Aug 2024 by Juhani Väätäjä (j-vee)
    • Versions available: 1.0.0-beta1, which supports Drupal 10.3 and 11
  • Maintainership
    • Actively maintained
    • Number of open issues: none
  • Usage stats:
    • 2 sites
  • Module features and usage
    • Once you enable this module along with the AI module, you can select the default provider, and optionally modify the default prompt that will be used to generate the report
    • With that done, editors (or anyone with the new “view seo reports” permission) will see an “Analyze SEO” tab on nodes throughout the site.
    • Generated reports are stored in the database, for ongoing reference
    • The reports are also revision-specific, so you could run reports on both a published node and a draft revision
    • There’s a separate “create seo reports” permission needed to generate reports. Within the form an editor can modify the default prompt, for example to get suggestions on optimizing for a specific topic, or to add or remove areas from the generated report.
    • By default the report will include areas like topic authority and depth, detailed content analysis, and even technical considerations like mobile responsiveness and accessibility. It’s able to do the latter by generating the full HTML markup of the node, and passing that to the AI provider for analysis
    • It feels like it was just yesterday that the AI module had its first release, so I think it’s great to see that there are community-created additions like this one already evolving as part of Drupal’s AI ecosystem
Categories: FLOSS Project Planets

Real Python: Advanced Python import Techniques

Planet Python - Tue, 2024-09-24 10:00

In Python, you use the import keyword to make code in one module available in another. Imports in Python are important for structuring your code effectively. Using imports properly will make you more productive, allowing you to reuse code while keeping your projects maintainable.

This video course provides a comprehensive overview of Python’s import statement and how it works. The import system is powerful, and this course will teach you how to harness this power. While you’ll cover many of the concepts behind Python’s import system, this video course is mostly example driven, so you’ll learn from the numerous code examples shared throughout.

In this video course, you’ll learn how to:

  • Use modules, packages, and namespace packages
  • Manage namespaces and avoid shadowing
  • Avoid circular imports
  • Import modules dynamically at runtime
  • Customize Python’s import system

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyCharm: PyCharm vs. Jupyter Notebook

Planet Python - Tue, 2024-09-24 08:52

Jupyter notebooks are an important tool for data scientists, providing an easy option for conducting experiments and presenting results. According to our Developer Ecosystem Survey 2023, at least 35% of data professionals use Jupyter notebooks. Furthermore, over 40% of these users spend more than 20% of their working time using these resources.

There are several implementations of notebook technology available to data professionals. At first we’ll look at the well-known Jupyter Notebook platforms by Project Jupyter. For the purposes of this article, we’ll refer to the Project Jupyter implementations of notebooks as “vanilla Jupyter” in order to avoid confusion, since there are several other implementations of the tool.

While vanilla Jupyter notebooks can be sufficient for some tasks, there are other cases where it would be better to rely on another tool for working with data. In this article, we’ll outline the key differences between PyCharm Professional and vanilla Jupyter when it comes to data science applications.

What is Jupyter Notebook?

Jupyter Notebook is an open-source platform that allows users to create and share code, visualizations, and text. It’s primarily used for data analysis and scientific research. Although JupyterLab offers some plugins and tools, its capabilities and user experience are significantly more limited than PyCharm’s.

What is PyCharm Professional?

PyCharm is a comprehensive integrated development environment (IDE) that supports a wide range of technologies out of the box, offering deep integration between them. In addition to enhanced support for Jupyter notebooks, PyCharm Professional also provides superior database support, Python script editing, and GitHub integration, as well as support for AI Assistant, Hugging Face, dbt-Core, and much more. 

Feature comparison: PyCharm Pro vs. Jupyter Language support

While Jupyter notebooks claim to support over 40 programming languages, their usage is limited to the .ipynb format, which makes working with traditional file extensions like .py, .sql, and others less convenient. On the other hand, while PyCharm offers support for fewer languages – Python, JavaScript and TypeScript, SQL, and R (via plugin), along with several markup languages like HTML and CSS – the support is much more comprehensive.

Often, Jupyter notebooks and Python scripts serve different purposes. Notebooks are typically used for prototyping and experimentation, while Python scripts are more suitable for production. In PyCharm Professional, you can work with both of these formats and it’s easy to convert .ipynb files into .py files. See the video below for more information.

The smartest code completion

If you’ve ever written code in PyCharm Professional, you’ll have definitely noticed its code completion capabilities. In fact, the IDE offers several different types of code completion. In addition to the standard JetBrains completion, which provides suggestions based on an extensive understanding of your project and libraries, there’s also runtime completion, which can suggest names of data objects like columns in a pandas or Polars DataFrame, and ML-powered, full-line completion that suggests entire lines of code based on the current file. Additionally, you can enhance these capabilities with LLM-powered tools such as JetBrains AI Assistant, GitHub Copilot, Amazon Whisper, and others.

In contrast, code completion in Jupyter notebooks is limited. Vanilla Jupyter notebooks lack awareness of your project’s context, there’s no local ML-based code completion, and there’s no runtime code completion for database objects.

Code quality features and debugger

PyCharm offers several tools to enhance your code quality, including smart refactorings, quick-fixes, and AI Assistant – none of which are available in vanilla Jupyter notebooks. 

If you’ve made a mistake in your code, PyCharm Professional will suggest several actions to fix it. These become visible when you click on the lightbulb icon.

PyCharm Professional also inspects code on the file and project level. To see all of the issues you have in your current file, you can click on the image in the top right-hand corner.

While vanilla Jupyter notebooks can highlight any issues after a code cell has been executed (as seen below), it doesn’t have features that allow you to analyze your entire file or project.   

PyCharm provides a comprehensive and advanced debugging environment for both Python scripts and Jupyter notebooks. This debugger allows you to step into your code, running through the execution steps line by line, and pinpointing exactly where an error was made. If you’ve never used the debugger in PyCharm, you can learn how to debug a Jupyter notebook in PyCharm with the help of this blog by Dr. Jodie Burchell. In contrast, vanilla Jupyter offers basic debugging tools such as cell-by-cell execution and interactive %debug commands.

Refactorings

Web-based Jupyter notebooks lack refactoring capabilities. If you need to rename a variable, introduce a constant, or perform any other operation, you have to do it manually, cell by cell. In PyCharm Professional, you can access the Refactoring menu via Control + T and use it to make changes in your file faster. More information about refactorings in PyCharm you can find in the video.

Other code-related features

If you forget how to work with a library in vanilla Jupyter notebooks, you need to open another tab in a browser to look up the documentation, taking you out of your development environment and programming flow.

In PyCharm Professional, you can get information about a function or library you’re currently using right in the IDE by hovering over the code.

If you have a subscription to AI Assistant you can also use it for troubleshooting, such as asking it to explain code and runtime errors, as well as finding potential problems with your code before you run it.

Working with tables

DataFrames are one of the most important types of data formats for the majority of data professionals. In vanilla Jupyter notebooks, if you print a pandas or Polars DataFrame, you’ll see a static, output with a limited number of columns and rows shown. Since the DataFrame outputs in Jupyter notebooks are static, this makes it difficult to explore your data without writing additional code.

In PyCharm Professional, you can use interactive tables that allow you to easily view, navigate, sort, and filter data. You can create charts and access essential data insights, including descriptive statistics and missing values – all without writing a single line of code. 

What’s more, the interactive tables are designed to give you a lot of information about your data, including details of: 

  • Data type symbols in the column headers
  • The size of your DataFrame (in our case it is 2390 rows and 82 columns). 
  • Descriptive statistics and missing values and many more. 

If you want to get more information about how interactive tables work in PyCharm, check out the documentation.

Versioning and GitHub integration

In PyCharm Professional, you have several version control options, including Git. 

With PyCharm’s GitHub integration, you can see and revert your changes with the help of the IDE’s built-in visual diff tool. This enables you to compare changes between different commits of your notebooks. You can find an in-depth overview of the functionality in this tutorial.

Another incredibly useful feature is the local history, which automatically saves a version history of your changes. This means that if you haven’t committed something, and you need to roll back to an earlier version, you can do so with the click of a button.  

In vanilla Jupyter notebooks, you have to rely on the CLI git tool. In addition, Git is the only way of versioning your work, meaning there is no way to revert changes if you haven’t committed them.

Navigation

When you work on your project in Jupyter Notebook, you always need to navigate either within a given file or the whole project. On the other hand, the navigation functionality in PyCharm is significantly richer.

 Beyond the Structure view that is also present in JupyterLab, you can also find some additional features to navigate your project in our IDEs. For example, double pressing Shift will help you find anything in your project or settings.

In addition to that, you can find the specific source of code in your project using PyCharm Professional’s Find Usages, Go to Implementation, Go to Declaration, and other useful features. 

Check out this blog post for more information about navigation in Jupyter notebooks in PyCharm.

Visualizations

In addition to libraries that are available in vanilla Jupyter notebooks, PyCharm also provides further visualization possibilities with the help of interactive tables. This means you don’t have to remember or type boilerplate code to create graphs. 

How to choose between PyCharm and vanilla Jupyter notebooks

Vanilla Jupyter notebooks are a lightweight tool. If you need to do some fast experiments, it makes sense to use this implementation. 

On the other hand, PyCharm Professional is a feature-rich IDE that simplifies the process of working with Jupyter notebooks. If you need to work with complex projects with a medium or large codebase, or you want to add a significant boost to productivity, PyCharm Professional is likely to be more suitable, allowing you to complete your data project more smoothly and quickly .

Get started with PyCharm Professional

PyCharm Professional is a data science IDE that supports Python, rich databases, Jupyter, Git, Conda, and other technologies right out of the box. Work on projects located in local or remote development environments. Whether you’re developing data pipelines, prototyping machine learning models, or analyzing data, PyCharm equips you with all the tools you need.

The best way to understand the difference between tools is to try them for yourself. We strongly recommend downloading PyCharm and testing it in your real-life projects. 

Download PyCharm Professional and get an extended 60-day trial by using the promo code “PyCharmNotebooks”. The free subscription is available for individual users only.

Activate your 60-day trial

Below are other blog posts that you might find useful for boosting your productivity with PyCharm Professional.

Categories: FLOSS Project Planets

Plasma Browser Integration 2.0

Planet KDE - Tue, 2024-09-24 07:19

I’m pleased to announce the immediate availability of Plasma Browser Integration version 2.0 on the Chrome Web Store and Microsoft Edge Add-ons page. This release updates the extension to Manifest Version 3 which will be required by Chrome soon. The major version bump reflects the amount of work it has taken to achieve this port.

Konqi surfing the world wide web

Plasma Browser Integration bridges the gap between your browser and the Plasma desktop. It lets you share links, find browser tabs and visited websites in KRunner, monitor download progress in the notification center, and control music and video playback anytime from within Plasma, or even from your phone using KDE Connect!

Despite the version number, there aren’t many user-facing changes. This release comes with the usual translation updates, however. Since this release doesn’t bring any advantage to Firefox users over the previous 1.9.1, it will not be provided on the Mozilla add-ons store.

We have taken the opportunity of reworking the extension manifest to make the “history” permission mandatory. When the browser history KRunner module was originally added, the permission was optional as we feared it might scare users away when presented with a scary permission warning when updating the extension.

(also see the Changelog Page on our Community Wiki)

Categories: FLOSS Project Planets

The Drop Times: Drupal CMS Expected by 15 Jan, XB Further Away in 2025: A Quick and Dirty Summary of Driesnote

Planet Drupal - Tue, 2024-09-24 07:08
Get ready to witness a transformative era for Drupal! In his 40th State of Drupal address, founder Dries Buytaert announced groundbreaking plans that are set to redefine the platform. With the targeted launch of Drupal CMS 1.0 on January 15, 2025, coinciding with Drupal's 25th anniversary, the platform is gearing up to become the gold standard for no-code website building. From the ambitious Starshot Project aiming to revolutionize site-building for non-developers, to the introduction of the Experience Builder (XB) that brings React into the fold, Drupal is embracing innovation like never before. Plus, with a strong commitment to responsible AI integration and a complete overhaul of its documentation, Drupal is positioning itself at the forefront of web development. Dive into how these exciting developments will shape the future of Drupal and what they mean for the global community!
Categories: FLOSS Project Planets

Data Transparency in Open Source AI: Protecting Sensitive Datasets

Open Source Initiative - Tue, 2024-09-24 06:49

The Open Source Initiative (OSI) is running a blog series to introduce some of the people who have been actively involved in the Open Source AI Definition (OSAID) co-design process. The co-design methodology allows for the integration of diverging perspectives into one just, cohesive and feasible standard. Support and contribution from a significant and broad group of stakeholders is imperative to the Open Source process and is proven to bring diverse issues to light, deliver swift outputs and garner community buy-in.

This series features the voices of the volunteers who have helped shape and are shaping the Definition.

Meet Tarunima Prabhakar

I am the research lead and co-founder at Tattle, a civic tech organization that builds citizen centric tools and datasets to respond to inaccurate and harmful content. My broad research interests are in the intersection of technology, policy and global development. Prior to starting Tattle, I worked as a research fellow at the Center for Long-Term Cybersecurity at UC, Berkeley studying the deployment of behavioral credit scoring algorithms towards financial inclusion goals in the global majority. I’ve also been fortunate to work on award-winning ICTD and data driven development projects with stellar non-profits. My career working in low-resource environments has turned me into an ardent advocate for Open Source development and citizen science movements. 

Protecting Sensitive Datasets

I recently gave a lightning talk at IndiaFOSS where I shared about Uli, a project to co-design solutions to online gendered abuse in Indian languages. As a part of this project, we’re building and maintaining datasets that are useful for machine learning models that detect abuse. The talk exhibited the importance of and the care that must be given when choosing a license for sensitive data, and why open datasets in Open Source AI should be carefully considered.

With the Uli project, we created a dataset annotated by gender rights activists and researchers who speak Hindi, Tamil and Indian English. Then, we fine-tuned Twitter’s XLM-RoBERTa model to detect gender abuse, which we deployed as a browser plugin. When activated, the Uli plugin would redact abusive tweets from a person’s feed. Another dataset we created was of slur words in the three languages that might be used to target people. Such a list is not only useful for the Uli plugin- these words are redacted from web pages if the plugin is installed- but they are also useful for any platform needing to moderate conversations in these languages.  At the time of the launch of the plugin, we chose to license the two datasets under an Open Data License (ODL). The model is hosted on Hugging Face and the code is available on GitHub. 

As we have continued to maintain and grow Uli, we have reconsidered how we license the data. When thinking about how to license this data, several factors come into play. First, annotating a dataset on abuse is labor-intensive and mentally exhausting, and the expert annotators should be fairly compensated for their expertise. Second, when these datasets are used by platforms for abuse detection, it creates a potential loophole—if abusive users realize the list of flagged words is public, they can change their language to evade moderation.

These concerns have led us to think carefully about how to license the data. On one end of the spectrum, we could continue to make everything open, regardless of commercial use. On the other end, we could keep all the data closed. We’ve historically operated as an Open Source organization, and every decision we make about data access impacts how we license our machine learning models as well. We are trying to find a happy medium that lets us balance the numerous concerns- recognition of effort and effectiveness of the data on one hand, and transparency, adaptability and extensibility on the other.

As we’ve thought about different strategies for data licensing, we haven’t been sure what that would mean for the license of the machine learning models. And that’s partly because we don’t have a clear definition for what “Open Source AI” really means. 

It is for this reason that we’ve closely followed the Open Source Initiative’s (OSI) process for converging on a definition for Open Source AI. OSI has been grappling with the definition of “Open Source AI” as it pertains to the four freedoms: the freedom to use, study, modify, and share. Over the past year, the OSI has been iterating on a definition for Open Source AI, and they’ve reached a point where they propose the following:

  • Open weights: The model weights and parameters should be open.
  • Open source code: The source code used to train the system should be open.
  • Open data or transparent data: Either the dataset should be open, or there should be enough detailed information for someone to recreate the dataset.

It’s important to note that the dataset doesn’t necessarily have to be open. The departure from a stance of maximally open dataset accounts for the complexity in the collection and management of data driving real world ML applications. While frontier models need to deal with copyright and privacy concerns, many smaller projects like ours worry about the uneven power dynamics between those creating the data and the entities using it. In our specific case, opening data also reduces its efficacy.

But having struggled with papers that describe research or data without sharing the dataset itself, I also recognize that ‘enough detailed information’ might not be information enough to repeat, adapt or extend another group’s work. In the end, the question becomes: how much information about the dataset is enough to consider the model “open?” It’s a fine line, and not everyone is comfortable with OSI’s stance on this issue. For our project in particular, we are considering the option of staggered data release- older data is released under an open data license, while the newest data requires users to request access. 

If you have strong opinions on this process, I encourage you to visit the OSI website and leave feedback. The OSI process is influential, and your input on open weights, open code, and their specifications around data openness could shape the future of Open Source AI.

You can learn more about the participatory process behind the Uli dataset here, and about Uli and Tattle on their respective websites. 

How to get involved

The OSAID co-design process is open to everyone interested in collaborating. There are many ways to get involved:

  • Join the forum: share your comment on the drafts.
  • Leave comment on the latest draft: provide precise feedback on the text of the latest draft.
  • Follow the weekly recaps: subscribe to our monthly newsletter and blog to be kept up-to-date.
  • Join the town hall meetings: we’re increasing the frequency to weekly meetings where you can learn more, ask questions and share your thoughts.
  • Join the workshops and scheduled conferences: meet the OSI and other participants at in-person events around the world.
Categories: FLOSS Research

The Drop Times: Winners of the 2024 Women in Drupal Awards Announced at DrupalCon Barcelona

Planet Drupal - Tue, 2024-09-24 06:38
The 2024 Women in Drupal Awards, sponsored by JAKALA, have recognized Esmeralda Braad-Tijhoff, Pamela Barone, and Alla Petrovska for their exceptional contributions in the Define, Build, and Scale categories, respectively, at DrupalCon Barcelona.
Categories: FLOSS Project Planets

Python Software Foundation: Service Awards given by the PSF: what are they and how they differ

Planet Python - Tue, 2024-09-24 05:00

Do you know someone in the Python community who inspires you and whose contributions to the Python community are outstanding? Other than saying thank you (definitely do this too!), you can also nominate them to receive recognition given by the PSF. In this blog post, we will explain what each of the awards are and how they differ. We hope this will encourage you to nominate your favorite inspirational community member to receive an award!

PSF Community Service Awards

The most straightforward way to acknowledge someone’s volunteer effort serving the Python community is to nominate them for the PSF Community Service Awards (CSA). The awardee will receive:


  • A cash award of $599 USD

  • Free registration at all future PyCon US events


Recipients need not be PSF members and can receive multiple awards if they have continuous outstanding contributions. Other than individuals, there are also small organizational groups (e.g. PyCon JP Association 2021) who can receive the CSA award.


The PSF Board reviews nominations quarterly. CSA recipients will be recognized at PyCon US every year.

 

CSA Award Winners 

The PSF Community Service Awards are all about the wonderful and dedicated folks in our community, and we had to take this opportunity to show some of their faces! You can find all of the inspiring PSF CSA recipients on our CSA webpage.  

 

CSA Recipients (left to right, top): Jessica Upani, Mariatta Wijaya, Abigail Mesrenyame Dogbe, Lais Carvalho, Mason Egger CSA Recipients (left to right, bottom): Kojo Idrissa, Tereza Iofciu, Jessica Greene, Carol Willing, Vicky Twomey-Lee 
PyCon JP Association CSA Recipients (left to right): Takayuki Shimizukawa,
Shunsuke Yoshida, Jonas Obrist, Manabu Terada, Takanori SuzuPSF Distinguished Service Awards

As the highest award that the PSF bestows, the Distinguished Service Award is the level up of the CSA award described above. Recipients of a DSA need to have made significant, sustained, and exemplary contributions with an exceptionally positive impact on the Python community. Recognition will take the form of an award certificate plus a cash award of $5000 USD. As of the writing of this blog post, there are only 7 awardees of the DSA in history.


Naomi Ceder is the latest Distinguished Service Awards recipient, she received the award in 2022

PSF Fellow Membership

Although it is also a form of recognition, the PSF Fellow Membership is different from the awards above and there’s no comparison of the level of recognition between fellowship and any of the awards above. Fellows are members who have been nominated for their extraordinary efforts and impact upon Python, the community, and the broader Python ecosystem. Fellows are nominated from the broader community and if they meet Fellow criteria, they are elevated by a vote of the Fellows Working Group. PSF Fellows are lifetime voting members of the PSF. That means Fellows are eligible to vote in PSF elections as well as follow the membership rules and Bylaws of the PSF. 

Nominate someone!

We hope this makes the types of recognition given by the PSF clear, as well as gives you confidence in nominating folks in the Python community that you think should be recognized for a CSA, DSA, or as a PSF Fellow. We also hope that this will inspire you to become a Python community member that receives a service award!


Categories: FLOSS Project Planets

Specbee: A practical guide to Personalization with user personas (sample campaigns included!)

Planet Drupal - Tue, 2024-09-24 04:46
You know that moment when Google finishes your search with exactly what you were thinking? Or when you want to just Netflix and chill after a long day but you’re recommended the perfect show that completely blows your mind? That’s personalization in action! But how does this magic happen? It begins with truly understanding your audience—building user personas. From there, you segment them based on key behaviors and preferences, and finally, you deliver exactly what they’re looking for, right when they need it.  In this article, we’ll discuss how you can offer personalized content to your audience using some simple-to-implement techniques. How to create User Personas You might probably know your audience, but do you really understand who they are and what they prefer? That’s where user personas come in.  It's possible to perform a UX research project to identify specific "Personas," and identify the needs, desires, and habits of each.  Take a look at this decent write-up on Personas. From this article, here's an example of what a final "Persona" might look like: In this example, each Persona is given a name, background, brand preferences, etc. You may have already done this at some point. It can be a helpful practice, but it takes time and a lot of corporate buy-in.  So if you think that’s not a practical choice at the moment, we’d suggest you identify your core target audiences and use technology to give your web visitors a personalized experience. For the sake of this overview, let's instead use the term "Segment" instead of "Persona". Creating a personalized website experience There are 3 basic steps to successfully manage personalization. It is primarily asking these questions: Defining segments - What type of people are you targeting? Connecting users to segments - Is this visitor one of those types of people? Providing personalized content for each segment - What do we want to show these targets? 1. Defining Segments At a high level, a "Segment" is simply a user type. And those can be identified in any way that you prefer. For instance, let’s imagine your company sells shoes. You might identify Segments like: Casual Shoewear Enthusiast Sports and Fitness Buff Fashion-Conscious Shopper Parents Shopping for Kids Professional or Workwear Buyer Seasonal Shopper Discount Shopper Luxury Buyer Orthopedic Footwear Seeker Chances are, you already have Segments in place. They could be from existing email list groups or sales teams managing clients by size, product, or region. These Segments can start simple ("newsletter subscribers") and grow as your efforts grow. Task 1: Identify and define Segments that you would like to target. 2. Connect users to Segments  Once target Segments are identified, you need to determine how to identify which website visitors belong to which Segments. This is where the tech comes in. You can identify users based on their actions. These actions can include these and more: Logging in,  Filling out a form, Participating in a survey, Accessing a specific landing page, Following an email newsletter link, Clicking a Google/LinkedIn/Facebook ad. Any defined action that you track will allow you to identify the visitor's Segment and set a Cookie to maintain that identification.  Ads and marketing links will have parameters in their URLs. We can discuss more specifics here, but here's a nice general overview.  Once a visitor's Segment is identified you can present personalized content. Task 2: Determine the specific actions that categorize web visitors into User Segments 3. Provide personalized content for each Segment Now that we know the User Segments (Task 1), and we know how to identify the website visitor as a member of a specific Segment (Task 2), we provide the personalized content. Here's an example of 3 visitors (and 4 visits): User A receives an email newsletter with a link related to your marketing sales campaign. They click the link and when they go to the homepage they see a marketing page banner related to the campaign - "Returning Customers New Arrivals 50% Off." User B visits from a Google ad for a New Year's campaign. They click the ad, and the marketing page banner reads "2025 New Year Flash Sale" User C visits your website by directly typing in the URL like a caveman would. They see the homepage with a generic marketing page banner. They fill out a form to get more information. User C returns two weeks later and the homepage displays with a marketing page banner that reads, "Welcome Back User C!"  Task 3: Determine content that should be seen for each Segment Sample campaigns for a shoe retailer Going with the same example of a company that retails shoes, let’s dive into some personalized campaigns you can create for your target audience. Completing the three tasks above might give us the following Campaign outlines for site personalization: 1. Sneaker Sale Campaign Segment:Sneaker Enthusiasts User Identification:Visitor is a member of the Segment if they took any of the following actions: Clicked a Google ad for sneakers. Clicked a link in an email campaign featuring sneakers. Visited the "Sneakers" category page on the website. Personalized Content: Homepage banner showcasing a "Sneaker Sale: Up to 50% Off" promotion. Personalized CTA offering "Exclusive Sneaker Offers" to encourage sign-ups for further deals.  2. Coastal Cities Campaign Segment:Coastal City Customers User Identification:Visitor is a member of the Segment if they took any of the following actions: Clicked a link in an email campaign targeting customers in coastal cities (e.g., Miami, Los Angeles, New York). Filled out a form with a coastal city location. Visited the site from an IP address associated with coastal regions. Personalized Content: Homepage banner promoting "Summer-Ready Footwear for Coastal Living", featuring sandals, lightweight sneakers, and water-resistant shoes. A "Coastal City Style Guide" featured at the top of the blog page, showing trending footwear options for beach and city life. CTA below the banner offering "Exclusive Deals on Summer Styles for Coastal Shoppers", with a localized discount code for users in coastal regions. Creating a Campaign statement might help keep things organized and aid in communication, internally and externally. Final thoughts While there are many ways to personalize your content, what we’ve covered today represents some of the simplest yet most effective methods. Personalization can change how you connect with your customers. The key is knowing your audience and delivering the right message at the right time. If this got you excited, know that you are not on your own here! Specbee can work with you on each of these steps, and we always recommend starting small. Talk to one of our experts today to find out how we can help.
Categories: FLOSS Project Planets

Vasudev Kamath: Note to Self: Enabling Secure Boot with UKI on Debian

Planet Debian - Tue, 2024-09-24 02:00

Note

This post is a continuation of my previous article on enabling the Unified Kernel Image (UKI) on Debian.

In this guide, we'll implement Secure Boot by taking full control of the device, removing preinstalled keys, and installing our own. For a comprehensive overview of the benefits and process, refer to this excellent post from rodsbooks.

Key Components

To implement Secure Boot, we need three essential keys:

  1. Platform Key (PK): The top-level key in Secure Boot, typically provided by the motherboard manufacturer. We'll replace the vendor-supplied PK with our own for complete control.
  2. Key Exchange Key (KEK): Used to sign updates for the Signatures Database and Forbidden Signatures Database.
  3. Database Key (DB): Used to sign or verify binaries (bootloaders, boot managers, shells, drivers, etc.).

There's also a Forbidden Signature Key (dbx), which is the opposite of the DB key. We won't be generating this key in this guide.

Preparing for Key Enrollment

Before enrolling our keys, we need to put the device in Secure Boot Setup Mode. Verify the status using the bootctl status command. You should see output similar to the following image:

Generating Keys

Follow these instructions from the Arch Wiki to generate the keys manually. You'll need the efitools and openssl packages. I recommend using rsa:2048 as the key size for better compatibility with older firmware.

After generating the keys, copy all .auth files to the /efi/loader/keys/<hostname>/ folder. For example:

❯ sudo ls /efi/loader/keys/chamunda db.auth KEK.auth PK.auth Signing the Bootloader

Sign the systemd-boot bootloader with your new keys:

sbsign --key <path-to db.key> --cert <path-to db.crt> \ /usr/lib/systemd/boot/efi/systemd-bootx64.efi

Install the signed bootloader using bootctl install. The output should resemble this:

Note

If you encounter warnings about mount options, update your fstab with the `umask=0077` option for the EFI partition.

Verify the signature using sbsign --verify:

Configuring UKI for Secure Boot

Update the /etc/kernel/uki.conf file with your key paths:

SecureBootPrivateKey=/path/to/db.key SecureBootCertificate=/path/to/db.crt Signing the UKI Image

On Debian, use dpkg-reconfigure to sign the UKI image for each kernel:

sudo dpkg-reconfigure linux-image-$(uname -r) # Repeat for other kernel versions if necessary

You should see output similar to this:

sudo dpkg-reconfigure linux-image-$(uname -r) /etc/kernel/postinst.d/dracut: dracut: Generating /boot/initrd.img-6.10.9-amd64 Updating kernel version 6.10.9-amd64 in systemd-boot... Signing unsigned original image Using config file: /etc/kernel/uki.conf + sbverify --list /boot/vmlinuz-6.10.9-amd64 + sbsign --key /home/vasudeva.sk/Documents/personal/secureboot/db.key --cert /home/vasudeva.sk/Documents/personal/secureboot/db.crt /tmp/ukicc7vcxhy --output /tmp/kernel-install.staging.QLeGLn/uki.efi Wrote signed /tmp/kernel-install.staging.QLeGLn/uki.efi /etc/kernel/postinst.d/zz-systemd-boot: Installing kernel version 6.10.9-amd64 in systemd-boot... Signing unsigned original image Using config file: /etc/kernel/uki.conf + sbverify --list /boot/vmlinuz-6.10.9-amd64 + sbsign --key /home/vasudeva.sk/Documents/personal/secureboot/db.key --cert /home/vasudeva.sk/Documents/personal/secureboot/db.crt /tmp/ukit7r1hzep --output /tmp/kernel-install.staging.dWVt5s/uki.efi Wrote signed /tmp/kernel-install.staging.dWVt5s/uki.efi Enrolling Keys in Firmware

Use systemd-boot to enroll your keys:

systemctl reboot --boot-loader-menu=0

Select the enroll option with your hostname in the systemd-boot menu.

After key enrollment, the system will reboot into the newly signed kernel. Verify with bootctl:

Dealing with Lockdown Mode

Secure Boot enables lockdown mode on distro-shipped kernels, which restricts certain features like kprobes/BPF and DKMS drivers. To avoid this, consider compiling the upstream kernel directly, which doesn't enable lockdown mode by default.

As Linus Torvalds has stated, "there is no reason to tie Secure Boot to lockdown LSM." You can read more about Torvalds' opinion on UEFI tied with lockdown.

Next Steps

One thing that remains is automating the signing of systemd-boot on upgrade, which is currently a manual process. I'm exploring dpkg triggers for achieving this, and if I succeed, I will write a new post with details.

Acknowledgments

Special thanks to my anonymous colleague who provided invaluable assistance throughout this process.

Categories: FLOSS Project Planets

parallel @ Savannah: GNU Parallel 20240922 ('Gold Apollo AR924') released

GNU Planet! - Mon, 2024-09-23 16:49

GNU Parallel 20240922 ('Gold Apollo AR924') has been released. It is available for download at: lbry://@GnuParallel:4

Quote of the month:

  Recently executed a flawless live data migration of ~2.4pb using GNU parallel for scale and bash scripts.
    -- @mechanicker@twitter Dhruva

New in this release:

  • --fast disables a lot of functionality to speed up running jobs.
  • Bug fixes and man page updates.

News about GNU Parallel:


GNU Parallel - For people who live life in the parallel lane.

If you like GNU Parallel record a video testimonial: Say who you are, what you use GNU Parallel for, how it helps you, and what you like most about it. Include a command that uses GNU Parallel if you feel like it.

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

  parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

  find . -name '*.jpg' |
    parallel convert -geometry {2} {1} {1//}/thumb{2}_{1/} :::: - ::: 50 100 200

You can find more about GNU Parallel at: http://www.gnu.org/s/parallel/

You can install GNU Parallel in just 10 seconds with:

    $ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
       fetch -o - http://pi.dk/3 ) > install.sh
    $ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a
    12345678 883c667e 01eed62f 975ad28b 6d50e22a
    $ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0
    cc21b4c9 43fd03e9 3ae1ae49 e28573c0
    $ sha512sum install.sh | grep ec113b49a54e705f86d51e784ebced224fdff3f52
    79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224
    fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35
    $ bash install.sh

Watch the intro video on http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial (man parallel_tutorial). Your command line will love you for it.

When using programs that use GNU Parallel to process data for publication please cite:

O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014.

If you like GNU Parallel:

  • Give a demo at your local user group/team/colleagues
  • Post the intro videos on Reddit/Diaspora*/forums/blogs/

Identi.ca/Google+/Twitter/Facebook/Linkedin/mailing lists

not already there)

  • Invite me for your next conference

If you use programs that use GNU Parallel for research:

  • Please cite GNU Parallel in you publications (use --citation)

If GNU Parallel saves you money:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the
limit.

Categories: FLOSS Project Planets

MidCamp - Midwest Drupal Camp: Join us to help plan MidCamp 2025

Planet Drupal - Mon, 2024-09-23 15:24
Join us to help plan MidCamp 2025

Please join us for our first MidCamp 2025 planning meeting!

Why come?

Because we value giving back to the Drupal community and this is one way you can do that.

What should I expect?

That's mostly up to you -- there are a lot of roles and skillsets needed to put on a conference like MidCamp. Regardless of what you do day-to-day, you can find a fit. Everything from 

What if I don't live in Chicago?

That's OK! The planning of things is done remotely. A good portion of the planning team doesn't live in or near Chicago. People join because they care about Drupal and want to help make MidCamp happen.

How to join?

We'll share the Zoom link via Meetup. We also welcome you to join the #midcamp-organizers channel on our Slack team: https://mid.camp/slack

Categories: FLOSS Project Planets

mark.ie: Need to hire Drupal developers? I can help you

Planet Drupal - Mon, 2024-09-23 15:00

Today I launched a new service, matching available Drupal developers with recruiters and agencies that are hiring.

Categories: FLOSS Project Planets

Jonathan McDowell: The (lack of a) return-to-office conspiracy

Planet Debian - Mon, 2024-09-23 13:31

During COVID companies suddenly found themselves able to offer remote working where it hadn’t previously been on offer. That’s changed over the past 2 or so years, with most places I’m aware of moving back from a fully remote situation to either some sort of hybrid, or even full time office attendance. For example last week Amazon announced a full return to office, having already pulled remote-hired workers in for 3 days a week.

I’ve seen a lot of folk stating they’ll never work in an office again, and that RTO is insanity. Despite being lucky enough to work fully remotely (for a role I’d been approached about before, but was never prepared to relocate for), I feel the objections from those who are pro-remote often fail to consider the nuances involved. So let’s talk about some of the reasons why companies might want to enforce some sort of RTO.

Real estate value

Let’s clear this one up first. It’s not about real estate value, for most companies. City planners and real estate investors might care, but even if your average company owned their building they’d close it in an instant all other things being equal. An unoccupied building costs a lot less to maintain. And plenty of companies rent and would save money even if there’s a substantial exit fee.

Occupancy levels

That said, once you have anyone in the building the equation changes. If you’re having to provide power, heating, internet, security/front desk staff etc, you want to make sure you’re getting your money’s worth. There’s no point heating a building that can seat 100 for only 10 people present. One option is to downsize the building, but that leads to not being able to assign everyone a desk, for example. No one I know likes hot desking. There are also scheduling problems about ensuring there are enough desks for everyone who might turn up on a certain day, and you’ve ruled out the option of company/office wide events.

Coexistence builds relationships

As a remote worker I wish it wasn’t true that most people find it easier to form relationships in person, but it is. Some of this can be worked on with specific “teambuilding” style events, rather than in office working, but I know plenty of folk who hate those as much as they hate the idea of being in the office. I am lucky in that I work with a bunch of folk who are terminally online, so it’s much easier to have those casual conversations even being remote, but I also accept I miss out on some things because I’m just not in the office regularly enough. You might not care about this (“I just need to put my head down and code, not talk to people”), but don’t discount it as a valid reason why companies might want their workers to be in the office. This often matters even more for folk at the start of their career, where having a bunch of experience folk around to help them learn and figure things out ends up working much better in person (my first job offered to let me go mostly remote when I moved to Norwich, but I said no as I knew I wasn’t ready for it yet).

Coexistence allows for unexpected interactions

People hate the phrase “water cooler chat”, and I get that, but it covers the idea of casual conversations that just won’t happen the same way when people are remote. I experienced this while running Black Cat; every time Simon and I met up in person we had a bunch of useful conversations even though we were on IRC together normally, and had a VoIP setup that meant we regularly talked too. Equally when I was at Nebulon there were conversations I overheard in the office where I was able to correct a misconception or provide extra context. Some of this can be replicated with the right online chat culture, but I’ve found many places end up with folk taking conversations to DMs, or they happen in “private” channels. It happens more naturally in an office environment.

It’s easier for bad managers to manage bad performers

Again, this falls into the category of things that shouldn’t be true, but are. Remote working has increased the ability for people who want to slack off to do so without being easily detected. Ideally what you want is that these folk, if they fail to perform, are then performance managed out of the organisation. That’s hard though, there are (rightly) a bunch of rights workers have (I’m writing from a UK perspective) around the procedure that needs to be followed. Managers need organisational support in this to make sure they get it right (and folk are given a chance to improve), which is often lacking.

Summary

Look, I get there are strong reasons why offering remote is a great thing from the company perspective, but what I’ve tried to outline here is that a return-to-office mandate can have some compelling reasons behind it too. Some of those might be things that wouldn’t exist in an ideal world, but unfortunately fixing them is a bigger issue than just changing where folk work from. Not acknowledging that just makes any reaction against office work seem ill-informed, to me.

Categories: FLOSS Project Planets

Pages