FLOSS Project Planets

Acquia Developer Portal Blog: Drupal 11 Preparation Checklist

Planet Drupal - Wed, 2024-07-24 13:15

Drupal 11 is early! But don’t panic, the new Drupal 10 support model means you are not under pressure to upgrade. Drupal 10 will continue to be supported until mid-late 2026. But as we know, it’s best to be prepared and understand the upgrade process when that time comes for your organization.

Similar to the upgrade from Drupal 9 to Drupal 10, the latest version of Drupal 10 - Drupal 10.3.1 - defined all the deprecated code for Drupal 11. Also like previous modern Drupal major version upgrades there is a recommended set of areas to focus in order to get your applications upgraded as cleanly as possible.

Categories: FLOSS Project Planets

Acquia Developer Portal Blog: The Power of Drupal 11

Planet Drupal - Wed, 2024-07-24 13:15

Welcome, Drupal 11! At a previous DrupalCon Portland Drupal’s creator Dries Buytaert introduced the goal of making Drupal the preferred tool for ambitious site builders on the open web. Recently, Dries shared an updated plan for Drupal 11 which has 3 major focus areas:

Categories: FLOSS Project Planets

PyPy: Abstract interpretation in the Toy Optimizer

Planet Python - Wed, 2024-07-24 10:48

This is a cross-post from Max Bernstein from his excellent blog where he writes about programming languages, compilers, optimizations, virtual machines. He's looking for a (dynamic language runtime or compiler related) job too.

CF Bolz-Tereick wrote some excellent posts in which they introduce a small IR and optimizer and extend it with allocation removal. We also did a live stream together in which we did some more heap optimizations.

In this blog post, I'm going to write a small abtract interpreter for the Toy IR and then show how we can use it to do some simple optimizations. It assumes that you are familiar with the little IR, which I have reproduced unchanged in a GitHub Gist.

Abstract interpretation is a general framework for efficiently computing properties that must be true for all possible executions of a program. It's a widely used approach both in compiler optimizations as well as offline static analysis for finding bugs. I'm writing this post to pave the way for CF's next post on proving abstract interpreters correct for range analysis and known bits analysis inside PyPy.

Before we begin, I want to note a couple of things:

  • The Toy IR is in SSA form, which means that every variable is defined exactly once. This means that abstract properties of each variable are easy to track.
  • The Toy IR represents a linear trace without control flow, meaning we won't talk about meet/join or fixpoints. They only make sense if the IR has a notion of conditional branches or back edges (loops).

Alright, let's get started.

Welcome to abstract interpretation

Abstract interpretation means a couple different things to different people. There's rigorous mathematical formalism thanks to Patrick and Radhia Cousot, our favorite power couple, and there's also sketchy hand-wavy stuff like what will follow in this post. In the end, all people are trying to do is reason about program behavior without running it.

In particular, abstract interpretation is an over-approximation of the behavior of a program. Correctly implemented abstract interpreters never lie, but they might be a little bit pessimistic. This is because instead of using real values and running the program---which would produce a concrete result and some real-world behavior---we "run" the program with a parallel universe of abstract values. This abstract run gives us information about all possible runs of the program.1

Abstract values always represent sets of concrete values. Instead of literally storing a set (in the world of integers, for example, it could get pretty big...there are a lot of integers), we group them into a finite number of named subsets.2

Let's learn a little about abstract interpretation with an example program and example abstract domain. Here's the example program:

v0 = 1 v1 = 2 v2 = add(v0, v1)

And our abstract domain is "is the number positive" (where "positive" means nonnegative, but I wanted to keep the words distinct):

top / \ positive negative \ / bottom

The special top value means "I don't know" and the special bottom value means "empty set" or "unreachable". The positive and negative values represent the sets of all positive and negative numbers, respectively.

We initialize all the variables v0, v1, and v2 to bottom and then walk our IR, updating our knowledge as we go.

# here v0:bottom = 1 v1:bottom = 2 v2:bottom = add(v0, v1)

In order to do that, we have to have transfer functions for each operation. For constants, the transfer function is easy: determine if the constant is positive or negative. For other operations, we have to define a function that takes the abstract values of the operands and returns the abstract value of the result.

In order to be correct, transfer functions for operations have to be compatible with the behavior of their corresponding concrete implementations. You can think of them having an implicit universal quantifier forall in front of them.

Let's step through the constants at least:

v0:positive = 1 v1:positive = 2 # here v2:bottom = add(v0, v1)

Now we need to figure out the transfer function for add. It's kind of tricky right now because we haven't specified our abstract domain very well. I keep saying "numbers", but what kinds of numbers? Integers? Real numbers? Floating point? Some kind of fixed-width bit vector (int8, uint32, ...) like an actual machine "integer"?

For this post, I am going to use the mathematical definition of integer, which means that the values are not bounded in size and therefore do not overflow. Actual hardware memory constraints aside, this is kind of like a Python int.

So let's look at what happens when we add two abstract numbers:

top positive negative bottom top top top top bottom positive top positive top bottom negative top top negative bottom bottom bottom bottom bottom bottom

As an example, let's try to add two numbers a and b, where a is positive and b is negative. We don't know anything about their values other than their signs. They could be 5 and -3, where the result is 2, or they could be 1 and -100, where the result is -99. This is why we can't say anything about the result of this operation and have to return top.

The short of this table is that we only really know the result of an addition if both operands are positive or both operands are negative. Thankfully, in this example, both operands are known positive. So we can learn something about v2:

v0:positive = 1 v1:positive = 2 v2:positive = add(v0, v1) # here

This may not seem useful in isolation, but analyzing more complex programs even with this simple domain may be able to remove checks such as if (v2 < 0) { ... }.

Let's take a look at another example using an sample absval (absolute value) IR operation:

v0 = getarg(0) v1 = getarg(1) v2 = absval(v0) v3 = absval(v1) v4 = add(v2, v3) v5 = absval(v4)

Even though we have no constant/concrete values, we can still learn something about the states of values throughout the program. Since we know that absval always returns a positive number, we learn that v2, v3, and v4 are all positive. This means that we can optimize out the absval operation on v5:

v0:top = getarg(0) v1:top = getarg(1) v2:positive = absval(v0) v3:positive = absval(v1) v4:positive = add(v2, v3) v5:positive = v4

Other interesting lattices include:

  • Constants (where the middle row is pretty wide)
  • Range analysis (bounds on min and max of a number)
  • Known bits (using a bitvector representation of a number, which bits are always 0 or 1)

For the rest of this blog post, we are going to do a very limited version of "known bits", called parity. This analysis only tracks the least significant bit of a number, which indicates if it is even or odd.

Parity

The lattice is pretty similar to the positive/negative lattice:

top / \ even odd \ / bottom

Let's define a data structure to represent this in Python code:

class Parity: def __init__(self, name): self.name = name def __repr__(self): return self.name

And instantiate the members of the lattice:

TOP = Parity("top") EVEN = Parity("even") ODD = Parity("odd") BOTTOM = Parity("bottom")

Now let's write a forward flow analysis of a basic block using this lattice. We'll do that by assuming that a method on Parity is defined for each IR operation. For example, Parity.add, Parity.lshift, etc.

def analyze(block: Block) -> None: parity = {v: BOTTOM for v in block} def parity_of(value): if isinstance(value, Constant): return Parity.const(value) return parity[value] for op in block: transfer = getattr(Parity, op.name) args = [parity_of(arg.find()) for arg in op.args] parity[op] = transfer(*args)

For every operation, we compute the abstract value---the parity---of the arguments and then call the corresponding method on Parity to get the abstract result.

We need to special case Constants due to a quirk of how the Toy IR is constructed: the constants don't appear in the instruction stream and instead are free-floating.

Let's start by looking at the abstraction function for concrete values---constants:

class Parity: # ... @staticmethod def const(value): if value.value % 2 == 0: return EVEN else: return ODD

Seems reasonable enough. Let's pause on operations for a moment and consider an example program:

v0 = getarg(0) v1 = getarg(1) v2 = lshift(v0, 1) v3 = lshift(v1, 1) v4 = add(v2, v3) v5 = dummy(v4)

This function (which is admittedly a little contrived) takes two inputs, shifts them left by one bit, adds the result, and then checks the least significant bit of the addition result. It then passes that result into a dummy function, which you can think of as "return" or "escape".

To do some abstract interpretation on this program, we'll need to implement the transfer functions for lshift and add (dummy will just always return TOP). We'll start with add. Remember that adding two even numbers returns an even number, adding two odd numbers returns an even number, and mixing even and odd returns an odd number.

class Parity: # ... def add(self, other): if self is BOTTOM or other is BOTTOM: return BOTTOM if self is TOP or other is TOP: return TOP if self is EVEN and other is EVEN: return EVEN if self is ODD and other is ODD: return EVEN return ODD

We also need to fill in the other cases where the operands are top or bottom. In this case, they are both "contagious"; if either operand is bottom, the result is as well. If neither is bottom but either operand is top, the result is as well.

Now let's look at lshift. Shifting any number left by a non-zero number of bits will always result in an even number, but we need to be careful about the zero case! Shifting by zero doesn't change the number at all. Unfortunately, since our lattice has no notion of zero, we have to over-approximate here:

class Parity: # ... def lshift(self, other): # self << other if other is ODD: return EVEN return TOP

This means that we will miss some opportunities to optimize, but it's a tradeoff that's just part of the game. (We could also add more elements to our lattice, but that's a topic for another day.)

Now, if we run our abstract interpretation, we'll collect some interesting properties about the program. If we temporarily hack on the internals of bb_to_str, we can print out parity information alongside the IR operations:

v0:top = getarg(0) v1:top = getarg(1) v2:even = lshift(v0, 1) v3:even = lshift(v1, 1) v4:even = add(v2, v3) v5:top = dummy(v4)

This is pretty awesome, because we can see that v4, the result of the addition, is always even. Maybe we can do something with that information.

Optimization

One way that a program might check if a number is odd is by checking the least significant bit. This is a common pattern in C code, where you might see code like y = x & 1. Let's introduce a bitand IR operation that acts like the & operator in C/Python. Here is an example of use of it in our program:

v0 = getarg(0) v1 = getarg(1) v2 = lshift(v0, 1) v3 = lshift(v1, 1) v4 = add(v2, v3) v5 = bitand(v4, 1) # new! v6 = dummy(v5)

We'll hold off on implementing the transfer function for it---that's left as an exercise for the reader---and instead do something different.

Instead, we'll see if we can optimize operations of the form bitand(X, 1). If we statically know the parity as a result of abstract interpretation, we can replace the bitand with a constant 0 or 1.

We'll first modify the analyze function (and rename it) to return a new Block containing optimized instructions:

def simplify(block: Block) -> Block: parity = {v: BOTTOM for v in block} def parity_of(value): if isinstance(value, Constant): return Parity.const(value) return parity[value] result = Block() for op in block: # TODO: Optimize op # Emit result.append(op) # Analyze transfer = getattr(Parity, op.name) args = [parity_of(arg.find()) for arg in op.args] parity[op] = transfer(*args) return result

We're approaching this the way that PyPy does things under the hood, which is all in roughly a single pass. It tries to optimize an instruction away, and if it can't, it copies it into the new block.

Now let's add in the bitand optimization. It's mostly some gross-looking pattern matching that checks if the right hand side of a bitwise and operation is 1 (TODO: the left hand side, too). CF had some neat ideas on how to make this more ergonomic, which I might save for later.3

Then, if we know the parity, optimize the bitand into a constant.

def simplify(block: Block) -> Block: parity = {v: BOTTOM for v in block} def parity_of(value): if isinstance(value, Constant): return Parity.const(value) return parity[value] result = Block() for op in block: # Try to simplify if isinstance(op, Operation) and op.name == "bitand": arg = op.arg(0) mask = op.arg(1) if isinstance(mask, Constant) and mask.value == 1: if parity_of(arg) is EVEN: op.make_equal_to(Constant(0)) continue elif parity_of(arg) is ODD: op.make_equal_to(Constant(1)) continue # Emit result.append(op) # Analyze transfer = getattr(Parity, op.name) args = [parity_of(arg.find()) for arg in op.args] parity[op] = transfer(*args) return result

Remember: because we use union-find to rewrite instructions in the optimizer (make_equal_to), later uses of the same instruction get the new optimized version "for free" (find).

Let's see how it works on our IR:

v0 = getarg(0) v1 = getarg(1) v2 = lshift(v0, 1) v3 = lshift(v1, 1) v4 = add(v2, v3) v6 = dummy(0)

Hey, neat! bitand disappeared and the argument to dummy is now the constant 0 because we know the lowest bit.

Wrapping up

Hopefully you have gained a little bit of an intuitive understanding of abstract interpretation. Last year, being able to write some code made me more comfortable with the math. Now being more comfortable with the math is helping me write the code. It's nice upward spiral.

The two abstract domains we used in this post are simple and not very useful in practice but it's possible to get very far using slightly more complicated abstract domains. Common domains include: constant propagation, type inference, range analysis, effect inference, liveness, etc. For example, here is a a sample lattice for constant propagation:

"-inf"; bottom -> "-2"; bottom -> "-1"; bottom -> 0; bottom -> 1; bottom -> 2; bottom -> "+inf"; "-inf" -> negative; "-2" -> negative; "-1" -> negative; 0 -> top; 1 -> nonnegative; 2 -> nonnegative; "+inf" -> nonnegative; negative -> nonzero; nonnegative -> nonzero; nonzero->top; {rank=same; "-inf"; "-2"; "-1"; 0; 1; 2; "+inf"} {rank=same; nonnegative; negative;} } -->

It has multiple levels to indicate more and less precision. For example, you might learn that a variable is either 1 or 2 and be able to encode that as nonnegative instead of just going straight to top.

Check out some real-world abstract interpretation in open source projects:

If you have some readable examples, please share them so I can add.

Acknowledgements

Thank you to CF Bolz-Tereick for the toy optimizer and helping edit this post!

  1. In the words of abstract interpretation researchers Vincent Laviron and Francesco Logozzo in their paper Refining Abstract Interpretation-based Static Analyses with Hints (APLAS 2009):

    The three main elements of an abstract interpretation are: (i) the abstract elements ("which properties am I interested in?"); (ii) the abstract transfer functions ("which is the abstract semantics of basic statements?"); and (iii) the abstract operations ("how do I combine the abstract elements?").

    We don't have any of these "abstract operations" in this post because there's no control flow but you can read about them elsewhere! 

  2. These abstract values are arranged in a lattice, which is a mathematical structure with some properties but the most important ones are that it has a top, a bottom, a partial order, a meet operation, and values can only move in one direction on the lattice.

    Using abstract values from a lattice promises two things:

    • The analysis will terminate
    • The analysis will be correct for any run of the program, not just one sample run

  3. Something about __match_args__ and @property... 

Categories: FLOSS Project Planets

FSF Events: Free Software Directory meeting on IRC: Friday, July 26, starting at 12:00 EDT (16:00 UTC)

GNU Planet! - Wed, 2024-07-24 10:45
Join the FSF and friends on Friday, July 26 from 12:00 to 15:00 EDT (16:00 to 19:00 UTC) to help improve the Free Software Directory.
Categories: FLOSS Project Planets

Tag1 Consulting: Drupal Workspaces: A Game-changer for Site Wide Content Staging

Planet Drupal - Wed, 2024-07-24 10:02

Join us as Andrei Mateescu demonstrates the Workspaces module's powerful capabilities for enterprise-level Drupal sites. Discover how the module allows preview and management of extensive content changes and integrates with core functionalities like translations and Layout Builder. Although currently labeled experimental, Workspaces is already in use in production environments and will become a stable part of Drupal Core.

Read more michaelemeyers Wed, 07/24/2024 - 07:02
Categories: FLOSS Project Planets

Real Python: Hugging Face Transformers: Leverage Open-Source AI in Python

Planet Python - Wed, 2024-07-24 10:00

Transformers is a powerful Python library created by Hugging Face that allows you to download, manipulate, and run thousands of pretrained, open-source AI models. These models cover multiple tasks across modalities like natural language processing, computer vision, audio, and multimodal learning. Using pretrained open-source models can reduce costs, save the time needed to train models from scratch, and give you more control over the models you deploy.

In this tutorial, you’ll learn how to:

  • Navigate the Hugging Face ecosystem
  • Download, run, and manipulate models with Transformers
  • Speed up model inference with GPUs

Throughout this tutorial, you’ll gain a conceptual understanding of Hugging Face’s AI offerings and learn how to work with the Transformers library through hands-on examples. When you finish, you’ll have the knowledge and tools you need to start using models for your own use cases. Before starting, you’ll benefit from having an intermediate understanding of Python and popular deep learning libraries like pytorch and tensorflow.

Get Your Code: Click here to download the free sample code that shows you how to use Hugging Face Transformers to leverage open-source AI in Python.

Take the Quiz: Test your knowledge with our interactive “Hugging Face Transformers” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Hugging Face Transformers

In this quiz, you'll test your understanding of the Hugging Face Transformers library. This library is a popular choice for working with transformer models in natural language processing tasks, computer vision, and other machine learning applications.

The Hugging Face Ecosystem

Before using Transformers, you’ll want to have a solid understanding of the Hugging Face ecosystem. In this first section, you’ll briefly explore everything that Hugging Face offers with a particular emphasis on model cards.

Exploring Hugging Face

Hugging Face is a hub for state-of-the-art AI models. It’s primarily known for its wide range of open-source transformer-based models that excel in natural language processing (NLP), computer vision, and audio tasks. The platform offers several resources and services that cater to developers, researchers, businesses, and anyone interested in exploring AI models for their own use cases.

There’s a lot you can do with Hugging Face, but the primary offerings can be broken down into a few categories:

  • Models: Hugging Face hosts a vast repository of pretrained AI models that are readily accessible and highly customizable. This repository is called the Model Hub, and it hosts models covering a wide range of tasks, including text classification, text generation, translation, summarization, speech recognition, image classification, and more. The platform is community-driven and allows users to contribute their own models, which facilitates a diverse and ever-growing selection.

  • Datasets: Hugging Face has a library of thousands of datasets that you can use to train, benchmark, and enhance your models. These range from small-scale benchmarks to massive, real-world datasets that encompass a variety of domains, such as text, image, and audio data. Like the Model Hub, 🤗 Datasets supports community contributions and provides the tools you need to search, download, and use data in your machine learning projects.

  • Spaces: Spaces allows you to deploy and share machine learning applications directly on the Hugging Face website. This service supports a variety of frameworks and interfaces, including Streamlit, Gradio, and Jupyter notebooks. It is particularly useful for showcasing model capabilities, hosting interactive demos, or for educational purposes, as it allows you to interact with models in real time.

  • Paid offerings: Hugging Face also offers several paid services for enterprises and advanced users. These include the Pro Account, the Enterprise Hub, and Inference Endpoints. These solutions offer private model hosting, advanced collaboration tools, and dedicated support to help organizations scale their AI operations effectively.

These resources empower you to accelerate your AI projects and encourage collaboration and innovation within the community. Whether you’re a novice looking to experiment with pretrained models, or an enterprise seeking robust AI solutions, Hugging Face offers tools and platforms that cater to a wide range of needs.

This tutorial focuses on Transformers, a Python library that lets you run just about any model in the Model Hub. Before using transformers, you’ll need to understand what model cards are, and that’s what you’ll do next.

Understanding Model Cards

Model cards are the core components of the Model Hub, and you’ll need to understand how to search and read them to use models in Transformers. Model cards are nothing more than files that accompany each model to provide useful information. You can search for the model card you’re looking for on the Models page:

Hugging Face Models page

On the left side of the Models page, you can search for model cards based on the task you’re interested in. For example, if you’re interested in zero-shot text classification, you can click the Zero-Shot Classification button under the Natural Language Processing section:

Hugging Face Models page filtered for zero-shot text classification models

In this search, you can see 266 different zero-shot text classification models, which is a paradigm where language models assign labels to text without explicit training or seeing any examples. In the upper-right corner, you can sort the search results based on model likes, downloads, creation dates, updated dates, and popularity trends.

Each model card button tells you the model’s task, when it was last updated, and how many downloads and likes it has. When you click a model card button, say the one for the facebook/bart-large-mnli model, the model card will open and display all of the model’s information:

A Hugging Face model card

Even though a model card can display just about anything, Hugging Face has outlined the information that a good model card should provide. This includes detailed information about the model, its uses and limitations, the training parameters and experiment details, the dataset used to train the model, and the model’s evaluation performance.

A high-quality model card also includes metadata such as the model’s license, references to the training data, and links to research papers that describe the model in detail. In some model cards, you’ll also get to tinker with a deployed instance of the model via the Inference API. You can see an example of this in the facebook/bart-large-mnli model card:

Tinker with Hugging Face models using the Inference API Read the full article at https://realpython.com/huggingface-transformers/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

The Drop Times: 5 Basic Rules to Keep your Website Dependencies Secure

Planet Drupal - Wed, 2024-07-24 08:31
In web security, maintaining a secure Drupal site involves more than just core updates—it's about managing a complex web of dependencies. In this article, Grzegorz Pietrzak explores the critical steps every Drupal site maintainer should take to safeguard their sites against potential vulnerabilities. From keeping Composer and dependencies up-to-date to leveraging automated tools, discover practical strategies to fortify your Drupal site against modern threats. Stay ahead of security risks with these essential tips and insights.
Categories: FLOSS Project Planets

Kentaro Hayashi: apt-upgrade-canary - PoC apt JSON hook use case

Planet Debian - Wed, 2024-07-24 08:03

apt-upgrade-canary is a helper program to alert when upgrading packages via apt.

If there are some packages which causes a critical or serious bug, it shows warnings for terminal.

Then you can stay on current version of package by canceling upgrades.

This program is aimed to help not to installing problematic packages via manual upgrades.

Usually, you should delegate package upgrade to unattended-upgrades and apt-listbugs.

apt-listbugs kindly warns if there are known bugs.

In most cases it is enough safer.

But there is one exception, it is not true that always apt-listbugs can follow very latest critical or serious bugs in timely manner.

apt-upgrade-canary checks UDD mirror database (it take into account to cache query not to throw redundant one frequently)

Getting started
  1. Install required packages.
$ sudo apt install -y make ruby-pg ruby-json ruby-term-ansicolor
  1. Clone https://salsa.debian.org/kenhys/apt-upgrade-canary.

  2. Just sudo make install.

Use case in action

If there is no serious bugs which is reported against target packages, apt-upgrade-canary reports no error.

apt-upgrade-canary: no problem

If there are something weird happen, it reports matched bugs.

apt-upgrade-canary: serious case

So you can stay on.

ランキング参加中テクノロジー

Categories: FLOSS Project Planets

Real Python: Quiz: Logging in Python

Planet Python - Wed, 2024-07-24 08:00

In this quiz, you’ll test your understanding of Python’s logging module.

Logging is a very useful tool in a programmer’s toolbox. It can help you develop a better understanding of the flow of a program and discover scenarios that you might not have thought of while developing.

Logs provide developers with an extra set of eyes that are constantly looking at the flow an application is going through.

They can store information, like which user or IP accessed the application. If an error occurs, then they can provide more insights than a stack trace by telling you what the state of the program was before it arrived and the line of code where the error occurred.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Python Protocols: Leveraging Structural Subtyping

Planet Python - Wed, 2024-07-24 08:00

Test your understanding of how to create and use Python protocols while providing type hints for your functions, variables, classes, and methods.

Take this quiz after reading our Python Protocols: Leveraging Structural Subtyping tutorial.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

GSoC '24 Progress: Week 7 and 8

Planet KDE - Wed, 2024-07-24 08:00
Multiple Subtitle Track

I continued to refine the feature proposed in my previous blog. We can now add new layers directly on the timeline by simply dragging the existing subtitle out of the bottom border of the subtitle track. Adding, moving, and deleting subtitles work as before, now with layer support.

I also added an indicator to the header of the subtitle track. It looks like this:

Besides setting a style to a specific subtitle event, I also plan to add the feature of setting different default styles for different subtitle layers. This will allow us to easily apply a consistent style to groups of subtitles within each layer.

Improved Subtitle Manager

Layer management is now integrated into the subtitle manager, giving it a fresh new look.

The duplicate and delete operations now work for layers as well.

Automatic Conversion of .srt Subtitle

To better test and develop the style feature, I switched the subtitle storage format to .ass. With the help of my mentor, we can now automatically convert the .srt files from old projects to .ass files while keeping the original .srt file.

There are still some minor issues with style conversion, such as incorrect font sizes. However, I believe it’s time to shift my focus to the styling widget and address these bugs later. The next two weeks will be dedicated to style management, which is the most important part of the project, so stay tuned!

Categories: FLOSS Project Planets

Real Python: Quiz: Hugging Face Transformers

Planet Python - Wed, 2024-07-24 08:00

In this quiz, you’ll test your understanding of Hugging Face Transformers. This library is a popular choice for working with transformer models in natural language processing tasks, computer vision, and other machine learning applications.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: Django 5.1 release candidate 1 released

Planet Python - Wed, 2024-07-24 06:51

Django 5.1 release candidate 1 is the final opportunity for you to try out a kaleidoscope of improvements before Django 5.1 is released.

The release candidate stage marks the string freeze and the call for translators to submit translations. Provided no major bugs are discovered that can't be solved in the next two weeks, Django 5.1 will be released on or around August 7. Any delays will be communicated on the on the Django forum.

Please use this opportunity to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the release candidate package from our downloads page or on PyPI.

The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E

Categories: FLOSS Project Planets

LN Webworks: 10 Best Low-code Platforms for Building Applications in 2024

Planet Drupal - Wed, 2024-07-24 02:42

The prime objective of any business is to work efficiently with high speed and agility in application development. This helps a company work effectively and allows them to explore an application's extensive features. Low-code works the same way, this platform is becoming more popular due to its greater speed and flexibility in application development. In this information piece, we are going to talk about the best low-code platforms that are making significant changes in the app development world.

Categories: FLOSS Project Planets

Gunnar Wolf: DebConf24 Continuous Key-Signing Party

Planet Debian - Tue, 2024-07-23 23:25
🎉🥳🤡🎂🍥 Yay, party! 🎉🥳🤡🎂🍥 🎉🥳🤡🎂🍥 Yay, crypto! 🎉🥳🤡🎂🍥

DebCamp has started, and in a couple of days, we will fully be in DebConf24 mode!

As most of you know, an important part that binds Debian together is our cryptographic identity assurance, and that is in good measure tightened by the Continuous Key-Signing Parties we hold at DebConfs and other Debian and Free Software gatherings.

As I have done during (most of) the past DebConfs, I have prepared a set of pseudo-social maps to help you find where you are in the OpenPGP mesh of our conference. Naturally, Web-of-Trust maps should be user-centered, so find your own at:

https://people.debian.org/~gwolf/dc24_ksp/

The list is now final and it will not receive any modifications (I asked for them some days ago); if your name still appears on the list and you don’t want to be linked to the DC24 KSP in the future, tell me and I’ll remove it from future versions of the list (but it is part of the final DC24 file, as its checksum is already final)

Speaking of which!

If you are to be a part of the keysigning, get the final DC24 file and, on a device you trust, check its SHA256 by running:

$ sha256sum dc24_fprs.txt 11daadc0e435cb32f734307b091905d4236cdf82e3b84f43cde80ef1816370a5 dc24_fprs.txt

Make sure the resulting number matches the one I’m presenting. If it doesn’t, ensure your copy of the file is not corrupted (i.e. download again). If it still doess not match, notify me immediately.

Does any of the above confuse you? Please come to (or at least, follow the stream for) my session on DebConf opening day, Continuous Key-Signing Party introduction, 10:30 Korean time; I will do my best to explain the details to you.

PS- I will soon provide a simple, short PDF that will probably be mass-printed at FrontDesk so that you can easily track your KSP progress.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RQuantLib 0.4.23 on CRAN: Updates

Planet Debian - Tue, 2024-07-23 22:48

A new minor release 0.4.23 of RQuantLib just arrived at CRAN earlier today, and will be uploaded to Debian in due course.

QuantLib is a rather comprehensice free/open-source library for quantitative finance. RQuantLib connects (some parts of) it to the R environment and language, and has been part of CRAN for more than twenty-two years (!!) as it was one of the first packages I uploaded.

This release of RQuantLib updates to QuantLib version 1.35 released this morning. It accommodates some removals following earlier deprecations, and also updates most of the code in the function for a more readable and compact form of creating shared pointers via make_shared() along with auto.

Changes in RQuantLib version 0.4.23 (2024-07-23)
  • Adjustments for QuantLib 1.35 and removal of deprecated code (in utility functions and dividend case of vanilla options)

  • Adjustments for new changes in QuantLib 1.35

  • Refactoring most C++ files making more use of both auto and make_shared to simplify and shorten expressions

Courtesy of my CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the RQuantLib page. Questions, comments etc should go to the rquantlib-devel mailing list. Issue tickets can be filed at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: qlcal 0.0.12 on CRAN: Calendar Updates

Planet Debian - Tue, 2024-07-23 20:45

The twelveth release of the qlcal package arrivied at CRAN today.

qlcal delivers the calendaring parts of QuantLib. It is provided (for the R package) as a set of included files, so the package is self-contained and does not depend on an external QuantLib library (which can be demanding to build). qlcal covers over sixty country / market calendars and can compute holiday lists, its complement (i.e. business day lists) and much more. Examples are in the README at the repository, the package page, and course at the CRAN package page.

This releases synchronizes qlcal with the QuantLib release 1.35 (made today) and contains more updates to 2024 calendars.

Changes in version 0.0.12 (2024-07-22)
  • Synchronized with QuantLib 1.35 released today

  • Calendar updates for Chile, India, United States, Brazil

Courtesy of my CRANberries, there is a diffstat report for this release. See the project page and package documentation for more details, and more examples. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

scikit-learn: Interview with Adam Li, scikit-learn Team Member

Planet Python - Tue, 2024-07-23 20:00
Author: Reshama Shaikh , Adam Li

BIO: Adam is currently a Postdoctoral Research Scientist at Columbia University in the Causal Artificial Intelligence Lab, directed by Dr. Elias Bareinboim. He is an NSF-funded Computing Innovation Research Fellow. He did his PhD in biomedical engineering, specializing in computational neuroscience and machine learning at Johns Hopkins University working with Dr. Sridevi V. Sarma in the Neuromedical Control Systems group. He also jointly obtained a MS in Applied Mathematics and Statistics with a focus in statistical learning theory, optimization and matrix analysis. He was fortunate to be a NSF-GRFP fellow, Whitaker International Fellow, Chateaubriand Fellow and ARCS Chapter Scholar during his time at JHU. Adam officially joined the scikit-learn team as a maintainer in July 2024.

Link to scikit-learn contributions (issues, pull requests):

  1. Tell us about yourself.

    I currently live in New York City, where I work on theoretical and applied AI research through the lens of causal inference, statistical modeling, dynamical systems and signal processing. My current research is focused on telling a causal story, specifically in the case one has multiple distributions of data from the same causal system. For example, one may have access to brain recordings from monkeys and humans. Given these heterogeneous datasets, I am interested in answering: what causal relationships can we learn. This is known as the causal discovery problem, where given data, one attempts to learn what causes what. Another problem that I work on that is highly relevant to generative AI is the problem of causal representation learning. Here, I develop theory and train deep neural networks to understand causality among latent factors. Specifically, we demonstrate how to leverage multiple datasets and a causal neural network to generate data that is causally realistic. This can enable more robust data generation from general latent variable models.

  2. How did you first become involved in open source and scikit-learn?

    I first got involved in open source as a user. I was making the switch from Matlab to Python and started using packages like numpy and scipy pretty regularly. In my PhD research, I dealt with a lot of electrophysiological data (i.e. EEG brain recordings). I was writing hundreds of lines of code to load and preprocess data, and it was always changing based on different constraints. That was when I discovered MNE-BIDS, a Python package within the MNE framework for reading and writing brain recording data in a structured format. This changed my life because now my preprocessing and data loading code was a few lines of code that adhered to an open standard tested by thousands of researchers. I realized the value of open source, and began contributing in my spare time.

  3. We would love to learn of your open source journey.

    I first started contributing to open-source in the MNE organization. This package implements data structures for the processing and analysis of neural recording data (e.g. MEG, EEG, iEEG data). I contributed over 70 pull requests in the MNE-BIDS package, and subsequently was invited to be a maintainer for MNE-BIDS and MNE-Python. Later one, I participated in a Google Summer of Code to port the connectivity submodule within MNE-Python to a new package, known as MNE-Connectivity. I added new data structures, and algorithms for the sake of improving the feature developments for connectivity algorithms among neural recording data. Later on, I also worked with a team on porting a neural network architecture from Matlab to the MNE framework to automatically classify ICA derived components. This became known as MNE-ICALabel. These experiences gave me the experience necessary to work in a large asynchronous team environment that is common in OSS. It also taught me how to begin contributing to an OSS project. This led me to scikit-learn.

    I first got involved in scikit-learn as a user, who was heavily interested in the decision tree model in scikit-learn (random forest, randomized trees). Here, I was interested in contributing a new oblique decision tree model that was a generalization of the existing random forest model. However, the code was not easily added to scikit-learn, and currently the decision to include it is inconclusive. Throughout this process, I learned about the challenges and intricacies of maintaining such a large OSS project as scikit-learn. It is not trivial to simply add new features to a large OSS project because code comes with a maintenance cost, and should fit with the current internal design. At this point in time, there were very few maintainers that were able to maintain the tree submodule, and as such new features are included conservatively.

    I was eager to improve the project to enable more exciting features for the community, so I began contributing to scikit-learn starting with smaller issues such as documentation improvements, or minor bug fixes to get acquainted with the codebase. I also refactored various Cython code to begin upgrading the codebase, especially in the tree submodule. Throughout this process, I identified other projects the maintainers team were working on, and also contributed there. For example, I added metadata routing to a variety of different functions and estimators in scikit-learn. I also began reviewing PRs for the tree submodule and metadata routing where I had knowledge. I also added missing-value support for extremely randomized tree models (called ExtraTrees in scikit-learn). This allows users to pass in data that contains missing values (encoded as np.nan) to ExtraTrees. Around this time, I was invited to join the maintainer team of scikit-learn. More recently, I have taken on the project to add categorical data support to the decision tree models, which will make random forests and extremely randomized tree models more performant and capable to handle real world settings where there is commonly categorical data.

  4. To which OSS projects and communities do you contribute?

    I currently primarily contribute to scikit-learn, PyWhy (a community for causal inference in Python), and also develop my own OSS project: treeple. Treeple is an exciting package that implements different decision tree models beyond those offered in scikit-learn with an efficient Cython implementation stemming from the scikit-learn tree internals.

  5. What do you find alluring about OSS?

    OSS is so exciting because of the impact it has. Everyone from private projects to other OSS projects will use OSS. Any fixes to documentation, performance improvements, or new features will potentially impact the workflows of potentially millions of people. This is what makes contributing to OSS so exciting. Moreover, this impact ensures that best practices are usually carried out in these projects, and it’s a great playground to learn from the best, while giving back to the larger community.

  6. What pain points do you observe in community-led OSS?

    Right now, community lead OSS moves very slowly in most places. This is for a number of very good reasons: i) not releasing buggy features that may impact millions of people, and ii) backwards compatibility. One of the challenges of maintaining a high-quality OSS project is that you would like to satisfy your users, who may all utilize different components of the project from different versions. As such, many community led OSS projects take a conservative approach when implementing new features and new ideas. However, there may be many exciting better features that are already known by the community, but still lack an OSS implementation.

    I think this can be partially solved by increased funding for OSS, so OSS maintainers and developers are able to dedicate more time to maintaining and improving the projects. In addition, I think this can be improved if more developers in the community contribute to said OSS projects. I hope that I have convinced you though that contributing to OSS is impactful and highly educational.

  7. If we discuss how far OS has evolved in 10 years, what would you like to see happen?

    I think more interoperability and integrated workflows for projects will make projects that utilize OSS more streamlined and efficient. For example, right now there are different array libraries (e.g. numpy, cupy, xarray, pytorch, etc.), which all support some manner of a n-dimensional array, but with a slightly different API. This makes it very painful to transition across different libraries that use different arrays. In addition, there are multiple dataframe libraries, such as pandas and polars, and this problem of API consistency also arises there.

    Some work has been made on the Array-API front to allow different array libraries to serve as backends given a common API. This will enable GPU acceleration for free without a single code change, which is great! This will be exciting because users will eventually only have to write code in a single way, and can then leverage any array/dataframe library that has different advantages and disadvantages based on the user use case.

  8. What are your hobbies, outside of work and open source?

    I enjoy running, trying new restaurants and bars, cooking and reading. I’m currently training for a half-marathon, where my goal is to run under 8 minutes per mile. I’m also trying to perfect a salad with an asian-themed dressing. In a past life, I was a bboy (breakdancer) for ten years until I stopped in graduate school because I got busy (and old).

Categories: FLOSS Project Planets

Pages