Feeds

Ian Ozsvald: What I’ve been up to since 2022

Planet Python - Thu, 2024-05-30 08:25

This has been terribly quiet since July 2022, oops. It turns out that having an infant totally sucks your time! In the meantime I’ve continued to build up:

  • Training courses – I’ve just listed my new Fast Pandas course plus the existing Successful Data Science Projects and Software Engineering for Data Scientists with runs of all three for July and September
  • My NotANumber newsletter, it goes out every month or so, carries Python data science jobs and talks on my strategic work, RebelAI leadership community and Higher Performance Python book updates
  • RebelAI – my private data science leadership community (there’s no web presence, just get in touch) for “excellent data scientists turned leaders” – this is having a set of very nice impacts for members
  • High Performance Python O’Reilly book – we’re working on the 3rd edition
  • PyDataLondon 2024 has a great schedule and if you’re coming – do find me and say hi!
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Categories: FLOSS Project Planets

Salsa Digital: A guide to design systems and their benefits

Planet Drupal - Thu, 2024-05-30 08:00
Why build a design system? Designing and building a new website is often costly and time-consuming. Designers come up with a visual masterpiece, and then frontend developers have to build it — which  can be easier said than done, especially if the developer is working in isolation and not taking into consideration how difficult it might be to actually build their designs.
Categories: FLOSS Project Planets

EuroPython: How EuroPython Proposals Are Selected: An Inside Look

Planet Python - Thu, 2024-05-30 06:24

With the number of Python-related conferences around the world, many people might wonder how the selection process is configured and performed. For the largest and oldest European Python conference, EuroPython, we wanted to share how this process works.

The Programme team for each EuroPython conference changes every year. There are mechanisms in place to carry on some of the processes and traditions when dealing with proposals for the next event, although you can still find some differences each year. These differences are not large enough to make a significant impact.

The 2024 Process

In this post, we highlight how the 2024 process was conducted and your role in it, whether as a submitter or potential contributor in future versions.

Opening the Call for Proposals

This year, the Call for Proposals (CfP) configuration was based on the 2023 version, with minor modifications. For example, two new tracks were added to enable more people to categorise their proposals:

  • PyData: Research & Applications
  • PyData: LLMs

This change was motivated by the popularity of these topics at other global conferences. In addition, other tracks were merged or removed to keep the number manageable.

For many people, having a configuration with both an Abstract and a Description is confusing. Not everyone knows what to write in each field. To address this, we decided to be clearer: we dropped the Description field and asked explicitly for an Outline section. The intention was that to submit a proposal, one would require an Abstract and an Outline for their session.

Reviewers from the Community

We opened a form to get help from the community for reviewing the talks and decided to accept most (if not all) of them to nurture our review results as much as possible. We had more than 20 people helping with reviewing proposals on Pretalx.

We created a few “track groups” containing related categories that the reviewers could choose from. This way, there was no pressure to have an opinion on a topic one might not be familiar with.

We had an average of six reviews per proposal, which greatly helped us make a final decision.

Community Voting

Another way to receive input is through Community Voting, which allows participants who have attended any of the EuroPythons since 2012 to vote on the upcoming programme.

Using a separate simple web application, people participated by voting for the proposals they wanted to see at EuroPython 2024. We were fortunate to have enough people voting to get a good estimate of preferences.

Fun fact: Around 11 people were able to review all of the nearly 640 proposals we received this year.

We are very grateful to everyone who participated!

My reaction when I saw a good proposal to be voted at EP 2024.Programme Committee

This year the programme committee was mostly formed by a group of new people to the conference, helped by a few people familiar with the process from last year. In total, around 11 people were actively participating.

Like most Programme teams, we did our best to get people from different areas to have a more diverse general mindset, including skills from Data, Core, DevOps, and Web technologies.

It was important for us to have local people on the team, and we are very happy to have had two members from the local Czech community helping, while the rest were spread across Europe.

Selection Process

Based on the reviewers&apos results from Pretalx and Community Voting, we generated a master sheet that was used to perform the selection process.

Track by track, the Programme team went through each proposal that had good Pretalx and Community Voting results and voted (again) for the talks they believed were good material for the conference.

During the selection process, we felt that we did not have enough expertise in a specific area. Therefore, we are very thankful that we could add four more members to the selection team to remedy that.

After three calls, each lasting around 2 hours, the Programme team had the first batch of accepted proposals. The speakers for these proposals were notified as soon as the decision was made. Following a similar process, we did the same for the second (and final) batch of accepted and rejected proposals.

To ensure the acceptance of proposals from most tracks and topics, many plots and statistical analyses were created to visualise the ratio of accepted proposals to submitted ones, the variety of topics, and the diversity of speakers.

Plots from pretalx visualising Proposals by Submission date, Session type, Track & State

Even though it sounds cliché, there were many good proposals we couldn&apost accept immediately since the high volume and quality of proposals made it challenging to make instant decisions. We kept debating whether to place them on the waiting list.

Ultimately, we created another category for proposals that "could be accepted" allowing us to manage and organise high-quality proposals that required further deliberation.

Programme team trying to figure which talk to choose from the waiting listWhat about sponsored talks?

Each year, the conference offers sponsors with certain packages the perk of hosting a sponsored talk, meaning that some of the talk slots had to be saved for that purpose. Slots not taken were filled by proposals on the waiting list.

Is selecting the talks the end of the story?

No. After proposals are accepted/confirmed, special requirements emerge, mainly about "I’m sorry, I cannot be at the conference, can I do it online?" Which, in our opinion, is unfortunate news—not because we don’t like it, but because we have learned that remote talks are not as popular with attendees.

Even though there are some special cases that we fully understand, we noticed a few cases not being convincing enough. In those cases, we had to encourage people to give up their slot for other in-person proposals. This is a tricky process as we are limited in the total amount of remote talks possible, the specific reasons for the change, and the overall scenario for the conference.

What is needed to get accepted?

Most rejected proposals are rejected because they have a weak abstract.

We have tried many means to encourage people to ask questions and seek feedback about their proposals, and we have hosted calls providing the details of good proposals. Still, every year we get proposals that have a poorly structured, incomplete abstract, etc.

For us, a good abstract contains the following:

  • Context of your talk or problem
  • Definition of the problem
  • Why is it important to find a solution to that problem?
  • What will be discussed and what will attendees learn?
  • Previous requirements or additional comments on your talk

You can also imagine a proposal like an elevator pitch. You need to describe it in a way that’s striking and motivates people to attend.

Don’t Forget About the Outline!

This year, we introduced an “outline” field for you to paste the outline of your talk, including the time you will spend on each item. This is essential to get an idea of how much you will be talking about each topic. (Hint: add up the expected times.)

The outline might sound like an obvious topic to you, but many people failed to provide a detailed one. Some even copied the abstract here, so you might understand the importance of this field as well.

Why Does It Feel Like the Same People Are Speakers Every Year?

The main reason for this is that those people followed the proper abstract structure and provided a descriptive outline. Having experience being rejected certainly helps. So we hope that after giving you detailed selection process standards, you know how to crack the selection process.

What about AI?

We discussed a few proposals that “felt AI written” and even used external tools to assess them. In the end, we didn’t have a strict ruling against people using Artificial Intelligence tools to improve their proposals.

When a proposal felt like it was AI-generated, we went deeper into the proposal and the speaker&aposs background. For example, by analysing the bio from the speaker and checking if the person was giving talks somewhere else. Most importantly, if the “speaker” was a real person.

Independently of how the Programme team feels towards AI tools, we cannot completely ignore how these tools are helping some people with structure and grammar, as well as overall assisting them in their writing process. This might change in the future, but currently, we have not written regulations against the usage of AI tools.

The 2025 Process and Final Words

As described before, the team and process can change a bit next year, but we expect the same critical aspects of a good abstract and outline to be essential to the process.

We encourage you to ask for feedback, participate in sessions teaching how to write good proposals, participate on our Speaker&aposs Mentorship programme. These can truly help you to get accepted into the conference.

Having said all this, each conference has a different selection process. Maybe the reason your proposal was not selected is due to a better proposal on the same topic, or too many similar proposals in the same track, or your proposal just did not fit this year&aposs Zeitgeist (i.e. Community Voting).

Please don’t be discouraged! We highly recommend you keep working on your proposal, tweak it, write a new one, and most importantly, try again.

Submitting Several Proposals Doesn’t Help!

We value quality over quantity and will compare your proposals against each other. This is extra work and might even give you less of a chance because of a split vote between your proposals. So submitting more than 10 proposals to get accepted is the wrong approach.

The Call for Proposals will likely be open earlier next year. We hope you can follow the recommendations in this post and get your proposal to accepted for EuroPython 2025.

And remember: Don’t be afraid to ask for feedback!

Thanks for reading! This community post is written by Cristián on behalf of EuroPython 2024 Programme team
Categories: FLOSS Project Planets

KDGpu 0.5.0 is here!

Planet KDE - Thu, 2024-05-30 03:30

Since we first announced it last year, our Vulkan wrapper KDGpu has been busy evolving to meet customer needs and our own. Our last post announced the public release of v0.1.0, and version 0.5.0 is available today. It’s never been easier to interact with modern graphics technologies, enabling you to focus on the big picture instead of hassling with the intricacies and nuances of Vulkan.

The PBR example in the new KDGpu Examples repository.

Wider device support

KDGpu now supports a wider array of devices, such as older versions of Android. For some context, additional features in Vulkan are supported by extensions. If said features become part of the “core” specification, they are automatically included in Vulkan 1.2, 1.3 and so on. In the past, KDGpu required the device to fully support Vulkan 1.2, which limited what devices you could target. In newer KDGpu versions (>0.4.6) it will now run on certain 1.1 devices (like the Meta Quest) as long as the required extensions are supported.

A KDGpu example running natively on an Android device.

We also added native examples for Android, which can be ran straight from Android Studio! There’s also better iOS support alongside a native Apple example.

the KDGpu Hello Triangle example running in the iOS simulator

External memory and images support

When writing applications using KDGpu, you will inevitably have to interface with other APIs or libraries that don’t support it or maybe not even Vulkan specifically. For example, if you generate an image using Vulkan graphics and then need to pass that to CUDA for further processing. Now with KDGpu it’s possible to grab texture and buffer objects and get their external memory handles:

const TextureOptions textureOptions = { .type = TextureType::TextureType2D, .format = Format::R8G8B8A8_SNORM, .extent = { 512, 512, 1 }, .mipLevels = 1, .usage = TextureUsageFlagBits::SampledBit, .memoryUsage = MemoryUsage::GpuOnly, .externalMemoryHandleType = ExternalMemoryHandleTypeFlagBits::OpaqueFD, }; Texture t = device.createTexture(textureOptions); const MemoryHandle externalHandleOrFD = t.externalMemoryHandle();

Additionally, we have added methods to adopt existing VkImages as native KDGpu objects to better support libraries like OpenXR.

Easy & fast XR

OpenXR is the leading API used for writing cross-platform VR/AR experiences. Like Vulkan, code directly using OpenXR tends to be verbose and requires a lot of setup. To alleviate this, KDGpu now includes an optional library called KDXr. It wraps OpenXR, and it even easily integrates into KDGpu. It takes care of initialization, has the C++ classes you expect and can make it painless to integrate XR functionality into your application including support for XR compositor layers, head tracking, input handling and haptic feedback.

For example, to set up a projection view you subclass the ProjectionLayer type:

class ProjectionLayer : public XrProjectionLayer { public:

And implement the required methods like renderView() to start rendering into each eye:

void ProjectionLayer::renderView() { m_fence.wait(); m_fence.reset(); // Update the scene data once per frame if (m_currentViewIndex == 0) { updateTransformUbo(); } // Update the per-view camera matrices updateViewUbo(); auto commandRecorder = m_device->createCommandRecorder(); // Set up the render pass using the current color and depth texture views m_opaquePassOptions.colorAttachments[0].view = m_colorSwapchains[m_currentViewIndex].textureViews[m_currentColorImageIndex]; m_opaquePassOptions.depthStencilAttachment.view = m_depthSwapchains[m_currentViewIndex].textureViews[m_currentDepthImageIndex]; auto opaquePass = commandRecorder.beginRenderPass(m_opaquePassOptions); // Do the rest of your rendering commands to this pass...

And add this layer to the compositor, in our examples this is abstracted away for you:

// Create a projection layer to render the 3D scene const XrProjectionLayerOptions projectionLayerOptions = { .device = &m_device, .queue = &m_queue, .session = &m_session, .colorSwapchainFormat = m_colorSwapchainFormat, .depthSwapchainFormat = m_depthSwapchainFormat, .samples = m_samples.get() }; m_projectionLayer = createCompositorLayer<ProjectionLayer>(projectionLayerOptions); m_projectionLayer->setReferenceSpace(m_referenceSpace);

You can view the complete example here. In this new release, we’re continuing to work on multiview support! KDXr supports multiview out of the box (see the example layer code) and you can check out the multiview example.

More in-depth examples are now available

The examples sitting in our main repository are no more than small tests, which don’t show the true benefits of using KDGpu in large graphical applications. So, in addition to our previous examples, we now have a dedicated KDGpu Examples repository!

Screenshot from our N-Body Compute example.

And more!

There are also small improvements such as being able to request custom extensions and ignore specific validation layer warnings. Check out the changelog on GitHub for a full list of what’s been changed.

Let us know what you think about the improvements we’ve made, and what could be useful for you in the future!

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post KDGpu 0.5.0 is here! appeared first on KDAB.

Categories: FLOSS Project Planets

April/May in KDE Itinerary

Planet KDE - Thu, 2024-05-30 02:15

Since the last summary of what happened around KDE Itinerary two month ago we shipped Transitous support, integrated a new import staging area, enabled creating entries from OSM elements and much more.

New Features Transitous support

The 24.05 releases shipped with Transitous support enabled by default for the first time. Transitous is a community-run free and open public transport routing service.

Since its start at FOSDEM 2024 just a few months ago Transitous is meanwhile consuming almost 800 GTFS feeds with base schedule information and 185 GTFS-RT feeds with realtime updates, covering 37 countries on 5 continents.

To support this a lot of work is happening both by the people taking care of the operational side of this as well as the team developing the MOTIS routing engine to improve the performance and scalability.

Unlike with vendor-operated or otherwise proprietary services Transitous allows us to expand public transport routing coverage ourselves, assuming publicly available GTFS feeds at least. See the Transitous contributor documentation on how to help with that, many major systems in Asia are still missing for example.

New import staging area

When importing from the system calendar Itinerary showed a list of detected elements and allowed to select which ones to actually import. This “staging area” for imported data has been generalized and is now available for all possible import scenarios.

Import staging area showing an entire trip.

This allows to review what the travel document extractor has found and provides greater control over what to import. It also is an important step towards the longer term plan of associating every element with a trip and providing more manual control over trip grouping.

Import from OSM URLs

Hotels or restaurants can be imported from OSM data, by pasting or dropping the link to the corresponding OSM element into Itinerary. That will result in the corresponding edit page being shown with all data present in OSM already pre-filled so you typically only have to enter dates and times.

With the next version the same will also work for a number of event venue types.

Manual ticket barcode entry

It’s now possible to manually add barcodes to reservations or tickets/passes that don’t have one yet, from within the corresponding edit page. For reservations it’s also possible to associate them with an existing pass or flat-rate ticket.

This can be useful when manually entering data that uses document or barcode formats that Itinerary doesn’t recognize automatically.

Barcode editing control in KDE Itinerary. Infrastructure Work CI/CD updates

We updated the build infrastructure for all of KDE’s Android apps to Qt 6.7 and NDK r26. Due to an API and ABI break in Qt’s JNI API (ie. something very central to Android integration) this unfortunately needed a lot more effort and changes than usual.

NDK r26 brings a much newer STL, which unblocked some KDE Framework changes, as well as allowed us to update to newer versions of the Quotient Matrix library.

Accessibility and UI testing

As mentioned in my report of KDE’s accessibility, automation and sustainability sprint the date and time input controls used by Itinerary can now be interacted with using assistive tools such as screen readers.

And that’s not only helpful for users relying on such tools, our UI testing tools use the same interface to control the application under test. Thanks to this Itinerary now has a first set of automated UI tests which are run as part of the automated builds.

Indoor routing

Indoor routing for the train station maps has been moving closer to becoming ready for integration by removing one major obstacle, its memory consumption.

The previous implementation processing an entire train station at once could end up needing 500MB or memory temporarily for creating the navigation mesh. That’s not a big deal on a laptop or desktop, but on a phone that is a bit much.

We now split larger areas into tiles and compute the navigation mesh for each of those separately. That has only minimal impact on the time it takes to do that, but decreases the peak memory consumption by a factor of 10 to 20.

MapCSS eval expressions

Another seemingly small but very powerful change in the indoor map renderer is support for MapCSS eval() expressions. This allows styling properties to not only be fixed values or values of OSM tags but can be complex expressions depending on other style properties or OSM tag values.

Complex road labels styled with MapCSS eval() expressions (right) compared to the previous result. Fixes & Improvements Travel document extractor
  • New or improved travel document extractors for CFR, Eurostar, Indico, IRTC, Lufthansa, Motel One, SNCB, SNCF, Ticketportal, Trenitalia, VDV eTicket and VR.
  • Initial generic support for railway tickets in IATA BCBP and PkPass formats.
  • Fixed invalid departure times for flights from airports with unknown timezones.
  • Added support for Base64 encoded ERA SSB ticket barcodes.
  • Improved handling of binary barcode content in Apple Wallet passes.
  • Fixed start/end time checks for restaurant reservations.

All of this has been made possible thanks to your travel document donations!

Public transport data
  • New occupancy indicator that no longer solely relies on color.
  • Support per-coach occupancy information on trains (only available in some ÖBB trains so far).
Train coach occupancy information.
  • Support for out-of-service train coaches in the coach layout view.
  • Updated public transport coverage metadata from the Transport API repository, which should result in more appropriate results for some regions.
  • Fixed filtering of pointlessly short foot paths from journey query results sometimes having no effect.
  • Discard non-WGS84 coordinates in EFA responses. This fixes some bizarre and physically impossible routing instructions in Baden-Württemberg.
Indoor map
  • Render node-based indoor columns.
  • Unified styling for all corridor types.
  • Handle one more OSM tagging variant for toilets.
  • Improved detection of the current holiday region for interpreting opening hours.
  • Handle more OSM tagging variants when doing floor level expansion.
Itinerary app
  • Show train coach layout actions also without any seat reservation or ticket.
  • Fixed overly long headers of ferry reservations.
  • Fixed some mistranslations due to missing translation contexts (in some languages the translation of “departure” depends on the mode of transportation).
  • Fixed layout issues for waiting sections on a public transport journey.
  • Allow to edit the ticket owner name as well.
  • Support for program membership passes and ticket in the Apple Wallet pass format.
How you can help

Feedback and travel document samples are very much welcome, as are all other forms of contributions. Feel free to join us in the KDE Itinerary Matrix channel.

Categories: FLOSS Project Planets

Matt Layman: About, FAQ, and Home Page - Building SaaS with Python and Django #192

Planet Python - Wed, 2024-05-29 20:00
In this episode, we worked on some core pages to round out the JourneyInbox user interface. This led us to work updating UI layout, writing copy, and doing other fundamentals for making templated pages.
Categories: FLOSS Project Planets

Python⇒Speed: Let’s optimize! Running 15× faster with a situation-specific algorithm

Planet Python - Wed, 2024-05-29 20:00
pre { white-space: pre; overflow-x: auto; font-size: 80%; }

Let’s speed up some software! Our motivation: we have an image, a photo of some text from a book. We want to turn it into a 1-bit image, with just black and white, extracting the text so we can easily read it.

We’ll use an example image from scikit-image, an excellent image processing library:

from skimage.data import page import numpy as np IMAGE = page() assert IMAGE.dtype == np.uint8

Here’s what it looks like (it’s licensed under this license):

Median-based local thresholding

The task we’re trying to do—turning darker areas into black, and lighter areas into white—is called thresholding. Since the image is different in different regions, with some darker and some lighter, we’ll get the best results if we use local thresholding, where the threshold is calculated from the pixel’s neighborhood.

Simplifying somewhat, for each pixel in the image we will:

  1. Calculate the median of the surrounding neighborhood.
  2. Subtract a magic constant from the calculated median to calculate our local threshold.
  3. If the pixel’s value is bigger than the threshold, the result is white, otherwise it’s black.

scikit-image includes an implementation of this algorithm. Here’s how we use it:

from skimage.filters import threshold_local def skimage_median_local_threshold(img, neighborhood_size, offset): threshold = threshold_local( img, block_size=neighborhood_size, method="median", offset=offset ) result = (img > threshold).astype(np.uint8) result *= 255 return result # The neighborhood size and offset value were determined "empirically", i.e. # they're manually tuning the algorithm to work well with our specific # example image. SKIMAGE_RESULT = skimage_median_local_threshold(IMAGE, 11, 10)

And here’s what the results look like:

Let’s see if we can make this faster!

Step 1. Reimplement our own version

We’re going to be using the Numba compiler, which lets us compile Python code to machine code at runtime. Here’s an initial implementation of the algorithm; it’s not quite identical to the original, for example the way edge pixels are handled, but it’s close enough for our purposes:

from numba import jit @jit def median_local_threshold1(img, neighborhood_size, offset): # Neighborhood size must be an odd number: assert neighborhood_size % 2 == 1 radius = (neighborhood_size - 1) // 2 result = np.empty(img.shape, dtype=np.uint8) # For every pixel: for i in range(img.shape[0]): # Calculate the Y borders of the neighborhood: min_y = max(i - radius, 0) max_y = min(i + radius + 1, img.shape[0]) for j in range(img.shape[1]): # Calculate the X borders of the neighborhood: min_x = max(j - radius, 0) max_x = min(j + radius + 1, img.shape[1]) # Calculate the median: median = np.median(img[min_y:max_y, min_x:max_x]) # Set the image to black or white, depending how it relates to # the threshold: if img[i, j] > median - offset: # White: result[i, j] = 255 else: # Black: result[i, j] = 0 return result NUMBA_RESULT1 = median_local_threshold1(IMAGE, 11, 10)

Here’s the resulting image; it looks similar enough that for our purposes:

Now we can compare the performance of the two implementations:

Code Elapsed milliseconds skimage_median_local_threshold(IMAGE, 11, 10) 76 median_local_threshold1(IMAGE, 11, 10) 87

It’s slower. But that’s OK, we’re just getting started.

Step 2: A faster implementation of the median algorithm

Calculating a median is pretty expensive, and we’re doing it for every single pixel, so let’s see if we can speed it up.

The generic median implementation Numba provides is likely to be fairly generic, since it needs to work in a wide variety of circumstances. We can hypothesize that it’s not optimized for our particular case. And even if it is, having our own implementation will allow for a second round of optimization, as we’ll see in the next step.

We’re going to implement a histogram-based median, based on the fact we’re using 8-bit images that only have a limited range of potential values. The median is the value where 50% of the pixels’ values are smaller, and 50% are bigger.

Here’s the basic algorithm for a histogram-based median:

  • Each pixel’s value will go into a different bucket in the histogram; since we know our image is 8-bit, we only need 256 buckets.
  • Then, we add up the size of each bucket in the histogram, from smallest to largest, until we hit 50% of the pixels we inspected.
@jit def median_local_threshold2(img, neighborhood_size, offset): assert neighborhood_size % 2 == 1 radius = (neighborhood_size - 1) // 2 result = np.empty(img.shape, dtype=np.uint8) # 😎 A histogram with a bucket for each of the 8-bit values possible in # the image. We allocate this once and reuse it. histogram = np.empty((256,), dtype=np.uint32) for i in range(img.shape[0]): min_y = max(i - radius, 0) max_y = min(i + radius + 1, img.shape[0]) for j in range(img.shape[1]): min_x = max(j - radius, 0) max_x = min(j + radius + 1, img.shape[1]) # Reset the histogram to zero: histogram[:] = 0 # Populate the histogram, counting how many of each value are in # the neighborhood we're inspecting: neighborhood = img[min_y:max_y, min_x:max_x].ravel() for k in range(len(neighborhood)): histogram[neighborhood[k]] += 1 # Use the histogram to find the median; keep adding buckets until # we've hit 50% of the pixels. The corresponding bucket is the # median. half_neighborhood_size = len(neighborhood) // 2 for l in range(256): half_neighborhood_size -= histogram[l] if half_neighborhood_size < 0: break median = l if img[i, j] > median - offset: result[i, j] = 255 else: result[i, j] = 0 return result NUMBA_RESULT2 = median_local_threshold2(IMAGE, 11, 10)

Here’s the resulting image:

And here’s the performance of our new implementation:

Code Elapsed milliseconds median_local_threshold1(IMAGE, 11, 10) 86 median_local_threshold2(IMAGE, 11, 10) 18

That’s better!

Step 3: Stop recalculating the histogram from scratch

Our algorithm uses a rolling neighborhood or window over the image, calculating the median for a window around each pixel. And the neighborhood for one pixel has a significant overlap for the neighborhood of the next pixel. For example, let’s say we’re looking at a neighborhood size of 3. We might calculate the median of this area:

...... .\\\.. .\\\.. .\\\.. ...... ......

And then when process the next pixel we’ll calculate the median of this area:

...... ..///. ..///. ..///. ...... ......

If we superimpose them, we can see there’s an overlap, the X:

...... .\XX/. .\XX/. .\XX/. ...... ......

Given the histogram for the first pixel, if we remove the values marked with \ and add the ones marked with /, we’ve calculated the exact histogram for the second pixel. So for a 3×3 neighborhood, instead of processing 3 columns we process 2, a minor improvement. For a 11×11 neighborhood, we will go from processing 11 columns to 2 columns, a much more significant improvement.

Here’s what the code looks like:

@jit def median_local_threshold3(img, neighborhood_size, offset): assert neighborhood_size % 2 == 1 radius = (neighborhood_size - 1) // 2 result = np.empty(img.shape, dtype=np.uint8) histogram = np.empty((256,), dtype=np.uint32) for i in range(img.shape[0]): min_y = max(i - radius, 0) max_y = min(i + radius + 1, img.shape[0]) # Populate histogram as if we started one pixel to the left: histogram[:] = 0 initial_neighborhood = img[min_y:max_y, 0:radius].ravel() for k in range(len(initial_neighborhood)): histogram[initial_neighborhood[k]] += 1 for j in range(img.shape[1]): min_x = max(j - radius, 0) max_x = min(j + radius + 1, img.shape[1]) # 😎 Instead of recalculating histogram from scratch, re-use the # previous pixel's histogram. # Substract left-most column we don't want anymore: if min_x > 0: for y in range(min_y, max_y): histogram[img[y, min_x - 1]] -= 1 # Add new right-most column: if max_x < img.shape[1]: for y in range(min_y, max_y): histogram[img[y, max_x - 1]] += 1 # Find the the median from the updated histogram: half_neighborhood_size = ((max_y - min_y) * (max_x - min_x)) // 2 for l in range(256): half_neighborhood_size -= histogram[l] if half_neighborhood_size < 0: break median = l if img[i, j] > median - offset: result[i, j] = 255 else: result[i, j] = 0 return result NUMBA_RESULT3 = median_local_threshold3(IMAGE, 11, 10)

Here’s the resulting image:

And here’s the performance of our latest code:

Code Elapsed microseconds median_local_threshold2(IMAGE, 11, 10) 17,066 median_local_threshold3(IMAGE, 11, 10) 6,386 Step #4: Adapative heuristics

Notice that a median’s definition is symmetrical:

  1. The first value that is smaller than the highest 50% values.
  2. Or, the first value that is larger than the lowest 50% values. We used this definition in our code above, adding up buckets from the smallest to the largest.

Depending on the distribution of values, one approach to adding up buckets to find the median may be faster than the other. For example, given a 0-255 range, if the median is going to be 10 we want to start from the smallest bucket to minimize additions. But if the median is going to be 200, we want to start from the largest bucket.

So which side we should start from? One reasonable heuristic is to look at the previous median we calculated, which most of the time will be quite similar to the new median. If the previous median was small, start from the smallest buckets; if it was large, start from the largest buckets.

@jit def median_local_threshold4(img, neighborhood_size, offset): assert neighborhood_size % 2 == 1 radius = (neighborhood_size - 1) // 2 result = np.empty(img.shape, dtype=np.uint8) histogram = np.empty((256,), dtype=np.uint32) median = 0 for i in range(img.shape[0]): min_y = max(i - radius, 0) max_y = min(i + radius + 1, img.shape[0]) histogram[:] = 0 initial_neighborhood = img[min_y:max_y, 0:radius].ravel() for k in range(len(initial_neighborhood)): histogram[initial_neighborhood[k]] += 1 for j in range(img.shape[1]): min_x = max(j - radius, 0) max_x = min(j + radius + 1, img.shape[1]) if min_x > 0: for y in range(min_y, max_y): histogram[img[y, min_x - 1]] -= 1 if max_x < img.shape[1]: for y in range(min_y, max_y): histogram[img[y, max_x - 1]] += 1 half_neighborhood_size = ((max_y - min_y) * (max_x - min_x)) // 2 # 😎 Find the the median from the updated histogram, choosing # the starting side based on the previous median; we can go from # the leftmost bucket to the rightmost bucket, or in reverse: the_range = range(256) if median < 127 else range(255, -1, -1) for l in the_range: half_neighborhood_size -= histogram[l] if half_neighborhood_size < 0: median = l break if img[i, j] > median - offset: result[i, j] = 255 else: result[i, j] = 0 return result NUMBA_RESULT4 = median_local_threshold4(IMAGE, 11, 10)

The end result is 25% faster. Since the heuristic is tied to the image contents, the performance impact will depend on the image.

Code Elapsed microseconds median_local_threshold3(IMAGE, 11, 10) 6,381 median_local_threshold4(IMAGE, 11, 10) 4,920 The big picture

Here’s a performance comparison of all the versions of the code:

Code Elapsed microseconds skimage_median_local_threshold(IMAGE, 11, 10) 76,213 median_local_threshold1(IMAGE, 11, 10) 86,494 median_local_threshold2(IMAGE, 11, 10) 17,145 median_local_threshold3(IMAGE, 11, 10) 6,398 median_local_threshold4(IMAGE, 11, 10) 4,925

Let’s go over the steps we went through:

  1. Switch to a compiled language: this gives us more control.
  2. Reimplement the algorithm taking advantage of constrained requirements: our median only needed to handle uint8, so a histogram was a reasonable solution.
  3. Reuse previous calculations to prevent repetition: our histogram for the neighborhood of a pixel is quite similar to that of the previous pixel. This means we can reuse some of the calculations.
  4. Adaptively tweak the algorithm at runtime: as we run on an actual image, we use what we’ve learned up to this point to hopefully run faster later on. The decision from which side of the histogram to start is arbirary in general. But in this specific algorithm, the overlapping pixel neighborhoods mean we can make a reasonable guess.

This process demonstrates part of why generic libraries may be slower than custom code you write for your particular use case and your particular data.

Next steps

What else can you do to speed up this algorithm? Here are some ideas:

  • There may be a faster alternative to histogram-based medians.
  • We’re not fully taking advantage of histogram overlap; there’s also overlap between rows.
  • The cumulative sum in the histogram doesn’t benefit from instruction-level parallelism or SIMD. It’s possible that using one of those would result in faster results even if it uses more instructions.
  • So far the code has only used a single CPU. Given each row is calculated independently, parallelism would probably work well if done in horizontal stripes, probably taller than one pixel so as to maximize utilization of memory caches.

Want to learn more about optimizing compiled code for Python data processing? This article is an extract from a book I’m working on; test readers are currently going through initial drafts. Aimed at Python developers, data scientists, and scientists, the book covers topics like instruction-level parallelism, memory caches, and other performance optimization techniques. Learn more and sign up to get updates here.

Read more...
Categories: FLOSS Project Planets

Matthew Palmer: GitHub's Missing Tab

Planet Debian - Wed, 2024-05-29 20:00

Visit any GitHub project page, and the first thing you see is something that looks like this:

“Code”, that’s fairly innocuous, and it’s what we came here for. The “Issues” and “Pull Requests” tabs, with their count of open issues, might give us some sense of “how active” the project is, or perhaps “how maintained”. Useful information for the casual visitor, undoubtedly.

However, there’s another user community that visits this page on the regular, and these same tabs mean something very different to them.

I’m talking about the maintainers (or, more commonly, maintainer, singular). When they see those tabs, all they see is work. The “Code” tab is irrelevant to them – they already have the code, and know it possibly better than they know their significant other(s) (if any). “Issues” and “Pull Requests” are just things that have to be done.

I know for myself, at least, that it is demoralising to look at a repository page and see nothing but work. I’d be surprised if it didn’t contribute in some small way to maintainers just noping the fudge out.

A Modest Proposal

So, here’s my thought. What if instead of the repo tabs looking like the above, they instead looked like this:

My conception of this is that it would, essentially, be a kind of “yearbook”, that people who used and liked the software could scribble their thoughts on. With some fairly straightforward affordances elsewhere to encourage its use, it could be a powerful way to show maintainers that they are, in fact, valued and appreciated.

There are a number of software packages I’ve used recently, that I’d really like to say a general “thanks, this is awesome!” to. However, I’m not about to make the Issues tab look even scarier by creating an “issue” to say thanks, and digging up an email address is often surprisingly difficult, and wouldn’t be a public show of my gratitude, which I believe is a valuable part of the interaction.

You Can’t Pay Your Rent With Kudos

Absolutely you cannot. A means of expressing appreciation in no way replaces the pressing need to figure out a way to allow open source developers to pay their rent. Conversely, however, the need to pay open source developers doesn’t remove the need to also show those people that their work is appreciated and valued by many people around the world.

Anyway, who knows a senior exec at GitHub? I’ve got an idea I’d like to run past them…

Categories: FLOSS Project Planets

Anarcat: Playing with fonts again

Planet Python - Wed, 2024-05-29 17:38

I am getting increasingly frustrated by Fira Mono's lack of italic support so I am looking at alternative fonts again.

Commit Mono

This time I seem to be settling on either Commit Mono or Space Mono. For now I'm using Commit Mono because it's a little more compressed than Fira and does have a italic version. I don't like how Space Mono's parenthesis (()) is "squarish", it feels visually ambiguous with the square brackets ([]), a big no-no for my primary use case (code).

So here I am using a new font, again. It required changing a bunch of configuration files in my home directory (which is in a private repository, sorry) and Emacs configuration (thankfully that's public!).

One gotcha is I realized I didn't actually have a global font configuration in Emacs, as some Faces define their own font family, which overrides the frame defaults.

This is what it looks like, before:

Fira Mono

After:

Commit Mono

(Notice how those screenshots are not sharp? I'm surprised too. The originals look sharp on my display, I suspect this is something to do with the Wayland transition. I've tried with both grim and flameshot, for what its worth.)

They are pretty similar! Commit Mono feels a bit more vertically compressed maybe too much so, actually -- the line height feels too low. But it's heavily customizable so that's something that's relatively easy to fix, if it's really a problem. Its weight is also a little heavier and wider than Fira which I find a little distracting right now, but maybe I'll get used to it.

All characters seem properly distinguishable, although, if I'd really want to nitpick I'd say the © and ® are too different, with the latter (REGISTERED SIGN) being way too small, basically unreadable here. Since I see this sign approximately never, it probably doesn't matter at all.

I like how the ampersand (&) is more traditional, although I'll miss the exotic one Fira produced... I like how the back quotes (`, GRAVE ACCENT) drop down low, nicely aligned with the apostrophe. As I mentioned before, I like how the bar on the "f" aligns with the other top of letters, something in Fira mono that really annoys me now that I've noticed it (it's not aligned!).

A UTF-8 test file

Here's the test sheet I've made up to test various characters. I could have sworn I had a good one like this lying around somewhere but couldn't find it so here it is, I guess.

US keyboard coverage: abcdefghijklmnopqrstuvwxyz`1234567890-=[]\;',./ ABCDEFGHIJKLMNOPQRSTUVWXYZ~!@#$%^&*()_+{}|:"<>? latin1 coverage: ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ EURO SIGN, TRADE MARK SIGN: €™ ambiguity test: e¢coC0ODQ iI71lL!|¦ b6G&0B83 [](){}/\.…·• zs$S52Z% ´`'"‘’“”«» all characters in a sentence, uppercase: the quick fox jumps over the lazy dog THE QUICK FOX JUMPS OVER THE LAZY DOG same, in french: voix ambiguë d'un cœur qui, au zéphyr, préfère les jattes de kiwis. VOIX AMBIGUË D'UN CŒUR QUI, AU ZÉPHYR, PRÉFÈRE LES JATTES DE KIWIS. Ligatures test: -<< -< -<- <-- <--- <<- <- -> ->> --> ---> ->- >- >>- =<< =< =<= <== <=== <<= <= => =>> ==> ===> =>= >= >>= <-> <--> <---> <----> <=> <==> <===> <====> :: ::: __ <~~ </ </> /> ~~> == != /= ~= <> === !== !=== =/= =!= <: := *= *+ <* <*> *> <| <|> |> <. <.> .> +* =* =: :> (* *) /* */ [| |] {| |} ++ +++ \/ /\ |- -| <!-- <!--- Box drawing alignment tests: █ ╔══╦══╗ ┌──┬──┐ ╭──┬──╮ ╭──┬──╮ ┏━━┳━━┓ ┎┒┏┑ ╷ ╻ ┏┯┓ ┌┰┐ ▉ ╱╲╱╲╳╳╳ ║┌─╨─┐║ │╔═╧═╗│ │╒═╪═╕│ │╓─╁─╖│ ┃┌─╂─┐┃ ┗╃╄┙ ╶┼╴╺╋╸┠┼┨ ┝╋┥ ▊ ╲╱╲╱╳╳╳ ║│╲ ╱│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╿ │┃ ┍╅╆┓ ╵ ╹ ┗┷┛ └┸┘ ▋ ╱╲╱╲╳╳╳ ╠╡ ╳ ╞╣ ├╢ ╟┤ ├┼─┼─┼┤ ├╫─╂─╫┤ ┣┿╾┼╼┿┫ ┕┛┖┚ ┌┄┄┐ ╎ ┏┅┅┓ ┋ ▌ ╲╱╲╱╳╳╳ ║│╱ ╲│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╽ │┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▍ ║└─╥─┘║ │╚═╤═╝│ │╘═╪═╛│ │╙─╀─╜│ ┃└─╂─┘┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▎ ╚══╩══╝ └──┴──┘ ╰──┴──╯ ╰──┴──╯ ┗━━┻━━┛ └╌╌┘ ╎ ┗╍╍┛ ┋ ▏▁▂▃▄▅▆▇█ Dashes alignment test: HYPHEN-MINUS, MINUS SIGN, EN, EM DASH, HORIZONTAL BAR, LOW LINE -------------------------------------------------- −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− –––––––––––––––––––––––––––––––––––––––––––––––––– —————————————————————————————————————————————————— ―――――――――――――――――――――――――――――――――――――――――――――――――― __________________________________________________

So there you have it, got completely nerd swiped by typography again. Now I can go back to writing a too-long proposal again.

Sources and inspiration for the above:

  • the unicode(1) command, to lookup individual characters to disambiguate, for example, - (U+002D HYPHEN-MINUS, the minus sign next to zero on US keyboards) and − (U+2212 MINUS SIGN, a math symbol)

  • searchable list of characters and their names - roughly equivalent to the unicode(1) command, but in one page, amazingly the /usr/share/unicode database doesn't have any one file like this

  • bits/UTF-8-Unicode-Test-Documents - full list of UTF-8 characters

  • UTF-8 encoded plain text file - nice examples of edge cases, curly quotes example and box drawing alignment test which, incidentally, showed me I needed specific faces customisation in Emacs to get the Markdown code areas to display properly, also the idea of comparing various dashes

  • sample sentences in many languages - unused, "Sentences that contain all letters commonly used in a language"

  • UTF-8 sampler - unused, similar

Other fonts

In my previous blog post about fonts, I had a list of alternative fonts, but it seems people are not digging through this, so I figured I would redo the list here to preempt "but have you tried Jetbrains mono" kind of comments.

My requirements are:

  • no ligatures: yes, in the previous post, I wanted ligatures but I have changed my mind. after testing this, I find them distracting, confusing, and they often break the monospace nature of the display
  • monospace: this is to display code
  • italics: often used when writing Markdown, where I do make use of italics... Emacs falls back to underlining text when lacking italics which is hard to read
  • free-ish, ultimately should be packaged in Debian

Here is the list of alternatives I have considered in the past and why I'm not using them:

  • agave: recommended by tarzeau, not sure I like the lowercase a, a bit too exotic, packaged as fonts-agave

  • Cascadia code: optional ligatures, multilingual, not liking the alignment, ambiguous parenthesis (look too much like square brackets), new default for Windows Terminal and Visual Studio, packaged as fonts-cascadia-code

  • Fira Code: ligatures, was using Fira Mono from which it is derived, lacking italics except for forks, interestingly, Fira Code succeeds the alignment test but Fira Mono fails to show the X signs properly! packaged as fonts-firacode

  • Hack: no ligatures, very similar to Fira, italics, good alternative, fails the X test in box alignment, packaged as fonts-hack

  • Hermit: no ligatures, smaller, alignment issues in box drawing and dashes, packaged as fonts-hermit somehow part of cool-retro-term

  • IBM Plex: irritating website, replaces Helvetica as the IBM corporate font, no ligatures by default, italics, proportional alternatives, serifs and sans, multiple languages, partial failure in box alignment test (X signs), fancy curly braces contrast perhaps too much with the rest of the font, packaged in Debian as fonts-ibm-plex

  • Intel One Mono: nice legibility, no ligatures, alignment issues in box drawing, not packaged in Debian

  • Iosevka: optional ligatures, italics, multilingual, good legibility, has a proportional option, serifs and sans, line height issue in box drawing, fails dash test, not in Debian

  • Jetbrains Mono: (mandatory?) ligatures, good coverage, originally rumored to be not DFSG-free (Debian Free Software Guidelines) but ultimately packaged in Debian as fonts-jetbrains-mono

  • Monoid: optional ligatures, feels much "thinner" than Jetbrains, not liking alignment or spacing on that one, ambiguous 2Z, problems rendering box drawing, packaged as fonts-monoid

  • Mononoki: no ligatures, looks good, good alternative, suggested by the Debian fonts team as part of fonts-recommended, problems rendering box drawing, em dash bigger than en dash, packaged as fonts-mononoki

  • Source Code Pro: italics, looks good, but dash metrics look whacky, not in Debian

  • spleen: bitmap font, old school, spacing issue in box drawing test, packaged as fonts-spleen

  • sudo: personal project, no ligatures, zero originally not dotted, relied on metrics for legibility, spacing issue in box drawing, not in Debian

So, if I get tired of Commit Mono, I might probably try, in order:

  1. Hack
  2. Jetbrains Mono
  3. IBM Plex Mono

Iosevka, Monoki and Intel One Mono are also good options, but have alignment problems. Iosevka is particularly disappointing as the EM DASH metrics are just completely wrong (much too wide).

This was tested using the Programming fonts site which has all the above fonts, which cannot be said of Font Squirrel or Google Fonts, amazingly. Other such tools:

Categories: FLOSS Project Planets

Antoine Beaupré: 2024-05-29-playing-with-fonts-again

Planet Debian - Wed, 2024-05-29 17:38

meta title="Playing with fonts again"

I am getting increasingly frustrated by Fira Mono's lack of italic support so I am looking at alternative fonts again.

This time I seem to be settling on either Commit Mono or Space Mono. For now I'm using Commit Mono because it's a little more compressed than Fira and does have a italic version. I don't like how Space Mono's parenthesis (()) is "squarish", it feels visually ambiguous with the square brackets ([]), a big no-no for my primary use case (code).

So here I am using a new font, again. It required changing a bunch of configuration files in my home directory (which is in a private repository, sorry) and Emacs configuration (thankfully that's public!).

One gotcha is I realized I didn't actually have a global font configuration in Emacs, as some Faces define their own font family, which overrides the frame defaults.

This is what it looks like, before:

Fira Mono

After:

Commit Mono

(Notice how those screenshots are not sharp? I'm surprised too. The originals look sharp on my display, I suspect this is something to do with the Wayland transition. I've tried with both grim and flameshot, for what its worth.)

They are pretty similar! Commit Mono feels a bit more vertically compressed maybe too much so, actually -- the line height feels too low. But it's heavily customizable so that's something that's relatively easy to fix, if it's really a problem. Its weight is also a little heavier and wider than Fira which I find a little distracting right now, but maybe I'll get used to it.

All characters seem properly distinguishable, although, if I'd really want to nitpick I'd say the © and ® are too different, with the latter (REGISTERED SIGN) being way too small, basically unreadable here. Since I see this sign approximately never, it probably doesn't matter at all.

I like how the ampersand (&) is more traditional, although I'll miss the exotic one Fira produced... I like how the back quotes (`, GRAVE ACCENT) drop down low, nicely aligned with the apostrophe. As I mentioned before, I like how the bar on the "f" aligns with the other top of letters, something in Fira mono that really annoys me now that I've noticed it (it's not aligned!).

Here's the test sheet I've made up to test various characters. I could have sworn I had a good one like this lying around somewhere but couldn't find it so here it is, I guess.

ASCII test abcdefghijklmnopqrstuvwxyz1234567890-= ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()_+ ambiguous characters &iIL7l1!|[](){}/\oO0DQ8B;:,./?~`'"$ all characters in a sentence, uppercase the quick fox jumps over the lazy dog THE QUICK FOX JUMPS OVER THE LAZY DOG same, in french voix ambiguë d'un cœur qui, au zéphyr, préfère les jattes de kiwis. VOIX AMBIGUË D'UN CŒUR QUI, AU ZÉPHYR, PRÉFÈRE LES JATTES DE KIWIS. Box drawing alignment tests: █ ▉ ╔══╦══╗ ┌──┬──┐ ╭──┬──╮ ╭──┬──╮ ┏━━┳━━┓ ┎┒┏┑ ╷ ╻ ┏┯┓ ┌┰┐ ▊ ╱╲╱╲╳╳╳ ║┌─╨─┐║ │╔═╧═╗│ │╒═╪═╕│ │╓─╁─╖│ ┃┌─╂─┐┃ ┗╃╄┙ ╶┼╴╺╋╸┠┼┨ ┝╋┥ ▋ ╲╱╲╱╳╳╳ ║│╲ ╱│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╿ │┃ ┍╅╆┓ ╵ ╹ ┗┷┛ └┸┘ ▌ ╱╲╱╲╳╳╳ ╠╡ ╳ ╞╣ ├╢ ╟┤ ├┼─┼─┼┤ ├╫─╂─╫┤ ┣┿╾┼╼┿┫ ┕┛┖┚ ┌┄┄┐ ╎ ┏┅┅┓ ┋ ▍ ╲╱╲╱╳╳╳ ║│╱ ╲│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╽ │┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▎ ║└─╥─┘║ │╚═╤═╝│ │╘═╪═╛│ │╙─╀─╜│ ┃└─╂─┘┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▏ ╚══╩══╝ └──┴──┘ ╰──┴──╯ ╰──┴──╯ ┗━━┻━━┛ └╌╌┘ ╎ ┗╍╍┛ ┋ ▁▂▃▄▅▆▇█ MIDDLE DOT, BULLET, HORIZONTAL ELLIPSIS: ·•… curly ‘single’ and “double” quotes ACUTE ACCENT, GRAVE ACCENT: ´` EURO SIGN: € unicode A1-BF: ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ HYPHEN-MINUS, MINUS SIGN, EN, EM DASH, HORIZONTAL BAR, LOW LINE -------------------------------------------------- −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− –––––––––––––––––––––––––––––––––––––––––––––––––– —————————————————————————————————————————————————— ―――――――――――――――――――――――――――――――――――――――――――――――――― __________________________________________________

So there you have it, got completely nerd swiped by typography again. Now I can go back to writing a too-long proposal again.

Sources and inspiration for the above:

  • the unicode(1) command, to lookup individual characters to disambiguate, for example, - (U+002D HYPHEN-MINUS, the minus sign next to zero on US keyboards) and − (U+2212 MINUS SIGN, a math symbol)

  • searchable list of characters and their names - roughly equivalent to the unicode(1) command, but in one page, amazingly the /usr/share/unicode database doesn't have any one file like this

  • bits/UTF-8-Unicode-Test-Documents - full list of UTF-8 characters

  • UTF-8 encoded plain text file - nice examples of edge cases, curly quotes example and box drawing alignment test which, incidentally, showed me I needed specific faces customisation in Emacs to get the Markdown code areas to display properly, also the idea of comparing various dashes

  • sample sentences in many languages - unused, "Sentences that contain all letters commonly used in a language"

  • UTF-8 sampler - unused, similar

Categories: FLOSS Project Planets

Mike Driscoll: Episode 42 – Harlequin – The SQL IDE for Your Terminal

Planet Python - Wed, 2024-05-29 17:06

This episode focuses on the Harlequin application, a Python SQL IDE for your terminal written using the amazing Textual package.

I was honored to have Ted Conbeer, the creator of Harlequin, on the show to discuss his creation and the other things he does with Python.

Specifically, we focused on the following topics:

  • Favorite Python packages
  • Origins of Harlequin
  • Why program for the terminal versus a GUI
  • Lessons learned in creating the tool
  • Asyncio
  • and more!
Links

The post Episode 42 – Harlequin – The SQL IDE for Your Terminal appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Django Weblog: Django Enhancement Proposal 14: Background Workers

Planet Python - Wed, 2024-05-29 15:04

As of today, DEP-14 has been approved 🛫

The DEP was written and stewarded by Jake Howard. A very enthusiastic community has been active with feedback and encouragement, while the Django Steering Council gave the final inputs before its formal acceptance. The implementation of DEP-14 is expected to be a major leap forward for the “batteries included” philosophy of Django.

Whilst Django is a web framework, there's more to web applications than just the request-response lifecycle. Sending emails, communicating with external services or running complex actions should all be done outside the request-response cycle.

Django doesn't have a first-party solution for long-running tasks, however the ecosystem is filled with incredibly popular frameworks, all of which interact with Django in slightly different ways. Other frameworks such as Laravel have background workers built-in, allowing them to push tasks into the background to be processed at a later date, without requiring the end user to wait for them to occur.

Library maintainers must implement support for any possible task backend separately, should they wish to offload functionality to the background. This includes smaller libraries, but also larger meta-frameworks with their own package ecosystem such as Wagtail.

This proposal sets out to provide an interface and base implementation for long-running background tasks in Django.

Future work

The DEP will now move on to the Implementation phase before being merged into Django itself.

If you would like to help or try it out, go have a look at django-tasks, a separate reference implementation by Jake Howard, the author of the DEP.

Jake will also be speaking about the DEP in his talk at DjangoCon Europe at DjangoCon Europe 2024 in Vigo next week.

Categories: FLOSS Project Planets

Théodore 'nod_' Biadala: Sponsored Drupal Contribution

Planet Drupal - Wed, 2024-05-29 15:00

Back in March I started to look at sponsors for the time I’m spending working on the Drupal core issue queue. It’s been a few months and I wanted to go back on all the sponsored commits I made as a Frontend Framework Manager, to show how the sponsorships helped Drupal for the past few months.

The sponsorship offer is simple: you send me a fixed monthly fee of 2500€, and I share the issue credit of every Drupal core commit that I make. I’m very thankful to Palantir.net and OPTASY who are sponsoring me. Thanks to them I was able to increase the amount of commits I can make to Drupal core. In the last 3 months I committed 61 issues (worth 610 weighted issue credits) and the more sponsors I have, the more time I can spend reviewing and committing issues.

  1. Differentiate visually dragging with and without hierarchy A nice improvement for editors working a lot with lists and trees
  2. Sticky table header is not sticky if --drupal-displace-offset-top is not defined
  3. [jQuery 4] ajax.js and jquery.form.js use deprecated function $.parseJSON() Preparing for the next release of jQuery 4 with some cleanup.
  4. cspell check is broken in commit-code-check.sh Sometimes we break the CI and it needs to be fixed 🤷
  5. CKEditor admin toolbar config buttons using ::before to add content: have invalid screen reader text It takes dedication to land those accessibility fixes, kudos to our accessibility contributors.
  6. Linking in CKEditor 5: URLs with top-level domain but without protocol should get a protocol added automatically
  7. #states disable property has stopped working for submit button and other elements Sometimes when we clean-up code, we clean too much and break some other parts of the code
  8. Setting width for sticky-header is broken
  9. Negotiate max width/height of oEmbed assets more intelligently
  10. States API doesn't work with multiple select fields This was a a 13 years old issue! It _always_ feels good to close an issue that old.
  11. Add deprecation/bc support for library-overrides when files are moved Making sure backwards compatibility is working and useful
  12. Remove default event from collpased nav-tabs button
  13. [DrupalHtmlEngine] HTML-reserved characters (>, <, &) in <script> and <style> tag are converted to HTML entities It happens that we fix things for uses cases that stretch the reasonable (like having whole script tags in a WYSIWYG field)
  14. Olivero: Show content preview checkbox is not center aligned with the layout builder buttons. Even a minor issue of a misaligned text by a few pixels is worth fixing
  15. Drupal.theme.progressBar() does not escape output correctly
  16. filter_autop should ignore twig.config debug html comments Making sure Developer experience doesn't impact regular users
  17. tablePositionSticky should not be called on a non-array variable
  18. CKEditor 5 table cell vertical align "middle" doesn't work
  19. Move system/base component CSS to respective libraries where they exist A surprising performance improvement. There are still some low hanging fruits to improve the default frontend performance of Drupal
  20. Remove country setting from the installer When you don't need a piece of data, just don't collect it
  21. Media Library widget display doesn't return to first page on applying filters
  22. Deprecate and remove the AJAX replace method That was a leftover D7 era deprecation
  23. Claro should use libraries-extend for views_ui.css Even in core it happens that we don't use the right way to do something
  24. Removal :tabbable usage in dialog.js Some more jQuery 4 preparation
  25. Close icon is ovrlapping the title text in modal in claro Yes, typos can make it in the commit log
  26. Convert Olivero's teaser into a single directory component Slowly but surely we're adding Single directory components to Drupal core
  27. Refactor (if feasible) uses of the jQuery animate function to use Vanilla/native More CSS awesomeness making JavaScript code disapear
  28. [11.x] Update to jQuery 4.0.x beta Drupal staying on the bleading edge of frontend developement :)
  29. Refactor some uses of the jQuery parents function to use vanillaJS Did a small post earlier about this, CSS is really very good
  30. [regression] Uncaught TypeError: Cannot read properties of null (reading 'style') (toolbar.js)
  31. JSDoc for ajax command "changed" is incorrect There was a bunch of documentation fixes around this time
  32. menu_heading_id variable is not set in menu-region--footer.html.twig
  33. Add @file documentation to navigation.html.twig layout template
  34. Add @file documentation to menu-region--footer.html.twig template
  35. Views UI action buttons create janky layout shift on page load Polishing the loading of pages with heavy JS usage is important to show we care about UX
  36. Remove bottom radius on hover state of expanded sub menu item
  37. Setting empty URL when making embedded media a link in CKEditor5 causes JS errors
  38. Update color of submenu title text
  39. Collapsed nav-tabs status not exposed to screen reader There is a good number of accessibility fixes after this one, always nice to commit
  40. Incorrect padding on child menu items
  41. Claro: Form labels that are disabled have too low color contrast
  42. Claro should not hardcode decimal list style type for <ol>
  43. Some of string comparisons should use String.prototype.startsWith() or String.prototype.endsWith() Removing regular expressions as much as we can is a noble goal
  44. Location of "Skip to Main" link below admin toolbar in Claro is problematic for screen magnifier users
  45. Focus states on mobile second level navigation items can get cut off in Olivero
  46. Regression: Shortcuts menu flickers when the page is loaded Those toolbar flickering issues are tricky. Thankfully the new navigation module code is simpler than the existing Toolbar code, so it's much much easier to deal with
  47. escapeAdmin.js functionality should be removed(it not used anymore) Removed the feature that removed the overlay… for now, escapeAdmin will be back one way or another)
  48. Navigation module offsets the Olivero skip link element
  49. Umami page.tpl.php breaks block placeholders Sometime themes can break really nice Drupal features (like bigpipe)
  50. Claro CSS for dropbutton items adds large gap of white space
  51. Replace dialogContentResize jQuery event with CustomEvent Those events issues are really exciting, we're moving away slowly from jQuery for event management
  52. Umami views should use responsive grid Another case of core not using the awesome features we provide, not anymore :)
  53. Claro highlighted row not communicated to keyboard users
  54. Fix overflow visibility for wrapper content in navigation CSS
  55. Claro details component does not have the right class
  56. Make drupal.tableheader only use CSS for sticky table headers I will always welcome CSS-removing-JS patches
  57. Mismatch between implementation and description for Drupal.Message.prototype.remove().
  58. "Skip to main content" link skips over content that is essential to the page, banner role should be for global content
  59. Add pdureau as a co-maintainer for the Theme API with a focus on SDC Adding new maintainers is too rare. In this case the community is better for having him around
  60. Choose an icon for the Announcements link
  61. Remove deprecated moved_files entries in core

Many of these issues are maintenance focused, it’s not shiny, it’s not exciting, and it needs to be done. Sponsoring big initiatives like Starshot is exciting, let’s not forget the unexciting day to day that keeps things running. If you’re interested in supporting my work on Drupal core and keep the frontend fixes coming, consider sponsoring me.

Categories: FLOSS Project Planets

PyCharm: PyCharm 2024.1.2: What’s New!

Planet Python - Wed, 2024-05-29 14:54

PyCharm 2024.1.2 is here with features designed to enhance your productivity and streamline your development workflow. This update includes support for DRF viewsets and routers in the Endpoints tool window, code assistance for TypedDict and Unpack, and improved debugger performance when handling large collections.

You can download the latest version from our download page or update your current version through our free Toolbox App

For more details, please visit our What’s New page.

Download PyCharm 2024.1.2

Key features Support for DRF viewsets and routers in the Endpoints tool window

When working with the Django REST Framework in PyCharm, not only can you specify function-based or class-based views in the path, but you can now also specify viewsets and see the results in the Endpoints tool window. Additionally, you can map HTTP methods to viewset methods, and PyCharm will display the HTTP methods next to the relevant route, including for custom methods. Routes without @actions decorators are now displayed with the related viewset methods.

Learn more Code assistance for TypedDict and Unpack

PEP 692 made it possible to add type information for keyword arguments of different types by using TypedDict and Unpack. PyCharm allows you to use this feature confidently by providing parameter info, type checking, and code completion.

Improved debugger performance for large collections

PyCharm’s debugger now offers a smoother experience, even when very large collections are involved. You can now work on your data science projects without having to put up with high CPU loads and UI freezes.

Download PyCharm 2024.1.2

Be sure to check out our release notes to learn all of the details and ensure you don’t miss out on any new features.

We appreciate your support as we work to improve your PyCharm experience. Please report any bugs via our issue tracker so we can resolve them promptly. Connect with us on X (formerly Twitter) to share your feedback on PyCharm 2024.1.2!

Categories: FLOSS Project Planets

Tag1 Consulting: Migrating Your Data from Drupal 7 to Drupal 10: Drupal Entities Overview

Planet Drupal - Wed, 2024-05-29 10:40
Today, we will take a step back from reviewing the Migrate API. Instead, we will have an overview of content and configuration entities in Drupal 10. This is important for two reasons. Read more mauricio Wed, 05/29/2024 - 07:40
Categories: FLOSS Project Planets

The Python Show: 42 - Harlequin - The SQL IDE for Your Terminal

Planet Python - Wed, 2024-05-29 10:14

This episode focuses on the Harlequin application, a Python SQL IDE for your terminal written using the amazing Textual package.

I was honored to have Ted Conbeer, the creator of Harlequin, on the show to discuss his creation and the other things he does with Python.

Specifically, we focused on the following topics:

  • Favorite Python packages

  • Origins of Harlequin

  • Why program for the terminal versus a GUI

  • Lessons learned in creating the tool

  • Asyncio

  • and more!

Links
Categories: FLOSS Project Planets

Drupal Association blog: Introducing the Local Associations Initiative: Empowering Drupal Communities Worldwide

Planet Drupal - Wed, 2024-05-29 10:00

We are thrilled to announce the launch of our new initiative led by Programs Manager, Joi Garrett. This program is designed to support the success of Drupal Local Associations by engaging directly with community leaders who work to promote the Drupal project in their global regions.

Connecting Communities

The heart of the Local Associations Initiative lies in fostering meaningful connections. We recognize the efforts of local leaders and the unique challenges they face. By hosting a series of virtual meetings, we aim to create a platform for leaders to share their experiences, successes and challenges. These sessions will not only provide valuable insight to the state of various local associations but help to strengthen our global community. 

Identifying and Addressing Common Needs

Understanding the diverse needs of our local associations is crucial. Through open dialogue in our virtual meetings, we will identify common needs and prioritize them. We hope by facilitating a collaborative environment the Drupal Association can support efforts for the most pressing issues faced by community leaders. The Drupal Association is committed to finding solutions that drive success. 

Join Us on This Journey

We invite local association leaders to participate in this initiative and attend the virtual meetings. Your insights and contributions are invaluable as we work together to strengthen our global Drupal Community. Stay tuned for announcements about the upcoming virtual meetings. Through this initiative, we aim to foster a collaborative environment where our global community feels more connected and supported. Once we have concluded the meetings, we will discuss the findings and future plans during DrupalCon Barcelona 2024. 

We have been collecting contact information of Local Association leaders for the past few months, if you would like to be included please fill out the following form.

Thank you to our local leaders for being an integral part of our community. We look forward to collaborating with you to make this initiative a success!

Continent Expected Start Europe April (working with Network of European Drupal Associations)  Asia June Australia July North America July South America August Africa August
Categories: FLOSS Project Planets

Real Python: What Are CRUD Operations?

Planet Python - Wed, 2024-05-29 10:00

CRUD operations are at the heart of nearly every application you interact with. As a developer, you usually want to create data, read or retrieve data, update data, and delete data. Whether you access a database or interact with a REST API, only when all four operations are present are you able to make a complete data roundtrip in your app.

Creating, reading, updating, and deleting are so vital in software development that these methods are widely referred to as CRUD. Understanding CRUD will give you an actionable blueprint when you build applications and help you understand how the applications you use work behind the scenes. So, what exactly does CRUD mean?

Get Your Code: Click here to download the free sample code that you’ll use to learn about CRUD operations in Python.

Take the Quiz: Test your knowledge with our interactive “What Are CRUD Operations?” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

What Are CRUD Operations?

In this quiz, you'll revisit the key concepts and techniques related to CRUD operations. These operations are fundamental to any system that interacts with a database, and understanding them is crucial for effective data management.

In Short: CRUD Stands for Create, Read, Update, and Delete

CRUD operations are the cornerstone of application functionality, touching every aspect of how apps store, retrieve, and manage data. Here’s a brief overview of the four CRUD operations:

  • Create: This is about adding new entries to your database. But it’s also applicable to other types of persistent storage, such as files or networked services. When you perform a create operation, you’re initiating a journey for a new piece of data within your system.
  • Read: Through reading, you retrieve or view existing database entries. This operation is as basic as checking your email or reloading a website. Every piece of information you get has been received from a database, thanks to the read operation.
  • Update: Updating allows you to modify the details of data already in the database. For example, when you update a profile picture or edit a chat message. Each time, there’s an update operation at work, ensuring your new data is stored in the database.
  • Delete: Deleting removes existing entries from the database. Whether you’re closing an account or removing a post, delete operations ensure that unwanted or unnecessary data can be properly discarded.

CRUD operations describe the steps that data takes from creation to deletion, regardless of what programming language you use. Every time you interact with an application, you’re likely engaging in one of the four CRUD operations.

Why Are CRUD Operations Essential?

Whether you’re working on a basic task list app or a complex e-commerce platform, CRUD operations offer a universal language for designing and manipulating data models. Knowing about CRUD as a user helps you understand what’s happening behind the curtains. As a developer, understanding CRUD provides you with a structured framework for storing data in your application with persistence:

In computer science, persistence refers to the characteristic of state of a system that outlives (persists more than) the process that created it. This is achieved in practice by storing the state as data in computer data storage. (Source)

So even when a program crashes or a user disconnects, the data is safe and can be retrieved later. This also means that the order of the operations is important. You can only read, update, or delete items that were previously created.

It’s good practice to implement each CRUD operation separately in your applications. For example, when you retrieve items, then you shouldn’t update them at the same time.

Note: An exception to this rule may be when you update a “last time retrieved” value after a read operation. Although the user performs a read CRUD operation to retrieve data, you may want to trigger an update operation in the back end to keep track of a user’s retrievals. This can be handy if you want to show the last visited posts to the user.

While CRUD describes a concept that’s independent of specific programming languages, one could argue that CRUD operations are strongly connected to SQL commands and HTTP methods.

What Are CRUD Operations in SQL?

The idea of CRUD is strongly connected with databases. That’s why it’s no surprise that CRUD operations correspond almost one-to-one with SQL commands:

CRUD Operation SQL Command Create INSERT Read SELECT Update UPDATE Delete DELETE

When you create data, you’re using the INSERT command to add new records to a table. After creation, you may read data using SELECT. With a SELECT query, you’re asking the database to retrieve the specific pieces of information you need, whether it’s a single value, a set of records, or complex relationships between data points.

The update operation corresponds to the UPDATE command in SQL, which allows you to modify data. It lets you edit or change an existing item.

Lastly, the delete operation relates to the DELETE command. This is the digital equivalent of shredding a confidential document. With DELETE, you permanently remove an item from the database.

Writing CRUD Operations in Raw SQL

CRUD operations describe actions. That’s why it’s a good idea to pull up your sleeves and write some code to explore how CRUD operations translate into raw SQL commands.

In the examples below, you’ll use Python’s built-in sqlite3 package. SQLite is a convenient SQL library to try things out, as you’ll work with a single SQLite database file.

You’ll name the database birds.db. As the name suggests, you’ll use the database to store the names of birds you like. To keep the example small, you’ll only keep track of the bird names and give them an ID as a unique identifier.

Read the full article at https://realpython.com/crud-operations/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Evolving Web: Starshot Initiative: Blast-Off for Drupal Beginners

Planet Drupal - Wed, 2024-05-29 09:41

When I first got into web development after graduating from university, I got excited about how easy it was to build a website. A bit of knowledge goes a long way. I learned HTML, CSS, Ruby on Rails, and PHP, and tried all kinds of platforms and content management systems. Sometimes it felt hard and sometimes easy. But it was those “aha” moments when I could prototype a site in an afternoon that really got me excited. Which is why I think Drupal Starshot is such a promising initiative.

The goal of Starshot is to create an out-of-the-box version of Drupal that fast-tracks people through building and launching a website. It’ll attract new users, on-board them quickly, and give them an instant feeling of empowerment.

Starshot will show new users that, yes, you can build enterprise websites with Drupal — but you can also use it to launch a campaign, create an event website, or evolve a digital presence for your startup business. 

The Initiative was announced at DrupalCon Portland 2024. Already, more than 386 individuals have pledged support, and dozens of companies have expressed interest too. Starshot is shaping up to be one of the most exciting large-scale initiatives in Drupal’s 23 year history.

“Without a doubt, Starshot will be the largest change to Drupal since the foundational rewrite to modern object-oriented PHP that occurred with Drupal 8.” 

– Mike Herschel, Senior Front-End Developer at Agileana, former elected member of the Drupal Association Board of Directors

Is Starshot a Rewrite of Drupal?

No, Starshot isn’t a rewrite. It is built on Drupal core and include many features from recent Drupal initiatives, including:

Drupal Starshot will be available alongside traditional Drupal Core on the Drupal.org download page later this year. Selecting Starshot will install the necessary features for various website use cases, simplifying the process for new users. Traditional Drupal Core will continue to be available for more custom sites. 

Watch the Driesnote from DrupalCon Portland 2024 where Starshot was revealed:

How Will Starshot Enhance the Open Web?

Starshot will increase Drupal usage and bring the power of open source to even more people by lowering the barrier to entry. It’s a great opportunity to package a version of Drupal that’s more attractive, user-friendly, and engaging for newcomers. This initiative will bring long-overdue improvements to usability, while maintaining Drupal’s strengths: content architecture, security, and multilingual support.

A key component of Starshot's development is its focus on the "Builder" persona—users who aspire to create amazing websites with Drupal but may not have extensive web development experience. By leveraging user research with Builders, Starshot will tailor its features and the editing interface around their needs. This means that folks with limited technical expertise can harness the power of Drupal to bring their creative ideas to life.

Starshot can pre-package Drupal with default configurations and pre-configured modules, leveraging the flexible Recipes initiative, and using the new Experience Builder to enhance page creation. Starshot should significantly reduce development and maintenance costs for simpler web builds. Plus, it will set good UX standards for Drupal websites across the board.

“Let’s reach for the stars and bring the open web to all.”

– Dries Buytaert, Creator and Project Lead of Drupal

How Can I Contribute?

The first releases of Starshot should be available before the end of 2024. The initiative needs lots of support to make this a reality!

Make your pledge on the Starshot landing page or reach out to us with the details of your expertise and availability. We’ll be happy to connect you with a meaningful way to contribute.

Want to help with marketing efforts related to Startshot? Go to the Drupal Marketing page or reach out to me directly.

Take Your Drupal Skills to New Heights

I’ve trained countless people in Drupal and web development over the years — always with the aim of empowering them and inspiring the same passion I have for building websites. If you’re looking to enhance your knowledge and gain valuable tools, you’ve come to the right place. My team and I offer course packages and custom training for site builders, content editors, UX/UI designers, front-end and back-end developers, and entire teams.

Learn more about Drupal training with Evolving Web.  

+ more awesome articles by Evolving Web
Categories: FLOSS Project Planets

LN Webworks: How to Improve Drupal SEO with the Help of a Global Module

Planet Drupal - Wed, 2024-05-29 08:06

The person with the most trouble with duplicate material is the content writer. The hardest task is having to start over and make the necessary corrections after spending hours writing something and discovering that your content is plagiarized. 

This not only gives you a lot of work to do, but it may also interfere with the SEO of your website. Further, we’ll discuss the main problems brought on by duplicate content and discuss how the Drupal SEO module might assist in resolving this issue. 

Categories: FLOSS Project Planets

Pages