FLOSS Project Planets

Russ Allbery: Review: A Prayer for the Crown-Shy

Planet Debian - Sun, 2022-08-21 00:08

Review: A Prayer for the Crown-Shy, by Becky Chambers

Series: Monk & Robot #2 Publisher: Tordotcom Copyright: 2022 ISBN: 1-250-23624-X Format: Kindle Pages: 151

A Prayer for the Crown Shy is the second novella in the Monk & Robot series and a direct sequel to A Psalm for the Wild-Built. Don't start here.

I would call this the continuing adventures of Sibling Dex and Mosscap the robot, except adventures is entirely the wrong term for stories with so little risk or danger. The continuing tour? The continuing philosophical musings? Whatever one calls it, it's a slow exploration of Dex's world, this time with Mosscap alongside. Humans are about to have their first contact with a robot since the Awakening.

If you're expecting that to involve any conflict, well, you've misunderstood the sort of story that this is. Mosscap causes a sensation, certainly, but a very polite and calm one, and almost devoid of suspicion or fear. There is one village where they get a slightly chilly reception, but even that is at most a quiet disapproval for well-understood reasons. This world is more utopian than post-scarcity, in that old sense of utopian in which human nature has clearly been rewritten to make the utopia work.

I have to admit I'm struggling with this series. It's calm and happy and charming and occasionally beautiful in its descriptions. Dex continues to be a great character, with enough minor frustration, occasional irritation, and inner complications to make me want to keep reading about them. But it's one thing to have one character in a story who is simply a nice person at a bone-deep level, particularly given that Dex chose religious orders and to some extent has being a nice person as their vocation. It's another matter entirely when apparently everyone in the society is equally nice, and the only conflicts come from misunderstandings, respectful disagreements of opinion, and the occasional minor personality conflict.

Realism has long been the primary criticism of Chambers's work, but in her Wayfarers series the problems were mostly in the technology and its perpetual motion machines. Human civilization in the Exodus Fleet was a little too calm and nice given its traumatic past (and, well, humans), but there were enough conflicts, suspicions, and poor decisions for me to recognize it as human society. It was arguably a bit too chastened, meek, and devoid of shit-stirring demagogues, but it was at least in contact with human society as I recognize it.

I don't recognize Panga as humanity. I realize this is to some degree the point of this series: to present a human society in which nearly all of the problems of anger and conflict have been solved, and to ask what would come after, given all of that space. And I'm sure that one purpose of this type of story is to be, as I saw someone describe it, hugfic: the fictional equivalent of a warm hug from a dear friend, safe and supportive and comforting. Maybe it says bad, or at least interesting, things about my cynicism that I don't understand a society that's this nice. But that's where I'm stuck.

If there were other dramatic elements to focus on, I might not mind it as much, but the other pole of the story apart from the world tour is Mosscap's philosophical musings, and I'm afraid I'm already a bit tired of them. Mosscap is earnest and thoughtful and sincere, but they're curious about Philosophy 101 material and it's becoming frustrating to see Mosscap and Dex meander through these discussions without attempting to apply any theoretical framework whatsoever. Dex is a monk, who supposedly has a scholarship tradition from which to draw, and yet appears to approach all philosophical questions with nothing more than gut feeling, common sense, and random whim. Mosscap is asking very basic meaning-of-life sorts of questions, the kind of thing that humans have been writing and arguing about from before we started keeping records and which are at the center of any religious philosophy. I find it frustrating that someone supposedly educated in a religious tradition can't bring more philosophical firepower to these discussions.

It doesn't help that this entry in the series reinforces the revelation that Mosscap's own belief system is weirdly unsustainable to such a degree that it's staggering that any robots still exist. If I squint, I can see some interesting questions raised by the robot attitude towards their continued existence (although most of them feel profoundly depressing to me), but I was completely unable to connect their philosophy in any believable way with their origins and the stated history of the world. I don't understand how this world got here, and apparently I'm not able to let that go.

This all sounds very negative, and yet I did enjoy this novella. Chambers is great at description of places that I'd love to visit, and there is something calm and peaceful about spending some time in a society this devoid of conflict. I also really like Dex, even more so after seeing their family, and I'm at least somewhat invested in their life decisions. I can see why people like these novellas. But if I'm going to read a series that's centered on questions of ethics and philosophy, I would like it to have more intellectual heft than we've gotten so far.

For what it's worth, I'm seeing a bit of a pattern where people who bounced off the Wayfarers books like this series much better, whereas people who loved the Wayfarers books are not enjoying these quite as much. I'm in the latter camp, so if you didn't like Chambers's earlier work, maybe you'll find this more congenial? There's a lot less found family here, for one thing; I love found family stories, but they're not to everyone's taste.

If you liked A Psalm for the Wild-Built, you will probably also like A Prayer for the Crown-Shy; it's more of the same thing in both style and story. If you found the first story frustratingly unbelievable or needing more philosophical depth, I'm afraid this is unlikely to be an improvement. It does have some lovely scenes, though, and is stuffed full of sheer delight in both the wild world and in happy communities of people.

Rating: 7 out of 10

Categories: FLOSS Project Planets

Iustin Pop: Note to self: Don't forget Qemu's discard option

Planet Debian - Sat, 2022-08-20 20:00

This is just a short note to myself, and to anyone who might run VMs via home-grown scripts (or systemd units). I expect modern VM managers to do this automatically, but for myself, I have just a few hacked together scripts.

By default, QEMU (at least as of version 7.0) does not honour/pass discard requests from block devices to the underlying storage. This is a sane default (like lvm’s default setting), but with long-lived VMs it can lead to lots of wasted disk space. I keep my VMs on SSDs, which is limited space for me, so savings here are important.

Older Debian versions did not trim automatically, but nowadays they do (which is why this is worth enabling for all VMs), so all you need is to pass:

  • unmap=discard to activate the pass-through.
  • optionally, detect-zeroes=unmap, but I don’t know how useful this is, as in, how often zeroes are written.

And the next trim should save lots of disk space. It doesn’t matter much if you use raw or qcow2, both will know to unmap the unused disk, leading to less disk space used. This part seems to me safe security-wise, as long as you trust the host. If you have pass-through to the actual hardware, it will also do proper discard at the SSD level (with the potential security issues leading from that). I’m happy with the freed up disk space 🙂

Note: If you have (like I do) Windows VMs as well, using paravirt block devices, make sure the drive is recent enough.

One interesting behaviour from Windows: it looks like the default cluster size is quite high (64K), which with many small files will lead to significant overhead. But, either I misunderstand, or Windows actually knows how to unmap the unused part of a cluster (although it takes a while). So in the end, the backing file for the VM (19G) is smaller than the “disk used” as reported in Windows (23-24G), but higher than “size on disk” for all the files (17.2G). Seems legit, and it still boots 😛 Most Linux file systems have much smaller block sizes (usually 4K), so this is not a problem for it.

Categories: FLOSS Project Planets

John Ludhi/nbshare.io: PySpark Replace Values In DataFrames

Planet Python - Sat, 2022-08-20 15:38
PySpark Replace Values In DataFrames Using regexp_replace(), translate() and Overlay() Functions

regexp_replace(), translate(), and overlay() functions can be used to replace values in PySpark Dataframes.

First we load the important libraries

In [1]: from pyspark.sql import SparkSession from pyspark.sql.functions import (col, regexp_replace, translate, overlay, when, expr) In [25]: # initializing spark session instance spark = SparkSession.builder.appName('snippets').getOrCreate() Then load our initial records In [3]: columns = ["Full_Name","Salary", "Last_Name_Pattern", "Last_Name_Replacement"] data = [('Sam A Smith', '1,000.01', 'Sm', 'Griffi'), ('Alex Wesley Jones', '120,000.89', 'Jo', 'Ba'), ('Steve Paul Jobs', '5,000.90', 'Jo', 'Bo')] In [4]: # converting data to rdds rdd = spark.sparkContext.parallelize(data) In [5]: # Then creating a dataframe from our rdd variable dfFromRDD2 = spark.createDataFrame(rdd).toDF(*columns) In [6]: # visualizing current data before manipulation dfFromRDD2.show() +-----------------+----------+-----------------+---------------------+ | Full_Name| Salary|Last_Name_Pattern|Last_Name_Replacement| +-----------------+----------+-----------------+---------------------+ | Sam A Smith| 1,000.01| Sm| Griffi| |Alex Wesley Jones|120,000.89| Jo| Ba| | Steve Paul Jobs| 5,000.90| Jo| Bo| +-----------------+----------+-----------------+---------------------+ PySpark regex_replace

regex_replace: we will use the regex_replace(col_name, pattern, new_value) to replace character(s) in a string column that match the pattern with the new_value

1) Here we are replacing the characters 'Jo' in the Full_Name with 'Ba'

In [7]: # here we update the column called 'Full_Name' by replacing some characters in the name that fit the criteria modified_dfFromRDD2 = dfFromRDD2.withColumn("Full_Name", regexp_replace('Full_Name', 'Jo', 'Ba')) In [8]: # visualizing the modified dataframe. We see that only the last two names are updated as those meet our criteria modified_dfFromRDD2.show() +-----------------+----------+-----------------+---------------------+ | Full_Name| Salary|Last_Name_Pattern|Last_Name_Replacement| +-----------------+----------+-----------------+---------------------+ | Sam A Smith| 1,000.01| Sm| Griffi| |Alex Wesley Banes|120,000.89| Jo| Ba| | Steve Paul Babs| 5,000.90| Jo| Bo| +-----------------+----------+-----------------+---------------------+

2) In the above example, we see that only two values (Jones, Jobs) are replaced but not Smith. We can use when function to replace column values conditionally

In [9]: # Here we update the column called 'Full_Name' by replacing some characters in the name that fit the criteria # based on the conditions modified_dfFromRDD3 = dfFromRDD2.withColumn("Full_Name", when(col('Full_Name').endswith('th'), regexp_replace('Full_Name', 'Smith', 'Griffith'))\ .otherwise(regexp_replace('Full_Name', 'Jo', 'Ba'))) In [10]: # visualizing the modified dataframe we see how all the column values are updated based on the conditions provided modified_dfFromRDD3.show() +-----------------+----------+-----------------+---------------------+ | Full_Name| Salary|Last_Name_Pattern|Last_Name_Replacement| +-----------------+----------+-----------------+---------------------+ | Sam A Griffith| 1,000.01| Sm| Griffi| |Alex Wesley Banes|120,000.89| Jo| Ba| | Steve Paul Babs| 5,000.90| Jo| Bo| +-----------------+----------+-----------------+---------------------+

3) We can also use a regex to replace characters. As an example we are making the decimal digits in the salary column to '00'.

In [11]: modified_dfFromRDD4 = dfFromRDD2.withColumn("Salary", regexp_replace('Salary', '\\.\d\d$', '.00 \\$')) In [12]: # visualizing the modified dataframe, we see how the Salary column is updated modified_dfFromRDD4.show(truncate=False) +-----------------+------------+-----------------+---------------------+ |Full_Name |Salary |Last_Name_Pattern|Last_Name_Replacement| +-----------------+------------+-----------------+---------------------+ |Sam A Smith |1,000.00 $ |Sm |Griffi | |Alex Wesley Jones|120,000.00 $|Jo |Ba | |Steve Paul Jobs |5,000.00 $ |Jo |Bo | +-----------------+------------+-----------------+---------------------+

4) Now we will use another regex example to replace varialbe number of characters where the pattern matches regex. Here we replace all lower case characters in the Full_Name column with '--'

In [13]: # Replace only the lowercase characters in the Full_Name with -- modified_dfFromRDD5 = dfFromRDD2.withColumn("Full_Name", regexp_replace('Full_Name', '[a-z]+', '--')) In [14]: # visualizing the modified data frame. We see that all the lowercase characters are replaced. # The uppercase characters are same as they were before modified_dfFromRDD5.show() +-----------+----------+-----------------+---------------------+ | Full_Name| Salary|Last_Name_Pattern|Last_Name_Replacement| +-----------+----------+-----------------+---------------------+ | S-- A S--| 1,000.01| Sm| Griffi| |A-- W-- J--|120,000.89| Jo| Ba| |S-- P-- J--| 5,000.90| Jo| Bo| +-----------+----------+-----------------+---------------------+

5) We can also use regex_replace with expr to replace a column's value with a match pattern from a second column with the values from third column i.e 'regexp_replace(col1, col2, col3)'. Here we are going to replace the characters in column 1, that match the pattern in column 2 with characters from column 3.

In [15]: # Here we update the column called 'Full_Name' by replacing some characters in the 'Full_Name' that match the values # in 'Last_Name_Pattern' with characters in 'Last_Name_Replacement' modified_dfFromRDD6 = modified_dfFromRDD2.withColumn("Full_Name", expr("regexp_replace(Full_Name, Last_Name_Pattern, Last_Name_Replacement)")) In [16]: # visualizing the modified dataframe. # The Full_Name column has been updated with some characters from Last_Name_Replacement modified_dfFromRDD6.show() +-----------------+----------+-----------------+---------------------+ | Full_Name| Salary|Last_Name_Pattern|Last_Name_Replacement| +-----------------+----------+-----------------+---------------------+ | Sam A Griffiith| 1,000.01| Sm| Griffi| |Alex Wesley Banes|120,000.89| Jo| Ba| | Steve Paul Babs| 5,000.90| Jo| Bo| +-----------------+----------+-----------------+---------------------+ PySpark translate() translate(): This function is used to do character by character replacement of column values In [17]: # here we update the column called 'Full_Name' by replacing the lowercase characters in the following way: # each 'a' is replaced by 0, 'b' by 1, 'c' by 2, .....'i' by 8 and j by 9 alphabets = 'abcdefjhij' digits = '0123456789' modified_dfFromRDD7 = dfFromRDD2.withColumn("Full_Name", translate('Full_Name', alphabets, digits)) In [18]: # visualizing the modified dataframe we see the replacements has been done character by character modified_dfFromRDD7.show(truncate=False) +-----------------+----------+-----------------+---------------------+ |Full_Name |Salary |Last_Name_Pattern|Last_Name_Replacement| +-----------------+----------+-----------------+---------------------+ |S0m A Sm8t7 |1,000.01 |Sm |Griffi | |Al4x W4sl4y Jon4s|120,000.89|Jo |Ba | |St4v4 P0ul Jo1s |5,000.90 |Jo |Bo | +-----------------+----------+-----------------+---------------------+ PySpark overlay()

overlay(src_col, replace_col, src_start_pos, src_char_len <default -1>): This function is used to replace the values in a src_col column from src_start_pos with values from replace_col. This replacement starts from src_start_pos and replaces src_char_len characters (by default replaces replace_col length characters)

In [19]: # Here the first two characters are replaced by the replacement string in Last_Name_Replacement column modified_dfFromRDD8 = dfFromRDD2.select('Full_Name', overlay("Full_Name", "Last_Name_Replacement", 1, 2).alias("FullName_Overlayed")) In [20]: # Visualizing the modified dataframe modified_dfFromRDD8.show() +-----------------+------------------+ | Full_Name|FullName_Overlayed| +-----------------+------------------+ | Sam A Smith| Griffim A Smith| |Alex Wesley Jones| Baex Wesley Jones| | Steve Paul Jobs| Boeve Paul Jobs| +-----------------+------------------+ In [21]: # Here we replace characters starting from position 5 (1-indexed) and replace characters equal to the # length of the replacement string modified_dfFromRDD9 = dfFromRDD2.select('Full_Name', overlay("Full_Name", "Last_Name_Replacement", 5).alias("FullName_Overlayed")) In [22]: # Visualizing the modified dataframe modified_dfFromRDD9.show() +-----------------+------------------+ | Full_Name|FullName_Overlayed| +-----------------+------------------+ | Sam A Smith| Sam Griffih| |Alex Wesley Jones| AlexBaesley Jones| | Steve Paul Jobs| StevBoPaul Jobs| +-----------------+------------------+ In [23]: spark.stop()
Categories: FLOSS Project Planets

Stack Abuse: Object Detection and Instance Segmentation in Python with Detectron2

Planet Python - Sat, 2022-08-20 06:30
Introduction

Object detection is a large field in computer vision, and one of the more important applications of computer vision "in the wild". On one end, it can be used to build autonomous systems that navigate agents through environments - be it robots performing tasks or self-driving cars, but this requires intersection with other fields. However, anomaly detection (such as defective products on a line), locating objects within images, facial detection and various other applications of object detection can be done without intersecting other fields.

Advice This short guide is based on a small part of a much larger lesson on object detection belonging to our "Practical Deep Learning for Computer Vision with Python" course.

Object detection isn't as standardized as image classification, mainly because most of the new developments are typically done by individual researchers, maintainers and developers, rather than large libraries and frameworks. It's difficult to package the necessary utility scripts in a framework like TensorFlow or PyTorch and maintain the API guidelines that guided the development so far.

This makes object detection somewhat more complex, typically more verbose (but not always), and less approachable than image classification. One of the major benefits of being in an ecosystem is that it provides you with a way to not search for useful information on good practices, tools and approaches to use. With object detection - most have to do way more research on the landscape of the field to get a good grip.

Meta AI's Detectron2 - Instance Segmentation and Object Detection

Detectron2 is Meta AI (formerly FAIR - Facebook AI Research)'s open source object detection, segmentation and pose estimation package - all in one. Given an input image, it can return the labels, bounding boxes, confidence scores, masks and skeletons of objects. This is well-represented on the repository's page:

It's meant to be used as a library on the top of which you can build research projects. It offers a model zoo with most implementations relying on Mask R-CNN and R-CNNs in general, alongside RetinaNet. They also have a pretty decent documentation. Let's run an examplory inference script!

First, let's install the dependencies:

$ pip install pyyaml==5.1 $ pip install 'git+https://github.com/facebookresearch/detectron2.git'

Next, we'll import the Detectron2 utilities - this is where framework-domain knowledge comes into play. You can construct a detector using the DefaultPredictor class, by passing in a configuration object that sets it up. The Visualizer offers support for visualizing results. MetadataCatalog and DatasetCatalog belong to Detectron2's data API and offer information on built-in datasets as well as their metadata.

Let's import the classes and functions we'll be using:

import torch, detectron2 from detectron2.utils.logger import setup_logger setup_logger() from detectron2 import model_zoo from detectron2.engine import DefaultPredictor from detectron2.config import get_cfg from detectron2.utils.visualizer import Visualizer from detectron2.data import MetadataCatalog, DatasetCatalog

Using requests, we'll download an image and save it to our local drive:

import matplotlib.pyplot as plt import requests response = requests.get('http://images.cocodataset.org/val2017/000000439715.jpg') open("input.jpg", "wb").write(response.content) im = cv2.imread("./input.jpg") fig, ax = plt.subplots(figsize=(18, 8)) ax.imshow(cv2.cvtColor(im, cv2.COLOR_BGR2RGB))

This results in:

Now, we load the configuration, enact changes if need be (the models run on GPU by default, so if you don't have a GPU, you'll want to set the device to 'cpu' in the config):

cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") # If you don't have a GPU and CUDA enabled, the next line is required # cfg.MODEL.DEVICE = "cpu"

Here, we specify which model we'd like to run from the model_zoo. We've imported an instance segmentation model, based on the Mask R-CNN architecture, and with a ResNet50 backbone. Depending on what you'd like to achieve (keypoint detection, instance segmentation, panoptic segmentation or object detection), you'll load in the appropriate model.

Finally, we can construct a predictor with this cfg and run it on the inputs! The Visualizer class is used to draw predictions on the image (in this case, segmented instances, classes and bounding boxes:

predictor = DefaultPredictor(cfg) outputs = predictor(im) v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) fig, ax = plt.subplots(figsize=(18, 8)) ax.imshow(out.get_image()[:, :, ::-1])

Finally, this results in:

Going Further - Practical Deep Learning for Computer Vision

Your inquisitive nature makes you want to go further? We recommend checking out our Course: "Practical Deep Learning for Computer Vision with Python".

Another Computer Vision Course?

We won't be doing classification of MNIST digits or MNIST fashion. They served their part a long time ago. Too many learning resources are focusing on basic datasets and basic architectures before letting advanced black-box architectures shoulder the burden of performance.

We want to focus on demystification, practicality, understanding, intuition and real projects. Want to learn how you can make a difference? We'll take you on a ride from the way our brains process images to writing a research-grade deep learning classifier for breast cancer to deep learning networks that "hallucinate", teaching you the principles and theory through practical work, equipping you with the know-how and tools to become an expert at applying deep learning to solve computer vision.

What's inside?
  • The first principles of vision and how computers can be taught to "see"
  • Different tasks and applications of computer vision
  • The tools of the trade that will make your work easier
  • Finding, creating and utilizing datasets for computer vision
  • The theory and application of Convolutional Neural Networks
  • Handling domain shift, co-occurrence, and other biases in datasets
  • Transfer Learning and utilizing others' training time and computational resources for your benefit
  • Building and training a state-of-the-art breast cancer classifier
  • How to apply a healthy dose of skepticism to mainstream ideas and understand the implications of widely adopted techniques
  • Visualizing a ConvNet's "concept space" using t-SNE and PCA
  • Case studies of how companies use computer vision techniques to achieve better results
  • Proper model evaluation, latent space visualization and identifying the model's attention
  • Performing domain research, processing your own datasets and establishing model tests
  • Cutting-edge architectures, the progression of ideas, what makes them unique and how to implement them
  • KerasCV - a WIP library for creating state of the art pipelines and models
  • How to parse and read papers and implement them yourself
  • Selecting models depending on your application
  • Creating an end-to-end machine learning pipeline
  • Landscape and intuition on object detection with Faster R-CNNs, RetinaNets, SSDs and YOLO
  • Instance and semantic segmentation
  • Real-Time Object Recognition with YOLOv5
  • Training YOLOv5 Object Detectors
  • Working with Transformers using KerasNLP (industry-strength WIP library)
  • Integrating Transformers with ConvNets to generate captions of images
  • DeepDream
Conclusion

Instance segmentation goes one step beyond semantic segmentation, and notes the qualitative difference between individual instances of a class (person 1, person 2, etc...) rather than just whether they belong to one. In a way - it's pixel-level classification.

In this short guide, we've taken a quick look at how Detectron2 makes instance segmentation and object detection easy and accessible through their API, using a Mask R-CNN.

Categories: FLOSS Project Planets

This week in KDE: Dolphin Selection Mode

Planet KDE - Sat, 2022-08-20 01:01

Today something very cool landed: Dolphin now has a dedicated “Selection Mode” you can optionally use to make the process of selecting items easier with a touchscreen or when using the default single-click setting! It even shows a toolbar of contextually-relevant actions you can perform on the selected items! When using a mouse and keyboard, you can quickly enter and exit it by pressing the spacebar, presssing-and-holding on an item in the view, or using an item in the menu. It’s completely optional, so if you like selecting files the old fashioned way, you don’t have to use it at all. Big thanks to Felix Ernst, who implemented this feature for Dolphin 22.12!

…But that’s not all! Read on for more goodies!

Other New Features

Elisa now has a Full Screen mode (me: Nate Graham, Elisa 22.12. Link):

You can now change the way the system formats addresses, names, and phone numbers (Akseli, Lahtinen, Plasma 5.26. Link):

When using a horizontal panel, Kickoff can now be configured to display text and/or remove the icon (Denys Madureira, Plasma 5.26. Link)

Kate now lets you customize the font that the document will be printed in, right there in the Print dialog (Christoph Cullmann, Frameworks 5.98. Link)

File thumbnailers are now capable of generating preview images for .arw RAW image files (Mirco Miranda, Frameworks 5.98. Link)

User Interface Improvements

Elisa’s “Artist” view now displays a grid of the artist’s albums, rather than a sea of nondescript identical icons (Stefan Vukanović, Elisa 22.12. Link):

When you enter shuffle mode in Elisa, the currently-playing song is now always the first one in the shuffled set of songs (Dmitry Kolesnikov, Elisa 22.12. Link)

When setting properties for KWin rules, the sheet containing the list of properties now stays open until explicitly dismissed (Ismael Asensio, Plasma 5.26. Link)

You can now launch executable files from file searches in Kicker, Kickoff, Overview etc; you’ll now see the standard “Open or execute?” dialog as you would expect (me: Nate Graham, Plasma 5.26. Link)

The “Get New [thing]” windows now support animated GIFs used as images, so for example, you can now preview the effects of the fancy “Burn My Windows” KWin effects that were added recently (Alexander Lohnau, Frameworks 5.98. Link)

Significant Bugfixes

(This is a curated list of e.g. HI and VHI priority bugs, Wayland showstoppers, major regressions, longstanding issues etc.)

The circular timeout indicator for Plasma notifications is now fully visible no matter what your screen DPI and scale factor (Eugene Popov, Plasma 5.24.7. Link)

Launchers other than Kickoff are once again capable of searching for files (Alexander Lohnau, Plasma 5.25.5)

Touch scrolling once again works in Kickoff (Noah Davis, Plasma 5.25.5. Link)

Global shortcuts are now capable of launching apps that define command-line arguments in their .desktop files’ Exec= keys (Nicolas Fella, Frameworks 5.98. Link)

Kirigami-based apps and views that use the common FormLayout component will no longer sometimes randomly freeze with certain combinations of fonts, font sizes, window sizes, and content sizes (Connor Carney, Frameworks 5.98. Link)

Other bug-related information of interest:

Changes not in KDE that affect KDE

That Qt bug that causes vertical scrollbars to disappear in QtQuick-based apps has been fixed upstream, and we’ll soon be backporting it to our KDE Qt Patch Collection (Mitch Curtis, a Qt version coming to you soon. Link)

…And everything else

This blog only covers the tip of the iceberg! If you’re hungry for more, check out https://planet.kde.org, where you can find more news from other KDE contributors.

How You Can Help

If you’re a developer, check out our 15-Minute Bug Initiative. Working on these issues makes a big difference quickly! Otherwise, have a look at https://community.kde.org/Get_Involved to discover ways to be part of a project that really matters. Each contributor makes a huge difference in KDE; you are not a number or a cog in a machine! You don’t have to already be a programmer, either. I wasn’t when I got started. Try it, you’ll like it! We don’t bite!

Finally, consider making a tax-deductible donation to the KDE e.V. foundation.

Categories: FLOSS Project Planets

Emmanuel Kasper: Everything markdown with pandoc

Planet Debian - Fri, 2022-08-19 15:02

Using a markdown file , this style sheet and this simple command,

pandoc couronne.md --standalone --css styling.css \ --to html5 --table-of-contents &gt; couronne.html

I feel I will never need a word processor again. It produces this nice looking document without pain.

Categories: FLOSS Project Planets

Mike Driscoll: Python 101 - Debugging Your Code with pdb (Video)

Planet Python - Fri, 2022-08-19 14:33

Learn how to debug your Python programs using Python's built-in debugger, pdb with Mike Driscoll

In this tutorial, you will learn the following:

  • Starting pdb in the REPL
  • Starting pdb on the Command Line
  • Stepping Through Code
  • Adding Breakpoints in pdb
  • Creating a Breakpoint with set_trace()
  • Using the built-in breakpoint() Function - Getting Help

This video is based on a chapter from the book, Python 101 by Mike Driscoll

Related Articles

The post Python 101 - Debugging Your Code with pdb (Video) appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Python for Beginners: Check For Superset in Python

Planet Python - Fri, 2022-08-19 09:00

In python, we use sets to store unique immutable objects. In this article,  we will discuss what is a superset of a set. We will also discuss ways to check for superset in Python.

What is a Superset?

A superset of a set is another set that contains all the elements of the given set. In other words, If we have a set A and set B, and each element of set B belongs to set A, then set A is said to be a superset of set B.

Let us consider an example where we are given three sets A, B, and C as follows.

A={1,2,3,4,5,6,7,8}

B={2,4,6,8}

C={0,1,2,3,4}

Here, you can observe that all the elements in set B are present in set A. Hence, set A is a superset of set B. On the other hand, all the elements of set C do not belong to set A. Hence, set A is not a superset of set C.

You can observe that a superset will always have more or equal elements than the original set. Now, let us describe a step-by-step algorithm to check for a superset  in python.

Suggested Reading: Chat Application in Python

How to Check For Superset in Python?

Consider that we are given two sets A and B. Now, we have to check if set B is a superset of set A or not. For this, we will traverse all the elements of set A and check whether they are present in set B or not. If there exists an element in set A that doesn’t belong to set B, we will say that set B is not a superset of set A. Otherwise, set B will be a superset of set A. 

To implement this approach in Python, we will use a for loop and a flag variable isSuperset. We will initialize the isSuperset variable to True denoting that set B is a superset of set A. Now we will traverse set A using a for loop. While traversing the elements in set A, we will check if the element is present in set B or not. 

If we find any element in A that isn’t present in set B, we will assign False to isSuperset showing that set B is not a superset of the set A. 

If we do not find any element in set A that does not belong to set B, the isSuperset variable will contain the value True showing that set B is a superset of set A. The entire logic to check for superset  can be implemented in Python as follows.

def checkSuperset(set1, set2): isSuperset = True for element in set2: if element not in set1: isSuperset = False break return isSuperset A = {1, 2, 3, 4, 5, 6, 7, 8} B = {2, 4, 6, 8} C = {0, 1, 2, 3, 4} print("Set {} is: {}".format("A", A)) print("Set {} is: {}".format("B", B)) print("Set {} is: {}".format("C", C)) print("Set A is superset of B :", checkSuperset(A, B)) print("Set A is superset of C :", checkSuperset(A, C)) print("Set B is superset of C :", checkSuperset(B, C))

Output:

Set A is: {1, 2, 3, 4, 5, 6, 7, 8} Set B is: {8, 2, 4, 6} Set C is: {0, 1, 2, 3, 4} Set A is superset of B : True Set A is superset of C : False Set B is superset of C : False Check For Superset Using issuperset() Method

We can also use the issuperset() method to check for superset in python. The issuperset() method, when invoked on a set A, accepts a set B as input argument and returns True if set A is a superset of B. Otherwise, it returns False.

You can use the issuperset() method to check for superset in python as follows.

A = {1, 2, 3, 4, 5, 6, 7, 8} B = {2, 4, 6, 8} C = {0, 1, 2, 3, 4} print("Set {} is: {}".format("A", A)) print("Set {} is: {}".format("B", B)) print("Set {} is: {}".format("C", C)) print("Set A is superset of B :", A.issuperset(B)) print("Set A is superset of C :", A.issuperset(C)) print("Set B is superset of C :", B.issuperset(C))

Output:

Set A is: {1, 2, 3, 4, 5, 6, 7, 8} Set B is: {8, 2, 4, 6} Set C is: {0, 1, 2, 3, 4} Set A is superset of B : True Set A is superset of C : False Set B is superset of C : False Conclusion

In this article, we have discussed two ways to check for superset in python. To learn more about sets, you can read this article on set comprehension in python. You might also like this article on list comprehension in python.

The post Check For Superset in Python appeared first on PythonForBeginners.com.

Categories: FLOSS Project Planets

Clangd config

Planet KDE - Fri, 2022-08-19 09:00
Clangd config

Since Kate got LSP support some time ago (thanks to all the developers who made/make this possible, it is a great addition), I've been using it a lot; as you'd expect with any tool, it has some default behaviours that you'd want to disable, and some non-default ones that you want to make use of. Below are some of the config tweaks I've collected over time.

First, there are two ways to change the clangd config options:

  • creating a ~/.config/clangd/config.yaml text file, which will affect all projects
  • creating a .clangd in a specific project directory
Note that clangd search for a .clangd file in the current dir and then its parent dir ...etc. So, you can have one .clangd file in the parent dir of all your KDE Frameworks checkouts and it'll affect all of them.

Now the config options:

  • Mark unused includes, this seems to only work with .h headers includes, but not with ForwardingHeaders (e.g. #include <QString> or #include <KIO/Stat>): Diagnostics: UnusedIncludes: Strict So while it doesn't work a 100% for KDE code which uses ForwardingHeaders a lot, it is still useful to have, since you can remove some unused includes as you go along. Of course be careful as it could mark something as "unused" while it is actually needed for a different code path, e.g. only needed on FreeBSD or Windows.
  • Disable/suppress showing "Diagnostics" for system headers (if you're a developer for system libraries, you'll have those libraries source code open in your editor, right? if I have a KDE Framework code open in my editor, I want to see issues about that code at his moment, not the system headers, please): --- If: PathMatch: /usr/include/.* Diagnostics: Suppress: "*"
  • If I understand correctly, the --- is used to separate sections in the config file, so that means I only want to suppress Diagnostics for this PathMatch not everywhere else.
  • Add/Remove compile flags that are passed to the compiler clangd uses (by default it uses clang, obviously): CompileFlags: Add: [-Wno-gnu-zero-variadic-macro-arguments] Remove: [-fdiagnostics-urls=always] For example I usually compile with gcc, and I add the -fdiagnostics-urls=always flag, but clang knows nothing about it and keeps complaining in e.g. the LSP diagnostics tab in Kate, so this is one way of removing that noisy warning. For -Wno-gnu-zero-variadic-macro-arguments see this post.

Categories: FLOSS Project Planets

Web Review, Week 2022-33

Planet KDE - Fri, 2022-08-19 08:47

Let’s go for my web review for the week 2022-33.

Open-source rival for OpenAI’s DALL-E runs on your graphics card

Tags: tech, ai, machine-learning

There’s clearly a whole lot of such image generation systems emerging lately. Was only a question of time until an open source one appears. What’s interesting with this one is that it can run locally, you don’t need to query a server.

https://mixed-news.com/en/open-source-rival-for-openais-dall-e-runs-on-your-graphics-card/?amp=1


A Python-compatible statically typed language

Tags: tech, python, type-systems

Very early times but this could become interesting. Maybe worth keeping an eye on.

https://github.com/erg-lang/erg


The Unreasonable Effectiveness of Makefiles

Tags: tech, unix, tools

Big shout out to make, one of my favorites Unix tools. I like some of the ideas for improvements listed here.

https://matt-rickard.com/the-unreasonable-effectiveness-of-makefiles


SSH tips and tricks | Carlos Becker

Tags: tech, ssh

Neat little list of tips, indeed some are useful and I didn’t know about them.

https://carlosbecker.dev/posts/ssh-tips-and-tricks/


Code Smells Catalog

Tags: tech, programming, smells, craftsmanship

Very nice catalog! Looks like a useful reference.

https://luzkan.github.io/smells/


KPIs, Velocity, and Other Destructive Metrics

Tags: tech, metrics, business, product-management, productivity

Very good point of view about metrics and their use. We’re unfortunately very often measuring the wrong things or using them the wrong way.

https://holub.com/kpis-velocity-and-other-destructive-metrics/


Agile Projects Have Become Waterfall Projects With Sprints | by Ben “The Hosk” Hosking | Aug, 2022 | ITNEXT

Tags: tech, agile, project-management

This is unfortunately very much true. Was only a matter of time I guess. The “grass is greener” effect is indeed the most likely reason.

https://itnext.io/agile-projects-have-become-waterfall-projects-with-sprints-536141801856


Taking notes in interviews - Jacob Kaplan-Moss

Tags: hr, interviews

OK, that’s an interesting approach to the note taking during interviews. I’m a bit far from that which is fine… and still that gives me ideas for improvements.

https://jacobian.org/2022/aug/12/interview-notes/


Bye for now!

Categories: FLOSS Project Planets

Droptica: How to Customize Taxonomy Terms Pages? Taxonomy Views Integrator Drupal Module

Planet Drupal - Fri, 2022-08-19 08:05

The standard method of managing Drupal's display mode may be too limited in some cases. If you want to efficiently create differentiated views used when listing terms or content related to the terms, you can try Layout Builder. What if, for example, the graphical interface, complexity, or genericity disqualifies this tool among the considered solutions? The Taxonomy Views Integrator (TVI) module may help here.

The Taxonomy Views Integrator Drupal module – general information

The functionality provided by Taxonomy Views Integrator allows you to create multiple views that can become part of the display modes of entire vocabularies or individual terms.

The module was created on 11 September 2009. The latest update to Drupal 7 took place on 20 September 2016, and to Drupal 8 and 9 – on 10 June 2021.

Popularity of the TVI module

As of the day of writing this article, about 25 thousand pages are using this module. The Drupal 7 version of the module is losing popularity, most likely due to the overall migration to Drupal 9. The Drupal 8 and 9 versions of the module enjoy a steady increase in the number of installations. Considering the number of available versions, the module has recently maintained its steady popularity.

 

Creators of the module

The main creator of the project is Derek Webb (derekwebb1). Kevin Quillen (kevinquillen), Michael O'Hara (mikeohara), Rich Gerdes (richgerdes), and Edouard Cunibil (DuaelFr) are also singled out on the list of maintainers.

Installation

The module doesn't require any external libraries. The TVI dependencies are limited to the Views and Taxonomy modules only, and both are available in the Drupal core.

We recommend using Composer for the installation.

$ composer require drupal/tvi What's the Taxonomy Views Integrator module for?

You can overwrite the terms in all vocabularies by using only the Views and Term modules. Taxonomy Views Integrator integrates vocabularies and terms with views in a more accessible way. By enabling this module, you can create a view in which you'll overwrite the list of terms in a specific vocabulary, additionally having full control over how the view is presented. You can just as easily attach a different view to a specific term. Thanks to this freedom, you can easily create, for example, differently looking subpages on a specific topic, which will be defined on the basis of terms.

Permissions

The module provides new permissions. Their list depends on the number of available vocabularies.

Administer taxonomy views integrator

This permission allows for managing global module settings and editing its settings in all vocabularies and terms. It's recommended that you only grant this permission to trusted roles.

Define the view override for terms in {VOCABULARY_NAME}

Granting this permission will allow you to overwrite the view used for specific terms in a given vocabulary.

Define the view override for the vocabulary {VOCABULARY_NAME}

This permission will allow you to override the view for an entire vocabulary.

Using the Taxonomy Views Integrator module

After enabling the module and granting permissions, it's time to familiarize yourself with its global settings. There are only two. The first one is Don't display a view by default; if selected, the default Taxonomy term view in the Page display mode won’t be used. The second one is Use global view override. If selected, it will allow you to override the default global view and choose the display mode.

 

You can also select views for vocabularies and terms. In both cases, the configuration form looks the same. You can choose:

  • a flag that allows you to enable overwriting,
  • a list of views from which you can select the view you are interested in and the display mode from this view,
  • the Child terms will use these settings option. If selected, all children of the vocabulary or term will use the same view in the same display mode,
  • and the Pass all arguments to views option. If selected, it'll ensure that the view will receive all arguments given in the path, such as the term identifier.

 

The TVI configuration form has been integrated with the vocabulary and term editing forms. Therefore, it's available under the address

/admin/structure/taxonomy/manage/{vocabulary_machine_name}

for vocabularies and

taxonomy/term/{term_id}/edit

for terms

Taxonomy Views Integrator Drupal module - summary

If your website requires more freedom in presenting terms and vocabularies, the Drupal Taxonomy Views Integrator module will certainly meet your expectations. The module is easy to use and requires only the basic knowledge of creating views. Full freedom also means the ability to create custom styling for the created views, which may require a more in-depth knowledge of Drupal. If your vision for your web page doesn't match the standard look of the views and integration with TVI alone isn't enough, our team dealing with developing Drupal websites will be happy to help you.

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #122: Configuring a Coding Environment on Windows &amp; Using TOML With Python

Planet Python - Fri, 2022-08-19 08:00

Have you attempted to set up a Python development environment on Windows before? Would it be helpful to have an easy-to-follow guide to get you started? This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder's Weekly articles and projects.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Stack Abuse: RetinaNet Object Detection with PyTorch and torchvision

Planet Python - Fri, 2022-08-19 06:30
Introduction

Object detection is a large field in computer vision, and one of the more important applications of computer vision "in the wild". On one end, it can be used to build autonomous systems that navigate agents through environments - be it robots performing tasks or self-driving cars, but this requires intersection with other fields. However, anomaly detection (such as defective products on a line), locating objects within images, facial detection and various other applications of object detection can be done without intersecting other fields.

Advice This short guide is based on a small part of a much larger lesson on object detection belonging to our "Practical Deep Learning for Computer Vision with Python" course.

Object detection isn't as standardized as image classification, mainly because most of the new developments are typically done by individual researchers, maintainers and developers, rather than large libraries and frameworks. It's difficult to package the necessary utility scripts in a framework like TensorFlow or PyTorch and maintain the API guidelines that guided the development so far.

This makes object detection somewhat more complex, typically more verbose (but not always), and less approachable than image classification. One of the major benefits of being in an ecosystem is that it provides you with a way to not search for useful information on good practices, tools and approaches to use. With object detection - most have to do way more research on the landscape of the field to get a good grip.

Object Detection with PyTorch/TorchVision's RetinaNet

torchvision is PyTorch's Computer Vision project, and aims to make the development of PyTorch-based CV models easier, by providing transformation and augmentation scripts, a model zoo with pre-trained weights, datasets and utilities that can be useful for a practitioner.

While still in beta and very much experimental - torchvision offers a relatively simple Object Detection API with a few models to choose from:

  • Faster R-CNN
  • RetinaNet
  • FCOS (Fully convolutional RetinaNet)
  • SSD (VGG16 backbone... yikes)
  • SSDLite (MobileNetV3 backbone)

While the API isn't as polished or simple as some other third-party APIs, it's a very decent starting point for those who'd still prefer the safety of being in an ecosystem they're familiar with. Before going forward, make sure you install PyTorch and Torchvision:

$ pip install torch torchvision

Let's load in some of the utility functions, such as read_image(), draw_bounding_boxes() and to_pil_image() to make it easier to read, draw on and output images, followed by importing RetinaNet and its pre-trained weights (MS COCO):

from torchvision.io.image import read_image from torchvision.utils import draw_bounding_boxes from torchvision.transforms.functional import to_pil_image from torchvision.models.detection import retinanet_resnet50_fpn_v2, RetinaNet_ResNet50_FPN_V2_Weights import matplotlib.pyplot as plt

RetinaNet uses a ResNet50 backbone and a Feature Pyramid Network (FPN) on top of it. While the name of the class is verbose, it's indicative of the architecture. Let's fetch an image using the requests library and save it as a file on our local drive:

import requests response = requests.get('https://i.ytimg.com/vi/q71MCWAEfL8/maxresdefault.jpg') open("obj_det.jpeg", "wb").write(response.content) img = read_image("obj_det.jpeg")

With an image in place - we can instantiate our model and weights:

weights = RetinaNet_ResNet50_FPN_V2_Weights.DEFAULT model = retinanet_resnet50_fpn_v2(weights=weights, score_thresh=0.35) # Put the model in inference mode model.eval() # Get the transforms for the model's weights preprocess = weights.transforms()

The score_thresh argument defines the threshold at which an object is detected as an object of a class. Intuitively, it's the confidence threshold, and we won't classify an object to belong to a class if the model is less than 35% confident that it belongs to a class.

Let's preprocess the image using the transforms from our weights, create a batch and run inference:

batch = [preprocess(img)] prediction = model(batch)[0]

That's it, our prediction dictionary holds the inferred object classes and locations! Now, the results aren't very useful for us in this form - we'll want to extract the labels with respect to the metadata from the weights and draw bounding boxes, which can be done via draw_bounding_boxes():

labels = [weights.meta["categories"][i] for i in prediction["labels"]] box = draw_bounding_boxes(img, boxes=prediction["boxes"], labels=labels, colors="cyan", width=2, font_size=30, font='Arial') im = to_pil_image(box.detach()) fig, ax = plt.subplots(figsize=(16, 12)) ax.imshow(im) plt.show()

This results in:

RetinaNet actually classified the person peeking behind the car! That's a pretty difficult classification.

You can switch out RetinaNet to an FCOS (fully convolutional RetinaNet) by replacing retinanet_resnet50_fpn_v2 with fcos_resnet50_fpn, and use the FCOS_ResNet50_FPN_Weights weights:

from torchvision.io.image import read_image from torchvision.utils import draw_bounding_boxes from torchvision.transforms.functional import to_pil_image from torchvision.models.detection import fcos_resnet50_fpn, FCOS_ResNet50_FPN_Weights import matplotlib.pyplot as plt import requests response = requests.get('https://i.ytimg.com/vi/q71MCWAEfL8/maxresdefault.jpg') open("obj_det.jpeg", "wb").write(response.content) img = read_image("obj_det.jpeg") weights = FCOS_ResNet50_FPN_Weights.DEFAULT model = fcos_resnet50_fpn(weights=weights, score_thresh=0.35) model.eval() preprocess = weights.transforms() batch = [preprocess(img)] prediction = model(batch)[0] labels = [weights.meta["categories"][i] for i in prediction["labels"]] box = draw_bounding_boxes(img, boxes=prediction["boxes"], labels=labels, colors="cyan", width=2, font_size=30, font='Arial') im = to_pil_image(box.detach()) fig, ax = plt.subplots(figsize=(16, 12)) ax.imshow(im) plt.show() Going Further - Practical Deep Learning for Computer Vision

Your inquisitive nature makes you want to go further? We recommend checking out our Course: "Practical Deep Learning for Computer Vision with Python".

Another Computer Vision Course?

We won't be doing classification of MNIST digits or MNIST fashion. They served their part a long time ago. Too many learning resources are focusing on basic datasets and basic architectures before letting advanced black-box architectures shoulder the burden of performance.

We want to focus on demystification, practicality, understanding, intuition and real projects. Want to learn how you can make a difference? We'll take you on a ride from the way our brains process images to writing a research-grade deep learning classifier for breast cancer to deep learning networks that "hallucinate", teaching you the principles and theory through practical work, equipping you with the know-how and tools to become an expert at applying deep learning to solve computer vision.

What's inside?
  • The first principles of vision and how computers can be taught to "see"
  • Different tasks and applications of computer vision
  • The tools of the trade that will make your work easier
  • Finding, creating and utilizing datasets for computer vision
  • The theory and application of Convolutional Neural Networks
  • Handling domain shift, co-occurrence, and other biases in datasets
  • Transfer Learning and utilizing others' training time and computational resources for your benefit
  • Building and training a state-of-the-art breast cancer classifier
  • How to apply a healthy dose of skepticism to mainstream ideas and understand the implications of widely adopted techniques
  • Visualizing a ConvNet's "concept space" using t-SNE and PCA
  • Case studies of how companies use computer vision techniques to achieve better results
  • Proper model evaluation, latent space visualization and identifying the model's attention
  • Performing domain research, processing your own datasets and establishing model tests
  • Cutting-edge architectures, the progression of ideas, what makes them unique and how to implement them
  • KerasCV - a WIP library for creating state of the art pipelines and models
  • How to parse and read papers and implement them yourself
  • Selecting models depending on your application
  • Creating an end-to-end machine learning pipeline
  • Landscape and intuition on object detection with Faster R-CNNs, RetinaNets, SSDs and YOLO
  • Instance and semantic segmentation
  • Real-Time Object Recognition with YOLOv5
  • Training YOLOv5 Object Detectors
  • Working with Transformers using KerasNLP (industry-strength WIP library)
  • Integrating Transformers with ConvNets to generate captions of images
  • DeepDream
Conclusion

Object Detection is an important field of Computer Vision, and one that's unfortunately less approachable than it should be.

In this short guide, we've taken a look at how torchvision, PyTorch's Computer Vision package, makes it easier to perform object detection on images, using RetinaNet.

Categories: FLOSS Project Planets

KDE neon: Jammy Porting Update

Planet KDE - Fri, 2022-08-19 06:14

Jammy porting is happening at full pace. Almost all the packages are now compiled and that leaves ISOs to be built and upgrade to be tested.

As with any software or engineering project there’s not much point in putting a deadline on it, it’ll be ready when it’s ready. As a moving target the builds are often two steps forward and one step back cos suddenly there’s a new KDE Gear that needs built.

But we’ll be with you soon

Categories: FLOSS Project Planets

New in Qt 6.4: FrameAnimation

Planet KDE - Fri, 2022-08-19 04:31

In this blog post we try to solve the classical "Mouse chasing Mouse" -problem. Don't know it? No problem, nobody does. But if you are interested in Qt Quick, smooth animations and what's new in Qt 6.4 (Beta3 was just released!), please continue reading and you'll find out!

Categories: FLOSS Project Planets

eGenix.com: eGenix Antispam Bot for Telegram 0.4.0 GA

Planet Python - Fri, 2022-08-19 04:00
Introduction

eGenix has long been running a local user group meeting in Düsseldorf called Python Meeting Düsseldorf and we are using a Telegram group for most of our communication.

In the early days, the group worked well and we only had few spammers joining it, which we could well handle manually.

More recently, this has changed dramatically. We are seeing between 2-5 spam signups per day, often at night. Furthermore, the signups accounts are not always easy to spot as spammers, since they often come with profile images, descriptions, etc.

With the bot, we now have a more flexible way of dealing with the problem.

Please see our project page for details and download links.

Features
  • Low impact mode of operation: the bot tries to keep noise in the group to a minimum
  • Several challenge mechanisms to choose from, more can be added as needed
  • Flexible and easy to use configuration
  • Only needs a few MB of RAM, so can easily be put into a container or run on a Raspberry Pi
  • Can handle quite a bit of load due to the async implementation
  • Works with Python 3.9+
  • MIT open source licensed
News

The 0.4.0 release fixes a few bugs and adds more features:

  • Added new challenge MathMultiplyChallenge
  • Made the MathAddChallenge and MathMultiplyChallenge a little more difficult

It has been battle-tested in production for several months already and is proving to be a really useful tool to help with Telegram group administration.

More Information

For more information on the eGenix.com Python products, licensing and download instructions, please write to sales@egenix.com.

Enjoy !

Marc-Andre Lemburg, eGenix.com

Categories: FLOSS Project Planets

John Ludhi/nbshare.io: PySpark Substr and Substring

Planet Python - Thu, 2022-08-18 21:38
PySpark Substr and Substring

substring(col_name, pos, len) - Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type.

First we load the important libraries In [1]: from pyspark.sql import SparkSession from pyspark.sql.functions import (col, substring) In [24]: # initializing spark session instance spark = SparkSession.builder.appName('snippets').getOrCreate()

Let us load our initial records.

In [3]: columns = ["Full_Name","Salary"] data = [("John A Smith", 1000), ("Alex Wesley Jones", 120000), ("Jane Tom James", 5000)] In [4]: # converting data to rdds rdd = spark.sparkContext.parallelize(data) In [5]: # Then creating a dataframe from our rdd variable dfFromRDD2 = spark.createDataFrame(rdd).toDF(*columns) In [6]: # visualizing current data before manipulation dfFromRDD2.show() +-----------------+------+ | Full_Name|Salary| +-----------------+------+ | John A Smith| 1000| |Alex Wesley Jones|120000| | Jane Tom James| 5000| +-----------------+------+ PySpark substring

1) Here we are taking a substring for the first name from the Full_Name Column. The Full_Name contains first name, middle name and last name. We are adding a new column for the substring called First_Name

In [7]: # here we add a new column called 'First_Name' and use substring() to get partial string from 'Full_Name' column modified_dfFromRDD2 = dfFromRDD2.withColumn("First_Name", substring('Full_Name', 1, 4)) In [8]: # visualizing the modified dataframe modified_dfFromRDD2.show() +-----------------+------+----------+ | Full_Name|Salary|First_Name| +-----------------+------+----------+ | John A Smith| 1000| John| |Alex Wesley Jones|120000| Alex| | Jane Tom James| 5000| Jane| +-----------------+------+----------+

2) We can also get a substring with select and alias to achieve the same result as above

In [9]: modified_dfFromRDD3 = dfFromRDD2.select("Full_Name", 'Salary', substring('Full_Name', 1, 4).alias('First_Name')) In [10]: # visualizing the modified dataframe after executing the above. # As you can see, it is exactly the same as the previous output. modified_dfFromRDD3.show() +-----------------+------+----------+ | Full_Name|Salary|First_Name| +-----------------+------+----------+ | John A Smith| 1000| John| |Alex Wesley Jones|120000| Alex| | Jane Tom James| 5000| Jane| +-----------------+------+----------+

3) We can also use substring with selectExpr to get a substring of 'Full_Name' column. selectExpr takes SQL expression(s) in a string to execute. This way we can run SQL-like expressions without creating views.

In [11]: modified_dfFromRDD4 = dfFromRDD2.selectExpr("Full_Name", 'Salary', 'substring(Full_Name, 1, 4) as First_Name') In [12]: # visualizing the modified dataframe after executing the above. # As you can see, it is exactly the same as the previous output. modified_dfFromRDD4.show() +-----------------+------+----------+ | Full_Name|Salary|First_Name| +-----------------+------+----------+ | John A Smith| 1000| John| |Alex Wesley Jones|120000| Alex| | Jane Tom James| 5000| Jane| +-----------------+------+----------+

4) Here we are going to use substr function of the Column data type to obtain the substring from the 'Full_Name' column and create a new column called 'First_Name'

In [13]: modified_dfFromRDD5 = dfFromRDD2.withColumn("First_Name", col('Full_Name').substr(1, 4)) In [14]: # visualizing the modified dataframe yields the same output as seen for all previous examples. modified_dfFromRDD5.show() +-----------------+------+----------+ | Full_Name|Salary|First_Name| +-----------------+------+----------+ | John A Smith| 1000| John| |Alex Wesley Jones|120000| Alex| | Jane Tom James| 5000| Jane| +-----------------+------+----------+

5) Let us consider now a example of substring when the indices are beyond the length of column. In that case, the substring() function only returns characters that fall in the bounds i.e (start, start+len). This can be seen in the example below

In [15]: # In this example we are going to get the four characters of Full_Name column starting from position 14. # As can be seen in the example, 4 or fewer characters are returned depending on the string length modified_dfFromRDD6 = dfFromRDD2.withColumn("Last_Name", substring('Full_Name', 14, 4)) In [16]: modified_dfFromRDD6.show() +-----------------+------+---------+ | Full_Name|Salary|Last_Name| +-----------------+------+---------+ | John A Smith| 1000| | |Alex Wesley Jones|120000| ones| | Jane Tom James| 5000| s| +-----------------+------+---------+

The above method produces wrong last name. We can fix it by following approach.

6) Another example of substring when we want to get the characters relative to end of the string. In this example, we are going to extract the last name from the Full_Name column.

In [17]: # In this example we are going to get the five characters of Full_Name column relative to the end of the string. # As can be seen in the example, last 5 charcters are returned modified_dfFromRDD7 = dfFromRDD2.withColumn("Last_Name", substring('Full_Name', -5, 5)) In [18]: modified_dfFromRDD7.show() +-----------------+------+---------+ | Full_Name|Salary|Last_Name| +-----------------+------+---------+ | John A Smith| 1000| Smith| |Alex Wesley Jones|120000| Jones| | Jane Tom James| 5000| James| +-----------------+------+---------+

Note above approach works only if the last name in each row is of constant characters length. What if the last name is of different characters length, the solution is not that simple.
I will need the index at which the last name starts and also the length of 'Full_Name'. If you are curious, I have provided the solution below without the explanation.

In [19]: from pyspark.sql import SparkSession from pyspark.sql.functions import (col, substring, lit, substring_index, length)

Let us create an example with last names having variable character length.

In [20]: columns = ["Full_Name","Salary"] data = [("John A Smith", 1000), ("Alex Wesley leeper", 120000), ("Jane Tom kinderman", 5000)] rdd = spark.sparkContext.parallelize(data) dfFromRDD2 = spark.createDataFrame(rdd).toDF(*columns) dfFromRDD2.show() +------------------+------+ | Full_Name|Salary| +------------------+------+ | John A Smith| 1000| |Alex Wesley leeper|120000| |Jane Tom kinderman| 5000| +------------------+------+ Pyspark substr In [21]: dfFromRDD2.withColumn('Last_Name', col("Full_Name").substr((length('Full_Name') - length(substring_index('Full_Name', " ", -1))),length('Full_Name'))).show() +------------------+------+----------+ | Full_Name|Salary| Last_Name| +------------------+------+----------+ | John A Smith| 1000| Smith| |Alex Wesley leeper|120000| leeper| |Jane Tom kinderman| 5000| kinderman| +------------------+------+----------+ In [22]: spark.stop()
Categories: FLOSS Project Planets

Dirk Eddelbuettel: RcppArmadillo 0.11.2.3.1 on CRAN: Double Update

Planet Debian - Thu, 2022-08-18 20:51

Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 1005 packages other packages on CRAN (as celebrated in this blog post on passing 1000 packages from just four days ago), downloaded nearly 26 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 488 times according to Google Scholar.

This release brings together two distinct changes. First, it updates the relese from upstream 11.2.0 (and CRAN 0.11.2.0.0 released a few weeks ago) to the now current 11.2.3 release by Conrard (given that more than four weeks have passed so that we do not surpass CRAN’s desired cadence of ‘releases no more than once a month’). The changeset includes a few small refinements (see below), it also includedes a deprecation for initialization for which I will need to reach out to a few packages for whom this triggers a deprecation warning. And speaking of deprecation, the other reason for this release is the desire by the Matrix package to phase out a few older conversions (or casts in C/C++ lingo) which we accomodated.

The full set of changes (since the last CRAN release 0.11.2.0.0) follows.

Changes in RcppArmadillo version 0.11.2.3.1 (2022-08-16)
  • Accomodate upcoming Matrix 1.4-2 deprecation for conversion (Dirk in #387)

  • CRAN release with small upstream changes in Armadillo 11.2.1,2,3 made since the last CRAN release 0.11.2.0.0 (Dirk in #383, #384 and #386)

  • Undefine arma_deprecated warning as it affects a number of CRAN packages which will likely need a small transition

Changes in RcppArmadillo version 0.11.2.3.0 (2022-07-12) (GitHub Only)
  • Upgraded to Armadillo release 11.2.3 (Classic Roast)

    • fix Cube::insert_slices() to accept Cube::slice() as input
Changes in RcppArmadillo version 0.11.2.2.0 (2022-07-04) (GitHub Only)
  • Upgraded to Armadillo release 11.2.2 (Classic Roast)

    • fix incorrect and/or slow convergence in single-threaded versions of kmeans(), gmm_diag::learn(), gmm_full::learn()
Changes in RcppArmadillo version 0.11.2.1.0 (2022-06-28) (GitHub Only)
  • Upgraded to Armadillo release 11.2.1 (Classic Roast)

    • old style matrix initialisation via the << operator will now emit a compile-time deprecation warning

    • use of the old and inactive ARMA_DONT_PRINT_ERRORS option will now emit a compile-time deprecation warning

    • the option ARMA_WARN_LEVEL can be used instead

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Reproducible Builds (diffoscope): diffoscope 221 released

Planet Debian - Thu, 2022-08-18 20:00

The diffoscope maintainers are pleased to announce the release of diffoscope version 221. This version includes the following changes:

* Don't crash if we can open a PDF file with PyPDF but cannot parse the annotations within. (Closes: reproducible-builds/diffoscope#311) * Depend on the dedicated xxd package, not vim-common. * Update external_tools.py to reflect xxd/vim-common change.

You find out more by visiting the project homepage.

Categories: FLOSS Project Planets

Lullabot: Lullabot Podcast: Drupal Automatic Updates—The Update

Planet Drupal - Thu, 2022-08-18 12:51

Keeping a Drupal site up-to-date can be tricky and time consuming. Host Matt Kleve sits down with three people in the Drupal community who have been working to make that process easier and faster.

It's been in progress for awhile, but now you might be able to start using Automatic Updates on your Drupal Site.

Categories: FLOSS Project Planets

Pages