Feeds

Python Bindings for KDE Frameworks: GSoC Midterm Review

Planet KDE - Mon, 2024-07-15 10:40

Week 7 of my 14-week GSoC project has finished, which means that it’s time for a midterm update! Many things have happened since the last update.

I added support for KI18n, KGuiAddons, KNotifications, KUnitConversion and KXMLGui. That was faster than expected. I also created a small unit conversion demo using KUnitConversion (it’s written in Python):

We have decided to move on to upstream the bindings to their corresponding repositories. For that, I set up a development environment and I’m now adding the code to each library.

Categories: FLOSS Project Planets

Real Python: Split Your Dataset With scikit-learn's train_test_split()

Planet Python - Mon, 2024-07-15 10:00

One of the key aspects of supervised machine learning is model evaluation and validation. When you evaluate the predictive performance of your model, it’s essential that the process be unbiased. Using train_test_split() from the data science library scikit-learn, you can split your dataset into subsets that minimize the potential for bias in your evaluation and validation process.

In this tutorial, you’ll learn:

  • Why you need to split your dataset in supervised machine learning
  • Which subsets of the dataset you need for an unbiased evaluation of your model
  • How to use train_test_split() to split your data
  • How to combine train_test_split() with prediction methods

In addition, you’ll get information on related tools from sklearn.model_selection.

Get Your Code: Click here to download the free sample code that you’ll use to learn about splitting your dataset with scikit-learn’s train_test_split().

Take the Quiz: Test your knowledge with our interactive “Split Your Dataset With scikit-learn's train_test_split()” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Split Your Dataset With scikit-learn's train_test_split()

In this quiz, you'll test your understanding of how to use the train_test_split() function from the scikit-learn library to split your dataset into subsets for unbiased evaluation in machine learning.

The Importance of Data Splitting

Supervised machine learning is about creating models that precisely map the given inputs to the given outputs. Inputs are also called independent variables or predictors, while outputs may be referred to as dependent variables or responses.

How you measure the precision of your model depends on the type of a problem you’re trying to solve. In regression analysis, you typically use the coefficient of determination, root mean square error, mean absolute error, or similar quantities. For classification problems, you often apply accuracy, precision, recall, F1 score, and related indicators.

The acceptable numeric values that measure precision vary from field to field. You can find detailed explanations from Statistics By Jim, Quora, and many other resources.

What’s most important to understand is that you usually need unbiased evaluation to properly use these measures, assess the predictive performance of your model, and validate the model.

This means that you can’t evaluate the predictive performance of a model with the same data you used for training. You need evaluate the model with fresh data that hasn’t been seen by the model before. You can accomplish that by splitting your dataset before you use it.

Training, Validation, and Test Sets

Splitting your dataset is essential for an unbiased evaluation of prediction performance. In most cases, it’s enough to split your dataset randomly into three subsets:

  1. The training set is applied to train or fit your model. For example, you use the training set to find the optimal weights, or coefficients, for linear regression, logistic regression, or neural networks.

  2. The validation set is used for unbiased model evaluation during hyperparameter tuning. For example, when you want to find the optimal number of neurons in a neural network or the best kernel for a support vector machine, you experiment with different values. For each considered setting of hyperparameters, you fit the model with the training set and assess its performance with the validation set.

  3. The test set is needed for an unbiased evaluation of the final model. You shouldn’t use it for fitting or validation.

In less complex cases, when you don’t have to tune hyperparameters, it’s okay to work with only the training and test sets.

Underfitting and Overfitting

Splitting a dataset might also be important for detecting if your model suffers from one of two very common problems, called underfitting and overfitting:

  1. Underfitting is usually the consequence of a model being unable to encapsulate the relations among data. For example, this can happen when trying to represent nonlinear relations with a linear model. Underfitted models will likely have poor performance with both training and test sets.

  2. Overfitting usually takes place when a model has an excessively complex structure and learns both the existing relations among data and noise. Such models often have bad generalization capabilities. Although they work well with training data, they usually yield poor performance with unseen test data.

You can find a more detailed explanation of underfitting and overfitting in Linear Regression in Python.

Prerequisites for Using train_test_split()

Now that you understand the need to split a dataset in order to perform unbiased model evaluation and identify underfitting or overfitting, you’re ready to learn how to split your own datasets.

You’ll use version 1.5.0 of scikit-learn, or sklearn. It has many packages for data science and machine learning, but for this tutorial, you’ll focus on the model_selection package, specifically on the function train_test_split().

Note: While this tutorial is tested with this specific version of scikit-learn, the features that you’ll use are core to the library and should work equivalently in other versions of scikit-learn as well.

You can install sklearn with pip:

Read the full article at https://realpython.com/train-test-split-python-data/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Week 7 recap - vector approach

Planet KDE - Mon, 2024-07-15 08:30
Currently, I have three separate ideas. First idea is that in kis_tool_freehand where stroke initialization and end stroke exists, we populate a vector, filter out extra points and then take out corner pixels before passing it down. The problem I am ...
Categories: FLOSS Project Planets

Real Python: Quiz: How to Use Generators and yield in Python

Planet Python - Mon, 2024-07-15 08:00

In this quiz, you’ll test your understanding of Python generators.

Generators and the Python yield statement can help you when you’re working with large datasets that might overwhelm your machine’s memory. Another use case is when you have a complex function that needs to maintain an internal state every time it’s called.

When you understand Python generators, then you’ll be able to work with large datasets in a more Pythonic fashion, create generator functions and expressions, and apply your knowledge towards building efficient data pipelines.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: How to Write Beautiful Python Code With PEP 8

Planet Python - Mon, 2024-07-15 08:00

In this quiz, you’ll test your understanding of how to write beautiful Python code with PEP 8.

By working through this quiz, you’ll revisit the key guidelines laid out in PEP 8 and how to set up your development environment to write PEP 8 compliant Python code.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Steinar H. Gunderson: Pull requests via git push

Planet Debian - Mon, 2024-07-15 07:15

This project inspired me to investigate whether git.sesse.net could start accepting patches in a format that was less friction than email, and didn't depend on custom SSH-facing code written by others. And it seems it really can! The thought was to simply allow git push from anyone, but that git push doesn't actually push anything; it just creates a pull request (by email). It was much simpler than I'd thought. First make an empty hooks directory with this pre-receive hook (make sure it is readable by your web server, and marked as executable):

#! /bin/bash set -e read oldsha newsha refname git send-email --to=steinar+git@gunderson.no --suppress-cc=all --subject-prefix="git-anon-push PATCH" --quiet $oldsha..$newsha echo '' echo 'Thank you for your contribution! The patch has been sent by email and will be examined for inclusion.' echo 'The push will now exit with an error. No commits have actually been pushed.' exit 1

Now we can activate this hook and anonymous push in each project (I already run git-http-backend on the server for pulling, and it supports just fine if you tell it to), and give www-data write permissions to store the pushed objects temporarily:

git config core.hooksPath /srv/git.sesse.net/hooks git config http.receivepack true sudo chgrp -R www-data . chmod -R g+w .

And now any attempts to git push will send me patch emails that I can review and optionally include!

It's not perfect. For instance, it doesn't support multipush, and if you try to push to a branch that doesn't exist already, will error out since $oldsha is all-zeros. And the From: header is always www-data (but I didn't want to expose myself to all sorts of weird injection attacks by trying to parse the committer email). And of course, there's no spam control, but if you want to spam me with email, then you could just like… send email?

(I have backups, in case someone discovers some sort of evil security hole.)

Categories: FLOSS Project Planets

Thomas Lange: FAIme adds Korean language support

Planet Debian - Mon, 2024-07-15 07:01

In two weeks DebConf24, the Debian conference starts in Busan, South Korea. Therefore I've added support for the Korean language into the web service of FAI:

https://fai-project.org/FAIme/

Another new feature of the FAIme service will be announced at DebConf24 in August.

Categories: FLOSS Project Planets

Qt Creator 14 - CMake Update

Planet KDE - Mon, 2024-07-15 07:00

Here are the new CMake features and fixes in Qt Creator 14:

Categories: FLOSS Project Planets

Kushal Das: Disable this Firefox preference to save privacy

Planet Python - Mon, 2024-07-15 04:55

If you are on the latest Firefox 128 (which is there on Fedora 40), you should uncheck the following preference to disable Privacy-Preserving Attribution. Firefox added this experimental feature and turn it on by default for everyone. Which should not be the case.

You can find it in the preferences window.

Categories: FLOSS Project Planets

Zato Blog: Network packet brokers and automation in Python

Planet Python - Mon, 2024-07-15 00:43
Network packet brokers and automation in Python 2024-07-15, by Dariusz Suchojad

Packet brokers are crucial for network engineers, providing a clear, detailed view of network traffic, aiding in efficient issue identification and resolution.

But what is a network packet broker (NBP) really? Why are they needed? And how to automate one in Python?

➤ Read this article about network packet brokers and their automation in Python to find out more.

More resources

Click here to read more about using Python and Zato in telecommunications
➤ Python API integration tutorial
What is an integration platform?

More blog posts
Categories: FLOSS Project Planets

FSF Events: Free Software Directory meeting on IRC: Friday, July 19, starting at 12:00 EDT (16:00 UTC)

GNU Planet! - Mon, 2024-07-15 00:00
Join the FSF and friends on Friday, July 19 from 12:00 to 15:00 EDT (16:00 to 19:00 UTC) to help improve the Free Software Directory.
Categories: FLOSS Project Planets

Russ Allbery: podlators v6.0.2

Planet Debian - Sun, 2024-07-14 15:53

podlators contains the Perl modules and scripts used to convert Perl's documentation language, POD, to text and manual pages.

This is another small bug fix release that is part of iterating on getting the new podlators incorproated into Perl core. The bug fixed in this release was another build system bug I introduced in recent refactorings, this time breaking the realclean target so that some generated scripts were not removed. Thanks to James E Keenan for the report.

You can get the latest version from CPAN or from the podlators distribution page.

Categories: FLOSS Project Planets

Russ Allbery: DocKnot 8.0.1

Planet Debian - Sun, 2024-07-14 15:38

DocKnot is my static web site generator, with some additional features for managing software releases.

This release fixes some bugs in the newly-added conversion of text to HTML that were due to my still-incomplete refactoring of that code. It still uses some global variables, and they were leaking between different documents and breaking the formatting. It also fixes consistency problems with how the style parameter in *.spin files was interpreted, and fixes some incorrect docknot update-spin behavior.

You can get the latest version from CPAN or from the DocKnot distribution page.

Categories: FLOSS Project Planets

Real Python: Quiz: How to Flatten a List of Lists in Python

Planet Python - Sun, 2024-07-14 08:00

In this quiz, you’ll test your understanding of how to flatten a list in Python.

You’ll write code and answer questions to revisit the concept of converting a multidimensional list, such as a matrix, into a one-dimensional list.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Python Type Checking

Planet Python - Sun, 2024-07-14 08:00

In this quiz, you’ll test your understanding of Python Type Checking.

By working through this quiz, you’ll revisit type annotations and type hints, adding static types to code, running a static type checker, and enforcing types at runtime.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Ravi Dwivedi: Kenya Visa Process

Planet Debian - Sun, 2024-07-14 06:54

Prior to arrival in Kenya, you need to apply for an Electronic Travel Authorization (eTA) on their website by uploading all the required documents. This system is in place since Jan 2024 after the country abolished the visa system, implementing the eTA portal. The required documents will depend on the purpose of your visit, which in my case, was to attend a conference.

Here is the list of documents I submitted for my eTA:

  • Scanned copy of my passport

  • Photograph with white background

  • Flight tickets (reservation)

  • Hotel bookings (reservation)

  • Invitation letter from the conference

  • Yellow Fever vaccination certificate (optional)

  • Job contract (optional)

“Reservation” means I didn’t book the flights and hotels, but rather reserved them. Additionally, “optional” means that those documents were not mandatory to submit, but I submitted them in the “Other Documents” section in order to support my application. After submitting the eTA, I had to make a payment of around 35 US Dollars (approximately 3000 Indian Rupees).

It took 40 hours for me to receive an email from Kenya stating that my eTA has been approved, along with an attached PDF, making this one of my smoothest experiences of obtaining travel documents to travel to a country :). An eTA is technically not a visa, but I put the word “visa” in the title due to familiarity with the term.

Categories: FLOSS Project Planets

digiKam 8.4.0 is released

Planet KDE - Sat, 2024-07-13 20:00
Dear digiKam fans and users, After five months of active maintenance and long bugs triage, the digiKam team is proud to present version 8.4.0 of its open source digital photo manager. Long time bugs present in older versions have been fixed and we spare a lot of time to contact users to validate changes in pre-release to confirm fixes before deploying the program in production. The application internationalization has also been updated.
Categories: FLOSS Project Planets

gnuastro @ Savannah: Gnuastro 0.23 released

GNU Planet! - Sat, 2024-07-13 19:01

The 23rd release of GNU Astronomy Utilities (Gnuastro) is now available. See the full announcement for all the new features in this release and the many bugs that have been found and fixed: https://lists.gnu.org/archive/html/info-gnuastro/2024-07/msg00001.html

Categories: FLOSS Project Planets

Anuradha Weeraman: Windows of Opportunity: Microsoft's Open Source Renaissance

Planet Debian - Sat, 2024-07-13 09:40

Twenty years ago, it was easy to dislike Microsoft. It was the quintessential evil MegaCorp that was quick to squash competition, often ruthlessly, but in some cases slowly through a more insidious process of embracing, extending, and exterminating anything that got in the way. This was the signature personality of Ballmer-era Microsoft that also inspired and united the software freedom fighting forces that came together to safeguard things that mattered to them and were at risk.

I remember the era when the Novell, SCO, and Microsoft saga cast fear, uncertainty, and doubt on the future of open Unix and Linux and on what would happen to the operating systems that we loved if the suits of Redmond prevailed. Looking back, I&aposm glad that the arc of this story has bent towards justice, and I shudder at the possibilities had it worked out differently.

Looking at today&aposs Microsoft, I&aposm amazed at how much change a leader with the right vision can make to the trajectory of a company that even makes an old-school software freedom advocate as me admire and even applaud the strides it has taken in the last 10 or so years that has dramatically shifted the perception of Microsoft. The personality of the Satya-era Microsoft is one to behold. While it will take more time to win back the trust, we see the tides changing and the positivity is important for the entire industry.

For Microsoft, it was TypeScript and VS Code that helped change the narrative internally which led to its internal resurgence and acceptance of open source. Its acquisition of GitHub propelled it forward within the community overnight. Its contributions to the Linux kernel and other major software projects have also been consequential in changing its public perceptions.

It takes a while to claw back trust and is very easy to breach. This time, however, Microsoft seems to understand this dynamic more than it did 20 years ago. All it took was the right leadership.

Categories: FLOSS Project Planets

Pages