FLOSS Project Planets

Molly de Blanc: Digital Self

Planet Debian - Mon, 2020-10-26 15:02

When we talk about the digital self, we are talking about the self as it exists within digital spaces. This holds differently for different people, as some of us prefer to live within an pseudonymous or anonymous identity online, divested from our physical selves, while others consider the digital a more holistic identity that extends from the physical.

Your digital self is gestalt, in that it exists across whatever mediums, web sites, and services you use. These bits are pieces together to form a whole picture of what it means to be you, or some aspect of  you. This may be carefully curated, or it may be an emergent property of who you are.

The way your physical self has rights, so too does your digital self. Or, perhaps, it would be more accurate to say that your rights extend to your digital self. I do not personally consider that there is a separation between these selves when it comes to rights, as both are aspects of you and you have rights. I am explicitly not going to list what these rights are, because I have my own ideas about them and yours may differ. Instead, I will briefly talk about consent.

I think it is essential that we genuinely consent to how others interact with us to maintain the sanctity of our selves. Consent is necessary to the protection and expression of our rights, as it ensures we are able to rely on our rights and creates a space where we are able to express our rights in comfort and safety. We may classically think of consent as it relates to sex and sexual consent: only we have the right to determine what happens to our bodies; no one else has the right to that determination. We are able to give sexual consent, and we are able to revoke it. Sexual consent, in order to be in good faith, must be requested and given from a place of openness and transparency. For this, we discuss with our partners the things about ourselves that may impact their decision to consent: we are sober; we are not ill; we are using (or not) protection as we agree is appropriate; we are making this decision because it is a thing we desire, rather than a thing we feel we ought to do or are being forced to do; as well as other topics.

These things also all hold true for technology and the digital spaces in which we reside. Our digital autonomy is not the only thing at stake when we look at digital consent. The ways we interact in digital spaces impact our whole selves, and exploitation of our consent too impacts our whole selves. Private information appearing online can have material consequences — it can directly lead to safety issues, like stalking or threats, and it can lead to a loss of psychic safety and have a chilling effect. These are in addition to the threats posed to digital safety and well being. Consent must be actively sought, what one is consenting to is transparent, and the potential consequences must be known and understood.

In order to protect and empower the digital self, to treat everyone justly and with respect, we must hold the digital self be as sacrosanct as other aspects of the self and treat it accordingly.

Categories: FLOSS Project Planets

NumFOCUS: TARDIS Joins NumFOCUS as a Sponsored Project

Planet Python - Mon, 2020-10-26 14:13

NumFOCUS is pleased to announce the newest addition to our fiscally sponsored projects: TARDIS TARDIS is an open-source, Monte Carlo based radiation transport simulator for supernovae ejecta. TARDIS simulates photons traveling through the outer layers of an exploded star including relevant physics like atomic interactions between the photons and the expanding gas. The TARDIS collaboration […]

The post TARDIS Joins NumFOCUS as a Sponsored Project appeared first on NumFOCUS.

Categories: FLOSS Project Planets

Reuven Lerner: Last chance: Weekly Python Exercise B3 starts tomorrow!

Planet Python - Mon, 2020-10-26 11:01

Want to improve your Python skills? Looking for a way to practice on a regular basis, backed up by a community of learners?

Look no more: A new advanced-level cohort of Weekly Python Exercise is starting tomorrow! If you’ve been using Python for at least a year, then this course will open your eyes to new techniques, and help to strengthen existing ones.

Here’s how it works:

  • Every Tuesday, you’re sent a new problem via e-mail, along with “pytest” tests
  • On the following Monday, you get the solution, with a detailed explanation.
  • In between, you can chat in our private forum about your solution (and theirs).
  • Once a month, I do free, live office hours, answering your Python questions.

But wait, there’s more: As of this cohort (B3), every solution will not only be written up in e-mail, but will also be answered in a screencast! I hope that this will help you to understand the solutions better than in pure text.

Questions? Comments? Wondering about discounts? Just contact me at @reuvenmlerner on Twitter, or send me e-mail at reuven@lerner.co.il.

But don’t hesitate; I won’t be offering this cohort again until 2021…

Click here for more info about Weekly Python Exercise

The post Last chance: Weekly Python Exercise B3 starts tomorrow! appeared first on Reuven Lerner.

Categories: FLOSS Project Planets

Test and Code: 136: Wearable Technology - Sophy Wong

Planet Python - Mon, 2020-10-26 10:15

Wearable technology is not just smart consumer devices like watches and activity trackers.

Wearable tech also includes one off projects by designers, makers, and hackers and there are more and more people producing tutorials on how to get started. Wearable tech is also a great way to get both kids and adults excited about coding, electronics, and in general, engineering skills.

Sophy Wong is a designer who makes really cool stuff using code, technology, costuming, soldering, and even jewelry techniques to get tech onto the human body.

Sophy joins the show to answer my many questions about getting started safely with wearable tech.

Some of the questions and topics:

  • Can I wash my clothing if I've added tech to it?
  • Is there any danger in wearing technology or building wearable tech?
  • Are there actual wires and cables conductive thread in the fabric and textiles of some wearable tech projects?
  • What's a good starter project? Especially if I want to do a wearable tech project with my kids?
  • Dealing with stretch with clothing and non-bendy electronics.
  • Some questions around the Sophy Wong and HackSpace "Wearable Tech Projects" book.
  • How did you get into wearable tech?
  • Do you have a favorite project?
  • Can I get into wearable tech if I don't know how to code or solder?
  • Are these projects accessible to people with limited budgets?
  • Making projects so you can reuse the expensive bits on multiple projects.

Special Guest: Sophy Wong.

Sponsored By:

Support Test & Code : Python Testing for Software Engineering

Links:

<p>Wearable technology is not just smart consumer devices like watches and activity trackers. </p> <p>Wearable tech also includes one off projects by designers, makers, and hackers and there are more and more people producing tutorials on how to get started. Wearable tech is also a great way to get both kids and adults excited about coding, electronics, and in general, engineering skills. </p> <p>Sophy Wong is a designer who makes really cool stuff using code, technology, costuming, soldering, and even jewelry techniques to get tech onto the human body. </p> <p>Sophy joins the show to answer my many questions about getting started safely with wearable tech.</p> <p>Some of the questions and topics:</p> <ul> <li>Can I wash my clothing if I&#39;ve added tech to it?</li> <li>Is there any danger in wearing technology or building wearable tech?</li> <li>Are there actual wires and cables conductive thread in the fabric and textiles of some wearable tech projects?</li> <li>What&#39;s a good starter project? Especially if I want to do a wearable tech project with my kids?</li> <li>Dealing with stretch with clothing and non-bendy electronics.</li> <li>Some questions around the Sophy Wong and HackSpace &quot;Wearable Tech Projects&quot; book.</li> <li>How did you get into wearable tech?</li> <li>Do you have a favorite project?</li> <li>Can I get into wearable tech if I don&#39;t know how to code or solder?</li> <li>Are these projects accessible to people with limited budgets?</li> <li>Making projects so you can reuse the expensive bits on multiple projects.</li> </ul><p>Special Guest: Sophy Wong.</p><p>Sponsored By:</p><ul><li><a href="https://monday.com/testandcode" rel="nofollow">monday.com</a>: <a href="https://monday.com/testandcode" rel="nofollow">Creating a monday.com app can help thousands of people and win you prizes. Maybe even a Tesla or a MacBook.</a></li><li><a href="https://testandcode.com/pycharm" rel="nofollow">PyCharm Professional</a>: <a href="https://testandcode.com/pycharm" rel="nofollow">Try PyCharm Pro for 4 months and learn how PyCharm will save you time.</a> Promo Code: TESTANDCODE20</li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code : Python Testing for Software Engineering</a></p><p>Links:</p><ul><li><a href="https://sophywong.com/" title="sophywong.com" rel="nofollow">sophywong.com</a></li><li><a href="https://hackspace.raspberrypi.org/articles/wearable-tech-projects" title="Wearable Tech Projects book" rel="nofollow">Wearable Tech Projects book</a> &mdash; The wearable technology book</li><li><a href="https://sophywong.com/costumes" title="costumes" rel="nofollow">costumes</a> &mdash; The dress is on this page, as well as the Ghostbuster pack and costume.</li><li><a href="https://sophywong.com/spacesuit" title="spacesuit" rel="nofollow">spacesuit</a></li><li><a href="https://www.youtube.com/watch?v=T3iNIylOZF0&t=4s" title="Music video with Sophy's space suit" rel="nofollow">Music video with Sophy's space suit</a></li><li><a href="https://www.kobakant.at/DIY/" title="Kobakant tutorials" rel="nofollow">Kobakant tutorials</a></li></ul>
Categories: FLOSS Project Planets

Real Python: Python Modulo in Practice: How to Use the % Operator

Planet Python - Mon, 2020-10-26 10:00

Python supports a wide range of arithmetic operators that you can use when working with numbers in your code. One of these operators is the modulo operator (%), which returns the remainder of dividing two numbers.

In this tutorial, you’ll learn:

  • How modulo works in mathematics
  • How to use the Python modulo operator with different numeric types
  • How Python calculates the results of a modulo operation
  • How to override .__mod__() in your classes to use them with the modulo operator
  • How to use the Python modulo operator to solve real-world problems

The Python modulo operator can sometimes be overlooked. But having a good understanding of this operator will give you an invaluable tool in your Python tool belt.

Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

Modulo in Mathematics#

The term modulo comes from a branch of mathematics called modular arithmetic. Modular arithmetic deals with integer arithmetic on a circular number line that has a fixed set of numbers. All arithmetic operations performed on this number line will wrap around when they reach a certain number called the modulus.

A classic example of modulo in modular arithmetic is the twelve-hour clock. A twelve-hour clock has a fixed set of values, from 1 to 12. When counting on a twelve-hour clock, you count up to the modulus 12 and then wrap back to 1. A twelve-hour clock can be classified as “modulo 12,” sometimes shortened to “mod 12.”

The modulo operator is used when you want to compare a number with the modulus and get the equivalent number constrained to the range of the modulus.

For example, say you want to determine what time it would be nine hours after 8:00 a.m. On a twelve-hour clock, you can’t simply add 9 to 8 because you would get 17. You need to take the result, 17, and use mod to get its equivalent value in a twelve-hour context:

8 o'clock + 9 = 17 o'clock 17 mod 12 = 5

17 mod 12 returns 5. This means that nine hours past 8:00 a.m. is 5:00 p.m. You determined this by taking the number 17 and applying it to a mod 12 context.

Now, if you think about it, 17 and 5 are equivalent in a mod 12 context. If you were to look at the hour hand at 5:00 and 17:00, it would be in the same position. Modular arithmetic has an equation to describe this relationship:

a ≡ b (mod n)

This equation reads “a and b are congruent modulo n.” This means that a and b are equivalent in mod n as they have the same remainder when divided by n. In the above equation, n is the modulus for both a and b. Using the values 17 and 5 from before, the equation would look like this:

17 ≡ 5 (mod 12)

This reads “17 and 5 are congruent modulo 12.” 17 and 5 have the same remainder, 5, when divided by 12. So in mod 12, the numbers 17 and 5 are equivalent.

You can confirm this using division:

17 / 12 = 1 R 5 5 / 12 = 0 R 5

Both of the operations have the same remainder, 5, so they’re equivalent modulo 12.

Now, this may seem like a lot of math for a Python operator, but having this knowledge will prepare you to use the modulo operator in the examples later in this tutorial. In the next section, you’ll look at the basics of using the Python modulo operator with the numeric types int and float.

Python Modulo Operator Basics#

The modulo operator, like the other arithmetic operators, can be used with the numeric types int and float. As you’ll see later on, it can also be used with other types like math.fmod(), decimal.Decimal, and your own classes.

Modulo Operator With int#

Most of the time you’ll use the modulo operator with integers. The modulo operator, when used with two positive integers, will return the remainder of standard Euclidean division:

Read the full article at https://realpython.com/python-modulo-operator/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Stack Abuse: Seaborn Scatter Plot - Tutorial and Examples

Planet Python - Mon, 2020-10-26 09:30
Introduction

Seaborn is one of the most widely used data visualization libraries in Python, as an extension to Matplotlib. It offers a simple, intuitive, yet highly customizable API for data visualization.

In this tutorial, we'll take a look at how to plot a scatter plot in Seaborn. We'll cover simple scatter plots, multiple scatter plots with FacetGrid as well as 3D scatter plots.

Import Data

We'll use the World Happiness dataset, and compare the Happiness Score against varying features to see what influences perceived happiness in the world:

import pandas as pd df = pd.read_csv('worldHappiness2016.csv') Plot a Scatter Plot in Seaborn

Now, with the dataset loaded, let's import PyPlot, which we'll use to show the graph, as well as Seaborn. We'll plot the Happiness Score against the country's Economy (GDP per Capita):

import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('worldHappiness2016.csv') sns.scatterplot(data = df, x = "Economy (GDP per Capita)", y = "Happiness Score") plt.show()

Seaborn makes it really easy to plot basic graphs like scatter plots. We don't need to fiddle with the Figure object, Axes instances or set anything up, although, we can if we want to. Here, we've supplied the df as the data argument, and provided the features we want to visualize as the x and y arguments.

These have to match the data present in the dataset and the default labels will be their names. We'll customize this in a later section.

Now, if we run this code, we're greeted with:

Here, there's a strong positive correlation between the economy (GDP per capita) and the perceived happiness of the inhabitants of a country/region.

Plotting Multiple Scatter Plots in Seaborn with FacetGrid

If you'd like to compare more than one variable against another, such as - the average life expectancy, as well as the happiness score against the economy, or any variation of this, there's no need to create a 3D plot for this.

While 2D plots that visualize correlations between more than two variables exist, some of them aren't fully beginner friendly.

Seaborn allows us to construct a FacetGrid object, which we can use to facet the data and construct multiple, related plots, one next to the other.

Let's take a look at how to do that:

import matplotlib.pyplot as plt import pandas as pd import seaborn as sns df = pd.read_csv('worldHappiness2016.csv') grid = sns.FacetGrid(df, col = "Region", hue = "Region", col_wrap=5) grid.map(sns.scatterplot, "Economy (GDP per Capita)", "Health (Life Expectancy)") grid.add_legend() plt.show()

Here, we've created a FacetGrid, passing our data (df) to it. By specifying the col argument as "Region", we've told Seaborn that we'd like to facet the data into regions and plot a scatter plot for each region in the dataset.

We've also assigned the hue to depend on the region, so each region has a different color. Finally, we've set the col_wrap argument to 5 so that the entire figure isn't too wide - it breaks on every 5 columns into a new row.

To this grid object, we map() our arguments. Specifically, we specified a sns.scatterplot as the type of plot we'd like, as well as the x and y variables we want to plot in these scatter plots.

This results in 10 different scatter plots, each with the related x and y data, separated by region.

We've also added a legend in the end, to help identify the colors.

Plotting a 3D Scatter Plot in Seaborn

Seaborn doesn't come with any built-in 3D functionality, unfortunately. It's an extension of Matplotlib and relies on it for the heavy lifting in 3D. Though, we can style the 3D Matplotlib plot, using Seaborn.

Let's set the style using Seaborn, and visualize a 3D scatter plot between happiness, economy and health:

import matplotlib.pyplot as plt import pandas as pd import seaborn as sns from mpl_toolkits.mplot3d import Axes3D df = pd.read_csv('2016.csv') sns.set(style = "darkgrid") fig = plt.figure() ax = fig.add_subplot(111, projection = '3d') x = df['Happiness Score'] y = df['Economy (GDP per Capita)'] z = df['Health (Life Expectancy)'] ax.set_xlabel("Happiness") ax.set_ylabel("Economy") ax.set_zlabel("Health") ax.scatter(x, y, z) plt.show()

Running this code results in an interactive 3D visualization that we can pan and inspect in three-dimensional space, styled as a Seaborn plot:

Customizing Scatter Plots in Seaborn

Using Seaborn, it's easy to customize various elements of the plots you make. For example, you can set the hue and size of each marker on a scatter plot.

Let's change some of the options and see how the plot looks like when altered:

import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('2016.csv') sns.scatterplot(data = df, x = "Economy (GDP per Capita)", y = "Happiness Score", hue = "Region", size = "Freedom") plt.show()

Here, we've set the hue to Region which means that data from different regions will have different colors. Also, we've set the size to be proportional to the Freedom feature. The higher the freedom factor is, the larger the dots are:

Or you can set a fixed size for all markers, as well as a color:

sns.scatterplot(data = df, x = "Economy (GDP per Capita)", y = "Happiness Score", hue = "red", size = 5) Conclusion

In this tutorial, we've gone over several ways to plot a scatter plot using Seaborn and Python.

If you're interested in Data Visualization and don't know where to start, make sure to check out our book on Data Visualization in Python.

Data Visualization in Python, a book for beginner to intermediate Python developers, will guide you through simple data manipulation with Pandas, cover core plotting libraries like Matplotlib and Seaborn, and show you how to take advantage of declarative and experimental libraries like Altair.

Data Visualization in Python Understand your data better with visualizations! With over 275+ pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.
Categories: FLOSS Project Planets

Stack Abuse: What Does if __name__ == "__main__": Do in Python?

Planet Python - Mon, 2020-10-26 08:17
Introduction

It's common to see if __name__ == "__main__" in Python scripts we find online, or one of the many we write ourselves.

Why do we use that if-statement when running our Python programs? In this article, we explain the mechanics behind its usage, the advantages, and where it can be used.

The __name__ Attribute and the __main__ Scope

The __name__ attribute comes by default as one of the names in the current local scope. The Python interpreter automatically adds this value when we are running a Python script or importing our code as a module.

Try out the following command on your Python interpreter. You may find out that __name__ belongs to the list of attributes in dir():

dir()

The __name__ in Python is a special variable that defines the name of the class or the current module or the script from which it gets invoked.

Create a new folder called name_scripts so we can write a few scripts to understand how this all works. In that folder create a new file, script1.py with the following code:

print(f'The __name__ from script1 is "{__name__}"')

That's a curveball! We'd expect that the name would be script1, as our file. What does the output __main__ mean?

By default, when a script is executed, the interpreter reads the script and assigns the string __main__ to the __name__ keyword.

It gets even more interesting when the above script gets imported to another script. Consider a Python file named script2.py with the following code:

import script1 # The print statement gets executed upon import print(f'The __name__ from script2 is "{__name__}"')

As you can see, when the script is executed the output is given as script1 denoting the name of the script. The final print statement is in the scope of script2 and when it gets executed, the output gets printed as: __main__.

Now that we understand how Python uses the __name__ scope and when it gives it a value of "__main__", let's look at why we check for its value before executing code.

if __name__ == "__main__" in Action

We use the if-statement to run blocks of code only if our program is the main program executed. This allows our program to be executable by itself, but friendly to other Python modules who may want to import some functionality without having to run the code.

Consider the following Python programs:

a) script3.py contains a function called add() which gets invoked only from the main context.

def add(a, b): return a+b if __name__ == "__main__": print(add(2, 3))

Here's the output when script3.py gets invoked:

As the script was executed directly, the __name__ keyword is assigned to __main__, and the block of code under the if __name__ == "__main__" condition is executed.

b) Here's what happens when this snippet is imported from script4.py:

import script3 print(f"{script3.__name__}")

The block under if __name__ == "__main__" from script3.py did not execute, as expected. This happened because the __name__ keyword is now assigned with the name of the script: script3. This can be verified by the print statement given which prints the assigned value for the __name__ keyword.

How Does __name__ == "__main__" Help in Development?

Here are some use cases for using that if-statement when creating your script

  • Testing is a good practice which helps not only catch bugs but ensure your code behaves as required. Test files have to import a function or object to them. In these cases, we typically don't want the script being run as the main module.
  • You're creating a library but would like to include a demo or other special run-time cases for users. By using this if-statement, the Python modules that use your code as a library are unaffected.
Creating a __main__.py File for Modules

The point of having the if __name__ == "__main__" block is to get the piece of code under the condition to get executed when the script is in the __main__ scope. While creating packages in Python, however, it's better if the code to be executed under the __main__ context is written in a separate file.

Let's consider the following example - a package for performing calculations. The file tree structure for such a scenario can be visualized as:

calc # --> Root directory ├── __main__.py ├── script1.py ├── script2.py ├── script3.py ├── script4.py └── src # --> Sub-directory ├── add.py └── sub.py

The tree structure contains calc as the root directory and a sub-directory known as src. The __main__.py under the calc directory contains the following content:

from src.add import add from src.sub import sub a, b = input("Enter two numbers separated by commas: ").split(',') a, b = int(a), int(b) print(f"The sum is: {add(a, b)}") print(f"The difference is: {sub(a, b)}")

The add.py contains:

def add(a, b): return a+b

And sub.py contains:

def sub(a, b): return a-b

From right outside the calc directory, the script can be executed and the logic inside the __main__.py gets executed by invoking:

python3 calc

This structure also gives a cleaner look to the workspace location, the way how the directories are organized, and the entry point is defined inside a separate file called __main__.py.

Conclusion

The __name__ == "__main__" runs blocks of code only when our Python script is being executed directly from a user. This is powerful as it allows our code to have different behavior when it's being executed as a program instead of being imported as a module.

When writing large modules, we can opt for the more structured approach of having a __main__.py file to run a module. For a stand-alone script, including the if __name__ == "__main__" is a simpler method to separate the API from the program.

Categories: FLOSS Project Planets

1xINTERNET blog: Three nomination for the German and Austrian Splash Awards

Planet Drupal - Mon, 2020-10-26 08:00
This year 1xINTERNET has three nominations for the German and Austria Splash awards taking place next Thursday 29th of October. The event will be virtual this year and is free for everyone.
Categories: FLOSS Project Planets

Malayalam fonts: Beyond Latin font metrics

Planet KDE - Mon, 2020-10-26 06:21

This year’s annual international conference organized by TeX Users Group — TUG2020 — was held completely online due to the raging pandemic. In TUG2020, I have presented a talk on some important Malayalam typeface design factors and considerations.

The idea and its articulation of the talk originated with K.H. Hussain, designer of well-known fonts such as Rachana, Meera, Meera Inimai, TNJoy etc. In a number of discussions that ensued, this idea was developed and later presented at TUG2020.

Opening keynote to TUG2020 was delivered by Steve Matteson, about the design of Noto fonts. He mentioned that Noto was originally envisaged to be developed as a single font containing all Unicode scripts; but that was changed due to a couple of reasons: (1) huge size of resulting font and (2) the design of many South/South-East Asian characters do not fit well within its Latin font metrics.

This second point set up the stage nicely for my talk, in which we argued that a paradigm shift from established Latin font metrics is necessary in designing and choosing font metrics for Indic scripts, in particular with Malayalam as a case study.

Indic scripts have abundant conjunct characters (basic characters combined to form a distinct shape). The same characters may join ‘horizontally’ (e.g. ത്സ/thsa) or ‘vertically/stacked’ (e.g. സ്ത/stha); and Malayalam script in particular has plenty of stacked conjuncts even in contrast with other Indic scripts. This peculiarity also makes the glyph design of fonts challenging — to balance aesthetics, legibility/readability and leading/line spacing. Specifically, following the usual x-height/cap-height/ascender/descender metrics used in Latin fonts put a lot of constraints in the design of stacked conjuncts. We propose to break away from this conventional metrics and adopt different proportions of the above- and below-base glyphs (even if they are the same characters, e.g. സ in the double conjunct സ്സ), still conforming to the aesthetics of the script yet managing the legibility and leading.

Fig. 1: Malayalam stacked conjuncts beyond conventional Latin font metrics.

Details of this study, argument and proposal can be found in the slides of the presentation available at the program details as well as the recorded talk now available on TUG YouTube channel.

TUG2020 presentation.

The conference paper, edited by Barbara Beeton and Karl Berry will be published in the next issue of TUGBoat journal.

Categories: FLOSS Project Planets

Graphics in Qt 6.0: QRhi, Qt Quick, Qt Quick 3D

Planet KDE - Mon, 2020-10-26 05:00

Last year we had a three part blog series about Qt's new approach to working with 3D graphics APIs and shading languages: part 1, part 2, part 3. For Qt Quick, an early, opt-in preview of the new rendering architecture was shipped in Qt 5.14, with some improvements in Qt 5.15. With the release of Qt 6.0 upcoming, let's see what has happened since Qt 5.15. It will not be possible to cover every detail of the graphics stack improvements for Qt Quick here, let alone dive into the vast amount of Qt Quick 3D features, many of which are new or improved in Qt 6.0. Rather, the aim is just to give an overview of what can be expected from the graphics stack perspective when Qt 6.0 ships later this year.

Note that the documentation links refer to the Qt 6 snapshot documentation. This allows seeing the latest C++ and QML API pages, including all changed and new functions, but the content is also not final. These links may also break later on.

Categories: FLOSS Project Planets

Reproducible Builds: Second Reproducible Builds IRC meeting

Planet Debian - Mon, 2020-10-26 04:32

After the success of our previous IRC meeting, we are having our second IRC meeting today. Monday 26th October, at 18:00 UTC:

  • 11:00am San Francisco. []
  • 2:00pm New York. []
  • 6:00pm London. []
  • 7:00pm Paris/Berlin. []
  • 11:30pm Delhi. []
  • 2:00am Beijing. [] (+1 day)

Please join us on the #reproducible-builds channel on irc.oftc.net — an agenda is available. As mentioned in our previous meeting announcement, due to the unprecedented events in 2020, there will be no in-person Reproducible Builds event this year, but we plan to run these IRC meetings every fortnight.

Categories: FLOSS Project Planets

Kushal Das: Running SecureDrop inside of podman containers on Fedora 33

Planet Python - Mon, 2020-10-26 01:46

Last week, while setting up a Fedora 33 system, I thought of running the SecureDrop development container there, but using podman instead of the Docker setup we have.

I tried to make minimal changes to our existing scripts. Added a ~/bin/docker file, with podman $@ inside (and the sha-bang line).

Next, I provided the proper label for SELinux:

sudo chcon -Rt container_file_t securedrop

The SecureDrop container runs as the normal user inside of the Docker container. I can not do the same here as the filesystem gets mounted as root, and I can not write in it. So, had to modify one line in the bash script, and also disabled another function call which deletes the /dev/random file inside of the container.

diff --git a/securedrop/bin/dev-shell b/securedrop/bin/dev-shell index ef424bc01..37215b551 100755 --- a/securedrop/bin/dev-shell +++ b/securedrop/bin/dev-shell @@ -72,7 +72,7 @@ function docker_run() { -e LANG=C.UTF-8 \ -e PAGE_LAYOUT_LOCALES \ -e PATH \ - --user "${USER:-root}" \ + --user root \ --volume "${TOPLEVEL}:${TOPLEVEL}" \ --workdir "${TOPLEVEL}/securedrop" \ --name "${SD_CONTAINER}" \ diff --git a/securedrop/bin/run b/securedrop/bin/run index e82cc6320..0c11aa8db 100755 --- a/securedrop/bin/run +++ b/securedrop/bin/run @@ -9,7 +9,7 @@ cd "${REPOROOT}/securedrop" source "${BASH_SOURCE%/*}/dev-deps" run_redis & -urandom +#urandom run_sass --watch & maybe_create_config_py reset_demo

This time I felt that build time for verifying each cached layer is much longer than what it used to be for podman. Maybe I am just mistaken. The SecureDrop web application is working very fine inside.

Package build containers

We also use containers to build Debian packages. And those molecule scenarios were failing as ansible.posix.synchronize module could not sync to a podman container. I asked if there is anyway to do that, and by the time I woke up, Adam Miller had a branch that fixed the issue. I directly used the same in my virtual environment. The package build was successful. Then, the testinfra tests failed due as it could not create the temporary directory inside of the container. I already opened an issue for the same.

Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: William Horton

Planet Python - Mon, 2020-10-26 01:05

This week we welcome William Horton (@hortonhearsafoo) as our PyDev of the Week! William is a Senior Software Engineer at Compass and has spoken at several local Python conferences. He is a contributor to PyTorch and fastai.

Let’s spend some time getting to know William better!

Can you tell us a little about yourself (hobbies, education, etc):

A little about myself: people might be surprised about my educational background–I didn’t study computer science. I have a bachelors in the social sciences. So by the time I finished undergrad, the most programming I had done was probably doing regressions in Stata to finish my thesis. I decided against grad school, and instead signed up for a coding bootcamp (App Academy) in NYC. The day I’m writing this, September 28, is actually 5 years to the day that I started at App Academy.

Since then I’ve worked at a few different startups in NYC, across various industries: first investment banking, then online pharmacy, and now real estate. I’m currently a senior engineer on the AI Services team at Compass, working on machine learning solutions for our real estate agents and consumers.

I like to spend my free time on a few different hobbies. I’m a competitive powerlifter, so I like to get into the gym a few times a week (although with the pandemic in NYC I didn’t lift for six months or so). I’ve actually found powerlifting to be a pretty common hobby among software engineers. Every time someone new joined my gym, it seemed like they came from a different startup. I love to play basketball. And I’m passionate about music: I’ve been a singer almost my whole life, and most recently was performing with an a cappella group in NYC. And in the last year I’ve picked up the guitar, after not touching it since I was a teenager, and that has been very fulfilling.

Why did you start using Python?

I definitely didn’t start out down the road to Python development–my coding bootcamp was focused on Ruby and Rails, and they also taught us JavaScript and React. I got my first job mostly because I knew React, and at the time it was pretty new. But the company I joined also had a fairly large data processing component written in Python, and there were only a few engineers, so eventually I was pitching in on that part as well. By the time I was looking for my second job, I knew I wanted to do more Python, so I found a full-stack role that was a React frontend and a Python backend (in Flask).

But I think the real turning point for me was when I discovered the fast.ai course in the fall of 2017. I had taken a few machine learning courses online, including the Andrew Ng Coursera course, and it was a topic that I found interesting. But the fast.ai course just really sucked me in–the way that Jeremy Howard presented the material just gripped me in a certain way, and made me want to find out more. I loved his pitch: if you know some Python, and you have high school level math, you can get hands-on with machine learning, and start to grow your skills.

So by the time I was looking for a job in 2018, I knew I wanted to do something closer to data and machine learning. I joined Compass for a backend data role, on a growing team that was handling all of the real estate listing data we had coming in from different sources. That gave me the chance to learn some important tools: I set up the first Airflow instance at Compass, and worked on our PySpark code. And then when the machine learning team started up, I was able to contribute to the first project, and eventually join the team full-time.

What other programming languages do you know and which is your favorite?

I know Ruby from my coding bootcamp, JavaScript from my previous two jobs, and I’ve done a small amount of programming in Go as well. Out of those I’d probably say JavaScript is my favorite.

What projects are you working on now?

My main project right now is working on Likely to Sell Recommendations at Compass. We use historical data to learn a model of which properties are likely to sell, and then connect that with the addresses that agents have put into their contacts lists in the Compass CRM. It’s a Python-powered project all the way through: the model is scikit-learn, we use PySpark for processing the data, and the API is a Python GRPC service. We have a blog post on the Compass Medium page that has more information if people are interested in learning more.

The other project I’m really excited about is our Machine Learning Pipelines project, which we’re building on top of the open-source platform Kubeflow. It’s a way to define and run machine learning workflows on top of Kubernetes, which allows you to get some big benefits in terms of leveraging distributed computing, parallelization, and resource management. We’re already using it for the Likely to Sell project I mentioned above, and it’s allowed us to iterate and experiment more quickly. I had the chance to present a poster about Kubeflow Pipelines at SciPy 2020, and I also have a (virtual) talk on the topic at SciPy Japan (Oct. 30-Nov. 2)

Which Python libraries are your favorite (core or 3rd party)?

It’s hard to pick, there are a lot of great libraries out there! But to name a few: for the work I do professionally, I think that Jupyter Notebook, pandas, and scikit-learn are just essential. Really great libraries that have been around a while, and have stood the test of time. And I also have to shout out pytorch and fastai for fostering my interest in deep learning and machine learning in general, which is what started me down the road to my current role.

How did you get into giving talks at Python conferences?

I would say a few things contributed to it: my own curiosity, the support of the community, and also, admittedly, just luck. It all started because I signed up for a meetup that was a PyGotham talk brainstorm session hosted at Dropbox NYC. They took us through some exercises, and we all shared some ideas, and I took my best one and submitted to the PyGotham 2018 CFP. But I got rejected.

However, the PyOhio CFP was around the same time, and I saw a Tweet that was encouraging people to submit to that one too, so I sent the same proposal to PyOhio. And I got in! I was pretty excited to make the trip, but also very nervous to do my talk. PyOhio did offer speaker coaching to first-time speakers, so I’m thankful to them for that. I ended up having a great time giving the talk, and enjoyed the chance to meet some people and see some of the other talks. So I decided I wanted to do it again.

Setting up at PyColordao

And then came…more rejections. I think I sent that same talk to two more conferences at the end of 2018 and got rejected. I decided to come up with some fresh material for 2019, and I’d say PyTexas 2019 is when I really hit my stride. I gave a talk that I was really proud of “CUDA in your Python”, but I also started meeting more people in the community, and that really contributed a lot to my conference experience.

Do you have any tips for people who would like to give technical talks?

The first thing I’d say is: put yourself out there. I’m a perfectionist by nature, so it’s really hard for me to actually hit the submit button on a CFP (even now, when I’ve had talks accepted). But at the end of the day, some reviewers are going to like your proposal, and some aren’t, so if you want to give the talk, you just have to play the numbers game, submit to a few places, and hope for the best.

The other thing I’d stress is that you don’t have to be the world’s expert on something to give a talk about it. It can be intimidating starting out when you see speakers who are the authors of libraries, or who have ten years more experience than you, or who work at a big-name company. But I would tell people starting out: all you have to do is create a 25-minute experience where people enjoy the presentation and learn something from it that they didn’t know before. A lot of people coming to conferences, especially the regional Python conferences, are early on in their learning process, so there’s a lot of value in just presenting your own take on some intro-level material.

Thanks for doing the interview, William!

The post PyDev of the Week: William Horton appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

PreviousNext: Join us at the DrupalGov 2020 Code Sprint

Planet Drupal - Mon, 2020-10-26 00:28

This year DrupalGov is virtual. The PreviousNext team is sponsoring and helping to run the DrupalGov 2020 Sprint Day on Wednesday 4 November, and there are a few things you can do now to hit the ground running on the day.

by kim.pepper / 26 October 2020 This year the DrupalGov sprint will be virtual

We’ll start the day with a brief Zoom meeting to introduce the organisers, and outline how the day will run.

We’ll use #australia-nz Drupal Slack as the main communication channel, with ad hoc Zoom or Meet video calls for those who want to dive deeper into a topic.

For the majority of the day, we’ll be using Slack threads to keep track of sprint topics and reduce the noise in the main channel.

Join us on Slack

If you haven’t already done so, now is a great time to sign up and join the Australian / New Zealand Drupal community in Slack. Instructions for how to join are here: https://www.drupal.org/slack

Let us know about your experience

Please fill in the following survey to let us know about your experience with Drupal, and the areas you’re interested in Sprinting on. This will help us better prepare for the day.

https://www.surveymonkey.com/r/2DPWDPL

How to contribute

Sprint day is not just for developers! Contribution comes in many forms. If you’re interested in the different ways you can contribute to this amazing project, see the list of contributor tasks: https://www.drupal.org/contributor-tasks

Tagging issues to work on

If you want to see what might be an interesting issue to work on, head over to the Drupal.org Issue Queue and look for issues tagged with 'DrupalGov 2020'. These are issues that others have tagged.

You can also tag an issue yourself to be added to the list.

Set Up a Development Environment

There is more than one way to shear a sheep, and there is also more than one way to set up a local development environment for working on Drupal.

If you don't already have a local development environment setup, we recommend using Docker Compose for local development - follow the instructions for installing Docker Compose on OSX, Windows and Linux.

Once you've setup Docker compose, you need to setup a folder containing your docker-compose.yml and a clone of Drupal core. The instructions for that vary depending on your operating system, we have instructions below for OSX, Windows and Linux, although please note the Windows version is untested.

Mac OSX mkdir -p ~/dev/drupal cd ~/dev/drupal wget https://gist.githubusercontent.com/larowlan/9ba2c569fd52e8ac12aee962cc9319c9/raw/e69795e7219c9c73eb8d8d171c31277eeb5bcbaa/docker-compose.yml git clone --branch 8.9.x https://git.drupalcode.org/project/drupal.git app docker-compose up -d docker-compose run -w /data/app app composer install Windows git clone --branch 8.9.x https://git.drupalcode.org/project/drupal.git app docker-compose up -d docker-compose run -w /data/app app composer install Linux mkdir -p ~/dev/drupal # or wherever you want to put the folder cd ~/dev/drupal wget https://gist.githubusercontent.com/larowlan/63a0f6efacee71b483af3a2184178dd0/raw/248dff13557efa533c0ca297d39c87cd3eb348fe/docker-compose.ymlgit clone --branch 8.9.x https://git.drupalcode.org/project/drupal.git app docker-compose up -d docker-compose exec app /bin/bash -c "cd /data/app && composer install"

If you have any issues, join us on Drupal slack in the #australia-nz channel beforehand and we'll be happy to answer any questions you might have.

Install dreditor browser extension

Dreditor is a browser extension that makes it easier to review patches on Drupal.org. Its a must for anyone contributing to Drupal.

There are versions for Firefox and Chrome.

Find Issues to Work On

If you want to see what might be an interesting issue to work on, head over to the Drupal.org Issue Queue and look for issues tagged with 'DrupalGov 2020'. These are issues that others have tagged.

You can also tag an issue yourself to be added to the list.

Being face-to-face with fellow contributors is a great opportunity to have discussions and put forward ideas. Don't feel like you need to come away from the day having completed lines and lines of code.

Code of conduct

To provide a safe and inclusive environment, the sprint day will abide by the DrupalSouth Code of Conduct: https://drupalsouth.org/code-of-conduct

We look forward to seeing you all there!

Tagged Code Sprint, DrupalSouth, DrupalGov 2020
Categories: FLOSS Project Planets

Marco d'Itri: RPKI validation with FORT Validator

Planet Debian - Sun, 2020-10-25 20:25

This article documents how to install FORT Validator (an RPKI relying party software which also implements the RPKI to Router protocol in a single daemon) on Debian 10 to provide RPKI validation to routers. If you are using testing or unstable then you can just skip the part about apt pinnings.

The packages in bullseye (Debian testing) can be installed as is on Debian stable with no need to rebuild them, by configuring an appropriate pinning for apt:

cat <<END > /etc/apt/sources.list.d/bullseye.list deb http://deb.debian.org/debian/ bullseye main END cat <<END > /etc/apt/preferences.d/pin-rpki # by default do not install anything from bullseye Package: * Pin: release bullseye Pin-Priority: 100 Package: fort-validator rpki-trust-anchors Pin: release bullseye Pin-Priority: 990 END apt update

Before starting, make sure that curl (or wget) and the web PKI certificates are installed:

apt install curl ca-certificates

If you already know about the legal issues related to the ARIN TAL then you may instruct the package to automatically install it. If you skip this step then you will be asked at installation time about it, either way is fine.

echo 'rpki-trust-anchors rpki-trust-anchors/get_arin_tal boolean true' \ | debconf-set-selections

Install the package as usual:

apt install fort-validator

You may also install rpki-client and gortr on Debian 10, or maybe cfrpki and gortr. I have also tried packaging Routinator 3000 for Debian, but this effort is currently on hold because the Rust ecosystem is broken and hostile to the good packaging practices of Linux distributions.

Categories: FLOSS Project Planets

Marco d'Itri: RPKI validation with OpenBSD's rpki-client and Cloudflare's gortr

Planet Debian - Sun, 2020-10-25 20:22

This article documents how to install rpki-client (an RPKI relying party software, the actual validator) and gortr (which implements the RPKI to Router protocol) on Debian 10 to provide RPKI validation to routers. If you are using testing or unstable then you can just skip the part about apt pinnings.

The packages in bullseye (Debian testing) can be installed as is on Debian stable with no need to rebuild them, by configuring an appropriate pinning for apt:

cat <<END > /etc/apt/sources.list.d/bullseye.list deb http://deb.debian.org/debian/ bullseye main END cat <<END > /etc/apt/preferences.d/pin-rpki # by default do not install anything from bullseye Package: * Pin: release bullseye Pin-Priority: 100 Package: gortr rpki-client rpki-trust-anchors Pin: release bullseye Pin-Priority: 990 END apt update

Before starting, make sure that curl (or wget) and the web PKI certificates are installed:

apt install curl ca-certificates

If you already know about the legal issues related to the ARIN TAL then you may instruct the package to automatically install it. If you skip this step then you will be asked at installation time about it, either way is fine.

echo 'rpki-trust-anchors rpki-trust-anchors/get_arin_tal boolean true' \ | debconf-set-selections

Install the packages as usual:

apt install rpki-client gortr

And then configure rpki-client to generate its output in the the JSON format needed by gortr:

echo 'OPTIONS=-j' > /etc/default/rpki-client

You may manually start the service unit to immediately generate the data instead of waiting for the next timer run:

systemctl start rpki-client &

gortr too needs to be configured to use the JSON data generated by rpki-client:

echo 'GORTR_ARGS=-bind :323 -verify=false -checktime=false -cache /var/lib/rpki-client/json' > /etc/default/gortr

And then it needs to be restarted to use the new configuration:

systemctl restart gortr

You may also install FORT Validator on Debian 10, or maybe cfrpki with gortr. I have also tried packaging Routinator 3000 for Debian, but this effort is currently on hold because the Rust ecosystem is broken and hostile to the packaging practices of Linux distributions.

Categories: FLOSS Project Planets

Zero-with-Dot (Oleg Żero): Multi-Layer Perceptron &amp; Backpropagation - Implemented from scratch

Planet Python - Sun, 2020-10-25 19:00
Introduction

Writing a custom implementation of a popular algorithm can be compared to playing a musical standard. For as long as the code reflects upon the equations, the functionality remains unchanged. It is, indeed, just like playing from notes. However, it lets you master your tools and practice your ability to hear and think.

In this post, we are going to re-play the classic Multi-Layer Perceptron. Most importantly, we will play the solo called backpropagation, which is, indeed, one of the machine-learning standards.

As usual, we are going to show how the math translates into code. In other words, we will take the notes (equations) and play them using bare-bone numpy.

FYI: Feel free to check another “implemented from scratch” article on Hidden Markov Models here.

Overture - A Dense Layer Data

Let our (most generic) data be described as pairs of question-answer examples: , where is as a matrix of feature vectors, is known a matrix of labels and refers to an index of a particular data example. Here, by we understand the number of features and is the number of examples, so . Also, we assume that thus posing a binary classification problem for us. Here, it is important to mention that the approach won’t be much different if was a multi-class or a continuous variable (regression).

To generate the mock for the data, we use the sklearn’s make_classification function.

1 2 3 4 5 6 7 import numpy as np from sklearn.datasets import make_classification np.random.seed(42) X, y = make_classification(n_samples=10, n_features=4, n_classes=2, n_clusters_per_class=1) y_true = y.reshape(-1, 1)

Note that we do not split the data into the training and test datasets, as our goal would be to construct the network. Therefore, if the model overfits it would be a perfect sign!

At this stage, we adopt the convention that axis=0 shall refer to the examples , while axis=1 will be reserved for the features . Naturally, for binary classification .

A single perceptron

Figure 1. shows the concept of a single perceptron for the sake of showing the notation.

Figure 1. A single perceptron. ✕ Figure 1. A single perceptron.

The coefficients are known as weights and we can group them with a matrix . The indices denote that we map an input feature to an output feature . We also introduce a bias term , and therefore we can calculate using a single vector-matrix multiplication operation, starting from :

that can be written in a compact form:

This way we also don’t discriminate between the weight terms and bias terms . Everything becomes .

Next, the output of the inner product is fed to a non-linear function , known as the activation function.

The strength of neural networks lies in the “daisy-chaining” of layers of these perceptrons. Therefore, we can describe the whole network with a non-linear transformation that uses these two equations combined. Recursively, we can write that for each layer

with being the input and being the output.

Figure 2. shows an example architecture of a multi-layer perceptron.

Figure 2. A multi-layer perceptron, where `L = 3`. In the case of a regression problem, the output would not be applied to an activation function. Apart from that, note that every activation function needs to be non-linear. Otherwise, the whole network would collapse to linear transformation itself thus failing to serve its purpose. ✕ Figure 2. A multi-layer perceptron, where `L = 3`. In the case of a regression problem, the output would not be applied to an activation function. Apart from that, note that every activation function needs to be non-linear. Otherwise, the whole network would collapse to linear transformation itself thus failing to serve its purpose. The code

Having the equations written down, let’s wrap them up with some code.

First of all, as the layer formula is recursive, it makes total sense to discriminate a layer to be an entity. Therefore, we make it a class:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 class DenseLayer: def __init__(self, n_units, input_size=None, name=None): self.units = n_units self.input_size = self.input_size self.W = None self.name = name def __repr__(self): return f"Dense['{self.name}'] in:{self.input_size} + 1, out:{self.n_units}" def init_weights(self): self.W = np.random.randn(self.n_units, self.input_size + 1) @property def shape(self): return self.W.shape def __call__(self, X): m_examples = X.shape[0] X_extended = np.hstack([np.ones((m_examples, 1)), X]) Z = X_extended @ self.W.T A = 1 / (1 + np.exp(-Z)) return A

OK. Let’s break it down…

First of all, all we need to know about a layer is the number of units and the input size. However, as the input size will be dictated by either the data matrix or the size of the preceding layer, we will leave this parameter as optional. This is also the reason, why we leave the weights’ initialization step aside.

Secondly, both the __repr__ method as well as the self.name attribute serves no other purpose but to help us to debug this thing. Similarly, the shape property is nothing, but a utility.

The real magic happens within the __call__ method that implements the very math we discussed so far. To to have the calculations vectorized, we prepare the layer to accept whole arrays of data. Naturally, we associate the example count with the 0th axis, and the features’ count with the 1st axis. Once the layer accepts it, it extends the array with a whole column of 1’s to account for the bias term . Next, we perform the inner product, ensuring that the output dimension will match (m_examples, n_units). Finally, we apply a particular type of a non-linear transformation known as sigmoid (for now):

Forward propagation

Our “dense” layer (a layer perceptrons) is now defined. It is time to combine several of these layers into a model.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 class SequentialModel: def __init__(self, layers, lr=0.01): self.lr = lr input_size = layers[0].n_units layers[0].init_weights() for layer in layers[1:]: layer.input_size = input_size input_size = layer.n_units layer.init_weights() self.layers = layers def __repr__(self): return f"SequentialModel n_layer: {len(self.layers)}" def forward(self, X): out = layers[0](X) for layer in self.layers[1:]: out = layer(out) return out @staticmethod def cost(y_pred, y_true): cost = -y_true * np.log(y_pred) \ - (1 - y_true) * np.log(1 - y_pred) return cost.mean()

The whole constructor of this class is all about making sure that all layers are initialized and “size-compatible”. The real computations happen in the .forward() method and the only reason for the method to be called this way (not __call__) is so that we can create twin method .backward once we move on to discussing the backpropagation.

Next, the .cost method implements the so-called binary cross-entropy equation that firs our particular case:

If we went down with a multi-class classification or regression, this equation would have to be replaced.

Finally, the __repr__ method exists only for the sake of debugging.

Instantiation

These two classes are enough to make the first predictions!

1 2 3 4 5 6 7 8 9 10 11 12 np.random.seed(42) model = SequentialModel([ DenseLayer(6, input_size=X.shape[1], name='input'), DenseLayer(4, name='1st hidden'), DenseLayer(3, name='2nd hidden'), DenseLayer(1, name='output') ]) y_pred = model.forward(X) model.cost(y_pred, y_true) >>> 0.8305111958397594

That’s it! If done correctly, we should see the result.

The only problem is, of course, that the result is completely random. After all, the network needs to be trained. Therefore, to make it trainable will be our next goal.

Back-propagation

There exist multiple ways to train a neural net, one of which is to use the so-called normal equation

Another option is to use an optimization algorithm such as Gradient Descent, which is an iterative process to update weight is such a way, that the cost function associated with the problem is subsequently minimized:

To get to the point, where we can find the most optimal weights, we need to be able to calculate the gradient of . Only then, we can expect to be altering in a way that would lead to improvement. Consequently, we need to incorporate this logic into our model.

Let’s take a look at the equations, first.

Dependencies and the chain-rule

Thinking of as a function of predictions , we know it has implicit dependencies on every weight . To get the gradient, we need to resolve all the derivatives of with respect to every possible weight. Since the dependency is dictated by the architecture of the network, to resolve these dependencies we can apply the so-called chain rule. In other words, to get the derivatives , we need to trace back “what depends on what”.

Here is how it is done.

Let’s start with calculating the derivative of the cost function with respect to some weight . To make the approach generic, irrespectively from if our problem is a classification or a regression type problem, we can estimate a fractional error that the network commits at every layer as a squared difference between the activation values and a respective local target value , which is what the network would achieve at the nodes if it was perfect. Therefore,

Now, to find a given weight’s contribution to the overall error for a given layer , we calculate . Knowing that when going “forward”, we have dependents on through an activation function , whose argument in turns depends on contributions from the previous layers, we apply the chain-rule:

Partial error function derivative

Now, let’s break it down. The first term gives the following contribution:

Activation function derivative

Next, the middle term depends on the actual choice of the activation function. For the sigmoid function, we have:

which we can rewrite as .

Similarly, for other popular activation function, we have

Note that although the ReLU function is not differentiable, we can calculate its “quasiderivative”, since we will only ever be interested in its value at a given point.

Previous layer

Finally, the dependency on the previous layer. We know that . Hence, the derivative is simply:

Unless we consider the first layer, in which case , we can expect further dependency of this activation on other weights and the chain-rule continues. However, at a level of a single layer, we are only interested in the contribution of each of its nodes to the global cost function, which is the optimization target.

Collecting the results together, the contribution to each layer becomes:

where ’s can be thought of as indicators of how far we deviate from the optimal performance.

This quantity, we need to pass on “backward” through the layers, starting from :

Then going recursively:

which may be expressed as: , until we reach , as there is no “shift” between the data and the data iteself.

Implementation of the backpropagation

This is all we need! Looking carefully at the equations above, we can note three things:

  • It provides us with an exact recipe for defining how much we need to alter each weight in the network.
  • It is recursive (just defined “backward”), hence we can re-use our “layered” approach to compute it.
  • It requires ’s at every layer. However, since these are quantities are calculated when propagating forward, we can cache them.

In addition to that, since we know how to handle different activation functions, let us also incorporate them into our model.

Different activation functions

Probably the cleanest way to account for the different activation functions would be to organize them as a separate library. However, we are not developing a framework here. Therefore, we will limit ourselves to updating the DenseLayer class:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 class DenseLayer: def __init__(self, n_units, input_size=None, activation='sigmoid', name=None): self.n_units = n_units self.input_size = input_size self.W = None self.name = name self.A = None # here we will cache the activation values self.fn, self.df = self._select_activation_fn(activation) def _select_activation_fn(self, activation): if activation == 'sigmoid': fn = lambda x: 1 / (1 + np.exp(-x)) df = lambda x: x * (1 - x) elif activation == 'tanh': fn = lambda x: np.tanh(x) df = lambda x: 1 - x ** 2 elif activation == 'relu': fn = lambda x: np.where(x < 0.0, 0.0, x) df = lambda x: np.where(x < 0.0, 0.0, 1.0) else: NotImplementedError(f"Nope!") return fn, df def __call__(self, X): ... # as before A = self.fn(A) self.A = A return A

With caching and different activation functions being supported, we can move on to defining the .backprop method.

Passing on delta

Again, let’s update the class:

1 2 3 4 5 6 class DenseLayer: ... # as before def backprop(self, delta, a): da = self.df(a) # the derivative of the activation fn return (delta @ self.W)[:, 1:] * da

Here is where s is being passed down through the layers. Observe that we “trim” the matrix by eliminating , which relate to the bias terms.

Updating weights

Updating of ’s needs to happen at the level of the model itself. Having the right behavior at the level of a layer, we are ready to add this functionality to the SequentialModel class.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 class SequentialModel: ... # as before def _extend(self, vec): return np.hstack([np.ones((vec.shape[0], 1)), vec]) def backward(self, X, y_pred, y_true): n_layers = len(self.layers) delta = y_pred - y_true a = y_pred dWs = {} for i in range(-1, -len(self.layers), -1): a = self.layers[i - 1].A dWs[i] = delta.T @ self._extend(a) delta = self.layers[i].backprop(delta, a) dWs[-n_layers] = delta.T @ self._extend(X) for k, dW in dWs.items(): self.layers[k].W -= self.lr * dW

The loop index runs back across the layers, getting delta to be computed by each layer and feeding it to the next (previous) one. The matrices of the derivatives (or dW) are collected and used to update the weights at the end. Again, the ._extent() method was used for convenience.

Finally, note the differences in shapes between the formulae we derived and their actual implementation. This is, again, the consequence of adopting the convention, where we use the 0th axis for the example count.

Time for training!

We have finally arrived at the point, where we can take advantage of the methods we created and begin to train the network. All we need to do is to instantiate the model and let it call the .forward() and .backward() methods interchangeably until we reach convergence.

1 2 3 4 5 6 7 8 9 10 11 12 13 np.random.seed(42) N = X.shape[1] model = SequentialModel([ DenseLayer(6, activation='sigmoid', input_size=N, name='input'), DenseLayer(4, activation='tanh', name='1st hidden'), DenseLayer(3, activation='relu', name='2nd hidden'), DenseLayer(1, activation='sigmoid', name='output') ]) for e in range(100): y_pred = model.forward(X) model.backward(X, y_pred, y_true)

The table below presents the result:

example predicted true 0 0.041729 0.0 1 0.042288 0.0 2 0.951919 1.0 3 0.953927 1.0 4 0.814978 1.0 5 0.036855 0.0 6 0.228409 0.0 7 0.953930 1.0 8 0.050531 0.0 9 0.953687 1.0 Conclusion

It seems our network indeed learned something. More importantly, we have shown how mathematical equations can also suggest more reasonable ways to implement them. Indeed, if you look carefully, you can perhaps notice that this implementation is quite “Keras-like”.

This is not a coincidence. The reason for this to happen is the fact it the very nature of the equations we used. As the concept uses the concept of derivatives, we follow the “chain-rule” also in the programming, which applies to other types of layers as well, and it the reason why Tensorflow or PyTorch organize the equations as graphs using a symbolic math approach. Although the intention of this work was never to compete against these well-established frameworks, it hopefully makes them more intuitive.

Coming back to the beginning, we hope that you liked the “melody” of these equations played using bare-bone numpy. If you did, feel encouraged to read our previous “implemented from scratch” article on Hidden Markov Models here.

Categories: FLOSS Project Planets

ext4 (and FUSE) on FreeBSD

Planet KDE - Sun, 2020-10-25 19:00

FreeBSD has a FUSE kernel module (Filesystems in User Space, I think), which allows it to use other filesystems – in user space – than it would normally do. Today it saved my bacon.

I do a lot of development work on a FreeBSD machine, with Linux as the target platform: that’s what you get (punishment?) for writing Linux installers, I guess. I have a handful of development and test VMs, all in VirtualBox, all with a ZFS volume (a reserved chunk of disk) as virtual disk. This normally gives me a lot of freedom in what I do with my VM’s HDDs: I can manipulate them easily from the host system. For testing purposes, that’s usually either zeroing them out or putting some partition table on them beforehand.

For whatever reason, today VirtualBox was giving me no end of trouble: as I boot each Linux VM, it gets a ton of I/O errors reading the disk, then ends up wedged somewhere in what looks like Plymouth in the guest, and then VBox tells me there was an error and gives up on the VM. It’s not physical I/O errors, since I can read all the data from the ZFS volume with dd, but how can I reach the data?

Enter FUSE, along with the port fusefs-ext2. Getting the software up-and-running (for an ad-hoc need-data-now session) took two steps. For good measure, I also installed e2fsprogs, which allows me to debug ext2 (and three, and four) filesystems from the host system as well.

# pkg install fusefs-ext2 e2fsprogs # kldload fusefs

Voila!

My ZFS volumes are regular “disk” devices for all intents and purposes: GEOM (the disk subsystem) recognizes that they are GPT or MBR partitioned and puts per-partition (“slice” in BSD jargon) files under /dev/zvol as needed. So from the FreeBSD host I can do:

# fdisk /dev/zvol/zippy/scratch-2-medium # fsck.ext4 /dev/zvol/zippy/scratch-2-mediump1 # fuse-ext2 /dev/zvol/zippy/scratch-2-mediump1 /mnt/tmp

To (respectively) double-check that the disk contains what I expect (fdisk tells me it’s a GPT disk, which I should have guessed from the p1 partition naming), fsck the filesystem (now that VirtualBox has flipped out over it), and mount it read-only (to get at the data I need).

To give this a teensy bit of a KDE spin, there’s also a port fusefs-smbnetfs which exposes Samba to FUSE, and which can then be used from Dolphin to FUSE-mount network shares – if I were a more avid Dolphin user, and less of a now-satisfied-I-can-get-my-data-from-the-command-line user, I might go looking if FUSE-mounting zvols in general can be done from Dolphin.

Categories: FLOSS Project Planets

Cutelyst 2.13 and ASql 0.19 released

Planet KDE - Sun, 2020-10-25 16:08

Cutelyst the C++/Qt Web Framework and ASql the ASync SQL library for Qt applications got new versions.

Thanks to the work on ASql Cutelyst got some significant performance improvements on async requests, as well as a new class called ASync, which automatically detaches the current request from the processing chain, and attaches later on when it goes out of scope.

With the ASync class you capture it on your ASql lambda and once the query result arrives and the lambda is freed and the ASync object gets out of scope and continues the processing action chain.

KDAB’s fix on a 10 year old bug raised my attention back to a note on QTcpSocket documentation:

Note: TCP sockets cannot be opened in QIODevice::Unbuffered mode.

Which is actually wrong, according to the source code, Cutelyst has been using buffered mode since always due that, so hopefully this new version will be a bit faster and consume less memory, it’s important to notice that once the Kernel tells it’s going to block QTcpSocket writes get buffered anyway.

Now talking about ASql you might notice the jump on release versions, this is because I’ve been experimenting some changes and didn’t want to write a post at each new feature.

ASql is now at the closest API I’d like it to be, unfortunately one of my goals that would be to be able to handle it’s AResult object to Grantlee/Cutelee and be able to iterate over it’s records just once, but the call QVariant::canConvert with QVariantList must succeed and the class be able to be casted to QAssociativeIterable or QSequentialIterable, which I didn’t managed to get it working in a way I like.

But AResult has hash() and hashes() methods that convert the data to a format that can be used in templating.

On the plus side I added iterators (if you have experience with iterators please review my code as this was the first time I wrote this kind of code) that also work on for ranged loops, and they also have faster type conversion methods, instead converting the data to a QVariant type, and them using QVariant to get the data out of it, one can just call toInt() which will get the int straight from the result set without checking each time if it’s an int.

Added AMigrations which is an awesome database maintenance class that is both simple and helpful to maintain database schemas.

  • ACache class to cache special queries
  • Support for data inside QJsonValue
  • Single Row mode (the lambda get’s called once per result)
  • Prepared Queries
  • Scoped Transaction class
  • Notifications – this is my favorite PostgreSQL feature, it’s hard to image some other big databases lack such an useful feature.

Oh, and Cutelyst results on TechEmpower already got better thanks to ASql, hoping to see even better results when I update the tests to 2.13 that has Unbuffered and faster async handlying.

https://github.com/cutelyst/asql/releases/tag/v0.19.0

https://github.com/cutelyst/cutelyst/releases/tag/v2.13.0

Categories: FLOSS Project Planets

"CodersLegacy": Datacamp Review

Planet Python - Sun, 2020-10-25 03:05

This article is a review on coding tutorial site, Datacamp.

Datacamp is a very well known online learning platform for programmers. It aims to teach a variety of different languages and topics through the use of videos, text and exercises.

In this review we’ll be attempting to cover everything about Datacamp, from it’s format to it’s user complaints to it’s good points. Whether Datacamp is worth the time and money, will be clear to you by the end of this review.

Format

When introducing a new concept, Datacamp will first introduce and explain it through the use of a short video. In the video the instructor of that particular course will explain the concept thoroughly.

Next will be a bunch of practice exercises and challenges for you to solve. The purposes of these is to test what you learnt in the video and further improve your concept through practice.

The final step in Datacamp’s learning process are “projects”. These are full proper projects which you are assigned to create which really test your abilities and knowledge on the subject matter.

Complaints

We’ll begin our review by getting the complaints regarding Datacamp out of the way first. We’ve compiled the below information after going through Datacamp’s courses ourselves and through dozens of different reviews.

Course content

The most common complaint was probably that some of the courses weren’t challenging enough and that the content was rather forgettable. Both these complaints are actually linked together. The more challenging a course is, the more effort you put it and the better your concept develops through repeated trial and error. An easier course appeals to a wider range of people (easier learning curve), but ultimately leaves a lighter impact.

The reason for the courses not being challenging appears to be the fact that in the practice exercises and challenges, a significant portion of the code was already pre-written. The person doing the exercise only had to complete a few lines of code throughout the program.

However, this issue seems to have been somewhat remedied with the introduction of Datacamp Projects (will be discussed later).

Advanced Content

While Datacamp do offer some courses on advanced concepts, most of the positive reviews on Datacamp were for it’s introductory courses, the R language, and Data Science.

There were several reviews which stated that Datacamp’s advanced concepts weren’t very well explained. This is something that’s rather expected of sites like Datacamp’s. While there are undoubtedly going to some good advanced courses in Datacamp, there are better places for learning advanced concepts (discussed at the end of the article).

Keep in mind however, that everyone has their own unique way of learning. The style in which Datacamp teaches will not appeal to everyone, so some criticism is expected. What one person may criticize, is something another might be praising. There’s no way to find out for sure until you try it for yourself.

On another note, the number of positive reviews for Datacamp significantly outnumbered the negative ones. (We went through hundreds of user reviews for this article) So that can be taken as a good sign.

Compliments & Praise

Here we’ve compiled everything good about Datacamp after careful examination of the site and user reviews.

You get the choice to start off with one of three Data science languages, Python, SQL and R. There are other options availible of course, but these three are the main ones that you’re recommended to start out with.

Built-in IDE

Datacamp has its own built-in browser IDE where you can practice and run your code. The benefit of this is that you don’t require your own installation, though I strongly recommend you do. The built-in IDE is clean and has features like auto complete that you don’t often see in browser IDE’s.

Course Previews

All of Datacamp’s standard courses have the 1st chapter available for free. It’s not much, but it lasts long enough for you to get a feel of how Datacamp teaches. If it suits you and you want more, you can go ahead and purchase it.

Certificates

The certificate doesn’t have much actual value in the real world, but it’s still a mark of your efforts which definitely counts for something. While you can’t rely on it to get you a job, it might come in handy in other places.

A good starting point

Many people who have gone through Datacamp, praise it’s courses for being a great starting point for that respective field. You’ll need to learn alot more than what Datacamp teaches you in it’s courses for an actual Job (in that field), but Datacamp provides a solid foundation from where you can move forward and learn the necessary skills.

Projects

As we mentioned earlier, Datacamp have these projects which are based off real life situations. Learning the programming theory is not enough. Until you learn how to actually apply in real life scenario’s that knowledge is almost useless.

Even completing exercises and challenges are not enough. Exercises only really test your knowledge about the course content. On the other hand, projects test your ability to use what you’ve learnt in the course.

Projects were actually a later introduction into Datacamp, that helped improve the practical skills of Datacamp users. The previous exercises were just not enough.

Career tracks and Skill tracks

Datacamp is a site with hundreds of courses on a wide variety of subjects. It’s easy for someone new to become overwhelmed with the number of options available.

Luckily, Datacamp has created Career and Skill tracks which compile together all the relevant courses for a certain Career or Skill. For instance, let’s say you want to pursue a career in Data Science, and you don’t know which courses to pick. If you go to the Data Scientist Career Track, then it will have the necessary courses that you’ll need to become a Data Scientist.

Pricing

Datacamp has a lot of different subscriptions available and usually has discounts and offers available, so we won’t discuss exact pricing here. What I can tell you though is that compared to many other platforms, it’s pretty fairly priced and has alot of value.

The benefit of buying a subscription for Datacamp is that you get access to all of their courses and projects. That’s alot of value, compared to other places where you spend the same amount of money for a single course.

Datacamp offers both monthly and yearly subscriptions, though I recommend you get the Yearly one as it’s usually on discount and it’s “per month” cost is lower.

I also happened to note that majority of positive reviews on Datacamp were directed towards it’s R language and Data Science courses. If you were planning on learning one of these at Datacamp, I recommend you do so.

Popular Courses

Here are come of Datacamp’s most popular courses that we’ve picked out for you in this review.

Introduction to R: One of the languages for which Datacamp is most known for, “Introduction to R” will teach you everything you need to begin Data analysis with R.

Introduction to Python: A intro into one of the most popular languages of today. Teaches you all the Python basics you need to know before moving onto more advanced concepts.

Introduction to Data Science in Python: Enter the world of Data Science with this Python course. Data Science is a rising field, with major importance in Data analysis, Machine Learning and AI. This course will help you begin your journey with Data Science.

Introduction to Data Visualization in Python: Data visualization is an important part of being a Data Scientist. Data is next to useless if you can’t display it in a way that makes sense. You can also find a R version of this course on Datacamp.

Datacamp Review Conclusion

While it’s not perfect, Datacamp has been a great starting point for many programmers out there and will continue to be in the future. It’s professional and simple which will appeal to a wide range of people, especially casual programmers or beginners.

After observing many of it’s reviews, I believe that Datacamp is not really suited for hard core programmers looking to learn advanced topics. Judging from it’s reviews, Datacamp’s introductory courses are the most popular. If you’re looking to enter a new field, Datacamp is a solid option that you won’t regret.

If you’re convinced that Datacamp is the right option for you, begin your Datacamp journey now!

For people looking to learn more advanced content, I advise you to either pick up a good book, or use one of the Datacamp alternatives described in the below section. If you’re interested in books, you can refer to this article which has some good recommendations for Python programmers.

For people who have begun or are going to begin their journey with Datacamp, don’t rely on it completely. Be sure to supplement your learning from other sources as well, but during and after using Datacamp.

Alternatives

Regardless of whether you choose to go with Datacamp or not, you should also consider some other great alternatives of which we’ve written a short review below.

Coursera:

Coursera is one of the world’s largest online learning platforms. You can find all kinds of courses here, ranging from simple from simple certifications to complete degree’s.

The courses here at Coursera are generally much longer, have a lot more video content and their practice content and exercises are tougher. The result of all this is that their certificates are more valuable and the courses make a bigger impact on you. The only downside is that the courses are more expensive.

Another site very much like Coursera and highly reputed is Udemy. If you can’t find the course you’re looking for on one site, try the other.

Codecademy:

Another site rather similar to Datacamp in nature. Unlike Datacamp it doesn’t focus around a particular field (like Data Science). It’s courses cover a massive number of different languages and topics, giving you a ton of options to pick from.

The biggest selling point in Codecademy is that it’s standard courses are all free of charge. It does however, have a Pro section that has paid courses which are obviously of higher quality than the standard ones.

This marks the end of the Datacamp Review article. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the article content can be asked in the comments section below.

The post Datacamp Review appeared first on CodersLegacy.

Categories: FLOSS Project Planets

Pages