FLOSS Project Planets

Savas Labs: Docker and the Drupal Pattern Lab Starter Theme

Planet Drupal - Sun, 2017-02-19 19:00

How to build a Docker Pattern Lab image for local Drupal development with the Pattern Lab Starter theme and/or with other common front-end applications such as npm, Gulp, and Bower. Continue reading…

Categories: FLOSS Project Planets

Gregor Herrmann: RC bugs 2016/52-2017/07

Planet Debian - Sun, 2017-02-19 17:19

debian is in deep freeze for the upcoming stretch release. still, I haven't dived into fixing "general" release-critical bugs yet; so far I mostly kept to working on bugs in the debian perl group:

  • #834912 – src:libfile-tee-perl: "libfile-tee-perl: FTBFS randomly (Failed 1/2 test programs)"
    add patch from ntyni (pkg-perl)
  • #845167 – src:lemonldap-ng: "lemonldap-ng: FTBFS randomly (failing tests)"
    upload package prepared by xavier with disabled tests (pkg-perl)
  • #849362 – libstring-diff-perl: "libstring-diff-perl: FTBFS: test failures with new libyaml-perl"
    add patch from ntyni (pkg-perl)
  • #851033 – src:jabref: "jabref: FTBFS: Could not find org.postgresql:postgresql:9.4.1210."
    update maven.rules
  • #851347 – libjson-validator-perl: "libjson-validator-perl: uses deprecated Mojo::Util::slurp, makes libswagger2-perl FTBFS"
    upload new upstream release (pkg-perl)
  • #852853 – src:libwww-curl-perl: "libwww-curl-perl: FTBFS (Cannot find curl.h)"
    add patch for multiarch curl (pkg-perl)
  • #852879 – src:license-reconcile: "license-reconcile: FTBFS: dh_auto_test: perl Build test --verbose 1 returned exit code 255"
    update tests (pkg-perl)
  • #852889 – src:liblatex-driver-perl: "liblatex-driver-perl: FTBFS: Test failures"
    add missing build dependency (pkg-perl)
  • #854859 – lemonldap-ng-doc: "lemonldap-ng-doc: unhandled symlink to directory conversion: /usr/share/doc/lemonldap-ng-doc/pages/documentation/current"
    help with dpkg-maintscript-helper, upload on xavier's behalf (pkg-perl)

thanks to the release team for pro-actively unblocking the packages with fixes which were uploaded after the begin of the freeze!

Categories: FLOSS Project Planets

Bhishan Bhandari: Raising and Handling Exceptions in Python – Python Programming Essentials

Planet Python - Sun, 2017-02-19 08:31
Brief Introduction Any unexpected events that occur during the execution of a program is known to be an exception. Like everything, exceptions are also objects in python that is either an instance of Exception class or an instance of underlying class derived from the base class Exception. Exceptions may occur due to logical errors in […]
Categories: FLOSS Project Planets

Import Python: Import Python Weekly Issue 112 - Python Programming Videos By MIT, mypy static type checker and more

Planet Python - Sun, 2017-02-19 06:44
Worthy Read
Introduction to Computer Science and Programming in Python. Video Series from MIT Introduction to Computer Science and Programming in Python is intended for students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems and to help students, regardless of their major, feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class uses the Python 3.5 programming language.
video
Whitepaper 3 Ways Our Dev Teams Create Velocity with Multi-System Integrations
sponsor
Python repository moves to GitHub Python core developer Brett talks about the history the decision to move Python to GitHub
core-python
memoryview memoryview is a special type that can be used to work with data stored in other data-structures.
core python
A Python-esque Type System for Python: Duck Typing Statically I think the mypy static type checker is a fantastic initiative, and absolutely love it. My one complaint is that it relies a little too much on subclassing for determining compatibility. This post discusses nominal vs. structural subtyping, duck typing and how it relates to structural subtyping, subtyping in mypy, and using abstract base classes in lieu of a structural subtyping system.
mypy
Regular Expressions Are Nothing to Fear regex
Extreme IO performance with parallel Apache Parquet in Python In this post, I show how Parquet can encode very large datasets in a small file footprint, and how we can achieve data throughput significantly exceeding disk IO bandwidth by exploiting parallelism (multithreading).
parquet, IO
Hire Development Experts Toptal hand-matches leading companies with experts in software, web,, and mobile app development. Let us match you with on-demand developers for your next project.
sponsor
Two Easy Ways to Use Scikit Learn and Dask This post describes two simple ways to use Dask to parallelize Scikit-Learn operations either on a single computer or across a cluster.
sckit learn
Learning AI if You Suck at Math — P4 — Tensors Illustrated (with Cats!) This the 4th part in the series.
tensorflow
Getting Started With Kafka Basically in this guide we will configure a basic Kafka instance in an Ubuntu environment & write a very very basic python producer & consumer.
kafka
Predict gender with voice and speech data A beginner’s guide to implementing classification algorithms in Python
machine learning, classification
Ergonomica Ergonomica is a Python-based console language, integrating modules such as os, shutil, and subprocess into a fast, easy-to use environment. It allows for functional programming tools and operations as well as data types that would otherwise require obscure grep or sed commands.
shell
Deep Learning with Keras on Google Compute Engine Inception, a model developed by Google is a deep CNN. Against the ImageNet dataset (a common dataset for measuring image recognition performance) it performed top-5 error 3.47%. In this tutorial, you’ll use the pre-trained Inception model to provide predictions on images uploaded to a web server.
deep learning, keras
Django Weekly Issue 26 Django round up for this week.
django

Projects
cpython - 5349 Stars, 303 Fork The Python programming language default implementation is now on github.
Bella - 494 Stars, 53 Fork A pure python, post-exploitation, data mining tool and remote administration tool for macOS.
PyTorch-Mini-Tutorials - 112 Stars, 12 Fork Minimal tutorials for PyTorch
mog - 38 Stars, 2 Fork A different take on the UNIX tool cat
pyowl - 31 Stars, 6 Fork Ordered Weighted L1 regularization for classification and regression in Python
QuoraDQBaseline - 10 Stars, 5 Fork Baseline solution to Quora Duplicate Question dataset.
http_heartbeat_proxy - 2 Stars, 0 Fork A simple proxy make some service heartbeat-able.
Categories: FLOSS Project Planets

libsigsegv @ Savannah: libsigsegv 2.11 is released

GNU Planet! - Sat, 2017-02-18 17:33

libsigsegv version 2.11 is released.

New in this release:

  • Added support for catching stack overflow on Linux/SPARC.
  • Provide a correct value for SIGSTKSZ on 64-bit AIX and on HP-UX. The one defined by these systems is too small.
  • Updated build infrastructure.
  • Compilation now requires the <stdint.h> include file. Platforms which don't have this include file (such as IRIX) are no longer supported.
  • NOTE: Support for Cygwin and native Windows is currently not up-to-date.

Download: https://haible.de/bruno/gnu/libsigsegv-2.11.tar.gz

Categories: FLOSS Project Planets

Steve Kemp: Apologies for the blog-churn.

Planet Debian - Sat, 2017-02-18 17:00

I've been tweaking my blog a little over the past few days, getting ready for a new release of the chronicle blog compiler (github).

During the course of that I rewrote all the posts to have 100% lower-case file-paths. Redirection-pages have been auto-generated for each page which was previously mixed-case, but unfortunately that will have meant that the RSS feed updated unnecessarily:

  • If it used to contain:
    • https://example.com/Some_Page.html
  • It would have been updated to contain
    • https://example.com/some_page.html

That triggered a lot of spamming, as the URLs would have shown up as being new/unread/distinct.

Categories: FLOSS Project Planets

Jamal Moir: Become a Lord of the Cells and Speed up Your Jupyter Notebook Workflow

Planet Python - Sat, 2017-02-18 12:04

Everyone loves a good Jupyter Notebook. Jupyter Notebooks are an insanely convenient environment to rapidly prototype Python scripts and delve into Data Science. They speed up the time from writing code to actually executing it and you can visually see the output for each section you write. I make heavy use Jupyter Notebooks in my […]

The post Become a Lord of the Cells and Speed up Your Jupyter Notebook Workflow appeared first on Data Dependence.

Categories: FLOSS Project Planets

agoradesign: Drupal's great little helpers: Random utility class

Planet Drupal - Sat, 2017-02-18 07:50
Drupal's API has a huge number of very useful utitlity classes and functions, especially in Drupal 8. Although the API docs are great, it's rather impossible to always find every little feature. Today I want to show you the Random utility class, which I've nearly overseen and found rather by accident.
Categories: FLOSS Project Planets

Nicola Iarocci: Python Workload pulled off Visual Studio 2017 RC3

Planet Python - Sat, 2017-02-18 04:48

So how do you install the awesome Python Development Tools on the latest Visual Studio 2017 RC? That might seem a stupid question considering that the Data Science and Python Development workload has been available with every Release Candidate so far. You simply select the workload during the installation and you’re done, right? Not quite.

I found out the hard way this morning as I wanted to install VS 2017 RC3 on my development machine and, to my surprise, I could not find Python Development anywhere on the workloads window (which itself is a huge improvement over the VS 2015 install experience, by the way). Easy, I thought, they moved it to some secondary “optional workloads” tab, but a quick scan did not reveal any of that.

Concerned now, I turned to the Oracle of All Things only to find that the Python Workload has been pulled off the Visual Studio 2017 RC3 (January 2017). It was actually reported in the release notes:

Removed the Data Science and Python Development workloads as some of the components weren’t meeting the release requirements, such as translation to non-English languages. They will be available soon as separate downloads.

When I glanced over them I (and probably you too) did not notice this little paragraph. But wait, it’s even worse than you would expect:

Upgrading to current version will remove any previously installed Python and Data Science workloads/components.

That’s right. If you upgrade to RC3 you win a wipe out of your Python environment. Further research revelead an open ticket on GitHub. Apparently they are working on a way to install the Python and Data Science workloads on top of an existing VS 2017 install, but I would not hold my breath on it:

Thanks everyone for the support and understanding. It’s still not clear to us how we’re going to be releasing Python support, but the plan is definitely to have something when VS 2017 releases next month.

Since the official VS 2017 release is planned early next month it is very likely that we will just have to wait until then. In the meantime, you better have a VS 2015 sitting side by side with your brand new, mutilated, Visual Studio 2017. Or you switch to Visual Studio Code, which offers fantastic support for Python.

Or you fallback to good ole trusted Vim, like I did.

join the newsletter to get an email alert when a new post surfaces on this site. if you want to get in touch, i am @nicolaiarocci on twitter.

Categories: FLOSS Project Planets

Full Stack Python: The Full Stack Python Blog

Planet Python - Sat, 2017-02-18 00:00

Full Stack Python began way back in December 2012 when I started writing the initial deployment, server, operating system, web server and WSGI server pages. Since then, the pages have expanded out into a boatload of other areas including subjects outside the deployment topics I originally started the site to explain.

Frequently though I wanted to write a Python walkthrough that was not a good fit for the page format I use for each topic. Many of those walkthroughs became Twilio blog posts but not all of them were quite the right fit on there. I'll still be writing plenty more Twilio tutorials, but this Full Stack Python blog is the spot for technical posts that fall outside the Twilio domain.

Let me know what you think and what tutorials you'd like to see in the future. Hit me up on Twitter @fullstackpython or @mattmakai.

Categories: FLOSS Project Planets

Philip Semanchuk: Pandas Surprise

Planet Python - Fri, 2017-02-17 22:20
Summary

Part of learning how to use any tool is exploring its strengths and weaknesses. I’m just starting to use the Python library Pandas, and my naïve use of it exposed a weakness that surprised me.

Background Thanks to bradleypjohnson for sharing this Lucky Charms photo under CC BY 2.0.

I have a long list of objects, each with the properties “color” and “shape”. I want to count the frequency of each color/shape combination. A sample of what I’m trying to achieve could be represented in a grid like this –

circle square star blue 8 41 18 orange 5 33 25 red 53 64 58

At first I implemented this with a dictionary of collections.Counter instances where the top level dictionary is keyed by shape, like so –

import collections SHAPES = ('square', 'circle', 'star', ) frequencies = {shape: collections.Counter() for shape in SHAPES}

Then I counted my frequencies using the code below. (For simplicity, assume that my objects are simple 2-tuples of (shape, color)).

for shape, color in all_my_objects: frequencies[shape][color] += 1

So far, so good.

Enter the Pandas

This looked to me like a perfect opportunity to use a Pandas DataFrame which would nicely support the operations I wanted to do after tallying the frequencies, like adding a column to represent the total number (sum) of instances of each color.

It was especially easy to try out a DataFrame because my counting loop ( for...all_my_objects) wouldn’t change, only the definition of frequencies. (Note that the code below requires I know in advance all the possible colors I can expect to see, which the Dict + Counter version does not. This isn’t a problem for me in my real-world application.)

import pandas as pd frequencies = pd.DataFrame(columns=SHAPES, index=COLORS, data=0, dtype='int') for shape, color in all_my_objects: frequencies[shape][color] += 1 It Works, But…

Both versions of the code get the job done, but using the DataFrame as a frequency counter turned out to be astonishingly slow. A DataFrame is simply not optimized for repeatedly accessing individual cells as I do above.

How Slow is it?

To isolate the effect pandas was having on performance, I used Python’s timeit module to benchmark some simpler variations on this code. In the version of Python I’m using (3.6), the default number of iterations for each timeit test is 1 million.

First, I timed how long it takes to increment a simple variable, just to get a baseline.

Second, I timed how long it takes to increment a variable stored inside a collections.Counter inside a dict. This mimics the first version of my code (above) for a frequency counter. It’s more complex than the simple variable version because Python has to resolve two hash table references (one inside the dict, and one inside the Counter). I expected this to be slower, and it was.

Third, I timed how long it takes to increment one cell inside a 2×2 NumPy array. Since Pandas is built atop NumPy, this gives an idea of how the DataFrame’s backing store performs without Pandas involved.

Fourth, I timed how long it takes to increment one cell inside a 2×2 Pandas DataStore. This is what I had used in my real code.

Raw Benchmark Results

Here’s what timeit showed me. Sorry for the cramped formatting.

$ python Python 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import timeit >>> timeit.timeit('data += 1', setup='data=0') 0.09242476700455882 >>> timeit.timeit('data[0][0]+=1',setup='from collections import Counter;data={0:Counter()}') 0.6838196019816678 >>> timeit.timeit('data[0][0]+=1',setup='import numpy as np;data=np.zeros((2,2))') 0.8909121589967981 >>> timeit.timeit('data[0][0]+=1',setup='import pandas as pd;data=pd.DataFrame(data=[[0,0],[0,0]],dtype="int")') 157.56428507200326 >>> Benchmark Results Summary

Here’s a summary of the results from above (decimals truncated at 3 digits). The rightmost column shows the results normalized so the fastest method (incrementing a simple variable) equals 1.

Actual (seconds) Normalized (seconds) Simple variable 0.092 1 Dict + Counter 0.683 7.398 Numpy 2D array 0.890 9.639 Pandas DataFrame 157.564 1704.784

As you can see, resolving the index references in the middle two cases (Dict + Counter in one case, NumPy array indices in the other) slows things down, which should come as no surprise. The NumPy array is a little slower than the Dict + Counter.

The DataFrame, however, is about 150 – 200 times slower than either of those two methods. Ouch!

I can’t really even give you a graph of all four of these methods together because the time consumed by the DataFrame throws the chart scale out of whack.

Here’s a bar chart of the first three methods –

Here’s a bar chart of all four –

Why Is My DataFrame Access So Slow?

One of the nice features of DataFrames is that they support dictionary-like labels for rows and columns. For instance, if I define my frequencies to look like this –

>>> SHAPES = ('square', 'circle', 'star', ) >>> COLORS = ('red', 'blue', 'orange') >>> pd.DataFrame(columns=SHAPES, index=COLORS, data=0, dtype='int')         square  circle  star red          0       0     0 blue         0       0     0 orange       0       0     0 >>>

Then frequencies['square']['orange'] is a valid reference.

Not only that, DataFrames support a variety of indexing and slicing options including –

  • A single label, e.g. 5 or 'a'
  • A list or array of labels ['a', 'b', 'c']
  • A slice object with labels 'a':'f'
  • A boolean array
  • A callable function with one argument

Here are those techniques applied in order to the frequencies DataFrame so you can see how they work –

>>> frequencies['star'] red       0 blue      0 orange    0 Name: star, dtype: int64 >>> frequencies[['square', 'star']]         square  star red          0     0 blue         0     0 orange       0     0 >>> frequencies['red':'blue']       square  circle  star red        0       0     0 blue       0       0     0 >>> frequencies[[True, False, True]]         square  circle  star red          0       0     0 orange       0       0     0 >>> frequencies[lambda x: 'star'] red       0 blue      0 orange    0 Name: star, dtype: int64

This flexibility has a price. Slicing (which is what is invoked by the square brackets) calls an object’s __getitem__() method. The parameter to __getitem__()  is the whatever was inside the square brackets. A DataFrame’s __getitem__() has to figure out what the passed parameter represents. Determining whether the parameter is a label reference, a callable, a boolean array, or something else takes time.

If you look at the DataFrame’s __getitem__() implementation, you can see all the code that has to execute to resolve a reference. (I linked to the version of the code that was current when I wrote this in February of 2017. By the time you read this, the actual implementation may differ.) Not only does __getitem__() have a lot to do, but because I’m accessing a cell (rather than a whole row or column), there’s two slice operations, so __getitem__() gets invoked twice each time I increment my counter.

This explains why the DataFrame is so much slower than the other methods. The dictionary and Counter both only support key lookup in a hash table, and a NumPy array has far fewer slicing options than a DataFrame, so its __getitem__() implementation can be much simpler.

Better DataFrame Indexing?

DataFrames support a few methods that exist explicitly to support “fast” getting and setting of scalars. Those methods are .at() (for label lookups) and .iat() (for integer-based index lookups). It also provides get_value() and set_value(), but those methods are deprecated in the version I have (0.19.2).

“Fast” is how the Panda’s documentation describes these methods. Let’s use timeit to get some hard data. I’ll try at() and iat(); I’ll also try get_value()/set_value() even though they’re deprecated.

>>> timeit.timeit("data.at['red','square']+=1",setup="import pandas as pd;data=pd.DataFrame(columns=('square','circle','star'),index=('red','blue','orange'),data=0,dtype='int')") 36.33179204000044 >>> timeit.timeit('data.iat[0,0]+=1',setup='import pandas as pd;data=pd.DataFrame(data=[[0,0],[0,0]],dtype="int")') 42.01523362501757 >>> timeit.timeit('data.set_value(0,0,data.get_value(0,0)+1)',setup='import pandas as pd;data=pd.DataFrame(data=[[0,0],[0,0]],dtype="int")') 15.050199927005451 >>>

These methods are better, but they’re still pretty bad. Let’s put those numbers in context by comparing them to other techniques. This time, for normalized results, I’m going to use my Dict + Counter method as the baseline of 1 and compare all other methods to that. The row “DataFrame (naïve)” refers to naïve slicing, like frequencies[0][0].

Actual (seconds) Normalized (seconds) Dict + Counter 0.683 1 Numpy 2D array 0.890 1.302 DataFrame (get/set) 15.050 22.009 DataFrame (at) 36.331 53.130 DataFrame (iat) 42.015 61.441 DataFrame (naïve) 157.564 230.417

The best I can do with a DataFrame uses deprecated methods, and is still over 20 times slower than the Dict + Counter. If I use non-deprecated methods, it’s over 50 times slower.

Workaround

I like label-based access to my frequency counters, I like the way I can manipulate data in a DataFrame (not shown here, but it’s useful in my real-world code), and I like speed. I don’t necessarily need blazing fast speed, I just don’t want slow.

I can have my cake and eat it too by combining methods. I do my counting with the Dict + Counter method, and use the result as initialization data to a DataFrame constructor.

SHAPES = ('square', 'circle', 'star', ) frequencies = {shape: collections.Counter() for shape in SHAPES} for shape, color in all_my_objects: frequencies[shape][color] += 1 frequencies = pd.DataFrame(data=frequencies)

The frequencies DataFrame now looks something like this –

circle square star blue 8 41 18 orange 5 33 25 red 53 64 58

The rows and columns appear in essentially random order; they’re ordered by whatever order Python returns the dict keys during DataFrame initialization. Getting them in a specific order is left as an exercise for the reader.

There’s one more detail to be aware of. If a particular (shape, color) combination doesn’t appear in my data, it will be represented by NaN in the DataFrame. They’re easy to set to 0 with frequencies.fillna(0).

Conclusion

What I was trying to do with Pandas – unfortunately, the very first thing I ever tried to do with it – didn’t play to its strengths. It didn’t break my code, but it slowed it down by a factor of ~1700. Since I had thousands of items to process, the difference was hard to overlook!

Pandas looks great for some things, and I expect I’ll continue using it. This was just a bump in the road, albeit an interesting one.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RPushbullet 0.3.1

Planet Debian - Fri, 2017-02-17 21:17

A new release 0.3.1 of the RPushbullet package, following the recent 0.3.0 release is now on CRAN. RPushbullet is interfacing the neat Pushbullet service for inter-device messaging, communication, and more. It lets you easily send alerts like the one to the to your browser, phone, tablet, ... -- or all at once.

This release owes once again a lot to Seth Wenchel who helped to update and extend a number of features. We fixed one more small bug stemming from the RJSONIO to jsonlite transition, and added a few more helpers. We also enabled Travis testing and with it covr-based coverage analysis using pretty much the same setup I described in this recent blog post.

Changes in version 0.3.1 (2017-02-17)
  • The target device designation was corrected (#39).

  • Three new (unexported) helper functions test the validity of the api key, device and channel (Seth in #41).

  • The summary method for the pbDevices class was corrected (Seth in #43).

  • New helper functions pbValidateConf, pbGetUser, pbGetChannelInfo were added (Seth in #44 closing #40).

  • New classes pbUser and pbChannelInfo were added (Seth in #44).

  • Travis CI tests (and covr coverage analysis) are now enabled via an encrypted config file (#45).

Courtesy of CRANberries, there is also a diffstat report for this release.

More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Ingo Juergensmann: Migrating from Owncloud 7 on Debian to Nextcloud 11

Planet Debian - Fri, 2017-02-17 18:19

These days I got a mail by my hosting provider stating that my Owncloud instance is unsecure, because the online scan from scan.nextcloud.com mailed them. However the scan seemed quite bogus: it reported some issues that were listed as already solved in Debians changelog file. But unfortunately the last entry in changelog was on January 5th, 2016. So, there has been more than a whole year without security updates for Owncloud in Debian stable.

In an discussion with the Nextcloud team I complained a little bit that the scan/check is not appropriate. The Nextcloud team replied very helpful with additional information, such as two bug reports in Debian to clarify that the Owncloud package will most likely be removed in the next release: #816376 and #822681.

So, as there is no nextcloud package in Debian unstable as of now, there was no other way to manually upgrade & migrate to Nextcloud. This went fairly well:

ownCloud 7 -> ownCloud 8.0 -> ownCloud 8.1 -> ownCloud 8.2 -> ownCloud 9.0 -> ownCloud 9.1 -> Nextcloud 10 -> Nextcloud 11

There were some smaller caveats:

  1. When migrating from OC 9.0 to OC 9.1 you need to migrate your addressbooks and calendars as described in the OC 9.0 Release Notes
  2. When migrating from OC 9.1 to Nextcloud 10, the OC 9.1 is higher than expected by the Mextcloud upgrade script, so it warns about that you can't downgrade your installation. The fix was simply to change the OC version in the config.php
  3. The Documents App of OC 7 is no longer available in Nextcloud 11 and is replaced by Collabora App, which is way more complex to setup

The installation and setup of the Docker image for collabora/code was the main issue, because I wanted to be able to edit documents in my cloud. For some reason Nextcloud couldn't connect to my docker installation. After some web searches I found "Can't connect to Collabora Online" which led me to the next entry in the Nextcloud support forum. But in the end it was this posting that finally made it work for me. So, in short I needed to add...

DOCKER_OPTS="--storage-driver=devicemapper"

to /etc/default/docker.

So, in the end everything worked out well and my cloud instance is secure again. :-)

UPDATE 2016-02-18 10:52:
Sadly with that working Collabora Online container from Docker I now face this issue of zombie processes for loolforkit inside of that container.

Kategorie: DebianTags: DebianSoftwareCloudServer 
Categories: FLOSS Project Planets

Caktus Consulting Group: Caktus Attends Wagtail CMS Sprint in Reykjavik

Planet Python - Fri, 2017-02-17 18:15

Caktus CEO Tobias McNulty and Sales Engineer David Ray recently had the opportunity to attend a development sprint for the Wagtail Content Management System (CMS) in Reykjavik, Iceland. The two-day software development sprint attracted 15 attendees hailing from a total of 5 countries across North America and Europe.

Wagtail was originally built for the Royal College of Art by UK firm Torchbox and is now one of the fastest-growing open source CMSs available. Being longtime champions of the Django framework, we’re also thrilled that Wagtail is Django-based. This makes Wagtail a natural fit for content-heavy sites that might still benefit from the customization made possible through the CMS’ Django roots.

The team worked on a wide variety of projects, including caching optimizations, an improved content model, a new React-based page explorer, the integration of a new rich-text editor (Draft.js), performance enhancements, other new features, and bug fixes.

Team Wagtail Bakery stole the show with a brand-new demo site that’s visually appealing and better demonstrates the level of customization afforded by the Wagtail CMS. The new demo site, which is still in development as of the time of this post, can be found at wagtail/bakerydemo on GitHub.

After the sprint was over, our hosts at Overcast Software were kind enough to take us on a personalized tour of the countryside around Reykjavik. We left Iceland with significant progress on a number of pull requests on Wagtail, new friends, and a new appreciation for the country's magical landscapes.

We were thrilled to attend and are delighted to be a part of the growing Wagtail community. If you're interested in participating in the next Wagtail sprint, it is not far away. Wagtail Space is taking place in Arnhem, The Netherlands March 21st-25th and is being organized to accommodate both local and remote sprinters. We hope to connect with you then!

Categories: FLOSS Project Planets

Drupal Association blog: Drupal Association membership campaign: February 20 to March 8

Planet Drupal - Fri, 2017-02-17 13:51

Drupal.org is home of the Drupal project and the Drupal community. It has been continuously operating since 2001. The Engineering Team— along with amazing community webmasters— keeps Drupal.org alive and well. As we launch the first membership campaign of 2017, our story is all about this small and productive team.

Join us as we celebrate all that the engineering team has accomplished. From helping grow Drupal adoption, to enabling contribution; improving infrastructure to making development faster. The team does a lot of good for the community, the project, and Drupal.org.

Check out some of their accomplishments and if you aren't yet a Drupal Association member, join us! Help us continue the work needed to make Drupal.org better, every day.

Share these stories with others - now until our membership drive ends on March 8.

Share

Tweet

Share

Thank you for supporting our work!

Categories: FLOSS Project Planets

David MacIver: Rigging elections with integer linear programming

Planet Python - Fri, 2017-02-17 12:54

No, this isn’t a post about politics, sorry, it’s just a post about voting theory.

As you might have noticed (and if not, this is the post announcing it), I have a book out! It’s called Voting by Example and is about the complexities of voting and why different voting systems might be interesting. You should go buy it.

But this isn’t just an ad post for the book, it’s about something else: How I came up with the examples in this book.

At its core is the following example. We have a student election where there are four candidates are running for class president. Each student casts a vote ranking the four candidates in order of most to least preferred.

The votes are as follows:

  • 32% of students voted Alex, Kim, Charlie, Pat
  • 27% of students voted Charlie, Pat, Kim, Alex
  • 25% of students voted Pat, Charlie, Kim, Alex
  • 16% of students voted Kim, Pat, Alex, Charlie

The point of this example is that each of four different ranked voting systems give a different answer:

It’s also constructed to have a minimal number of voting blocs, though it’s minimum amongst a slightly more specific set of elections than just those satisfying the above conditions.

Roughly what this means is:

  • Alex has the most first choice votes
  • When compared pairwise with any other candidate, the majority prefer Charlie
  • Pat wins a complicated procedure where people iteratively drop out depending on who is the favourite of the remaining candidates amongst the fewest voters
  • If you give people a score of 3 for each voter who ranks them first, two for each who ranks them second, and one for each who ranks them third, then Kim has the highest score.

If you want to know more than that I recommend the relevant Wikipedia links above, which are all very approachable.

The significance of this is that each of these is a popular (either in practice or amongst electoral theorists) way of saying who should be the winner. So the situation ends up being rather complex to decide.

But this isn’t a post about the significance. This is a post about how I constructed the example.

Historically I’ve generally created example elections with Hypothesis or a similar approach – randomly fuzzing until you get a result you want – but that wasn’t really going to work very well here due to the rather complicated set of constraints that are hard to satisfy by accident.

So I instead turned to my old friend, integer linear programming.

The idea of integer linear programming (ILP) is that we have a number of variables which are forced to take integer values. We can then impose linear constraints between them, and give a linear objective function to optimise for. For example we might have three variables \(v_1, v_2, v_3\) and the following constraints:

  • \(v_1, v_2, v_3 \geq 0\)
  • \(v_1 + 2 v_2 + 3 v_3 \leq 5\)

And then try to maximise \(v_1 + 2 v_2 + 4 v_3\).

Given a problem like this we can feed it to an ILP solver and it will spit out a solution. If we do that, it will tell us that an optimal solution for this problem is \(v_1 = 2, v_2 = 0, v_3 = 1\).

A lot of problems can be nicely turned into ILP problems, and there are some fairly good open source ILP solvers, so it’s often a nice way to solve complex combinatorial problems.

So how do we turn our election rigging problem into an integer linear program?

The key idea is this: 4! isn’t very large. It’s only 24. 24 is tiny from an ILP point of view.

This means that there’s no problem with creating a variable for each of the possible votes that someone could cast, representing the number of people who cast that vote. That is, we create integer variables \(v_p\) indexed by permutations with the constraints:

  • \(\sum v_p = 100\)
  • \(v_p \geq 0\)

Then the value of e.g. \(v_{(0, 1, 2, 3)}\) is the number of people who cast the exact vote \((0, 1, 2, 3)\) (or, by name, Alex,, Charlie, Pat, Kim).

We then use a trick to get nicer examples, which is that we try to minimise the number of non-zero votes. The idea is to create variables which, when minimised, just look like markers that say whether a vote is non-zero or not.

So we create supplementary 0/1 valued integer variables \(u_p\) with the constraints that \(v_p \leq 100 u_p\), and set the objective to minimise \(\sum u_p\). Then \(u_p\) will be set to \(0\) wherever it can be, and the only places where it can’t are where \(v_p\) is non-zero. Thus this minimises the number of voting blocs.

So that’s how we create our basic ILP problem, but right now it will just stick 100 votes on some arbitrary possible ballot. How do we then express the voting conditions?

Well, lets start with the plurality and Borda scores. These are pretty easy, because they constitute just calculating a score for each candidate for each permutation and adding up the scores. This means that the scores are just a linear function of the variables, which is exactly what an ILP is built on.

Victory is then just a simple matter of one candidate’s score exceeding another. You need to set some epsilon for the gap (linear programming can’t express \(<\), only \(\leq\)), but that’s OK – the scores are just integers, so we can just insist on a gap of \(1\).

The following code captures all of the above using Pulp, which is a very pleasant to use Python interface to a variety of ILP solvers:

The idea is that the additive_scores parameter takes a list of scoring functions and a partial list of winners given those functions and returns an election producing those orders.

So if we run this asking for the plurality winners to be (0, 1, 2, 3) and the Borda winners to be (3, 2, 1, 0) we get the following:

>>> build_election(4, 100, [ (lambda p, c: int(p[0] == c), (0, 1, 2, 3)), (lambda p, c: 3 - p.index(c), (3, 2, 1, 0))]) [((0, 3, 1, 2), 35), ((1, 2, 3, 0), 34), ((2, 3, 0, 1), 31)]

So this is already looking quite close to our starting example.

Creating a Condorcet winner is similarly easy: Whether the majority prefers a candidate to another is again just an additive score. So we just need to add the requisite \(N\) constraint that our desired Condorcet candidate wins.

if condorcet_winner is not None: victories = { (i, j): lpsum( v for p, v in variables if p.index(i) > p.index(j) ) for i in candidates for j in candidates } for c in candidates: if c != condorcet_winner: problem.addConstraint( victories[(condorcet_winner, c)] >= victories[(c, condorcet_winner)] + 1 )

If we run this to force the Condorcet winner to be \(1\) we now get the following:

>>> build_election(4, 100, [ (lambda p, c: int(p[0] == c), (0, 1, 2, 3)), (lambda p, c: 3 - p.index(c), (3, 2, 1, 0))], condorcet_winner=1, ) [((0, 3, 1, 2), 28), ((1, 2, 3, 0), 27), ((2, 1, 3, 0), 24), ((3, 2, 0, 1), 21)]

This is pretty close to the desired result. We just need to figure out how to set the IRV winner.

This is a bit more fiddly because IRV isn’t a simple additive procedure, so we can’t simply set up scores for who wins it.

But where it is a simple additive procedure is to determine who drops out given who has already dropped out, because that’s simply a matter of calculating a modified plurality score with some of the candidates ignored.

So what we can do is specify the exact dropout order: This means we know who has dropped out at any point, so we can calculate the scores for who should drop out next and add the appropriate constraints.

The following code achieves this:

if irv_dropout_order is not None: remaining_candidates = set(candidates) for i in irv_dropout_order: if len(remaining_candidates) <= 1: break assert i in remaining_candidates allocations = {j: [] for j in remaining_candidates} for p, v in variables: for c in p: if c in remaining_candidates: allocations[c].append(v) break loser_allocations = sum(allocations.pop(i)) remaining_candidates.remove(i) for vs in allocations.values(): problem.addConstraint(loser_allocations + 1 <= sum(vs))

And running this we get the following:

>>> build_election(4, 100, [ (lambda p, c: int(p[0] == c), (0, 1, 2, 3)), (lambda p, c: 3 - p.index(c), (3, 2, 1, 0)) ], condorcet_winner=1, irv_dropout_order=(3, 1, 0, 2)) [((0, 3, 1, 2), 31), ((1, 2, 3, 0), 27), ((2, 1, 3, 0), 25), ((3, 2, 0, 1), 17)]

This isn’t quite the same example in the book (the first bloc has one fewer vote which the last bloc got in this), because the book example had a bit more code for optimizing it into a canonical form, but that code isn’t very interesting so we’ll skip it.

Here’s the full code:

I’m constantly surprised how nice integer linear programming ends up being for constructing examples. I knew I could do the Borda and plurality scores – that’s in fact the example that motivated me trying this out at all – but although I find it obvious in retrospect that you can also fix the Condorcet winner it definitely wasn’t a priori. The fact that it’s easy to also calculate the IRV dropout order was genuinely surprising.

This is also a nice example of how much fun small n programming can be. This isn’t just an O(n!) solution – it generates a problem of size O(n!) and then feeds it to a solver for an NP-hard problem! in principle that should be catastrophic. In practice, n is 4, so who cares? (n=4 is about the limit for this too – it just about works for n=5, and for n=6 it doesn’t really).

I also find it somewhat surprising how much freedom there is to get different results from different voting systems. I was motivated to do some of this by The ultimate of chaos resulting from weighted voting systems, so I knew there was a fair bit of freedom, but I was somewhat expecting that to be a pathology of weighted voting systems, so I’m still a little surprised. I guess there’s just quite a lot of room to play around with in a 24-dimensional simplex.

Which brings me to my final point: When you’re trying to understand what is possible in a domain, this sort of example generation focused programming is very useful for exploring the limits of the possible and building your intuition. I feel like this isn’t a thing people do enough, and I think we should all do more of it.

I also think you should buy my book.

(Or support my writing on Patreon)

Categories: FLOSS Project Planets

KStars 2.7.4 for Windows is released!

Planet KDE - Fri, 2017-02-17 12:09
Glad to announce the release of KStars v2.7.4 for Windows 64bit. This version is built a more recent Qt (5.8) and the latest KF5 frameworks for Windows bringing more features and stability.


This release brings in many bugs fixes, enhancements for limited-resources devices, and improvements, especially to KStars premier astrophotography tool: Ekos. Windows users would be glad to learn that they can now use offline astrometry solver in Windows, thanks to the efforts of the ANSVR Local Astrometry.net solver. The ANSVR mimics the astrometry.net online server on your local computer; thus the internet not required for any astrometry queries.

After installing the ANSVR server and downloading the appropriate index files for your setup, you can simply change the API URL to use the ANSVR server as illustrated below:



In Ekos align module, keep the solver type to Online so it would use the local ANSVR server for all astrometry queries. Then you can use the align module as you would normally do. This release also features the Ekos Polar Alignment Assistant tool, a very easy to use spot-on tool to polar align your mount.

Clear skies!
Categories: FLOSS Project Planets

Editing files as root

Planet KDE - Fri, 2017-02-17 12:09

For years I have told people to not start Kate as root to edit files. The normal response I got was “but I have to edit this file”. The problem with starting GUI applications as root is that X11 is extremely insecure and it’s considerable easy for another application to attack this.

An application like Kate depends on libraries such as Qt. Qt itself disallows running as an setuid-app:

Qt is not an appropriate solution for setuid programs due to its large attack surface.

If Qt is not an appropriate solution for command line arguments running as root, it’s also not an appropriate solution for running GUI applications. And Qt is just one of the dependencies of graphical applications. There is obviously also xcb, Xlib, OpenGL, xkbcommon, etc. etc.

So how can another application attack an application running as root? A year ago I implemented a simple proof of concept attack against Dolphin. The attack is waiting for dolphin getting started as root. As soon as it starts, it uses the XTest extension to fake input, enable the embedded konsole window and type into it.

This is just one example. The elephant in the room is string handling, though. Every X11 window has many window properties and every process can write to it. We just have to accept that string handling is complex and can easily trigger a crash.

Luckily there is no need for editing a file to run the editor as root. There is a neat tool called sudoedit. That does the magic of starting the editor as the user and takes care of storing the file as root when you save.

Today I pushed a change for Kate and KWrite which does no longer allow to be run as root. Instead it educates the user about how to do the same with sudoedit.

Now I understand that this will break the workflow for some users. But with a minor adjustment to your workflow you get the same. In fact it will be better, because the Kate you start is able to pick up your configured styling. And it will also work on Wayland. And most importantly it will be secure.

I am also aware that if you run an application which is malicious you are already owned. I think that we should protect nevertheless.

Categories: FLOSS Project Planets

Gábor Hojtsy: Improving Drupal 8's usability and the impact

Planet Drupal - Fri, 2017-02-17 11:09

We started regular Drupal usability meetings twice a week almost a year ago in March 2016. That is a long time and we succeeded in supporting many key initiatives in this time, including reviews on new media handling and library functionality, feedback on workflow user experience, outside-in editing and place block functionality. We helped set scope for the changes required to inline form errors on its way to stability. Those are all supporting existing teams working on their respective features where user interfaces are involved.

However, we also started to look at some Drupal components and whether we can gradually improve them. One of the biggest tasks we took on was redesigning the status page, where Drupal's system information is presented and errors and warnings are printed for site owners to resolve. While that looks like a huge monster issue, Roy Scholten in fact posted a breakdown of how the process itself went. If we were to start a fresh issue (which we should have), the process would be much easier to follow and would be more visible. The result is quite remarkable:

Categories: FLOSS Project Planets
Syndicate content