FLOSS Project Planets

Dirk Eddelbuettel: RcppMsgPack 0.2.0

Planet Debian - Wed, 2017-09-13 21:28

A new and much enhanced version of RcppMsgPack arrived on CRAN a couple of days ago. It came together following this email to the r-package-devel list which made it apparent that Travers Ching had been working on MessagePack converters for R which required the very headers I had for use from, inter alia, the RcppRedis package.

So we joined our packages. I updated the headers in RcppMsgPack to the current upstream version 2.1.5 of MessagePack, and Travers added his helper functions allow direct packing / unpacking of MessagePack objects at the R level, as well as tests and a draft vignette. Very exciting, and great to have a coauthor!

So now RcppMspPack provides R with both MessagePack header files for use via C++ (or C, if you must) packages such as RcppRedis --- and direct conversion routines at the R prompt.

MessagePack itself is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it is faster and smaller. Small integers are encoded into a single byte, and typical short strings require only one extra byte in addition to the strings themselves.

Changes in version 0.2.0 (2017-09-07)
  • Added support for building on Windows

  • Upgraded to MsgPack 2.1.5 (#3)

  • New R functions to manipulate MsgPack objects: msgpack_format, msgpack_map, msgpack_pack, msgpack_simplify, mgspack_unpack (#4)

  • New R functions also available as msgpackFormat, msgpackMap, msgpackPack, msgpackSimplify, mgspackUnpack (#4)

  • New vignette (#4)

  • New tests (#4)

Courtesy of CRANberries, there is also a diffstat report for this release. More information is on the RcppRedis page.

More information may be on the RcppMsgPack page. Issues and bugreports should go to the GitHub issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RcppRedis 0.1.8

Planet Debian - Wed, 2017-09-13 21:26

A new minor release of RcppRedis arrived on CRAN last week, following the release 0.2.0 of RcppMsgPack which brought the MsgPack headers forward to release 2.1.5. This required a minor and rather trivial change in the code. When the optional RcppMsgPack package is used, we now require this version 0.2.0 or later.

We made a few internal updates to the package as well.

Changes in version 0.1.8 (2017-09-08)
  • A new file init.c was added with calls to R_registerRoutines() and R_useDynamicSymbols()

  • Symbol registration is enabled in useDynLib

  • Travis CI was updated to using run.sh

  • The (optional MessagePack) code was updated for MsgPack 2.*

Courtesy of CRANberries, there is also a diffstat report for this release. More information is on the RcppRedis page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Aaron Morton: Phantom Consistency Mechanisms

Planet Apache - Wed, 2017-09-13 20:00

In this blog post we will take a look at consistency mechanisms in Apache Cassandra. There are three reasonably well documented features serving this purpose:

  • Read repair gives the option to sync data on read requests.
  • Hinted handoff is a buffering mechanism for situations when nodes are temporarily unavailable.
  • Anti-entropy repair (or simply just repair) is a process of synchronizing data across the board.

What is far less known, and what we will explore in detail in this post, is a fourth mechanism Apache Cassandra uses to ensure data consistency. We are going to see Cassandra perform another flavour of read repairs but in far sneakier way.

Setting things up

In order to see this sneaky repair happening, we need to orchestrate a few things. Let’s just blaze through some initial setup using Cassandra Cluster Manager (ccm - available on github).

# create a cluster of 2x3 nodes ccm create sneaky-repair -v 2.1.15 ccm updateconf 'num_tokens: 32' ccm populate --vnodes -n 3:3 # start nodes in one DC only ccm node1 start --wait-for-binary-proto ccm node2 start --wait-for-binary-proto ccm node3 start --wait-for-binary-proto # create table and keypsace ccm node1 cqlsh -e "CREATE KEYSPACE sneaky WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 3};" ccm node1 cqlsh -e "CREATE TABLE sneaky.repair (k TEXT PRIMARY KEY , v TEXT);" # insert some data ccm node1 cqlsh -e "INSERT INTO sneaky.repair (k, v) VALUES ('firstKey', 'firstValue');" The familiar situation

At this point, we have a cluster up and running. Suddenly, “the requirements change” and we need to expand the cluster by adding one more data center. So we will do just that and observe what happens to the consistency of our data.

Before we proceed, we need to ensure some determinism and turn off Cassandra’s known consistency mechanisms (we will not be disabling anti-entropy repair as that process must be initiated by an operator anyway):

# disable hinted handoff ccm node1 nodetool disablehandoff ccm node2 nodetool disablehandoff ccm node3 nodetool disablehandoff # disable read repairs ccm node1 cqlsh -e "ALTER TABLE sneaky.repair WITH read_repair_chance = 0.0 AND dclocal_read_repair_chance = 0.0"

Now we expand the cluster:

# start nodes ccm node4 start --wait-for-binary-proto ccm node5 start --wait-for-binary-proto ccm node6 start --wait-for-binary-proto # alter keyspace ccm node1 cqlsh -e "ALTER KEYSPACE sneaky WITH replication ={'class': 'NetworkTopologyStrategy', 'dc1': 3, 'dc2':3 };"

With these commands, we have effectively added a new DC into the cluster. From this point, Cassandra can start using the new DC to serve client requests. However, there is a catch. We have not populated the new nodes with data. Typically, we would do a nodetool rebuild. For this blog post we will skip that, because this situation allows some sneakiness to be observed.

Sneakiness: blocking read repairs

Without any data being put on the new nodes, we can expect no data to be actually readable from the new DC. We will go to one of the new nodes (node4) and do a read request with LOCAL_QUORUM consistency to ensure only the new DC participates in the request. After the read request we will also check the read repair statistics from nodetool, but we will set that information aside for later:

ccm node4 cqlsh -e "CONSISTENCY LOCAL_QUORUM; SELECT * FROM sneaky.repair WHERE k ='firstKey';" ccm node4 nodetool netstats | grep -A 3 "Read Repair" k | v ---+--- (0 rows)

No rows are returned as expected. Now, let’s do another read request (again from node4), this time involving at least one replica from the old DC thanks to QUORUM consistency:

ccm node4 cqlsh -e "CONSISTENCY QUORUM; SELECT * FROM sneaky.repair WHERE k ='firstKey';" ccm node4 nodetool netstats | grep -A 3 "Read Repair" k | v ----------+------------ firstKey | firstValue (1 rows)

We now got a hit! This is quite unexpected because we did not run rebuild or repair meanwhile and hinted handoff and read repairs have been disabled. How come Cassandra went ahead and fixed our data anyway?

In order to shed some light onto this issue, let’s examine the nodetool netstat output from before. We should see something like this:

# after first SELECT using LOCAL_QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 # after second SELECT using QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 1 Mismatch (Background): 0 # after third SELECT using LOCAL_QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 1 Mismatch (Background): 0

From this output we can tell that:

  • No read repairs happened (Attempted is 0).
  • One blocking read repair actually did happen (Mismatch (Blocking) is 1).
  • No background read repair happened (Mismatch (Background) is 0).

It turns out there are two read repairs that can happen:

  • A blocking read repair happens when a query can not complete with desired consistency level without actually repairing the data. read_repair_chance has no impact on this.
  • A background read repair happens in situations when a query succeeds but inconsistencies are found. This happens with read_repair_chance probability.
The take-away

To sum things up, it is not possible to entirely disable read repairs and Cassandra will sometimes try to fix inconsistent data for us. While this is pretty convenient, it also has some inconvenient implications. The best way to avoid any surprises is to keep the data consistent by running regular repairs.

In situations featuring non-negligible amounts of inconsistent data this sneakiness can cause a lot of unexpected load on the nodes, as well as the cross-DC network links. Having to do cross-DC reads can also introduce additional latency. Read-heavy workloads and workloads with large partitions are particularly susceptible to problems caused by blocking read repair.

A particular situation when a lot of inconsistent data is guaranteed happens when a new data center gets added to the cluster. In these situations, LOCAL_QUORUM is necessary to avoid doing blocking repairs until a rebuild or a full repair is done. Using a LOCAL_QUORUM is twice as important when the data center expansion happens for the first time. In one data center scenario QUORUM and LOCAL_QUORUM have virtually the same semantics and it is easy to forget which one is actually used.

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-13

Planet Apache - Wed, 2017-09-13 19:58
Categories: FLOSS Project Planets

Accessibility improvements in Randa

Planet KDE - Wed, 2017-09-13 17:38

Accessibility in KDE and Qt is constantly improving. Sometimes a change of scenery helps focusing and brings up productivity. Mix that with a bunch of great people and good things will start happening. It has been possible to create accessible Qt applications on Linux for a while, but of course not everything will just work out of the box untested. A while back Mario asked me to join this year’s Randa Meeting where KDE people discuss and fix issues. It turns out that was a great idea. I haven’t been able to focus much on applications and user experience lately, but with this backdrop it works ��

Upon arrival I sat down with Marco and David of Plasma fame and we quickly got Orca (the screen reader on Linux) working on their laptops. We discussed what is wrong and where improvements would be most needed for blind users to use the Plasma desktop, we got to work and had the first fixes even before lunch, wow. Adding a few accessibility hints and poking hard at keyboard navigation got us much further.

This means that KRunner – Plasma Desktop’s app launcher – is now much more accessible, a great enabler. There’s more work going into the menu and panel, hopefully we’ll see much improved keyboard usability for these by the end of the week. It’s good to see how Qt accessibility works nicely with Qt Quick.

In the afternoon, we had a round of introductions and talks. For accessibility some key points were:
– Don’t do custom stuff if you can avoid it. This includes colors and fonts, but also focus handling.
– Try running your application with keyboard only. And then mouse only.
– Make sure that focus handling actually works, then test with a screen reader.
– Oh, and focus handling. Check the order of your tab focus chain (in Qt Designer or by running your Qt Quick application. While I’ve lately become a big fan of the Qt Quick Designer, I don’t think it allows to detect all corner cases when it comes to tab key handling yet.)
– I should write and talk more about FocusScope, one our our best and most confusing one bit friends in Qt Quick.

I sat down to poke at the systemsettings module, making it easier to debug and a bit more reliable. This morning I sat down with Ade to poke at why Calamares (the installer framework with the highest squid content ever) is not accessible. When running it as a regular user, the results were actually all in all quite OK. But it usually (for historical reasons) gets launched as root. While that may be fixed eventually, it was worth investigating what the issue really is, because, well, it should work. After a bit of poking, comparing DBus messages and then a break (Yoga and Acrobatics), we spotted that an old bug hadn’t actually been really fixed, just almost. Applications run as root would connect to the right DBus bus (AT-SPI2 runs its own bus), but we just failed to actually initialize things properly. Turns out that in this slightly different code path, we’d emit a signal in a constructor, before it was connected to the receiver… oops. The fix is making its way into Qt as we speak, so everything is looking and sounding good ��

[Note: the blog post was written yesterday, but I never got around to publishing it, so everything is off by a day, just shift it back by one day. Then you can imagine how I will sit down the next evening for a bit of late night blogging and pancakes (that’s about now).]

Help the Randa Meeting and other sprints!

The post Accessibility improvements in Randa appeared first on Qt Blog.

Categories: FLOSS Project Planets

Last week development in Elisa

Planet KDE - Wed, 2017-09-13 16:29

I have decided to try to publish a short or not too short blog post each week some development happen in Elisa Git repository. I am inspired amongst others by the current posts about development of Kube.

I have updated the wiki page bout Elisa to include howto build instructions Elisa. Please have a look and improve them if you can.

The following items have been pushed:

  • A fix for memory leak when modifying the paths to be indexed by the Elisa files indexer ;
  • Do not display the disc number in play list when the track is from an album with a single disc.

I am still working on the notifications and a small progress has been made for the integration of visualizations when playing music.

Categories: FLOSS Project Planets

Python Anywhere: The PythonAnywhere newsletter, September 2017

Planet Python - Wed, 2017-09-13 13:59

Gosh, and we were doing so well. After managing a record seven of our "monthly" newsletters back in 2016, it's mid-September and we haven't sent a single one so far this year :-( Well, better late than never! Let's see what's been going on.

The PythonAnywhere API

Our API is now in public beta! Just go to the "API token" tab on the "Account" page to generate a token and get started.

You can do lots with it already:

  • Create, reload and reconfigure websites: our very own Harry has written a neat script that allows you to create a completely new Django website, with a virtualenv, using it.
  • Get links to share files from your PythonAnywhere file storage with other people
  • List your consoles, and close them.

We're planning to add API support for creating, modifying and deleting scheduled tasks very soon.

Full documentation is here. We'd love your feedback and any suggestions about what we need to add to it. Just drop us a line at support@pythonanywhere.com.

Other nifty new stuff, part 1

You might have noticed something new in that description of the API calls. You might have asked yourself "what's all this about sharing files? I don't remember anything about that."

You're quite right -- it's a new thing, you can now generate a sharing link for any file from inside the PythonAnywhere editor. Send the link to someone else, and they'll get a page allowing them to copy it into their own account. Let us know if you find it useful :-)

Other nifty stuff, part 2

Of course, no Python developer worth their salt would ever consider using an old version of the language. In particular, we definitely don't have any bits of Python 2.7 lurking in our codebase. Definitely not. Nope.

Anyway, adding Python 3.6 support was super-high priority for us -- and it went live earlier on this year.

One important thing -- it's only supported in our "dangermouse" system image. If your account was created in the last year, you're already using dangermouse, so you'll already have it. But if your account is older, and you haven't switched over yet, maybe it's time? Just drop us a line.

The inside scoop from the blog and the forums Some new help pages

A couple of new pages from our ever-expanding collection:

New modules

Although you can install Python packages on PythonAnywhere yourself, we like to make sure that we have plenty of batteries included.

We haven't installed any new system modules for Python 2.7, 3.3, 3.4 or 3.5 recently -- but we have installed everything we thought might be useful as part of our Python 3.6 install :-)

New whitelisted sites

Paying PythonAnywhere customers get unrestricted Internet access, but if you're a free PythonAnywhere user, you may have hit problems when writing code that tries to access sites elsewhere on the Internet. We have to restrict you to sites on a whitelist to stop hackers from creating dummy accounts to hide their identities when breaking into other people's websites.

But we really do encourage you to suggest new sites that should be on the whitelist. Our rule is, if it's got an official public API, which means that the site's owners are encouraging automated access to their server, then we'll whitelist it. Just drop us a line with a link to the API docs.

We've added too many sites to list since our last newsletter to list them all -- but please keep them coming!

That's all for now

That's all we've got this time around. We have some big new features in the pipeline, so keep tuned! Maybe we'll even get our next newsletter out in October :-)

Categories: FLOSS Project Planets

FSF Blogs: Only a short time left to pre-order the Talos II; pre-orders end September 15th

GNU Planet! - Wed, 2017-09-13 13:42

We wrote previously about why you should support the Talos II from Raptor Engineering. The pre-order period for the Talos II is almost over. Making a pre-order will help them to launch this much-needed system. The goal for the folks at Raptor Engineering has always been to gain Respects Your Freedom certification. We certified a lot of new devices this year, and if we want to keep seeing those numbers increase, then it is critical that we support projects like this. As we said in our last post:

The unfortunate reality is that x86 computers come encumbered with built-in low-level backdoors like the Intel Management Engine, as well as proprietary boot firmware. This means that users can't gain full control over their computers, even if they install a free operating system.

While people are currently working to overcome the Intel Management Engine problem, each new generation of Intel CPUs is a new problem. Even if the community succeeds fully with one generation, it has to start over with the next one. This is precisely why the Talos II is important. As we said previously:

For the future of free computing, we need to build and support systems that do not come with such malware pre-installed, and the Power9-based Talos II promises to be a great example of just such a system. Devices like this are the future of computing that Respects Your Freedom.

You should help make the Talos II a success by making a pre-order by September 15th. The FSF Licensing & Compliance Lab will have to do another evaluation once it is actually produced to be sure it meets our certification standards, but we have high hopes. Here is what you can do to help:

Categories: FLOSS Project Planets

Only a short time left to pre-order the Talos II; pre-orders end September 15th

FSF Blogs - Wed, 2017-09-13 13:42

We wrote previously about why you should support the Talos II from Raptor Engineering. The pre-order period for the Talos II is almost over. Making a pre-order will help them to launch this much-needed system. The goal for the folks at Raptor Engineering has always been to gain Respects Your Freedom certification. We certified a lot of new devices this year, and if we want to keep seeing those numbers increase, then it is critical that we support projects like this. As we said in our last post:

The unfortunate reality is that x86 computers come encumbered with built-in low-level backdoors like the Intel Management Engine, as well as proprietary boot firmware. This means that users can't gain full control over their computers, even if they install a free operating system.

While people are currently working to overcome the Intel Management Engine problem, each new generation of Intel CPUs is a new problem. Even if the community succeeds fully with one generation, it has to start over with the next one. This is precisely why the Talos II is important. As we said previously:

For the future of free computing, we need to build and support systems that do not come with such malware pre-installed, and the Power9-based Talos II promises to be a great example of just such a system. Devices like this are the future of computing that Respects Your Freedom.

You should help make the Talos II a success by making a pre-order by September 15th. The FSF Licensing & Compliance Lab will have to do another evaluation once it is actually produced to be sure it meets our certification standards, but we have high hopes. Here is what you can do to help:

Categories: FLOSS Project Planets

Deeson: The slow but timely death of user 1

Planet Drupal - Wed, 2017-09-13 12:26

Change is hard, but sometimes it's also for the better.

All platforms have their issues, and Drupal is no different. These quirks, known as Drupalisms, can be the source of many WTF moments for developers as the code or functionality does not work in a way they expected.

As Drupal leaves the island of doing things in its own way, one of the stowaways still onboard is user 1.

User 1 is the first Drupal user on a Drupal site with the user ID number of 1. User 1 is hardcoded to have all permissions; their access cannot be controlled through the administration interface. User 1 has all the site keys and has to be dealt with uniquely in code.

It’s time for us to kill user 1. 

In its place, all users will be treated in the same way using the standard roles and permissions model.

Key benefits

There are several benefits, some of them rather major:

Security improvement: Once a site has been built or has proper roles defined, you can take away the admin role from all users. This ensures there are no accounts that put your entire website at risk should they be compromised.

Code stability: I had to fix a few dozen tests because they relied on user 1 being special. The tests were not functioning meaning they were not actually covering the code they should have. Removing the UID1 Drupalism will ensure our tests need to run with the right permissions defined.

Consistency: What good is an access layer if there is a special exception that can bypass everything? An example of this being a downside is a bunch of administrative local tasks (tabs) or actions ("+"-icon links) being put behind sensible access checks, only to have all gazillion of them clutter the UI for user 1 because he has god-mode haxx turned on.

Reducing the number of Drupalisms: We need to distinguish between Drupalisms that define what Drupal is and those that negatively characterize Drupal by needlessly increasing its learning curve. The special case of UID1 belongs to the latter category. There are very few systems that still have god-mode accounts. And for good reason (see above items). So let's destroy yet another barrier for outside devs to join our project.


The issue to remove user 1 has been around since 2009, so the concept isn’t new. I resurrected the issue earlier this year and it seems to be building momentum now.

If this is something that interests you, then please head over to the issue queue, read the discussions and try out the patch: https://www.drupal.org/node/540008

Let’s get this into Drupal 8.5.x!

Interested in joining our team? Deeson is hiring!

Categories: FLOSS Project Planets

Kushal Das: Network isolation using NetVMs and VPN in Qubes

Planet Python - Wed, 2017-09-13 12:25

In this post, I am going to talk about the isolation of network for different domains using VPN on Qubes. The following shows the default network configuration in Qubes.

The network hardware is attached to a special domain called sys-net. This is the only domain which directly talks to the outside network. Then a domain named sys-firewall connects to sys-net and all other VMs use sys-firewall to access the outside network. These kinds of special domains are also known as NetVM as they can provide network access to other VMs.

Creating new NetVMs for VPN

The easiest way is to clone the existing sys-net domain to a new domain. In my case, I have created two different domains, mynetwork and vpn2 as new NetVMs in dom0.

$ qvm-clone sys-net mynetwork $ qvm-clone sys-net vpn2

As the next step, I have opened the settings for these VMs and marked sys-net as the NetVM for these. I have also install openvpn package in the templateVM so that both the new NetVM can find that package.

Setting up openvpn

I am not running openvpn as proper service as I want to switch to different VPN services I have access to. That also means a bit of manual work to setup the right /etc/resolv.conf file in the NetVMs and any corresponding VMs which access the network through these.

$ sudo /usr/sbin/openvpn --config connection_service_name.ovpn

So, the final network right now looks like the following diagram. The domains (where I am doing actual work) are connected into different VPN services.

Categories: FLOSS Project Planets

gnuastro @ Savannah: Gnuastro 0.4 released

GNU Planet! - Wed, 2017-09-13 12:07

I am happy to announce that the fourth release of Gnuastro now available.

GNU Astronomy Utilities (Gnuastro) is an official GNU package consisting of various command-line programs and library functions for the manipulation and analysis of astronomical data. All the programs share the same basic command-line user interface for the comfort of both the users and developers. For the full list of Gnuastro's library and programs please see the links below, respectively:


The emphasis in this release has mainly been on features to improve the user experience of Gnuastro's programs. The full list of major new/changed features in this release can be seen in the NEWS file and is also appended to this announcement below [*].

Here are the compressed sources for this release:
http://ftp.gnu.org/gnu/gnuastro/gnuastro-0.4.tar.gz (4.4MB)
http://ftp.gnu.org/gnu/gnuastro/gnuastro-0.4.tar.lz (3.0MB)

Here are the GPG detached signatures[**]:

Use a mirror for higher download bandwidth (may need a day or two to sync):

Here are the MD5 and SHA1 checksums:
a5d68d008ee5de9197907a35b3002988 gnuastro-0.4.tar.gz
9b79efe278645c1510444bd42e48b83f gnuastro-0.4.tar.lz
c6113658a119a9de785b04f4baceb3f7e6560360 gnuastro-0.4.tar.gz
69317d10d13ac72fdaa627a03ed77a4e307d4cb7 gnuastro-0.4.tar.lz

I am very grateful to Vladimir Markelov for contributions to the code of this release and (in alphabetical order) to Marjan Akbari, Fernando Buitrago, Adrian Bunk, Antonio Diaz Diaz, Mosè Giordano, Stephen Hamer, Raúl Infante Sainz, Aurélien Jarno, Alan Lefor, Guillaume Mahler, William Pence, Ole Streicher, Ignacio Trujillo and David Valls-Gabaud for their great suggestions, help and bug reports that made this release possible.

Gnuastro 0.4 tarball was bootstrapped (built) with the following tools:

  • Texinfo 6.4
  • Autoconf 2.69
  • Automake 1.15.1
  • Libtool 2.4.6
  • Help2man 1.47.4
  • Gnulib v0.1-1593-g9d3e8e18d
  • Autoconf Archives v2017.03.21-138-g37a7575

Note that these are not installation dependencies, for those, please see


Mohammad Akhlaghi,
Postdoctoral research fellow,
Centre de Recherche Astrophysique de Lyon (CRAL),
Observatoire de Lyon. 9, Avenue Charles André,
Saint Genis Laval (69230), France.

NEWS file for this release New features
  • All programs: `.fit' is now a recognized FITS file suffix.
  • All programs: ASCII text files (tables) created with CRLF line terminators (for example text files created in MS Windows) are now also readable as input when necessary.
  • Arithmetic: now has a new `--globalhdu' (`-g') option which can be used once for all the input images.
  • MakeNoise: with the new `--sigma' (`-s') option, it is now possible to directly request the noise sigma or standard deviation. When this option is called, the `--background', `--zeropoint' and other option values will be ignored.
  • MakeProfiles: the new `--kernel' option can make a kernel image without the need to define a catalog. With this option, a catalog (or accompanying background image) must not be given.
  • MakeProfiles: the new `--pc', `--cunit' and `--ctype' options can be used to specify the PC matrix, CUNIT and CTYPE world coordinate system keywords of the output FITS file.
  • MakeProfiles: the new `distance' profile will save the radial distance of each pixel. This may be used to define your own profiles that are not currently supported in MakeProfiles.
  • MakeProfiles: with the new `--mcolisbrightness' ("mcol-is-brightness") option, the `--mcol' values of the catalog will be interpretted as total brightness (sum of pixel values), not magnitude.
  • NoiseChisel: with the new `--dilatengb' option, it is now possible to identify the connectivity of the final dilation.
  • Library: Functions that read data from an ASCII text file (`gal_txt_table_info', `gal_txt_table_read', `gal_txt_image_read') now also operate on files with CRLF line terminators.
Changed features
  • Crop: The new `--center' option is now used to define the center of a single crop. Hence the old `--ra', `--dec', `--xc', `--yc' have been removed. This new option can take multiple values (one value for each dimension). Fractions are also acceptable.
  • Crop: The new `--width' option is now used to define the width of a single crop. Hence the old `--iwidth', `--wwidth' were removed. The units to interpret the value to the option are specified by the `--mode' option. With the new `--width' option it is also possible to define a non-square crop (different widths along each dimension). In WCS mode, its units are no longer arcseconds but are the same units of the WCS (degrees for angles). `--width' can also accept fractions. So to set a width of 5 arcseconds, you can give it a value of `5/3600' for the angular dimensions.
  • Crop: The new `--coordcol' option is now used to determine the catalog columns that define coordinates. Hence the old `--racol', `--deccol', `--xcol', and `--ycol' have been removed. This new option can be called multiple times and the order of its calling will be used for the column containing the center in the respective dimension (in FITS format).
  • MakeNoise: the old `--stdadd' (`-s') option has been renamed to `--instrumental' (`-i') to be more clear.
  • MakeProfiles: The new `--naxis' and `--shift' options can take multiple values for each dimension (separated by a comma). This replaces the old `--naxis1', `--naxis2' and `--xshift' and `--yshift' options.
  • MakeProfiles: The new `--ccol' option can take the center coordinate columns of the catalog (in multiple calls) and the new `--mode' option is used to identify what standard to interpret them in (image or WCS). Together, these replace the old `--xcol', `--ycol', `--racol' and `--deccol'.
  • MakeProfiles: The new `--crpix', `--crval' and `--cdelt' options now accept multiple values separated by a comma. So they replace the old `--crpix1', `--crpix2', `--crval1', `--crval2' and `--resolution' options.
  • `gal_data_free_contents': when the input `gal_data_t' is a tile, its `array' element will not be freed. This enables safe usage of this function (and thus `gal_data_free') on tiles without worrying about the memory block associated with the tile.
  • `gal_box_bound_ellipse' is the new name for the old `gal_box_ellipse_in_box' (to be more clear and avoid repetition of the term `box'). The input position angle is now also in degrees, not radians.
  • `gal_box_overlap' now works on data of any dimensionality and thus also needs the number of dimensions (elements in each input array).
  • `gal_box_border_from_center' now accepts an array of coordinates as one argument and the number of dimensions as another. This allows it to work on any dimensionality.
  • `gal_fits_img_info' now also returns the name and units of the dataset (if they aren't NULL). So it takes two extra arguments.
  • `gal_wcs_pixel_scale' now replaces the old `gal_wcs_pixel_scale_deg', since it doesn't only apply to degrees. The pixel scale units are defined by the units of the WCS.
  • `GAL_TILE_PARSE_OPERATE' (only when `OTHER' is given) can now parse and operate on different datasets independent of the size of allocated block of memory (the tile sizes of `IN' and `OTHER' have to be identical, but not their allocated blocks of memory). Until now, it was necessary for the two blocks to have the same size and this is no longer the case.
Bug fixes
  • MakeProfiles long options on 32bit big endian systems (bug #51341).
  • Pure rotation around pixel coordinate (0,0) (bug #51353).
  • NoiseChisel segfault when no usable region for sky clumps (bug #51372).
  • Pixel scale measurement when dimension scale isn't equal or doesn't decrease (bug #51385).
  • Improper types for function code in MakeProfiles (bug #51467).
  • Crashes on 32-bit and big-endian systems (bug #51476).
  • Warp's align matrix when second dimension must be reversed (bug #51536).
  • Reading BZERO for unsigned 64-bit integers (bug #51555).
  • Arithmetic with one file and no operators (bug #51559).
  • NoiseChisel segfault when detection contains no clumps (bug #51906).
Checking integrity

Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this:

If that command fails because you don't have the required public key, then run this command to import it:

and rerun the 'gpg --verify' command.

Categories: FLOSS Project Planets

Discovering South America – Qt Con Brazil

Planet KDE - Wed, 2017-09-13 11:52

Few weeks ago I attended QtCon Brasil, an event organised by Brazilian members in the KDE Community who wanted to have an outreach event to the local technology community about Qt and beyond. It was great.

It’s always refreshing to get out of your own circles to meet new people and hear what they are up to. For me, it was more notable than ever! Different culture, different people, different backgrounds, different hemisphere!

We had a variety of presentations. From the mandatory KDE Frameworks talk by Filipe:

Some PyQt experience by Eliakin

And a lot more, although I didn’t understand everything, given my limited knowledge of the language consists of mapping it to Spanish or Catalan.

We got to hear about many projects in the region doing really cool stuff with Qt. From drug research and development to Point of Sale devices.
Us in the Free Software world, we are not always exposed to a good deal of development happening right before us, with the same technologies. It is fundamental to keep having such events where we learn how people create software, even if it’s on close environments.

Myself, I got to present Kirigami. It’s a very important project for KDE and I was happy to introduce it to the audience. My impression is that the presentation was well received, I believe that such wider community sees the value in convergence and portability like we do. Starting to deliver applications useful in a variety of scenarios will bring new light to how we use our computing systems.

Here you can find my slides and the examples I used.

Categories: FLOSS Project Planets

Shirish Agarwal: Android, Android marketplace and gaming addiction.

Planet Debian - Wed, 2017-09-13 10:44

This would be a longish piece so please bear and play with tea, coffee, beer or anything stronger that you desire while reading below

Categories: FLOSS Project Planets

Stack Abuse: Differences Between .pyc, .pyd, and .pyo Python Files

Planet Python - Wed, 2017-09-13 10:21

In this article we go over the Python file types .pyc, .pyo and .pyd, and how they're used to store bytecode that will be imported by other Python programs.

You might have worked with .py files writing Python code, but you want to know what these other file types do and where they come into use. To understand these, we will look at how Python transforms code you write into instructions the machine can execute directly.

Bytecode and the Python Virtual Machine

Python ships with an interpreter that can be used as a REPL (read-eval-print-loop), interactively, on the command line. Alternatively, you can invoke Python with scripts of Python code. In both cases, the interpreter parses your input and then compiles it into bytecode (lower-level machine instructions) which is then executed by a "Pythonic representation" of the computer. This Pythonic representation is called the Python virtual machine.

However, it differs enough from other virtual machines like the Java virtual machine or the Erlang virtual machine that it deserves its own study. The virtual machine, in turn, interfaces with the operating system and actual hardware to execute native machine instructions.

The critical thing to keep in mind when you see .pyc, .pyo and .pyd file types, is that these are files created by the Python interpreter when it transforms code into compiled bytecode. Compilation of Python source into bytecode is a necessary intermediate step in the process of translating instructions from source code in human-readable language into machine instructions that your operating system can execute.

Throughout this article we'll take a look at each file type in isolation, but first we'll provide a quick background on the Python virtual machine and Python bytecode.

The .pyc File Type

We consider first the .pyc file type. Files of type .pyc are automatically generated by the interpreter when you import a module, which speeds up future importing of that module. These files are therefore only created from a .py file if it is imported by another .py file or module.

Here is an example Python module which we want to import. This module calculates factorials.

# math_helpers.py # a function that computes the nth factorial, e.g. factorial(2) def factorial(n): if n == 0: return 1 else: return n * factorial(n - 1) # a main function that uses our factorial function defined above def main(): print("I am the factorial helper") print("you can call factorial(number) where number is any integer") print("for example, calling factorial(5) gives the result:") print(factorial(5)) # this runs when the script is called from the command line if __name__ == '__main__': main()

Now, when you just run this module from the command line, using python math_helpers.py, no .pyc files get created.

Let's now import this in another module, as shown below. We are importing the factorial function from the math_helpers.py file and using it to compute the factorial of 6.

# computations.py # import from the math_helpers module from math_helpers import factorial # a function that makes use of the imported function def main(): print("Python can compute things easily from the REPL") print("for example, just write : 4 * 5") print("and you get: 20.") print("Computing things is easier when you use helpers") print("Here we use the factorial helper to find the factorial of 6") print(factorial(6)) # this runs when the script is called from the command line if __name__ == '__main__': main()

We can run this script by invoking python computations.py at the terminal. Not only do we get the result of 6 factorial, i.e. 720, but we also notice that the interpreter automatically creates a math_helpers.pyc file. This happens because the computations module imports the math_helpers module. To speed up the loading of the imported module in the future, the interpreter creates a bytecode file of the module.

When the source code file is updated, the .pyc file is updated as well. This happens whenever the update time for the source code differs from that of the bytecode file and ensures that the bytecode is up to date.

Note that using .pyc files only speeds up the loading of your program, not the actual execution of it. What this means is that you can improve startup time by writing your main program in a module that gets imported by another, smaller module. To get performance improvements more generally, however, you'll need to look into techniques like algorithm optimization and algorithmic analysis.

Because .pyc files are platform independent, they can be shared across machines of different architectures. However, if developers have different clock times on their systems, checking in the .pyc files into source control can create timestamps that are effectively in the future for others' time readings. As such, updates to source code no longer trigger changes in the bytecode. This can be a nasty bug to discover. The best way to avoid it is to add .pyc files to the ignore list in your version control system.

The .pyo File Type

The .pyo file type is also created by the interpreter when a module is imported. However, the .pyo file results from running the interpreter when optimization settings are enabled.

The optimizer is enabled by adding the "-O" flag when we invoke the Python interpreter. Here is a code example to illustrate the use of optimization. First, we have a module that defines a lambda. In Python, a lambda is just like a function, but is defined more succinctly.

# lambdas.py # a lambda that returns double whatever number we pass it g = lambda x: x * 2

If you remember from the previous example, we will need to import this module to make use of it. In the following code listing, we import lambdas.py and make use of the g lambda.

# using_lambdas.py # import the lambdas module import lambdas # a main function in which we compute the double of 7 def main(): print(lambdas.g(7)) # this executes when the module is invoked as a script at the command line if __name__ == '__main__': main()

Now we come to the critical part of this example. Instead of invoking Python normally as in the last example, we will make use of optimization here. Having the optimizer enabled creates smaller bytecode files than when not using the optimizer.

To run this example using the optimizer, invoke the command:

$ python -O using_lambdas.py

Not only do we get the correct result of doubling 7, i.e. 14, as output at the command line, but we also see that a new bytecode file is automatically created for us. This file is based on the importation of lambdas.py in the invocation of using_lambdas.py. Because we had the optimizer enabled, a .pyo bytecode file is created. In this case, it is named lambdas.pyo.

The optimizer, which doesn't do a whole lot, removes assert statements from your bytecode. The result won't be noticeable in most cases, but there may be times when you need it.

Also note that, since a .pyo bytecode file is created, it substitutes for the .pyc file that would have been created without optimization. When the source code file is updated, the .pyo file is updated whenever the update time for the source code differs from that of the bytecode file.

The .pyd File Type

The .pyd file type, in contrast to the preceding two, is platform-specific to the Windows class of operating systems. It may thus be commonly encountered on personal and enterprise editions of Windows 10, 8, 7 and others.

In the Windows ecosystem, a .pyd file is a library file containing Python code which can be called out to and used by other Python applications. In order to make this library available to other Python programs, it is packaged as a dynamic link library.

Dynamic link libraries (DLLs) are Windows code libraries that are linked to calling programs at run time. The main advantage of linking to libraries at run time like the DLLs is that it facilitates code reuse, modular architectures and faster program startup. As a result, DLLs provide a lot of functionality around the Windows operating systems.

A .pyd file is a dynamic link library that contains a Python module, or set of modules, to be called by other Python code. To create a .pyd file, you need to create a module named, for example, example.pyd. In this module, you will need to create a function named PyInit_example(). When programs call this library, they need to invoke import foo, and the PyInit_example() function will run.

For more information on creating your own Python .pyd files, check out this article.

Differences Between These File Types

While some similarities exist between these file types, there are also some big differences. For example, while the .pyc and .pyo files are similar in that they contain Python bytecode, they differ in that the .pyo files are more compact thanks to the optimizations made by the interpreter.

The third file type, the .pyd, differs from the previous two by being a dynamically-linked library to be used on the Windows operating system. The other two file types can be used on any operating system, not just Windows.

Each of these file types, however, involve code that is called and used by other Python programs.


In this article we described how each special file type, .pyc, .pyo, and .pyd, is used by the Python virtual machine for re-using code. Each file, as we saw, has its own special purposes and use-cases, whether it be to speed up module loading, speed up execution, or facilitate code re-use on certain operating systems.

Categories: FLOSS Project Planets

FSF Events: Donald Robertson, III - "History of control: The past and future of Digital Restrictions Management" (SeaGL, Seattle, WA)

GNU Planet! - Wed, 2017-09-13 09:46

FSF licensing & compliance manager Donald Robertson, III, will be speaking at the SeaGL (2017-10-06–07).

The talk will give an overview of the history of Digital Restrictions Management (DRM), from codes and passwords for ancient video games to remote attestation spyware and beyond. It will provide the listener with a perspective on the true purpose of DRM, which throughout its history has always been control over the user. While nominally DRM has something to do with copyright, in each step throughout its story we find again and again that domination of the user is its only 'success' and ultimate goal. This fact becomes glaringly obvious as we move to the era of DRM enforcement of laws, where governments try to control citizens by rewriting digital reality. The talk will conclude with what we are doing at the FSF to fight back, and what you can do to help.

Don's speech will be nontechnical, admission is gratis, and the public is encouraged to attend.

Location: Room 3178, Seattle Central College, Seattle, WA

Please fill out our contact form, so that we can contact you about future events in and around Seattle.

Categories: FLOSS Project Planets

FSF Events: Molly de Blanc - "A division of labor: Attempting to measure free software" (SeaGL, Seattle, WA)

GNU Planet! - Wed, 2017-09-13 09:35

FSF campaigns manager Molly de Blanc will be speaking at SeaGL (2017-10-06–07).

We like to think that diversity has increased over time--contributors have stuck around as they age, students are excited to get started, initiatives are making space for people of color, trans* individuals, women, and other groups underrepresented in free software. We like to think we are doing better at recognizing the wide range of contributions and that more people are getting involved from all spheres of skill type, level, and experience. But is this true? Molly de Blanc, a free software activist with a fondness for numbers and data, analyzed the results from four community surveys from 2003, 2013, 2016, and 2017 (as well as other bits of data around the internet). With fourteen (incomplete) years of community data, she'll attempt to quantify the ways the make up of free software has changed, where we're not doing as well as we'd like, and how we can do better.

We'll be asking (and answering) questions like:

  •   Is more non-coding work being done by women?
  •   Are people coding and also doing other things?
  •   Who is not coding? Who is doing nothing "technical?"
  •   Are we doing a good job trying to understand our community?
  • Molly's speech will be nontechnical, admission is gratis, and the public is encouraged to attend.

    Location: Room 3183, Seattle Central College, Seattle, WA

    Please fill out our contact form, so that we can contact you about future events in and around Seattle.

    Categories: FLOSS Project Planets

    Shopkick Tech Blog: Shopkick at PyBay 2017

    Planet Python - Wed, 2017-09-13 08:00

    This year a bunch of us devs attended the second annual PyBay conference, proudly sponsored by, you guessed it, Shopkick! From the panel to the keynote to manning our booth, the speakers, attendees, and co-sponsors made PyBay 2017 a blast.

    This year, PyBay accepted two talks from shopkick engineers, and we figured we should shed some extra light on the bits of shopkick in each of them.

    The Packaging Gradient

    Up first, The Packaging Gradient, presented by principal engineer Mahmoud Hashemi (aka yours truly), dove deep into the interconnected matrix of technologies used for shipping software. It highlights the big differences between shipping libraries and applications, as well as the finer gradiations within each of those.

    At Shopkick we have hundreds of internal libraries, and at the end of the day we ship dozens of Python server applications. The talk touches on container-based packaging and deployment systems, like the ones Shopkick has been using since 2011. The talk even describes a bit about how we ship hardware, as part of manufacturing the beacons used for presence detection inside of retailers.

    For more information, check out the blog post The Packaging Gradient is based on, or shoot me an email.

    Best Practices in Legacy Codebases

    For our second talk, Moving Towards Best Practices in Legacy Codebases, frameworks engineering duo Kurt Rose and Moshe Zadka draw upon their combined 35+ years of Python experience to bear on a nuanced-yet-practical approach to wrangling huge codebases.

    Shopkick has always been a startup, with all that entails. Years of fast-paced development and experimentation can leave quite a bit of technical debt in its wake. Now, having committed to paying off that debt, how can we successfully upgrade our codebase while minimizing business impact? This talk covers what's worked for us so far.

    Conferring conclusions

    All in all, this year's PyBay managed to outdo 2016 by a healthy margin. Polling the six of us who attended, reviews are unanimous: tutorials were a fantastic resource, and the mix of talks was just right. Some favorites were Sandy Ryza's talk on solving NP-hard problems, Paul Ganssle's talk on timezone complications, and of course, the lightning talks.

    In a repeat of last year's conference, we're talking to a couple lead developers to hang out with Shopkick on a longer-term basis. If you're in the Bay/Toronto and are looking to step up your development game, give us a shout!

    In any case, we couldn't be happier to attend such a great regional conference.

    Big thanks to Grace Law, SF Python, and the whole PyBay team who made it all possible. See you next year!

    Categories: FLOSS Project Planets

    Dries Buytaert: Who sponsors Drupal development? (2016-2017 edition)

    Planet Drupal - Wed, 2017-09-13 07:46

    Last year, Matthew Tift and I examined Drupal.org's commit data to understand who develops Drupal, how much of that work is sponsored, and where that sponsorship comes from. We published our analysis in a blog post called "Who Sponsors Drupal Development?". A year later, I wanted to present an update. This year’s report will also cover additional data, including gender and geographical diversity, and project sponsorship.

    Understanding how an open-source project works is important because it establishes a benchmark for project health and scalability. Scaling an open-source project is a difficult task. As an open-source project’s rate of adoption grows, the number of people that benefit from the project also increases. Often the open-source project also becomes more complex as it expands, which means that the economic reward of helping to improve the project decreases.

    A recent article on the Bitcoin and Ethereum contributor communities illustrates this disparity perfectly. Ethereum and Bitcoin have market capitalizations valued at $30 billion and $70 billion, respectively. However, both projects have fewer than 40 meaningful contributors, and contribution isn’t growing despite the rising popularity of cryptocurrency.

    According to Bitcoin's GitHub data, Bitcoin has less than 40 active contributors.According to Ethereum's GitHub data, Ethereum has less than 20 active contributors.

    Drupal, by comparison, has a diverse community of contributors. In the 12-month period between July 1, 2016 to June 30, 2017 we saw code contributions on Drupal.org from 7,240 different individuals and 889 different companies. This does not mean that Drupal is exempt from the challenges of scaling an open-source project. We hope that this report provides transparency about Drupal project development and encourages more individuals and organizations incentive to contribute. We also will highlight areas where our community can and should do better.

    What is the Drupal.org credit system?

    In the spring of 2015, after proposing ideas for giving credit and discussing various approaches at length, Drupal.org added the ability for people to attribute their work to an organization or customer in the Drupal.org issue queues. Maintainers of Drupal modules, themes and distributions can award issues credits to people who help resolve issues with code, translations, documentation, design and more.

    A screenshot of an issue comment on Drupal.org. You can see that jamadar worked on this patch as a volunteer, but also as part of his day job working for TATA Consultancy Services on behalf of their customer, Pfizer.

    Credits are a powerful motivator for both individuals and organizations. Accumulating credits provides individuals with a way to showcase their expertise. Organizations can utilize credits to help recruit developers or to increase their visibility in the Drupal.org marketplace.

    While the benefits are evident, it is important to note a few of the limitations in Drupal.org’s current credit system:

    • Contributing to issues on Drupal.org is not the only way to contribute. Other activities, such as sponsoring events, promoting Drupal, and providing help and mentorship, are important to the long-term health of the Drupal project. Many of these activities are not currently captured by the credit system. For this post, we chose to only look at code contributions.
    • We acknowledge that parts of Drupal are developed on GitHub and therefore aren't fully credited on Drupal.org. The actual number of contributions and contributors could be significantly higher than what we report.
    • Even when development is done on Drupal.org, the credit system is not used consistently; because using the credit system is optional, a lot of code committed on Drupal.org has no or incomplete contribution credits.
    • Not all code credits are the same. We currently don't have a way to account for the complexity and quality of contributions; one person might have worked several weeks for just one credit, while another person might receive a credit for ten minutes of work. In the future, we should consider issuing credit data in conjunction with issue priority, patch size, etc. We can also reduce the need for trivial credits by automating patch rerolls and automating coding style fixes.
    Who is working on Drupal?

    For our analysis we looked at all the issues that were marked "closed" or "fixed" in the 12-month period from July 1, 2016 to June 30, 2017. What we learned is that there were 23,238 issues marked "closed" or "fixed", a 22% increase from the 19,095 issues in the 2015-2016 period. Those 23,238 issues had 42,449 issue credits, a 30% increase from the 32,711 issue credits recorded in the previous year. Issue credits against Drupal core remained roughly the same year over year, meaning almost all of this growth came from increased activity in contributed projects. This is no surprise. Drupal development is cyclical, and during this period of the Drupal 8 development cycle, most of the Drupal community has been focused on porting modules from Drupal 7 to Drupal 8. Of the 42,449 issue credits reported this year, 20% (8,619 credits) were for Drupal core, while 80% (33,830 credits) went to contributed themes, modules and distributions.

    Compared to the previous year, we also saw an increase in both the number of people contributing and the number of organizations contributing. Drupal.org received code contributions from 7,240 different individuals and 889 different organizations.

    The number of individual contributors is up 28% year over year and the number of organizations contributing is up 26% year over year.

    While the number of individual contributors rose, a relatively small number of individuals still do the majority of the work. Approximately 47% of individual contributors received just one credit. Meanwhile, the top 30 contributors (the top 0.4%) account for over 17% of the total credits, indicating that these individuals put an incredible amount of time and effort in developing Drupal and its contributed projects:

    RankUsernameIssues1jrockowitz5372dawehner4213RenatoG4084bojanz3515Berdir3356mglaman3347Wim Leers3328alexpott3299DamienMcKenna24510jhodgdon24211drunken monkey23812naveenvalecha19613Munavijayalakshmi19214borisson_19115yongt941218916klausi18517Sam15218418miro_dietiker18219Pavan B S18020ajay_reddy17621phenaproxima17222sanchiz16223slashrsm16124jhedstrom15525xjm15126catch14727larowlan14528rakesh.gectcr14129benjy13930dhruveshdtripathi138

    Out of the top 30 contributors featured, 19 were also recognized as top contributors in our 2015-2016 report. These Drupalists’ dedication and continued contribution to the project has been crucial to Drupal’s development. It’s also exciting to see 11 new names on the list. This mobility is a testament to the community’s evolution and growth.

    Next, we looked at both the gender and geographic diversity of Drupal.org code contributors. While these are only two examples of diversity, this is the only available data that contributors can choose to share on their Drupal.org profiles. The reported data shows that only 6% of the recorded contributions were made by contributors that identify as female, which indicates a steep gender gap. Like in most open-source projects, the gender imbalance in Drupal is profound and underscores the need to continue fostering diversity and inclusion in our community.

    The gender representation behind the issue credits. When measuring geographic diversity, we saw individual contributors from 6 different continents and 116 different countries: The top 20 countries from which contributions originate. The data is compiled by aggregating the countries of all individual contributors behind each commit. Note that the geographical location of contributors doesn't always correspond with the origin of their sponsorship. Wim Leers, for example, works from Belgium, but his funding comes from Acquia, which has the majority of its customers in North America.How much of the work is sponsored?

    Drupal is used by more than one million websites. The vast majority of the individuals and organizations behind these Drupal websites never participate in the development of the project. They might use the software as it is or might not feel the need to help drive its development. We have to provide more incentive for these individuals and organizations to contribute back to the project.

    Issue credits can be marked as "volunteer" and "sponsored" simultaneously (shown in jamadar's screenshot near the top of this post). This could be the case when a contributor does the minimum required work to satisfy the customer's need, in addition to using their spare time to add extra functionality.

    While Drupal started out as a 100% volunteer-driven project, today the majority of the code on Drupal.org is sponsored by organizations. Only 11% of the commit credits that we examined in 2016-2017 were "purely volunteer" credits (4,498 credits), in stark contrast to the 46% that were "purely sponsored". In other words, there were four times as many "purely sponsored" credits as "purely volunteer" credits.

    A few comparisons with the 2015-2016 data:

    • The credit system is used more. Between July 1, 2015 and June 30, 2016, 37% of all credits had no attribution while in the more recent period between July 1, 2016 to June 30, 2017, only 28% of credits lacked attribution. More people have become aware of the credit system, the attribution options, and their benefits.
    • Sponsored credits are growing faster than volunteer credits. Both "purely volunteer" and "purely sponsored" credits grew, but "purely sponsored" credits grew faster. There are two reasons why this could be the case: (1) more contributions are sponsored and (2) organizations are more likely to use the credit system compared to volunteers.

    No data is perfect, but it feels safe to conclude that most of the work on Drupal is sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal. Maybe most importantly, while the number of volunteers and sponsors has grown year over year in absolute terms, sponsored contributions appear to be growing faster than volunteer contributions. This is consistent with how open source projects grow and scale.

    Who is sponsoring the work?

    Now that we have established that most of the work on Drupal is sponsored, we want to study which organizations contribute to Drupal. While 889 different organizations contributed to Drupal, approximately 50% of them received four credits or fewer. The top 30 organizations (roughly the top 3%) account for about 48% of the total credits, which implies that the top 30 companies play a crucial role in the health of the Drupal project. The graph below shows the top 30 organizations and the number of credits they received between July 1, 2016 and June 30, 2017:

    While not immediately obvious from the graph above, different types of companies are active in Drupal's ecosystem:

    Category Description Traditional Drupal businesses Small-to-medium-sized professional services companies that make money primarily using Drupal. They typically employ fewer than 100 employees, and because they specialize in Drupal, many of these professional services companies contribute frequently and are a huge part of our community. Examples are Chapter Three (shown on graph) and Lullabot (shown on graph). Digital marketing agencies Larger full-service agencies that have marketing-led practices using a variety of tools, typically including Drupal, Adobe Experience Manager, Sitecore, WordPress, etc. They tend to be larger, with the larger agencies employing thousands of people. Examples are Wunderman and Mirum. System integrators Larger companies that specialize in bringing together different technologies into one solution. Example system agencies are Accenture, TATA Consultancy Services, Capgemini and CI&T. Technology and infrastructure companies Examples are Acquia (shown on graph), Lingotek, BlackMesh, Rackspace, Pantheon and Platform.sh. End-users Examples are Pfizer (shown on graph) or NBCUniversal.

    A few observations:

    • Almost all of the sponsors in the top 30 are traditional Drupal businesses. Companies like MD Systems (12 employees), Valuebound (34 employees), Chapter Three (27 employees), Commerce Guys (7 employees) and PreviousNext (20 employees) are, despite their size, critical to Drupal's success.

      It's worth highlighting MD Systems, which ranks second in the list of the top 30 contributing organizations, and is the number-one contributor among traditional Drupal businesses. What distinguishes MD Systems from most others is that it has embedded contribution into its corporate philosophy. For every commercial project, MD Systems invests 20% of that project’s value back into Drupal. They believe that using commercial projects as the foundation for community contribution leads to more meaningful and healthier contributions for Drupal and a lower total cost of ownership for their customers. This is different from other organizations, where employees are allotted a number of hours per month to contribute outside of customer-facing projects. There is no denying that MD Systems has had a tremendous impact on the Drupal community with contributions that are both frequent and impactful.

    • Compared to these traditional Drupal businesses, Acquia has nearly 800 employees and several full-time Drupal contributors. Acquia’s Office of the CTO (OCTO) works to resolve some of the most complex issues on Drupal.org, many of which are not recognized by the credit system (e.g. release management, communication, sprint organizing, and project coordination). However, I believe that Acquia should contribute even more due to our comparative size.
    • No digital marketing agencies show up in the top 30, though some of them are starting to contribute. It is exciting that an increasing number of digital marketing agencies are delivering beautiful experiences using Drupal. As a community, we need to work to ensure that each of these firms are contributing back to the project with the same commitment that we see from firms like Chapter Three, MD Systems or CI&T.
    • The only system integrator in the top 30 is CI&T, which ranked 6th with 664 credits. As far as system integrators are concerned, CI&T is a smaller player with approximately 2,500 employees. However, we do see various system integrators outside of the top 30, including Globant, Capgemini, Sapient and TATA Consultancy Services. Each of these system integrators reported 30 to 70 credits in the past year. Finally, Wipro began contributing this year with 2 credits. We expect, or hope, to see system integrators contribute more and more, especially given the number of Drupal developers they employ. Many have sizable Drupal practices with hundreds of Drupal developers, yet contributing to open source is relatively new and often not well-understood.
    • Infrastructure and software companies play an important role in our community, yet only Acquia appears in the top 30. While Acquia has a professional services division, 75% of the contributions come from the product organization (including the Office of the CTO and the Acquia Lightning team). Other contributing infrastructure companies include Pantheon and Platform.sh, which are both venture-backed platform-as-a-service companies that originated from the Drupal community. Pantheon has 17 credits and Platform.sh has 47 credits. Amazee Labs, who is building an infrastructure business, reported 51 credits. Rackspace is a public company hosting thousands of Drupal sites; they have 48 credits. Lingotek offers cloud-based translation management software and has 94 credits.
    • We saw two end-users in the top 30 corporate sponsors: Pfizer (251 credits, up from 158 credits the year before) and the German company bio.logis (212 credits). Other notable customers outside of the top 30 were Workday, Wolters Kluwer, Burda Media, University of Colorado Boulder, YMCA and OpenY, CARD.com and NBCUniversal.
    Sponsored code contributions to Drupal.org from technology and infrastructure companies. The chart does not reflect sponsored code contributions on GitHub, Drupal event sponsorship, and the many forms of value that these companies add to Drupal and other open-source communities.

    We can conclude that technology and infrastructure companies, digital marketing agencies, system integrators and end-users are not meaningfully contributing code to Drupal.org today. How can we explain this disparity in comparison to traditional Drupal businesses who contribute the most? We believe the biggest reasons are:

    1. Drupal's strategic importance. A variety of the traditional Drupal agencies have been involved with Drupal for 10 years and almost entirely depend on Drupal to support their business. Given both their expertise and dependence on Drupal, they are most likely to look after Drupal's development and well-being. These organizations are typically recognized as Drupal experts and are sought out by organizations that want to build a Drupal website. Contrast this with most of the digital marketing agencies and system integrators who are sized to work with a diversified portfolio of content management platforms and who are historically only getting started with Drupal and open source. They deliver digital marketing solutions and aren't necessarily sought out for their Drupal expertise. As their Drupal practices grow in size and importance, this could change. In fact, contributing to Drupal can help grow their Drupal business because it helps their name stand out as Drupal experts and gives them a competitive edge with their customers.
    2. The level of experience with Drupal and open source. Drupal aside, many organizations have little or no experience with open source, so it is important that we motivate and teach them to contribute.
    3. Legal reservations. We recognize that some organizations are not legally permitted to contribute, let alone attribute their customers. We hope that will change as open source continues to get adopted.
    4. Tools and process barriers. Drupal contribution still involves a patch-based workflow on Drupal.org's unique issue queue system. This presents a fairly steep learning curve to most developers, who primarily work with more modern and common tools such as GitHub. Getting the code change proposal uploaded is just the first step; getting code changes accepted into an upstream Drupal project — especially Drupal core — is hard work. Peer reviews, gates such as automated testing and documentation, required sign-offs from maintainers and committers, knowledge of best practices and other community norms are a few of the challenges a contributor must face to get code accepted into Drupal.

    Consequently, this data shows that the Drupal community can do more to entice companies to contribute code to Drupal.org. The Drupal community has a long tradition of encouraging organizations to share code rather than keep it behind firewalls. While the spirit of the Drupal project cannot be reduced to any single ideology — not every organization can or will share their code — we would like to see organizations continue to prioritize collaboration over individual ownership. Our aim is not to criticize those who do not contribute, but rather to help foster an environment worthy of contribution. Given the vast amount of Drupal users, we believe continuing to encourage organizations and end-users to contribute could be a big opportunity.

    There are substantial benefits and business drivers for organizations that contribute: (1) it improves their ability to sell and win deals and (2) it improves their ability to hire. Companies that contribute to Drupal tend to promote their contributions in RFPs and sales pitches. Contributing to Drupal also results in being recognized as a great place to work for Drupal experts.

    The uneasy alliance with corporate contributions

    As mentioned above, when community-driven open-source projects grow, there is a bigger need for organizations to help drive their development. It almost always creates an uneasy alliance between volunteers and corporations.

    This theory played out in the Linux community well before it played out in the Drupal community. The Linux project is 25 years old and has seen a steady increase in the number of corporate contributors for roughly 20 years. While Linux companies like Red Hat and SUSE rank high on the contribution list, so do non-Linux-centric companies such as Samsung, Intel, Oracle and Google. All of these corporate contributors are (or were) using Linux as an integral part of their business.

    The 889 organizations that contribute to Drupal (which includes corporations) is more than four times the number of organizations that sponsor development of the Linux kernel. This is significant because Linux is considered "one of the largest cooperative software projects ever attempted". In fairness, Linux has a different ecosystem than Drupal. The Linux business ecosystem has various large organizations (Red Hat, Google, Intel, IBM and SUSE) for whom Linux is very strategic. As a result, many of them employ dozens of full-time Linux contributors and invest millions of dollars in Linux each year.

    What projects have sponsors?

    In total, the Drupal community worked on 3,183 different projects (modules, themes and distributions) in the 12-month period between July 1, 2016 to June 30, 2017. To understand where the organizations sponsoring Drupal put their money, I’ve listed the top 20 most sponsored projects:

    RankUsernameIssues1Drupal core47452Drupal Commerce (distribution)5263Webform3614Open Y (distribution)3245Paragraphs2316Inmail2237User guide2188JSON API2049Paragraphs collection20010Entity browser19611Diff19012Group17013Metatag15714Facets15515Commerce Point of Sale (PoS)14716Search API14317Open Social (distribution)13318Drupal voor Gemeenten (distribution)13119Solr Search12220Geolocation field118
    Who is sponsoring the top 30 contributors? Rank Username Issues Volunteer Sponsored Not specified Sponsors 1 jrockowitz 537 88% 45% 9% The Big Blue House (239), Kennesaw State University (6), Memorial Sloan Kettering Cancer Center (4) 2 dawehner 421 67% 83% 5% Chapter Three (328), Tag1 Consulting (19), Drupal Association (12), Acquia (5), Comm-press (1) 3 RenatoG 408 0% 100% 0% CI&T (408) 4 bojanz 351 0% 95% 5% Commerce Guys (335), Adapt A/S (38), Bluespark (2) 5 Berdir 335 0% 93% 7% MD Systems (310), Acquia (7) 6 mglaman 334 3% 97% 1% Commerce Guys (319), Thinkbean, LLC (48), LivePerson, Inc (46), Bluespark (22), Universal Music Group (16), Gaggle.net, Inc. (3), Bluehorn Digital (1) 7 Wim Leers 332 14% 87% 2% Acquia (290) 8 alexpott 329 7% 99% 1% Chapter Three (326), TES Global (1) 9 DamienMcKenna 245 2% 95% 4% Mediacurrent (232) 10 jhodgdon 242 0% 1% 99% Drupal Association (2), Poplar ProductivityWare (2) 11 drunken monkey 238 95% 11% 1% Acquia (17), Vizala (8), Wunder Group (1), Sunlime IT Services GmbH (1) 12 naveenvalecha 196 74% 55% 1% Acquia (152), Google Summer of Code (7), QED42 (1) 13 Munavijayalakshmi 192 0% 100% 0% Valuebound (192) 14 borisson_ 191 66% 39% 22% Dazzle (70), Acquia (6) 15 yongt9412 189 0% 97% 3% MD Systems (183), Acquia (6) 16 klausi 185 9% 61% 32% epiqo (112) 17 Sam152 184 59% 92% 7% PreviousNext (168), amaysim Australia Ltd. (5), Code Drop (2) 18 miro_dietiker 182 0% 99% 1% MD Systems (181) 19 Pavan B S 180 0% 98% 2% Valuebound (177) 20 ajay_reddy 176 100% 99% 0% Valuebound (180), Drupal Bangalore Community (154) 21 phenaproxima 172 0% 99% 1% Acquia (170) 22 sanchiz 162 0% 99% 1% Drupal Ukraine Community (107), Vinzon (101), FFW (60), Open Y (52) 23 slashrsm 161 6% 95% 3% MD Systems (153), Acquia (47) 24 jhedstrom 155 4% 92% 4% Phase2 (143), Workday, Inc. (134), Memorial Sloan Kettering Cancer Center (1) 25 xjm 151 0% 91% 9% Acquia (137) 26 catch 147 3% 83% 16% Third and Grove (116), Tag1 Consulting (6) 27 larowlan 145 12% 92% 7% PreviousNext (133), University of Technology, Sydney (30), amaysim Australia Ltd. (6), Australian Competition and Consumer Commission (ACCC) (1), Department of Justice & Regulation, Victoria (1) 28 rakesh.gectcr 141 100% 91% 0% Valuebound (128) 29 benjy 139 0% 94% 6% PreviousNext (129), Brisbane City Council (8), Code Drop (1) 30 dhruveshdtripathi 138 15% 100% 0% DevsAdda (138), OpenSense Labs (44)

    We observe that the top 30 contributors are sponsored by 46 organizations. This kind of diversity is aligned with our desire not to see Drupal controlled by a single organization. These top contributors and organizations are from many different parts of the world and work with customers large and small. Nonetheless, we will continue to benefit from more diversity.

    Evolving the credit system

    Like Drupal itself, the credit system on Drupal.org is an evolving tool. Ultimately, the credit system will only be useful when the community uses it, understands its shortcomings, and suggests constructive improvements. In highlighting the organizations that sponsor the development of code on Drupal.org, we hope to elicit responses that help evolve the credit system into something that incentivizes business to sponsor more work and enables more people to participate in our community, learn from others, teach newcomers and make positive contributions. Drupal is a positive force for change and we wish to use the credit system to highlight (at least some of) the work of our diverse community, which includes volunteers, companies, nonprofits, governments, schools, universities, individuals, and other groups.

    One of the challenges with the existing credit system is it has no way of "weighting" contributions. A typo fix counts just as much as giving multiple detailed technical reviews on a critical core issue. This appears to have the effect of incentivizing organizations' employees to work on "lower-hanging fruit issues", because this bumps their companies' names in the rankings. One way to help address this might be to adjust the credit ranking algorithm to consider things such as issue priority, patch size, and so on. This could help incentivize companies to work on larger and more important problems and save coding standards improvements for new contributor sprints. Implementing a scoring system that ranks the complexity of an issue would also allow us to develop more accurate reports of contributed work.


    Our data confirms Drupal is a vibrant community full of contributors who are constantly evolving and improving the software. While we have amazing geographic diversity, we need greater gender diversity. Our analysis of the Drupal.org credit data concludes that most contributions to Drupal are sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal.

    As a community, we need to understand that a healthy open-source ecosystem includes more than traditional Drupal businesses that contribute the most. For example, we don't see a lot of contribution from the larger digital marketing agencies, system integrators, technology companies, or end-users of Drupal — we believe that might come as these organizations build out their Drupal practices and Drupal becomes more strategic for them.

    To grow and sustain Drupal, we should support those that contribute to Drupal and find ways to get those that are not contributing involved in our community. We invite you to help us continue to strengthen our ecosystem.

    Special thanks to Tim Lehnen and Neil Drumm from the Drupal Association for providing us with the Drupal.org credit system data and for supporting us during our research. I would also like to extend a special thanks to Matthew Tift for helping to lay the foundation for this research, collaborating on last year’s blog post, and for reviewing this year’s edition. Finally, thanks to Angie Byron, Gábor Hojtsy, Jess (xjm), Preston So, Ted Bowman, Wim Leers and Gigi Anderson for providing feedback during the writing process.

    Categories: FLOSS Project Planets
    Syndicate content