KnackForge: How to update Drupal 8 core?

Planet Drupal - Sat, 2018-03-24 00:01
How to update Drupal 8 core?

Let's see how to update your Drupal site between 8.x.x minor and patch versions. For example, from 8.1.2 to 8.1.3, or from 8.3.5 to 8.4.0. I hope this will help you.

  • If you are upgrading to Drupal version x.y.z

           x -> is known as the major version number

           y -> is known as the minor version number

           z -> is known as the patch version number.

Sat, 03/24/2018 - 10:31
Categories: FLOSS Project Planets

Interview with Baukje Jagersma

Planet KDE - 2 hours 4 min ago

Could you tell us something about yourself?

Hey! My name Is Baukje Jagersma, I’m 22 years old and live in the Netherlands. I studied game design and recently started doing freelance, to try and make a living out of something I enjoy doing!

Do you paint professionally, as a hobby artist, or both?

Both: I’ve always enjoyed creating my own stories and worlds with drawing and recently started doing freelance work as well.

What genre(s) do you work in?

Most if not all of my work has something to do with fantasy. To me that’s the best part of drawing, you can create things that don’t exist and make them look believable. Besides that I mostly work as an illustrator and concept artist.

Whose work inspires you most — who are your role models as an artist?

There are a lot of sources where I get inspiration from, art in games for example, movies or art sites.

A few artists that are really worth mentioning would be Grzegorz Rutkowski, Ruan Jia and Piotr Jablonski.

How and when did you get to try digital painting for the first time?

Probably when I first discovered Deviantart. I was already familiar with GIMP, which I used to create photo-manipulations with. But seeing all the amazingly talented artists on there made me want to try out digital painting for myself.

What makes you choose digital over traditional painting?

I feel like traditional has more limitations and can get messy. In digital you can easily pick any color you like, or undo something that doesn’t work. For me it just works a lot faster.

How did you find out about Krita?

Somewhere around 2013-2014 when an artist posted his Krita art on a GIMP forum.

What was your first impression?

I really didn’t know where to start, haha! There were just so many more options than I was used to in GIMP, especially with all the individual brush engines. It really took me a while to get comfortable with the program.

What do you love about Krita?

Now I’ve just grown to love the multiple brush engines! The wrap-around mode, animation tool, brush smoothing options, symmetry options, assistant tool and the different layer and mask options are probably the key features that I love about it. It’s a program that just has so much to offer which makes it a lot of fun to explore with!

What do you think needs improvement in Krita? Is there anything that really annoys you?

Probably the only thing that really bugs me is the text tool, which seems to have a few weird issues right now. I’d also love to see the possibility to import and use vector layers and an alternative to the pattern brush option to make it work less repetitive (something similar to Photoshop’s dual brush perhaps).

What sets Krita apart from the other tools that you use?

Kinda mentioned it earlier already, it has a lot to offer which makes it fun to explore with! Besides that it’s available to everyone and works just as well as any other ‘professional’ digital painting program.

If you had to pick one favourite of all your work done in Krita so far, what would it be, and why?

Probably one of my few non-illustrative works. I really wanted to try out the animation tool so I decided to try out a run cycle. I had little knowledge of animating beforehand- but I like how the animation and design turned out in the end.

What techniques and brushes did you use in it?

I made a few different style concepts beforehand, where I chose a design from and later on used as a reference. I first made a sketch version of the animation which I then refined and colored. I actually made a little video about it which I posted on youtube.

Where can people see more of your work?

Deviantart: https://baukjespirit.deviantart.com/
Artstation: https://www.artstation.com/baukjespirit
Instagram: https://www.instagram.com/baukjespirit/
Twitter: https://twitter.com/BaukjeJagersma
Youtube: https://www.youtube.com/user/baukjespirit

Anything else you’d like to share?

I’d like to thank the Krita team for developing this amazing program and making it available to everyone! I’m very excited to see how Krita will further develop in the future!

Categories: FLOSS Project Planets

Amit Saha: Linux System Mining with Python

Planet Python - 4 hours 5 min ago

In this article, we will explore the Python programming language as a tool to retrieve various information about a system running Linux. Let's get started.

Which Python?

When I refer to Python, I am referring to CPython 2 (2.7 to be exact). I will mention it explicitly when the same code won't work with CPython 3 (3.3) and provide the alternative code, explaining the differences. Just to make sure that you have CPython installed, type python or python3 from the terminal and you should see the Python prompt displayed in your terminal.


Please note that all the programs have their first line as #!/usr/bin/env python meaning that, we want the Python interpreter to execute these scripts. Hence, if you make your script executable using chmod +x your-script.py, you can execute it using ./your-script.py (which is what you will see in this article).

Exploring the platform module

The platform module in the standard library has a number of functions which allow us to retrieve various system information. Let us start the Python interpreter and explore some of them, starting with the platform.uname() function:

>>> import platform >>> platform.uname() ('Linux', 'fedora.echorand', '3.7.4-204.fc18.x86_64', '#1 SMP Wed Jan 23 16:44:29 UTC 2013', 'x86_64')

If you are aware of the uname command on Linux, you will recognize that this function is an interface of sorts to this command. On Python 2, it returns a tuple consisting of the system type (or Kernel type), hostname, version, release, machine hardware and processor information. You can access individual attributes using indices, like so:

>>> platform.uname()[0] 'Linux'

On Python 3, the function returns a named tuple:

>>> platform.uname() uname_result(system='Linux', node='fedora.echorand', release='3.7.4-204.fc18.x86_64', version='#1 SMP Wed Jan 23 16:44:29 UTC 2013', machine='x86_64', processor='x86_64')

Since the returned result is a named tuple, this makes it easy to refer to individual attributes by name rather than having to remember the indices, like so:

>>> platform.uname().system 'Linux'

The platform module also has direct interfaces to some of the above attributes, like so:

>>> platform.system() 'Linux' >>> platform.release() '3.7.4-204.fc18.x86_64'

The linux_distribution() function returns details about the Linux distribution you are on. For example, on a Fedora 18 system, this command returns the following information:

>>> platform.linux_distribution() ('Fedora', '18', 'Spherical Cow')

The result is returned as a tuple consisting of the distribution name, version and the code name. The distributions supported by your particular Python version can be obtained by printing the value of the _supported_dists attribute:

>>> platform._supported_dists ('SuSE', 'debian', 'fedora', 'redhat', 'centos', 'mandrake', 'mandriva', 'rocks', 'slackware', 'yellowdog', 'gentoo', 'UnitedLinux', 'turbolinux')

If your Linux distribution is not one of these (or a derivative of one of these), then you will likely not see any useful information from the above function call.

The final function from the platform module, we will look at is the architecture() function. When you call the function without any arguments, this function returns a tuple consisting of the bit architecture and the executable format of the Python executable, like so:

>>> platform.architecture() ('64bit', 'ELF')

On a 32-bit Linux system, you would see:

>>> platform.architecture() ('32bit', 'ELF')

You will get similar results if you specify any other executable on the system, like so:

>>> platform.architecture(executable='/usr/bin/ls') ('64bit', 'ELF')

You are encouraged to explore other functions of the platform module which among others, allow you to find the current Python version you are running. If you are keen to know how this module retrieves this information, the Lib/platform.py file in the Python source directory is where you should look into.

The os and sys modules are also of interest to retrieve certain system attributes such as the native byteorder. Next, we move beyond the Python standard library modules to explore some generic approaches to access the information on a Linux system made available via the proc and sysfs file systems. It is to be noted that the information made available via these filesystems will vary between various hardware architectures and hence you should keep that in mind while reading this article and also writing scripts which attempt to retrieve information from these files.

CPU Information

The file /proc/cpuinfo contains information about the processing units on your system. For example, here is a Python version of what the Linux command cat /proc/cpuinfo would do:

#! /usr/bin/env python """ print out the /proc/cpuinfo file """ from __future__ import print_function with open('/proc/cpuinfo') as f: for line in f: print(line.rstrip('\n'))

When you execute this program either using Python 2 or Python 3, you should see all the contents of /proc/cpuinfo dumped on your screen (In the above program, the rstrip() method removes the trailing newline character from the end of each line).

The next code listing uses the startswith() string method to display the models of your processing units:

#! /usr/bin/env python """ Print the model of your processing units """ from __future__ import print_function with open('/proc/cpuinfo') as f: for line in f: # Ignore the blank line separating the information between # details about two processing units if line.strip(): if line.rstrip('\n').startswith('model name'): model_name = line.rstrip('\n').split(':')[1] print(model_name)

When you run this program, you should see the model names of each of your processing units. For example, here is what I see on my computer:

Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz

We have so far seen a couple of ways to find the architecture of the computer system we are on. To be technically correct, both those approaches actually report the architecture of the kernel your system is running. So, if your computer is actually a 64-bit computer, but is running a 32-bit kernel, then the above methods will report it as having a 32-bit architecture. To find the true architecture of the computer you can look for the lm flag in the list of flags in /proc/cpuinfo. The lm flag stands for long mode and is only present on computers with a 64-bit architecture. The next program shows how you can do this:

#! /usr/bin/env python """ Find the real bit architecture """ from __future__ import print_function with open('/proc/cpuinfo') as f: for line in f: # Ignore the blank line separating the information between # details about two processing units if line.strip(): if line.rstrip('\n').startswith('flags') \ or line.rstrip('\n').startswith('Features'): if 'lm' in line.rstrip('\n').split(): print('64-bit') else: print('32-bit')

As we have seen so far, it is possible to read the /proc/cpuinfo and use simple text processing techniques to read the data we are looking for. To make it friendlier for other programs to use this data, it is perhaps a better idea to make the contents of /proc/cpuinfo available as a standard data structure, such as a dictionary. The idea is simple: if you see the contents of this file, you will find that for each processing unit, there are a number of key, value pairs (in an earlier example, we printed the model name of the processor, here model name was a key). The information about different processing units are separated from each other by a blank line. It is simple to build a dictionary structure which has each of the processing unit's data as keys. For each of the these keys, the value is all the information about the corresponding processing unit present in the file /proc/cpuinfo. The next listing shows how you can do so.

#!/usr/bin/env/ python """ /proc/cpuinfo as a Python dict """ from __future__ import print_function from collections import OrderedDict import pprint def cpuinfo(): ''' Return the information in /proc/cpuinfo as a dictionary in the following format: cpu_info['proc0']={...} cpu_info['proc1']={...} ''' cpuinfo=OrderedDict() procinfo=OrderedDict() nprocs = 0 with open('/proc/cpuinfo') as f: for line in f: if not line.strip(): # end of one processor cpuinfo['proc%s' % nprocs] = procinfo nprocs=nprocs+1 # Reset procinfo=OrderedDict() else: if len(line.split(':')) == 2: procinfo[line.split(':')[0].strip()] = line.split(':')[1].strip() else: procinfo[line.split(':')[0].strip()] = '' return cpuinfo if __name__=='__main__': cpuinfo = cpuinfo() for processor in cpuinfo.keys(): print(cpuinfo[processor]['model name'])

This code uses an OrderedDict (Ordered dictionary) instead of a usual dictionary so that the key and values are stored in the order which they are found in the file. Hence, the data for the first processing unit is followed by the data about the second processing unit and so on. If you call this function, it returns you a dictionary. The keys of dictionary are each processing unit with. You can then use to sieve for the information you are looking for (as demonstrated in the if __name__=='__main__' block). The above program when run will once again print the model name of each processing unit (as indicated by the statement print(cpuinfo[processor]['model name']):

Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Memory Information

Similar to /proc/cpuinfo, the file /proc/meminfo contains information about the main memory on your computer. The next program creates a dictionary from the contents of this file and dumps it.

#!/usr/bin/env python from __future__ import print_function from collections import OrderedDict def meminfo(): ''' Return the information in /proc/meminfo as a dictionary ''' meminfo=OrderedDict() with open('/proc/meminfo') as f: for line in f: meminfo[line.split(':')[0]] = line.split(':')[1].strip() return meminfo if __name__=='__main__': #print(meminfo()) meminfo = meminfo() print('Total memory: {0}'.format(meminfo['MemTotal'])) print('Free memory: {0}'.format(meminfo['MemFree']))

As earlier, you could also access any specific information you are looking for by using that as a key (shown in the if __name__==__main__ block). When you execute the program, you should see an output similar to the following:

Total memory: 7897012 kB Free memory: 249508 kB Network Statistics

Next, we explore the network devices on our computer system. We will retrieve the network interfaces on the system and the data bytes sent and recieved by them since your system reboot. The /proc/net/dev file makes this information available. If you examine the contents of this file, you will notice that the first two lines contain header information - i.e. the first column of this file is the network interface name, the second and the third columns display information about the received and the transmitted bytes (such as total bytes sent, number of packets, errors, etc.). Our interest here is to extract the total data sent and recieved by the different network devices. The next listing shows how we can extract this information from /proc/net/dev:

#!/usr/bin/env python from __future__ import print_function from collections import namedtuple def netdevs(): ''' RX and TX bytes for each of the network devices ''' with open('/proc/net/dev') as f: net_dump = f.readlines() device_data={} data = namedtuple('data',['rx','tx']) for line in net_dump[2:]: line = line.split(':') if line[0].strip() != 'lo': device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0), float(line[1].split()[8])/(1024.0*1024.0)) return device_data if __name__=='__main__': netdevs = netdevs() for dev in netdevs.keys(): print('{0}: {1} MiB {2} MiB'.format(dev, netdevs[dev].rx, netdevs[dev].tx))

When you run the above program, the output should display your network devices along with the total recieved and transmitted data in MiB since your last reboot as shown below:

em1: 0.0 MiB 0.0 MiB wlan0: 2651.40951061 MiB 183.173976898 MiB

You could probably couple this with a persistent data storage mechanism to write your own data usage monitoring program.


The /proc directory also contains a directory each for all the running processes. The directory names are the same as the process IDs for these processes. Hence, if you scan /proc for all directories which have digits as their names, you will have a list of process IDs of all the currently running processes. The function process_list() in the next listing returns a list with process IDs of all the currently running processes. The length of this list will hence be the total number of processes running on the system as you will see when you execute the above program.

#!/usr/bin/env python """ List of all process IDs currently active """ from __future__ import print_function import os def process_list(): pids = [] for subdir in os.listdir('/proc'): if subdir.isdigit(): pids.append(subdir) return pids if __name__=='__main__': pids = process_list() print('Total number of running processes:: {0}'.format(len(pids)))

The above program when executed will show an output similar to:

Total number of running processes:: 229

Each of the process directories contain number of other files and directories which contain various information about the invoking command of the process, the shared libraries its using, and others.

#!/usr/bin/env python """ Python interface to the /proc file system. Although this can be used as a replacement for cat /proc/... on the command line, its really aimed to be an interface to /proc for other Python programs. As long as the object you are looking for exists in /proc and is readable (you have permission and if you are reading a file, its contents are alphanumeric, this program will find it). If its a directory, it will return a list of all the files in that directory (and its sub-dirs) which you can then read using the same function. Example usage: Read /proc/cpuinfo: $ ./readproc.py proc.cpuinfo Read /proc/meminfo: $ ./readproc.py proc.meminfo Read /proc/cmdline: $ ./readproc.py proc.cmdline Read /proc/1/cmdline: $ ./readproc.py proc.1.cmdline Read /proc/net/dev: $ ./readproc.py proc.net.dev Comments/Suggestions: Amit Saha <@echorand> <http://echorand.me> """ from __future__ import print_function import os import sys import re def toitem(path): """ Convert /foo/bar to foo.bar """ path = path.lstrip('/').replace('/','.') return path def todir(item): """ Convert foo.bar to /foo/bar""" # TODO: breaks if there is a directory whose name is foo.bar (for # eg. conf.d/), but we don't have to worry as long as we are using # this for reading /proc return '/' + item.replace('.','/') def readproc(item): """ Resolves proc.foo.bar items to /proc/foo/bar and returns the appropriate data. 1. If its a file, simply return the lines in this file as a list 2. If its a directory, return the files in this directory in the proc.foo.bar style as a list, so that this function can then be called to retrieve the contents """ item = todir(item) if not os.path.exists(item): return 'Non-existent object' if os.path.isfile(item): # its a little tricky here. We don't want to read huge binary # files and return the contents. We will probably not need it # in the usual case. # utilities like 'file' on Linux and the Python interface to # libmagic are useless when it comes to files in /proc for # detecting the mime type, since the these are not on-disk # files. # Searching, i find this solution which seems to be a # reasonable assumption. If we find a '\0' in the first 1024 # bytes of a file, we declare it as binary and return an empty string # however, some of the files in /proc which contain text may # also contain the null byte as a constituent character. # Hence, I use a RE expression that matches against any # combination of alphanumeric characters # If any of these conditions suffice, we read the file's contents pattern = re.compile('\w*') try: with open(item) as f: chunk = f.read(1024) if '\0' not in chunk or pattern.match(chunk) is not None: f.seek(0) data = f.readlines() return data else: return '{0} is binary'.format(item) except IOError: return 'Error reading object' if os.path.isdir(item): data = [] for dir_path, dir_name, files in os.walk(item): for file in files: data.append(toitem(os.path.join(dir_path, file))) return data if __name__=='__main__': if len(sys.argv)>1: data = readproc(sys.argv[1]) else: data = readproc('proc') if type(data) == list: for line in data: print(line) else: print(data) Block devices

The next program lists all the block devices by reading from the sysfs virtual file system. The block devices on your system can be found in the /sys/block directory. Thus, you may have directories such as /sys/block/sda, /sys/block/sdb and so on. To find all such devices, we perform a scan of the /sys/block directory using a simple regular expression to express the block devices we are interested in finding.

#!/usr/bin/env python """ Read block device data from sysfs """ from __future__ import print_function import glob import re import os # Add any other device pattern to read from dev_pattern = ['sd.*','mmcblk*'] def size(device): nr_sectors = open(device+'/size').read().rstrip('\n') sect_size = open(device+'/queue/hw_sector_size').read().rstrip('\n') # The sect_size is in bytes, so we convert it to GiB and then send it back return (float(nr_sectors)*float(sect_size))/(1024.0*1024.0*1024.0) def detect_devs(): for device in glob.glob('/sys/block/*'): for pattern in dev_pattern: if re.compile(pattern).match(os.path.basename(device)): print('Device:: {0}, Size:: {1} GiB'.format(device, size(device))) if __name__=='__main__': detect_devs()

If you run this program, you will see output similar to as follows:

Device:: /sys/block/sda, Size:: 465.761741638 GiB Device:: /sys/block/mmcblk0, Size:: 3.70703125 GiB

When I run the program, I had a SD memory card plugged in as well and hence you can see that the program detects it. You can extend this program to recognize other block devices (such as virtual hard disks) as well.

Building command line utilities

One ubiquitious part of all Linux command line utilities is that they allow the user to specify command line arguments to customise the default behavior of the program. The argparse module allows your program to have an interface similar to built-in Linux utilities. The next listing shows a program which retrieves all the users on your system and prints their login shells (using the pwd standard library module):

#!/usr/bin/env python """ Print all the users and their login shells """ from __future__ import print_function import pwd # Get the users from /etc/passwd def getusers(): users = pwd.getpwall() for user in users: print('{0}:{1}'.format(user.pw_name, user.pw_shell)) if __name__=='__main__': getusers()

When run the program above, it will print all the users on your system and their login shells.

Now, let us say that you want the program user to be able to choose whether he or she wants to see the system users (like daemon, apache). We will see a first use of the argparse module to implement this feature in by extending the previous listing as follows.

#!/usr/bin/env python """ Utility to play around with users and passwords on a Linux system """ from __future__ import print_function import pwd import argparse import os def read_login_defs(): uid_min = None uid_max = None if os.path.exists('/etc/login.defs'): with open('/etc/login.defs') as f: login_data = f.readlines() for line in login_data: if line.startswith('UID_MIN'): uid_min = int(line.split()[1].strip()) if line.startswith('UID_MAX'): uid_max = int(line.split()[1].strip()) return uid_min, uid_max # Get the users from /etc/passwd def getusers(no_system=False): uid_min, uid_max = read_login_defs() if uid_min is None: uid_min = 1000 if uid_max is None: uid_max = 60000 users = pwd.getpwall() for user in users: if no_system: if user.pw_uid >= uid_min and user.pw_uid <= uid_max: print('{0}:{1}'.format(user.pw_name, user.pw_shell)) else: print('{0}:{1}'.format(user.pw_name, user.pw_shell)) if __name__=='__main__': parser = argparse.ArgumentParser(description='User/Password Utility') parser.add_argument('--no-system', action='store_true',dest='no_system', default = False, help='Specify to omit system users') args = parser.parse_args() getusers(args.no_system)

On executing the above program with the --help option, you will see a nice help message with the available options (and what they do):

$ ./getusers.py --help usage: getusers.py [-h] [--no-system] User/Password Utility optional arguments: -h, --help show this help message and exit --no-system Specify to omit system users

An example invocation of the above program is as follows:

$ ./getusers.py --no-system gene:/bin/bash

When you pass an invalid parameter, the program complains:

$ ./getusers.py --param usage: getusers.py [-h] [--no-system] getusers.py: error: unrecognized arguments: --param

Let us try to understand in brief how we used argparse in the above program. The statement: parser = argparse.ArgumentParser(description='User/Password Utility') creates a new ArgumentParser object with an optional description of what this program does.

Then, we add the arguments that we want the program to recognize using the add_argument() method in the next statement: parser.add_argument('--no-system', action='store_true', dest='no_system', default = False, help='Specify to omit system users'). The first argument to this method is the name of the option that the program user will supply as an argument while invoking the program, the next parameter action=store_true indicates that this is a boolean option. That is, its presence or absence affects the program behavior in some way. The dest parameter specifies the variable in which the value that the value of this option will be available to the program. If this option is not supplied by the user, the default value is False which is indicated by the parameter default = False and the last parameter is the help message that the program displays about this option. Finally, the arguments are parsed using the parse_args() method: args = parser.parse_args(). Once the parsing is done, the values of the options supplied by the user can be retrieved using the syntax args.option_dest, where option_dest is the dest variable that you specified while setting up the arguments. This statement: getusers(args.no_system) calls the getusers() function with the option value for no_system supplied by the user.

The next program shows how you can specify options which allow the user to specify non-boolean preferences to your program. This program is a rewrite of Listing 6, with the additional option to specify the network device you may be interested in.

#!/usr/bin/env python from __future__ import print_function from collections import namedtuple import argparse def netdevs(iface=None): ''' RX and TX bytes for each of the network devices ''' with open('/proc/net/dev') as f: net_dump = f.readlines() device_data={} data = namedtuple('data',['rx','tx']) for line in net_dump[2:]: line = line.split(':') if not iface: if line[0].strip() != 'lo': device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0), float(line[1].split()[8])/(1024.0*1024.0)) else: if line[0].strip() == iface: device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0), float(line[1].split()[8])/(1024.0*1024.0)) return device_data if __name__=='__main__': parser = argparse.ArgumentParser(description='Network Interface Usage Monitor') parser.add_argument('-i','--interface', dest='iface', help='Network interface') args = parser.parse_args() netdevs = netdevs(iface = args.iface) for dev in netdevs.keys(): print('{0}: {1} MiB {2} MiB'.format(dev, netdevs[dev].rx, netdevs[dev].tx))

When you execute the program without any arguments, it behaves exactly as the earlier version. However, you can also specify the network device you may be interested in. For example:

$ ./net_devs_2.py em1: 0.0 MiB 0.0 MiB wlan0: 146.099492073 MiB 12.9737148285 MiB virbr1: 0.0 MiB 0.0 MiB virbr1-nic: 0.0 MiB 0.0 MiB $ ./net_devs_2.py --help usage: net_devs_2.py [-h] [-i IFACE] Network Interface Usage Monitor optional arguments: -h, --help show this help message and exit -i IFACE, --interface IFACE Network interface $ ./net_devs_2.py -i wlan0 wlan0: 146.100307465 MiB 12.9777050018 MiB System-wide availability of your scripts

With the help of this article, you may have been able to write one or more useful scripts for yourself which you want to use everyday like any other Linux command. The easiest way to do is make this script executable and setup a BASH alias to this script. You could also remove the .py extension and place this file in a standard location such as /usr/local/sbin.

Other useful standard library modules

Besides the standard library modules we have already looked at in this article so far, there are number of other standard modules which may be useful: subprocess, ConfigParser, readline and curses.

What next?

At this stage, depending on your own experience with Python and exploring Linux internals, you may follow one of the following paths. If you have been writing a lot of shell scripts/command pipelines to explore various Linux internals, take a look at Python. If you wanted a easier way to write your own utility scripts for performing various tasks, take a look at Python. Lastly, if you have been using Python for programming of other kinds on Linux, have fun using Python for exploring Linux internals.

Resources Python resources System Information
Categories: FLOSS Project Planets

Shirish Agarwal: PrimeZ270-p, Intel i7400 review and Debian – 1

Planet Debian - 4 hours 42 min ago

This is going to be a biggish one as well.

This is a continuation from my last blog post .

Before diving into installation, I had been reading for quite a while Matthew Garett’s work. Thankfully most of his blog posts do get mirrored on planet.debian.org hence it is easy to get some idea as what needs to be done although have told him (I think even shared here) that he should somehow make his site more easily navigable. Trying to find posts on either ‘GPT’ and ‘UEFI’ and to have those posts in an ascending or descending way date-wise is not possible, at least I couldn’t find a way to do it as he doesn’t do it date-wise or something.

The closest I could come to is sing ‘$keyword’ site:https://mjg59.dreamwidth.org/ via a search-engine and go through the entries shared therein. This doesn’t mean I don’t value his contribution. It is in fact, the opposite. AFAIK he was one of the first people who drew the community’s attention when UEFI came in and only Microsoft Windows could be booted on them, nothing else.

I may be wrong but AFAIK he was the first one to talk about having a shim and was part of getting people to be part of the shim process.

While I’m sure Matthew’s understanding may have evolved significantly from what he had shared before, it was two specific blog posts that I had to re-read before trying to install MS-Windows and then Debian-GNU/Linux system on it. .

I went to a friend’s house who had windows 7 running at his end, I ran over there, used diskpart and did the change to GPT using Windows technet article.

I had to use/go the GPT way as I understood that MS-Windows takes all the four primary partitions for itself, leaving nothing for any other operating system to do/use .

I did the conversion to GPT and tried to have MS-Windows 10 as my current motherboard and all future motherboards from Intel Gen7/Gen8 onwards do not support anything less than Windows 10. I did see an unofficial patch floating on github somewhere but now have lost the reference to it. I had read some of the bug-reports of the repo. which seemed to suggest it was still a work in progress.

Now this is where it starts becoming a bit… let’s say interesting.

Now a friend/client of mine offered me a job to review MS-Windows 10, with his product keys of course. I was a bit hesitant as it had been a long time since I had worked with MS-Windows and didn’t know if I could do it or not, the other was a suspicion that I might like it too much. While I did review it, I found –

a. It it one heck of a bloatware – I had thought MS-Windows would have learned it by now but no, they still have to have to learn that adware and bloatware aren’t solutions. I still can’t get my head wrapped around as to how 4.1 GB of an MS-WIndows ISO gets extracted to 20 GB and still have to install shit-loads of third-party tools to actually get anything done. Just amazed (and not in good way.) .

Just to share as an example I still had to get something like Revo Uninstaller as MS-Windows even till date hasn’t learned to uninstall programs cleanly and needs a tool like that to clean the registry and other places to remove the titbits left along the way.

Edit/Update – It still doesn’t have Fall Creators Update which is still supposed to be another 4 GB+ iso which god only knows how much space that will take.

b. It’s still not gold – With all the hoopla around MS-Windows 10 that I had been hearing and seeing ads, I was under the impression that MS-Windows had turned gold i.e. it had a release like Debian would have ‘buster’ something around next year probably around or after 2019 Debconf is held. Windows 10 Microsoft would be released around July 2018, so it’s still a few months off.

c. I had read an insightful article few years ago by a Junior Microsoft employee sharing/emphasizing why MS cannot do GNU/Linux volunteer/bazaar type of development. To put in not so many words, it came down to the cultural differences the way two communities operate. While in GNU/Linux a one more patch, one more pull request will be encouraged, and it may be integrated in that point release or it can’t it would be in the next point release (unless it changes something much more core/fundamentally which needs more in-depth review) MS-Windows on the other hand, actively discourages that sort of behavior as it meant more time for integration and testing and from the sound of it MS still doesn’t do Continuous Integration (CI), regressive testing etc. as is common in many GNU/Linux common projects more and more.

I wish I could have shared the article but don’t have the link anymore. @Lazyweb, if you would be so kind so as to help find that article. The developer had shared some sort of ssh credentials or something to prove who he was which he later to remove (probably) because of the consequences to him for sharing that insight were not worth it, although the writings seemed to be valid.

There were many more quibbles but shared the above ones. For e.g. copying files from hdd to usb disks doesn’t tell how much time it takes, while in Debian I’ve come to see time taken for any operation as guaranteed.

Before starting on to the main issue, some info. before-hand although I don’t know how relevant or not that info. might be –

Prime Z270-P uses EFI 2.60 by American Megatrends –

/home/shirish> sudo dmesg | grep -i efi
[sudo] password for shirish:
[ 0.000000] efi: EFI v2.60 by American Megatrends

I can share more info. if needed later.

Now as I understood/interpretated info. found on the web and by experience Microsoft makes quite a few more partitions than necessary to get MS-Windows installed.

This is how it stacks up/shows up –

> sudo fdisk -l
Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: xxxxxxxxxxxxxxxxxxxxxxxxxxx

Device Start End Sectors Size Type
/dev/sda1 34 262177 262144 128M Microsoft reserved
/dev/sda2 264192 1185791 921600 450M Windows recovery environment
/dev/sda3 1185792 1390591 204800 100M EFI System
/dev/sda4 1390592 3718037503 3716646912 1.7T Microsoft basic data
/dev/sda5 3718037504 3718232063 194560 95M Linux filesystem
/dev/sda6 3718232064 5280731135 1562499072 745.1G Linux filesystem
/dev/sda7 5280731136 7761199103 2480467968 1.2T Linux filesystem
/dev/sda8 7761199104 7814035455 52836352 25.2G Linux swap

I had made 2 GB for /boot in MS-Windows installer as I had thought it would take only some space and leave the rest for Debian GNU/Linux’s /boot to put its kernel entries, tools to check memory and whatever else I wanted to have on /boot/debian but for some reason I have not yet understood, that didn’t work out as I expected it to be.

Device Start End Sectors Size Type
/dev/sda1 34 262177 262144 128M Microsoft reserved
/dev/sda2 264192 1185791 921600 450M Windows recovery environment
/dev/sda3 1185792 1390591 204800 100M EFI System
/dev/sda4 1390592 3718037503 3716646912 1.7T Microsoft basic data

As seen in the above, the first four primary partitions are taken by MS-Windows themselves. I just wish I had understood how to use GPT disklabels properly so I could figure out things better, but it seems (for reasons not fully understood) why the efi partition is a lowly 100 MB which I suspect where /boot is when I asked it to be 2 GB. Is that UEFI doing, Microsoft’s doing or something which is a default bit, dunno. Having the EFI partition smaller hampers the way I want to do things as will be clear in a short while from now.

After I installed MS-Windows, I installed Debian GNU/Linux using the net install method.

The following is what I had put on piece of paper as what partitions would be for GNU/Linux –

/boot – 512 MB (should be enough to accommodate couple of kernel versions, memory checking and any other tools I might need in the future.

/ – 700 GB – well admittedly that looks insane a bit but I do like to play with new programs/binaries as and when possible and don’t want to run out of space as and when I forget to clean it up.

[off-topic, wishlist] One tool I would like to have (and dunno if it’s there) is an ability to know when I installed a package, how many times I have used it, how frequently and the ability to add small notes or description to the package. Many a times I have seen that the package description is either too vague or doesn’t focus on the practical usefulness of a package to me .

An easy example to share what I mean would be the apt package –

aptitude show apt
Package: apt
Version: 1.6~alpha6
Essential: yes
State: installed
Automatically installed: no
Priority: required
Section: admin
Maintainer: APT Development Team
Architecture: amd64
Uncompressed Size: 3,840 k
Depends: adduser, gpgv | gpgv2 | gpgv1, debian-archive-keyring, libapt-pkg5.0 (>= 1.6~alpha6), libc6 (>= 2.15), libgcc1 (>= 1:3.0), libgnutls30 (>= 3.5.6), libseccomp2 (>=1.0.1), libstdc++6 (>= 5.2)
Recommends: ca-certificates
Suggests: apt-doc, aptitude | synaptic | wajig, dpkg-dev (>= 1.17.2), gnupg | gnupg2 | gnupg1, powermgmt-base, python-apt
Breaks: apt-transport-https (< 1.5~alpha4~), apt-utils (< 1.3~exp2~), aptitude (< 0.8.10)
Replaces: apt-transport-https (< 1.5~alpha4~), apt-utils (< 1.3~exp2~)
Provides: apt-transport-https (= 1.6~alpha6)
Description: commandline package manager
This package provides commandline tools for searching and managing as well as querying information about packages as a low-level access to all features of the libapt-pkg library.

These include:
* apt-get for retrieval of packages and information about them from authenticated sources and for installation, upgrade and removal of packages together with their dependencies
* apt-cache for querying available information about installed as well as installable packages
* apt-cdrom to use removable media as a source for packages
* apt-config as an interface to the configuration settings
* apt-key as an interface to manage authentication keys

Now while I love all the various tools that the apt package has, I do have special fondness for $apt-cache rdepends $package

as it gives another overview of a package or library or shared library that I may be interested in and which other packages are in its orbit.

Over period of time it becomes easy/easier to forget packages that you don’t use day-to-day hence having something like such a tool would be a god-send where you can put personal notes about packages. Another could be reminders of tickets posted upstream or something on those lines. I don’t know of any tool/package which does something on those lines. [/off-topic, wishlist]

/home – 1.2 TB

swap – 25.2 GB

Admit I got a bit overboard on swap space but as and when I get more memory at least should have swap 1:1 right. I am not sure if the old rules would still apply or not.

Then I used Debian buster alpha 2 netinstall iso

https://cdimage.debian.org/cdimage/buster_di_alpha2/amd64/iso-cd/debian-buster-DI-alpha2-amd64-netinst.iso and put it on the usb stick. I did use the sha1sum to ensure that the netinstall iso was the same as the original one https://cdimage.debian.org/cdimage/buster_di_alpha2/amd64/iso-cd/SHA1SUMS

After that simply doing a dd if of was enough to copy the net install to the usb stick.

I did have some issues with the installation which I’ll share in the next post but the most critical issue was that I had to again do make a /boot and even though I made /boot as a separate partition and gave 1 GB to it during the partitioning step, I got only 100 MB and I have no idea why it is like that.

/dev/sda5 3718037504 3718232063 194560 95M Linux filesystem

> df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 88M 68M 14M 84% /boot

home/shirish> ls -lh /boot
total 55M
-rw-r--r-- 1 root root 193K Dec 22 19:42 config-4.14.0-2-amd64
-rw-r--r-- 1 root root 193K Jan 15 01:15 config-4.14.0-3-amd64
drwx------ 3 root root 1.0K Jan 1 1970 efi
drwxr-xr-x 5 root root 1.0K Jan 20 10:40 grub
-rw-r--r-- 1 root root 19M Jan 17 10:40 initrd.img-4.14.0-2-amd64
-rw-r--r-- 1 root root 21M Jan 20 10:40 initrd.img-4.14.0-3-amd64
drwx------ 2 root root 12K Jan 1 17:49 lost+found
-rw-r--r-- 1 root root 2.9M Dec 22 19:42 System.map-4.14.0-2-amd64
-rw-r--r-- 1 root root 2.9M Jan 15 01:15 System.map-4.14.0-3-amd64
-rw-r--r-- 1 root root 4.4M Dec 22 19:42 vmlinuz-4.14.0-2-amd64
-rw-r--r-- 1 root root 4.7M Jan 15 01:15 vmlinuz-4.14.0-3-amd64

root@debian:/boot/efi/EFI# ls -lh
total 3.0K
drwx------ 2 root root 1.0K Dec 31 21:38 Boot
drwx------ 2 root root 1.0K Dec 31 19:23 debian
drwx------ 4 root root 1.0K Dec 31 21:32 Microsoft

I would be the first to say I don’t really the understand this EFI business.

The only thing I do understand that it’s good that even without OS it becomes easier to see that all the components if you change/add which would or would not work in BIOS. In bios, getting info on components were iffy at best.

There have been other issues with EFI which I may take in another blog post but for now I would be happy if somebody can share –

how to have a big /boot/ so it’s not a small partition for debian boot. I don’t see any value in having a bigger /boot for MS-Windows unless there is a way to also get grub2 pointer/header added in MS-Windows bootloader. Will share the reasons for it in the next blog post.

I am open to reinstalling both MS-Windows and Debian from scratch although that would happen when debian-buster-alpha3 arrives. Any answer to the above would give me something to try the solution and share if I get the desired result.

Looking forward for answers.

Categories: FLOSS Project Planets

Louis-Philippe Véronneau: French Gender-Neutral Translation for Roundcube

Planet Debian - 5 hours 5 min ago

Here's a quick blog post to tell the world I'm now doing a French gender-neutral translation for Roundcube.

A while ago, someone wrote on the Riseup translation list to complain against the current fr_FR translation. French is indeed a very gendered language and it is common place in radical spaces to use gender-neutral terminologies.

So yeah, here it is: https://github.com/baldurmen/roundcube_fr_FEM

I haven't tested the UI integration yet, but I'll do that once the Riseup folks integrate it to their Roundcube instance.

Categories: FLOSS Project Planets

PreviousNext: Managing Composer Github access with Personal Access Tokens

Planet Drupal - Sun, 2018-01-21 22:20

All PreviousNext Drupal 8 projects are now managed using Composer. This is a powerful tool, and allows our projects to define both public and private modules or libraries, and their dependencies, and bring them all together.


However, a if you require public or private modules which are hosted on GitHub you may run into the API Rate Limits. In order to overcome this, it is recommended to add a GitHub personal access token to your composer configuration.


In this blog post, I'll show how you can do this in a secure and manageable way.

by Kim Pepper / 22 January 2018

It's common practice when you encounter a Drupal project to see the following snippet in a composer.json file:

"config": { "github-oauth": { "github.com": "XXXXXXXXXXXXXXXXXXXXXX" } },

What this means is, everyone is sharing a single account's personal access token. While this may be convenient, it's also a major security risk should the token accidentally be made public, or a team member leaves the organisation, and still has read/write access to your repositories.

A better approach, is to have each team member have their own personal access token configure locally. This ensures that individuals can only access repositories they have read permissions for, and once they leave your organisation they can no longer access any private dependencies.

Step 1: Create a personal access token

Go to https://github.com/settings/tokens and generate a new token.

You will need to specify all repo scopes.

Finally, hit Generate Token to create the token.

Copy this, as well need it in the next step.

Step 2: Configure Composer to use your personal access token

Run the following from the command line:

composer config -g github-oauth.github.com XXXXXXXXXXXXXXXXXXXXXXX

You're all set! From now on, composer will use your own individual personal access token which is stored in $HOME/.composer/auth.json

What about Automated Testing Environments?

Fortunately, composer also accepts an environment variable COMPOSER_AUTH with a JSON-formatted string as an argument. For example:


You can simply set this environment variable in your CI Environment (e.g. CircleCI, TravisCI, Jenkins) and have a personal access token specific to the CI environment.


By using Personal Access Tokens, you can now safely remove any tokens from the project's composer.json file, removing the risk this gets exposed. You can also know that by removing access for any ex-team members, they are no longer able to access your organisations repos using a token. Finally, in the event of a token being compromised, you have reduced the attack surface, and can more easily identify which user's token was used.


Tagged Composer, Security, Drupal Security
Categories: FLOSS Project Planets

Techiediaries - Django: Handling CORS in Django REST Framework

Planet Python - Sun, 2018-01-21 19:00

If you are building applications with Django and modern front-end/JavaScript technologies such as Angular, React or Vue, chances are that you are using two development servers for the back-end server (running at the port 8000) and a development server (Webpack) for your front-end application.

When sending HTTP requests from your front-end application, using the browser's fetch API, the Axios client or the jQuery $.ajax() method (a wrapper for the JavaScript XHR interface), to your back-end API built with Django REST framework the web browser will throw an error related to the Same Origin Policy.

Cross Origin Resource Sharing or CORS allows client applications to interface with APIs hosted on different domains by enabling modern web browsers to bypass the Same origin Policy which is enforced by default.

CORS enables you to add a set of headers that tell the web browser if it's allowed to send/receive requests from domains other than the one serving the page.

You can enable CORS in Django REST framework by using a custom middleware or better yet using the django-cors-headers package

Using a Custom Middleware

First create a Django application:

python manage.py startapp app

Next you need to add a middleware file app/cors.py:

class CorsMiddleware(object): def process_response(self, req, resp): response["Access-Control-Allow-Origin"] = "*" return response

This will add an Access-Control-Allow-Origin:* header to every Django request but before that you need to add it to the list of middleware classes:

MIDDLEWARE_CLASSES = ( #... 'app.CorsMiddleware' )

That's it you have now enabled CORS in your Django backend. You can configure this middlware to add more fine grained options or you can use the well tested package django-cors-headers which works great with Django REST framework.

Using django-cors-headers

Start by installing django-cors-headers using pip

pip install django-cors-headers

You need to add it to your project settings.py file:

INSTALLED_APPS = ( ##... 'corsheaders' )

Next you need to add corsheaders.middleware.CorsMiddleware middleware to the middleware classes in settings.py

MIDDLEWARE_CLASSES = ( 'corsheaders.middleware.CorsMiddleware', 'django.middleware.common.BrokenLinkEmailsMiddleware', 'django.middleware.common.CommonMiddleware', #... )

You can then, either enable CORS for all domains by adding the following setting


Or Only enable CORS for specified domains:

CORS_ORIGIN_ALLOW_ALL = False CORS_ORIGIN_WHITELIST = ( 'http//:localhost:8000', )

You can find more configuration options from the docs.


In this tutorial we have seen how to enable CORS headers in your Django REST framework back-end using a custom CORS middleware or the django-cors-headers package.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: #15: Tidyverse and data.table, sitting side by side ... (Part 1)

Planet Debian - Sun, 2018-01-21 17:40

Welcome to the fifteenth post in the rarely rational R rambling series, or R4 for short. There are two posts I have been meaning to get out for a bit, and hope to get to shortly---but in the meantime we are going start something else.

Another longer-running idea I had was to present some simple application cases with (one or more) side-by-side code comparisons. Why? Well at times it feels like R, and the R community, are being split. You're either with one (increasingly "religious" in their defense of their deemed-superior approach) side, or the other. And that is of course utter nonsense. It's all R after all.

Programming, just like other fields using engineering methods and thinking, is about making choices, and trading off between certain aspects. A simple example is the fairly well-known trade-off between memory use and speed: think e.g. of a hash map allowing for faster lookup at the cost of some more memory. Generally speaking, solutions are rarely limited to just one way, or just one approach. So if pays off to know your tools, and choose wisely among all available options. Having choices is having options, and those tend to have non-negative premiums to take advantage off. Locking yourself into one and just one paradigm can never be better.

In that spirit, I want to (eventually) show a few simple comparisons of code being done two distinct ways.

One obvious first candidate for this is the gunsales repository with some R code which backs an earlier NY Times article. I got involved for a similar reason, and updated the code from its initial form. Then again, this project also helped motivate what we did later with the x13binary package which permits automated installation of the X13-ARIMA-SEATS binary to support Christoph's excellent seasonal CRAN package (and website) for which we now have a forthcoming JSS paper. But the actual code example is not that interesting / a bit further off the mainstream because of the more specialised seasonal ARIMA modeling.

But then this week I found a much simpler and shorter example, and quickly converted its code. The code comes from the inaugural datascience 1 lesson at the Crosstab, a fabulous site by G. Elliot Morris (who may be the highest-energy undergrad I have come across lately) focusssed on political polling, forecasts, and election outcomes. Lesson 1 is a simple introduction, and averages some polls of the 2016 US Presidential Election.

Complete Code using Approach "TV"

Elliot does a fine job walking the reader through his code so I will be brief and simply quote it in one piece:

## Getting the polls library(readr) polls_2016 <- read_tsv(url("http://elections.huffingtonpost.com/pollster/api/v2/questions/16-US-Pres-GE%20TrumpvClinton/poll-responses-clean.tsv")) ## Wrangling the polls library(dplyr) polls_2016 <- polls_2016 %>% filter(sample_subpopulation %in% c("Adults","Likely Voters","Registered Voters")) library(lubridate) polls_2016 <- polls_2016 %>% mutate(end_date = ymd(end_date)) polls_2016 <- polls_2016 %>% right_join(data.frame(end_date = seq.Date(min(polls_2016$end_date), max(polls_2016$end_date), by="days"))) ## Average the polls polls_2016 <- polls_2016 %>% group_by(end_date) %>% summarise(Clinton = mean(Clinton), Trump = mean(Trump)) library(zoo) rolling_average <- polls_2016 %>% mutate(Clinton.Margin = Clinton-Trump, Clinton.Avg = rollapply(Clinton.Margin,width=14, FUN=function(x){mean(x, na.rm=TRUE)}, by=1, partial=TRUE, fill=NA, align="right")) library(ggplot2) ggplot(rolling_average)+ geom_line(aes(x=end_date,y=Clinton.Avg),col="blue") + geom_point(aes(x=end_date,y=Clinton.Margin))

It uses five packages to i) read some data off them interwebs, ii) then filters / subsets / modifies it leading to a right (outer) join with itself before iv) averaging per-day polls first and then creates rolling averages over 14 days before v) plotting. Several standard verbs are used: filter(), mutate(), right_join(), group_by(), and summarise(). One non-verse function is rollapply() which comes from zoo, a popular package for time-series data.

Complete Code using Approach "DT"

As I will show below, we can do the same with fewer packages as data.table covers the reading, slicing/dicing and time conversion. We still need zoo for its rollapply() and of course the same plotting code:

## Getting the polls library(data.table) pollsDT <- fread("http://elections.huffingtonpost.com/pollster/api/v2/questions/16-US-Pres-GE%20TrumpvClinton/poll-responses-clean.tsv") ## Wrangling the polls pollsDT <- pollsDT[sample_subpopulation %in% c("Adults","Likely Voters","Registered Voters"), ] pollsDT[, end_date := as.IDate(end_date)] pollsDT <- pollsDT[ data.table(end_date = seq(min(pollsDT[,end_date]), max(pollsDT[,end_date]), by="days")), on="end_date"] ## Average the polls library(zoo) pollsDT <- pollsDT[, .(Clinton=mean(Clinton), Trump=mean(Trump)), by=end_date] pollsDT[, Clinton.Margin := Clinton-Trump] pollsDT[, Clinton.Avg := rollapply(Clinton.Margin, width=14, FUN=function(x){mean(x, na.rm=TRUE)}, by=1, partial=TRUE, fill=NA, align="right")] library(ggplot2) ggplot(pollsDT) + geom_line(aes(x=end_date,y=Clinton.Avg),col="blue") + geom_point(aes(x=end_date,y=Clinton.Margin))

This uses several of the components of data.table which are often called [i, j, by=...]. Row are selected (i), columns are either modified (via := assignment) or summarised (via =), and grouping is undertaken by by=.... The outer join is done by having a data.table object indexed by another, and is pretty standard too. That allows us to do all transformations in three lines. We then create per-day average by grouping by day, compute the margin and construct its rolling average as before. The resulting chart is, unsurprisingly, the same.

Benchmark Reading

We can looking how the two approaches do on getting data read into our session. For simplicity, we will read a local file to keep the (fixed) download aspect out of it:

R> url <- "http://elections.huffingtonpost.com/pollster/api/v2/questions/16-US-Pres-GE%20TrumpvClinton/poll-responses-clean.tsv" R> download.file(url, destfile=file, quiet=TRUE) R> file <- "/tmp/poll-responses-clean.tsv" R> res <- microbenchmark(tidy=suppressMessages(readr::read_tsv(file)), + dt=data.table::fread(file, showProgress=FALSE)) R> res Unit: milliseconds expr min lq mean median uq max neval tidy 6.67777 6.83458 7.13434 6.98484 7.25831 9.27452 100 dt 1.98890 2.04457 2.37916 2.08261 2.14040 28.86885 100 R>

That is a clear relative difference, though the absolute amount of time is not that relevant for such a small (demo) dataset.

Benchmark Processing

We can also look at the processing part:

R> rdin <- suppressMessages(readr::read_tsv(file)) R> dtin <- data.table::fread(file, showProgress=FALSE) R> R> library(dplyr) R> library(lubridate) R> library(zoo) R> R> transformTV <- function(polls_2016=rdin) { + polls_2016 <- polls_2016 %>% + filter(sample_subpopulation %in% c("Adults","Likely Voters","Registered Voters")) + polls_2016 <- polls_2016 %>% + mutate(end_date = ymd(end_date)) + polls_2016 <- polls_2016 %>% + right_join(data.frame(end_date = seq.Date(min(polls_2016$end_date), + max(polls_2016$end_date), by="days"))) + polls_2016 <- polls_2016 %>% + group_by(end_date) %>% + summarise(Clinton = mean(Clinton), + Trump = mean(Trump)) + + rolling_average <- polls_2016 %>% + mutate(Clinton.Margin = Clinton-Trump, + Clinton.Avg = rollapply(Clinton.Margin,width=14, + FUN=function(x){mean(x, na.rm=TRUE)}, + by=1, partial=TRUE, fill=NA, align="right")) + } R> R> transformDT <- function(dtin) { + pollsDT <- copy(dtin) ## extra work to protect from reference semantics for benchmark + pollsDT <- pollsDT[sample_subpopulation %in% c("Adults","Likely Voters","Registered Voters"), ] + pollsDT[, end_date := as.IDate(end_date)] + pollsDT <- pollsDT[ data.table(end_date = seq(min(pollsDT[,end_date]), + max(pollsDT[,end_date]), by="days")), on="end_date"] + pollsDT <- pollsDT[, .(Clinton=mean(Clinton), Trump=mean(Trump)), + by=end_date][, Clinton.Margin := Clinton-Trump] + pollsDT[, Clinton.Avg := rollapply(Clinton.Margin, width=14, + FUN=function(x){mean(x, na.rm=TRUE)}, + by=1, partial=TRUE, fill=NA, align="right")] + } R> R> res <- microbenchmark(tidy=suppressMessages(transformTV(rdin)), + dt=transformDT(dtin)) R> res Unit: milliseconds expr min lq mean median uq max neval tidy 12.54723 13.18643 15.29676 13.73418 14.71008 104.5754 100 dt 7.66842 8.02404 8.60915 8.29984 8.72071 17.7818 100 R>

Not quite a factor of two on the small data set, but again a clear advantage. data.table has a reputation for doing really well for large datasets; here we see that it is also faster for small datasets.


Stripping the reading, as well as the plotting both of which are about the same, we can compare the essential data operations.


We found a simple task solved using code and packages from an increasingly popular sub-culture within R, and contrasted it with a second approach. We find the second approach to i) have fewer dependencies, ii) less code, and iii) running faster.

Now, undoubtedly the former approach will have its staunch defenders (and that is all good and well, after all choice is good and even thirty years later some still debate vi versus emacs endlessly) but I thought it to be instructive to at least to be able to make an informed comparison.


My thanks to G. Elliot Morris for a fine example, and of course a fine blog and (if somewhat hyperactive) Twitter account.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Sandipan Dey: Recursive Graphics, Bilinear Interpolation and Image Transformation in Python

Planet Python - Sun, 2018-01-21 15:52
The following problem appeared in an assignment in the Princeton course COS 126 . The problem description is taken from the course itself. Recursive Graphics Write a program that plots a Sierpinski triangle, as illustrated below. Then develop a program that plots a recursive patterns of your own design. Part 1.  The Sierpinski triangle is an example of … Continue reading Recursive Graphics, Bilinear Interpolation and Image Transformation in Python
Categories: FLOSS Project Planets

fluffy.pro. Drupal Developer's blog: Monolog: namespaced logger?

Planet Drupal - Sun, 2018-01-21 14:47
Using monolg library and monolog-cascade extension you can't configure the "namespaced" loggers. What does it mean? Imagine you have tons of classes and you need to log information from them into a log file. There is nothing special in this. Just define loggers with the needed handler(s) and instantiate them directly in a place where you want them to use with a help of monolog-cascade. It means in your monolog-cascade config file you have to define needed loggers in advance and you have to reference needed loggers by their names. But what if you need an additional logger (with absolutely different handlers/processors) for some of the classes? Will you go through all the classes and change logger names where you instantiate them? I think it doesn't look like a good idea when a small requirement (for instance, change the log file name for records from a bunch of classes) leads to edits in an application code. It's something that must be configurable and that's why I decided to write a tiny library called monolog-cascade-namespaced.
Read more »
Categories: FLOSS Project Planets

Artem Golubin: Understanding internals of Python classes

Planet Python - Sun, 2018-01-21 10:14

The goal of this series is to describe internals and general concepts behind the class object in Python 3.6. In this part, I will explain how Python stores and lookups attributes. I assume that you already have a basic understanding of object-oriented concepts in Python.

Let's start with a simple class:

class Vehicle: kind = 'car' def __init__(self, manufacturer, model): self.manufacturer = manufacturer self.model_name = model @property def name(self): return "%s %s" %
Categories: FLOSS Project Planets

Python Does What?!: None on the left

Planet Python - Sun, 2018-01-21 05:00
A natural default, None is probably the most commonly assigned value in Python. But what happens if you move it to the left side of that equation?

In Python 2:
>>> None = 2
File "<stdin>", line 1
SyntaxError: cannot assign to None
This is similar to what happens when you assign to a literal:
>>> 1 = 2
File "<stdin>", line 1
SyntaxError: can't assign to literal
In Python 3 this walk on the wild side will get you a slightly different error:
>>> None = 1
File "<stdin>", line 1
SyntaxError: can't assign to keyword
None has graduated from useful snowflake to full-blown keyword!
Categories: FLOSS Project Planets

Import Python: #159: How to speed up Python application startup time, Hunting Memory Leaks and more

Planet Python - Sun, 2018-01-21 04:34
Worthy Read
Optimize Utilization with GoCD’s Elastic Agents GoCD is a continuous delivery tool specializing in advanced workflow modeling and dependency management. Our new AWS ECS elastic agents extension now allows you to scale up with on-demand agents based on your need. Try it now!
GoCD, advert
How to speed up Python application startup time? Python 3.7 has new feature to show time for importing modules. This feature is enabled with -X importtime option or PYTHONPROFILEIMPORTTIME environment variable.
processing time
Using textual analysis to quantify a cast of characters If you’ve ever worked on a text and wished you could get a list of characters or see how many times each character was mentioned, this is the tutorial for you.
Hunting for memory leaks in asyncio applications. Sailing into the last two weeks of 2017 that I fully intended to spend experimenting with various eggnog recipes. I was alerted by our DevOps team that our asyncio app was consuming 10GB of memory. That is approximately 100 times more than it should!
memory leaks, async
The Industry’s Fastest eSignature API Integration Embed docs directly on your website with a few lines of code. Test the API for free.
DjangoCon JP 2018 DjangoCon JP is a conference for the Django Web framework in Japan. If you're a seasoned Django pro or just starting, DjangoCon JP is for you. Our goal is for atendees to meet, talk, share tips, discover new ways to use Django, and, most importantly, have FUN.
The flat success path If you want to write clear and easy to understand software, make sure it has a single success path. A 'single success path' means a few things. First, it means that any given function/method/procedure should have a single clear purpose.
Normalizing Flows Tutorial, Part 1: Distributions and Determinants This series is written for an audience with a rudimentary understanding of linear algebra, probability, neural networks, and TensorFlow. Knowledge of recent advances in Deep Learning, generative models will be helpful in understanding the motivations and context underlying these techniques, but they are not necessary.
A GPU ready Docker container for OpenAI Gym Development with TensorFlow So, you want to write an agent, competing in the OpenAI Gym, you want to use Keras or TensorFlow or something similar and you don’t want everything installed on your workstation? You have come to the right place!
docker, tensorflow
Check your balance on Coinbase using Python Even though Coinbase has a mobile application so you’re able to check your balance on the go, I prefer using their API instead so I can setup custom alerts not available on their platform.
Using bower to manage static files with Django Sharing a way to manage libraries like bootstrap, jquery with bower without using any external app.
Automatic model selection: H2O AutoML In this post, we will use H2O AutoML for auto model selection and tuning. This is an easy way to get a good tuned model with minimal effort on the model selection and parameter tuning side.
Logistic regression in Python sklearn

SimpleCoin - 209 Stars, 20 Fork Just a really simple, insecure and incomplete implementation of a blockchain for a cryptocurrency made in Python as educational material. In other words, a simple Bitcoin clone.
languagecrunch - 136 Stars, 8 Fork LanguageCrunch NLP server docker image.
howtopython.org - 86 Stars, 16 Fork A (book, website) that decribes how to Python, from scratch.
unimatrix - 83 Stars, 4 Fork Python script to simulate the display from "The Matrix" in terminal. Uses half-width katakana unicode characters by default, but can use custom character sets. Accepts keyboard controls while running. Based on CMatrix.
spacy-lookup - 32 Stars, 1 Fork Named Entity Recognition based on dictionaries.
simpledb - 14 Stars, 0 Fork miniature redis-like server implemented in Python.
python-bigone - 10 Stars, 1 Fork BigONE Exchange API python implementation for automated trading.
django-multiple-user-types-example - 10 Stars, 1 Fork Django Quiz Application
spotify-lyrics-cli - 9 Stars, 0 Fork Automatically get lyrics for the song currently playing in Spotify from command line.
aws-security-checks - 7 Stars, 0 Fork AWS Security Checks.
auditor - 5 Stars, 0 Fork Script for tracking file system changes.
sgqlc - 5 Stars, 1 Fork Simple GraphQL Client.
pyfakers - 4 Stars, 0 Fork py-fake-rs: a fake data generator for python, backed by fake-rs in rust.
shellson - 3 Stars, 1 Fork JSON command line parser.
django-qsessions - 3 Stars, 0 Fork Extends Django's cached_db session backend.
Categories: FLOSS Project Planets

Techiediaries - Django: Handling CORS in Express 4

Planet Python - Sat, 2018-01-20 19:00

CORS stands for Cross Origin Resource Sharing and allows modern web browsers to be able to send AJAX requests and receive HTTP responses for resource from other domains other that the domain serving the client side application.

If you have ever been developing an application which is making XHR requests to a cross-domain origin and getting an error like the following in your browser console?

XMLHttpRequest cannot load XXX. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin XXX is therefore not allowed access. The response had HTTP status code 500.

Your web browser is simply informing you that your web server is not sending back the headers that allow CORS i.e Access-Control-Allow-Origin and Access-Control-Allow-Methods

So in this tutorial you'll learn how to enable CORS in your Express 4 server to enable your front-end application to bypass the Same Origin Policy enforced by modern web browsers. This is particularly useful when you are locally developing your application, since in many cases you'll have two running development servers (front-end and back-end servers) in different ports, or if you want to enable resource sharing between different domains/hosts.

How to enable CORS in Express 4

There are many ways that you can use to enable CORS in Express.

If you are locally developing your application and want a quick way to CORS then you can simply use a middleware with a few lines of code:

var express = require('express'); var server = express(); server.use(bodyParser.urlencoded({extended: true})) server.use(bodyParser.json()) server.use(function(req, res, next) { res.header("Access-Control-Allow-Origin", "*"); res.header("Access-Control-Allow-Headers", "Origin, X-Requested-With, Content-Type, Accept"); next(); }); server.get('/endpoint', function (req, res, next) { res.json({msg: 'This is CORS-enabled for all origins!'}) }) server.listen(3000, () => { console.log('Listenning at http://localhost:3000' ) })

The wildcard * allows resources to be accessed from any origin.

That's it you can now send requests from any origin without getting the same origin policy problems.

You can also use fine grained options without having to deal with HTTP header names for CORS by using the CORS module installed from npm.

Using the CORS Module

Head over to your terminal and install:

npm install --save cors

You can then use it as a middleware

var express = require('express'); var server = express(); var cors = require('cors'); server.use(bodyParser.urlencoded({extended: true})) server.use(bodyParser.json()) server.get('/endpoint', function (req, res, next) { res.json({msg: 'This is CORS-enabled for all origins!'}) }) server.use(cors()); server.listen(3000, () => { console.log('Listenning at http://localhost:3000' ) })

This is equivalent to our previous example and allows resources to be accessed from any origin by adding the Access-Control-Allow-Origin: * header to all requests.

Controlling Allowed Hosts

When your are in production you don't want to allow CORS access for all origins but if you need to allow cross origin requests from some specified host(s) you can do add the following code:

server.use(cors({ origin: 'https://techiediaries.com' }));

This wil allow https://techiediaries.com to send cross origin requests to your Express server without the Same Origin Policy getting in the way.

You can also enable CORS for a single Express route

server.get('/endpoint', cors(), function (req, res, next) { res.json({msg: 'This has CORS-enabled for only this route: /endpoint'}) }) Allowing Dynamic/Multiple Origins

If you want to allow multiple origins you need to use a function (for origin instead of a string) that dynamically set the CORS header depending on the origin making the request and a white list that you specify which contains the origin to allow.

var express = require('express') var cors = require('cors') var server = express() var whitelist = ['http://techiediaries.com', 'http://othersite.com'] var options = { origin: function (origin, callback) { if (whitelist.indexOf(origin) !== -1) { callback(null, true) } else { callback(new Error('Not allowed by CORS')) } } } server.use(cors(options)) server.get('endpoint', function (req, res, next) { res.json({msg: 'This has CORS enabled'}) }) server.listen(3000, () => { console.log('Listenning at http://localhost:3000' ) }) Conclusion

In this tutorial we have seen some useful options for adding CORS headers to your web application, developed with Node.js and Express 4, which is particularly useful for development applications with separate front-end and back-end apps or if you want to be able to share resources (via API requests) across many domains.

Categories: FLOSS Project Planets

Russ Allbery: New year haul

Planet Debian - Sat, 2018-01-20 18:08

Some new acquired books. This is a pretty wide variety of impulse purchases, filled with the optimism of a new year with more reading time.

Libba Bray — Beauty Queens (sff)
Sarah Gailey — River of Teeth (sff)
Seanan McGuire — Down Among the Sticks and Bones (sff)
Alexandra Pierce & Mimi Mondal (ed.) — Luminescent Threads (nonfiction anthology)
Karen Marie Moning — Darkfever (sff)
Nnedi Okorafor — Binti (sff)
Malka Older — Infomocracy (sff)
Brett Slatkin — Effective Python (nonfiction)
Zeynep Tufekci — Twitter and Tear Gas (nonfiction)
Martha Wells — All Systems Red (sff)
Helen S. Wright — A Matter of Oaths (sff)
J.Y. Yang — Waiting on a Bright Moon (sff)

Several of these are novellas that were on sale over the holidays; the rest came from a combination of reviews and random on-line book discussions.

The year hasn't been great for reading time so far, but I do have a couple of things ready to review and a third that I'm nearly done with, which is not a horrible start.

Categories: FLOSS Project Planets

Shirish Agarwal: PC desktop build, Intel, spectre issues etc.

Planet Debian - Sat, 2018-01-20 17:05

This is and would be a longish one.

I have been using desktop computers for around couple of decades now. My first two systems were an Intel Pentium III and then a Pentium Dual-core, the first one on kobian/mercury motherboard. The motherboards were actually called Mercury and was a brand which was later sold to Kobian which kept the brand-name. The motherboards and the CPU/processor used to be cheap. One could set up a decentish low-end system with display for around INR 40k/- which seemed to be decent as a country we had just come out of non-alignment movement and also chose to come out of isolationist tendencies (technology and otherwise as well). Most middle-class income families got their first taste of computers after y2k. There were quite a few y2k incomes which prompted the Government to lose duties further.

One of the highlights during 1991 when satellite TV came was shown by CNN (probably CNN International) was the coming down of the Berlin Wall. There were many of us who were completely ignorant of world politics or what is/was happening in other parts of the world.

Computer systems at those times were considered a luxury item and duties were sky-high ( between 1992-2001). The launch of Mars Pathfinder, its subsequent successful landing on the Martian surface also catapulted people’s imagination about PCs and micro-processors.

I can still recall the excitement that was among young people of my age first seeing the liftoff from Cape Canaveral and then later the processed images of Spirits cameras showing images of a desolate desert-type land. We also witnessed the beginnings of ‘International Space Station‘ (ISS) .

Me and few of my friends had drunk lot of Carl Sagan and many other sci-fi coolaids/stories. Star Trek, the movies and the universal values held/shared by them was a major influence to all our lives.

People came to know about citizen based science or projects/distributed science projects, y2k fear appeared to be unfounded all these factors and probably a few more prompted the Government of India to reduce duties on motherboards, processors, components as well as taking Computers out of the restricted list which lead to competition and finally the common man being able to dream of a system sooner than later. Y2K also kick-started the beginnings of Indian software industry which is the bread and butter of many a middle class men-women who are in the service industry using technology directly or indirectly.

In 2002 I bought my first system, an Intel Pentium III, i810 chipset (integrated graphics) with 256 MB of SDRAM which was supposed to be sufficient for the tasks it was being used for, Some light gaming, some web-mails, seeing movies,etc running on a mercury board. I don’t remember the code-name partly because the code-names are/were really weird and partly because it is just too long ago. I remember using Windows ’98 and trying to install one of the early GNU/Linux variants on that machine. Ir memory serves right, you had to flick a jumper (like a switch) to use the extended memory.

I do not know/remember what happened but I think somewhere in a year or two in that time-frame Mercury India filed for bankruptcy and the name, manufacturing was sold to Kobian. After Kobian took over the ownership, it said it would neither honor the 3/5 year warranty or even repairs on the motherboards Mercury had sold, it created a lot of bad will against the company and relegated itself to the bottom of the pile for both experienced and new system-builders. Also mercury motherboards weren’t reputed/known to have a long life although the one I had gave me quite a decent life.

The next machine I purchased was a Pentium Dual-core, (around 2009/2010) LGA a Williamnette which had out-of-order execution, the bug meltdown which is making news nowadays has history this far back. I think I bought it in 45nm which was a huge jump from the previous version although still secure in the mATX package. Again the board was from mercury. (Intel 845 chipset, DDR2 2 GB RAM and SATA came to stay).

So meltdown has been in existence for 10-12 odd years and is in everything which either uses Intel or ARM processors.

As you can probably make-out most systems came stretched out 2-3 years later than when they were launched in American or/and European markets. Also business or tourism travel was neither so easy, smooth or transparent as is today. All of which added to delay in getting new products in India.

Sadly, the Indian market is similar to other countries where Intel is used in more than 90% machines. I know of few institutions (though pretty much rare) who insisted and got AMD solutions.

That was the time when gigabyte came onto the scene which formed the basis of the Wolfdale-3M 45nm system which was in the same price range as the earlier models, and offered a weeny tiny bit of additional graphics performance.To the best of my knowledge, it was perhaps the first motherboard which had solid state capacitors being offered/put in a budget motherboard. The mobo-processor bundle used to be in the range of INR 7/8k excluding RAM. cabinet etc, I had a Philips 17″ CRT display which ran a good decade or so, so just had to get the new cabinet, motherboard, CPU, RAM and was good to go.

Few months later at a hardware exhibition held in the city I was invited to an Asus party which was just putting a toe-hold in the Indian market. I went to the do, enjoyed myself. They had a small competition where they asked some questions and asked if people had queries. To my surprise, I found that most people who were there were hardware vendors and for one reason or the other they chose to remain silent. Hence I got an AMD Asus board. This is different from winning another Gigabyte motherboard which I also won in the same year in another competition as well in the same time-frame. Both were mid-range motherboards (ATX build).

As I had just bought a Gigabyte (mATX) motherboard and had made the build, I had to give both the motherboards away, one to a friend and one to my uncle and both were pleased with the AMD-based mobos which they somehow paired with AMD processors. At that time AMD had one-upped Intel in both graphics and even bare computing especially at the middle level and they were striving to push into new markets.

Apart from the initial system bought, most of my systems when being changed were in the INR 20-25k/- budget including all and any accessories I bought later.

The only real expensive parts I purchased have been external hdd ( 1 TB WD passport) and then a Viewsonic 17″ LCD which together sent me back by around INR 10k/- but both seem to give me adequate performance (both have outlived the warranty years) with the monitor being used almost 24×7 over 6 years or so, of course over GNU/Linux specifically Debian. Both have been extremely well value for the money.

As I had been exposed to both the motherboards I had been following those and other motherboards as well. What was and has been interesting to observe what Asus did later was to focus more on the high-end gaming market while Gigabyte continued to dilute it energy both in the mid and high-end motherboards.

Cut to 2017 and had seen quite a few reports –




All of which points to the fact that Asus had cornered a large percentage of the market and specifically the gaming market . While there are no formal numbers as both Asus and Gigabyte choose to releases only APAC numbers rather than a country-wide split which would have made for some interesting reading.

Just so that people do not presume anything, there are about 4-5 motherboard vendors in the Indian market. There is Asus at the top (I believe) followed by Gigabyte, Intel at a distant 3rd place (because it’s too expensive). There are also pockets of Asrock and MSI and I know of people who follow them religiously although their mobos are supposed to be somewhat pensive than the two above. Asus and Gigabyte do try to fight out with each other but each has its core competency I believe with Asus being used by heavy gamers, overclockers more than Gigabyte.

Anyway come October 2017 and my main desktop died and am left as they say up the creek without the paddle. I didn’t even have Net access for about 3 weeks due to BSNL or PMC’s foolishness and then later small riots breaking out due to Koregaon Bhima conflict.

This led to a situation where I had to buy/build a system with oldish/half knowledge. I was open to having an AMD system but both datacare and even Rashi peripherals, Pune both of whom used to deal in AMD systems shared they had stopped dealing in AMD stuff sometime back. While datacare had AMD mobos, getting processors were an issue. Both the vendors are near to my home so if I buy from them getting support becomes an non-issue. I could have gone out of my way to get an AMD processor but getting support could have been an issue as would have had to travel and I do not know the vendors enough. Hence fell back to the Intel platform.

I asked around quite a few PC retailers and distributors around and found the Asus Prime Z270-P was the only mid-range motherboard available at that time. I did come to know a bit later of other motherboards in the z270 series but most vendors didn’t/don’t stock them as there is capital, interest and stock cost.

History – Historically, there has also been huge time lag in getting motherboards, processors etc. between worldwide announcements, and then announcements of sale in India and actually getting hands-on to the newest motherboards and processors as seen above. This had led to quite a bit of frustration to many a users. I have known of many a soul visiting Lamington Road, Mumbai to get the latest motherboard, processor. Even to-date this system flourishes as Mumbai has an International Airport and there is always a demand and people willing to pay a premium for the newest processor/motherboard even before any reviews are in.

I was highly surprised to know recently that Prime Z370-P motherboards are already selling (just 3 months late) with the Intel 8th generation processors although these are still as samples rather than a torrent some of the other motherboard-combo might be.

At the end I bought an Intel I7400 chip and an Asus Prime Z270-P motherboard with 2400 mhz Corsair 8 GB and a 4 TB WD Green (5400) HDD with a Circle 545 cabinet and (with the almost criminal 400 Watts SMPS). Later came to know that it’s not really even 400 Watts, but around 20-25% less . The whole package costed me north of INR 50k/- with still need to spend on a better SMPS (probably a Cosair or Coolermaster 80 600/650 SMPS) with a few accessories I still need to complete the system.

I will be changing the PSU most probably next week.

Disclosure – The neatness you see is not me. I was unsure if I would be able to put the heatsink on the CPU properly as that is the most sensitive part while building a system. A bent pin on the CPU could play havoc as well as void the warranty on the CPU or motherboard or both. The new thing I saw were the knobs that can be seen on the heatsink fan is something which I hadn’t seen before. The vendor did the fixing of the processor on the mobo for me as well as tied up the remaining power cables without asking for which I am and was grateful and would definitely provide him with more business as and when I need components.

Future – While it’s ok for now, I’m still using a pretty old 2 speaker setup which I hope to upgrade to either a 2.1/3.1 speaker setup, have full 64 GB 2400 Mhz Kingston Razor/G.Skill/Corsair memory, an M.2 512 MB SSD .

If I do get the Taiwan Debconf bursary I do hope to buy some or all of the above plus a Samsung or some other Android/Replicant/Librem smartphone. I have been also looking for a vastly simplified smartphone for my mum with big letters and everything but that has been a failure to find in the Indian market. Of course this all depends if I do get the bursary and even after the bursary if Global warranty and currency exchange works out in my favor vis-a-vis what I would have to pay in India.

Apart from above, Taiwan is supposed to be a pretty good source to get graphic novels, manga comics, lots of RPG games for very cheap prices with covers and hand-drawn material etc. All of this is based upon few friend’s anecdotal experiences so dunno if all of that would still hold true if I manage to be there.

There are also quite a few chip foundries and maybe during debconf could have visit to one of them if possible. It would be rewarding if the visit was to any 45nm or lower chip foundry as India is still stuck at 65nm range till date.

I would be sharing about my experience about the board, the CPU, the expectations I had from the Intel chip and the somewhat disappointing experience of using Debian on the new board in the next post, not necessarily Debian’s fault but the free software ecosystem being at fault here.

Feel free to point out any mistakes you find, grammatically or even otherwise. The blog post has been in the works for over couple of weeks so its possible for mistakes to creep in.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: Rcpp 0.12.15: Numerous tweaks and enhancements

Planet Debian - Sat, 2018-01-20 16:53

The fifteenth release in the 0.12.* series of Rcpp landed on CRAN today after just a few days of gestation in incoming/.

This release follows the 0.12.0 release from July 2016, the 0.12.1 release in September 2016, the 0.12.2 release in November 2016, the 0.12.3 release in January 2017, the 0.12.4 release in March 2016, the 0.12.5 release in May 2016, the 0.12.6 release in July 2016, the 0.12.7 release in September 2016, the 0.12.8 release in November 2016, the 0.12.9 release in January 2017, the 0.12.10.release in March 2017, the 0.12.11.release in May 2017, the 0.12.12 release in July 2017, the 0.12.13.release in late September 2017, and the 0.12.14.release in November 2017 making it the nineteenth release at the steady and predictable bi-montly release frequency.

Rcpp has become the most popular way of enhancing GNU R with C or C++ code. As of today, 1288 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with another 91 in BioConductor.

This release contains a pretty large number of pull requests by a wide variety of authors. Most of these pull requests are very focused on a particular issue at hand. One was larger and ambitious with some forward-looking code for R 3.5.0; however this backfired a little on Windows and is currently "parked" behind a #define. Full details are below.

Changes in Rcpp version 0.12.15 (2018-01-16)
  • Changes in Rcpp API:

    • Calls from exception handling to Rf_warning() now correctly set an initial format string (Dirk in #777 fixing #776).

    • The 'new' Date and Datetime vectors now have is_na methods too. (Dirk in #783 fixing #781).

    • Protect more temporary SEXP objects produced by wrap (Kevin in #784).

    • Use public R APIs for new_env (Kevin in #785).

    • Evaluation of R code is now safer when compiled against R 3.5 (you also need to explicitly define RCPP_PROTECTED_EVAL before including Rcpp.h). Longjumps of all kinds (condition catching, returns, restarts, debugger exit) are appropriately detected and handled, e.g. the C++ stack unwinds correctly (Lionel in #789). [ Committed but subsequently disabled in release 0.12.15 ]

    • The new function Rcpp_fast_eval() can be used for performance-sensitive evaluation of R code. Unlike Rcpp_eval(), it does not try to catch errors with tryEval in order to avoid the catching overhead. While this is safe thanks to the stack unwinding protection, this also means that R errors are not transformed to an Rcpp::exception. If you are relying on error rethrowing, you have to use the slower Rcpp_eval(). On old R versions Rcpp_fast_eval() falls back to Rcpp_eval() so it is safe to use against any versions of R (Lionel in #789). [ Committed but subsequently disabled in release 0.12.15 ]

    • Overly-clever checks for NA have been removed (Kevin in #790).

    • The included tinyformat has been updated to the current version, Rcpp-specific changes are now more isolated (Kirill in #791).

    • Overly picky fall-through warnings by gcc-7 regarding switch statements are now pre-empted (Kirill in #792).

    • Permit compilation on ANDROID (Kenny Bell in #796).

    • Improve support for NVCC, the CUDA compiler (Iñaki Ucar in #798 addressing #797).

    • Speed up tests for NA and NaN (Kirill and Dirk in #799 and #800).

    • Rearrange stack unwind test code, keep test disabled for now (Lionel in #801).

    • Further condition away protect unwind behind #define (Dirk in #802).

  • Changes in Rcpp Attributes:

    • Addressed a missing Rcpp namespace prefix when generating a C++ interface (James Balamuta in #779).
  • Changes in Rcpp Documentation:

    • The Rcpp FAQ now shows Rcpp::Rcpp.plugin.maker() and not the outdated ::: use applicable non-exported functions.

Thanks to CRANberries, you can also look at a diff to the previous release. As always, details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Python Data: Local Interpretable Model-agnostic Explanations – LIME in Python

Planet Python - Sat, 2018-01-20 14:57

When working with classification and/or regression techniques, its always good to have the ability to ‘explain’ what your model is doing. Using Local Interpretable Model-agnostic Explanations (LIME), you now have the ability to quickly provide visual explanations of your model(s).

Its quite easy to throw numbers or content into an algorithm and get a result that looks good. We can test for accuracy and feel confident that the classifier and/or model is ‘good’…but can we describe what the model is actually doing to other users? A good data scientist spends some of their time making sure they have reasonable explanations for what the model is doing and why the results are what they are.

There’s always been a focus on ‘trust’ in any type of modeling methodology but with machine learning and deep learning, many people feel like the black-box approach taken with these methods isn’t as trustworthy as other methods.  This topic was addressed in a paper titled Why Should I Trust You?”: Explaining the Predictions of Any Classifier, which proposes the concept of Local Interpretable Model-agnostic Explanations (LIME). According to the paper, LIME is ‘an algorithm that can explain the predictions of any classifier or regressor in a faithful way, by approximating it locally with an interpretable model.’

I’ve used the LIME approach a few times in recent projects and really like the idea. It breaks down the modeling / classification techniques and output into a form that can be easily described to non-technical people.  That said, LIME isn’t a replacement for doing your job as a data scientist, but it is another tool to add to your toolbox.

To implement LIME in python, I use this LIME library written / released by one of the authors the above paper.

I thought it might be good to provide a quick run-through of how to use this library. For this post, I’m going to mimic “Using lime for regression” notebook the authors provide, but I’m going to provide a little more explanation.

The full notebook is available in my repo here.

Getting started with Local Interpretable Model-agnostic Explanations (LIME)

Before you get started, you’ll need to install Lime.

pip install lime

Next, let’s import our required libraries.

from sklearn.datasets import load_boston import sklearn.ensemble import numpy as np from sklearn.model_selection import train_test_split import lime import lime.lime_tabular

Let’s load the sklearn dataset called ‘boston’. This data is a dataset that contains house prices that is often used for machine learning regression examples.

boston = load_boston()

Before we do much else, let’s take a look at the description of the dataset to get familiar with it.  You can do this by running the following command:

print boston['DESCR']

The output is:

Boston House Prices dataset =========================== Notes ------ Data Set Characteristics: :Number of Instances: 506 :Number of Attributes: 13 numeric/categorical predictive :Median Value (attribute 14) is usually the target :Attribute Information (in order): - CRIM per capita crime rate by town - ZN proportion of residential land zoned for lots over 25,000 sq.ft. - INDUS proportion of non-retail business acres per town - CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) - NOX nitric oxides concentration (parts per 10 million) - RM average number of rooms per dwelling - AGE proportion of owner-occupied units built prior to 1940 - DIS weighted distances to five Boston employment centres - RAD index of accessibility to radial highways - TAX full-value property-tax rate per $10,000 - PTRATIO pupil-teacher ratio by town - B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town - LSTAT % lower status of the population - MEDV Median value of owner-occupied homes in $1000's :Missing Attribute Values: None :Creator: Harrison, D. and Rubinfeld, D.L. This is a copy of UCI ML housing dataset. http://archive.ics.uci.edu/ml/datasets/Housing This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, 'Regression diagnostics ...', Wiley, 1980. N.B. Various transformations are used in the table on pages 244-261 of the latter. The Boston house-price data has been used in many machine learning papers that address regression problems. **References** - Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261. - Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann. - many more! (see http://archive.ics.uci.edu/ml/datasets/Housing)

Now that we have our data loaded, we want to build a regression model to forecast boston housing prices. We’ll use random forest for this to follow the example by the authors.

First, we’ll set up the RF Model and then create our training and test data using the train_test_split module from sklearn. Then, we’ll fit the data.

rf = sklearn.ensemble.RandomForestRegressor(n_estimators=1000) train, test, labels_train, labels_test = train_test_split(boston.data, boston.target, train_size=0.80) rf.fit(train, labels_train)

Now that we have a Random Forest Regressor trained, we can check some of the accuracy measures.

print('Random Forest MSError', np.mean((rf.predict(test) - labels_test) ** 2))

Tbe MSError is: 10.45. Now, let’s look at the MSError when predicting the mean.

print('MSError when predicting the mean', np.mean((labels_train.mean() - labels_test) ** 2))

From this, we get 80.09.

Without really knowing the dataset, its hard to say whether they are good or bad.  Since we are really most interested in looking at the LIME approach, we’ll move along and assume these are decent errors.

To implement LIME, we need to get the categorical features from our data and then build an ‘explainer’. This is done with the following commands:

categorical_features = np.argwhere( np.array([len(set(boston.data[:,x])) for x in range(boston.data.shape[1])]) <= 10).flatten()

and the explainer:

explainer = lime.lime_tabular.LimeTabularExplainer(train, feature_names=boston.feature_names, class_names=['price'], categorical_features=categorical_features, verbose=True, mode='regression')

Now, we can grab one of our test values and check out our prediction(s). Here, we’ll grab the 100th test value and check the prediction and see what the explainer has to say about it.

i = 100 exp = explainer.explain_instance(test[i], rf.predict, num_features=5) exp.show_in_notebook(show_table=True)

LIME Explainer for regression

So…what does this tell us?

It tells us that the 100th test value’s prediction is 21.16 with the “RAD=24” value providing the most positive valuation and the other features providing negative valuation in the prediction.

For regression, this isn’t quite as interesting (although it is useful). The LIME approach shows much more benefit (at least to me) when performing classification.

As an example, if you are trying to classify plans as edible or poisonous, LIME’s explanation is much more useful. Here’s an example from the authors.

LIME explanation of edible vs poisonous

Take a look at LIME when you have some time. Its a good library to add to your toolkit, especially if you are doing a lot of classification work. It makes it much easier to ‘explain’ what the model is doing.

The post Local Interpretable Model-agnostic Explanations – LIME in Python appeared first on Python Data.

Categories: FLOSS Project Planets

This week in Usability and Productivity, part 2

Planet KDE - Sat, 2018-01-20 12:39

This is your weekly status update for the KDE community’s progress in the Usability and Productivity initiative. KDE contributors have been busy, and here’s a sampling of features, improvements, and bugfixes relevant to the initiative that KDE developers landed over the past week-and-a-half (subsequent reports will be weekly, but I wrote the first one in the middle of a week):

  • KIO file copy speed (e.g. in Dolphin) is now 4.5x faster (KDE bug 384561)
  • Fixed a layout glitch in Open & Save file picker dialogs (KDE bug 352776)
  • KMail gained the ability to badge its Task Manager app icon with the count of unread emails (KDE Phabricator revision D9841)
  • Notification badges on Task Manager app icons now show up in Task Manager tooltips, too (KDE Phabricator revision D9825) and look better for huge numbers (KDE Phabricator revision D9827):
  • The Audio Volume widget now looks good with Dark themes (KDE bug 388766)
  • KSysGuard’s CPU column now has a pretty little CPU use graph in the background for each process (KDE Phabricator revision D9689):
  • Every KDE app’s “Settings > Configure [app]” menu item now has a universally consistent keyboard shortcut: Ctrl+Shift+Comma (KDE Phabricator revision D8296)
  • The PDF thumbnailer is able to generate thumbnails in Dolphin for more types of PDFs (KDE bug 388288)
  • Dates are no longer formatted like numbers (i.e. as “2,018”) in some places in Dolphin (KDE Phabricator revision D9887)
  • The menu you get when right-clicking on KAddressBook’s icon now includes a “New Contact…” item (KDE Phabricator revision D9926)
  • Dolphin’s main view now properly regains focus after you close the inline terminal pane (KDE bug 298467)
  • Window titlebar buttons now show tooltips (KDE bug 383040)
  • Plasma’s notifications no longer leak memory when created (KDE bug 389132)
  • Baloo indexer now actually excludes directories that are marked as excluded from indexing (KDE bug 362226)
  • A whole class of app crashes caused by typing or deleting characters in search fields using Qt QML components (such as in Discover and System Settings) was traced to a Qt bug–and KDE developer Aleix Pol has submitted a patch!

There’s also been a ton of work in Discover, particularly in Snap and Flatpak support. You can read about it here.

If any of this seems cool or useful, consider helping us out by becoming a KDE contributor yourself. If donating time isn’t your cup of tea, we always appreciate donations of money, too. We can’t do this without your support!

Categories: FLOSS Project Planets

Matt Glaman: Drupal, Coffee, Burgers - A small world afterall

Planet Drupal - Sat, 2018-01-20 11:00
Drupal, Coffee, Burgers - A small world afterall mglaman Sat, 01/20/2018 - 10:00 This year I joined a coffee exchange for some members of the Drupal community. I had known there was one floating around, but finally got signed up. Over the past two years, I have gotten more and more into coffee - being a coffee snob about roasts and learning brewing techniques. Last week we were paired up. And sent out some roasts.
Categories: FLOSS Project Planets
Syndicate content