FLOSS Project Planets

FSF Events: Richard Stallman - « Devrions-nous avoir plus de surveillance que l'Union Soviétique ? » (Grenoble, France)

GNU Planet! - Tue, 2016-09-27 09:44

Ce discours de Richard Stallman ne sera pas technique, l'entrée sera libre, et tout le monde est invité à assister.

Lieu: Auditorium 1er étage, Grenoble Ecole de Management, 12 rue Pierre Semard, 38000 Grenoble, France

Veuillez complèter notre formulaire de contact, pour que nous puissions vous annoncer de futurs événements dans la région de grenobloise.

Categories: FLOSS Project Planets

KDE neon Korean Developer Edition (... and future CJK Edition?)

Planet KDE - Tue, 2016-09-27 08:41
KDE Project:

While not being advertised on the KDE neon main page just yet (and it won't be for a while), we've recently begun doing regular builds of a special Korean Edition of neon's Developer Edition tracking the stable branch of KDE's code repositories. The Korean Edition pre-selects the Korean language and locale at boot, packs all the Korean translations we have and comes with a Korean input method pre-setup.

Joseon-era Hangeul metal type

Why a Korean Edition?

Among many other locations around the planet, the local community in Korea is planning to put on a KDE 20th Anniversary birthday party in Seoul on October 14th. The KDE neon Korean Developer Edition was directly created on request for this event, to be made available to attendees.

That said - this is actually something we've been wanting to do for a while, and it's not just about Korean.

None of the bits that make up the new image are new per-se; KDE has supported Korean for a long time, both with foundational localization engineering and regular maintenance activity. And as of the Plasma 5.6 release, our Input Method Panel is finally bundled with the core desktop code and gets automatically added to the panel on first logon in a locale that typically requires an input method.

Yet it's pretty hard to keep all of this working well, as it requires tight integration and testing across an entire stack, with some parts of the whole living upstream or downstream of KDE.org. For example: After we attempted to make the Plasma panel smarter by making it auto-add the Input Method Panel depending on locale, we couldn't actually be sure it was working as desired by our users, as it takes time for distros to get around to tuning their dependency profiles and for feedback from their users to loop back up to us. It's a very long cycle, with too many opportunities to lose focus or domain knowledge to turnover along the way.

This is where KDE neon comes in: As a fully-integrated product, we can now prove out and demo the intended distro experience there. We can make sure thing stay in working order, even before additional work hits our other distro partners.

Right now, we're kicking things off with Korean Edition, but based on time, interest and testers (please get in touch!), we'd like to build it out into a full CJK Edition, with translations and input support for our Chinese and Japanese users pre-installed as well (as another precursor to this, the decision to switch to Noto Sans CJK as Plasma's default typeface last year was very much made with the global audience in mind as well).

Ok, but where do I get it?

Here! Do keep in mind it's alpha. ☺

Categories: FLOSS Project Planets

CTI Digital: DrupalCon Dublin 2016

Planet Drupal - Tue, 2016-09-27 08:36

As always CTI Digital are sponsors of DrupalCon. Following the popularity of our huge Tag Cloud floor covering at Barcelona celebrating those who made Drupal 8 contributions, this time we’ve created a larger than life version of Snakes & Ladders.

Categories: FLOSS Project Planets

Third & Grove: SRI relaunches with new site design and content architecture to drive content marketing and lead generation

Planet Drupal - Tue, 2016-09-27 08:33
SRI relaunches with new site design and content architecture to drive content marketing and lead generation tony Tue, 09/27/2016 - 08:33
Categories: FLOSS Project Planets

FSF Events: Richard Stallman - "Reclaim Your Freedom with Free (Libre) Software" (Web Summit, Lisbon, Portugal)

GNU Planet! - Tue, 2016-09-27 08:25

Richard Stallman will be speaking at Web Summit (2016-11-07–10). His speech will be nontechnical, admission to the speech (not the conference) is gratis and requires no registration, and the public is encouraged to attend.

Richard Stallman's speech will be nontechnical, admission is gratis, and the public is encouraged to attend.

Location: Code Stage, Feira Internacional de Lisboa and Meo Arena, R. Bojador, 1998-010 Santa Maria dos Olivais

Please fill out our contact form, so that we can contact you about future events in and around Lisbon.

Categories: FLOSS Project Planets

Continuum Analytics News: Continuum Analytics Joins Forces with IBM to Bring Open Data Science to the Enterprise

Planet Python - Tue, 2016-09-27 08:18
News Tuesday, September 27, 2016

Optimized Python experience empowers data scientists to develop advanced open source analytics on Spark   
AUSTIN, TEXAS—September 27, 2016—Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, today announced an alliance with IBM to advance open source analytics for the enterprise. Data scientists and data engineers in open source communities can now embrace Python and R to develop analytic and machine learning models in the Spark environment through its integration with IBM's Project DataWorks. 

Combining the power of IBM's Project DataWorks with Anaconda enables organizations to build high-performance Python and R data science models and visualization applications required to compete in today’s data-driven economy. The companies will collaborate on several open source initiatives including enhancements to Apache Spark that fully leverage Jupyter Notebooks with Apache Spark – benefiting the entire data science community.

“Our strategic relationship with Continuum Analytics empowers Project DataWorks users with full access to the Anaconda platform to streamline and help accelerate the development of advanced machine learning models and next-generation analytics apps,” said Ritika Gunnar, vice president, IBM Analytics. “This allows data science professionals to utilize the tools they are most comfortable with in an environment that reinforces collaboration with colleagues of different skillsets.”

By collaborating to bring about the best Spark experience for Open Data Science in IBM's Project DataWorks, enterprises are able to easily connect their data, analytics and compute with innovative machine learning to accelerate and deploy their data science solutions. 

“We welcome IBM to the growing family of industry titans that recognize Anaconda as the defacto Open Data Science platform for enterprises,” said Michele Chambers, EVP of Anaconda Business & CMO at Continuum Analytics. “As the next generation moves from machine learning to artificial intelligence, cloud-based solutions are key to help companies adopt and develop agile solutions––IBM recognizes that. We’re thrilled to be one of the driving forces powering the future of machine learning and artificial intelligence in the Spark environment.”

IBM's Project Dataworks the industry’s first cloud-based data and analytics platform that integrates all types of data to enable AI-powered decision making. With this, companies are able to realize the full promise of data by enabling data professionals to collaborate and build cognitive solutions by combining IBM data and analytics services and a growing ecosystem of data and analytics partners - all delivered on Apache Spark. Project Dataworks is designed to allow for faster development and deployment of data and analytics solutions with self-service user experiences to help accelerate business value. 

To learn more, join Bob Picciano, SVP of IBM Analytics and Travis Oliphant, CEO of Continuum Analytics at the IBM DataFirst Launch Event on Sept 27, 2016, Hudson Mercantile Building in NYC. The event is also available on livestream.

About Continuum Analytics
Continuum Analytics is the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python. We put superpowers into the hands of people who are changing the world. 

With more than 3M downloads and growing, Anaconda is trusted by the world’s leading businesses across industries––financial services, government, health & life sciences, technology, retail & CPG, oil & gas––to solve the world’s most challenging problems. Anaconda does this by helping everyone in the data science team discover, analyze and collaborate by connecting their curiosity and experience with data. With Anaconda, teams manage their Open Data Science environments without any hassles to harness the power of the latest open source analytic and technology innovations. 

Our community loves Anaconda because it empowers the entire data science team––data scientists, developers, DevOps, architects and business analysts––to connect the dots in their data and accelerate the time-to-value that is required in today’s world. To ensure our customers are successful, we offer comprehensive support, training and professional services. 

Continuum Analytics' founders and developers have created and contributed to some of the most popular Open Data Science technologies, including NumPy, SciPy, Matplotlib, Pandas, Jupyter/IPython, Bokeh, Numba and many others. Continuum Analytics is venture-backed by General Catalyst and BuildGroup. 

To learn more, visit http://www.continuum.io.

Media Contact:
Jill Rosenthal

Categories: FLOSS Project Planets

Kirigami 1.1

Planet KDE - Tue, 2016-09-27 08:07

Today the first feature update of Kirigami has been released.
We have a lot of bug fixes and some cool new features:

The Menu class features some changes and fixes which give greater control over the action triggered by submenus and leaf nodes in the menu tree. Submenus now know which entry is their parent, and allow the submenu’s view to be reset when the application needs it to.
The OverlaySheet now allows to embed ListView and GridView instances in it as well.
The Drawer width now is standardized so all applications look coherent from one another and the title now elides if it doesn’t fit. We also introduced the GlobalDrawer.bannerClicked signal to let applications react to banner interaction.
SwipeListItem has been polished to make sure its contents can fit to the space they have and we introduced the Separator component.
Desktop Kirigami applications support the “quit” shortcut (such as Ctrl+Q) now.

Plasma 5.8 will depend from Kirigami 1.1, so if you are planning to write a Kirigami-based application, it will work by default and nicely integrate in the Plasma 5.8 desktop.

Plasma 5.8 also has a new big user for Kirigami 1.1, that is Discover: the application to search and install for new software, has a brand new user interface, based upon Kirigami.

This is problably the last feature release based upon QtQuickControls 1, QtQuickControls 2 version is on the way at an experimental stage. The port will have way simpler code (and smaller memory footprint) but this is an entry for another day

Categories: FLOSS Project Planets

Open source C++ execution trace framework

Planet KDE - Tue, 2016-09-27 07:59

At froglogic, we’re big fans of open source software. A large part of our engineering (and management!) staff contributed or contributes to open source projects, and everyone visiting our offices for a job interview certainly gets a big +1 in case she can show off some open source work! We also use a lot of open source software for our daily work, ranging from obvious projects like Git or the Linux kernel to individual libraries serving very specific purposes; the Acknowledgements Chapter of the Squish manual gives an impression of how tall the giants are upon whose shoulders we’re standing.

Over the last couple of years we contributed back various bug fixes and improvements to different projects we’re using, but we’d like to step things up a little bit. Hence, we now open-sourced an internally developed C++ framework called ‘TraceTool’ and made it available under the LGPL v3 license on our GitHub account:


TraceTool started out as a project which we did a few years ago but since grew to a full-fledged, efficient and configurable framework which we use ourselves internally in order to aid debugging our software.

TraceTool GUI highlighting trace statements generated by interesting function.

The birds-eye view is that the source code of the application or library is augmented with additional statements which generate output reflecting the current state of the application. By default, these statements don’t do anything though – the idea is that they are compiled into the application (and shipped in release builds) but only get activated when a special configuration file is detected at runtime. That configuration file describes

  • for which C++ classes, functions or files to enable tracing
  • what format to use for the trace output
  • whether to log stack traces when individual trace points are hit
  • and much more

In addition to the library, the framework provides a GUI as well as various command line tools to view and process the generated traces.

The workflow which inspired this project is that once a user reports a defect of some sort, he gets sent an configuration file to be stored somewhere and that configuration file enables just the right amount of debug output for the developer to understand what’s going on.

We welcome everyone to hop over to the TraceTool repository page, have a look, give it a try – and see whether this might be useful to you. If you feel like it, don’t hesitate to contribute back by submitting a Pull Request so that we can improve things for everyone.

Categories: FLOSS Project Planets

Attiks: Drupal and polymer

Planet Drupal - Tue, 2016-09-27 07:42
Drupal and polymer27-09 2016Peter Droogmans

What if there were polymer elements for custom Drupal functionality

Read more
Categories: FLOSS Project Planets

Python Insider: Python Core Development Sprint 2016: 3.6 and beyond!

Planet Python - Tue, 2016-09-27 07:03

From September 5th to the 9th a group of Python core developers gathered for a sprint hosted at Instagram and sponsored by Instagram, Microsoft, and the Python Software Foundation. The goal was to spend a week working towards the Python 3.6b1 release, just in time for the Python 3.6 feature freeze on Monday, September 12, 2016. The inspiration for this sprint was the Need for Speed sprint held in Iceland a decade ago, where many performance improvements were made to Python 2.5. How time flies!

By any measurement, the sprint was extremely successful. All participants left feeling accomplished both in the work they did and in the discussions they held. Being in the same room encouraged many discussions related to various PEPs, and many design decisions were made. There was also the camaraderie of working in the same room together — typically most of us only see each other at the annual PyCon US, where there are other distractions that prevent getting to simply spend time together. (This includes the Python development sprints at PyCon, where the focus is more on helping newcomers contribute to Python — that’s why this sprint was not public.)

From a quantitative perspective, the sprint was the most productive week for Python ever! According to the graphs from the GitHub mirror of CPython, the week of September 4th saw more commits than the preceding 7 weeks combined! And in terms of issues, the number of open issues dropped by 62 with a total of 166 issues closed.

A large portion of the work performed during the sprint week revolved around various PEPs that had either not yet been accepted or were not yet fully implemented. In the end, 12 PEPs were either implemented from scratch or had their work completed during the week, out of Python 3.6’s total of 16 PEPs:

  1. Preserving the order of **kwargs in a function (PEP 468)
  2. Add a private version to dict (PEP 509)
  3. Underscores in Numeric Literals (PEP 515)
  4. Adding a file system path protocol (PEP 519)
  5. Preserving Class Attribute Definition Order (PEP 520)
  6. Adding a frame evaluation API to CPython (PEP 523)
  7. Make os.urandom() blocking on Linux, add os.getrandom() (PEP 524)
  8. Asynchronous Generators (PEP 525)
  9. Syntax for Variable Annotations (PEP 526)
  10. Change Windows console encoding to UTF-8 (PEP 528)
  11. Change Windows filesystem encoding to UTF-8 (PEP 529)
  12. Asynchronous Comprehensions (PEP 530)

Some large projects were also worked on that are not represented as PEPs. For instance, Python 3.6 now contains support for DTrace and SystemTap. This will give people more tools to introspect and monitor Python. See the HOWTO for usage instructions and examples showing some of the new possibilities.

CPython also gained a more memory efficient dictionary implementation at the sprint. The new implementation shrinks memory usage of dictionaries by about 25% and also preserves insertion order, without speed penalties. Based on a proposal by Raymond Hettinger, the patch was written by INADA Naoki prior to the sprint but it was reviewed and heavily discussed at the sprint, as changing the underlying implementation of dictionaries can have profound implications on Python itself. In the end, the patch was accepted, directly allowing for PEP 468 to be accepted and simplifying PEP 520.

Work was also done on the Gilectomy (see the presentation on the topic from PyCon US for more background info on the project). Progress was made such that Python would run without any reference counting turned on (i.e. Python turned into a huge memory leak). Work then was started on trying the latest design on how to turn reference counting back on in a way that would allow Python to scale with the number of threads and CPU cores used. There’s still a long road ahead before the Gilectomy will be ready to merge though, and we even jokingly considered branding the result as Python 4.

Much of the work done during the sprint led not only to improvements in the language and library, but to better performance as well. A quick performance comparison between Python 3.5.2+ and 3.6b1+ under OS X shows that 3.6 is generally faster, with double-digit speed improvements not uncommon. Similar benchmarking under Windows 10 has been reported to show similar performance gains.

A huge thanks goes out to the participants of the sprints! They are listed below in alphabetical order, along with thanks to the organizations that helped finance their attendance. Many of them traveled to attend and gave up the US Labor Day holiday with their families to participate. In the end, we had participants from 3 countries on 2 continents (We actually invited more people from other countries and continents, but not everybody invited could attend.)

  • Brett Cannon (Microsoft)
  • Ned Deily
  • Steve Dower (Microsoft)
  • Larry Hastings
  • Raymond Hettinger
  • Senthil Kumaran
  • Łukasz Langa (Instagram)
  • R. David Murray
  • Benjamin Peterson
  • Davin Potts
  • Lisa Roach (Cisco)
  • Guido van Rossum
  • Yury Selivanov (magic.io)
  • Gregory P. Smith (Google)
  • Eric Snow (Canonical)
  • Victor Stinner (Red Hat)
  • Zachary Ware

Special thanks to Łukasz for making the event happen and to Larry for designing the logo.

Categories: FLOSS Project Planets

Ben Hutchings: Debian LTS work, August 2016

Planet Debian - Tue, 2016-09-27 06:41

I was assigned 14.75 hours of work by Freexian's Debian LTS initiative and carried over 0.7 from last month. I worked a total of 14 hours, carrying over 1.45 hours.

I finished preparing and finally uploaded an update for linux (3.2.81-2). This took longer than expected due to the difficulty of reproducing CVE-2016-5696 and verifying the backported fix. I also released an upstream stable update (3.2.82) which will go into the next update in wheezy LTS.

I discussed a few other security updates and issues on the debian-lts mailing list.

Categories: FLOSS Project Planets

Amazee Labs: A day at DrupalCon with Corina

Planet Drupal - Tue, 2016-09-27 06:21
A day at DrupalCon with Corina

It's this time of the year again. DrupalCon Dublin is finally here. Even though it's my fourth Con already, it's always a special event.

Corina Schmid Tue, 09/27/2016 - 12:21

The conference officially only starts Tuesday but Monday for me Monday is the more important day. It's the day of setting up our booth, making sure all things arrived and in the evening there was our Amazee Story Telling Tour around Dublin. They say a picture says more than 1000 words so here are some impressions of my DrupalCon Monday:

Once the booth was set up we enjoyed the welcome reception. By the way, if you want to see the finished booth make sure you come by and meet us at the exhibit hall; we're at booth 700.

To finish off the day we ventured out on our Story Telling Tour learning about Irish Tales.


Categories: FLOSS Project Planets

The Digital Cat: Python Mocks: a gentle introduction - Part 2

Planet Python - Tue, 2016-09-27 05:00

In the first post I introduced you to Python mocks, objects that can imitate other objects and work as placeholders, replacing external systems during unit testing. I described the basic behaviour of mock objects, the return_value and side_effect attributes, and the assert_called_with() method.

In this post I will briefly review the remaining assert_* methods and some interesting attributes that allow to check the calls received by the mock object. Then I will introduce and exemplify patching, which is a very important topic in testing.

Other assertions and attributes

The official documentation of the mock library lists many other assertion, namely assert_called_once_with(), assert_any_call(), assert_has_calls(), assert_not_called(). If you grasped how assert_called_with() works, you will have no troubles in understanding how those other behave. Be sure to check the documentation to get a full description of what mock object can assert about their history after being used by your code.

Together with those methods, mock objects also provide some useful attributes, two of which have been already reviewed in the first post. The remaining attributes are as expected mostly related to calls, and are called, call_count, call_args, call_args_list, method_calls, mock_calls. While these also are very well descripted in the official documentation, I want to point out the two method_calls and mock_calls attributes, that store the detailed list of methods which are called on the mock, and the call_args_list attribute that lists the parameters of every call.

Do not forget that methods called on a mock object are mocks themselves, so you may first access the main mock object to get information about the called methods, and then access those methods to get the arguments they received.


Mocks are very simple to introduce in your tests whenever your objects accept classes or instances from outside. In that case, as described, you just have to instantiate the Mock class and pass the resulting object to your system. However, when the external classes instantiated by your library are hardcoded this simple trick does not work. In this case you have no chance to pass a mock object instead of the real one.

This is exactly the case addressed by patching. Patching, in a testing framework, means to replace a globally reachable object with a mock, thus achieving the target of having the code run unmodified, while part of it has been hot swapped, that is, replaced at run time.

A warm-up example

Let us start with a very simple example. Patching can be complex to grasp at the beginning so it is better to learn it with trivial code. If you do not have it yet, create the testing environment mockplayground with the instruction given in the previous post.

I want to develop a simple class that returns information about a given file. The class shall be instantiated with the filename, which can be a relative path.

For the sake of brevity I will not show you every step of the TDD development of the class. Remember that TDD requires you to write a test and then implement the code, but sometimes this could be too fine grained, so do not use the TDD rules without thinking.

The tests for the initialization of the class are

from fileinfo import FileInfo def test_init(): filename = 'somefile.ext' fi = FileInfo(filename) assert fi.filename == filename def test_init(): filename = 'somefile.ext' relative_path = '../{}'.format(filename) fi = FileInfo(relative_path) assert fi.filename == filename

You can put them into the tests/test_fileinfo.py file. The code that makes the tests pass could be something like

import os class FileInfo: def __init__(self, path): self.original_path = path self.filename = os.path.basename(path)

Up to now I didn't introduce any new feature. Now I want the get_info() function to return a tuple with the file name, the original path the class was instantiated with, and the absolute path of the file.

You immediately realise that you have an issue in writing the test. There is no way to easily test something as "the absolute path", since the outcome of the function called in the test is supposed to vary with the path of the test itself. Let us write part of the test

def test_get_info(): filename = 'somefile.ext' original_path = '../{}'.format(filename) fi = FileInfo(original_path) assert fi.get_info() == (filename, original_path, '???')

where the '???' string highlights that I cannot put something sensible to test the absolute path of the file.

Patching is the way to solve this problem. You know that the function will use some code to get the absolute path of the file. So in the scope of the test only you can replace that code with different code and perform the test. Since the replacement code has a known outcome writing the test is now possible.

Patching, thus, means to inform Python that in some scope you want a globally accessible module/object replaced by a mock. Let's see how we can use it in our example

from unittest.mock import patch [...] def test_get_info(): filename = 'somefile.ext' original_path = '../{}'.format(filename) with patch('os.path.abspath') as abspath_mock: test_abspath = 'some/abs/path' abspath_mock.return_value = test_abspath fi = FileInfo(original_path) assert fi.get_info() == (filename, original_path, test_abspath)

Remember that if you are using Python 2 you installed the mock module with pip, so your import statement becomes form mock import patch.

You clearly see the context in which the patching happens, as it is enclosed in a with statement. Inside this statement the module os.path.abspath will be replaced by a mock created by the function patch and called abspath_mock. We can now give the function a return_value as we did with standard mocks in the first post and run the test.

The code that make the test pass is

class FileInfo: [...] def get_info(self): return self.filename, self.original_path, os.path.abspath(self.filename)

Obviously to write the test you have to know that you are going to use the os.path.abspath function, so patching is somehow a "less pure" practice in TDD. In pure OOP/TDD you are only concerned with the external behaviour of the object, and not with its internal structure. This example, however, shows that you have to cope with some real world issues, and patching is a clean way to do it.

The patching decorator

The patch function we imported from the unittest.mock module is very powerful, and can be used as a function decorator as well. When used in this fashion you need to change the decorated function to accept a mock as last argument.

@patch('os.path.abspath') def test_get_info(abspath_mock): filename = 'somefile.ext' original_path = '../{}'.format(filename) test_abspath = 'some/abs/path' abspath_mock.return_value = test_abspath fi = FileInfo(original_path) assert fi.get_info() == (filename, original_path, test_abspath)

As you can see the patch decorator works like a big with statement for the whole function. Obviously in this way you replace the target function os.path.abspath in the scope of the whole function. It is then up to you to decide if you need to use patch as a decorator or in a with block.

Multiple patches

We can also patch more that one object. Say for example that we want to change the above test to check that the outcome of the FileInfo.get_info() method also contains the size of the file. To get the size of a file in Python we may use the os.path.getsize() function, which returns the size of the file in bytes.

So now we have to patch os.path.getsize as well, and this can be done with another patch decorator.

@patch('os.path.getsize') @patch('os.path.abspath') def test_get_info(abspath_mock, getsize_mock): filename = 'somefile.ext' original_path = '../{}'.format(filename) test_abspath = 'some/abs/path' abspath_mock.return_value = test_abspath test_size = 1234 getsize_mock.return_value = test_size fi = FileInfo(original_path) assert fi.get_info() == (filename, original_path, test_abspath, test_size)

Please notice that the decorator which is nearest to the function is applied first. Always remember that the decorator syntax with @ is a shortcut to replace the function with the output of the decorator, so two decorators result in

@decorator1 @decorator2 def myfunction(): pass

which is a shorcut for

def myfunction(): pass myfunction = decorator1(decorator2(myfunction))

This explains why, in the test code, the function receives first abspath_mock and then getsize_mock. The first decorator applied to the function is the patch of os.path.abspath, which appends the mock that we call abspath_mock. Then the patch of os.path.getsize is applied and this appends its own mock.

The code that makes the test pass is

class FileInfo: [...] def get_info(self): return self.filename, self.original_path, os.path.abspath(self.filename), os.path.getsize(self.filename)

We can write the above test using two with statements as well

def test_get_info(): filename = 'somefile.ext' original_path = '../{}'.format(filename) with patch('os.path.abspath') as abspath_mock: test_abspath = 'some/abs/path' abspath_mock.return_value = test_abspath with patch('os.path.getsize') as getsize_mock: test_size = 1234 getsize_mock.return_value = test_size fi = FileInfo(original_path) assert fi.get_info() == (filename, original_path, test_abspath, test_size)

Using more than one with statement, however, makes the code difficult to read, in my opinion, so in general I prefer to avoid complex with trees if I do not need a limited scope of the patching.

Patching immutable objects

The most widespread version of Python is CPython, which is written, as the name suggests, in C. Part of the standard library is also written in C, while the rest is written in Python itself.

The objects (classes, modules, functions, etc) that are implemented in C are shared between interpreters, which is something that you can do embedding the Python interpreter in a C program, for example. This requires those objects to be immutable, so that you cannot alter them at runtime from a single interpreter.

For an example of this immutability just check the following code

>>> a = 1 >>> a.conjugate = 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'int' object attribute 'conjugate' is read-only

Here I'm trying to replace a method with an integer, which is pointless, but nevertheless shows the issue we are facing.

What has this immutability to do with patching? What patch does is actually to temporarily replace an attibute of an object (method of a class, class of a module, etc), so if that object is immutable the patching action fails.

A typical example of this problem is the datetime module, which is also one of the best candidates for patching, since the output of time functions is by definition time-varying.

Let me show the problem with a simple class that logs operations. The class is the following (you can put it into a file called logger.py)

import datetime class Logger: def __init__(self): self.messages = [] def log(self, message): self.messages.append((datetime.datetime.now(), message))

This is pretty simple, but testing this code is problematic, because the log() method produces results that depend on the actual execution time.

If we try to write a test patching datetime.datetime.now we have a bitter surprise. This is the test code, that you can put in tests/test_logger.py

from unittest.mock import patch from logger import Logger def test_init(): l = Logger() assert l.messages == [] @patch('datetime.datetime.now') def test_log(mock_now): test_now = 123 test_message = "A test message" mock_now.return_value = test_now l = Logger() l.log(test_message) assert l.messages == [(test_now, test_message)]

and the execution of pytest returns a TypeError: can't set attributes of built-in/extension type 'datetime.datetime', which is exactly a problem of immutability.

There are several ways to address this problem, but all of them leverage the fact that, when you import of subclass an immutable object what you get is a "copy" of that is now mutable.

The easiest example in this case is the module datetime itself. In the test_log function we try to patch directly the datetime.datetime.now object, affecting the builtin module datetime. The file logger.py, however, does import datetime, so that this latter becomes a local symbol in the logger module. This is exactly the key for our patching. Let us change the code to

@patch('logger.datetime.datetime') def test_log(mock_datetime): test_now = 123 test_message = "A test message" mock_datetime.now.return_value = test_now l = Logger() l.log(test_message) assert l.messages == [(test_now, test_message)]

As you see running the test now the patching works. What we did was to patch logger.datetime.datetime instead of datetime.datetime.now. Two things changed, thus, in our test. First, we are patching the module imported in the logger.py file and not the module provided globally by the Python interpreter. Second, we have to patch the whole module because this is what is imported by the logger.py file. If you try to patch logger.datetime.datetime.now you will find that it is still immutable.

Another possible solution to this problem is to create a function that invokes the immutable object and returns its value. This last function can be easily patched, because it just uses the builtin objects and thus is not immutable. This solution, however, requires to change the source code to allow testing, which is far from being desirable. Obviously it is better to introduce a small change in the code and have it tested than to leave it untested, but whenever is possible I avoid solutions that introduce code which wouldn't be required without tests.

Final words

In this second part of this small series on Python testing we reviewd the patching mechanism and run through some of its subleties. Patching is a really effective technique, and patch-based tests can be found in many different packages. Take your time to become confident with mocks and patching, since they will be one of your main tools while working with Python and any other object-oriented language.

As always, I strongly recommend finding some time to read the official documentation of the mock library.


Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections

Categories: FLOSS Project Planets

Semaphore Community: Mocks and Monkeypatching in Python

Planet Python - Tue, 2016-09-27 03:31

This article is brought with ❤ to you by Semaphore.

Post originally published on http://krzysztofzuraw.com/. Republished with author's permission.


In this post I will look into the essential part of testing — mocks.

First of all, what I want to accomplish here is to give you basic examples of how to mock data using two tools — mock and pytest monkeypatch.

Why bother mocking?

Some of the parts of our application may have dependencies for other libraries or objects. To isolate the behaviour of our parts, we need to substitute external dependencies. Here comes the mocking. We mock an external API to check certain behaviours, such as proper return values, that we previously defined.

Mocking function

Let’s say we have a module called function.py:

def square(value): return value ** 2 def cube(value): return value ** 3 def main(value): return square(value) + cube(value)

Then let’s see how these functions are mocked using the mock library:

try: import mock except ImportError: from unittest import mock import unittest from function import square, main class TestNotMockedFunction(unittest.TestCase): @mock.patch('__main__.square', return_value=1) def test_function(self, mocked_square): # because you need to patch in exact place where function that has to be mocked is called self.assertEquals(square(5), 1) @mock.patch('function.square') @mock.patch('function.cube') def test_main_function(self, mocked_square, mocked_cube): # underling function are mocks so calling main(5) will return mock mocked_square.return_value = 1 mocked_cube.return_value = 0 self.assertEquals(main(5), 1) mocked_square.assert_called_once_with(5) mocked_cube.assert_called_once_with(5) if __name__ == '__main__': unittest.main()

What is happening here? Lines 1-4 are for making this code compatible between Python 2 and 3. In Python 3, mock is part of the standard library, whereas in Python 2 you need to install it by pip install mock.

In line 13, I patched the square function. You have to remember to patch it in the same place you use it. For instance, I’m calling square(5) in the test itself so I need to patch it in __main__. This is the case if I’m running this by using python tests/test_function.py. If I’m using pytest for that, I need to patch it as test_function.square.

In lines 18-19, I patch the square and cube functions in their module because they are used in the main function. The last two asserts come from the mock library, and are there to make sure that mock was called with proper values.

The same can be accomplished using mokeypatching for py.test:

from function import square, main def test_function(monkeypatch): monkeypatch.setattr(“test_function_pytest.square”, lambda x: 1) assert square(5) == 1 def test_main_function(monkeypatch): monkeypatch.setattr(‘function.square’, lambda x: 1) monkeypatch.setattr(‘function.cube’, lambda x: 0) assert main(5) == 1

As you can see, I’m using monkeypatch.setattr for setting up a return value for given functions. I still need to monkeypatch it in proper places — test_function_pytest and function.

Mocking classes

I have a module called square:

import math class Square(object): def __init__(radius): self.radius = radius def calculate_area(self): return math.sqrt(self.radius) * math.pi

and mocks using standard lib:

try: import mock except ImportError: from unittest import mock import unittest from square import Square class TestClass(unittest.TestCase): @mock.patch('__main__.Square') # depends in witch from is run def test_mocking_instance(self, mocked_instance): mocked_instance = mocked_instance.return_value mocked_instance.calculate_area.return_value = 1 sq = Square(100) self.assertEquals(sq.calculate_area(), 1) def test_mocking_classes(self): sq = Square sq.calculate_area = mock.MagicMock(return_value=1) self.assertEquals(sq.calculate_area(), 1) @mock.patch.object(Square, 'calculate_area') def test_mocking_class_methods(self, mocked_method): mocked_method.return_value = 20 self.assertEquals(Square.calculate_area(), 20) if __name__ == ‘__main__’: unittest.main()

At line 13, I patch the class Square. Lines 15 and 16 present a mocking instance. mocked_instance is a mock object which returns another mock by default, and to these mock.calculate_area I add return_value 1. In line 23, I’m using MagicMock, which is a normal mock class, except in that it also retrieves magic methods from the given object. Lastly, I use patch.object to mock the method in the Square class.

The same using pytest:

try: from mock import MagicMock except ImportError: from unittest.mock import MagicMock from square import Square def test_mocking_class_methods(monkeypatch): monkeypatch.setattr('test_class_pytest.Square.calculate_area', lambda: 1) assert Square.calculate_area() == 1 def test_mocking_classes(monkeypatch): monkeypatch.setattr('test_class_pytest.Square', MagicMock(Square)) sq = Square sq.calculate_area.return_value = 1 assert sq.calculate_area() == 1

The issue here is with test_mocking_class_methods, which works well in Python 3, but not in Python 2.

All examples can be found in this repo.

If you have any questions and comments, feel free to leave them in the section below.

  1. What is Mocking?
  2. Mocking With Kung Fu Panda

This article is brought with ❤ to you by Semaphore.

Categories: FLOSS Project Planets

Adding API dox generation to the build by CMake macros

Planet KDE - Tue, 2016-09-27 00:48

Online documentations of APIs are a fine thing: by their urls they provide some shared universal location to refer to and often also are kept up-to-date to latest improvements. And they serve as index trap for search engine robots, so more people have the chance to find out about some product.

But the world is sometimes not an online heaven:

  • “my internet is gone!”
  • documentation webserver is down
  • documentation webserver is slooo….oo…ow
  • documentation webserver no longer has documentation for that older version

So having also access to offline documentation as alternative can make some developers using your library happy. If you are using Qt in your projects, you yourself might even prefer to study the offline Qt API documentation installed on your local systems, either via the Qt Assistant program or as integrated in IDEs like Qt Creator or KDevelop. Even better, the offline documentation comes together with the library itself (e.g. via packages on Linux or *BSD distributions), so the documentation is always matching the version of the library.

Being a good developer you have already added lots of nice API documentation in your code. Now how to turn this into a separate offline documentation manual for the consumers of your library, without too much effort?
If your project uses CMake for the build system and your target audience is using documentation viewers like Qt Assistant, Qt Creator or KDevelop, read on for a possible solution.

Deploying doxygen & qhelpgenerator

Very possibly you are already using doxygen to generate e.g. an HTML version of the API dox for your library project. And you might know doxygen can output to more formats than HTML. One of them is the format “Qt Compressed Help” (QCH). Which is a single-file format the above mentioned documentation viewers support well. QCH support by doxygen even exists since 2009.

So, with doxygen & qhelpgenerator installed and a respective doxygen config file prepared, the build script of your library just needs to additionally call doxygen as part of another target, and there is the API documentation manual as part of your project build, ready to be deployed and packaged with your library.

A CMake macro for convenience

To not have everyone duplicate the needed build setup code, like checking for the needed tools being installed, and to not have to learn all the details of the doxygen config file, some CMake macro doing the dull work recommends itself to be written and to be distributed for reuse.

Before doing that though, let’s consider another aspect when it comes to documenting libraries: including documentation for code constructs used from external libraries.

Linking between offline API dox manuals

Especially for C++ libraries, where subclassing across classes from different libraries is common, one wants to have direct links into the documentation of external libraries, so the reader has transparent access to any info.

The Qt help system has an abstract addressing system which works independently from the actual location of the QCH files e.g. in the filesystem. These addresses use the scheme qthelp: and a domain namespace for identifying a given manual (e.g. org.qt-project.qtcore). Additionally there is the concept of a virtual folder which is also set as root directory within a manual (e.g. qtcore). So the address of the document for the class QString would be qthelp://org.qt-project.qtcore/qtcore/qstring.html, and the documentation viewer would resolve it to the manuals registered by the matching metadata.

Doxygen supports the creation of such links into external QCH files. For a given manual of an external library, the references are created based on the domain name, the virtual folder name and a so-called tag file for that library. Such a tag file contains all kind of metadata about classes, methods and more that is needed to document imported structures and to create the references.

The QCH files for Qt libraries are coming with such tag files. But what about your own library? What if somebody else should be able to create documentation for their code with links into the documentation of your own library?
Doxygen helps here as well, during the creation of the API dox manual it optionally also creates a respective tag file.

Extending CMake Config files with API dox info

So for generating the references into some external library documentation, the location of its tag file, its name of the domain and its name of the virtual folder need to be known and explicitly added to the doxygen config file. Which is extra work, error-prone due to typos and might miss changes.

But it can be automated. All the information is known to the builder of that external library. And with the builder storing the information in standardized CMake variables with the external library’s CMake config files, it could also be automatically processed into the needed doxygen config entries in the build of your own library.

So a simple find_package(Bar) would bring in all the needed data, and passing Bar as name of the external library should be enough to let the to-be-written CMake macro derive the names of the variables with that data.

So with these theoretic considerations finally some real code as draft:

Patch uploaded for your review and comments

As possible extension of the Extra-CMake-Modules (ECM) the review request https://phabricator.kde.org/D2854 proposes 2 macros for making it simple and easy to add to the build of own libraries the generation of API dox manuals with links into external API dox manuals.

The call of the macro for generating the tag and qch files would e.g. be like this:


This would create and install the files ${KDE_INSTALL_FULL_DATADIR}/qch/Foo.qch and ${KDE_INSTALL_FULL_DATADIR}/qch/Foo.tags.

The call for generating the CMake Config file would be:

ecm_generate_package_apidox_file( ${CMAKE_CURRENT_BINARY_DIR}/FooConfigApiDox.cmake NAMES Foo )

This generates the file FooConfigApiDox.cmake, whose content is setting the respective variables with the API dox metadata:

set(Foo_APIDOX_TAGSFILE "/usr/share/docs/Foo.tags") set(Foo_APIDOX_QHP_NAMESPACE "org.kde.Foo") set(Foo_APIDOX_QHP_NAMESPACE_VERSIONED "org.kde.Foo.210") set(Foo_APIDOX_QHP_VIRTUALFOLDER "Foo")

This file would then be included in the FooConfig.cmake file

# ... include("${CMAKE_CURRENT_LIST_DIR}/FooApiDox.cmake") # ...

and installed together in the same directory.

Tightening and extending integration

In the current draft, for defining what source files doxygen should include for generating the API documentation, the macro ecm_generate_qch() takes simply a list of directories via the SOURCE_DIRS argument list, which then together with some wildcard patterns for typical C and C++ files is written to the doxygen config file.
Though being part of the build system, there might be more specific info available what files should be used, e.g. by using the PUBLIC_HEADER property of a library target. Anyone experienced with the usage of that or similar CMake features are invited to propose how to integrate this into the API of the proposed new macro.

Currently there is no standard installation directory for QCH files. So documentation viewers supporting QCH files need to be manually pointed to any new installed QCH file (exception are the QCH files for Qt, whose location can be queried by qmake -query QT_INSTALL_DOCS).
Instead it would be nice those viewers pick up such files automatically. How could that be achieved?

It would be also good if this or some similar API dox metadata system could be upstreamed into CMake itself, so there is a reliable standard for this, helping everyone with the creation of cross-project integrated API manuals. At least the metadata variable naming patterns would need to get wide recognition to get really useful. Anyone interested in pushing this upstream?

For proper documentation of public and non-public API perhaps the doxygen development could also pick up this old thread about missing proper support for marking classes & methods as internal. Or is there meanwhile a good solution for this?

On the product side, what manual formats beside QCH should be supported?
On the tool side, what other API documentation tools should be supported (perhaps something clang-based)?

Comments welcome here or on the phabricator page.

Categories: FLOSS Project Planets

Experienced Django: Shallow Dive into Django ORM

Planet Python - Mon, 2016-09-26 21:57
A Closer Look at the Django ORM and Many-To-Many Relationships

In the last post I worked some on the data model for the KidsTasks app and discovered that a many-to-many relationship would not allow multiple copies of the same task to exist in a given schedule. Further reading showed me, without much explanation, that using a “through” parameter on the relationship definition fixed that. In this post I want to take a closer look at what’s going on in that django model magic.

Django Shell

As part of my research for this topic, I was lead to a quick description of the Django shell which is great for testing out ideas and playing with the models you’re developing. I found a good description here.  (which also gives a look at filters and QuerySets).

Additionally, I’ll note for anyone wanting to play along at home, that the following sequence of commands was quite helpful to have handy when testing different models.

$ rm tasks/migrations db.sqlite3 -rf $ ./manage.py makemigrations tasks $ ./manage.py migrate $ ./manage.py shell Python 3.4.3 (default, Oct 14 2015, 20:33:09) [GCC 4.8.4] on linux Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) Many To Many without an Intermediate Class

I’ll start by examining what happened with my original model design where a DayOfWeekSchedule had a ManyToMany relationship with Task.

Simple Solution Code

The simplified model I’ll use here looks like this.

class Task(models.Model): name = models.CharField(max_length=256) required = models.BooleanField() def __str__(self): return self.name class DayOfWeekSchedule(models.Model): tasks = models.ManyToManyField(Task) name = models.CharField(max_length=20) def __str__(self): return self.name

Note that the ManyToMany field directly accesses the Task class. (Also note that I retained the __str__ methods to make the shell output more meaningful.)


In the shell experiment show in the listing below, I set up a few Tasks and
a couple of DayOfWeekSchedules and then add “first task” and “second
task” to one of the schedules. Once this is done, I attempt to add “first
task” to the schedule again and we see that it does not have the desired

>>> # import our models >>> from tasks.models import Task, DayOfWeekSchedule >>> >>> # populate our database with some simple tasks and schedules >>> Task.objects.create(name="first task", required=False) <Task: first task> >>> Task.objects.create(name="second task", required=True) <Task: second task> >>> Task.objects.create(name="third task", required=False) <Task: third task> >>> DayOfWeekSchedule.objects.create(name="sched1") <DayOfWeekSchedule: sched1> >>> DayOfWeekSchedule.objects.create(name="sched2") <DayOfWeekSchedule: sched2> >>> Task.objects.all() <QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]> >>> DayOfWeekSchedule.objects.all() <QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]> >>> >>> # add a task to a schedule >>> s = DayOfWeekSchedule.objects.get(name='sched2') >>> t = Task.objects.get(name='first task') >>> s.tasks.add(t) >>> s.tasks.all() <QuerySet [<Task: first task>]> >>> >>> # add other task to that schedule >>> t = Task.objects.get(name='second task') >>> s.tasks.add(t) >>> s.tasks.all() <QuerySet [<Task: first task>, <Task: second task>]> >>> >>> # attempt to add the first task to the schedule again >>> s = DayOfWeekSchedule.objects.get(name='sched2') >>> t = Task.objects.get(name='first task') >>> s.tasks.add(t) >>> s.tasks.all() <QuerySet [<Task: first task>, <Task: second task>]>

Note that at the end, we still only have a single copy of “first task” in the schedule.

Many To Many with an Intermediate Class

Now we’ll retry the experiment with the “through=” intermediate class specified in the ManyToMany relationship.

Not-Quite-As-Simple Solution Code

The model code for this is quite similar.  Note the addition of the “through=” option and of the DayTask class.

from django.db import models class Task(models.Model): name = models.CharField(max_length=256) required = models.BooleanField() def __str__(self): return self.name class DayOfWeekSchedule(models.Model): tasks = models.ManyToManyField(Task, through='DayTask') name = models.CharField(max_length=20) def __str__(self): return self.name class DayTask(models.Model): task = models.ForeignKey(Task) schedule = models.ForeignKey(DayOfWeekSchedule) Experiment #2

This script is as close as possible to the first set.  The only difference being the extra steps we need to take to add the ManyToMany relationship.  We need to manually create the object of DayTask, initializing it with the Task and Schedule objects and then saving it.  While this is slightly more cumbersome in the code, it does produce the desired results; two copies of “first task” are present in the schedule at the end.

>>> # import our models >>> from tasks.models import Task, DayOfWeekSchedule, DayTask >>> >>> # populate our database with some simple tasks and schedules >>> Task.objects.create(name="first task", required=False) <Task: first task> >>> Task.objects.create(name="second task", required=True) <Task: second task> >>> Task.objects.create(name="third task", required=False) <Task: third task> >>> DayOfWeekSchedule.objects.create(name="sched1") <DayOfWeekSchedule: sched1> >>> DayOfWeekSchedule.objects.create(name="sched2") <DayOfWeekSchedule: sched2> >>> Task.objects.all() <QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]> >>> DayOfWeekSchedule.objects.all() <QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]> >>> >>> # add a task to a schedule >>> s = DayOfWeekSchedule.objects.get(name='sched2') >>> t = Task.objects.get(name='first task') >>> # cannot simply add directly, must create intermediate object see >>> # https://docs.djangoproject.com/en/1.9/topics/db/models/#extra-fields-on-many-to-many-relationships >>> # s.tasks.add(t) >>> d1 = DayTask(task=t, schedule=s) >>> d1.save() >>> s.tasks.all() <QuerySet [<Task: first task>]> >>> >>> # add other task to that schedule >>> t = Task.objects.get(name='second task') >>> dt2 = DayTask(task=t, schedule=s) >>> dt2.save() >>> # s.tasks.add(t) >>> s.tasks.all() <QuerySet [<Task: first task>, <Task: second task>]> >>> >>> # attempt to add the first task to the schedule again >>> s = DayOfWeekSchedule.objects.get(name='sched2') >>> t = Task.objects.get(name='first task') >>> dt3 = DayTask(task=t, schedule=s) >>> dt3.save() >>> s.tasks.all() <QuerySet [<Task: first task>, <Task: second task>, <Task: first task>]> But…Why?

The short answer is that I’m not entirely sure why the intermediate class is needed to allow multiple instances.  It’s fairly clear that it is tied to how the Django code manages those relationships.  Evidence confirming that can be seen in the migration script generated for each of the models.

The first model generates these operations:

operations = [ migrations.CreateModel( name='DayOfWeekSchedule', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('name', models.CharField(max_length=20)), ], ), migrations.CreateModel( name='Task', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('name', models.CharField(max_length=256)), ('required', models.BooleanField()), ], ), migrations.AddField( model_name='dayofweekschedule', name='tasks', field=models.ManyToManyField(to='tasks.Task'), ), ]

Notice the final AddField call which adds “tasks” to the “dayofweekschedule” model directly.

The second model (shown above) generates a slightly different set of migration operations:

operations = [ migrations.CreateModel( name='DayOfWeekSchedule', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('name', models.CharField(max_length=20)), ], ), migrations.CreateModel( name='DayTask', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('schedule', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.DayOfWeekSchedule')), ], ), migrations.CreateModel( name='Task', fields=[ ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), ('name', models.CharField(max_length=256)), ('required', models.BooleanField()), ], ), migrations.AddField( model_name='daytask', name='task', field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.Task'), ), migrations.AddField( model_name='dayofweekschedule', name='tasks', field=models.ManyToManyField(through='tasks.DayTask', to='tasks.Task'), ), ]

This time it adds task to the daytask and dayofweekschedule classes.  I have to admit here that I really wanted this to show the DayTask object being used in the DayOfWeekSchedule class as a proxy, but that’s not the case.

Examining the databases generated by these two models showed no significant differences there, either.

A Quick Look at the Source

One of the beauties of working with open source software is the ability to dive in and see for yourself what’s going on.  Looking at the Django source, you can find the code that adds a relationship in django/db/models/fields/related_descriptors.py (at line 918 in the version I checked out).

        def add(self, *objs):   ... stuff deleted ... self._add_items(self.source_field_name, self.target_field_name, *objs)

(actually _add_items can be called twice, once for a forward and once for a reverse relationship).  Looking at _add_items (line 1041 in my copy), we see after building the list of new_ids to insert, this chunk of code:

                db = router.db_for_write(self.through, instance=self.instance)                 vals = (self.through._default_manager.using(db)                         .values_list(target_field_name, flat=True)                         .filter(**{                             source_field_name: self.related_val[0],                             '%s__in' % target_field_name: new_ids,                         }))                 new_ids = new_ids - set(vals)

which I suspect of providing the difference.  This code gets the list of current values in the relation table and removes that set from the set of new_ids.  I believe that the filter here will respond differently if we have a intermediate class defined.  NOTE: I did not run this code live to test this theory, so if I’m wrong, feel free to point out how and where in the comments.

Even if this is not quite correct, after walking through some code, I’m satisfied that the intermediate class definitely causes some different behavior internally in Django.

Next time I’ll jump back into the KidsTasks code.

Thank for reading!

Categories: FLOSS Project Planets

Daniel Bader: Python Code Review: Unplugged – Episode 2

Planet Python - Mon, 2016-09-26 20:00
Python Code Review: Unplugged – Episode 2

This is the second episode of my video code review series where I record myself giving feedback and refactoring a reader’s Python code.

The response to the first Code Review: Unplugged video was super positive. I got a ton of emails and comments on YouTube saying that the video worked well as a teaching tool and that I should do more of them.

And so I did just that 😃. Milton sent me a link to his Python 3 project on GitHub and I recorded another code review based on his code. You can watch it below:

Milton is on the right track with his Python journey. I liked how he used functions to split up his web scraper program into functions that each handle a different phase, like fetch the html, parse it, and generate the output file.

The main thing that this code base could benefit from would be consistent formatting. Making the formatting as regular and consistent as possible really helps with keeping the “mental overhead” low when you’re working on the code or handing it off to someone else.

And the beautiful thing is that there’s an easy fix for this, too. I demo a tool called Flake8 in the video. Flake8 is a code linter and code style checker – and it’s great for making sure your code has consistent formatting and avoids common pitfalls or anti-patterns.

You can even integrate Flake8 into your editing environment so that it checks your code as you write it.

(Shameless plug: The book I’m working on has a whole chapter on integrating Flake8 into the Sublime Text editor. Check it out if you’d like to learn how to set up a Python development environment just like the one I’m using in the video).

Besides formatting, the video also covers things like writing a great GitHub README, how to name functions and modules, and the use of constants to simplify your Python code. So be sure to watch the whole thing when you get the chance.

Again, I left the video completely unedited. That’s why I’m calling this series Code Review: Unplugged. It’s definitely not a polished tutorial or course. But based on the feedback I got so far that seems to be part of the appeal.

Links & Resources:

One more quick tip for you: You can turn these videos into a fun Python exercise for yourself. Just pause the video before I dig into the code and do your own code review first. Spend 10 to 20 minutes taking notes and refactoring the code and then continue with the video to compare your solution with mine. Let me know how this worked out! 😊

Categories: FLOSS Project Planets

Kees Cook: security things in Linux v4.3

Planet Debian - Mon, 2016-09-26 18:54

When I gave my State of the Kernel Self-Protection Project presentation at the 2016 Linux Security Summit, I included some slides covering some quick bullet points on things I found of interest in recent Linux kernel releases. Since there wasn’t a lot of time to talk about them all, I figured I’d make some short blog posts here about the stuff I was paying attention to, along with links to more information. This certainly isn’t everything security-related or generally of interest, but they’re the things I thought needed to be pointed out. If there’s something security-related you think I should cover from v4.3, please mention it in the comments. I’m sure I haven’t caught everything. :)

A note on timing and context: the momentum for starting the Kernel Self Protection Project got rolling well before it was officially announced on November 5th last year. To that end, I included stuff from v4.3 (which was developed in the months leading up to November) under the umbrella of the project, since the goals of KSPP aren’t unique to the project nor must the goals be met by people that are explicitly participating in it. Additionally, not everything I think worth mentioning here technically falls under the “kernel self-protection” ideal anyway — some things are just really interesting userspace-facing features.

So, to that end, here are things I found interesting in v4.3:


Russell King implemented this feature for ARM which provides emulated segregation of user-space memory when running in kernel mode, by using the ARM Domain access control feature. This is similar to a combination of Privileged eXecute Never (PXN, in later ARMv7 CPUs) and Privileged Access Never (PAN, coming in future ARMv8.1 CPUs): the kernel cannot execute user-space memory, and cannot read/write user-space memory unless it was explicitly prepared to do so. This stops a huge set of common kernel exploitation methods, where either a malicious executable payload has been built in user-space memory and the kernel was redirected to run it, or where malicious data structures have been built in user-space memory and the kernel was tricked into dereferencing the memory, ultimately leading to a redirection of execution flow.

This raises the bar for attackers since they can no longer trivially build code or structures in user-space where they control the memory layout, locations, etc. Instead, an attacker must find areas in kernel memory that are writable (and in the case of code, executable), where they can discover the location as well. For an attacker, there are vastly fewer places where this is possible in kernel memory as opposed to user-space memory. And as we continue to reduce the attack surface of the kernel, these opportunities will continue to shrink.

While hardware support for this kind of segregation exists in s390 (natively separate memory spaces), ARM (PXN and PAN as mentioned above), and very recent x86 (SMEP since Ivy-Bridge, SMAP since Skylake), ARM is the first upstream architecture to provide this emulation for existing hardware. Everyone running ARMv7 CPUs with this kernel feature enabled suddenly gains the protection. Similar emulation protections (PAX_MEMORY_UDEREF) have been available in PaX/Grsecurity for a while, and I’m delighted to see a form of this land in upstream finally.

To test this kernel protection, the ACCESS_USERSPACE and EXEC_USERSPACE triggers for lkdtm have existed since Linux v3.13, when they were introduced in anticipation of the x86 SMEP and SMAP features.

Ambient Capabilities

Andy Lutomirski (with Christoph Lameter and Serge Hallyn) implemented a way for processes to pass capabilities across exec() in a sensible manner. Until Ambient Capabilities, any capabilities available to a process would only be passed to a child process if the new executable was correctly marked with filesystem capability bits. This turns out to be a real headache for anyone trying to build an even marginally complex “least privilege” execution environment. The case that Chrome OS ran into was having a network service daemon responsible for calling out to helper tools that would perform various networking operations. Keeping the daemon not running as root and retaining the needed capabilities in children required conflicting or crazy filesystem capabilities organized across all the binaries in the expected tree of privileged processes. (For example you may need to set filesystem capabilities on bash!) By being able to explicitly pass capabilities at runtime (instead of based on filesystem markings), this becomes much easier.

For more details, the commit message is well-written, almost twice as long as than the code changes, and contains a test case. If that isn’t enough, there is a self-test available in tools/testing/selftests/capabilities/ too.

PowerPC and Tile support for seccomp filter

Michael Ellerman added support for seccomp to PowerPC, and Chris Metcalf added support to Tile. As the seccomp maintainer, I get excited when an architecture adds support, so here we are with two. Also included were updates to the seccomp self-tests (in tools/testing/selftests/seccomp), to help make sure everything continues working correctly.

That’s it for v4.3. If I missed stuff you found interesting, please let me know! I’m going to try to get more per-version posts out in time to catch up to v4.8, which appears to be tentatively scheduled for release this coming weekend.

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Categories: FLOSS Project Planets

Reproducible builds folks: Reproducible Builds: week 74 in Stretch cycle

Planet Debian - Mon, 2016-09-26 17:25

Here is what happened in the Reproducible Builds effort between Sunday September 18 and Saturday September 24 2016:


We intend to participate in Outreachy Round 13 and look forward for new enthusiastic applications to contribute to reproducible builds. We're offering four different areas to work on:

  • Improve test and debugging tools.
  • Improving reproducibility of Debian packages.
  • Improving Debian infrastructure.
  • Help collaboration across distributions.
Reproducible Builds World summit #2

We are planning e a similar event to our Athens 2015 summit and expect to reveal more information soon. If you haven't been contacted yet but would like to attend, please contact holger.

Toolchain development and fixes

Mattia uploaded dpkg/ to our experimental repository. and covered the details for the upload in a mailing list post.

The most important change is the incorporation of improvements made by Guillem Jover (dpkg maintainer) to the .buildinfo generator. This is also in the hope that it will speed up the merge in the upstream.

One of the other relevant changes from before is that .buildinfo files generated from binary-only builds will no longer include the hash of the .dsc file in Checksums-Sha256 as documented in the specification.

Even if it was considered important to include a checksum of the source package in .buildinfo, storing it that way breaks other assumptions (eg. that Checksums-Sha256 contains only files part of that are part of a single upload, wheras the .dsc might not be part of that upload), thus we look forward for another solution to store the source checksum in .buildinfo.

Bugs filed Reviews of unreproducible packages

250 package reviews have been added, 4 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues.

4 issue types have been added:

3 issue types have been updated:

Weekly QA work

FTBFS bugs have been reported by:

  • Chris Lamb (11)
  • Santiago Vila (2)
Documentation updates

h01ger created a new Jenkins job so that every commit pushed to the master branch for the website will update reproducible-builds.org.

diffoscope development strip-nondeterminism development reprotest development tests.reproducible-builds.org
  • The full rebuild of all packages in unstable (for all tested archs) with the new build path variation has been completed. This has had the result that we are down to ~75% reproducible packages in unstable now. In comparison, for testing (where we don't vary the build path) we are still at ~90%. IRC notifications for unstable have been enabled again. (Holger)
  • Make the notes job robust about bad data (see #833695 and #833738). (Holger)
  • Setup profitbricks-build7 running stretch as testing reproducible builds of F-Droid need to use a newer version of vagrant in order to support running vagrant VMs with kvm on kvm. (Holger)
  • The misbehaving 'opi2a' armhf node has been replaced with a Jetson-TK1 board kindly donated by NVidia. This machine is using an NVIDIA tegra-k1 (cortex-a15) quad-core board. (vagrant and Holger)

This week's edition was written by Chris Lamb, Holger Levsen and Mattia Rizzolo and reviewed by a bunch of Reproducible Builds folks on IRC.

Categories: FLOSS Project Planets

Emoji restyling

Planet KDE - Mon, 2016-09-26 17:02

(really short post)
I started to restyling and try to finish the Emoji.

Hope you like it.

Categories: FLOSS Project Planets
Syndicate content