KnackForge: How to update Drupal 8 core?

Planet Drupal - Sat, 2018-03-24 01:01
How to update Drupal 8 core?

Let's see how to update your Drupal site between 8.x.x minor and patch versions. For example, from 8.1.2 to 8.1.3, or from 8.3.5 to 8.4.0. I hope this will help you.

  • If you are upgrading to Drupal version x.y.z

           x -> is known as the major version number

           y -> is known as the minor version number

           z -> is known as the patch version number.

Sat, 03/24/2018 - 10:31
Categories: FLOSS Project Planets

A small Update

Planet KDE - Fri, 2017-07-21 17:00

I planned on writing about the Present extension this week, but I’ll postpone this since I’m currently strongly absorbed into finding the last rough edges of a first patch I can show off. I then hope to get some feedback on this from other developers in the xorg-devel mailing list.

Another reason is that I stalled my work on the Present extension for now and try to get first my Xwayland code working. My mentor Daniel recommended that to me since the approach I pursued in my work on Present might be more difficult than I first assessed. At least it is something similar to what other way more experienced developers than myself tried in the past and weren’t able to do according to Daniel. My idea was to make Present flip per CRTC only, but this would clash with Pixmaps being linked to the whole screen only. There are no Pixmaps only for CRTCs in X.

On the other hand when accepting the restriction of only being able to flip one window at a time my code already works quite good. The flipping is smooth and at least in a short test also improved the frame rate. But the main problem I had and still to some degree have, is that stopping the flipping can fail. The reason seems to be that the Present extension sets always the Screen Pixmap on flips. But when I test my work with KWin, it drives Xwayland in rootless mode, i.e. without a Screen Pixmap and only the Window Pixmaps. I’m currently looking into how to circumvent this in Xwayland. I think it’s possible, but I need to look very carefully on how to change the process in order to not forget necessary cleanups on the flipped Pixmaps. I hope though that I’m able to solve these issues already this weekend and then get some feedback on the xorg-devel mailing list.

As always you can find my latest work on my working branch on GitHub.

Categories: FLOSS Project Planets

Drupal Commerce: See what’s new in Drupal Commerce 2.0-rc1

Planet Drupal - Fri, 2017-07-21 16:34

Eight months ago we launched the first beta version of Commerce 2.x for Drupal 8. Since then we’ve made 304 code commits by 58 contributors, and we've seen dozens of attractive, high-performing sites go live. We entered the release candidate phase this month with the packaging of Commerce 2.0-rc1 (release notes), the final part of our long and fruitful journey to a full 2.0.

Introducing a new Promotions UI:

Some of the most exciting updates this Summer center around our promotions system. This work represents a huge leap forward from Commerce 1.x, as we've made promotions first class citizens in core. They power a variety of discount types and coupons, and now that they are in core we can ensure the systems are designed to look and work well on both the front end and back end.

Read on to learn more about what's new in promotions, payment, taxes, and more...

Categories: FLOSS Project Planets

Sandipan Dey: SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India in 2010 (Simulation in Python & R)

Planet Python - Fri, 2017-07-21 13:47
This appeared as a project in the edX course DelftX: MathMod1x Mathematical Modelling Basics and the project report can be found here. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Summary In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 is going to be simulated. … Continue reading SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India in 2010 (Simulation in Python & R)
Categories: FLOSS Project Planets

Hook 42: Hook 42 goes to Washington!

Planet Drupal - Fri, 2017-07-21 12:38

Hook 42 is expanding our enterprise Drupal services to the public sector. It’s only logical that our next trek is to Drupal GovCon!

We are bringing some of our colorful San Francisco Bay Area love to DC. We will be sharing our knowledge about planning and managing migrations, as well as core site building layout technologies. The most exciting part of the conference will be meeting up with our east coast Drupal community and government friends in person.

Categories: FLOSS Project Planets

Assume Good Faith

Open Source Initiative - Fri, 2017-07-21 10:03

You feel slighted by a comment on a mailing list, or a forum post has failed to be moderated live. How should you react?

A recent exchange on a user forum caught my eye, one that’s typical of many user interactions with open source communities. Someone with a technical question had apparently had the answer they needed and to help others in the same situation had posted a summary of the resolution, complete with sample code. When they came back later, the summary was gone.

I’ve no idea why this happened. It may have been a system issue, or an administrative error, or the user himself may have accidentally deleted it without realising. It’s even remotely possible an intentionally malicious act took place. Without more information there is no way to know. For the self-aware mind, responding to this situation is a matter of choice.

So how did the user in question respond? Well, he decided the only possible explanation was malicious deletion. He posted an angry demand that his account be deleted and assumed more malice when this was “ignored” (after 3 days, including a weekend, having posted the demand at the end of a comment in a user forum…)

No matter how you look at it, I don’t think that was a very smart choice. Given he was posting in a busy user forum managed by volunteers, and that this was his first post, the chance any action would be intentionally directed at him is vanishingly small. He would have been far smarter to put his ego on hold and take a lesson from driving principle of Wikipedia. “Assume Good Faith“.

That’s great advice in all communities of volunteer strangers. The things that happen usually have a great explanation, and the motivations of volunteers are almost always to make things better. So when a thing happens, or something is said, that you don’t understand or can’t explain, the best assumption to make is that it has happened “in good faith”. More significantly, when you think you do understand the motivation, over-ride your instinct and choose to respond as if it was good faith!

How to assume good faith

The chain of assumptions to make if you assume good faith might go like this:

  • The thing was probably your mistake.
  • If it wasn’t, then it was an unintentional defect.
  • If it wasn’t, then it was the act of someone with more information than you acting correctly.
  • If it wasn’t, then the person was acting in the belief they were correct but with imperfect information.
  • If not, then they were inexperienced.
  • But whatever the explanation, in the unlikely event you ever find out, don’t assume it was an act of intentional malice!

Maybe you can’t let any of the assumptions in that chain go. Maybe the person really is an idiot; it happens. All the same, an angry response is still not helpful, either to you or to the Community. Open source communities only thrive when a critical mass of participants choose to assume good faith in every interaction. The assumption is very occasionally misplaced, but even when that’s so it’s almost always better to respond by assuming good faith anyway.

That doesn’t mean it’s wrong to apply corrections. Good intentions do not guarantee good actions. But correcting a thing done in good faith has a distinct character of good faith itself. The original choice to act is welcomed and valued. The explanation of the flaw in the act is good-natured, clear, never patronising. The correction is generous. The whole thing is warm and seeks to build the confidence of the contributor.

It’s a lesson the detail-oriented among us need to remember (I include myself in that). The overwhelming majority of community actions are intended well. Treating them as such — even when they are wrong — will grow individuals and the community with them.

This article originally appeared on "Meshed Insights" and was made possible by Patreon patrons.
Image credit: CC0 / Public Domain via Max Pixel

Categories: FLOSS Research

Anubavam Blog: Dependency injection and Service Containers

Planet Drupal - Fri, 2017-07-21 07:43
Dependency injection and Service Containers

A dependency is an object that can be used (a service). An injection is the passing of a dependency to a dependent object (a client) that would use it. The service is made part of the client's state. Passing the service to the client, rather than allowing a client to build or find the service, is the fundamental requirement of the pattern." Dependency injection is an advanced software design pattern and applying it will increase flexibility. Once you wrap your head around this pattern, you will be unstoppable.

A practical example of accessing services in objects using dependency injection

For the following example, let's assume we are creating a method that will use the service of A, we need to pull the dependencies of B and C into the plugin which we can use to inject whichever services we require.

  • Application needs A so:
  • Application gets A from the Container, so:
  • Container creates C
  • Container creates B and gives it C
  • Container creates A and gives it B
  • Application calls A
  • A calls B
  • B does something

Types of Dependency Injection

There are different types of Dependency Injection:

  • Constructor injection
  • Method injection
  • Setter and property injection
  • PHP callable injection

Constructor Injection

The DI container supports constructor injection with the help of type hints(Type hinting we can specify the expected data type) for constructor parameters. The type hints tell the container which classes or interfaces are dependent when it is used to create a new object. The container will try to get the instances of the dependent classes or interfaces and then inject them into the new object through the constructor.

Method Injection 

In constructor injection we saw that the dependent class will use the same concrete class for its entire lifetime. Now if we need to pass separate concrete class on each invocation of the method, we have to pass the dependency in the method only.

Setter & Property Injection

Now we have discussed two scenarios where in constructor injection we knew that the dependent class will use one concrete class for the entire lifetime. The second approach is to use the method injection where we can pass the concrete class object in the action method itself. But what if the responsibility of selection of concrete class and invocation of method are in separate places. In such cases we need property injection.

PHP Callable Injection

Container will use a registered PHP callable to build new instances of a class. Each time when yii\di\Container::get() is called, the corresponding callable will be invoked. The callable is responsible to resolve the dependencies and inject them appropriately to the newly created objects

Dependency Injection: Advantages & Disadvantages


Reducing the dependency to each other of objects in application.
Unit testing is made easier.
Loosely couple 
Promotes re-usability of code or objects in different applications
Promotes logical abstraction of components.


DI increases complexity, usually by increasing the number of classes since responsibilities are separated more, which is not always beneficial.
Code will be coupled to the dependency injection framework.
It takes time to learn
If misunderstood it can lead to more harm than good


Dependency injection is a very simple concept of decoupling your code and easier to read. By injecting dependencies to objects we can isolate their purpose and easily swap them with others. 

The service container is basically there to manage some classes. It keeps track of what a certain service needs before getting instantiated, does it for you and all you have to do is access the container to request that service. Using it the right way will save time and frustration, while Drupal developers will even make it easier for the layman. 


admin Fri, 07/21/2017 - 07:43 Drupal developer Drupal Application Development
Categories: FLOSS Project Planets

Michal Čihař: Making Weblate more secure and robust

Planet Debian - Fri, 2017-07-21 06:00

Having publicly running web application always brings challenges in terms of security and in generally in handling untrusted data. Security wise Weblate has been always quite good (mostly thanks to using Django which comes with built in protection against many vulnerabilities), but there were always things to improve in input validation or possible information leaks.

When Weblate has joined HackerOne (see our first month experience with it), I was hoping to get some security driven core review, but apparently most people there are focused on black box testing. I can certainly understand that - it's easier to conduct and you need much less knowledge of the tested website to perform this.

One big area where reports against Weblate came in was authentication. Originally we were mostly fully relying on default authentication pipeline coming with Python Social Auth, but that showed some possible security implications and we ended up with having heavily customized authentication pipeline to avoid several risks. Some patches were submitted back, some issues reported, but still we've diverged quite a lot in this area.

Second area where scanning was apparently performed, but almost none reports came, was input validation. Thanks to excellent XSS protection in Django nothing was really found. On the other side this has triggered several internal server errors on our side. At this point I was really happy to have Rollbar configured to track all errors happening in the production. Thanks to having all such errors properly recorded and grouped it was really easy to go through them and fix them in our codebase.

Most of the related fixes have landed in Weblate 2.14 and 2.15, but obviously this is ongoing effort to make Weblate better with every release.

Filed under: Debian English SUSE Weblate

Categories: FLOSS Project Planets

CiviCRM Blog: CiviProxy and CiviMcRestFace sprint in Bonn

Planet Drupal - Fri, 2017-07-21 04:40

CiviCooP and Systopia and Palasthotel have been working together on CiviProxy and CiviProxy. This blog is a round up of what we have achieved in the last couple of days. The first thing we have achieved is that we had fun and a very good work atmosphere. We made long days and made lots of progress.

What are CiviProxy and CiviMcRestFace?

CiviProxy is a script to act as an application firewall for CiviCRM. It could be used to put your civicrm in secure network. CiviProxy is the gatekeeper to which external systems, such as your website, connect (this is for example when a user signs a petition on your website and the website submits this data to your CiviCRM). CiviProxy will make sure the call is from the right place (ip-adress) and is only doing what allowed to do. 

CiviMcRestFace (CiviMRF) is a framework to be used in other systems (such as your external website) to connect to CiviCRM. The framework itself is divided in three parts: the abstract core (CMS/System independent), the core implementation (e.g. a Drupal 7 implementation), and lastly the module who is doing the actual submission (for example the cmrf_webform module which provides the functionality to submit a webform to CiviCRM).

What we have achieved:

  • Completed the documentation on CiviProxy: https://docs.civicrm.org/civiproxy/en/latest
  • Got a working drupal 7 module with CiviMcRestFace:
    • Completed screens for set up connection profiles (you can also provide the connection credentials through your module with an api; so that you can store them somewhere outside the database)
    • Completed screen for the call log (a call is submission to CiviCRM through CiviMcRestFace)
    • Added functionality to queue calls and run them in the background and added functionality to retry failed calls
    • Added a basic webform integration module to submit a webform to the CiviCRM Api
    • Added a rules integration module so that you can perform additional actions when a call succeeds or fails. Probably a use case is when a call fails you want to send the data by e-mail to the CiviCRM Administrator so that he or she can enter the data manually.
    • Added an example module so you can see how you could use the cmrf_core module in your durpal projects
    • Code: https://github.com/CiviMRF/cmrf_core/tree/7.x-dev
  • Got a start with the drupal 8 module for CiviMcRestFace: https://github.com/CiviMRF/cmrf_core/tree/8.x-dev


ToolsAPIArchitectureCiviCRMCommunityDocumentationDrupalDrupal 8Sprints
Categories: FLOSS Project Planets

The Digital Cat: Refactoring with tests in Python: a practical example

Planet Python - Fri, 2017-07-21 04:30

This post contains a step-by-step example of a refactoring session guided by tests. When dealing with untested or legacy code refactoring is dangerous and tests can help us do it the right way, minimizing the amount of bugs we introduce, and possibly completely avoiding them.

Refactoring is not easy. It requires a double effort to understand code that others wrote, or that we wrote in the past, and moving around parts of it, simplifying it, in one word improving it, is by no means something for the faint-hearted. Like programming, refactoring has its rules and best practices, but it can be described as a mixture of technique, intuition, experience, risk.

Programming, after all, is craftsmanship.

The starting point

The simple use case I will use for this post is that of a service API that we can access, and that produces data in JSON format, namely a list of elements like the one shown here

{ 'age': 20, 'surname': 'Frazier', 'name': 'John', 'salary': '£28943' }

Once we convert this to a Python data structure we obtain a list of dictionaries, where 'age' is an integer, and the remaining fields are strings.

Someone then wrote a class that computes some statistics on the input data. This class, called DataStats, provides a single method stats(), whose inputs are the data returned by the service (in JSON format), and two integers called iage and isalary. Those, according to the short documentation of the class, are the initial age and the initial salary used to compute the average yearly increase of the salary on the whole dataset.

The code is the following

import math import json class DataStats: def stats(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = math.floor( sum([e['age'] for e in data])/len(data)) - iage average_salary_increase = math.floor( sum([int(e['salary'][1:]) for e in data])/len(data)) - isalary yearly_avg_increase = math.floor( average_salary_increase/average_age_increase) # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) max_salary = [e for e in data if e['salary'] == threshold] # Compute min salary salaries = [int(d['salary'][1:]) for d in data] min_salary = [e for e in data if e['salary'] == '£{}'.format(str(min(salaries)))] return json.dumps({ 'avg_age': math.floor(sum([e['age'] for e in data])/len(data)), 'avg_salary': math.floor(sum( [int(e['salary'][1:]) for e in data])/len(data)), 'avg_yearly_increase': yearly_avg_increase, 'max_salary': max_salary, 'min_salary': min_salary }) The goal

It is fairly easy, even for the untrained eye, to spot some issues in the previous class. A list of the most striking ones is

  • The class exposes a single method and has no __init__(), thus the same functionality could be provided by a single function.
  • The stats() method is too big, and performs too many tasks. This makes debugging very difficult, as there is a single inextricable piece of code that does everything.
  • There is a lot of code duplication, or at least several lines that are very similar. Most notably the two operations '£' + str(max(salaries)) and '£{}'.format(str(min(salaries))), the two different lines starting with salaries =, and the several list comprehensions.

So, since we are going to use this code in some part of our Amazing New Project™, we want to possibly fix these issues.

The class, however, is working perfectly. It has been used in production for many years and there are no known bugs, so our operation has to be a refactoring, which means that we want to write something better, preserving the behaviour of the previous object.

The path

In this post I want to show you how you can safely refactor such a class using tests. This is different from TDD, but the two are closely related. The class we have has not been created using TDD, as there are no tests, but we can use tests to ensure its behaviour is preserved. This should therefore be called Test Driven Refactoring (TDR).

The idea behind TDR is pretty simple. First, we have to write a test that checks the behaviour of some code, possibly a small part with a clearly defined scope and output. This is a posthumous (or late) unit test, and it simulates what the author of the code should have provided (cough cough, it was you some months ago...).

Once you have you unit test you can go and modify the code, knowing that the behaviour of the resulting object will be the same of the previous one. As you can easily understand, the effectiveness of this methodology depends strongly on the quality of the tests themselves, possibly more than when developing with TDD, and this is why refactoring is hard.


Two remarks before we start our first refactoring. The first is that such a class could easily be refactored to some functional code. As you will be able to infer from the final result there is no real reason to keep an object-oriented approach for this code. I decided to go that way, however, as it gave me the possibility to show a design pattern called wrapper, and the refactoring technique that leverages it.

The second remark is that in pure TDD it is strongly advised not to test internal methods, that is those methods that do not form the public API of the object. In general, we identify such methods in Python by prefixing their name with an underscore, and the reason not to test them is that TDD wants you to shape objects according to the object-oriented programming methodology, which considers objects as behaviours and not as structures. Thus, we are only interested in testing public methods.

It is also true, however, that sometimes even tough we do not want to make a method public, that method contains some complex logic that we want to test. So, in my opinion the TDD advice should sound like "Test internal methods only when they contain some non-trivial logic".

When it comes to refactoring, however, we are somehow deconstructing a previously existing structure, and usually we end up creating a lot of private methods to help extracting and generalising parts of the code. My advice in this case is to test those methods, as this gives you a higher degree of confidence in what you are doing. With experience you will then learn which tests are required and which are not.

Setup of the testing environment

Clone this repository and create a virtual environment. Activate it and install the required packages with

pip install -r requirements.txt

The repository already contains a configuration file for pytest and you should customise it to avoid entering your virtual environment directory. Go and fix the norecursedirs parameter in that file, adding the name of the virtual environment you just created; I usually name my virtual environments with a venv prefix, and this is why that variable contains the entry venv*.

At this point you should be able to run pytest -svv in the parent directory of the repository (the one that contains pytest.ini), and obtain a result similar to the following

========================== test session starts ========================== platform linux -- Python 3.5.3, pytest-3.1.2, py-1.4.34, pluggy-0.4.0 cachedir: .cache rootdir: datastats, inifile: pytest.ini plugins: cov-2.5.1 collected 0 items ====================== no tests ran in 0.00 seconds ======================

The given repository contains two branches. master is the one that you are into, and contains the initial setup, while develop points to the last step of the whole refactoring process. Every step of this post contains a reference to the commit that contains the changes introduced in that section.

Step 1 - Testing the endpoints

Commit: 27a1d8c

When you start refactoring a system, regardless of the size, you have to test the endpoints. This means that you consider the system as a black box (i.e. you do not know what is inside) and just check the external behaviour. In this case we can write a test that initialises the class and runs the stats() method with some test data, possibly real data, and checks the output. Obviously we will write the test with the actual output returned by the method, so this test is automatically passing.

Querying the server we get the following data

test_data = [ { "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }, { "id": 2, "name": "Mikayla", "surname": "Henry", "age": 49, "salary": "£67137" }, { "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" } ]

and calling the stats() method with that output, with iage set to 20, and isalary set to 20000, we get the following JSON result

{ 'avg_age': 62, 'avg_salary': 55165, 'avg_yearly_increase': 837, 'max_salary': [{ "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" }], 'min_salary': [{ "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }] }

Caveat: I'm using a single very short set of real data, namely a list of 3 dictionaries. In a real case I would test the black box with many different use cases, to ensure I am not just checking some corner case.

The test is the following

import json from datastats.datastats import DataStats def test_json(): test_data = [ { "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }, { "id": 2, "name": "Mikayla", "surname": "Henry", "age": 49, "salary": "£67137" }, { "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" } ] ds = DataStats() assert ds.stats(test_data, 20, 20000) == json.dumps( { 'avg_age': 62, 'avg_salary': 55165, 'avg_yearly_increase': 837, 'max_salary': [{ "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" }], 'min_salary': [{ "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }] } )

As said before, this test is obviously passing, having been artificially constructed from a real execution of the code.

Well, this test is very important! Now we know that if we change something inside the code, altering the behaviour of the class, at least one test will fail.

Step 2 - Getting rid of the JSON format

Commit: 65e2997

The method returns its output in JSON format, and looking at the class it is pretty evident that the conversion is done by json.dumps().

The structure of the code is the following

class DataStats: def stats(self, data, iage, isalary): [code_part_1] return json.dumps({ [code_part_2] })

Where obviously code_part_2 depends on code_part_1. The first refactoring, then, will follow this procedure

  1. We write a test called test__stats() for a _stats() method that is supposed to return the data as a Python structure. We can infer this latter manually from the JSON or running json.loads() from a Python shell. The test fails.
  2. We duplicate the code of the stats() method that produces the data, putting it in the new _stats() method. The test passes.
class DataStats: def _stats(parameters): [code_part_1] return [code_part_2] def stats(self, data, iage, isalary): [code_part_1] return json.dumps({ [code_part_2] })
  1. We remove the duplicated code in stats() replacing it with a call to _stats()
class DataStats: def _stats(parameters): [code_part_1] return [code_part_2] def stats(self, data, iage, isalary): return json.dumps( self._stats(data, iage, isalary) )

At this point we could refactor the initial test test_json() that we wrote, but this is an advanced consideration, and I'll leave it for some later notes.

So now the code of our class looks like this

class DataStats: def _stats(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = math.floor( sum([e['age'] for e in data])/len(data)) - iage average_salary_increase = math.floor( sum([int(e['salary'][1:]) for e in data])/len(data)) - isalary yearly_avg_increase = math.floor( average_salary_increase/average_age_increase) # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) max_salary = [e for e in data if e['salary'] == threshold] # Compute min salary salaries = [int(d['salary'][1:]) for d in data] min_salary = [e for e in data if e['salary'] == '£{}'.format(str(min(salaries)))] return { 'avg_age': math.floor(sum([e['age'] for e in data])/len(data)), 'avg_salary': math.floor(sum( [int(e['salary'][1:]) for e in data])/len(data)), 'avg_yearly_increase': yearly_avg_increase, 'max_salary': max_salary, 'min_salary': min_salary } def stats(self, data, iage, isalary): return json.dumps( self._stats(data, iage, isalary) )

and we have two tests that check the correctness of it.

Step 3 - Refactoring the tests

Commit: d619017

It is pretty clear that the test_data list of dictionaries is bound to be used in every test we will perform, so it is high time we moved that to a global variable. There is no point now in using a fixture, as the test data is just static data.

We could also move the output data to a global variable, but the upcoming tests are not using the whole output dictionary any more, so we can postpone the decision.

The test suite now looks like

import json from datastats.datastats import DataStats test_data = [ { "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }, { "id": 2, "name": "Mikayla", "surname": "Henry", "age": 49, "salary": "£67137" }, { "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" } ] def test_json(): ds = DataStats() assert ds.stats(test_data, 20, 20000) == json.dumps( { 'avg_age': 62, 'avg_salary': 55165, 'avg_yearly_increase': 837, 'max_salary': [{ "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" }], 'min_salary': [{ "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }] } ) def test__stats(): ds = DataStats() assert ds._stats(test_data, 20, 20000) == { 'avg_age': 62, 'avg_salary': 55165, 'avg_yearly_increase': 837, 'max_salary': [{ "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" }], 'min_salary': [{ "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }] } Step 4 - Isolate the average age algorithm

Commit: 9db1803

Isolating independent features is a key target of software design. Thus, our refactoring shall aim to disentangle the code dividing it into small separated functions.

The output dictionary contains five keys, and each of them corresponds to a value computed either on the fly (for avg_age and avg_salary) or by the method's code (for avg_yearly_increase, max_salary, and min_salary). We can start replacing the code that computes the value of each key with dedicated methods, trying to isolate the algorithms.

To isolate some code, the first thing to do is to duplicate it, putting it into a dedicated method. As we are refactoring with tests, the first thing is to write a test for this method.

def test__avg_age(): ds = DataStats() assert ds._avg_age(test_data) == 62

We know that the method's output shall be 62 as that is the value we have in the output data of the original stats() method. Please note that there is no need to pass iage and isalary as they are not used in the refactored code.

The test fails, so we can dutifully go and duplicate the code we use to compute 'avg_age'

def _avg_age(self, data): return math.floor(sum([e['age'] for e in data])/len(data))

and once the test passes we can replace the duplicated code in _stats() with a call to _avg_age()

return { 'avg_age': self._avg_age(data), 'avg_salary': math.floor(sum( [int(e['salary'][1:]) for e in data])/len(data)), 'avg_yearly_increase': yearly_avg_increase, 'max_salary': max_salary, 'min_salary': min_salary }

Checking after that that no test is failing. Well done! We isolated the first feature, and our refactoring produced already three tests.

Step 5 - Isolate the average salary algorithm

Commit: 4122201

The avg_salary key works exactly like the avg_age, with different code. Thus, the refactoring process is the same as before, and the result should be a new test__avg_salary() test

def test__avg_salary(): ds = DataStats() assert ds._avg_salary(test_data) == 55165

a new _avg_salary() method

def _avg_salary(self, data): return math.floor(sum([int(e['salary'][1:]) for e in data])/len(data))

and a new version of the final return value

return { 'avg_age': self._avg_age(data), 'avg_salary': self._avg_salary(data), 'avg_yearly_increase': yearly_avg_increase, 'max_salary': max_salary, 'min_salary': min_salary } Step 6 - Isolate the average yearly increase algorithm

Commit: 4005145

The remaining three keys are computed with algorithms that, being longer than one line, couldn't be squeezed directly in the definition of the dictionary. The refactoring process, however, does not really change; as before, we first test a helper method, then we define it duplicating the code, and last we call the helper removing the code duplication.

For the average yearly increase of the salary we have a new test

def test__avg_yearly_increase(): ds = DataStats() assert ds._avg_yearly_increase(test_data, 20, 20000) == 837

a new method that passes the test

def _avg_yearly_increase(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = math.floor( sum([e['age'] for e in data])/len(data)) - iage average_salary_increase = math.floor( sum([int(e['salary'][1:]) for e in data])/len(data)) - isalary return math.floor(average_salary_increase/average_age_increase)

and a new version of the _stats() method

def _stats(self, data, iage, isalary): # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) max_salary = [e for e in data if e['salary'] == threshold] # Compute min salary salaries = [int(d['salary'][1:]) for d in data] min_salary = [e for e in data if e['salary'] == '£{}'.format(str(min(salaries)))] return { 'avg_age': self._avg_age(data), 'avg_salary': self._avg_salary(data), 'avg_yearly_increase': self._avg_yearly_increase( data, iage, isalary), 'max_salary': max_salary, 'min_salary': min_salary }

Please note that we are not solving any code duplication but the ones that we introduce to refactor. The first achievement we should aim to is to completely isolate independent features.

Step 7 - Isolate max and min salary algorithms

Commit: 17b2413

When refactoring we shall always do one thing at a time, but for the sake of conciseness, I'll show here the result of two refactoring steps at once. I'll recommend the reader to perform them as independent steps, as I did when I wrote the code that I am posting below.

The new tests are

def test__max_salary(): ds = DataStats() assert ds._max_salary(test_data) == [{ "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" }] def test__min_salary(): ds = DataStats() assert ds._min_salary(test_data) == [{ "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }]

The new methods in the DataStats class are

def _max_salary(self, data): # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) return [e for e in data if e['salary'] == threshold] def _min_salary(self, data): # Compute min salary salaries = [int(d['salary'][1:]) for d in data] return [e for e in data if e['salary'] == '£{}'.format(str(min(salaries)))]

and the _stats() method is now really tiny

def _stats(self, data, iage, isalary): return { 'avg_age': self._avg_age(data), 'avg_salary': self._avg_salary(data), 'avg_yearly_increase': self._avg_yearly_increase( data, iage, isalary), 'max_salary': self._max_salary(data), 'min_salary': self._min_salary(data) } Step 8 - Reducing code duplication

Commit: b559a5c

Now that we have the main tests in place we can start changing the code of the various helper methods. These are now small enough to allow us to change the code without further tests. While this can be true in this case, however, in general there is no definition of what "small enough" means, as there is no real definition of what "unit test" is. Generally speaking you should be confident that the change that you are doing is covered by the tests that you have. Weren't this the case, you'd better add one or more tests until you feel confident enough.

The two methods _max_salary() and _min_salary() share a great deal of code, even though the second one is more concise

def _max_salary(self, data): # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) return [e for e in data if e['salary'] == threshold] def _min_salary(self, data): # Compute min salary salaries = [int(d['salary'][1:]) for d in data] return [e for e in data if e['salary'] == '£{}'.format(str(min(salaries)))]

I'll start by making explicit the threshold variable in the second function. As soon as I change something, I'll run the tests to check that the external behaviour did not change.

def _max_salary(self, data): # Compute max salary salaries = [int(e['salary'][1:]) for e in data] threshold = '£' + str(max(salaries)) return [e for e in data if e['salary'] == threshold] def _min_salary(self, data): # Compute min salary salaries = [int(d['salary'][1:]) for d in data] threshold = '£{}'.format(str(min(salaries))) return [e for e in data if e['salary'] == threshold]

Now, it is pretty evident that the two functions are the same but for the min() and max() functions. They still use different variable names and different code to format the threshold, so my first action is to even out them, copying the code of _min_salary() to _max_salary() and changing min() to max()

def _max_salary(self, data): # Compute max salary salaries = [int(d['salary'][1:]) for d in data] threshold = '£{}'.format(str(max(salaries))) return [e for e in data if e['salary'] == threshold] def _min_salary(self, data): # Compute min salary salaries = [int(d['salary'][1:]) for d in data] threshold = '£{}'.format(str(min(salaries))) return [e for e in data if e['salary'] == threshold]

Now I can create another helper called _select_salary() that duplicates that code and accepts a function, used instead of min() or max(). As I did before, first I duplicate the code, and then remove the duplication by calling the new function.

After some passages, the code looks like this

def _select_salary(self, data, func): salaries = [int(d['salary'][1:]) for d in data] threshold = '£{}'.format(str(func(salaries))) return [e for e in data if e['salary'] == threshold] def _max_salary(self, data): return self._select_salary(data, max) def _min_salary(self, data): return self._select_salary(data, min)

I noticed then a code duplication between _avg_salary() and _select_salary():

def _avg_salary(self, data): return math.floor(sum([int(e['salary'][1:]) for e in data])/len(data)) def _select_salary(self, data, func): salaries = [int(d['salary'][1:]) for d in data]

and decided to extract the common algorithm in a method called _salaries(). As before, I write the test first

def test_salaries(): ds = DataStats() assert ds._salaries(test_data) == [27888, 67137, 70472]

then I implement the method

def _salaries(self, data): return [int(d['salary'][1:]) for d in data]

and eventually I replace the duplicated code with a call to the new method

def _salaries(self, data): return [int(d['salary'][1:]) for d in data] def _select_salary(self, data, func): threshold = '£{}'.format(str(func(self._salaries(data)))) return [e for e in data if e['salary'] == threshold]

While doing this I noticed that _avg_yearly_increase() contains the same code, and fix it there as well.

def _avg_yearly_increase(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = math.floor( sum([e['age'] for e in data])/len(data)) - iage average_salary_increase = math.floor( sum(self._salaries(data))/len(data)) - isalary return math.floor(average_salary_increase/average_age_increase)

It would be useful at this point to store the input data inside the class and to use it as self.data instead of passing it around to all the class's methods. This however would break the class's API, as currently DataStats is initialised without any data. Later I will show how to introduce changes that potentially break the API, and briefly discuss the issue. For the moment, however, I'll keep changing the class without modifying the external interface.

It looks like age has the same code duplication issues as salary, so with the same procedure I introduce the _ages() method and change the _avg_age() and _avg_yearly_increase() methods accordingly.

Speaking of _avg_yearly_increase(), the code of that method contains the code of the _avg_age() and _avg_salary() methods, so it is worth replacing it with two calls. As I am moving code between existing methods, I do not need further tests.

def _avg_yearly_increase(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = self._avg_age(data) - iage average_salary_increase = self._avg_salary(data) - isalary return math.floor(average_salary_increase/average_age_increase) Step 9 - Advanced refactoring

Commit: cc0b0a1

The initial class didn't have any __init__() method, and was thus missing the encapsulation part of the object-oriented paradigm. There was no reason to keep the class, as the stats() method could have easily been extracted and provided as a plain function.

This is much more evident now that we refactored the method, because we have 10 methods that accept data as a parameter. I would be nice to load the input data into the class at instantiation time, and then access it as self.data. This would greatly improve the readability of the class, and also justify its existence.

If we introduce a __init__() method that requires a parameter, however, we will change the class's API, breaking the compatibility with every other code that imports and uses it. Since we want to keep it, we have to devise a way to provide both the advantages of a new, clean class and of a stable API. This is not always perfectly achievable, but in this case the Adapter design pattern (also known as Wrapper) can perfectly solve the issue.

The goal is to change the current class to match the new API, and then build a class that wraps the first one and provides the old API. The strategy is not that different from what we did previously, only this time we will deal with classes instead of methods. With a stupendous effort of my imagination I named the new class NewDataStats. Sorry, but sometimes you just have to get the job done.

The first things, as happens very often with refactoring, is to duplicate the code, and when we insert new code we need to have tests that justify it. The tests will be the same as before, as the new class shall provide the same functionalities as the previous one, so I just create a new file, called test_newdatastats.py and start putting there the first test test_init().

import json from datastats.datastats import NewDataStats test_data = [ { "id": 1, "name": "Laith", "surname": "Simmons", "age": 68, "salary": "£27888" }, { "id": 2, "name": "Mikayla", "surname": "Henry", "age": 49, "salary": "£67137" }, { "id": 3, "name": "Garth", "surname": "Fields", "age": 70, "salary": "£70472" } ] def test_init(): ds = NewDataStats(test_data) assert ds.data == test_data

This test doesn't pass, and the code that implements the class is very simple

class NewDataStats: def __init__(self, data): self.data = data

Now I can start an iterative process:

  1. I will copy one of the tests of DataStats and adapt it to NewDataStats
  2. I will copy come code from DataStats to NewDataStats, adapting it to the new API and making it pass the test.

At this point iteratively removing methods from DataStats and replacing them with a call to NewDataStats would be overkill. I'll show you in the next section why, and what we can do to avoid that.

An example of the resulting tests for NewDataStats is the following

def test_ages(): ds = NewDataStats(test_data) assert ds._ages() == [68, 49, 70]

and the code that passes the test is

def _ages(self): return [d['age'] for d in self.data]

Once finished, I noticed that, as now methods like _ages() do not require an input parameter any more, I can convert them to properties, changing the tests accordingly.

@property def _ages(self): return [d['age'] for d in self.data]

It is time to replace the methods of DataStats with calls to NewDataStats. We could do it method by method, bu actually the only thing that we really need is to replace stats(). So the new code is

def stats(self, data, iage, isalary): nds = NewDataStats(data) return nds.stats(iage, isalary)

And since all the other methods are no more used we can safely delete them, checking that the tests do not fail. Speaking of tests, removing method will make many tests of DataStats fail, so we need to remove them.

class DataStats: def stats(self, data, iage, isalary): nds = NewDataStats(data) return nds.stats(iage, isalary) Final words

I hope this little tour of a refactoring session didn't result too trivial, and helped you to grasp the basic concepts of this technique. If you are interested in the subject I'd strongly recommend the classic book by Martin Fowler "Refactoring: Improving the Design of Existing Code", which is a collection of refactoring patterns. The reference language is Java, but the concepts are easily adapted to Python.


Feel free to use the blog Google+ page to comment the post. Feel free to reach me on Twitter if you have questions. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

AtCore officialy moved to KDE Extragear

Planet KDE - Thu, 2017-07-20 21:35

It's with all the joy in my heart that I share with you this amazing notice: AtCore was officially moved today to KDE Extragear by my favorite sysadmin Ben Cooksley after more than a month on KDE Review. This is the first huge milestone that we achieve on this 11 months of team work made [...]

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-07-20

Planet Apache - Thu, 2017-07-20 19:58
Categories: FLOSS Project Planets

Ryan Szrama: Connecting with the Community at Drupal Camp Asheville 2017

Planet Drupal - Thu, 2017-07-20 18:30

Last weekend I had the pleasure of attending Drupal Camp Asheville 2017 ('twas my fourth year in a row : ). I absolutely love this event and encourage you to consider putting it on your list of Drupal events to hit next year. The Asheville area is a beautiful (and delicious) place to spend the weekend, but the bigger draw for me is the people involved:

(Check out all the pictures in the album on Flickr.)

Drupal Camp Asheville is always well organized (seriously, it's in the best of the best small conferences for venue, amenities, and content) and attended by a solid blend of seasoned Drupal users / contributors and newcomers. I live only an hour away, so I get to interact with my Drupal friends from Blue Oak Interactive, New Valley Media, and Kanopi on occasion, but then on Camp weekend I also get to see a regular mix of folks from Mediacurrent, Code Journeymen, Lullabot, Palantir, FFW, CivicActions, end users like NOAA, and more.

This year we got to hear from Adam Bergstein as the keynote speaker. Unfortunately, that "we" didn't include me at first, as I managed to roll up right after Adam spoke ... but his keynote is on YouTube already thanks to Kevin Thull! I encourage you to give it a listen to hear how Adam's experience learning to work against his own "winning strategy" as a developer (that of a honey badger ; ) helped him gain empathy for his fellow team members and find purpose in collaborative problem solving to make the world a better place.

I gave a presentation of Drupal Commerce 2.x focusing on how we've improved the out of the box experience since Commerce 1.x. This was fun to deliver, because we really have added quite a bit more functionality along with a better customer experience in the core of Commerce 2.x itself. These improvements continued all the way up to our first release candidate tagged earlier this month, which included new promotions, coupons, and payment capabilities.

Many folks were surprised by how far along Commerce 2.x is, but now that Bojan has decompressed from the RC1 sprint, I expect we'll start to share more about the new goodies on the Drupal Commerce blog. (If you're so inclined, you can subscribe to our newsletter to get bi-weekly news / updates as well.)

Lastly, I loved just hanging out and catching up with friends at the venue and at the afterparty. I played several rounds of a very fun competitive card game in development by Ken Rickard (follow him to find out when his Kickstarter launches!). I also enjoyed several rounds of pool with other Drupallers in the evening and closed out the night with cocktails at Imperial Life, one of my favorite cocktail bars in Asheville. I treasure these kinds of social interactions with people I otherwise only see as usernames and Twitter handles online.

Can't wait to do it again next year!

Categories: FLOSS Project Planets

Reuven Lerner: Globbing and Python’s “subprocess” module

Planet Python - Thu, 2017-07-20 15:12

Python’s “subprocess” module makes it really easy to invoke an external program and grab its output. For example, you can say

import subprocess print(subprocess.check_output('ls'))

and the output is then

$ ./blog.py b'blog.py\nblog.py~\ndictslice.py\ndictslice.py~\nhexnums.txt\nnums.txt\npeanut-butter.jpg\nregexp\nshowfile.py\nsieve.py\ntest.py\ntestintern.py\n'

subprocess.check_output returns a bytestring with the filenames on my desktop. To deal with them in a more serious way, and to have the ASCII 10 characters actually function as newlines, I need to invoke the “decode” method, which results in a string:

output = subprocess.check_output('ls').decode('utf-8') print(output)

This is great, until I want to pass one or more arguments to my “ls” command.  My first attempt might look like this:

output = subprocess.check_output('ls -l').decode('utf-8') print(output)

But I get the following output:

$ ./blog.py Traceback (most recent call last): File "./blog.py", line 5, in <module> output = subprocess.check_output('ls -l').decode('utf-8') File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 336, in check_output **kwargs).stdout File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 403, in run with Popen(*popenargs, **kwargs) as process: File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 707, in __init__ restore_signals, start_new_session) File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 1333, in _execute_child raise child_exception_type(errno_num, err_msg) FileNotFoundError: [Errno 2] No such file or directory: 'ls -l'

The most important part of this error message is the final line, in which the system complains that I cannot find the program “ls -l”. That’s right — it thought that the command + option was a single program name, and failed to find that program.

Now, before you go and complain that this doesn’t make any sense, remember that filenames may contain space characters. And that there’s no difference between a “command” and any other file, except for the way that it’s interpreted by the operating system. It might be a bit weird to have a command whose name contains a space, but that’s a matter of convention, not technology.

Remember, though, that when a Python program is invoked, we can look at sys.argv, a list of the user’s arguments. Always, sys.argv[0] is the program’s name itself. We can thus see an analog here, in that when we invoke another program, we also need to pass that program’s name as the first element of a list, and the arguments as subsequent list elements.

In other words, we can do this:

output = subprocess.check_output(['ls', '-l']).decode('utf-8') print(output)

and indeed, we get the following:

$ ./blog.py total 88 -rwxr-xr-x 1 reuven 501 126 Jul 20 21:43 blog.py -rwxr-xr-x 1 reuven 501 24 Jul 20 21:31 blog.py~ -rwxr-xr-x 1 reuven 501 401 Jul 17 13:43 dictslice.py -rwxr-xr-x 1 reuven 501 397 Jun 8 14:47 dictslice.py~ -rw-r--r-- 1 reuven 501 54 Jul 16 11:11 hexnums.txt -rw-r--r-- 1 reuven 501 20 Jun 25 22:24 nums.txt -rw-rw-rw- 1 reuven 501 51011 Jul 3 13:51 peanut-butter.jpg drwxr-xr-x 6 reuven 501 204 Oct 31 2016 regexp -rwxr-xr-x 1 reuven 501 1669 May 28 03:03 showfile.py -rwxr-xr-x 1 reuven 501 143 May 19 02:37 sieve.py -rw-r--r-- 1 reuven 501 0 May 28 09:15 test.py -rwxr-xr-x 1 reuven 501 72 May 18 22:18 testintern.py

So far, so good.  Notice that check_output can thus get either a string or a list as its first argument.  If we pass a list, we can pass additional arguments, as well:

output = subprocess.check_output(['ls', '-l', '-F']).decode('utf-8') print(output)

As a result of adding the “-F’ flag, we now get a file-type indicator at the end of every filename:

$ ls -l -F total 80 -rwxr-xr-x 1 reuven 501 137 Jul 20 21:44 blog.py* -rwxr-xr-x 1 reuven 501 401 Jul 17 13:43 dictslice.py* -rw-r--r-- 1 reuven 501 54 Jul 16 11:11 hexnums.txt -rw-r--r-- 1 reuven 501 20 Jun 25 22:24 nums.txt -rw-rw-rw- 1 reuven 501 51011 Jul 3 13:51 peanut-butter.jpg drwxr-xr-x 6 reuven 501 204 Oct 31 2016 regexp/ -rwxr-xr-x 1 reuven 501 1669 May 28 03:03 showfile.py* -rwxr-xr-x 1 reuven 501 143 May 19 02:37 sieve.py* -rw-r--r-- 1 reuven 501 0 May 28 09:15 test.py -rwxr-xr-x 1 reuven 501 72 May 18 22:18 testintern.py*

It’s at this point that we might naturally ask: What if I want to get a file listing of one of my Python programs? I can pass a filename as an argument, right?  Of course:

output = subprocess.check_output(['ls', '-l', '-F', 'sieve.py']).decode('utf-8') print(output)

And the output is:

-rwxr-xr-x 1 reuven 501 143 May 19 02:37 sieve.py*


Now, what if I want to list all of the Python programs in this directory?  Given that this is a natural and everyday thing we do on the command line, I give it a shot:

output = subprocess.check_output(['ls', '-l', '-F', '*.py']).decode('utf-8') print(output)

And the output is:

$ ./blog.py ls: cannot access '*.py': No such file or directory Traceback (most recent call last): File "./blog.py", line 5, in <module> output = subprocess.check_output(['ls', '-l', '-F', '*.py']).decode('utf-8') File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 336, in check_output **kwargs).stdout File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 418, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ls', '-l', '-F', '*.py']' returned non-zero exit status 2.

Oh, no!  Python thought that I was trying to find the literal file named “*.py”, which clearly doesn’t exist.

It’s here that we discover that when Python connects to external programs, it does so on its own, without making use of the Unix shell’s expansion capabilities. Such expansion, which is often known as “globbing,” is available via the Python “glob” module in the standard library.  We could use that to get a list of files, but it seems weird that when I invoke a command-line program, I can’t rely on it to expand the argument.

But wait: Maybe there is a way to do this!  Many functions in the “subprocess” module, including check_output, have a “shell” parameter whose default value is “False”. But if I set it to “True”, then a Unix shell is invoked between Python and the command we’re running. The shell will surely expand our star, and let us list all of the Python programs in the current directory, right?

Let’s see:

output = subprocess.check_output(['ls', '-l', '-F', '*.py'], shell=True).decode('utf-8') print(output)

And the results:

$ ./blog.py blog.py blog.py~ dictslice.py dictslice.py~ hexnums.txt nums.txt peanut-butter.jpg regexp showfile.py sieve.py test.py testintern.py

Hmm. We didn’t get an error.  But we also didn’t get what we wanted.  This is mighty strange.

The solution, it turns out, is to pass everything — command and arguments, including the *.py — as a single string, and not as a list. When you’re invoking commands with shell=True, you’re basically telling Python that the shell should break apart your arguments and expand them.  If you pass a list to the shell, then the parsing is done the wrong number of times, and in the wrong places, and you get the sort of mess I showed above.  And indeed, with shell=True and a string as the first argument, subprocess.check_output does the right thing:

output = subprocess.check_output('ls -l -F *.py', shell=True).decode('utf-8') print(output)

And the output from our program is:

$ ./blog.py -rwxr-xr-x 1 reuven 501 141 Jul 20 22:03 blog.py* -rwxr-xr-x 1 reuven 501 401 Jul 17 13:43 dictslice.py* -rwxr-xr-x 1 reuven 501 1669 May 28 03:03 showfile.py* -rwxr-xr-x 1 reuven 501 143 May 19 02:37 sieve.py* -rw-r--r-- 1 reuven 501 0 May 28 09:15 test.py -rwxr-xr-x 1 reuven 501 72 May 18 22:18 testintern.py*

The bottom line is that you can get globbing to work when invoking commands via subprocess.check_output. But you need to know what’s going on behind the scenes, and what shell=True does (and doesn’t) do, to make it work.

The post Globbing and Python’s “subprocess” module appeared first on Lerner Consulting Blog.

Categories: FLOSS Project Planets

Dries Buytaert: Arsenal using Drupal

Planet Drupal - Thu, 2017-07-20 15:11

As a Belgian sports fan, I will always be a loyal to the Belgium National Football Team. However, I am willing to extend my allegiance to Arsenal F.C. because they recently launched their new site in Drupal 8! As one of the most successful teams of England's Premier League, Arsenal has been lacing up for over 130 years. On the new Drupal 8 site, Arsenal fans can access news, club history, ticket services, and live match results. This is also a great example of collaboration with two Drupal companies working together - Inviqa in the U.K. and Phase2 in the US If you want to see Drupal 8 on Arsenal's roster, check out https://www.arsenal.com!

Categories: FLOSS Project Planets

Kate is now translated also on Windows!

Planet KDE - Thu, 2017-07-20 15:04

This release includes all the feature-enhancements the Linux version has received (frameworks announcements for 5.36.0)

– Actually working spell-checking.
– Possibility to switch interface language.

EDIT: Adding an extra fall-back UI language does not wok properly yet.

Kate in Sweedish

Grab it now at download.kde.org:  Kate-setup-17.04.3-KF5.36-32bit or Kate-setup-17.04.3-KF5.36-64bit

Categories: FLOSS Project Planets

Mediacurrent: Memory Management with Migrations in Drupal 8

Planet Drupal - Thu, 2017-07-20 12:32

When dealing with a site migration that has hundreds of thousands of nodes with larger than usual field values, you might notice some performance issues.

In one instance recently I had to write a migration for nodes that had multiple fields of huge JSON strings and parse them.  The migration itself was solid, but I kept running into memory usage warnings that would stop the migration its tracks.

Sometime during the migration, I would see these messages:

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Hive - part I

Planet Apache - Thu, 2017-07-20 12:30
This is the first post in a series of articles on securing Apache Hive. In this article we will look at installing Apache Hive and doing some queries on data stored in HDFS. We will not consider any security requirements in this post, but the test deployment will be used by future posts in this series on authenticating and authorizing access to Hive.

1) Install and configure Apache Hadoop

The first step is to install and configure Apache Hadoop. Please follow section 1 of this earlier tutorial for information on how to do this. In addition, we need to configure two extra properties in 'etc/hadoop/core-site.xml':
  • hadoop.proxyuser.$user.groups: *
  • hadoop.proxyuser.$user.hosts: localhost
where "$user" above should be replaced with the user that is going to run the hive server below. As we are not using authentication in this tutorial, this allows the $user to impersonate the "anonymous" user, who will connect to Hive via beeline and run some queries.

Once HDFS has started, we need to create some directories for use by Apache Hive, and change the permissions appropriately:
  • bin/hadoop fs -mkdir -p /user/hive/warehouse /tmp
  • bin/hadoop fs -chmod g+w /user/hive/warehouse /tmp
  • bin/hadoop fs -mkdir /data
The "/data" directory will hold a file which represents the output of a map-reduce job. For the purposes of this tutorial, we will use a sample output of the canonical "Word Count" map-reduce job on some text. The file consists of two columns separated by a tab character, where the left column is the word, and the right column is the total count associated with that word in the original document.

I've uploaded such a sample output here. Download it and upload it to the HDFS data directory:
  • bin/hadoop fs -put output.txt /data
2) Install and configure Apache Hive

Now we will install and configure Apache Hive. Download and extract Apache Hive (2.1.1 was used for the purposes of this tutorial). Set the "HADOOP_HOME" environment variable to point to the Apache Hadoop installation directory above. Now we will configure the metastore and start Hiveserver2:
  • bin/schematool -dbType derby -initSchema
  • bin/hiveserver2
In a separate window, we will start beeline to connect to the hive server, where $user is the user who is running Hadoop (necessary as we are going to create some data in HDFS, and otherwise wouldn't have the correct permissions):
  • bin/beeline -u jdbc:hive2://localhost:10000 -n $user
Once we are connected, then create a Hive table and load the map reduce output data into a new table called "words":
  • create table words (word STRING, count INT) row format delimited fields terminated by '\t' stored as textfile;
  • LOAD DATA INPATH '/data/output.txt' INTO TABLE words;
Now we can run some queries on the data as the anonymous user. Log out of beeline and then back in and run some queries via:
  • bin/beeline -u jdbc:hive2://localhost:10000
  • select * from words where word == 'Dare';
Categories: FLOSS Project Planets

Palantir: Fine Arts Museums of San Francisco

Planet Drupal - Thu, 2017-07-20 12:00
Fine Arts Museums of San Francisco brandt Thu, 07/20/2017 - 11:00 Creating a Modular, Interactive Learning Tool for Art Exhibits

Creating a Modular, Interactive Learning Tool on Drupal 8

  • A media-rich, modular template built on Drupal 8
  • Innovative feature components to create context with a variety of content
  • Expansion of existing brand elements to create cohesive but unique identities

We want to make your project a success.

Let's Chat. Our Client

The Fine Arts Museums of San Francisco (FAMSF) is the largest public arts institution in the city of San Francisco and one of the largest art museums in the state of California. With an annual combined attendance of 1,442,200 people for the two museums (the Legion of Honor and the de Young), FAMSF sought a way to expand the experience for attendees beyond the reach of the physical exhibits themselves and to deepen visitors’ engagement with the art. From this goal, the idea of ‘Digital Stories’ was born.

The Challenge of ‘Digital Stories’

FAMSF had an interesting challenge:

  • They wanted to create engaging, interactive websites for each future exhibition. 
  • They wanted each exhibit website to be unique – not employing the same template over and over. 
  • They wanted these websites to serve as educational tools for their exhibits over the course of many years. 
  • They required a platform that museum staff could use to author without starting from scratch. 
  • They needed to create these website for their two museums — the Legion of Honor and the de Young — with different branding appropriate to each museum and to their respective exhibitions. 

In short, they required a tool that allowed them to “spin up” unique, interactive educational microsites for multiple exhibits, across two museums, for several years.

FAMSF had seen various treatments of text, images, audio, and video on the web that they felt could be used as inspiration for interactive features for their content, that when combined together could provide a larger learning experience for visitors. Those treatments included an expansive use of the standards in HTML5 and CSS3, along with a series of exciting Javascript libraries that expand interactions further than what is offered through HTML and CSS.

The problem also required more than a front-end solution to create the interactions. It needed to be built on a content management system — Drupal 8 in this case — that could support their content editors, providing them with a tool where they could simply upload and arrange content to produce amazing, dynamic exhibit sites.

The Solution Understanding the brands

In order to create an adaptable template for the museums, we needed to first understand the two brands. The Legion of Honor displays a European collection that spans ancient cultures to early modernism.. The exhibits are seen as authoritative manifestos. The de Young, on the other hand, houses diverse collections including works from Africa, Oceania, Mesoamerica, and American Art. The exhibits are challenging and exploratory, and invite visitors to think about art in new and different ways. The framework for the microsites needed to be flexible enough to convey either brand effectively.

Understanding the content

The FAMSF project was unique in that it wasn’t the typical content strategy we do for websites. Because this project was more interaction and feature driven, our content strategy was focused on the different elements of the stories to be told, and how we could showcase those elements to create an expansive experience for visitors. For example, users should be able to zoom in on a painting to get a closer look, or be able to click on a question and see the answer display.

Creating the right interactive features

With so many different possible elements, it was important to narrow down the interactions and feature components that they needed. These components needed to match their content and also have the ability to be executed in a tight timeline.

For overall presentation treatment, we introduced a flexible content area where individual sections could be introduced as revealable “card sections”. Within each card section, a site administrator can first choose to add hero content for that section which could include either background images or background video, plus various options for placement and style of animated headers.

Next within the card section, a series of “Section Layout Components” were available, such as single column or two columns side-by-side that they could choose from. Within the column sections they could place modular content components that included media (video, images, audio) and text.

Menu features for the Early Monet site, one of the Digital Stories for the Legion of Honor Museum.

We used a custom implementation of several JavaScript and jQuery libraries to achieve the card-reveal effect (pagePiling.js) and animated CSS3 transitions as slides are revealed, using a suite of CSS3 animation effects, particularly for Hero sections of slides. Additionally, implementation of a JavaScript library (lazyloadxt) for lazy-loading of images was critical for the success of the desired media-rich pages in order to optimize performance. All were coded to work on modern mobile and desktop browsers, so that every experience would be rich, no matter the type of device it was displayed on.

Many interactive components went through a process of discovery and iteration achieved through team collaboration, taking into account the strategic needs of each component to increase user engagement, along with content requirements of the component, look, feel and interactivity. Components as well as the general treatment were presented as proof-of-concept, where additional client feedback was taken into account. Most interactivity on individual components was done by creating custom jQuery behaviors and CSS3 animation for each component. This often included animated transitional effects to help reveal more pieces of content as users look more closely.

Collapsible content displayed on the “Summer of Love” site, one of the Digital Stories for the de Young MuseumApplying the FAMSF brand to design components

Although the same colors and typefaces employed in FAMSF’s main website were used, it was agreed from the beginning that the Digital Stories and the main website were going to be “cousins” within the same family as opposed to “siblings,” so they could definitely have their own unique feel. This supported the goal of designing the microsites to be an immersive and very targeted experience. This was achieved by expanding upon the existing color palette and using additional fonts within the brand’s font family.

Style tile created for the de Young: playful / challenging / contemporary / exploratoryThe de Young style tile creates a sense of excitement and delight through the use of whimsical icons and graphics. Easily recognizable iconography is incorporated in order to communicate effectively with a wide audience, with the added bonus of fun details such as saturated drop shadows and stripes.

In order to make sure the FAMSF team could reliably reproduce new, unique exhibit sites without having to change any code, we had to systematize the structure of the content and application of the interactions.

The Digital Story Content type for each exhibition had modular and reusable interactive features including the following:

  • An image comparison component for comparing two or three images side by side, where revealable text can give more context for each image 
  • An audio component for uploading audio files that included custom playback buttons and a revealable transcript
  • The ability to add a highly customizable caption or credit to any instance of an image or video
  • A zoomable image component where markers can be positioned on the initial image and that marker can be clicked, revealing a more detailed image and an area for more commentary on that detail
  • A revealable “read more” section that can contain further subsections
  • An image with an overlay, to be able to reveal a new image on top of the existing image. This was used to demonstrate aspects of the composition of a painting, showing a drawing on top of the painting.
  • A video component that could support uploaded videos or embedded streaming video
  • A horizontal slider that could contain images and captions with a variety of configuration
  • A stand-alone quotation with display type and an animated transition
The Results

The resulting platform we built allowed FAMSF to launch two exhibit sites in rather quick succession, which would have been incredibly difficult if they had to build each from scratch. In a matter of weeks, FAMSF launched two quite different interactive learning experiences:

Both exhibit sites have received praise, not only internally at FAMSF, but from online reviews of the exhibits, which mention the accompanying Digital Stories online learning tool.

Since the completion of the engagement with Palantir, FAMSF has already leveraged this tool to create an additional Digital Stories site (digitalstories.famsf.org/degas), and they have plans to create at least three more before the end of the year. Because of the simplicity of using the platform, they anticipate being able to spin up 4 - 5 different exhibit sites per year.

Current success for the Digital Stories sites is being measured by individual views and actual participation rate, and the initial results are on track with FAMSF’s initial goals:

  • The Monet site has over 30,000 views
  • The Summer of Love site has just under 30,000 views
  • Visitors are typically spending 4 - 5 minutes on each page

We’re pleased to have been part of a project that helps expand visitors’ understanding of important artists and their works. FAMSF was a great partner, allowing for a true collaboration focused on both pairing the nest technologies to fit the material and also providing the best learning mechanism for those engaging with the content.

We want to make your project a success.

Let's Chat. Drupal 8 digitalstories.famsf.org/summer-of-love digitalstories.famsf.org/early-monet/
Categories: FLOSS Project Planets

PyCharm: PyCharm 2017.2 RC

Planet Python - Thu, 2017-07-20 11:37

We’ve been putting the finishing touches on PyCharm 2017.2, and we have a release candidate ready! Go get it on our website

Fixes since the last EAP:

  • Docker Compose on Windows issues with malformatted environment variables
  • Various issues in Django project creation
  • Incorrect “Method may be static” inspection
  • AttributeError during package installation
  • And a couple more, see the release notes for details

As this is a release candidate, it does not come with a 30 day EAP license. If you don’t have a license for PyCharm Professional Edition you can use a trial license.

Even though this is not called an EAP version anymore, our EAP promotion still applies! If you find any issues in this version and report them on YouTrack, you can win prizes in our EAP competition.

To get all EAP builds as soon as we publish them, set your update channel to EAP (go to Help | Check for Updates, click the ‘Updates’ link, and then select ‘Early Access Program’ in the dropdown). If you’d like to keep all your JetBrains tools up to date, try JetBrains Toolbox!

-PyCharm Team
The Drive to Develop

Categories: FLOSS Project Planets
Syndicate content