Feeds

The Sego Blog: Drupal 8.2 Release October 5th!

Planet Drupal - Mon, 2016-09-26 09:42
09/26/2016Drupal 8.2 Release October 5th!

Earlier this month Acquia put on a great webinar hosted by Angie Byron & Gabor Hojtsy titled All You Need to Know About Drupal 8.2 and Beyond (slides linked below).

There were many topics covered and I could not help but get super excited for some things we can look forward to in the upcoming release of Drupal 8.2 slated for October 5th 2016.

Categories: FLOSS Project Planets

Doug Hellmann: copy — Duplicate Objects — PyMOTW 3

Planet Python - Mon, 2016-09-26 09:00
The copy module includes two functions, copy() and deepcopy() , for duplicating existing objects. Read more… This post is part of the Python Module of the Week series for Python 3. See PyMOTW.com for more articles from the series.
Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: Katie McLaughlin

Planet Python - Mon, 2016-09-26 08:30

This week we welcome Katie McLaughlin (@glasnt) as our PyDev of the Week! She is a core developer of the BeeWare project. You should take a moment and check out her Github profile to see what fun projects she’s a part of. Katie also has a fun little website and was a speaker at PyCon 2016. Let’s take a few moments to get to know her better!

Can you tell us a little about yourself (hobbies, education, etc):

G’day! I’m Australian, originally from Brisbane, but now living in Sydney. I’ve got a Bachelor of Information Technology, and I’ve been in the tech industry for going on ten years now. I’ve been in a bunch of different roles and technologies, but mostly in web hosting and cloud stuff. When I’m not on a computer or attending conferences, I enjoy cooking and making tapestries.

Why did you start using Python?

To fix a bug in a bit of in-house code! There was a bug in an old script, and I saw the “#!/usr/bin/env python” and learnt from there. I didn’t go back to Python for a few years, but just after I was accepted to PyCon Australia 2015, I thought I should brush up on what little I knew. That’s about a year ago now, and it’s now my go-to language for scripting. I was had previously used Ruby for years, and I only occasionally still automatically type “puts” instead of “print”.

What other programming languages do you know and which is your favorite?

Well! Based on just languages that I’ve been paid to do, I know JavaScript, Haskell, Scala, C, Python, Ruby, Perl, Bash/Shell, Powerscript, Powershell, PL/SQL, and probably a few others in there as well. Add a dozen or so other languages from high school and university (mostly Pascal, Lisp, Poplog, Assembly, ActionScript, C#, Java), and there’s a lot.

But what languages do I know? That’s a tough one. Personally I’d define knowing a language has having a working knowledge of it. Put any language in front of me and I could probably work it out, but writing is completely different.

Given that, I’d say I know JavaScript, Haskell, Python, Ruby & Bash. #polyglotLife

But as for a favourite, I know I adored Poplog back in the day, but I really don’t play favourites with languages. I use a programming language in an environment to articulate a solution specific to that environment. Using a favourite language in an environment were it doesn’t belong doesn’t do anyone any favours. The right tool for the job, etc

Categories: FLOSS Project Planets

My Adventures and Misadventures in Qt Quick Land

Planet KDE - Mon, 2016-09-26 07:58

I venture into QMLand. I do not survive unscathed. I did not even get a shirt. I learned some lessons instead.

Categories: FLOSS Project Planets

myDropWizard.com: How to QUICKLY and SAFELY deploy to the live site WITHOUT comprehensive testing!

Planet Drupal - Mon, 2016-09-26 07:41

On the one hand, you want to deploy changes to the live site QUICKLY (for, say, a Highly Critical security update).

On the other hand, you want make changes SAFELY, ie. you don't want it to break the site.

Testing is good. Automated testing is great.

But what if you simply didn't have the resources to comprehensively test the change (either manually or automatically)?

Maybe the client isn't willing to fund a project to write automated tests. Maybe you don't have the extra time or extra people to do proper QA. Whatever reason, you just couldn't do it.

Is it possible to both quickly AND safely deploy to the live site WITHOUT comprehensive testing?

... besides just crossing your fingures and hoping it doesn't break?

We think there is a way. :-) Read more to found out!

Categories: FLOSS Project Planets

Annertech: At DrupalCon? Need Help? We'll Take Care of You.

Planet Drupal - Mon, 2016-09-26 06:59
At DrupalCon? Need Help? We'll Take Care of You.

With over 2,000 people expected to be in Dublin for DrupalCon this week, it's likely that someone, somewhere is going to need some assistance. We're all very helpful people in the Drupal community and so help should easily be available. But sometimes you get caught out and can't find people nearby - you get lost, you lose your phone, you're in an area of town and haven't a clue how to get back to your home, you are locked out of your AirBnB, you've gone to kiss the Blarney Stone not realising it was 350km away!

Categories: FLOSS Project Planets

Rhonda D'Vine: LP

Planet Debian - Mon, 2016-09-26 06:00

I guess you know by now that I simply love music. It is powerful, it can move you, change your mood in a lot of direction, make you wanna move your body to it, even unknowingly have this happen, and remind you of situations you want to keep in mind. The singer I present to you was introduce to me by a dear friend with the following words: So this hasn't happened to me in a looooong time: I hear a voice and can't stop crying. I can't decide which song I should send to you thus I send three of which the last one let me think of you.

And I have to agree, that voice is really great. Thanks a lot for sharing LP with me, dear! And given that I got sent three songs and I am not good at holding excitement back, I want to share it with you, so here are the songs:

  • Lost On You: Her voice is really great in this one.
  • Halo: Have to agree that this is really a great cover.
  • Someday: When I hear that song and think about that it reminds my friend of myself I'm close to tears, too ...

Like always, enjoy!

/music | permanent link | Comments: 0 | Flattr this

Categories: FLOSS Project Planets

qed42.com: Lazy Builders in Drupal 8 - Caching FTW

Planet Drupal - Mon, 2016-09-26 04:36
Lazy Builders in Drupal 8 - Caching FTW Body

Drupal caching layer has become more advanced with the advent of cache tags & contexts in Drupal 8. In Drupal 7, the core din't offer many options to cache content for authenticated users. Reverse proxies like Varnish, Nginx etc. could only benefit the anonymous users.  Good news is Drupal 8 handle many painful caching scenarios ground up and have given developers / site builders array of options, making Drupal 8 first class option for all sort of performance requirements.  

Lets look at the criteria for an un-cacheable content:

  • Content that would have a very high cardinality e.g, a block that needs to display user info
  • Blocks that need to display content with a very high invalidation rate. e.g, a block displaying current timestamp

In both of the cases above, we would have a mix of dynamic & static content. For simplicity consider a block rendering current timestamp like "The current timestamp is 1421318815". This rendered content consists of 2 parts:

  • static/cacheable: "The current timestamp is"
  • Dynamic: 1421318815

Drupal 8 rendering & caching system provides us with a way to cache static part of the block, leaving out the dynamic one to be uncached. It does so by using the concept of lazy builders. Lazy builders as the name suggests is very similar what a lazy loader does in Javascript. Lazy builders are replaced with unique placeholders to be processed later once the processing is complete for cached content. So, at any point in rendering, we can have n cached content + m placeholders. Once the processing for cached content is complete, Drupal 8 uses its render strategy(single flush) to process all the placeholders & replace them with actual content(fetching dynamic data). The placeholders can also be leveraged by the experimental module bigpipe to render the cached data & present it to the user, while keep processing placeholders in the background. These placeholders as processed, the result is injected into the HTML DOM via embedded AJAX requests.

Lets see how lazy builders actually work in Drupal 8. Taking the above example, I've create a simple module called as timestamp_generator. This module is responsible for providing a block that renders the text "The current timestamp is {{current_timestamp}}".

<?php /** * @file * Contains \Drupal\timestamp_generator\Plugin\Block\Timestamp. */ namespace Drupal\timestamp_generator\Plugin\Block; use Drupal\Core\Block\BlockBase; /** * Provides a 'Timestamp' block. * * @Block( * id = "timestamp", * admin_label = @Translation("Timestamp"), * ) */ class Timestamp extends BlockBase { /** * {@inheritdoc} */ public function build() { $build = []; $user = \Drupal::currentUser()->getAccount(); $build['timestamp'] = array( '#lazy_builder' => ['timestamp_generator.generator:generateUserTimestamp', array()], '#create_placeholder' => TRUE ); $build['#markup'] = $this->t('The current timestamp is'); $build['#cache']['contexts'][] = 'languages'; return $build; } }

In the block plugin above, lets focus on the build array:

$build['timestamp'] = array( '#lazy_builder' => ['timestamp_generator.generator:generateUserTimestamp', array()], '#create_placeholder' => TRUE ); $build['#markup'] = $this->t('The current timestamp is');

All we need to define a lazy builder is, add an index #lazy_builder to our render array.

#lazy_builder: The lazy builder argument must be an array of callback function & argument this callback function needs. In our case, we have created a service that can generate the current timestamp. Also, since it doesn't need any arguments, the second argument is an empty array.

#create_placeholder: This argument when set to TRUE, makes sure a placeholder is generated & placed while processing this render element.

#markup: This is the cacheable part of the block plugin. Since, the content is translatable, we have added a language cache context here. We can add any cache tag depending on the content being rendered here as well using $build['#cache']['tags'] = ['...'];

Lets take a quick look at our service implementation:

services: timestamp_generator.generator: class: Drupal\timestamp_generator\UserTimestampGenerator arguments: []

 

<?php /** * @file * Contains \Drupal\timestamp_generator\UserTimestampGenerator. */ namespace Drupal\timestamp_generator; /** * Class UserTimestampGenerator. * * @package Drupal\timestamp_generator */ class UserTimestampGenerator { /** * */ public function generateUserTimestamp() { return array( '#markup' => time() ); } }

As we can see above the data returned from the service callback function is just the timestamp, which is the dynamic part of block content.

Lets see how Drupal renders it with its default single flush strategy. So, the content of the block before placeholder processing would look like as follows:

<div id="block-timestamp" class="contextual-region block block-timestamp-generator block-timestamp"> <h2>Timestamp</h2> <div data-contextual-id="block:block=timestamp:langcode=en" class="contextual" role="form"> <button class="trigger focusable visually-hidden" type="button" aria-pressed="false">Open Timestamp configuration options</button> <ul class="contextual-links" hidden=""><li class="block-configure"><a href="/admin/structure/block/manage/timestamp?destination=node">Configure block</a></li></ul> </div> <div class="content">The current time stamp is <drupal-render-placeholder callback="timestamp_generator.generator:generateUserTimestamp" arguments="" token="d12233422"></drupal-render-placeholder> </div> </div>

Once the placeholders are processed, it would change to:

<div id="block-timestamp" class="contextual-region block block-timestamp-generator block-timestamp"> <h2>Timestamp</h2> <div data-contextual-id="block:block=timestamp:langcode=en" class="contextual" role="form"> <button class="trigger focusable visually-hidden" type="button" aria-pressed="false">Open Timestamp configuration options</button> <ul class="contextual-links" hidden=""><li class="block-configure"><a href="/admin/structure/block/manage/timestamp?destination=node">Configure block</a></li></ul> </div> <div class="content">The current time stamp is 1421319204 </div> </div>

The placeholder processing in Drupal 8 happens inside via Drupal\Core\Render\Placeholder\ChainedPlaceholderStrategy::processPlaceholders. Drupal 8 core also provides with an interface for defining any custom placeholder processing strategy as well Drupal\Core\Render\Placeholder\PlaceholderStrategyInterface. Bigpipe module implements this interface to provide its own placeholder processing strategy & is able to present users with cached content without waiting for the processing for dynamic ones.

With bigpipe enabled, the block would render something like the one shown in the gif below:

As you can see in this example, as soon as the cached part of the timestamp block "The current timestamp is" is ready, its presented to the end users without waiting for the timestamp value. Current timestamp loads when the lazy builders kick in. Lazy builders are not limited to blocks but can work with any render array element in Drupal 8, this means any piece of content being rendered can leverage lazy builders.

N.B. -- We use Bigpipe in the demo above to make the difference visble.

PIYUESH KUMAR Mon, 09/26/2016 - 14:06
Categories: FLOSS Project Planets

erdfisch: Arriving in Dublin

Planet Drupal - Mon, 2016-09-26 03:34
Arriving in Dublin Single image:  Hydra in front of anti-racism poster in Dublin 26.09.2016 Michael Lenahan Body:  Arriving in Dublin

Drupalcon Dublin is starting this week. Particularly exciting one for me since
I went to university here about a million years ago (a.k.a. the early nineties).

This was another era entirely, before the internet. I think the arrival of the
internet changed Ireland even more than most other countries. It is mainly
rural, and stuck on the western edge of Europe on the Atlantic. "Surrounded by
water" as the song says. The water comes down from above as well. Very, very
frequently.

So Ireland remains geographically on the edge of Europe, but digitally it is
extremely connected. Companies from Silicon Valley discovered a country with a
young and educated workforce, so they set up their European operations here from
the late 1990s on.

This changed Ireland. For most of its recent history it had been a poor country
where people mostly emigrated to find work. The Catholic church had a lot of
power over society in general, the health and education systems in particular.

Now, there are many foreigners living here and you hear people from all over the
world who speak English with Irish accents. This is something that is still
quite amazing for me.

It's a privilege to be here. Thank you, erdfisch! More to come ...

Schlagworte/Tags:  planet Ihr Name Kommentar/Comment Kommentar hinzufügen/Add comment Leave this field blank
Categories: FLOSS Project Planets

Kdenlive news and packaging

Planet KDE - Mon, 2016-09-26 03:34

Following our last week’s monthly Café, we decided to concentrate on advanced trimming features and if possible an audio mixer for the next Kdenlive 16.12 release. It was also mentionned that several users requested the comeback of the rotoscoping effect, which was lost in the KF5 port, preventing some users to upgrade their Kdenlive version.

 

So good news, I worked on it and rotoscoping is now back in git master, will be in the 16.12 release.

On the stability side, we just fixed a packaging issue on our PPA that caused frequent crashes, so if you experienced issues with our PPA, please update to enjoy a stable Kdenlive.

Next Café will be on the 19th of October, at 9pm (Central European Time). See you there!

Jean-Baptiste Mardelle

Categories: FLOSS Project Planets

PreviousNext: Join us at the DrupalSouth Code Sprint

Planet Drupal - Sun, 2016-09-25 20:51

The Drupal open source project only exists because of code contributions by tens of thousands of developers and Drupal focused companies around the world. In his recent post, project founder Dries Buytaert blogged that “The Drupal community has a shared responsibility to build Drupal and that those who get more from Drupal should consider giving more”.

Australia’s contribution to Drupal code is significantly underrepresented, with PreviousNext the only Australian company in the Top 100 contributors listed on Drupal.org’s global marketplace. DrupalSouth represents the best opportunity for a wider pool of Australian Drupal developers to change this status by participating in DrupalSouth's official Code Sprint, being held on Wednesday, 26th October.
 

Categories: FLOSS Project Planets

Clint Adams: Collect the towers

Planet Debian - Sun, 2016-09-25 19:57

Why is openbmap's North American coverage so sad? Is there a reason that RadioBeacon doesn't also submit to OpenCellID? Is there a free software Android app that submits data to OpenCellID?

Categories: FLOSS Project Planets

Vincent Sanders: I'll huff, and I'll puff, and I'll blow your house in

Planet Debian - Sun, 2016-09-25 19:23
Sometimes it really helps to have a different view on a problem and after my recent writings on my Public Suffix List (PSL) library I was fortunate to receive a suggestion from my friend Enrico Zini.

I had asked for suggestions on reducing the size of the library further and Enrico simply suggested Huffman coding. This was a technique I had learned about long ago in connection with data compression and the intervening years had made all the details fuzzy which explains why it had not immediately sprung to mind.

Huffman coding named for David A. Huffman is an algorithm that enables a representation of data which is very efficient. In a normal array of characters every character takes the same eight bits to represent which is the best we can do when any of the 256 values possible is equally likely. If your data is not evenly distributed this is not the case for example if the data was english text then the value is fifteen times more likely to be that for e than k.

So if we have some data with a non uniform distribution of probabilities we need a way the data be encoded with fewer bits for the common values and more bits for the rarer values. To be efficient we would need some way of having variable length representations without storing the length separately. The term for this data representation is a prefix code and there are several ways to generate them.

Such is the influence of Huffman on the area of prefix codes they are often called Huffman codes even if they were not created using his algorithm. One can dream of becoming immortalised like this, to join the ranks of those whose names are given to units or whole ideas in a field must be immensely rewarding, however given Huffman invented his algorithm and proved it to be optimal to answer a question on a term paper in his early twenties I fear I may already be a bit too late.

The algorithm itself is relatively straightforward. First a frequency analysis is performed, a fancy way of saying count how many of each character is in the input data. Next a binary tree is created by using a priority queue initialised with the nodes sorted by frequency.

The two least frequent items count is summed together and a node placed in the tree with the two original entries as child nodes. This step is repeated until a single node exists with a count value equal to the length of the input.

To encode data once simply walks the tree outputting a 0 for a left node or 1 for right node until reaching the original value. This generates a mapping of values to bit output, the input is then simply converted value by value to the bit output. To decode the data the data is used bit by bit to walk the tree to arrive at values.

If we perform this algorithm on the example string table *!asiabvcomcoopitamazonawsarsaves-the-whalescomputebasilicata we can reduce the 488 bits (61 * 8 bit characters) to 282 bits or 40% reduction. Obviously in a real application the huffman tree would need to be stored which would probably exceed this saving but for larger data sets it is probable this technique would yield excellent results on this kind of data.

Once I proved this to myself I implemented the encoder within the existing conversion program. Although my perl encoder is not very efficient it can process the entire PSL string table (around six thousand labels using 40KB or so) in less than a second, so unless the table grows massively an inelegant approach will suffice.

The resulting bits were packed into 32bit values to improve decode performance (most systems prefer to deal with larger memory fetches less frequently) and resulted in 18KB of output or 47% of the original size. This is a great improvement in size and means the statically linked test program is now 59KB and is actually smaller than the gzipped source data.

$ ls -alh test_nspsl
-rwxr-xr-x 1 vince vince 59K Sep 25 23:58 test_nspsl
$ ls -al public_suffix_list.dat.gz
-rw-r--r-- 1 vince vince 62K Sep 1 08:52 public_suffix_list.dat.gz
To be clear the statically linked program can determine if a domain is in the PSL with no additional heap allocations and includes the entire PSL ordered tree, the domain label string table and the huffman decode table to read it.

An unexpected side effect is that because the decode loop is small it sits in the processor cache. This appears to cause the string comparison function huffcasecmp() (which is not locale dependant because we know the data will be limited ASCII) performance to be close to using strcasecmp() indeed on ARM32 systems there is a very modest improvement in performance.

I think this is as much work as I am willing to put into this library but I am pleased to have achieved a result which is on par with the best of breed (libpsl still has a data representation 20KB smaller than libnspsl but requires additional libraries for additional functionality) and I got to (re)learn an important algorithm too.
Categories: FLOSS Project Planets

François Dion: Something for your mind: Polymath Podcast Episode 001

Planet Python - Sun, 2016-09-25 18:32
Two topics will be covered:Chipmusic, limitations and creativityNumfocus (Open code = better science)
The numfocus interview was recorded at PyData Carolinas 2016. There will be a future episode covering the keynotes, tutorials, talks and lightning talks later this year. This interview was really more about open source and less about PyData.
The episode concludes with Learn more, on Claude Shannon and Harry Nyquist.
Something for your mind is available on
art·chiv.es/'ärt,kīv/
at artchiv.es/s4ym/

Francois Dion
@f_dion
Categories: FLOSS Project Planets

Greg Boggs: Content Modeling in Drupal 8

Planet Drupal - Sun, 2016-09-25 15:43

In many modern frameworks, data modeling is done by building out database tables. In Drupal, we use a web-based interface to build our models. This interface makes building the database accessible for people with no database experience. However, this easy access can lead to overly complex content models because it’s so easy to build out advanced structures with a few hours of clicking. It’s surprising how often Drupal developers are expected to be content modeling experts. Here’s a well-written overview of content modeling for the rest of us who aren’t experts yet.

Data Modeling Goal

Our goal when modeling content in Drupal is build out the structure that will become our editor interface and HTML output. We also need to create a model that supports the functionality needed in the website. While accomplishing this, we want to reduce the complexity of our models as much as possible.

Getting Started

One of the first things to do when building a Drupal site is to build content types. So, before you start a site build, start with either a content model or a detail page wireframe. This spreadsheet from Palantir will help you. The home page design may look amazing, but it’s unhelpful for building out content types. Get the detail pages before you start building.

Why Reduce Complexity?

The more content types you create, the more effort it will take to produce a site. Further, the more types you have, the more time it will take to maintain the site in the future. If you have 15 content types and need to make a site-wide change, you need to edit 15 different pages.

The more pages you need to edit, the more mistakes you will make in choosing labels, settings, and formatters. Lastly, content can’t easily be copied from one type to another which makes moving content around your site harder when there are many content types. So, the first thing you’ll want to do with your content model is to collapse your types into as few types as feasible. How many is that?

5 Content Types is Enough

Drupal has many built in entities like files, taxonomy, users, nodes, comments, and config. So, the vast majority of sites don’t need any more than 5 content types. Instead of adding a new content type for every design, look for ways to reuse existing types by adding fields and applying layouts to those fields.

Break Up the Edit Form

Drupal 8 allows you to have different form displays for a single content type. With either Form Mode Control or Form Mode Manager, you can create different edit experiences for the same content type without overloading the admin interface.

Now that we’ve got the goal in mind and some tools to get us there, we’ll dive into the specifics of configuring field types such as hero images and drop down lists in my next post.

Categories: FLOSS Project Planets

Julian Andres Klode: Introducing TrieHash, a order-preserving minimal perfect hash function generator for C(++)

Planet Debian - Sun, 2016-09-25 14:44
Abstract

I introduce TrieHash an algorithm for constructing perfect hash functions from tries. The generated hash functions are pure C code, minimal, order-preserving and outperform existing alternatives. Together with the generated header files,they can also be used as a generic string to enumeration mapper (enums are created by the tool).

Introduction

APT (and dpkg) spend a lot of time in parsing various files, especially Packages files. APT currently uses a function called AlphaHash which hashes the last 8 bytes of a word in a case-insensitive manner to hash fields in those files (dpkg just compares strings in an array of structs).

There is one obvious drawback to using a normal hash function: When we want to access the data in the hash table, we have to hash the key again, causing us to hash every accessed key at least twice. It turned out that this affects something like 5 to 10% of the cache generation performance.

Enter perfect hash functions: A perfect hash function matches a set of words to constant values without collisions. You can thus just use the index to index into your hash table directly, and do not have to hash again (if you generate the function at compile time and store key constants) or handle collision resolution.

As #debian-apt people know, I happened to play a bit around with tries this week before guillem suggested perfect hashing. Let me tell you one thing: My trie implementation was very naive, that did not really improve things a lot…

Enter TrieHash

Now, how is this related to hashing? The answer is simple: I wrote a perfect hash function generator that is based on tries. You give it a list of words, it puts them in a trie, and generates C code out of it, using recursive switch statements (see code generation below). The function achieves competitive performance with other hash functions, it even usually outperforms them.

Given a dictionary, it generates an enumeration (a C enum or C++ enum class) of all words in the dictionary, with the values corresponding to the order in the dictionary (the order-preserving property), and a function mapping strings to members of that enumeration.

By default, the first word is considered to be 0 and each word increases a counter by one (that is, it generates a minimal hash function). You can tweak that however:

= 0 WordLabel ~ Word OtherWord = 9

will return 0 for an unknown value, map “Word” to the enum member WordLabel and map OtherWord to 9. That is, the input list functions like the body of a C enumeration. If no label is specified for a word, it will be generated from the word. For more details see the documentation

C code generation switch(string[0] | 32) { case 't': switch(string[1] | 32) { case 'a': switch(string[2] | 32) { case 'g': return Tag; } } } return Unknown;

Yes, really recursive switches – they directly represent the trie. Now, we did not really do a straightforward translation, there are some optimisations to make the whole thing faster and easier to look at:

First of all, the 32 you see is used to make the check case insensitive in case all cases of the switch body are alphabetical characters. If there are non-alphabetical characters, it will generate two cases per character, one upper case and one lowercase (with one break in it). I did not know that lowercase and uppercase characters differed by only one bit before, thanks to the clang compiler for pointing that out in its generated assembler code!

Secondly, we only insert breaks only between cases. Initially, each case ended with a return Unknown, but guillem (the dpkg developer) suggested it might be faster to let them fallthrough where possible. Turns out it was not faster on a good compiler, but it’s still more readable anywhere.

Finally, we build one trie per word length, and switch by the word length first. Like the 32 trick, his gives a huge improvement in performance.

Digging into the assembler code

The whole code translates to roughly 4 instructions per byte:

  1. A memory load,
  2. an or with 32
  3. a comparison, and
  4. a conditional jump.

(On x86, the case sensitive version actually only has a cmp-with-memory and a conditional jump).

Due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 this may be one instruction more: On some architectures an unneeded zero-extend-byte instruction is inserted – this causes a 20% performance loss.

Performance evaluation

I run the hash against all 82 words understood by APT in Packages and Sources files, 1,000,000 times for each word, and summed up the average run-time:

host arch Trie TrieCase GPerfCase GPerf DJB plummer ppc64el 540 601 1914 2000 1345 eller mipsel 4728 5255 12018 7837 4087 asachi arm64 1000 1603 4333 2401 1625 asachi armhf 1230 1350 5593 5002 1784 barriere amd64 689 950 3218 1982 1776 x230 amd64 465 504 1200 837 693

Suffice to say, GPerf does not really come close.

All hosts except the x230 are Debian porterboxes. The x230 is my laptop with a a Core i5-3320M, barriere has an Opteron 23xx. I included the DJB hash function for another reference.

Source code

The generator is written in Perl, licensed under the MIT license and available from https://github.com/julian-klode/triehash – I initially prototyped it in Python, but guillem complained that this would add new build dependencies to dpkg, so I rewrote it in Perl.

Benchmark is available from https://github.com/julian-klode/hashbench

Usage

See the script for POD documentation.


Filed under: General
Categories: FLOSS Project Planets

Obey the Testing Goat: Plans for the second edition

Planet Python - Sun, 2016-09-25 13:52

The second edition was mostly prompted by the announcement by Mozilla that they were shutting down Persona in November 2016. Given that it would affect almost all the chapters from 15 thru to 21, it seemed a good excuse to do a full second edition rather than just an update.

Here, in brief, is an outline of the plan:

Chapter rewrites:
  • Rewrite chapters 15 + 16, replace persona with passwordless auth: first draft done

  • Update chapters 17+ for persona changes: in progress

  • Update JavaScript chapter for new version of QUnit: done

  • Update deployment chapters to use Systemd instead of Upstart: started but only in ansible appendix.

  • Two new chapters on REST APIs and Ajax: code spiked, but chapters not yet written

Minor updates + changes:
  • Switch to using a virtualenv from the very beginning
  • Upgrade to latest Django (1.10?)
  • Use less HTML ids and more classes
  • Use more early returns in FTs when refactoring partially finished user stories.

That's it, in very brief. You can read more on the google group, and feel free to join in the discussion there too, or here. Let me know what you think!

Categories: FLOSS Project Planets

Abu Ashraf Masnun: Introduction to Django Channels

Planet Python - Sun, 2016-09-25 11:27

Django is a brilliant web framework. In fact it is my most favourite one for various reasons. An year and a half ago, I switched to Python and Django for all my web development. I am a big fan of the eco system and the many third party packages. Particularly I use Django REST Framework whenever I need to create APIs. Having said that, Django was more than good enough for basic HTTP requests. But the web has changed. We now have HTTP/2 and web sockets. Django could not support them well in the past. For the web socket part, I usually had to rely on Tornado or NodeJS (with the excellent Socket.IO library). They are good technologies but most of my web apps being in Django, I really wished there were something that could work with Django itself. And then we had Channels. The project is meant to allow Django to support HTTP/2, websockets or other protocols with ease.

Concepts

The underlying concept is really simple - there are channels and there are messages, there are producers and there are consumers - the whole system is based on passing messages on to channels and consuming/responding to those messages.

Let’s look at the core components of Django Channels first:

  • channel - A channel is a FIFO queue like data structure. We can have many channels depending on our need.
  • message - A message contains meaningful data for the consumers. Messages are passed on to the channels.
  • consumer - A consumer is usually a function that consumes a message and take actions.
  • interface server - The interface server knows how to handle different protocols. It works as a translator or a bridge between Django and the outside world.
How does it work?

A http request first comes to the Interface Server which knows how to deal with a specific type of request. For example, for websockets and http, Daphne is a popular interface server. When a new http/websocket request comes to the interface server (daphne in our case), it accepts the request and transforms it into a message. Then it passes the message to the appropriate channel. There are predefined channels for specific types. For example, all http requests are passed to http.request channel. For incoming websocket messages, there is websocket.receive. So these channels receive the messages when the corresponding type of requests come in to the interface server.

Now that we have channels getting filled with messages, we need a way to process these messages and take actions (if necessary), right? Yes! For that we write some consumer functions and register them to the channels we want. When messages come to these channels, the consumers are called with the message. They can read the message and act on them.

So far, we have seen how we can read an incoming request. But like all web applications, we should write something back too, no? How do we do that? As it happens, the interface server is quite clever. While transforming the incoming request into a message, it creates a reply channel for that particular client request and registers itself to that channel. Then it passes the reply channel along with the message. When our consumer function reads the incoming message, it can pass a response to the reply channel attached with the message. Our interface server is listenning to that reply channel, remember? So when a response is sent back to the reply channel, the interface server grabs the message, transforms it into a http response and sends back to the client. Simple, no?

Writing a Websocket Echo Server

Enough with the theories, let’s get our hands dirty and build a simple echo server. The concept is simple. The server accepts websocket connections, the client writes something to us, we just echo it back. Plain and simple example.

Install Django & Channels pip install channels

That should do the trick and install Django + Channels. Channels has Django as a depdency, so when you install channels, Django comes with it.

Create An App

Next we create a new django project and app -

django-admin.py startproject djchan cd djchan python manage.py startapp realtime Configure INSTALLED_APPS

We have our Django app ready. We need to add channels and our django app (realtime) to the INSTALLED_APPS list under settings.py. Let’s do that:

INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', "channels", "realtime" ] Write our Consumer

After that, we need to start writing a consumer function that will process the incoming websocket messages and send back the response:

# consumers.py def websocket_receive(message): text = message.content.get('text') if text: message.reply_channel.send({"text": "You said: {}".format(text)})

The code is simple enough. We receieve a message, get it’s text content (we’re expecting that the websocket connection will send only text data for this exmaple) and then push it back to the reply_channel - just like we planned.

Channels Routing

We have our consume function ready, now we need to tell Django how to route messages to our consumer. Just like URL routing, we need to define our channel routings.

# routing.py from channels.routing import route from .consumers import websocket_receive channel_routing = [ route("websocket.receive", websocket_receive, path=r"^/chat/"), ]

The code should be self explanatory. We have a list of route objects. Here we select the channel name (websocket.receive => for receieving websocket messages), pass the consumer function and then configure the optional path. The path is an interesting bit. If we didn’t pass a value for it, the consumer will get all the messages in the websocket.receive channel on any URL. So if someone created a websocket connection to / or /private or /user/1234 - regardless of the url path, we would get all incoming messages. But that’s not our intention, right? So we restrict the path to /chat so only connections made to that url are handled by the consumer. Please note the beginning /, unlike url routing, in channels, we have to use it.

Configuring The Channel Layers

We have defined a consumer and added it to a routing table. We’re more or less ready. There’s just a final bit of configuration we need to do. We need to tell channels two things - which backend we want to use and where it can find our channel routing.

Let’s briefly talk about the backend. The messages and the channels - Django needs some sort of data store or message queue to back this system. By default Django can use in memory backend which keeps these things in memory but if you consider a distributed app, for scaling large, you need something else. Redis is a popular and proven piece of technology for these kinds of scenarios. In our case we would use the Redis backend.

So let’s install that:

pip install asgi_redis

And now we put this in our settings.py:

CHANNEL_LAYERS = { "default": { "BACKEND": "asgi_redis.RedisChannelLayer", "CONFIG": { "hosts": [("localhost", 6379)], }, "ROUTING": "realtime.routing.channel_routing", }, } Running The Servers

Make sure that Redis is running (usually redis-server should run it). Now run the django app:

python manage.py runserver

In local environment, when you do runserver - Django launches both the interface server and necessary background workers (to run the consumer functions in the background). But in production, we should run the workers seperately. We will get to that soon.

Trying it Out!

Once our dev server starts up, let’s open up the web app. If you haven’t added any django views, no worries, you should still see the “It Worked!” welcome page of Django and that should be fine for now. We need to test our websocket and we are smart enough to do that from the dev console. Open up your Chrome Devtools (or Firefox | Safari | any other browser’s dev tools) and navigate to the JS console. Paste the following JS code:

socket = new WebSocket("ws://" + window.location.host + "/chat/"); socket.onmessage = function(e) { alert(e.data); } socket.onopen = function() { socket.send("hello world"); }

If everything worked, you should get an alert with the message we sent. Since we defined a path, the websocket connection works only on /chat/. Try modifying the JS code and send a message to some other url to see how they don’t work. Also remove the path from our route and see how you can catch all websocket messages from all the websocket connections regardless of which url they were connected to. Cool, no?

Our Custom Channels

We have seen that certain protocols have predefined channels for various purposes. But we are not limited to those. We can create our own channels. We don’t need to do anything fancy to initialize a new channel. We just need to mention a name and send some messages to it. Django will create the channel for us.

Channel("thumbnailer").send({ "image_id": image.id })

Of course we need corresponding workers to be listenning to those channels. Otherwise nothing will happen. Please note that besides working with new protocols, Channels also allow us to create some sort of message based task queues. We create channels for certain tasks and our workers listen to those channels. Then we pass the data to those channels and the workers process them. So for simpler tasks, this could be a nice solution.

Scaling Production Systems Running Workers Seperately

On a production environment, we would want to run the workers seperately (since we would not run runserver on production anyway). To run the background workers, we have to run this command:

python manage.py runworker ASGI & Daphne

In our local environment, the runserver command took care of launching the Interface server and background workers. But now we have to run the interface server ourselves. We mentioned Daphne already. It works with the ASGI standard (which is commonly used for HTTP/2 and websockets). Just like wsgi.py, we now need to create a asgi.py module and configure it.

import os from channels.asgi import get_channel_layer os.environ.setdefault("DJANGO_SETTINGS_MODULE", "djchan.settings") channel_layer = get_channel_layer()

Now we can run the server:

daphne djchan.asgi:channel_layer

If everything goes right, the interface server should start running!

ASGI or WSGI

ASGI is still new and WSGI is a battle tested http server. So you might still want to keep using wsgi for your http only parts and asgi for the parts where you need channels specific features.

The popular recommendation is that you should use nginx or any other reverse proxies in front and route the urls to asgi or uwsgi depending on the url or Upgrade: WebSocket header.

Retries and Celery

The Channels system does not gurantee delivery. If there are tasks which needs the certainity, it is highly recommended to use a system like Celery for these parts. Or we can also roll our own checks and retry logic if we feel like that.

Categories: FLOSS Project Planets

DrupalCon News: Badge Pick-Up is Open

Planet Drupal - Sun, 2016-09-25 10:18

DrupalCon is here! We have registration open here at the Dublin Convention Centre until 18:00 today. We will also be open bright and early at 7:00 on Monday to kick off the first day of DrupalCon Dublin! Come on by and pick your badge and tote and make sure to join us for the Opening Reception from 17:00-19:00 where everyone is welcome.

 

 

 

Categories: FLOSS Project Planets
Syndicate content