Feeds

Drupal Watchdog: Baby Steps

Planet Drupal - Tue, 2014-09-30 00:47
Column

FADE IN:

INTERIOR. RONNIE’S APARTMENT – EVENING
(RONNIE paces, on the phone.)

RONNIE: Jeremy, I have a bit of a problem. My Drupal guru-guy got two million dollars seed capital for his start-up and moved to Palo Alto. But I was thinking, maybe you should send me to DrupalCon Austin... Why? So I can experience the whole Drupal community-thing first-hand... Really?... Awesome!

DISSOLVE TO:

EXTERIOR. CONVENTION CENTER – ESTABLISHING SHOT – MORNING
(RONNIE enters the massive Neal Kocurek Memorial Austin Convention Center, along with hundreds of enthusiastic Drupalists.)

[TITLE: DRUPALCON. DAY ONE]

INT. CONVENTION CENTER, REGISTRATION AREA
(Ronnie approaches a heavily costumed gentleman and explains that he is interviewing attendees for an article in Drupal Watchdog. HILMAR HALLBJÖRNSSON, a Drupal developer from Iceland, readily agrees to be questioned. Ronnie sets his cellphone to Record.)

RONNIE: Hilmar, why the helmet and horns?

HILMAR: Three days before I came here, I saw Morten DK announcing a big Vikings party and claiming that Vikings came from Denmark.

RONNIE: Uh-huh.

HILMAR: Well, everyone with a little knowledge of Vikings knows there are as many genuine Vikings in Denmark as there are high mountains. Which is: none.

RONNIE: So...?

HILMAR: So I decided to show him who was the boss.

RONNIE: And you showed him?

HILMAR: He was defeated and knelt before me.

INT. CONVENTION CENTER – LATER
(Exiting Exhibit Hall, Ronnie keeps up with a scurrying man-on-a-mission: JASON MOSS, an applications developer for the University of North Carolina in Chapel Hill.)

JASON: This is my fourth DrupalCon and what makes it most favorable over the last three is that it’s in a warm place.

RONNIE: Uh-huh. But aside from the weather –

JASON: – I always get a lot of little tidbits and useful tips at Drupalcon. It’s what keeps me coming.

RONNIE: Did you hear Dries’s keynote speech?

Categories: FLOSS Project Planets

Drupal core announcements: No Drupal 6 or Drupal 7 core release on Wednesday, October 1

Planet Drupal - Tue, 2014-09-30 00:44

The monthly Drupal core bug fix release window is scheduled for this Wednesday. However, due to DrupalCon and other scheduling conflicts, there will be no release on this date.

Upcoming release windows include:

  • Wednesday, October 15 (security release window)
  • Wednesday, November 5 (bug fix release window)

For more information on Drupal core release windows, see the documentation on release timing and security releases, and the discussion that led to this policy being implemented.

Categories: FLOSS Project Planets

Bryan Pendleton: A fairly random collection of links on Merkle Trees

Planet Apache - Tue, 2014-09-30 00:18

Just sitting around, hanging out, musing about Merkle Trees...

  • Merkle treeHash trees can be used to verify any kind of data stored, handled and transferred in and between computers. Currently the main use of hash trees is to make sure that data blocks received from other peers in a peer-to-peer network are received undamaged and unaltered, and even to check that the other peers do not lie and send fake blocks.
  • A Certified Digital SignatureThe method is called tree authentication because the computation of H(1,n,Y) forms a binary tree of recursive calls. Authenticating a particular leaf Y(i) in the tree requires only those values of H() starting from the leaf and progressing to the root, i.e., from H(i,i,Y) to H(1,n,Y).
  • Recent Improvements in the Efficient Use of Merkle Trees: Additional Options for the Long Term Fractal Merkle Tree Representation and Traversal, shows how to modify Merkle’s scheduling algorithm to achieve a space-time trade-off. This paper was presented at the Cryptographer’s Track, RSA Conference 2003 (May 2003). This construction roughly speeds up the signing operation inherent in Merkle’s algorithm by an arbitrary factor of T, (less than H), at a cost of requiring more space: (2^T times the space).
  • Merkle Signature Schemes, Merkle Trees and Their CryptanalysisThe big advantage of the Merkle Signature Scheme is, that the security does not rely on the difficulty of any mathematic problem. The security of the Merkle Signature Scheme depends on the availability of a secure hash function and a secure one-time digital signature. Even if a one-time signature or a hash function becomes insecure, it can be easily exchanged. This makes it very likely that the Merkle Signature Scheme stays secure even if the conventional signature schemes become insecure.
  • Caches and Merkle Trees for Efficient Memory AuthenticationOur work addresses the issues in implementing hash tree machinery in hardware and integrating this machinery with an on-chip cache to reduce the log N memory bandwidth overhead.
  • Protocol specificationMerkle trees are binary trees of hashes. Merkle trees in bitcoin use a double SHA-256, the SHA-256 hash of the SHA-256 hash of something.

    If, when forming a row in the tree (other than the root of the tree), it would have an odd number of elements, the final double-hash is duplicated to ensure that the row has an even number of hashes.

    First form the bottom row of the tree with the ordered double-SHA-256 hashes of the byte streams of the transactions in the block.

    Then the row above it consists of half that number of hashes. Each entry is the double-SHA-256 of the 64-byte concatenation of the corresponding two hashes below it in the tree.

    This procedure repeats recursively until we reach a row consisting of just a single double-hash. This is the Merkle root of the tree.

  • Amazon's Dynamo Merkle trees help in reducing the amount of data that needs to be transferred while checking for inconsistencies among replicas. For instance, if the hash values of the root of two trees are equal, then the values of the leaf nodes in the tree are equal and the nodes require no synchronization. If not, it implies that the values of some replicas are different. In such cases, the nodes may exchange the hash values of children and the process continues until it reaches the leaves of the trees, at which point the hosts can identify the keys that are “out of sync”. Merkle trees minimize the amount of data that needs to be transferred for synchronization and reduce the number of disk reads performed during the anti-entropy process.
  • Cassandra: Using Merkle trees to detect inconsistencies in data A repair coordinator node requests Merkle tree from each replica for a specific token range to compare them. Each replica builds a Merkle tree by scanning the data stored locally in the requested token range. The repair coordinator node compares the Merkle trees and finds all the sub token ranges that differ between the replicas and repairs data in those ranges.
  • The Dangers of Rebasing A BranchPersonally, having studied Merkle Trees and discussed a possible use-case for using git/Merkle Trees as a caching solution, I view git as a entirely immutable structure of your code. Rebases break this immutability of commits.
  • Sigh. "grow-only", "rebase is dangerous", "detached head state is dangerous". STOP. Stop it now.git is a bag of commits organized into a tree (a tree of Merkle hash chains). Branches and tags are symbolic names for these. Think of it this way and there's no danger.

    ...

    I didn't need a local branch crutch to find my way around because I know the model: a tree of commits.

    Understanding the model is the key.

    There are other VCSes that also use Merkle hash trees. Internally they have the power that git has.

  • Google's end-to-end key distribution proposal Smells like a mixed blockchain/git type approach - which is a good thing. The "super-compressed" version of the log tip sounds like git revision hash. The append-only, globally distributed log is pretty much like a blockchain.
  • Ticking time bombGiven only one verified hash in such a system, no part of the data, nor its history of mutation can be forged. "History" can mean which software runs on your computer (TPM), which transactions are valid (Bitcoin), or which commits have been done in a SCM (git, mercurial).

    So git is not magical, it is just a practical implementation of something that works. Any other *general* solution will be based on similar basic principles. Mercurial does this and there is a GPG extension for it.

Wow, that's a lot.

I shall have to find more time to read...

Categories: FLOSS Project Planets

Dirk Eddelbuettel: Rcpp 0.11.3

Planet Debian - Mon, 2014-09-29 21:39

A new release 0.11.3 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package has been uploaded too.

Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 273 packages on CRAN depend on Rcpp for making analyses go faster and further.

This release brings a fairly large number of continued enhancements, fixes and polishing to Rcpp. These were provided by a total of seven different contributors---which is a new record as well.

See below for a detailed list of changes extracted from the NEWS file, but some highlights included in this release are

  • Several API cleanups, polishes and a pre-announced code removal
  • New InternalFunction interface, and new Timer functionality.
  • More robust functionality of Rcpp Attributes as well as a new dryRun option.
  • The Rcpp FAQ was updated, as was the main Description: in the DESCRIPTION file.
  • Rcpp.package.skeleton() can now deploy functionality from pkgKitten to create Rcpp packages that purr.

One sore point, however, is that we missed that packages using Rcpp Modules appear to require a rebuild. We are sorry for the inconvenience; this has highlighted a shortcoming in our fairly robust and extensive tests. While we test our packages against all known CRAN dependents, such tests check for the ability to compile and run freshly and not whether previously built packages still run. We intend to augment our testing in this direction to avoid a repeat occurrence of such a misfeature.

Changes in Rcpp version 0.11.3 (2014-09-27)
  • Changes in Rcpp API:

    • The deprecation of RCPP_FUNCTION_* which was announced with release 0.10.5 last year is proceeding as planned, and the file macros/preprocessor_generated.h has been removed.

    • Timer no longer records time between steps, but times from the origin. It also gains a get_timers(int) methods that creates a vector of Timer that have the same origin. This is modelled on the Rcpp11 implementation and is more useful for situations where we use timers in several threads. Timer also gains a constructor taking a nanotime_t to use as its origin, and a origin method. This can be useful for situations where the number of threads is not known in advance but we still want to track what goes on in each thread.

    • A cast to bool was removed in the vector proxy code as inconsistent behaviour between clang and g++ compilations was noticed.

    • A missing update(SEXP) method was added thanks to pull request by Omar Andres Zapata Mesa.

    • A proxy for DimNames was added.

    • A no_init option was added for Matrices and Vectors.

    • The InternalFunction class was updated to work with std::function (provided a suitable C++11 compiler is available) via a pull request by Christian Authmann.

    • A new_env() function was added to Environment.h

    • The return value of range eraser for Vectors was fixed in a pull request by Yixuan Qiu.

  • Changes in Rcpp Sugar:

    • In ifelse(), the returned NA type was corrected for operator[].

  • Changes in Rcpp Attributes:

    • Include LinkingTo in DESCRIPTION fields scanned to confirm that C++ dependencies are referenced by package.

    • Add dryRun parameter to sourceCpp.

    • Corrected issue with relative path and R chunk use for sourceCpp.

  • Changes in Rcpp Documentation:

    • The Rcpp-FAQ vignette was updated with respect to OS X issues.

    • A new entry in the Rcpp-FAQ clarifies the use of licenses.

    • Vignettes build results no longer copied to /tmp to please CRAN.

    • The Description in DESCRIPTION has been shortened.

  • Changes in Rcpp support functions:

    • The Rcpp.package.skeleton() function will now use pkgKitten package, if available, to create a package which passes R CMD check without warnings. A new Suggests: has been added for pkgKitten.

    • The modules=TRUE case for Rcpp.package.skeleton() has been improved and now runs without complaints from R CMD check as well.

  • Changes in Rcpp unit test functions:

    • Functions from the RUnit package are now prefixed with RUnit::

    • The testRcppModule and testRcppClass sample packages now pass R CMD check --as-cran cleanly with NOTES or WARNINGS

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-09-29

Planet Apache - Mon, 2014-09-29 19:58
  • Prototype

    Prototype is a brand new festival of play and interaction. This is your chance to experience the world from a new perspective with removable camera eyes, to jostle and joust to a Bach soundtrack whilst trying to disarm an opponent, to throw shapes as you figure out who got an invite to the silent disco, to duel with foam pool noodles, and play chase in the dark with flashlights. A unique festival that incites new types of social interaction, involving technology and the city, Prototype is a series of performances, workshops, talks, and games that spill across the city, alongside an adult playground in the heart of Temple Bar. Project Arts Centre, 17-18 October. looks nifty

    (tags: prototype festivals dublin technology make vr gaming)

  • Confessions of a former internet troll – Vox

    I want to tell you about when violent campaigns against harmless bloggers weren’t any halfway decent troll’s idea of a good time — even the then-malicious would’ve found it too easy to be fun. When the punches went up, not down. Before the best players quit or went criminal or were changed by too long a time being angry. When there was cruelty, yes, and palpable strains of sexism and racism and every kind of phobia, sure, but when these things had the character of adolescents pushing the boundaries of cheap shock, disagreeable like that but not criminal. Not because that time was defensible — it wasn’t, not really — but because it was calmer and the rage wasn’t there yet. Because trolling still meant getting a rise for a laugh, not making helpless people fear for their lives because they’re threatening some Redditor’s self-proclaimed monopoly on reason. I want to tell you about it because I want to make sense of how it is now and why it changed.

    (tags: vox trolls blogging gamergate 4chan weev history teenagers)

Categories: FLOSS Project Planets

Web Wash: 3 Easy Ways to Create View Modes in Drupal 7

Planet Drupal - Mon, 2014-09-29 17:14

View modes allow site builders to display the same piece of content in various ways. Drupal ships with a bunch of them out of the box like Teaser, "Full content", RSS and much more. There is even one for the search result page called "Search result". However, the two most prominent are Teaser and "Full content".

The "Full content" view mode is the one used to display content on its "node/123" page. It's the one you'll customise the most. Teaser, on the other hand, is used to display a summarised or trimmed down version of an article.

You can create as many view modes as necessary. But like many things in Drupal, they can be created in a few ways. They can be implemented using code and with a module or two.

In this tutorial, you'll learn how to create view modes in three ways: using hook_entity_info_alter(), using Display Suite and Entity view modes.

Categories: FLOSS Project Planets

Drupal Association News: Drupal.org Content Strategy: Announcing a Request for Proposals!

Planet Drupal - Mon, 2014-09-29 16:45

Earlier this year the Drupal Association began work on an initiative to launch a redesigned and improved Drupal.org in 2015. The first step of the plan was the Drupal.org user research, which was recently finished. Today we’d like to issue a Request for Proposals (RFP) for the content strategy for Drupal.org, next step of our redesign project.

We’d like to develop the strategy, which will guide ongoing content development work performed by Drupal Association staff and the Drupal community, and inform our ongoing branding and design efforts.

If you or your company is interested in potentially performing the content strategy work, please see the dates and instructions in the RFP document for more detail. Also, if you know of a person or company who would be awesome for this project, please encourage them to participate. Thank you!

Drupal.org Content Strategy RFP

Categories: FLOSS Project Planets

Tomorrow: The Luminosity of Free Software Episode 21

Planet KDE - Mon, 2014-09-29 12:48

As I wrote in my blog last week, I was away from the Internets for nearly and entire week (oh my!) but am back and tomorrow is the day when the next episode of the Luminosity of Free Software will happen.

I will be recording it on Google+ Hangouts on Tuesday the 30th of September at 18:00 UTC. The topics haven't changed since my last blog, but I'll repeat them here for convenience:


  1. Kdenlive: Free software non-linear video editing that rocks
  2. Funding Free Software: We'll discuss a number of models, each with their unique strengths and weaknesses and see if we can't pick out some of the better ones
  3. Q&A: You ask, and I do my best to answer
See you tomorrow on G+ Hangout and irc.freenode.net in #luminosity!
Categories: FLOSS Project Planets

Jim Jagielski: Shellshock: No, it IS a bash bug

Planet Apache - Mon, 2014-09-29 12:32

Reading over http://paste.lisp.org/display/143864, I am surprised just how wrong the entire post is.

The gist of the post is that the Shellshock bug is not bash's fault, but rather, in this argument, the fault of Apache and other facing programs in not "sanitizing" the environment before it gets into bash's hands.

Sweet Sassy Molassy! What kind of horse-sh*t is that?

As "proof" of this argument, pjb uses the tired old excuse: "It's not a bug, it's a feature", noting that bash's execution of commands "hidden" in environment variables is documented; But then we get the best line of all:

The implementation detail of using an environment variable whose value starts with "() {" and which may contain further commands after the function definition is not documented, but could still be considered a feature 

As far as outlandish statements, this one takes the cake. Somehow, Apache and other programs should sanitize magical, undocumented features and their failure to do so is the problem, not that this magic is undocumented as well as fraught with issues in and of itself.

Let's recall that if any other Bourne-type shell, or, in fact, any real POSIX-compliant shell (which bash claims to be), were being used in the exact situation that bash was being used, there would be no vulnerability. None. Nada. Zero. Replace with ksh, zsh, dash, ... and you'd be perfectly fine. No vulnerability and CGI would work just fine. And also let's recall, again focusing on Apache (and all web servers, in fact; It's not just Apache is affected by this vulnerability but any web server, such as nginx, etc...), the CGI specification specifically makes it clear that environment variables are exactly where the parameters of the client's request lives.

Also, let's consider this: A shell is where the unwashed public interfaces with the OS. If there is ANY place where you don't want undocumented magic, especially in executing random code in an undocumented fashion, it AIN'T the shell. And finally, the default shell is also run by the start-up scripts themselves, again meaning that you want that shell to have as few undocumented bugs... *cough* *cough*, sorry "features" as possible, and certainly not one's that could possible run things behind your back.

Yes, this bug, this vulnerability is certainly bash's, no doubt at all. But it also goes without saying that if bash was not the default shell (/bin/sh) on Linux and OSX, that this would have been a weaker vulnerability. Maybe that was, and is, the main takeaway here. Maybe it is time for the default shell on Linux to "return" to the old Bourne shell or, at least, dash.

Categories: FLOSS Project Planets

Riccardo Mottola: Improvements in GNUstep's native window look

GNU Planet! - Mon, 2014-09-29 12:11
In the past weeks, quite some polish was added in windows support.

First, there was a bug affecting Popup Menus and contextual menus that affected only certain computers. It was fixed.




Then the controls were not properly initialized. Native file-dialogs, for example, as well as upcoming print dialogs (work in progress by Gregory) did not fit the theme properly. On XP, Window 7 and Windows 8 they should follow the native look, instead they always got the "Win 95" look creating a strange mix.

The fix requires initializing Windows' controls. I put the initialization code inside the WinUX theme loading. If it will not prove safe, then it needs to be moved into NSApplication. Furthermore, an XML resource file to enable the correct loading.




I really does look nice, doesn't it?
Categories: FLOSS Project Planets

DrupalCon Amsterdam: DrupalCon Amsterdam Kicks Off!

Planet Drupal - Mon, 2014-09-29 11:21

It's time for DrupalCon Amsterdam! We're excited to welcome more than 2,000 developers, designers, IT professionals and business executives from all over the world to Drupal’s largest-ever event in Europe.

We're especially excited to celebrate the progress made toward a Drupal 8 release. All critical “beta blocker” issues have been fixed and a beta version of Drupal 8 is the next step. Drupal 8 is full of promising improvements for developers, site builders, themers, IT professionals and marketers, and we can't wait to get it ready for the world.

Watch DrupalCon Live

DrupalCon Amsterdam will open this morning with a keynote from Dries Buytaert at 9:00 CEST -- but if you aren't in Amsterdam, don't worry! You can catch a live stream of the keynote (and Prenote!) here, courtesy of Brightcove.

The keynote promises to be fantastic -- Dries will discuss Drupal 8, as well as new ideas for how project contributors can be recognized. The Wednesday keynote will be excellent, as author and activist Cory Doctorow will take the stage to talk about net neutrality, privacy, and security. Doctorow’s keynote will be followed by a book signing.

View the full live stream schedule.

Categories: FLOSS Project Planets

DrupalCon Amsterdam: Looking Back on Monday in Amsterdam

Planet Drupal - Mon, 2014-09-29 11:06

We’re kicking off DrupalCon Amsterdam this evening, and our day of training and summits has been a huge success. At registration, over 900 badges were picked up, and this evening, the exhibit hall's opening reception promises to be a great bash.

We had more than 150 attendees show up for training, while the community summit and business summit both drew over 115 attendees apiece. The rockstars in our community also built some fantastic DrupalCon apps:

  • One Shoe built an app for
    Android
    and iPhone
  • Lemberg also built an app for Android and iPhone.
  • Remember that the fun starts tomorrow bright and early! The prenote will kick off the day of sessions & more at 8 AM— we recommend showing up to the Prenote to get a great seat for Dries’ keynote. (Plus, it’s fun!)

    We hope you have a great time here in Amsterdam, and we’re looking forward to seeing you tomorrow.

Categories: FLOSS Project Planets

Blair Wadman: Drupalcon for Non-Attendees

Planet Drupal - Mon, 2014-09-29 11:00

Drupalcon Amsterdam kicks off on today and it looks like it is going to be a great event, especially with Drupal 8 Beta about to be released! Sadly, we can't all be there. But that doesn't mean we have to miss out entirely.

Tags: Drupal CommunityPlanet Drupal
Categories: FLOSS Project Planets

Baris Wanschers: Drupal Training Day, the largest Drupal training worldwide

Planet Drupal - Mon, 2014-09-29 09:53

On Friday September 26th, the largest Drupal training worldwide was held in Amsterdam. Over 250 students, teachers and professionals from Belgium and The Netherlands participated in a curriculum of 5 different tracks introducing them with Drupal. 

Both Drupal agencies as well as client organisation have serious demand of Drupal talent. The Dutch Drupal Foundation aims to onboard new talent. Drupal is a popular open source content management framework used by Ikea, European governments and Lady Gaga.

Schools and professionals from Belgium and The Netherlands attended the free Drupal Training Day with over 250 people, and 50 more on the waiting list. The workshop was the first attempt by the Dutch Drupal Foundation to create awareness amongst young professionals and universities and to meet increased demand on Drupal in the market. The program included workshops aimed at code development, site building and content strategy as well as one single track for university stakeholders to get Drupal introduced into the curriculum.

The Training Day was carefully prepared by a team of volunteers and 35 trainers. Drupal Training Day was organised on Friday before DrupalCon, the largest Drupal event in the world, which is held in Amsterdam this week.

Check out the photos!

Credits to Imre Gmelig Meijling for writing this post.

 

Tags:  Planet Drupal Drupal Training Day Community Students
Categories: FLOSS Project Planets

InternetDevels: Meet our bloggers! 100 post!

Planet Drupal - Mon, 2014-09-29 09:14

Site building is a hell of a challenge. Like designing from scratch is easier… Or managing dozens of people devoted to the project. Or keeping a 2-floor office in order and cleanness. Gosh, every duty and specialization is a challenge! And we can only thank our teammates for keeping to high standards of their profession! But every professional comes to a point when it is needed not only to use skills in practise but also to share secrets of mastery with the community.

Read more
Categories: FLOSS Project Planets

Phase2: Introducing OpenPublic 1.0: The Next Era of Digital Government

Planet Drupal - Mon, 2014-09-29 08:24

Since 2011, OpenPublic has been transforming government by building government websites and applications which ensure security, mobility, and accessibility. Through our work with numerous government agencies, our teams have developed deep xperts at recognizing and responding to the specific technical challenges faced by public sector organizations.

But that doesn’t mean we couldn’t get better – and we did. As of today, we are proud to introduce the new and improved OpenPublic 1.0. It is the culmination of years of developing content management platforms for federal, state, and local government agencies. With each project, we gained a better understanding of these organizations’ digital needs. From the Department of Homeland Security, to Gerogia.gov, to the recent launch of San Mateo County’s multi-site platform, OpenPublic has evolved to the mature product it is today.

What’s New in OpenPublic 1.0?

The 1.0 product version encapsulates all the most important OpenPublic functionality in a clean collection of Apps, simplifying the distribution’s powerful out-of-the-box capabilities. Not only does OpenPublic 1.0 break the mold by “appifying” what was once a wilderness of modules and complicated Drupal configuration, the newly released product is also fully compliant with the Federal Information Security Management Act (FISMA).

The conception of OpenPublic 1.0 was based on Phase2’s significant experience building government technology solutions. While working with San Mateo County, we developed the idea of using apps to make the distribution’s functionality simple to configure for site administrators. Instead of wading through Drupal’s confusing configuration settings and modules, admins can now turn features on and off without affecting other parts of the platform. Apps like the Services App, Media Room App, Security App, and Workflow App provide distinct segments of functionality specifically designed to complement agencies’ digital needs. In July, Experience Director Shawn Mole elaborated on our OpenPublic App strategy and its potential to transform content management for the public sector.

Open Technology and Government

Like all distributions maintained by Phase2, OpenPublic is built with open technology, with good reason. Government agencies strive to reduce unnecessary costs for their taxpayers, and avoid the recurring licensing fees of proprietary software is a major benefit to open source solutions. Bypassing proprietary vendor lock-ins allows government to leverage the sustainable innovation of an open community working collaboratively to create, improve, and extend functionality, in addition to utilizing the community’s best practices for development. And because open technology is in the public domain, any agency can download, test drive, and learn about potential content management systems before choosing a provider.

San Mateo County, which worked with Phase2 to implement OpenPublic for the county’s CMS, recognized the value of openness in government technology and opened their code to the GitHub community. We were ecstatic that one of our clients embraced the open practices which are not only inherent in our work but laid the foundation for the development of OpenPublic. By making the innovative technology that went into building San Mateo County’s platform available for wider use, San Mateo County contributed to the government’s objective to foster openness, lower costs, and enhance service delivery. The “Open San Mateo” project demonstrates the power of open source to improve not just one government agency, but hundreds simultaneously by making the code available to other governments.

OpenPublic Moving Forward

We are hopeful that OpenPublic 1.0 will continue to advance more open government initiatives. In the meantime, Phase2 has several exciting projects in which we’ll show off some of the product’s enhanced features. Keep an eye open for the launch of the Department of the Interior (among others)!

Learn more about OpenPublic 1.0 at openpublicapp.com. For more information about Phase2’s services and how we can help build your site, email us at openpublic@phase2technology.com or comment below!

Categories: FLOSS Project Planets

Caktus Consulting Group: Celery in Production

Planet Python - Mon, 2014-09-29 08:00

(Thanks to Mark Lavin for significant contributions to this post.)

In a previous post, we introduced using Celery to schedule tasks.

In this post, we address things you might need to consider when planning how to deploy Celery in production.

At Caktus, we've made use of Celery in a number of projects ranging from simple tasks to send emails or create image thumbnails out of band to complex workflows to catalog and process large (10+ Gb) files for encryption and remote archival and retrieval. Celery has a number of advanced features (task chains, task routing, auto-scaling) to fit most task workflow needs.

Simple Setup

A simple Celery stack would contain a single queue and a single worker which processes all of the tasks as well as schedules any periodic tasks. Running the worker would be done with

python manage.py celery worker -B

This is assuming using the django-celery integration, but there are plenty of docs on running the worker (locally as well as daemonized). We typically use supervisord, for which there is an example configuration, but init.d, upstart, runit, or god are all viable alternatives.

The -B option runs the scheduler for any periodic tasks. It can also be run as its own process. See starting-the-scheduler.

We use RabbitMQ as the broker, and in this simple stack we would store the results in our Django database or simply ignore all of the results.

Large Setup

In a large setup we would make a few changes. Here we would use multiple queues so that we can prioritize tasks, and for each queue, we would have a dedicated worker running with the appropriate level of concurrency. The docs have more information on task routing.

The beat process would also be broken out into its own process.

# Default queue python manage.py celery worker -Q celery # High priority queue. 10 workers python manage.py celery worker -Q high -c 10 # Low priority queue. 2 workers python manage.py celery worker -Q low -c 2 # Beat process python manage.py celery beat

Note that high and low are just names for our queues, and don't have any implicit meaning to Celery. We allow the high queue to use more resources by giving it a higher concurrency setting.

Again, supervisor would manage the daemonization and group the processes so that they can all be restarted together. RabbitMQ is still the broker of choice. With the additional task throughput, the task results would be stored in something with high write speed: Memcached or Redis. If needed, these worker processes can be moved to separate servers, but they would have a shared broker and results store.

Scaling Features

Creating additional workers isn't free. The default concurrency uses a new process for each worker and creates a worker per CPU. Pushing the concurrency far above the number of CPUs can quickly pin the memory and CPU resources on the server.

For I/O heavy tasks, you can dedicate workers using either the gevent or eventlet pools rather than new processes. These can have a lower memory footprint with greater concurrency but are both based on greenlets and cooperative multi-tasking. If there is a library which is not properly patched or greenlet safe, it can block all tasks.

There are some notes on using eventlet, though we have primarily used gevent. Not all of the features are available on all of the pools (time limits, auto-scaling, built-in rate limiting). Previously gevent seemed to be the better supported secondary pool, but eventlet seems to have closed that gap or surpassed it.

The process and gevent pools can also auto-scale. It is less relevant for the gevent pool since the greenlets are much lighter weight. As noted in the docs, you can implement your own subclass of the Autoscaler to adjust how/when workers are added or removed from the pool.

Common Patterns

Task state and coordination is a complex problem. There are no magic solutions whether you are using Celery or your own task framework. The Celery docs have some good best practices which have served us well.

Tasks must assert the state they expect when they are picked up by the worker. You won't know how much time has passed since the original task was queued and when it executes. Another similar task might have already carried out the operation if there is a backlog.

We make use of a shared cache (Memcache/Redis) to implement task locks or rate limits. This is typically done via a decorator on the task. One example is given in the docs though it is not written as a decorator.

Key Choices

When getting started with Celery you must make two main choices:

  • Broker
  • Result store

The broker manages pending tasks, while the result store stores the results of completed tasks.

There is a comparison of the various brokers in the docs.

As previously noted, we use RabbitMQ almost exclusively, though we have used Redis successfully and experimented with SQS. We prefer RabbitMQ because Celery's message passing style and much of the terminology was written with AMQP in mind. There are no caveats with RabbitMQ like there are with Redis, SQS, or the other brokers which have to emulate AMQP features.

The major caveat with both Redis and SQS is the lack of built-in late acknowledgment, which requires a visibility timeout setting. This can be important when you have long running tasks. See acks-late-vs-retry.

To configure the broker, use BROKER_URL.

For the result store, you will need some kind of database. A SQL database can work fine, but using a key-value store can help take the load off of the database, as well as provide easier expiration of old results which are no longer needed. Many people choose to use Redis because it makes a great result store, a great cache server and a solid broker. AMQP backends like RabbitMQ are terrible result stores and should never be used for that, even though Celery supports it.

Results that are not needed should be ignored, using CELERY_IGNORE_RESULT or Task.ignore_result.

To configure the result store, use CELERY_RESULT_BACKEND.

RabbitMQ in production

When using RabbitMQ in production, one thing you'll want to consider is memory usage.

With its default settings, RabbitMQ will use up to 40% of the system memory before it begins to throttle, and even then can use much more memory. If RabbitMQ is sharing the system with other services, or you are running multiple RabbitMQ instances, you'll want to change those settings. Read the linked page for details.

Transactions and Django

You should be aware that Django's default handling of transactions can be different depending on whether your code is running in a web request or not. Furthermore, Django's transaction handling changed significantly between versions 1.5 and 1.6. There's not room here to go into detail, but you should review the documentation of transaction handling in your version of Django, and consider carefully how it might affect your tasks.

Monitoring

There are multiple tools available for keeping track of your queues and tasks. I suggest you try some and see which work best for you.

Summary

When going to production with your site that uses Celery, there are a number of decisions to be made that could be glossed over during development. In this post, we've tried to review some of the decisions that need to be thought about, and some factors that should be considered.

Categories: FLOSS Project Planets

Data Community DC: Social Network Analysis with Python Workshop on November 22nd

Planet Python - Mon, 2014-09-29 07:00

 

 

 

 

Data Community DC and District Data Labs are hosting a full-day Social Network Analysis with Python workshop on Saturday November 22nd.  For more info and to sign up, go to http://bit.ly/1lWFlLx.  Register before October 31st for an early bird discount!

Overview

Social networks are not new, even though websites like Facebook and Twitter might make you want to believe they are; and trust me- I’m not talking about Myspace! Social networks are extremely interesting models for human behavior, whose study dates back to the early twentieth century. However, because of those websites, data scientists have access to much more data than the anthropologists who studied the networks of tribes!

Because networks take a relationship-centered view of the world, the data structures that we will analyze model real world behaviors and community. Through a suite of algorithms derived from mathematical Graph theory we are able to compute and predict behavior of individuals and communities through these types of analyses. Clearly this has a number of practical applications from recommendation to law enforcement to election prediction, and more.

What You Will Learn

In this course we will construct a social network from email communications using Python. We will learn analyses that compute cardinality, as well as traversal and querying techniques on the graph, and even compute clusters to detect community. Besides learning the basics of graph theory, we will also make predictions and create visualizations from our graphs so that we can easily harness social networks in larger data products.

Course Outline

The workshop will cover the following topics:

  • Email Mbox format for conducting analysis

  • Reading emails with Python

  • Creating a graph using NetworkX

  • Serializing and deserializing NetworkX graphs

  • An introduction to Graph theory

  • Finding strong ties through link weighting

  • Computing centrality and key players (celebrities)

  • Finding communities through clustering techniques

  • Visualizing graphs with matplotlib

Upon completion of the course, you will understand how to conduct graph analyses on social networks, as well as have built a library for analyses on a social network constructed from email communications!

Instructor: Benjamin Bengfort

Benjamin is an experienced Data Scientist and Python developer who has worked in military, industry, and academia for the past eight years. He is currently pursuing his PhD in Computer Science at The University of Maryland, College Park, doing research in Metacognition and Active Logic. He is also a Data Scientist at Cobrain Company in Bethesda, MD where he builds data products including recommender systems and classifier models. He holds a Masters degree from North Dakota State University where he taught undergraduate Computer Science courses. He is also adjunct faculty at Georgetown University where he teaches Data Science and Analytics.

For more information and to reserve a seat, please go to  http://bit.ly/1lWFlLx.

The post Social Network Analysis with Python Workshop on November 22nd appeared first on Data Community DC.

Categories: FLOSS Project Planets
Syndicate content