FLOSS Project Planets

Martin-Éric Racine: xf86-video-geode 2.11.17

Planet Debian - Wed, 2015-05-20 05:46

This morning, I pushed out version 2.11.17 of the Geode X.Org driver. This is the driver used by the OLPC XO-1 and by a plethora of low-power desktops, micro notebooks and thin clients. This is a minor release. It merges conditional support for the OpenBSD MSR device (Marc Ballmer, Matthieu Herrb), fixes a condition that prevents compiling on some embedded platforms (Brian A. Lloyd) and upgrades the code for X server 1.17 compatibility (Maarten Lankhorst).

Pending issues:

  • toggle COM2 into DDC probing mode during driver initialization
  • reset the DAC chip when exiting X and returning to vcons
  • fix a rendering corner case with Libre Office
Categories: FLOSS Project Planets

Enrico Zini: love-thy-neighbor

Planet Debian - Wed, 2015-05-20 05:35
Love thy neighbor as thyself

‘Love thy neighbor as thyself’, words which astoundingly occur already in the Old Testament.

One can love one’s neighbor less than one loves oneself; one is then the egoist, the racketeer, the capitalist, the bourgeois. and although one may accumulate money and power one does not of necessity have a joyful heart, and the best and most attractive pleasures of the soul are blocked.

Or one can love one’s neighbor more than oneself—then one is a poor devil, full of inferiority complexes, with a longing to love everything and still full of hate and torment towards oneself, living in a hell of which one lays the fire every day anew.

But the equilibrium of love, the capacity to love without being indebted to anyone, is the love of oneself which is not taken away from any other, this love of one’s neighbor which does no harm to the self.

(From Herman Hesse, "My Belief")

I always have a hard time finding this quote on the Internet. Let's fix that.

Categories: FLOSS Project Planets

Rhonda D'Vine: Berge

Planet Debian - Wed, 2015-05-20 05:21

I wrote well over one year ago about Earthlings. It really did have some impact on my life. Nowadays I try to avoid animal products where possible, especially for my food. And in the context of vegan information that I tracked I stumbled upon a great band from Germany: Berge. They recently started a deal with their record label which says that if they receive one million clicks within the next two weeks on their song 10.000 Tränen their record label is going to donate 10.000,- euros to a German animal rights organization. Reason enough for me to share this band with you! :)
(For those who are puzzled by the original upload date of the video: Don't let yourself get confused, the call for it is from this monday)

  • 10.000 Tränen: This is the song that needs the views. It's a nice tune and great lyrics to think about. Even though its in German it got English subtitles. :)
  • Schauen was passiert: In the light of 10.000 Tränen it was hard for me to select other songs, but this one sounds nice. "Let's see what happens". :)
  • Meer aus Farben: I love colors. And I hate the fact that most conference shirts are black only. Or that it seems to be impossible to find colorful clothes and shoes for tall women.

Like always, enjoy!

/music | permanent link | Comments: 3 | Flattr this

Categories: FLOSS Project Planets

Jim Birch: Using Drupal's Environment Indicator to help visually manage Dev, Stage, and Production Servers

Planet Drupal - Wed, 2015-05-20 05:00

There are days that I work on half a dozen different websites.  I'm sure some of you are in the same boat.  We make client edits and change requests with rapid effieciency.  We work locally, push to staging, test and review, then push to the live server and repeat.  I would be remiss in saying that I never made a change on the live or staging site accidentally.

The Drupal Environment Indicator module allows you to name, color, and configure a multitude of visual queues for each of your different servers, or other variables, like Git branch or path.  It is very easy to install, and can integrate with Toolbar, Admin Menu, and Mobile Friendly Navigation Toolbar for no additional screen space. 

Once installed, set the permissions of the roles you want to give permission to see the indicator.  You can adjust the general settings at /admin/config/development/environment-indicator/settings

While you can create different indicators inside the admin UI, I prefer to set these in the settings.php files on the various servers so they are not overidden when we move databases back from Production back to Staging and Dev.

Read more

Categories: FLOSS Project Planets

بايثون العربي: كيفية إستخدام وحدة Random في بايثون

Planet Python - Wed, 2015-05-20 04:54
برامج الكمبيوتر وخاصة الألعاب منها تكون ممتعة لو كانت هناك بعض من الأشياء العشوائية ولكن ولسوء الحظ ليس لدينا أي طريقة تمكننا من الاعتماد عليها لتوليد الارقام العشوائية بشكل رائع ومع ذلك فان معظم لغات البرمجة ومنها بايثون تتضمن بعض الدوال التي تقوم بتوليد الأرقام الشبه العشوائية وتقوم هذه الدوال بعمل بعض الخطوات لتقوم بعرض الارقام بطريقة عشوائية.
يصعب على أجهزة الكمبيوتر توليد أرقام عشوائية حقيقية وهي تحتاج الى عتاد خاص لتوليد ارقام عشوائية حقيقية ولكنها عملية معقدة وباهضة لذلك نقوم بالإكتفاء بما تقدمه لغات البرمجة من عملية التوليد الشبه العشوائية .

الأرقام العشوائية في البرامج تسمح لنا بلعب ألعاب نجهل احداثها المستقبلية بسبب عشوائية المراحل .
سبق لي وان تكلمت عن دالة randrange  الموجودة في وحدة Random بشكل سريع ومختصر ولكن اليوم سنتلكم عن مجموعة اخرى من الدوال الموجودة في وحدة Random .
تتيح لنا  وحدة Random إمكانية الوصول مجموعة كبيرة من الوظائف والدوال ومن أهمها تلك التي تسمح لنا بتوليد الأرقام العشوائية .
متى نستعمل وحدة Random
نحتاج وحدة Random عندما نريد من الكمبيوتر ان يقوم بإختيار رقم معين في مجال محدد وهي ليست مخصصة للأرقام وفقط بل يمكننا اختيار عناصر عشوائية من القوائم القواميس والكثير من الأمور الاخرى .
دوال وحدة Random
كما وسبق أن قلت ان هذه الوحدة تحتوي على الكثير من الدوال والوظائف التي تساعدنا في أعمالنا وسنقوم بعرض مجموعة مفيدة من تلك الدوال .
  • Randint
إذا كنت تريد أن تقوم بتوليد أعداد صحيحة عشوائيا نقوم بإستخدام الدالة Randint  وهي تقبل قيمتين :
القيمة الادنى والقيمة الأعلى ويتم ضمهما معا في نطاق هملية الإختيار العشوائية ولتوضيح العملية أكثر ناخذ مثال :

لتوليد رقم عشوائي من 1 الى 5 نقوم بكتابة الكود التالي :

import random
print random.randint(0, 5)
وستكون النتيجة رقم عشوائي من الأرقام التالية : 1،2،3،4،5
  • Random
اذا كنت تريد ارقام وأعداد كبيرة يمكننا إستخدام العلامة الرياضية الضرب .
المثال التالي سيقوم بعرض رقم عشواي من 0 الى 100

import random
random.random() * 100

  • choice
إذا كنت تريد عرض قيمة عشواية من القوائم نقوم بإستخدام الدالة choice
البرنامج التالي سيقوم بعرض نتائج مختلفة في كل مرة يتم تشغيل فيها البرنامج


  • Shuffle

تقوم هذه الدالة بإعادة توزيع  عناصر القائمة عشوائيا دعونا نأخذ مثال لتوضيح الأمر أكثر
import random

list = [20, 16, 10, 5];
print list

print list
وستكون النتيجة

  • Randrange
تقوم هذه الدالة بعرض قيمة عشوائية من مجموعة من العناصر التي تم تعيينها مسبقا

range(الخطوة النهاية البداية ).
البداية : الرقم التي تبدأ منها عملية الإختيار العشوائية ويمكن ان يكون هذا الرقم ضمن نطاق الارقام العشوائية.
النهاية : الرقم أو العدد الذي تنتهي عنده عملية الاختيار العشوائية ولا يمكن ان يكون هذا الرقم ضمن نطاق الارقام العشوائية .
الخطوة: هذه الخاصية تمثل الرقم الذي يقوم البرنامج لاضافته للرقم العشوائي .
دعونا ناخذ مثال عن العملية

import random

#إختيار رقم عشوائي من 100 الى 1000
random.randrange(100, 1000, 2)

# إختيار رقم عشوائي من 100 الى 1000
random.randrange(100, 1000, 3)

Categories: FLOSS Project Planets

Modules Unraveled: 135 Writing the Book Drupal 8 Configuration Management with Anja Schirwinski and Stefan Borchert - Modules Unraveled Podcast

Planet Drupal - Wed, 2015-05-20 00:40
Published: Tue, 05/19/15Download this episodeWriting a Book for D8
  • What’s it like writing a book for a piece of software that isn’t even officially released yet?
  • How long did the writing process take?
    • Packt publishing sent us a proposal to write this book in December of 2013. We got started quickly, sending them an outline of the chapters and an estimated page count in the same month. The original estimated page count was 150, it turned out to be around 120. We received a pretty strict time line, having to finish a chapter every two weeks, starting in December of 2013.
    • We managed to finish most chapters in two weeks, but some of the longer ones took a little longer since we also started one of our biggest projects we had had until then, also in January. That was pretty tough because that project took up way more than a regular full time job, so we ended up having to write all of the chapters late at night and on the weekends. In May, all of our chapters then went to the editors and we didn’t hear back from the publisher for a really long time.
    • We also told them that we will have to rewrite a lot of the chapters since there was so much work in progress with the Configuration Management Initiative and they were changing a lot about how it worked, like going from the file based default to the database default. I think it was in January of 2015 when chapters came back with some feedback and we started rewriting every chapter, which was pretty painful at the time. We were able to update some of the documentation at drupal.org with the changes we found. It felt good to contribution at least a small part, when with our project and the book we had no time left to contribute code to Drupal 8 like we usually do.
    • We spent around 40 days on the book between the two of us.
    • In December, Packt asked the first publisher to review the book. We had recommended them one of our team members at undpaul, Tom, who has a similar amount of Drupal knowledge as Stefan. We really wanted to have someone from CMI to review the book, like Greg Dunlap. They had turned down reviewing the book after the first chapters were written, because too much would still change. Then after the changes went in we kept recommending Greg but I never heard anything back, maybe he was busy or they didn’t bother to ask. At the beginning of this year they told us the book was planned to be published by March. We recommended waiting because we didn’t expect a release candidate before the European Drupalcon and we would have rather had someone like Greg take the time to review, but Packt had another opinion :) Since most of CMI changes were finished, we didn’t feel too uncomfortable about the time of publishing, and it was also kind of nice to finally be done with this thing :) So it took a little over a year from start to finish. It was published on March 24th.
  • Do you expect to need to rewrite anything between now and when 8.0 is released?
The Book: Drupal 8 Configuration Management
  • What do you cover in the book?
    • We start of with a basic introduction to what Configuration Management in Drupal means, because it is a thing in Software development in general, that doesn’t usually refer to what it is in Drupal, where it basically just means that configuration is saved in files which makes deployment easier. In the first chapters, we make sure the reader understands what Configuration Management means and why version control is so important. We mention some best practices and then show how to use it for non-coders as well, since there’s a nice backend non-technical folks can use, even if you don’t use version control (which of course we don’t recommend). We also have a part that describes how managing configuration works in Drupal 7 (Features!) and then dive into code examples, explaining schema files, showing how to add configuration to a custom module, how to upgrade Drupal 7 variables to the new system and cover configuration management for multilingual sites.
  • Who is the target audience of the book?
  • Why did you decide to write about Configuration Management?
    • We have used Features to deploy configuration changes for a very long time, I don’t recall not using it since we started the company 5 years ago. We have talked about it at several DrupalCamps and Drupal User Groups and always tried to convince everyone to use it. We were really excited about the Configuration Management Initiative and thought it was a very good fit for us.
  • Before we started recording, you mentioned that there is a companion website to the book. Can you talk about what content we’ll find there, and what purpose that serves?
  • Are you building any sites in D8 at Undpaul?
Episode Links: Anja on drupal.orgAnja on TwitterStefan on drupal.orgStefan on TwitterWhere to buy the bookThe website for the bookundpaul on Twitterundpaul Instagramundpaul websiteTags: Drupal 8Bookplanet-drupal
Categories: FLOSS Project Planets

A. Jesse Jiryu Davis: Server Discovery And Monitoring In PyMongo, Perl, And C

Planet Python - Tue, 2015-05-19 23:09

(Cross-posted from the MongoDB Blog.)

How does a MongoDB driver discover and monitor a single server, a set of mongos servers, or a replica set? How does it determine what types of servers they are? How does it keep this information up to date? How does it discover an entire replica set given an initial host list, and how does it respond to stepdowns, elections, reconfigurations, network error, or the loss of a server?

In the past each MongoDB driver answered these questions a little differently, and mongos differed a little from the drivers. We couldn't answer questions like, "Once I add a secondary to my replica set, how long does it take for the driver to start using it?" Or, "How does a driver detect when the primary steps down, and how does it react?"

To standardize our drivers, I wrote the Server Discovery And Monitoring Spec, with David Golden, Craig Wilson, Jeff Yemin, and Bernie Hackett. Beginning with this spring's next-generation driver releases, all our drivers conform to the spec and answer these questions the same. Or, where there's a legitimate reason for them to differ, there are as few differences as possible and each is clearly explained in the spec. Even in cases where several answers seem equally good, drivers agree on one way to do it.

The spec describes how a driver monitors a topology:

Topology: The state of your deployment. What type of deployment it is, which servers are available, and what type of servers (mongos, primary, secondary, ...) they are.

The spec covers all MongoDB topologies, but replica sets are the most interesting. So I'll explain the spec's algorithm for replica sets by telling the story of your application as it passes through life stages: it starts up, discovers a replica set, and reaches a steady state. Then there is a crisis—I spill coffee on your primary server's motherboard—and a resolution—the replica set elects a new primary and the driver discovers it.

At each stage we'll observe a typical multi-threaded driver, PyMongo 3.0, a typical single-threaded driver, the Perl Driver 1.0, and a hybrid, the C Driver 1.2. (I implemented PyMongo's server discovery and monitoring. David Golden wrote the Perl version, and Samantha Ritter and Jason Carey wrote the one in C.)

To conclude, I'll tell you our strategy for verifying spec compliance in ten programming languages, and I'll share links for further reading.


When your application initializes, it creates a MongoClient. In Python:

client = MongoClient( 'mongodb://hostA,hostB/?replicaSet=my_rs')

In Perl:

my $client = MongoDB::MongoClient->new({ host => "mongodb://hostA,hostB/?replicaSet=my_rs" });

In C, you can either create a client directly:

mongoc_client_t *client = mongoc_client_new ( "mongodb://hostA,hostB/?replicaSet=my_rs");

Or create a client pool:

mongoc_client_pool_t *pool = mongoc_client_pool_new ( "mongodb://hostA,hostB/?replicaSet=my_rs"); mongoc_client_t *client = mongoc_client_pool_pop (pool);

A crucial improvement of the next gen drivers is, the constructor no longer blocks while it makes the initial connection. Instead, the constructor does no network I/O. PyMongo launches a background thread per server (two threads in this example) to initiate discovery, and returns control to your application without blocking. Perl does nothing until you attempt an operation; then it connects on demand.

In the C Driver, if you create a client directly it behaves like the Perl Driver: it connects on demand, on the main thread. But the C Driver's client pool launches one background thread to discover and monitor all servers.

The spec's "no I/O in constructors" rule is a big win for web applications that use our next gen drivers: In a crisis, your app servers might be restarted while your MongoDB servers are unreachable. Your application should not throw an error at startup, when it constructs the client object. It starts up disconnected and tries to reach your servers until it succeeds.


The initial host list you provide is called the "seed list":

Seed list: The initial list of server addresses provided to the MongoClient.

The seed list is the stepping-off point for the driver's journey of discovery. As long as one seed is actually an available replica set member, the driver will discover the whole set and stay connected to it indefinitely, as described below. Even if every member of the set is replaced with a new host, like the Ship of Theseus, it is still the same replica set and the driver remains connected to it.

I tend to think of a driver as a tiny economy of information about your topology. Monitoring supplies information, and your application's operations demand information. Their demands are defined in David Golden's Server Selection Spec, while the method of supplying information is defined here, in the Server Discovery And Monitoring Spec. In the beginning, there is no information, and the monitors rush to supply some. I'll talk more about the demand side later, in the "Crisis" section.


Let's start with PyMongo. In PyMongo, like other multi-threaded drivers, the MongoClient constructor starts one monitor thread each for "hostA" and "hostB".

Monitor: A thread or async task that occasionally checks the state of one server.

Each monitor connects to its assigned server and executes the "ismaster" command. Ignore the command's archaic name, which dates from the days of master-slave replication, long superseded by replica sets. The ismaster command is the client-server handshake. Let's say the driver receives hostB's response first:

ismaster = { "setName": "my_rs", "ismaster": false, "secondary": true, "hosts": [ "hostA:27017", "hostB:27017", "hostC:27017"]}

hostB confirms it belongs to your replica set, informs you that it is a secondary, and lists the members in the replica set config. PyMongo sees a host it didn't know about, hostC, so it launches a new thread to connect to it.

If your application threads are waiting to do any operations with the MongoClient, they block while awaiting discovery. But since PyMongo now knows of a secondary, if your application is waiting to do a secondary read, it can now proceed:

db = client.get_database( "dbname", read_preference=ReadPreference.SECONDARY) # Unblocks when a secondary is found. db.collection.find_one()

Meanwhile, discovery continues. PyMongo waits for ismaster responses from hostA and hostC. Let's say hostC responds next, and its response includes "ismaster": true:

ismaster = { "setName": "my_rs", "ismaster": true, "secondary": false, "hosts": [ "hostA:27017", "hostB:27017", "hostC:27017"]}

Now PyMongo knows the primary, so all reads and writes are unblocked. PyMongo is still waiting to hear back from hostA; once it does, it can use hostA for secondary reads as well.


Multithreaded Perl code is problematic, so the Perl Driver doesn't launch a thread per host. How, then does it discover your set? When you construct a MongoClient it does no I/O. It waits for you to begin an operation before it connects. Once you do, it scans the hosts serially, initially in random order.

Scan: A single-threaded driver's process of checking the state of all servers.

Let's say the driver begins with hostB, a secondary. Here's a detail I didn't show you earlier: replica set members tell you who they think the primary is. HostB's reply includes "primary": "hostC:27017":

ismaster = { "setName": "my_rs", "ismaster": false, "secondary": true, "primary": "hostC:27017", "hosts": [ "hostA:27017", "hostB:27017", "hostC:27017"]}

The Perl Driver uses this hint to put hostC next in the scan order, because connecting to the primary is its top priority. It checks hostC and confirms that it's primary. Finally, it checks hostA to ensure it can connect, and discovers that hostA is another secondary. Scanning is now complete and the driver proceeds with your application's operation.


The C driver has two modes for server discovery and monitoring: single-threaded and pooled. Single-threaded mode is optimized for embedding the C Driver within languages like PHP: PHP applications deploy many single-threaded processes connected to MongoDB. Each process uses the same connections to scan the topology as it uses for application operations, so the total connection count from many processes is kept to a minimum.

Other applications should use pooled mode: as we shall see, in pooled mode a background thread monitors the topology, so the application need not block to scan it.

C Driver's single-threaded mode

The C driver scans servers on the main thread, if you construct a single client:

mongoc_client_t *client = mongoc_client_new ( "mongodb://hostA,hostB/?replicaSet=my_rs");

In single-threaded mode, the C Driver blocks to scan your topology periodically with the main thread, just like the Perl Driver. But unlike the Perl Driver's serial scan, the C Driver checks all servers in parallel. Using a non-blocking socket per member, it begins a check on each member concurrently, and uses the asynchronous "poll" function to receive events from the sockets, until all have responded or timed out. The driver updates its topology as ismaster calls complete. Finally it ends the scan and returns control to your application.

Whereas the Perl Driver's topology scan lasts for the sum of all server checks (including timeouts), the C Driver's topology scan lasts only the maximum of any one check's duration, or the connection timeout setting, whichever is shorter. Put another way, in single-threaded mode the C Driver fans out to begin all checks concurrently, then fans in once all checks have completed or timed out. This "fan out, fan in" topology scanning method gives the C Driver an advantage scanning very large replica sets, or sets with several high-latency members.

C Driver's pooled mode

To activate the C Driver's pooled mode, make a client pool:

mongoc_client_pool_t *pool = mongoc_client_pool_new ( "mongodb://hostA,hostB/?replicaSet=my_rs"); mongoc_client_t *client = mongoc_client_pool_pop (pool);

The pool launches one background thread for monitoring. When the thread begins, it fans out and connects to all servers in the seed list, using non-blocking sockets and a simple event loop. As it receives ismaster responses from the servers, it updates its view of your topology, the same as a multi-threaded driver like PyMongo does. When it discovers a new server it begins connecting to it, and adds the new socket to the list of non-blocking sockets in its event loop.

As with PyMongo, when the C Driver is in background-thread mode, your application's operations are unblocked as soon as monitoring discovers a usable server. For example, if your C code is blocked waiting to insert into the primary, it is unblocked as soon as the primary is discovered, rather than waiting for all secondaries to be checked too.

Steady State

Once the driver has discovered your whole replica set, it periodically re-checks each server. The periodic check is necessary to keep track of your network latency to each server, and to detect when a new secondary joins the set. And in some cases periodic monitoring can head off errors, by proactively discovering when a server is offline.

By default, the monitor threads in PyMongo check their servers every ten seconds, as does the C Driver's monitor in background-thread mode. The Perl driver, and the C Driver in single-threaded mode, block your application to re-scan the replica set once per minute.

If you like my supply-and-demand model of a driver, the steady state is when your application's demand for topology information is satisfied. The driver occasionally refreshes its stock of information to make sure it's ready for future demands, but there is no urgency.


So I wander into your data center, swirling my cappuccino, and I stumble and spill it on hostC's motherboard. Now your replica set has no primary. What happens next?

When your application next writes to the primary, it gets a socket timeout. Now it knows the primary is gone. Its demand for information is no longer in balance with supply. The next attempt to write blocks until a primary is found.

To meet demand, the driver works overtime. How exactly it responds to the crisis depends on which type of monitoring it uses.

Multi-threaded: In drivers like PyMongo, the monitor threads wait only half a second between server checks, instead of ten seconds. They want to know as soon as possible if the primary has come back, or if one of the secondaries has been elected primary.

Single-threaded: Drivers like the Perl Driver sleep half a second between scans of the topology. The application's write operation remains blocked until the driver finds the primary.

C Driver Single-Threaded: In single-threaded mode, the C Driver sleeps half a second between scans, just like the Perl Driver. During the scan the driver launches non-blocking "ismaster" commands on all servers concurrently, as I described above.

C Driver Pooled Mode: Each time the driver's monitor thread receives an ismaster response, schedules that server's next ismaster call on the event loop only a half-second in the future.


Your secondaries, hostA and hostB, promptly detect my sabotage of hostC, and hold an election. In MongoDB 3.0, the election takes just a couple seconds. Let's say hostA becomes primary.

A half second or less later, your driver rechecks hostA and sees that it is now the primary. It unblocks your application's writes and sends them to hostA. In PyMongo, the monitor threads relax, and return to their slow polling strategy: they sleep ten seconds between server checks. Same for the C Driver's monitor in background-thread mode. The Perl Driver, and the C Driver in single-threaded mode, do not rescan the topology for another minute. Demand and supply are once again in balance.

Compliance Testing

I am particularly excited about the unit tests that accompany the Server Discovery And Monitoring Spec. We have 38 tests that are specified formally in YAML files, with inputs and expected outcomes for a range of scenarios. For each driver we write a test runner that feeds the inputs to the driver and verifies the outcome. This ends confusion about what the spec means, or whether all drivers conform to it. You can track our progress toward full compliance in MongoDB's issue tracker.

Further Study

The spec is long but tractable. It explains the monitoring algorithm in very fine detail. You can read a summary, and the spec itself, here:

Its job is to describe the demand side of the driver's information economy. For the supply side, read my colleague David Golden's article on his Server Selection Spec.

Categories: FLOSS Project Planets

LaKademy 2015 – here we go!

Planet KDE - Tue, 2015-05-19 22:10

Hi there,

Everything is ready for the 3rd edition of LaKademy – The KDE Latin America Summit \o/. The meeting will take place from 03-06 June, 2015, in Salvador, north-eastern Brazil. Besides of being the city where I live in :), it was the venue of the 1st Akademy-BR in 2010, when we began some efforts to create and then expand the culture of KDE hacking sprints in Brazil and, after, in Latin-America. Hence, we are now somewhat with that cosy feeling of returning to the grandma’s house for a portion of home-made cookies :). For this year, we decided on having only hacking sessions and quick meetings, rather than talks and/or introductory short-courses. We want to leverage contributions and have more things done during these four nice days of LaKademy 2015. We aren’t, by any means, alien to newcomers, though. The LaKademy 2015’s Call for Participation was already announced and everyone interested in knowing more about KDE contributions may join us at the hacking sessions, ask questions, get involved, and have fun.

For these four days, seven KDE contributors (and, hopefully, some visitors) will meet at the Information Technology Offices of the Federal University of Bahia. We are still settling the details of the program, but I would like to revisit some stuff I’ve done for KDevelop in the past, Filipe should keep working in Cantor enhancements, Lamarque in Plasma Network Manager, and Aracele in translation and promo stuff. As usual, we have also a promo meeting involving all participants where we set the plans for conquering the world with KDE :).

Keep tuned for upcoming news about LaKademy 2015 ! See you …

Categories: FLOSS Project Planets

Norbert Preining: Shishiodoshi or Us and They – On the perceived exclusivity of Japanese

Planet Debian - Tue, 2015-05-19 21:01

The other day I received from my Japanese teacher an interesting article by Yamazaki Masakazu 山崎正和 comparing garden styles, and in particular the attitude towards and presentation of water in Japanese and European gardens (page 1, page 2). The author’s list of achievements is long, various professorships, dramatist, literature critique, recognized as Person of Cultural Merit, just to name a view. I was looking forward to an interesting and high quality article!

The article itself introduces the reader to 鹿おどし Shishiodoshi, one of the standard ingredients of a Japanese garden: It is a device where water drips into a bamboo tube that is also a seesaw. At some point the water in the bamboo tube makes the seesaw switch over and the water pours out, after which the seesaw returns to the original position and a new cycle begins.

The author describes his feelings and thoughts about the shishiodoshi, in particular connects human life (stress and relieve, cycles), the flow of time, and some other concepts with the shishiodoshi. Up to here it is a wonderful article providing interesting insights into the way the author thinks. Unfortunately, then the author tries to underline his ideas by comparing the Japanese shishiodoshi with European style water fountains, describing the former with all favorable properties and full of deep meaning, while the latter is qualified as beautiful and nice, but bare of any deeper meaning.

I don’t go into details that the whole comparison is anyway a bad one, as he is comparing Baroque style fountains, a very limited period, and furthermore ignores the fact that water fountains are not genuinely European (isn’t there one in the Kenrokuen, one of the three most famous gardens in Japan!?), nor does he consider any other “water-installation” that might be available. What really destroys the in principle very nice article is the tone:

The general tone of the article then can be summarized into: “The shishiodoshi is rich on meaning, connects to the life of humans, instigates philosophical reflections, represents nature, the flow of time etc. The water fountain is beautiful and gorgeous, but that is all.”

I don’t think that this separation, or this negative undertone, was created on purpose by the author. A person of his stature is supposedly above this level of primitive comparison. I believe that it is nothing else but a consequence of upbringing and the general attitude that permeates the whole society with this feeling of separateness.

Us and They

Repeatedly providing sentences like “Japanese people and Western people have different tastes..” (日本人は西洋人と違った独特の好みを持っていたのである). About 10 times in this short article expressions like “Japanese” and “Westerner” appear, leaving the reader with a bitter taste of an author that considers first the Japanese a people (what about Ainu, Ryukyu, etc?), and second that the Japanese are exclusive in the sense that they are set apart from the rest of the world in their way of thinking, living, being.

What puzzles me is that this is not only a singular opinion, but a very general straight in the Japanese media, TV, radio, newspaper, books. Everyone considers “Japan” and “Japanese” as something that is fundamentally and profoundly different from everyone else in the world.

There is “We – the Japanese” (and that doesn’t mean nationality of passport, but blood line!), and there are “They – the Rest” or, as the way of writing and and description on many occasion suggestions, “They – the Barbarians”.

A short anecdote will underly this: One of the favorite TV talk show / pseudo-documentary style is about Japanese living abroad. That day a lady married in Paris was interviewed. What followed was a guided tour through Paris showing: Dirt in the gutter, street cleaning cars, waste disposal places. Yes, that was all. Just about the “dirt”. Of course, at length the (unfortunately only apparent) cleanliness of Japanese cities and neighborhoods are mentioned and shown to remind everyone how wonderful Japan is and how dirty the Barbarians. I don’t want to say that I consider Japan more dirty than most other countries – just the visible part is clean, the moment you step a bit aside and around the corner, there are the worst trash just thrown away without consideration. Anyway.

To return to the topic of “Us and They” – I consider us all humans, first and foremost, and nationality, birthplace, and all that are just by chance. I do NOT reject cultural differences, they are here, of course. But cultural differences are one thing, separating one self and one’s perceived people from the rest of the world is another.


I repeat, I don’t think that the author had any ill intentions, but it would have been nicer if the article wouldn’t make such a stark distinction. He could have written about Shishiodoshi and water fountains without using the “Us – They” categorization. He could have compared other water installations, could have discussed the long tradition of small ponds in European gardens, just to name a few things. But the author choose to highlight differences instead of commonalities.

It is the “Us against Them” feeling that often makes life in Japan for a foreigner difficult. Japanese are not special, Austrians, too, are not special, nor are Americans, Russians, Tibetans, or any other nationality. No nationality is special, we are all humans. Maybe at some point this will arrive also in the Japanese society and thinking.

Categories: FLOSS Project Planets

Gunnar Wolf: Feeling somewhat special

Planet Debian - Tue, 2015-05-19 19:36

Today I feel more special than I have ever felt.

Or... Well, or something like that.

Thing is, there is no clear adjective for this — But I successfully finished my Specialization degree! Yes, believe it or not, today I can formally say I am Specialist in Informatic Security and Information Technologies (Especialista en Seguridad Informática y Tecnologías de la Información), as awarded by the Higher School of Electric and Mechanic Engineering (Escuela Superior de Ingeniería Mecánica y Eléctrica) of the National Polytechnical Institute (Instituto Politécnico Nacional).

In Mexico and most Latin American countries, degrees are usually incorporated to your name as if they were a nobiliary title. Thus, when graduating from Engineering studies (pre-graduate universitary level), I became "Ingeniero Gunnar Wolf". People graduating from further postgraduate programs get to introduce themselves as "Maestro Foobar Baz" or "Doctor Quux Noox". And yes, a Specialization is a small posgraduate program (I often say, the smallest possible posgraduate). And as a Specialist... What can I brag about? Can say I am Specially Gunnar Wolf? Or Special Gunnar Wolf? Nope. The honorific title for a Specialization is a pointer to null, and when casted into a char* it might corrupt your honor-recognizing function. So I'm still Ingeniero Gunnar Wolf, for information security reasons.

So that's the reason I am now enrolled in the Masters program. I hope to write an addenda to this message soonish (where soonish ≥ 18 months) saying I'm finally a Maestro.

As a sidenote, many people asked me: Why did I take on the specialization, which is a degree too small for most kinds of real work recognition? Because it's been around twenty years since I last attended a long-term scholar program as a student. And my dish is quite full with activities and responsabilities. I decided to take a short program, designed for 12 months (I graduated in 16, minus two months that the university was on strike... Quite good, I'd say ;-) ) to see how I fared on it, and only later jumping on the full version.

Because, yes, to advance my career at the university, I finally recognized and understood that I do need postgraduate studies.

Oh, and what kind of work did I do for this? Besides the classes I took, I wrote a thesis on a model for evaluating covert channels for establishing secure communications.

Categories: FLOSS Project Planets

Graham Dumpleton: Returning a string as the iterable from a WSGI application.

Planet Python - Tue, 2015-05-19 17:07
The possible performance consequences of returning many separate data blocks from a WSGI application were covered in the previous post. In that post the WSGI application used as an example was one which returned the contents of a file as many small blocks of data. Part of the performance problems seen arose due to how the WSGI servers would flush each individual block of data out, writing it onto
Categories: FLOSS Project Planets

Gizra.com: Visual regression tests on every commit

Planet Drupal - Tue, 2015-05-19 17:00

As we dive deeper into visual regression testing in our development workflow we realize a sad truth: on average, we break our own CSS every week and a half.

Don't feel bad for us, as in fact I'd argue that it's pretty common across all web projects - they just don't know it. It seems we all need a system that will tell us when we break our CSS.

While we don't know of a single (good) system that does this, we were able to connect together a few (good) systems to get just that, with the help of: Travis-CI, webdriverCSS, Shoov.io, BrowserStack/Sauce Labs, and ngrok. Oh my!

Don't be alarmed by the long list. Each one of these does one thing very well, and combining them together was proven to be not too complicated, nor too costly.

You can jump right into the .travis file of the Gizra repo to see its configuration, or check the webdriverCSS test. Here's the high level overview of what we're doing:

Gizra.com is built on Jekyll but visual regression could be executed on every site, regardless of the underlying technology. Travis is there to help us build a local installation. Travis also allows adding encrypted keys, so even though the repo is public, we were able to add our Shoov.io and ngrok access tokens in a secure way.

We want to use services such as BrowserStack or Sauce-Labs to test our local installation on different browsers (e.g. latest chrome and IE11). For that we need to have an external URL accessible by the outside world, which is where ngrok comes in: ngrok http -log=stdout -subdomain=$TRAVIS_COMMIT 9000 from the .travis.yml file exposes our Jekyll site inside the Travis box to a unique temporary URL based on the Git commit (e.g. https://someCommitHash.ngrok.io).

WebdriverCSS tests are responsible for capturing the screenshots, and comparing them against the baseline images. If a regression is found, it will be automatically pushed to Shoov, and a link to the regression would be provided in the Travis log. This means that if a test was broken, we can immediately see where's the regression and figure out if it is indeed a bug - or, if not, replace the baseline image with the "regression" image.

Visual regression found and uploaded to shoov.io

Continue reading…

Categories: FLOSS Project Planets

Mediacurrent: Contrib Committee Status Review for April, 2015

Planet Drupal - Tue, 2015-05-19 16:47

The fourth month of the year brought reminders that Winter can show up at unexpected times, with snow flurries during the early parts of the month. It also that we can only juggle so much. With many of us involved in organizing regional events and preparing for Drupalcon, our code contributions waned for a second month, down to a rather low 20 hours.

Categories: FLOSS Project Planets

Drupalpress, Drupal in the Health Sciences Library at UVA: executing an r script with bash

Planet Drupal - Tue, 2015-05-19 16:43

Here’s a tangent:

Let’s say you need to randomly generate a series of practice exam questions. You have a bunch of homework assignments, lab questions and midterms, all of which are numbered in a standard way so that you can sample from them.

Here’s a simple R script to run those samples and generate a practice exam that consists of references to the assignments and their original numbers.

## exam prep script ## build hw data j <- 1 hw <- data.frame(hw_set = NA, problm = seq(1:17)) for (i in seq(1:12)) { hw[j,1] <- paste0("hw",j) j <- j+1 } library(tidyr) hw <- expand(hw) names(hw) <- c("problm_set", "problm") ## build exam data j <- 1 exam <- data.frame(exam_num = NA, problm = seq(1:22)) for (i in seq(1:8)) { exam[j,1] <- paste0("exam",j) j <- j+1 } library(tidyr) exam <- expand(exam) names(exam) <- c("problm_set", "problm") ## create practice exam prctce <- rbind(exam,hw) prctce_test <- prctce[sample(1:nrow(prctce), size=22),] row.names(prctce_test) <- 1:nrow(prctce_test) print(prctce_test)

As the last line indicates, the final step of the script is to output a prctce_test … that will be randomly generated each time the script is run, but may include duplicates over time.

Sure. Fine. Whatever.

Probably a way to do this with Drupal … or with Excel … or with a pencil and paper … why use R?

Two reasons: 1) using R to learn R and 2) scripting this simulation let’s you automate things a little bit easier.

In particular, you can use something like BASH to execute the script n number of times.

for n in {1..10}; do Rscript examprep.R > "YOUR_PATH_HERE/practice${n}.txt"; done

That will give you 10 practice test txt files that are all named with a tokenized number, with just one command. And of course that could be written into a shell script that’s automated or processed on a scheduler.

Sure. Fine. Whatever.

OK. While this is indeed a fairly underwhelming example, the potential here is kind of interesting. Our next step is to investigate using Drupal Rules to initiate a BASH script that in turn executes an algorithm written in R. The plan is to also use Drupal as the UI for entering the data to be processed in the R script.

Will document that here if/when that project comes together.

Categories: FLOSS Project Planets

Lars Wirzenius: Software development estimation

Planet Debian - Tue, 2015-05-19 15:50

Acceptable estimations for software development:

  • Almost certainly doable in less than a day.
  • Probably doable in less than a day, almost certainly not going to take more than three days.
  • Probably doable in less than a week, but who knows?
  • Certainly going to take longer than a week, and nobody can say how long, but if you press me, the estimate is between two weeks and four months.

Reality prevents better accuracy.

Categories: FLOSS Project Planets

Timothy Potter: Integrating Storm and Solr

Planet Apache - Tue, 2015-05-19 14:13
In this post I introduce a new open source project provided by Lucidworks for integrating Solr and Storm. Specifically, I cover features such as micro-buffering, data mapping, and how to send custom JSON documents to Solr from Storm. I assume you have a basic understanding of how Storm works, but if you need a quick refresher, please review the Storm concepts documentation. As you read through this post, it will help to have the project source code on your local machine. After cloning https://github.com/LucidWorks/storm-solr, simply do: mvn clean package. This will create the unified storm-solr-1.0.jar in the target/ directory for the project. The project discussed here started out as a simple bolt for indexing documents in Solr. My first pass at creating Solr bolt was quite simple, but then a number of questions came up that made my simple bolt not quite so simple. For instance, how do I …
  • Separate my application business logic from Storm boilerplate code?
  • Unit test application logic in my bolts and spouts?
  • Run a topology locally while developing?
  • Configure my Solr bolt to specify environment-specific settings like the ZooKeeper connection string needed by SolrCloud?
  • Package my topology into something that can be deployed to a Storm cluster?
  • Measure the performance of my components at runtime?
  • Integrate with other services and databases when building a real-world topology?
  • Map Tuples in my topology to a format that Solr can process?
This is just a small sample of the types of questions that arise when building a high-performance streaming application with Storm. I quickly realized that I needed more than just a Solr bolt. Hence, the project evolved into a toolset that makes it easy to integrate Storm and Solr, as well as addressing all of the questions raised above. I’ll spare you the nitty-gritty details of the framework supporting Solr integration with Storm. If you’re interested, the README for the project contains more details about how the framework was designed. Packaging and Running a Storm Topology To begin, let’s understand how to run a topology in Storm. Effectively, there are two basic modes of running a Storm topology: local and cluster mode. Local mode is great for testing your topology locally before pushing it out to a remote Storm cluster, such as staging or production. For starters, you need to compile and package your code and all of its dependencies into a unified JAR with a main class that runs your topology. For this project, I use the Maven Shade plugin to create the unified JAR with dependencies. The benefit of the Shade plugin is that it can relocate classes into different packages at the byte-code level to avoid dependency conflicts. This comes in quite handy if your application depends on 3rd party libraries that conflict with classes on the Storm classpath. You can look at the project pom.xml file for specific details about I use the Shade plugin. For now, let it suffice to say that the project makes it very easy to build a Storm JAR for your application. Once you have a unified JAR (storm-solr-1.0.jar), you’re ready to run your topology in Storm. The project includes a main class named com.lucidworks.storm.StreamingApp that allows you to run a topology locally or in a remote Storm cluster. Specifically, StreamingApp provides the following:
  • Separates the process of defining a Storm topology from the process of running a Storm topology in different environments. This lets you focus on defining a topology for your specific requirements.
  • Provides a clean mechanism for separating environment-specific configuration settings.
  • Minimizes duplicated boilerplate code when developing multiple topologies and gives you a common place to insert reusable logic needed for all of your topologies.
To use StreamingApp, you simply need to implement the StormTopologyFactory interface, which defines the spouts and bolts in your topology: public interface StormTopologyFactory { String getName(); StormTopology build(StreamingApp app) throws Exception; } Let’s look at a simple example of a StormTopologyFactory implementation that defines a topology for indexing tweets into Solr: class TwitterToSolrTopology implements StormTopologyFactory { static final Fields spoutFields = new Fields("id", "tweet") String getName() { return "twitter-to-solr" } StormTopology build(StreamingApp app) throws Exception { // setup spout and bolts for accessing Spring-managed POJOs at runtime SpringSpout twitterSpout = new SpringSpout("twitterDataProvider", spoutFields); SpringBolt solrBolt = new SpringBolt("solrBoltAction", app.tickRate("solrBolt")); // wire up the topology to read tweets and send to Solr TopologyBuilder builder = new TopologyBuilder() builder.setSpout("twitterSpout", twitterSpout, app.parallelism("twitterSpout")) builder.setBolt("solrBolt", solrBolt, app.parallelism("solrBolt")) .shuffleGrouping("twitterSpout") return builder.createTopology() } } A couple of things should stand out to you in this listing. First, there’s no command-line parsing, environment-specific configuration handling, or any code related to running this topology. All that you see here is code defining a StormTopology; StreamingApp handles all the boring stuff for you. Second, the code is quite easy to understand because it only does one thing. Lastly, this class is written in Groovy instead of Java, which helps keep things nice and tidy and I find Groovy to be more enjoyable to write. Of course if you don’t want to use Groovy, you can use Java, as the framework supports both seamlessly. The following diagram depicts the TwitterToSolrTopology. A key aspect of the solution is the use of the Spring framework to manage beans that implement application specific logic in your topology and leave the Storm boilerplate work to reusable components: SpringSpout and SpringBolt. We’ll get into the specific details of the implementation shortly, but first, let’s see how to run the TwitterToSolrTopology using the StreamingApp framework. For local mode, you would do: java -classpath $STORM_HOME/lib/*:target/storm-solr-1.0.jar com.lucidworks.storm.StreamingApp \ example.twitter.TwitterToSolrTopology -localRunSecs 90 The command above will run the TwitterToSolrTopology for 90 seconds on your local workstation and then shutdown. All the setup work is provided by the StreamingApp class. To submit to a remote cluster, you would do: $STORM_HOME/bin/storm jar target/storm-solr-1.0.jar com.lucidworks.storm.StreamingApp \ example.twitter.TwitterToSolrTopology -env staging Notice that I’m using the -env flag to indicate I’m running in my staging environment. It’s common to need to run a Storm topology in different environments, such as test, staging, and production, so that’s built into the StreamingApp framework. So far, I’ve shown you how to define a topology and how to run it. Now let’s get into the details of how to implement components in a topology. Specifically, let’s see how to build a bolt that indexes data into Solr, as this illustrates many of the key features of the framework. SpringBolt In Storm, a bolt performs some operation on a Tuple and optionally emits Tuples into the stream. In the example Twitter topology definition above, we see this code: SpringBolt solrBolt = new SpringBolt("solrBoltAction", app.tickRate("solrBolt")); This creates an instance of SpringBolt that delegates message processing to a Spring-managed bean with ID “solrBoltAction”. The main benefit of the SpringBolt is it allows us to separate Storm-specific logic and boilerplate code from application logic. The com.lucidworks.storm.spring.SpringBolt class allows you to implement your bolt logic as a simple Spring-managed POJO (Plain Old Java Object). To leverage SpringBolt, you simply need to implement the StreamingDataAction interface: public interface StreamingDataAction { SpringBolt.ExecuteResult execute(Tuple input, OutputCollector collector); } At runtime, Storm will create one or more instances of SpringBolt per JVM. The number of instances created depends on the parallelism hint configured for the bolt. In the Twitter example, we simply pulled the number of tasks for the Solr bolt from our configuration: // wire up the topology to read tweets and send to Solr ... builder.setBolt("solrBolt", solrBolt, app.parallelism("solrBolt")) ... The SpringBolt needs a reference to the solrBoltAction bean from the Spring ApplicationContext. The solrBoltAction bean is defined in resources/storm-solr-spring.xml as: <bean id="solrBoltAction" class="com.lucidworks.storm.solr.SolrBoltAction" scope="prototype"> <property name="solrInputDocumentMapper" ref="solrInputDocumentMapper"/> <property name="maxBufferSize" value="${maxBufferSize}"/> <property name="bufferTimeoutMs" value="${bufferTimeoutMs}"/> </bean> There are a couple of interesting aspects of about this bean definition. First, the bean is defined with prototype scope, which means that Spring will create a new instance for each SpringBolt instance that Storm creates at runtime. This is important because it means your bean instance will only be accessed by one thread at a time so you don’t need to worry about thread-safety issues. Also notice that the maxBufferSize and bufferTimeoutMs properties are set using Spring’s dynamic variable resolution syntax, e.g. ${maxBufferSize}. These properties will be resolved during bean construction from a configuration file called resources/Config.groovy. When the SpringBolt needs a reference to solrBoltAction bean, it first needs to get the Spring ApplicationContext. The StreamingApp class is responsible for bootstrapping the Spring ApplicationContext using storm-solr-spring.xml. StreamingApp ensures there is only one Spring context initialized per JVM instance per topology as multiple topologies may be running in the same JVM. If you’re concerned about the Spring container being too heavyweight, rest assured there is only one container initialized per JVM per topology and bolts and spouts are long-lived objects that only need to be initialized once by Storm per task. Put simply, the overhead of Spring is quite minimal especially for long-running streaming applications. The framework also provides a SpringSpout that allows you to implement a data provider as a simple Spring-managed POJO. I’ll refer you to the source code for more details about SpringSpout but it basically follows the same design patterns as SpringBolt. Environment-specific Configuration I’ve implemented several production Storm topologies in the past couple years and one pattern that keeps emerging is the need to manage configuration settings for different environments. For instance, we’ll need to index into a different SolrCloud cluster for staging and production. To address this need, the Spring-driven framework allows you to keep all environment-specific configuration properties in the same configuration file, see resources/Config.groovy. Don’t worry if you don’t know Groovy, the syntax of the Config.groovy file is very easy to understand and allows you to cleanly separate properties for the following environments: test, dev, staging, and production. Put simply, this approach allows you to run the topology in multiple environments using a simple command-line switch to specify the environment settings that should be applied -env. Metrics Storm provides high-level metrics for bolts and spouts, but if you need more visibility into the inner workings of your application-specific logic, then it’s common to use the Java metrics library, see: https://dropwizard.github.io/metrics/3.1.0/. Fortunately, there are open source options for integrating metrics with Spring, see: https://github.com/ryantenney/metrics-spring. The Spring context configuration file resources/storm-solr-spring.xml comes pre-configured with all the infrastructure needed to inject metrics into your bean implementations. When implementing your StreamingDataAction (bolt) or StreamingDataProvider (spout), you can have Spring auto-wire metrics objects using the @Metric annotation when declaring metrics-related member variables. For instance, the SolrBoltAction class uses a Timer to track how long it takes to send batches to Solr. @Metric public Timer sendBatchToSolr; The SolrBoltAction class provides several examples of how to use metrics in your bean implementations. At this point you should have a basic understanding of the main features of the framework. Now let’s turn our attention to some Solr-specific features. Micro-buffering and Ack’ing Input Tuples It’s possible that thousands of documents per second will be flowing into each Solr bolt. To avoid sending too many requests into Solr and to avoid blocking too much in the topology, the bolt uses an internal buffer to send documents to Solr in small batches. This helps reduce the number of network round-trips between your bolt and Solr. The bolt supports a maximum buffer size setting to control when the buffer should be flushed, which defaults to 100. Buffering poses two basic issues in a streaming topology. First, you’re likely using Storm to power a near real-time data processing application, so we don’t want to delay documents from getting into Solr for too long. To support this, the bolt supports a buffer timeout setting that indicates when a buffer should be flushed to ensure documents flow into Solr in a timely manner. Consequently, the buffer will be flushed when either the size threshold or the time limit is reached. There is a subtle side-effect that would normally require a background thread to flush the buffer if there was some delay in messages being sent into the bolt by upstream components. Fortunately, Storm provides a simple mechanism that allows your bolt to receive a special type of Tuple on a periodic schedule, known as a TickTuple. Whenever the SolrBoltAction bean receives a TickTuple, it checks to see if the buffer needs to be flushed, which avoids holding documents for too long and alleviates the need for a background thread to monitor the buffer. Field Mapping The SolrBoltAction bean takes care of sending documents to SolrCloud in an efficient manner, but it only works with SolrInputDocument objects from SolrJ. It’s unlikely that your Storm topology will be working with SolrInputDocument objects natively, so the SolrBoltAction bean delegates mapping of input Tuples to SolrInputDocument objects to a Spring-managed bean that implements the com.lucidworks.storm.solr.SolrInputDocumentMapper interface. This fits nicely with our design approach of separating concerns in our topology. The default implementation provided in the project (DefaultSolrInputDocumentMapper) uses Java reflection to read data from a Java object to populate the fields of the SolrInputDocument. In the Twitter example, the default implementation uses Java reflection to read data from a Twitter4J Status object to populate dynamic fields on a SolrInputDocument instance. It should be clear, however, that you can inject your own SolrInputDocumentMapper implementation into the bolt bean using Spring if the default implementation does not meet your needs. JSON As of Solr 5, you can send arbitrary JSON documents to Solr and have it parse out documents for indexing. For more information about this cool feature in Solr, please see: http://lucidworks.com/blog/indexing-custom-json-data/ If you want to send arbitrary JSON objects to Solr and have it index documents during JSON parsing, you need to use the solrJsonBoltAction bean instead of solrBoltAction. For our Twitter example, you could define the solrJsonBoltAction bean as: <bean id="solrJsonBoltAction" class="com.lucidworks.storm.solr.SolrJsonBoltAction" scope="prototype"> <property name="split" value="/"/> <property name="fieldMappings"> <list> <value>$FQN:/**</value> </list> </property> </bean> Lucidworks Fusion Lastly, if you’re using Lucidworks Fusion (and you should be), then instead of sending documents directly to Solr, you can send them to a Fusion indexing pipeline using the FusionBoltAction class. FusionBoltAction posts JSON documents to the Fusion proxy which gives you security and the full power of Fusion pipelines for generating Solr documents.

The post Integrating Storm and Solr appeared first on Lucidworks.

Categories: FLOSS Project Planets

Drupal Watchdog: Web API Alphabet Soup

Planet Drupal - Tue, 2015-05-19 11:56

Drupal 8 offers unprecedented support for creating RESTful APIs. This was one of the major goals of the Web Services and Context Core Initiative (WSCCI), and one we've delivered on pretty well. However, as with most things that are worth doing, just because Drupal core “supports” it doesn't mean you'll see good results without an understanding of what's going on. In this article, we'll explore some of these principles, so that when it comes time to design with those systems, you'll know how to think about the problem.


Drupal 8 ships with support for encoding and representing its entities (and other objects) via the Hypermedia Application Language (HAL) specification. HAL can currently be expressed in JSON or XML, and is a specification for describing resources. As the specification says, HAL is “a bit like HTML for machines.”

What that means is that a HAL API can provide enough data for a machine agent to visit the root ("/") of a website, then navigate its way through the remainder of the system exclusively by following the links provided in responses. Humans do the exact same thing by visiting a page and clicking on links. The notion that machines might also want to do this is a relatively obvious idea, but one that has, until recently, rarely been followed on the web.

Still, though, it's pretty abstract. To really understand why HAL is powerful – and what it does for us in Drupal – it's necessary to go back to the basic constraints and capabilities of the problem space it operates in: HTTP and REST. The crucial documents there are RFC2616 and Roy Fielding's thesis, both well-worth [re-]reading. But a more easily digestible version comes in the form of the Richardson Maturity Model, first laid out by Leonard Richardson in 2008, and since revisited by Martin Fowler and Steve Klabnik.


The Richardson Maturity Model helpfully suggests a set of four “maturity” levels into which HTTP APIs fall:

Categories: FLOSS Project Planets

Thorsten Alteholz: alpine and UTF-8 and Debian lists

Planet Debian - Tue, 2015-05-19 11:47

This is a note for my future self: When writing an email with only “charset=US-ASCII”, alpine creates an email with:

Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII

and everything is fine.

In case of UTF-8 characters inside the text, alpine creates something like:

Content-Type: MULTIPART/MIXED; BOUNDARY="705298698-1667814148-1432049085=:28313"

and the only available part contains:

Content-Type: TEXT/PLAIN; format=flowed; charset=UTF-8
Content-Transfer-Encoding: 8BIT

Google tells me that the reason for this is:

Alpine uses a single part MULTIPART/MIXED to apply a protection wrapper around QUOTED-PRINTABLE and BASE64 content to prevent it from being corrupted by various mail delivery systems that append little (typically advertising) things at the end of the message.

Ok, this behavior might come from bad experiences and it seems to work most of the time. Unfortunately if one sends a signed email to a Debian list that checks whether the signature is valid (like for example debian-lts-announce), such an email will be rejected with:

Failed to understand the email or find a signature: UDFormatError:
Cannot handle multipart messages not of type multipart/signed


Categories: FLOSS Project Planets

Caktus Consulting Group: Keynote by Catherine Bracy (PyCon 2015 Must-See Talk: 4/6)

Planet Python - Tue, 2015-05-19 10:12

Part four of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

My recommendation would be Catherine Bracy's Keynote about Code for America. Cakti should be familiar with Code for America. Colin Copeland, Caktus CTO, is the founder of Code for Durham and many of us are members. Her talk made it clear how important this work is. She was funny, straight-talking, and inspirational. For a long time before I joined Caktus, I was a "hobbyist" programmer. I often had time to program, but wasn't sure what to build or make. Code for America is a great opportunity for people to contribute to something that will benefit all of us. I have joined Code for America and hope to contribute locally soon through Code for Durham.

.embed-container { position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden; max-width: 100%; } .embed-container iframe, .embed-container object, .embed-container embed { position: absolute; top: 0; left: 0; width: 100%; height: 100%; }

More in the PyCon 2015 Must-See Talks Series.

Categories: FLOSS Project Planets
Syndicate content