FLOSS Project Planets

Steve Kemp: Updating Debian Administration

Planet Debian - Thu, 2014-08-21 04:50

Recently I've been getting annoyed with the Debian Administration website; too often it would be slower than it should be considering the resources behind it.

As a brief recap I have six nodes:

  • 1 x MySQL Database - The only MySQL database I personally manage these days.
  • 4 x Web Nodes.
  • 1 x Misc server.

The misc server is designed to display events. There is a node.js listener which receives UDP messages and stores them in a rotating buffer. The messages might contain things like "User bob logged in", "Slaughter ran", etc. It's a neat hack which gives a good feeling of what is going on cluster-wide.

I need to rationalize that code - but there's a very simple predecessor posted on github for the curious.

Anyway enough diversions, the database is tuned, and "small". The misc server is almost entirely irrelevent, non-public, and not explicitly advertised.

So what do the web nodes run? Well they run a lot. Potentially.

Each web node has four services configured:

  • Apache 2.x - All nodes.
  • uCarp - All nodes.
  • Pound - Master node.
  • Varnish - Master node.

Apache runs the main site, listening on *:8080.

One of the nodes will be special and will claim a virtual IP provided via ucarp. The virtual IP is actually the end-point visitors hit, meaning we have:

Master HostOther hosts

Running:

  • Apache.
  • Pound.
  • Varnish

Running:

  • Apache.

Pound is configured to listen on the virtual IP and perform SSL termination. That means that incoming requests get proxied from "vip:443 -> vip:80". Varnish listens on "vip:80" and proxies to the back-end apache instances.

The end result should be high availability. In the typical case all four servers are alive, and all is well.

If one server dies, and it is not the master, then it will simply be dropped as a valid back-end. If a single server dies and it is the master then a new one will appear, thanks to the magic of ucarp, and the remaining three will be used as expected.

I'm sure there is a pathological case when all four hosts die, and at that point the site will be down, but that's something that should be atypical.

Yes, I am prone to over-engineering. The site doesn't have any availability requirements that justify this setup, but it is good to experiment and learn things.

So, with this setup in mind, with incoming requests (on average) being divided at random onto one of four hosts, why is the damn thing so slow?

We'll come back to that in the next post.

(Good news though; I fixed it ;)

Categories: FLOSS Project Planets

Wouter Verhelst: Multiarchified eID libraries, now public

Planet Debian - Thu, 2014-08-21 04:30

Yesterday, I spent most of the day finishing up the multiarch work I'd been doing on introducing multiarch to the eID middleware, and did another release of the Linux builds. As such, it's now possible to install 32-bit versions of the eID middleware on a 64-bit Linux distribution. For more details, please see the announcement.

Learning how to do multiarch (or biarch, as the case may be) for three different distribution families has been a, well, learning experience. Being a Debian Developer, figuring out the technical details for doing this on Debian and its derivatives wasn't all that hard. You just make sure the libraries are installed to the multiarch-safe directories (i.e., /usr/lib/<gnu arch triplet>), you add some Multi-Arch: foreign or Multi-Arch: same headers where appropriate, and you're done. Of course the devil is in the details (define "where appropriate"), but all in all it's not that difficult and fairly deterministic.

The Fedora (and derivatives, like RHEL) approach to biarch is that 64-bit distributions install into /usr/lib64 and 32-bit distributions install into /usr/lib. This goes for any architecture family, not just the x86 family; the same method works on ppc and ppc64. However, since fedora doesn't do powerpc anymore, that part is a detail of little relevance.

Once that's done, yum has some heuristics whereby it will prefer native-architecture versions of binaries when asked, and may install both the native-architecture and foreign-architecture version of a particular library package at the same time. Since RPM already has support for installing multiple versions of the same package on the same system (a feature that was originally created, AIUI, to support the installation of multiple kernel versions), that's really all there is to it. It feels a bit fiddly and somewhat fragile, since there isn't really a spec and some parts seem fairly undefined, but all in all it seems to work well enough in practice.

The openSUSE approach is vastly different to the other two. Rather than installing the foreign-architecture packages natively, as in the Debian and Fedora approaches, openSUSE wants you to take the native foo.ix86.rpm package and convert that to a foo-32bit.x86_64.rpm package. The conversion process filters out non-unique files (only allows files to remain in the package if they are in library directories, IIUC), and copes with the lack of license files in /usr/share/doc by adding a dependency header on the native package. While the approach works, it feels like unnecessary extra work and bandwidth to me, and obviously also wouldn't scale beyond biarch.

It also isn't documented very well; when I went to openSUSE IRC channels and started asking questions, the reply was something along the lines of "hand this configuration file to your OBS instance". When I told them I wasn't actually using OBS and had no plans of migrating to it (because my current setup is complex enough as it is, and replacing it would be far too much work for too little gain), it suddenly got eerily quiet.

Eventually I found out that the part of OBS which does the actual build is a separate codebase, and integrating just that part into my existing build system was not that hard to do, even though it doesn't come with a specfile or RPM package and wants to install files into /usr/bin and /usr/lib. With all that and some more weirdness I've found in the past few months that I've been building packages for openSUSE I now have... Ideas(TM) about how openSUSE does things. That's for another time, though.

(disclaimer: there's a reason why I'm posting this on my personal blog and not on an official website... don't take this as an official statement of any sort!)

Categories: FLOSS Project Planets

Bertrand Delacretaz: So you want to talk at this conference?

Planet Apache - Thu, 2014-08-21 04:00

I regularly review talk submissions for tech conferences, and here’s a list of what I’m mostly looking for when deciding to accept or reject a talk.

Other reviewers might be looking for different things – this is just my own criteria.

My first question is always are you going to get people interested in your stuff. Are you a dynamic speaker who keeps people on their toes, or the kind of person that delivers their talk seated at a desk. I’ve seen the latter happen, and it’s not pretty! For me, a brilliant speaker gets a slot almost every time, also because they usually know which topics will raise people’s interest.

Unless you’re famous already, the best way to convince me that you’re a good speaker is to point to a video of one of your talks. And I also need to know why you think you’re qualified to deliver this talk.

Then, I’m looking for a topic that will add value to the conference. Promoting your product or company might not add much value, whereas a talk that will open people’s minds and maybe save them hours of work in their practice is a guaranteed winner. Signs of a value-adding topic are pointers to concrete achievements using the techniques presented in the talk.

The quality of the submission comes next, especially if I don’t know the speaker. Someone who’s unable to present their ideas clearly in a talk submission is unlikely to present them clearly at the conference. Or maybe they’re a misunderstood genius, you should also look for those but they are rare. A concise submission that packs lots of useful information about what’s going to be delivered at the talk is a good promise of success.

Last but not least, original and inspiring ideas get lots of bonus points from me. Being able to predict the abstract’s contents from the title is usually a bad sign, except if it’s a talk for beginners. We don’t need conferences to exchange information today, that’s supposed to happen on the Web. Talks should be inspiring, maybe teasers to convince people to look at your value-adding stuff, but not rehash information that’s found elsewhere.


Categories: FLOSS Project Planets

Leonardo Giordani: Python 3 OOP Part 4 - Polymorphism

Planet Python - Thu, 2014-08-21 04:00
Previous post

Python 3 OOP Part 3 - Delegation: composition and inheritance

Good Morning, Polymorphism

The term polymorphism, in the OOP lingo, refers to the ability of an object to adapt the code to the type of the data it is processing.

Polymorphism has two major applications in an OOP language. The first is that an object may provide different implementations of one of its methods depending on the type of the input parameters. The second is that code written for a given type of data may be used on data with a derived type, i.e. methods understand the class hierarchy of a type.

In Python polymorphism is one of the key concepts, and we can say that it is a built-in feature. Let us deal with it step by step.

First of all, you know that in Python the type of a variable is not explicitly declared. Beware that this does not mean that Python variables are untyped. On the contrary, everything in Python has a type, it just happens that the type is implicitly assigned. If you remember the last paragraph of the previous post, I stated that in Python variables are just pointers (using a C-like nomenclature), in other words they just tell the language where in memory a variable has been stored. What is stored at that address is not a business of the variable.

``` python

a = 5 a 5 type(a) hex(id(a)) '0x83fe540' a = 'five' a 'five' type(a) hex(id(a)) '0xb70d6560' ```

This little example shows a lot about the Python typing system. The variable a is not statically declared, after all it can contain only one type of data: a memory address. When we assign the number 5 to it, Python stores in a the address of the number 5 (0x83fe540 in my case, but your result will be different). The type() built-in function is smart enough to understand that we are not asking about the type of a (which is always a reference), but about the type of the content. When you store another value in a, the string 'five', Python shamelessly replaces the previous content of the variable with the new address.

So, thanks to the reference system, Python type system is both strong and dynamic. The exact definition of those two concepts is not universal, so if you are interested be ready to dive into a broad matter. However, in Python, the meaning of those two words is the following:

  • type system is strong because everything has a well-defined type that you can check with the type() built-in function
  • type system is dynamic since the type of a variable is not explicitly declared, but changes with the content

Onward! We just scratched the surface of the whole thing.

To explore the subject a little more, try to define the simplest function in Python (apart from an empty function)

``` python def echo(a):

return a

```

The function works as expected, just echoes the given parameter

``` python

echo(5) 5 echo('five') 'five' ```

Pretty straightforward, isn't it? Well, if you come from a statically compiled language such as C or C++ you should be at least puzzled. What is a? I mean: what type of data does it contain? Moreover, how can Python know what it is returning if there is no type specification?

Again, if you recall the references stuff everything becomes clear: that function accepts a reference and returns a reference. In other words we just defined a sort of universal function, that does the same thing regardless of the input.

This is exactly the problem that polymorphism wants to solve. We want to describe an action regardless of the type of objects, and this is what we do when we talk among humans. When you describe how to move an object by pushing it, you may explain it using a box, but you expect the person you are addressing to be able to repeat the action even if you need to move a pen, or a book, or a bottle.

There are two main strategies you can apply to get code that performs the same operation regardless of the input types.

The first approach is to cover all cases, and this is a typical approach of procedural languages. If you need to sum two numbers that can be integers, float or complex, you just need to write three sum() functions, one bound to the integer type, the second bound to the float type and the third bound to the complex type, and to have some language feature that takes charge of choosing the correct implementation depending on the input type. This logic can be implemented by a compiler (if the language is statically typed) or by a runtime environment (if the language is dynamically typed) and is the approach chosen by C++. The disadvantage of this solution is that it requires the programmer to forecast all the possible situations: what if I need to sum an integer with a float? What if I need to sum two lists? (Please note that C++ is not so poorly designed, and the operator overloading technique allows to manage such cases, but the base polymorphism strategy of that language is the one exposed here).

The second strategy, the one implemented by Python, is simply to require the input objects to solve the problem for you. In other words you ask the data itself to perform the operation, reversing the problem. Instead of writing a bunch on functions that sum all the possible types in every possible combination you just write one function that requires the input data to sum, trusting that they know how to do it. Does it sound complex? It is not.

Let's look at the Python implementation of the + operator. When we write c = a + b, Python actually executes c = a.__add__(b). As you can see the sum operation is delegated to the first input variable. So if we write

``` python def sum(a, b):

return a + b

```

there is no need to specify the type of the two input variables. The object a (the object contained in the variable a) shall be able to sum with the object b. This is a very beautiful and simple implementation of the polymorphism concept. Python functions are polymorphic simply because they accept everything and trust the input data to be able to perform some actions.

Let us consider another simple example before moving on. The built-in len() function returns the length of the input object. For example

``` python

l = [1, 2, 3] len(l) 3 s = "Just a sentence" len(s) 15 ```

As you can see it is perfectly polymorphic: you can feed both a list or a string to it and it just computes its length. Does it work with any type? let's check

``` python

d = {'a': 1, 'b': 2} len(d) 2 i = 5 len(i) Traceback (most recent call last): File "", line 1, in TypeError: object of type 'int' has no len() ```

Ouch! Seems that the len() function is smart enough to deal with dictionaries, but not with integers. Well, after all, the length of an integer is not defined.

Indeed this is exactly the point of Python polymorphism: the integer type does not define a length operation. While you blame the len() function, the int type is at fault. The len() function just calls the __len__() method of the input object, as you can see from this code

``` python

l.len() 3 s.len() 15 d.len() 2 i.len() Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute 'len' ```

Very straightforward: the 'int' object does not define any __len__() method.

So, to sum up what we discovered until here, I would say that Python polymorphism is based on delegation. In the following sections we will talk about the EAFP Python principle, and you will see that the delegation principle is somehow ubiquitous in this language.

Type Hard

Another real-life concept that polymorphism wants to bring into a programming language is the ability to walk the class hierarchy, that is to run code on specialized types. This is a complex sentence to say something we are used to do every day, and an example will clarify the matter.

You know how to open a door, it is something you learned in your early years. Under an OOP point of view you are an object (sorry, no humiliation intended) which is capable of interacting with a wood rectangle rotating on hinges. When you can open a door, however, you can also open a window, which, after all, is a specialized type of wood-rectangle-with-hinges, hopefully with some glass in it too. You are also able to open the car door, which is also a specialized type (this one is a mix between a standard door and a window). This shows that, once you know how to interact with the most generic type (basic door) you can also interact with specialized types (window, car door) as soon as they act like the ancestor type (e.g. as soon as they rotate on hinges).

This directly translates into OOP languages: polymorphism requires that code written for a given type may also be run on derived types. For example, a list (a generic list object, not a Python one) that can contain "numbers" shall be able to accept integers because they are numbers. The list could specify an ordering operation which requires the numbers to be able to compare each other. So, as soon as integers specify a way to compare each other they can be inserted into the list and ordered.

Statically compiled languages shall provide specific language features to implement this part of the polymorphism concept. In C++, for example, the language needs to introduce the concept of pointer compatibility between parent and child classes.

In Python there is no need to provide special language features to implement subtype polymorphism. As we already discovered Python functions accept any variable without checking the type and rely on the variable itself to provide the correct methods. But you already know that a subtype must provide the methods of the parent type, either redefining them or through implicit delegation, so as you can see Python implements subtype polymorphism from the very beginning.

I think this is one of the most important things to understand when working with this language. Python is not really interested in the actual type of the variables you are working with. It is interested in how those variables act, that is it just wants the variable to provide the right methods. So, if you come from statically typed languages, you need to make a special effort to think about acting like instead of being. This is what we called "duck typing".

Time to do an example. Let us define a Room class

``` python class Room:

def __init__(self, door): self.door = door def open(self): self.door.open() def close(self): self.door.close() def is_open(self): return self.door.is_open()

```

A very simple class, as you can see, just enough to exemplify polymorphism. The Room class accepts a door variable, and the type of this variable is not specified. Duck typing in action: the actual type of door is not declared, there is no "acceptance test" built in the language. Indeed, the incoming variable shall export the following methods that are used in the Room class: open(), close(), is_open(). So we can build the following classes

``` python class Door:

def __init__(self): self.status = "closed" def open(self): self.status = "open" def close(self): self.status = "closed" def is_open(self): return self.status == "open"

class BooleanDoor:

def __init__(self): self.status = True def open(self): self.status = True def close(self): self.status = False def is_open(self): return self.status

```

Both represent a door that can be open or closed, and they implement the concept in two different ways: the first class relies on strings, while the second leverages booleans. Despite being two different types, both act the same way, so both can be used to build a Room object.

``` python

door = Door() bool_door = BooleanDoor() room = Room(door) bool_room = Room(bool_door)

room.open() room.is_open() True room.close() room.is_open() False

bool_room.open() bool_room.is_open() True bool_room.close() bool_room.is_open() False ```

File Like Us

File-like objects are a concrete and very useful example of polymorphism in Python. A file-like object is a class (or the instance of a class) that acts like a file, i.e. it provides those methods a file object exposes.

Say for example that you code a class that parses an XML tree, and that you expect the XML code to be contained in a file. So your class accepts a file in its __init__() method, and reads the content from it

``` python class XMLReader:

def __init__(xmlfile): xmlfile.open() self.content = xmlfile.read() xmlfile.close()

[...] ```

The class works well until your application shall be modified to receive XML content from a network stream. To use the class without modifying it you shall write the stream in a temporary file and load this latter, but this sounds a little overkill. So you plan to change the class to accept a string, but this way you shall change every single code that uses the class to read a file, since now you shall open, read and close the file on your own, outside the class.

Polymorphism offers a better way. Why not storing the incoming stream inside an object that acts like a file, even if it is not an actual one? If you check the io module you will find that such an object has been already invented and provided in the standard Python library.

Other very useful file-like classes are those contained in the gzip, bz2, and zipfile modules (just to name some of the most used), which provide objects that allow you to manage compressed files just like plain files, hiding the decompression/compression machinery.

Unforgiveness

EAFP is a Python acronym that stands for easier to ask for forgiveness than permission. This coding style is highly pushed in the Python community because it completely relies on the duck typing concept, thus fitting well with the language philosophy.

The concept behind EAFP is fairly easy: instead of checking if an object has a given attribute or method before actually accessing or using it, just trust the object to provide what you need and manage the error case. This can be probably better understood by looking at some code. According to EAFP, instead of writing

``` python if hasattr(someobj, 'open'):

[...]

else:

[...]

```

you shall write

``` python try:

someobj.open() [...]

except AttributeError:

[...]

```

As you can see, the second snippet directly uses the method and deals with the possible AttributeError exception (by the way: managing exceptions is one of the top Black Magic Topics in Python, more on it in a future post. A very quick preview: I think we may learn something from Erlang - check this).

Why is this coding style pushed so much in the Python community? I think the main reason is that through EAFP you think polymorphically: you are not interested in knowing if the object has the open attribute, you are interested in knowing if the object can satisfy your request, that is to perform the open() method call.

Movie Trivia

Section titles come from the following movies: Good Morning, Vietnam (1987), Die Hard (1988), Spies Like Us (1985), Unforgiven (1992).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

Understanding Icons: Participate in fantastic fourth survey

Planet KDE - Thu, 2014-08-21 03:31

We start the next in our little test series of different icon sets. Please, again, participate in our little game and help us to learn more about the usability of icon design.

Keep on reading: Understanding Icons: Participate in fantastic fourth survey

Categories: FLOSS Project Planets

Edward J. Yoon: Alone You Breathe - Savatage

Planet Apache - Thu, 2014-08-21 03:28
Lyrics:

You were never one for waiting
Still I always thought you'd wait for me
Have you from your dream awakened
And from where you are what do you see

Which of us is now in exile
Which in need of amnesty
Are you now but an illusion
In my mind alone you breathe

You believed in things that I will never know
You were out there drowning but it never showed
'Til inside a rain swept night you just let go

You're thrown it all away
And now we'll never see
The ending of the play
The grand design
The final line
And what was meant to be

In the dark a distant runner
Now has disappeared into the night
Leaving us to stand and wonder
Staring from this end into your life

You believed in things that I will never know
You were out there drowning but it never showed
'Til inside a rain swept night you just let go

You've thrown it all away
And now we'll never see
The ending of the play
The grand design
The final line
And what was meant to be

And if this is all illusion
Nothing more than pure delusion
Clinging to a fading fantasy

Like Icarus who heeds the calling
Of a sun but now is falling
As the feathers of his life fall free
Can you see
See

Tomorrow
And after
You tell me what am I to do
I stand here
Believing
That in the dark
There is a clue

Perhaps inside
This midnight sky
Perhaps tomorrow's new born eyes
Or could it be
We'll never know
And after all
This was the show

What am I to do

Gotta get back
Gotta get back
Gotta get back

What am I to do

Gotta get back
Gotta get back
Gotta get back

What am I to do

Standing on a dream
Isn't what it seems
Could we then reclaim a dream refused
Knowing what we know
Could we let it go
Realizing that all the years are used

Tomorrow and after
You tell me what am I to do
I stand here believing
That in the dark there is a clue
I am the way
I am the light
I am the dark inside the night
I hear your hopes
I feel your dreams
And in the dark I hear your screams

Tomorrow and after
You tell me what am I to do
I stand here believing
That in the dark there is a clue
Categories: FLOSS Project Planets

Python Anywhere: New release - a few new packages, some steps towards postgres, and forum post previews

Planet Python - Thu, 2014-08-21 03:05

Today's upgrade included a few new packages in the standard server image:

  • OpenSCAD
  • FreeCAD
  • inkscape
  • Pillow for 3.3 and 3.4
  • flask-bootstrap
  • gensim
  • textblob

We also improved the default "Unhandled Exception" page, which is shown when a users' web app allows an exception to bubble up to our part of the stack. We now include a slightly friendlier message, explaining to any of the users' users that there's an error, and explaining to the user where they can find their log files and look for debug info.

And in the background, we've deployed a bunch of infrastructure changes related to postgres support. We're getting there, slowly slowly!

Oh yes, and we've enabled dynamic previews in the forums, so you get an idea of how the markdown syntax will translate. It actually uses the same library as stackoverflow, it's called pagedown. Hope you find 'em useful!

Categories: FLOSS Project Planets

Gunnar Wolf: Walking without crutches

Planet Debian - Thu, 2014-08-21 01:34

I still consider myself a newbie teacher. I'm just starting my fourth semester. And yes, I really enjoy it.

Now, how did I come to teaching? Well, my training has been mostly on stages for different conferences. More technical, more social, whatever — I have been giving ~10 talks a year for ~15 years, and I must have learnt something from that.

Some good things, some bad habits.

When giving presentations, a most usual technique is to prepare a set of slides to follow/support the ideas. And yes, that's what I did for my classes: Since my first semester, I prepared a nice set of slides, thematically split in 17 files, with ~30 to ~110 pages each (yes, huge variation). Given the course spans 32 classes (72 hours, 2¼ hours per class), each slide lasts for about two classes.

But, yes, this tends to make the class much less dynamic, much more scripted, rigid, and... Boring. From my feedback, I understand the students don't think I am a bad teacher, but still, I want to improve!

So, today I was to give the introduction to memory management. Easy topic, with few diagrams and numbers, mostly talking about the intuitive parts of a set of functions. I started scribbling and shortening the main points on a piece of paper (yes, the one on the picture). I am sure I can get down to more reduction — But this does feel like an improvement!

The class was quite successful. I didn't present the 100% of the material (which is one of the reasons I cling to my presentations — I don't want to skip important material), and at some point I do feel I was a bit going in circles. However, Operating Systems is a very intuitive subject, and getting the students to sketch by themselves the answers that describe the working of real operating systems was a very pleasant experience!

Of course, when I use my slides I do try to make it as interactive and collaborative as possible. But it is often unfeasible when I'm following a script. Today I was able to go around with the group's questions, find my way back to the outline I prepared.

I don't think I'll completely abandon my slides, specially for some subjects which include many diagrams or pictures. But I'll try to have this alternative closer to my mind.

Categories: FLOSS Project Planets

Tomer Filiba: D for the Win

Planet Python - Wed, 2014-08-20 20:00

I'm a convert! I've seen the light!

You see, Python is nice and all and it excels in so many domains, but it was not crafted for the ever growing demands of the industry. Sure, you can build large-scale projects in Python (and I have built), but you take it out of the lab and into the real world, the price you pay is just too high. Literally. In terms of work per CPU cycle, you can't do worst.

The C10M problem is a reiteration of the C10K problem. In short, today's commodity hardware can handle millions of packets per second, but in reality you hardly ever reach such numbers. For example, I worked a short while at a company that used AWS and had tens of twisted-based Python servers accepting and logging requests (not doing any actual work). They managed to squeeze ~500 requests/sec out of this setup, which escalated in cost rather quickly. Moving to PyPy (not without trouble) did triple the numbers or so, but still, the cost simply didn't scale.

Python, I love you, but you help instill Gate's law -- "The speed of software halves every 18 months". In the end, we pay for our CPU cycles and we want to maximize our profit. It's not you, Guido, it's me. I've moved on to the C10M world, and for that I'd need a programming language that's designed for system programming with a strong and modern type system (after all, I love duck typing). I need to interface with external systems, so a C ABI is desirable (no foreign function interface), and meta-programming is a huge plus (so I won't need to incorporate cumbersome code-generation in my build system). Not to mention that mission-critical code can't allow for the occasional NameError or NoneType has no member __len__ exceptions. The code must compile.

I've looked into rust (nice, but will require a couple of years to mature enough for a large-scale project) and go (Google must be joking if they actually consider it for system programming), but as strange as it may sound, I've finally found what I've been looking for with D.

Dlang Dlang Über Alles

System programming is a vast ocean of specifics, technicalities and constraints, imposed by your specific needs. Instead of boring you to death with that, I thought it would be much more intriguing to compare D and Python. In other words, I'll try to show how D speaks fluent Python.

But first things first. In (the probable) case you don't know much D -- imagine it's what C++ would have dreamed to be. It offers cleaner syntax, much shorter compilation time, (optional) garbage collection, highly expressive templates and type inference, Pythonic operator overloading (implemented as rewriting), object-oriented and functional capabilities (multi-paradigm like Python), intermingles high-level constructs (like closures) with low-level ones (naked functions in inline assembly) to produces efficient code, has strong compile-time introspection capabilities and some extra cools features in the domain of code generation: mixin -- which evaluates an arbitrary string of D code at compile time, and CTFE -- compile-time function execution. Whoa, that was long.

In general, D follows Python's duck-typed (or protocol-oriented) spirit. If a type provides the necessary interface ("protocol") for an operation, it will just work, but you can also test for compliance at compile time. For example, ranges are a generalization of generators in Python. All you need to do in order to be an InputRange is implement bool empty(), void popFront() and auto front(), and you can use isInputRange!T to test whether T adheres the protocol. By the way, the exclamation point (!), which we'll soon get acquainted with, distinguishes compile-time arguments from runtime ones.

For brevity's sake, I'm not going to demonstrate all the properties I listed up there. Instead, I'll show why Python programmers ought to love D.

Case Study #1: Generating HTML

In an old blog post I outlined my vision of HTML templating languages: kill them all. I argued they are all but crippled-down forms of Python with an ugly syntax, so just give me Python and an easy way to programmatically manipulate the DOM.

I've later extended the sketch into a library in its own right, named srcgen. You can use it to generate HTML, C-like languages and Python/Cython code. I used it in many of my commercial projects when I needed to generate code.

So here's an excerpt of how's it done in srcgen:

def buildPage(): doc = HtmlDocument() with doc.head(): doc.title("das title") doc.link(rel = "foobar", type="text/css") with doc.body(): with doc.div(class_="mainDiv"): with doc.ul(): for i in range(5): with doc.li(id = str(i), class_="listItem"): doc.text("I am bulletpoint #", i) return doc.render()

And here's how it's done in D:

auto buildPage() { auto doc = new Html(); with (doc) { with (head) { title("das title"); link[$.rel = "foobar", $.type = "text/css"]; } with (body_) { with(div[$.class_ = "mainDiv"]) { with (ul) { foreach(i; 0 .. 5) { with (li[$.id = i, $.class_ = "listItem"]) { text("I am bulletpoint #"); text(i); } } } } } } return doc.render(); }

You can find the source code on github, just keep in mind it's a sketch I wrote for this blog post, not a feature-complete library.

The funny thing is, Python's with and D's with are not even remotely related! The Python implementation builds a stack of context managers, while with in D merely alters symbol lookup. But lo and behold! The two versions are practically identical, modulo curly braces. You get the same expressive power in both.

Case Study #2: Construct

But the pinnacle is clearly my D version of Construct. You see, I've been struggling for many years to create a compiled version of Construct. Generating efficient, static code from declarative constructs would make the library capable of handling real-world data, like packet sniffing or processing of large files. In other words, you won't have to write a toy parser in Construct and then rewrite it (by hand) in C++.

The issues with my C version of Construct were numerous, but they basically boiled down to the fact I needed a stronger object model to represent strings, dynamic arrays, etc., and adapters. The real power of Construct comes from adapters, which operate at the representational ("DOM") level of the data, rather on its binary form. That required lambdas, closures and other higher-level concepts that C lacks. I even tried writing a Haskell version, given that Haskell is so high-level and functional, but my colleague and I had given hope after a while.

Last week, it struck me that D could be the perfect candidate: it has all the necessary high-level concepts while being able to generate efficient code with meta-programming. I began fiddling with a D version, which proved extremely promising. So without further ado, I present dconstruct -- an initial sketch of the library.

This is the canonical PascalString declaration in Python:

>>> pascal_string = Struct("pascal_string", ... UBInt8("length"), ... Array(lambda ctx: ctx.length, Field("data", 1),), ... ) >>> >>> pascal_string.parse("\x05helloXXX") Container({'length': 5, 'data': ['h', 'e', 'l', 'l', 'o']}) >>> >>> pascal_string.build(Container(length=5, data="hello")) '\x05hello'

And here's how it's done in D:

struct PascalString { Field!ubyte length; Array!(Field!ubyte, "length") data; // the equivalent of 'Struct' in Python, // to avoid confusion of keyword 'struct' and 'Struct' mixin Record; } PascalString ps; auto stream = cast(ubyte[])"\x05helloXXXX".dup; ps.unpack(stream); writeln(ps); // {length: 5, data: [104, 101, 108, 108, 111]}

Through the use of meta-programming (and assuming inlining and optimizations), that code snippet there actually boils down to something like

struct PascalString { ubyte length; ubyte[] data; void unpack(ref ubyte[] stream) { length = stream[0]; stream = stream[1 .. $]; // advance stream data = stream[0 .. length]; stream = stream[length .. $]; // advance stream } }

Which is as efficient as it gets.

But wait, there's more! The real beauty here is how we handle the context. In Python, Construct builds a dictionary that travels along the parsing/building process, allowing constructs to refer to previously seen objects. This is possible in D too, of course, but it's highly inefficient (and not type safe). Instead, dconstruct uses a trick that's commonly found in template-enabled languages -- creating types on demand:

struct Context(T, U) { T* _curr; U* _; alias _curr this; // see below } auto linkContext(T, U)(ref T curr, ref U parent) { return Context!(T, U)(&curr, &parent); }

The strange alias _curr this is a lovely feature of D known as subtyping. It basically means that any property that doesn't exist at the struct's scope will we forwarded to _curr, e.g., when I write myCtx.foo and myCtx has no member named foo, the code is rewritten as myCtx._curr.foo.

As we travel along constructs, we link the current context with its ancestor (_). This means that for each combination of constructs, and at each nesting level, we get a uniquely-typed context. At runtime, this context is nothing more than a pair of pointers, but at compile time it keeps us type-safe. In other words, you can't reference a nonexistent field and expect the program to compile.

A more interesting example would thus be

struct MyStruct { Field!ubyte length; YourStruct child; mixin Record; } struct YourStruct { Field!ubyte whatever; Array!(Field!ubyte, "_.length") data; // one level up, then 'length' mixin Record; } MyStruct ms; ms.unpack(stream);

When we unpack MyStruct (which recursively unpacks YourStruct), a new context ctx will be created with ctx._curr=&ms.child and ctx._=&ms. When YourStruct refers to "_.length", the string is implanted into ctx, yielding ctx._.length. If we refered to the wrong path or misspelled anything, it would simply not compile. That, and you don't need dictionary lookups at runtime -- it's all resolved during compilation.

So again, this is a very preliminary version of Construct, miles away from production grade, but you can already see where it's going.

By the way, you can try out D online at dpaste and even play around with my demo version of dconstruct over there.

In Short

Python will always have a special corner in my heart, but as surprising as it may be (for a guy who's made his career over Python), this rather unknown, rapidly-evolving language, D, has become my new language of choice. It's expressive, concise and powerful, offers short compilation times (as opposed to C++) and makes programming both fun and efficient. It's the language for the C10M age.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-08-20

Planet Apache - Wed, 2014-08-20 19:58
Categories: FLOSS Project Planets

Ian Donnelly: Technical Demo

Planet Debian - Wed, 2014-08-20 19:46

Hi Everybody,

Today I wanted to talk a bit about our technical demo. We patched a version of Samba to use our elektra-merge script in order to handle it’s configuration file smb.conf. Using the steps from my previous tutorial, we patched Samba to use this new technique of config merging. This patched version of Samba mounts it’s configuration to system/samba/smb in the Elektra Key Database. Then, during package upgrades, it uses the new --threeway-merge-command command with elektra-merge as the specified command. The result is automatic handling of smb.conf that is conffile-like (thanks ucf!) and the ability to have a powerful, automatic, three-way merge solution on package upgrades.

The main thing I would like to discuss is how this project is an improvement upon the current implementation of three-way merges in ucf. Before this project, ucf could attempt three-way merges on files it controlled using the diff3 tool. The main limitation of tools like diff3 are that they are line based and don’t inherently understand the files they are dealing with. Elektra on the other hand allows for a powerful system of backends which use plug-ins to understand configuration files. Elektra doesn’t store configuration data on a line-by-line basis, but in a more abstract way that is tailored to each configuration file using backends. smb.conf is a great example of this because it uses the syntax of an ini file so Elektra can mount it in a way that is intuitive to an ini file. Since data is stored in a format of key=data within ini files, Elektra stores this data in a similar way. For each key in smb.conf store a Key in Elektra with the value of that key store in a string. Then, during a merge, we can compare Keys in each version of smb.conf and easily see which ones changed and how they need to be merged into the result. On the other hand, diff3 has no concept of ini files or keys, it just compares the different versions line by line which results in many more conflicts than using elektra-merge. Moreover, a traditional weakness of diff is moving lines or data around. While diff3 does a very good job at handling this, it’s not perfect. In Elektra, Keys are named in an intelligent way based on their backend, so for smb.conf the line workgroup = HOME would always be saved under system/samba/smb/workgroup. It doesn’t matter if the lines are changed between versions because Elektra just has to check for the Key and its value.

My favorite example is a shortcoming in the diff3 algorithm (at least in my opinion). If something is changed to the same value in ours and theirs, but they differ from base, diff3 reports a conflict. On the other hand elektra-merge can easily handle this problem. A simple example of this would be changing the max log size value in Samba. Here is that line in each version of smb.conf:
Base:
max log size = 1000
Ours:
max log size = 2000
Theirs:
max log size = 2000

Obviously, in the merged version, result, one would expect this line to be:
max log size = 2000

Let’s check the result from elektra-merge:
max log size = 2000

Great! How about diff3:
<<<<<<< smb.conf.base
max log size = 1000
=======
max log size = 2000
>>>>>>> smb.conf.theirs

Whoops! As I mentioned the diff3 algorithm can’t handle this type of change, it just results in a conflict. Note that smb.conf.base is just representative of the file used as base and that smb.conf.theirs is representative of the file used as theirs. The file names were changed for the sake of clarity.

There are many other examples of the benefits to storing configuration data in a database of Keys that can better conform to actual data as opposed to storing configuration parameters in files where they can only be compared on a line to line basis. With the help of storage plug-ins, Elektra can ‘understand’ the configurations stored in it’s Key Database. Since we store the data in a way that makes sense for configuration data, we can easily merge actual configuration data as opposed to just lines in a file. A good example of this is in-line comments. Many of our storage plug-ins understand the difference between comments and actual configuration data. So if a configuration file has an inline comment like so:
max log size = 10000 ; Controls the size of the log file (in KiB)
we can compare the actual Keys, value pairs between versions max log size = 10000 and deal with the comments separately.

As a result, if we have a base:
max log size = 1000 ; Size in KiB

Ours:
max log size = 10000 ; Size in KiB

Theirs:
max log size = 1000 ; Controls the size of the log file (in KiB)

The result using elektra-merge would be:
max log size = 10000 ; Controls the size of the log file (in KiB)

Obviously, this line would cause a conflict on any line-based merge algorithm such as diff3 or git. It is worth noting that the ability of elektra-merge is directly related to the quality of the storage plug-ins that Elektra relies on. elektra-merge only looks at the name, value, and any metadata affiliated with each key. As a result, using the line plug-in would result in a merge only as powerful as any other line-based merge. Yet by using the ini plug-in on an ini file we get a much more advanced merge like the one described above.

As you can tell, this new method offers clear advantages to the traditional method of using diff3 to handle configuration merges. Also, I hope this demo shows how easy it is to implement these great features into your Debian packages. The community can only benefit if maintainers take advantage of these great features. I am glad to say that my Google Summer of Code Project has been a success, even if we had to do a little change of plans. The ucf integration ended up working great and is really easy for maintainers to implement. Hope you enjoyed this demo and better understand the power of using Elektra.

Sincerely,
Ian S. Donnelly

Categories: FLOSS Project Planets

James Duncan: James Foley and the last journalists in Syria

Planet Apache - Wed, 2014-08-20 18:00

As Foley’s murder this week underlines, Syria is the most dangerous place in the world to be a journalist. As The Atlantic’s David Rohde wrote in November, “Syria today is the scene of the single largest wave of kidnappings in modern journalism, more than in Iraq during the 2000s or Lebanon during the 1980s.”

via permalink
Categories: FLOSS Project Planets

James Duncan: Vox’s card stack on ISIS and Iraq

Planet Apache - Wed, 2014-08-20 18:00

Looking for a good background on the organization that executed American journalist James Foley this week? Vox’s card stack provides a good look at ISIS—the group that used to be known as Al-Qaeda in Iraq.

via permalink
Categories: FLOSS Project Planets

FSF Blogs: GNU hackers discover HACIENDA government surveillance and give us a way to fight back

GNU Planet! - Wed, 2014-08-20 17:45

According to Heise online, the intelligence agencies of the United States, Canada, United Kingdom, Australia, and New Zealand, have used HACIENDA to map every server in twenty-seven countries, employing a technique known as port scanning. The agencies have shared this map and use it to plan intrusions into the servers. Disturbingly, the HACIENDA system actually hijacks civilian computers to do some of its dirty work, allowing it to leach computing resources and cover its tracks.

But this was not enough to stop the team of GNU hackers and their collaborators. After making key discoveries about the details of HACIENDA, Julian Kirsch, Christian Grothoff, Jacob Appelbaum, and Holger Kenn designed the TCP Stealth system to protect unadvertised servers from port scanning. They revealed their work at the recent annual GNU Hackers' Meeting in Germany.

Please be sure to share this with everyone you know who cares about bulk surveillance.

We must fight the political battle for an end to mass surveillance and reduce the amount of data collected about people in the first place. On an individual level we have to do everything we can to thwart the surveillance programs that are already in place.

No matter your skill level, you can get involved at the FSF's surveillance page.

Ethical developers inside and outside GNU have been working for years on free software that does not keep secrets from users, and programs that anyone can review to remove potential vulnerabilities. These capabilities give free software users a fighting chance against surveillance. Now, our community is turning its attention to uncovering and undermining insidious programs like HACIENDA. Free software and its ideals are crucial to putting an end to government bulk surveillance.

Share this news with your friends, to help make people aware of the importance of free software in fighting bulk surveillance.

Jacob Appelbaum of the TCP Stealth team gave a remote keynote address at the FSF's LibrePlanet conference this year. Watch the recording of "Free Software for freedom: Surveillance and you."

"Knocking down the HACIENDA" by Julian Kirsch, produced by GNU, the GNUnet team and edited on short notice by Carlo von Lynx from #youbroketheinternet is licensed under a Creative Commons Attribution NoDerivatives 3.0 Unported License.

Categories: FLOSS Project Planets

GNU hackers discover HACIENDA government surveillance and give us a way to fight back

FSF Blogs - Wed, 2014-08-20 17:45

According to Heise online, the intelligence agencies of the United States, Canada, United Kingdom, Australia, and New Zealand, have used HACIENDA to map every server in twenty-seven countries, employing a technique known as port scanning. The agencies have shared this map and use it to plan intrusions into the servers. Disturbingly, the HACIENDA system actually hijacks civilian computers to do some of its dirty work, allowing it to leach computing resources and cover its tracks.

But this was not enough to stop the team of GNU hackers and their collaborators. After making key discoveries about the details of HACIENDA, Julian Kirsch, Christian Grothoff, Jacob Appelbaum, and Holger Kenn designed the TCP Stealth system to protect unadvertised servers from port scanning. They revealed their work at the recent annual GNU Hackers' Meeting in Germany.

Please be sure to share this with everyone you know who cares about bulk surveillance.

We must fight the political battle for an end to mass surveillance and reduce the amount of data collected about people in the first place. On an individual level we have to do everything we can to thwart the surveillance programs that are already in place.

No matter your skill level, you can get involved at the FSF's surveillance page.

Ethical developers inside and outside GNU have been working for years on free software that does not keep secrets from users, and programs that anyone can review to remove potential vulnerabilities. These capabilities give free software users a fighting chance against surveillance. Now, our community is turning its attention to uncovering and undermining insidious programs like HACIENDA. Free software and its ideals are crucial to putting an end to government bulk surveillance.

Share this news with your friends, to help make people aware of the importance of free software in fighting bulk surveillance.

Jacob Appelbaum of the TCP Stealth team gave a remote keynote address at the FSF's LibrePlanet conference this year. Watch the recording of "Free Software for freedom: Surveillance and you."

"Knocking down the HACIENDA" by Julian Kirsch, produced by GNU, the GNUnet team and edited on short notice by Carlo von Lynx from #youbroketheinternet is licensed under a Creative Commons Attribution NoDerivatives 3.0 Unported License.

Categories: FLOSS Project Planets

Ian Ozsvald: Data Science Training Survey

Planet Python - Wed, 2014-08-20 17:08

I’ve put together a short survey to figure out what’s needed for Python-based Data Science training in the UK. If you want to be trained in strong data science, analysis and engineering skills please complete the survey, it doesn’t need any sign-up and will take just a couple of minutes. I’ll share the results at the next PyDataLondon meetup.

If you want training you probably want to be on our training announce list, this is a low volume list (run by MailChimp) where we announce upcoming dates and suggest topics that you might want training around. You can unsubscribe at any time.

I’ve written about the current two courses that run in October through ModelInsight, one focuses on improving skills around data science using Python (including numpy, scipy and TDD), the second on high performance Python (I’ve now finished writing O’Reilly’s High Performance Python book). Both courses focus on practical skills, you’ll walk away with working systems and a stronger understanding of key Python skills. Your developer skills will be stronger as will your debugging skills, in the longer run you’ll develop stronger software with fewer defects.

If you want to talk about this, come have a chat at the next PyData London meetup or in the pub after.

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.
Categories: FLOSS Project Planets

Four Kitchens: DrupalCamp Twin Cities: Frontend Wrap-up

Planet Drupal - Wed, 2014-08-20 16:58

This year’s Twin Cities DrupalCamp had no shortage of new faces, quality sessions, trainings, and after parties. Most of my time was spent in frontend sessions and talking with folks. Being that I live in Minneapolis, this camp is especially rewarding from a hometown Drupal represent kind of perspective. Below are some of my favorite sessions and camp highlights.

Community Drupal
Categories: FLOSS Project Planets

Aurelien Jarno: MIPS Creator CI20

Planet Debian - Wed, 2014-08-20 16:52

I have received two MIPS Creator CI20 boards, thanks to Imagination Technologies. It’s a small MIPS32 development board:

As you can see it comes in a nice packaging with a world-compatible power adapter. It uses a Ingenic JZ4780 SoC with a dual core MIPS32 CPU running at 1.2GHz with a PowerVR SGX540 GPU. The board is fitted with 1GB of RAM, 8GB of NOR flash, HDMI output, USB 2.0 ports, Ethernet + Wi-Fi + BlueTooth, SD card slot, IR receiver, expansion headers and more. The schematics are available. The Linux kernel and the U-Boot bootloader sources are also available.

Powering this board with a USB keyboard, a USB mouse and a HDMI display, it boots off the internal flash on a Debian Wheezy up to the XFCE environment. Besides the kernel, the Wi-Fi + Bluetooth firmware, and very few configuration changes, it runs a vanilla Debian. Unfortunately I haven’t found time to play more with it yet, but it looks already quite promising.

The board has not been formally announced yet, so I do not know when it will become available, nor the price, but if you are interested I’ll bring it to DebConf14. Don’t hesitate to ask me if you want to look at it or play with it.

Categories: FLOSS Project Planets

Mediacurrent: UX - Above the Fold &amp; Scrolling

Planet Drupal - Wed, 2014-08-20 16:51

More and more often I am asked, when putting together a design Drupal for a website, what is the importance of designing above the fold and whether or not that today’s users will scroll to read content.

Categories: FLOSS Project Planets

FSF Blogs: Friday Free Software Directory IRC meetup: August 22

GNU Planet! - Wed, 2014-08-20 16:28

Join the FSF and friends on Friday, August 22, from 2pm to 5pm EDT (18:00 to 21:00 UTC) to help improve the Free Software Directory by adding new entries and updating existing ones. We will be on IRC in the #fsf channel on freenode.


Tens of thousands of people visit directory.fsf.org each month to discover free software. Each entry in the Directory contains a wealth of useful information, from basic category and descriptions, to providing detailed info about version control, IRC channels, documentation, and licensing info that has been carefully checked by FSF staff and trained volunteers.


While the Free Software Directory has been and continues to be a great resource to the world over the past decade, it has the potential of being a resource of even greater value. But it needs your help!


If you are eager to help and you can't wait or are simply unable to make it onto IRC on Friday, our participation guide will provide you with all the information you need to get started on helping the Directory today!

Categories: FLOSS Project Planets
Syndicate content