Feeds

Sergey Beryozkin: [OT] The best T-shirt I've seen at Apache Con EU 2014

Planet Apache - Sun, 2014-11-23 17:07
This is the first post about Apache Con EU 2014 held in beautiful Budapest I've been lucky to attend to.

One of the nice things about being an ApacheCon visitor is that one can see lots of cool T-shirts. The official T-shirts (I do treasure them) and other T-shirts with some great lines or digits printed on them. The T-shirts that many software geeks would be happy to wear. And indeed the visitors at ApacheCon EU 2014 had a lot of different T-shirts to demonstrate.

It was at the presentation about TomEE that I realized that while the rest of the room were glued to the presentation screen and being impressed by what TomEE could do I was looking at the T-shirts of TomEE experts doing the presentation and thinking how unfair it was I did not have a T-shirt like that too.

You can see Romain wearing it here.

Tomitribe, the company which did it right once again :-) !






Categories: FLOSS Project Planets

Possible to download recipes – Foresighters repo

LinuxPlanet - Sun, 2014-11-23 16:29

Iv’e been busy to find a way for users to see what’s been added to foresighters repo and able to download the recipe for the package. And now it’s possible.

First, you can see latest added recipes in the widget on the right side and browse all recipes here: https://www.foresightlinux.se/flr-recipes/

Don’t forget to look at Videos section, as Iv’e been planning to make some videos in the near future.

Wondered how Foresight Linux 7 looks like? Take a look at the video below.

Categories: FLOSS Project Planets

orkjerns blogg: Headless Drupal with head fallback

Planet Drupal - Sun, 2014-11-23 16:22
Headless Drupal with head fallback admin Sun, 11/23/2014 - 21:22

I just wanted to take a moment to talk about how I approached the hot word "headless Drupal" on my blog. It uses some sort of "headless" communication with the Drupal site, but it also leverages Drupal in a standard way. For different reasons. (by the way, if you are interested in "headless Drupal", there is a groups.drupal.org page about the subject.)

First of all, let's examine in what way this simple blog is headless. It is not headless in the way that it offers all the functionality of Drupal without using Drupals front-end. For example, these words I am typing is not typed into a decoupled web-app or command-line tool. Its only headless feature is that it loads content pages with ajax through Drupal 8's new REST module. Let's look at a typical set-up for this, and how I approached it differently.

A typical setup

A common way to build a front-end JavaScript application leveraging a REST API, is using a framework of your choice (backbone / angular / or something else *.js) and build a single-page application (or SPA for short). Basically this could mean that you have an index.html file with some JavaScript and stylesheets, and all content is loaded with AJAX. This also means that if you request the site without JavaScript enabled, then you would just see an empty page (except of course if you have some way of scraping the dynamic content and outputting plain HTML as fallback).

Head fallback

I guess the "headless" metaphor sounds strange when I change it around to talk about "head fallback". But what I mean with this is that I want a user to be able to read all pages with no JavaScript enabled, and I want Drupal (the head) to handle this. All URLs should also contain (more or less) the same content if you are browsing with JavaScript or without it. Luckily, making HTML is something Drupal always has done, so let's start there.

Now, this first part should be obvious. If a user comes to the site, we show only the output of each URL as intended with the activated theme. This is a out-of-the box feature with Drupal (and any other CMS). OK, so the fallback is covered. The next step is to leverage the REST module, and load content async with AJAX.

Head first, headless later

A typical scenario would be that for the front page I would want to request the "/node" resource with the header "Accept:application/hal+json" to get a list of nodes. Then I would want to display these in the same way the theme displays it statically on a page load. The usual way of doing this is that when the document is ready, we request the resource and build and render the page, client side. This is impractical in one way: You are waiting to load the entire document to actually render anything at all. Or maybe even worse: You could be waiting for the entire /node list to load, only to destroy the DOM elements with the newly fetched and rendered JSON. This is bad for several reasons, but one concrete example is a smart phone on a slow network. This client could start rendering your page on the first chunk of html transferred, and that would maybe be enough to show what is called the "above the fold content". This is also something that is a criteria in the often used Google PageSpeed. Meaning in theory that our page would get slower (on first page load) by building a SPA on top of the fallback head.

It is very hip with some "headless Drupal" goodness, but not at the cost of performance and speed. So what I do for the first page load, is trust Drupal to do the rendering, and then initializing the JavaScript framework (Mithril.js in my case) when I need it. Let's take for example you, dear visitor, reading this right now. You probably came to this site via a direct link. Now, why would I need to set up all client side routes and re-render this node when all you probably wanted to do, was to read this article?

Results and side-effects

OK, so now I have a fallback for JavaScript that gives me this result (first picture is without JavaScript, second is with JavaScript):

As you can see, the only difference is that the disqus comment count can not be shown on the non-js version. So the result is that I have a consistent style for both js and non-js visitors, and I only initialize the headless part of the site when it is needed.

A fun (and useful) side-effect is the page speed. Measured in Google PageSpeed this now gives me a score of 99 (with the only suggestion to increase the cache lifetime of the google analytics js)

Is it really headless, then?

Yes and no. Given that you request my site with JavaScript enabled, the first page request is a regular Drupal page render. But after that, if you choose to go to the front page or any other articles, all content is fetched with AJAX and rendered client side.

Takeaways and lessons learned

I guess some of these are more obvious than others.

  • Do not punish your visitor for having JavaScript disabled. Make all pages available for all users. Mobile first is one thing, but you could also consider no-js first. Or both?
  • Do not punish your visitor for having JavaScript enabled. If you render the page based on a AJAX request, the time between initial page load and actual render time will be longer, and this is especially bad for mobile.
  • Subsequent pages are way faster to load with AJAX, both for mobile and desktop. You really don't need to download more than the content (that is, the text) of the page you are requesting, when the client already have the assets and wrapper content loaded in the browser.
Disclaimers

First: these techniques might not always be appropriate for everyone. You should obviously consider the use case before using a similar approach

If you, after reading this article, find yourself turning off JavaScript to see what the page looks like, then you might notice that there are no stylesheets any more. Let me just point out that this would not be the case if your _first_ page request were without JavaScript. By requesting and rendering the first page with JavaScript, your subsequent requests will say to my server that you have JavaScript enabled, and thus I also assume you have stored the css in localStorage (as the js does). Please see this article for more information

Let's just sum this up with this bad taste gif in the category "speed":

Tags:
Categories: FLOSS Project Planets

grep @ Savannah: grep-2.21 released [stable]

GNU Planet! - Sun, 2014-11-23 16:22
This is to announce grep-2.21, a stable release. There have been 94 commits by 3 people in the 25 weeks since 2.20. See the NEWS below for a brief summary. Thanks to everyone who has contributed! The following people contributed changes to this release: Jim Meyering (26) Norihiro Tanaka (17) Paul Eggert (51) Jim [on behalf of the grep maintainers] ================================================================== Here is the GNU grep home page: http://gnu.org/s/grep/ For a summary of changes and contributors, see: http://git.sv.gnu.org/gitweb/?p=grep.git;a=shortlog;h=v2.21 or run this command from a git-cloned grep directory: git shortlog v2.20..v2.21 To summarize the 123 gnulib-related changes, run these commands from a git-cloned grep directory: git checkout v2.21 git submodule summary v2.20 Here are the compressed sources and a GPG detached signature[*]: http://ftp.gnu.org/gnu/grep/grep-2.21.tar.xz http://ftp.gnu.org/gnu/grep/grep-2.21.tar.xz.sig Use a mirror for higher download bandwidth: http://ftpmirror.gnu.org/grep/grep-2.21.tar.xz http://ftpmirror.gnu.org/grep/grep-2.21.tar.xz.sig [*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this: gpg --verify grep-2.21.tar.xz.sig If that command fails because you don't have the required public key, then run this command to import it: gpg --keyserver keys.gnupg.net --recv-keys 7FD9FCCB000BEEEE and rerun the 'gpg --verify' command. This release was bootstrapped with the following tools: Autoconf 2.69.117-1717 Automake 1.99a Gnulib v0.1-262-g46d015f ================================================================== NEWS * Noteworthy changes in release 2.21 (2014-11-23) [stable] ** Improvements Performance has been greatly improved for searching files containing holes, on platforms where lseek's SEEK_DATA flag works efficiently. Performance has improved for rejecting data that cannot match even the first part of a nontrivial pattern. Performance has improved for very long strings in patterns. If a file contains data improperly encoded for the current locale, and this is discovered before any of the file's contents are output, grep now treats the file as binary. grep -P no longer reports an error and exits when given invalid UTF-8 data. Instead, it considers the data to be non-matching. ** Bug fixes grep no longer mishandles patterns that contain \w or \W in multibyte locales. grep would fail to count newlines internally when operating in non-UTF8 multibyte locales, leading it to print potentially many lines that did not match. E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .." would print this: 1:1 2 3 4 5 6 7 8 9 10 implying that the match, "10" was on line 1. [bug introduced in grep-2.19] grep -F -x -o no longer prints an extra newline for each match. [bug introduced in grep-2.19] grep in a non-UTF8 multibyte locale could mistakenly match in the middle of a multibyte character when using a '^'-anchored alternate in a pattern, leading it to print non-matching lines. [bug present since "the beginning"] grep -F Y no longer fails to match in non-UTF8 multibyte locales like Shift-JIS, when the input contains a 2-byte character, XY, followed by the single-byte search pattern, Y. grep would find the first, middle- of-multibyte matching "Y", and then mistakenly advance an internal pointer one byte too far, skipping over the target "Y" just after that. [bug introduced in grep-2.19] grep -E rejected unmatched ')', instead of treating it like '\)'. [bug present since "the beginning"] On NetBSD, grep -r no longer reports "Inappropriate file type or format" when refusing to follow a symbolic link. [bug introduced in grep-2.12] ** Changes in behavior The GREP_OPTIONS environment variable is now obsolescent, and grep now warns if it is used. Please use an alias or script instead. In locales with multibyte character encodings other than UTF-8, grep -P now reports an error and exits instead of misbehaving. When searching binary data, grep now may treat non-text bytes as line terminators. This can boost performance significantly. grep -z no longer automatically treats the byte '\200' as binary data.
Categories: FLOSS Project Planets

Dimitri John Ledkov: Analyzing public OpenPGP keys

Planet Debian - Sun, 2014-11-23 16:15
OpenPGP Message Format (RFC 4880) well defines key structure and wire formats (openpgp packets). Thus when I looked for public key network (SKS) server setup, I quickly found pointers to dump files in said format for bootstrapping a key server.

I did not feel like experimenting with Python and instead opted for Go and found http://code.google.com/p/go.crypto/openpgp/packet library that has comprehensive support for parsing openpgp low level structures. I've downloaded the SKS dump, verified it's MD5SUM hashes (lolz), and went ahead to process them in Go.

With help from http://github.com/lib/pq and database/sql, I've written a small program to churn through all the dump files, filter for primary RSA keys (not subkeys) and inject them into a database table. The things that I have chosen to inject are fingerprint, N, E. N & E are the modulus of the RSA key pair and the public exponent. Together they form a public part of an RSA keypair. So far, nothing fancy.

Next I've run an SQL query to see how unique things are... and found 92 unique N & E pairs that have from two and up to fifteen duplicates. In total it is 231 unique fingerprints, which use key material with a known duplicate in the public key network. That didn't sound good. And also odd - given that over 940 000 other RSA keys managed to get unique enough entropy to pull out a unique key out of the keyspace haystack (which is humongously huge by the way).

Having the list of the keys, I've fetched them and they do not look like regular keys - their UIDs do not have names & emails, instead they look like something from the monkeysphere. The keys look like they are originally used for TLS and/or SSH authentication, but were converted into OpenPGP format and uploaded into the public key server. This reminded me of the Debian's SSL key generation vulnerability CVE-2008-0166. So these keys might have been generated with bad entropy due to affected tools by that CVE and later converted to OpenPGP.

Looking at the openssl-blacklist package, it should be relatively easy for me to generate all possible RSA key-pairs and I believe all other material that is hashed to generate the fingerprint are also available (RFC 4880#12.2). Thus it should be reasonably possible to generate matching private keys, generate revocation certificates and publish the revocation certificate with pointers to CVE-2008-0166. (Or email it to the people who have signed given monkeysphered keys). When I have a minute I will work on generating openpgp-blacklist type of scripts to address this.

If anyone is interested in the Go source code I've written to process openpgp packets, please drop me a line and I'll publish it on github or something.
Categories: FLOSS Project Planets

Andre Roberge: Practical Python and OpenCV: a review (part 2)

Planet Python - Sun, 2014-11-23 15:53
In part 1, I mentioned that I intended to review the "Case Studies" of the bundle I got from Practical Python and OpenCV  and that I would discuss using the included Ubuntu VirtualBox later.  However, after finishing the blog post on Part 1, I started looking at the "Case Studies" and encountered some "new" problems using the VirtualBox that I will mention near the end of this post.  So, I decided to forego using it altogether and install OpenCV directly.
Note: If you have experience using VirtualBoxes, then it might perhaps be useful to get the premium bundle that includes them; for me it was not.  Including a Ubuntu VirtualBox already set up with all the dependencies and the code samples from the two books is a very good idea and one  that may work very well for some people.
If you need to use VirtualBoxes on Windows for other reasons, perhaps you will find the following useful.
Setting up the VirtualBox
Running Windows 8.1, I encountered an error about vt-x not being enabled.   My OS is in French and googling French error messages is ... hopeless.  So, I used my best guess as to what were the relevant English pages.
From http://superuser.com/questions/785672/linux-64bit-on-virtual-box-with-window-7-profession-64-bit  I understood that I needed to access the BIOS to change the settings so that I could enable virtualization mode.
Unfortunately, I (no longer) was seeing an option to access the bios at boot time.     There are *many* messages about how to re-enable bios access at boot time, most of which simply did not work for me.  The simpler method I found to do so was following (at least initially) the explanation given at http://www.7tutorials.com/how-boot-uefi-bios-any-windows-81-tablet-or-device.
(However, I found out afterwards, that the bios not being accessible is possibly/likely simply because I had a fast startup option checked in power settings.)
Once I got access to the bios, I changed my settings to enable virtualization; there were two such settings ... I enabled them both, not knowing which one was relevant.  I do not recall exactly which settings (I've done this one month ago and did not take notes of that step)... but navigating through the options, it was very easy to identify them.
This made it possible to start the virtual box,  but when I tried for the first few times, I had to use the option to run as Administrator for this to work. 
The first time I tried to start the image (as an administrator), it got stuck at 20%.  I killed the process.  (I might have repeated this twice.)   Eventually, it started just fine and I got to the same stage as demonstrated in the demonstration video included with the bundle.   Started the terminal - the file structure is slightly different from what what is shown in the video but easy enough to figure out.
Using the VirtualBox
I've used the VirtualBox a few times since setting it up.  For some reason, it runs just fine as a normal user, without needing to use the option run as an Administrator anymore.  
My 50+ years old eyes not being as good as they once were, I found it easier to read the ebook on my regular computer while running the programs inside the VirtualBox.  Running the included programs, and doing some small modifications was easy to do and made me appreciate the possibility of using VirtualBoxes as a good way to either learn to use another OS or simply use a "package" already set up without having to worry about downloading and installing anything else. 
As I set up to start the "Case Studies" samples, I thought it would be a good opportunity to do my own examples.  And this is where I ran into another problem - which may very well be due to my lack of experience with Virtual Boxes.
I wanted to use my own images.  However, I did not manage to find a way to set things up so that I could see a folder on my own computer.  There is an option to take control of a USB device ... but, while activating the USB device on the VirtualBox was clearly deactivating it under Windows (and deactivating it was enabling it again on Windows indicating that something was done correctly), I simply was not able to figure out how to see any files on the usb key from the Ubuntu VirtualBox.  (Problem between keyboard and chair perhaps?)
I did find a workaround: starting Firefox on the Ubuntu VirtualBox, I logged in my Google account and got access to my Google Drive.  I used it to transfer one image, ran a quick program to modify it using OpenCV.  To save the resulting image (and/or modified script) onto my Windows computer, I would have had to copy the files to my Google Drive ...
However, as I thought of the experiments I wanted to do, I decided that this "back-and-forth" copying (and lack of my usual environment and editor) was not a very efficient nor very pleasant way to do things.
So, I gave up on using the VirtualBox, used Anaconda to install Python 2.7, Numpy, Matplotlib (and many other packages not required for OpenCV), installed OpenCV (3.0 Beta), ran a quick test using the first program included with Practical Python and OpenCV ... (loading, viewing and saving an image)  which just worked.
Take away
If you have some experience running VirtualBoxes successfully, including being able to copy easily files between the VirtualBox and your regular OS, then you are in a better position than I am to figure out if getting the premium bundle that includes a VirtualBox might be worth your while.
If you have no experience using and setting up VirtualBoxes, unless you wanted to use this opportunity to learn about them, my advice would be to not consider this option.
Now that I have all the required tools (OpenCV, Numpy, Matplotlib, ...) already installed on my computer, I look forward to spending some time exploring the various Case Studies.
---My evaluation so far:  Getting the "Practical Python and OpenCV" ebook with code and image samples was definitely worth it for me.   Getting the Ubuntu VirtualBox and setting it up was ... a learning experience, but not one that I would recommend as being very useful for people with my own level of expertise or lack thereof.
My evaluation of the "Case Studies" will likely take a while to appear - after all, it took me a month between purchasing the Premium bundle and writing the first two blog post.  (Going through the first book could easily be done in one day.)  
I do intend to do some exploration with my own images and I plan to include them with my next review.
Categories: FLOSS Project Planets

youtube-dl, Python video download tool, on front page of Hacker News

LinuxPlanet - Sun, 2014-11-23 15:47

By Vasudev Ram



youtube-dl is a video download tool written in Python.

I had blogged about youtube-dl a while ago, here:

youtube-dl, a YouTube downloader in Python [1]

and again some days later, here:

How to download PyCon US 2013 videos for offline viewing using youtube-dl

(The comments on the above post give some better / easier ways to download the videos than I gave in the post.)

Today I saw that a Hacker News thread about youtube-dl was on the front page of Hacker News for at least part of the day (up to the time of writing this). The thread is here:

youtube-dl (on Hacker News)

I scanned the thread and saw many comments saying that the tool is good, what different users are using it for, many advanced tips on how to use it, etc. The original creator of youtube-dl, Ricardo Garcia, as well as a top contributor and the current maintainer (Filippo Valsorda and Philipp Hagemeister, respectively) also participated in the HN thread, as HN users rg3, FiloSottile and phihag_, respectively. I got to know from the thread that youtube-dl has many contributors, and that its source code is updated quite frequently (for changes in video site formats and other issues), both points which I did not know earlier. (I did know that you can use it to update itself, using the -U option).

Overall, the HN thread is a worthwhile read, IMO, for people interested in downloading videos for offline viewing. The thread had over 130 comments at the time of writing this post.

(On a personal note, since I first got to know about youtube-dl and downloaded it, I've been using it a fair amount to download videos every now and then, for offline viewing, and it has worked pretty well. There were only a few times when it gave an error saying the video could not be downloaded, and I am not sure whether it was due to a problem with the tool, or with the video site.)

[1] My first post about youtube-dl also had a brief overview of its code, which may be of interest to some Pythonistas.

This other post which mentions youtube-dl may also be of interest:

The most-watched Python repositories on Github

since youtube-dl was one of those most-watched repositories, at the time of writing that post.

- Vasudev Ram - Dancing Bison Enterprises

Signup for emails about new products from me.

Contact me for Python consulting and training.

Share | var addthis_config = {"data_track_clickback":true}; Vasudev Ram

Categories: FLOSS Project Planets

Steinar H. Gunderson: Scaling analysis.sesse.net

Planet Debian - Sun, 2014-11-23 15:06

As I previously mentioned, I've been running live chess analysis during the Carlsen–Anand World Chess Championship match. Now it's all over (congratulations to Magnus!), so I thought I should write a few words about scaling, as we ended up peaking at (I think) 1527 simultaneous users, much more than the system was originally designed for (2–3 or so :-) ).

Let me explain first the requirements. Generally the backend system outputs analysis as soon as it comes in from the two chess engines (although rate-limited so that it doesn't output more than once a second), and we want to push this out to the clients as soon as possible. The clients are regular web browsers (both on mobile and on desktop; I haven't checked the ratio) running a fair amount of JavaScript; they generally request an URL in a loop, and whenever something comes in, they display it on the chessboard, wait 100 ms (just as a safeguard) and then go fetch again.

Of course, I could have just had the clients ask every second, but it seems inelegant and a bit wasteful, especially for mobile. If the analysis has come far, or even has stopped entirely since e.g. the game is over, there's no need to go fetch the same data over and over again. Instead, what I want it a system where the, if the client already has the latest data, the HTTP request hangs until there's more data, and then gets it immediately. Together with this, there's also a special header that says how many people are connected (which is also shown to the viewers). If a client has been hanging/sleeping for more than 30 seconds, I just re-send the latest analysis; in this world of NATs, transparent proxies and other unpredictable network conditions, I don't want to have connections hanging for minutes with no idea of whether I can actually answer them when the time comes.

The client tells the server (in a CGI parameter, again for simplicity so that I don't have to deal with caching proxies etc. on the way) in the request the timestamp of the latest data it has. This leads to the following different scenarios:

  1. A client comes in and has no existing data. They should get the latest data, immediately.
  2. A client has old data, and re-asks. They should also get the latest data, immediately.
  3. A client already has the newest data, which causes it to hang, and no new data is ready within 30 seconds. They should get the latest data anew (or I could have returned some other HTTP code, but I decided not to get fancy).
  4. A client already has the newest data, but new data comes in underway. They should get the new data.

Unsurprisingly, this leads to a lot of clients being in the “hanging” state at the same time, and then when new analysis comes in, there's a thundering herd of clients that should have it at the same time (and then come back for more soon after).

Like I wrote earlier, Node.js was pretty much the ideal case model-wise for this; there's only one process to handle all of them, which means the extra memory overhead per hanging client is very low, and when there's new data, we can just send to all of them after each other. Furthermore, since there's only one process, it is also easy to find the viewer count: Simply count all the hanging clients, plus the ones that I think are simply processing the latest data and should come back with a new request soon (the limit was five seconds or so).

However, around 6–700 clients, I started getting issues with requests not coming through. It turns out that the single Node.js process just couldn't handle all that many clients and started hitting the roof CPU-wise. Everyone who's done a bit of performance work in nontrivial systems know that you can't really optimize anything without profiling it first, but unfortunately, Node.js was extremely limited here. There were some systems sending lots of data to externals, which I didn't really feel like. Then, there were some systems to try to interpret V8's debug output logs, and they simply gave out bogus answers.

In particular, they claimed 93% of my time was spent in glibc, but couldn't say where in glibc, and when I pointed perf at them, it was pretty clear that most of the time was actually spent in JavaScript and V8 support functions. I took a guess at my viewer count functions, optimized so I didn't have to count as often, and it helped—but I still wasn't really confident it would scale very far. Usually people start up more Node.js workers and then have some load balancer in front, but it would make the viewer counting much more complicated, and the CPU would need to be shared with the chess engine itself, so I wasn't happy with the “just give it more cores” approach either.

So I turned to everyone's favorite website scaling tool, Varnish. With lots of help from Lasse (Varnish Software) and Tollef (ex-Varnish Software, now Fastly), I got things working; it was a sort of bumpy road, though, especially as I hit two different crash bugs in Varnish 4.0.2 that only manifested themselves under actual production load. Here's what I ended up with (running on git master):

The first thing to realize is that we're not trying to keep backend traffic entirely minimal, just cut it significantly. For instance, if Varnish sees #1 (no existing data) and #2 (old existing data) as different and fire off two different backend requests for them, it doesn't really matter. However, we really want all the hanging clients to get the same backend connection; thankfully, Varnish gives us this entirely by itself with its backend coalescing; if it has a backend request going and another one comes in for the same URL, it simply puts the second one on the sleep list and gives both the same response when it comes back. Also, if a slow or new client doesn't manage to get onto the hanging request (ie., it comes in after the backend response came), it should simply be served entirely out of Varnish' cache.

A lot of this comes automatically, and some of it comes with some cooperation between the backend and the VCL. In particular, we can let the client tag the response with the timestamp of the data, and once something comes in, simply purge/ban every object with a different timestamp header, causing us to never give out stale data from the Varnish cache. Varnish bans can be a bit tricky since they're checked lazily, and if you're not entirely careful, you can end up with a very long ban list, but it seems to work well in practice.

However, the distinction between #3 and #4 gives us a problem. We now have a situation where people ask for an URL, it times out after 30 seconds and gives us a response... which we then should give out to everybody, but the next time they ask, we should hang again! This was incredibly tricky to get right; the combination of TTL, Expires headers, grace, and the problem of clock skew for long-running requests (what exactly is the Date timestamp supposed to mean; request received, first byte of backend response, or something else?) and so on was just too much for me. Eventually I got tired of reading the Varnish source code (which, frankly, I find quite opaque) and decided to just sidestep the problem. Now, instead of the 30-second timeout, the backend simply just touches the data file every 30 seconds if it hasn't been changed, so it gets a new timestamp every time. Problem solved.

Finally, there's the counting problem; the backend doesn't see all the requests anymore, so we need a different way of counting. I solve this by tailing the access logs (using varnishncsa) in a separate process, comparing to the updates of the analysis file, and then trying to figure out if they've fallen out or not. Then I simply inject the viewer count into the backend every second. Problem solved, again. (Well, at low traffic numbers, seemingly there's some sort of buffering somewhere, causing me to see the requests way too late, and this causes the count to oscillate down between 1 and the real number somehow. But I don't care too much right now.)

So, there you have it. Varnish' threaded architecture isn't super for this kind of thing; in a sense, much less than Node.js. However, even in this day and age, optimized C beats JavaScript any day of the week; seemingly by a factor five or so. In the end, the 1500 clients were handled with CPU usage of about 40% of one core. I don't really like the fact that it needs ~1500 worker threads for those 1500 clients (I had to increase it from the default of 1024 in order to keep the HTTP errors away), but I used taskset to restrict it to two physical cores in order not to disturb the chess worker threads too much (they are already rather sensitive to the kernel's scheduling decisions).

So, how far can it go? Well, those 1500 clients needed about 33 Mbit/sec, so we can go to ~45k based on bandwidth (the server is on gigabit). At that point, though, I sincerely doubt I can sustain both Varnish and the chess engine can keep going, so I'd either move it externally. So next up, maybe Fastly? Well, at least if they start supporting IPv6.

You can find all the code, including the Varnish snippets, in the git repository. Until next time—perhaps WCC 2016! Waiting for Carlsen–Caruana. :-)

Final bonus: Munin graphs. Everyone loves Munin graphs; it's the Comic Sans of system administration.

Categories: FLOSS Project Planets

IO Digital Sec: SSH and SFTP with Paramiko & Python

Planet Python - Sun, 2014-11-23 13:58

Paramiko is a Python implementation of SSH with a whole range of supported features. To start, let’s look at the most simple example – connecting to a remote SSH server and gathering the output of ls /tmp/

import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) try: ssh.connect('localhost', username='testuser', password='t3st@#test123') except paramiko.SSHException: print "Connection Failed" quit() stdin,stdout,stderr = ssh.exec_command("ls /etc/") for line in stdout.readlines(): print line.strip() ssh.close()

After importing paramiko, we create a new variable ‘ssh’ to hold our SSHClient. ssh.set_missing_host_key_policy automatically adds our server’s host key without prompting. For security, this is not a good idea in production, and host keys should be added manually. Should a host key change unexpectedly, it could indicate that the connection has been compromised and is being diverted elsewhere.

Next, we create 3 variables, stdin, stdout and stderr allowing us to access the respective streams when calling ls /etc/

Finally, for each “\n” terminated line on stdout, we print the line, stripping the trailing “\n” (as print adds one). Finally we close the SSH connection.

Let’s look at another example, where we communicate with stdin.

cat in its simplest form will print what it receives on stdin to stdout. This can be shown as follows:

root@w:~# echo "test" | cat test

Simply, we can use:

stdin,stdout,stderr = ssh.exec_command("cat") stdin.write("test")

To allow us to communicate with stdin. Wait! Now, the program hangs indefinitely. cat is waiting for an EOF to be received. To do so, we must close the channel:

stdin.channel.shutdown_write()

Now, let’s extend the example to read a colon separated username and password from a file:

import paramiko import sys ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) if len(sys.argv) != 2: print "Usage %s <filename>" % sys.argv[0] quit() try: fd = open(sys.argv[1], "r") except IOError: print "Couldn't open %s" % sys.argv[1] quit() username,passwd = fd.readline().strip().split(":") #TODO: add error checking! try: ssh.connect('localhost', username=username, password=passwd) stdin,stdout,stderr = ssh.exec_command("ls /tmp") for line in stdout.readlines(): print line.strip() ssh.close() except paramiko.AuthenticationException: print "Authentication Failed" quit() except: print "Unknown error" quit()

Lastly, let’s look at reading a remote directory over SFTP:

import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) try: ssh.connect('localhost', username='testuser', password='t3st@#test123') except paramiko.SSHException: print "Connection Error" sftp = ssh.open_sftp() sftp.chdir("/tmp/") print sftp.listdir() ssh.close()

Paramiko supports far more SFTP options, including of course the upload and download of files.

Categories: FLOSS Project Planets

Awesome BSP in München

Planet KDE - Sun, 2014-11-23 13:10

An awesome BSP just took place in München where teams from Kubuntu, Kolab, KDE PIM, Debian and LibreOffice came and planned the future and fixed bugs. This is my second year participating at this BSP and I must say it was an awesome experience. I got to see again my colleagues from Kubuntu and got to […]

Categories: FLOSS Project Planets

Foresighters repo up and running for F7

LinuxPlanet - Sun, 2014-11-23 12:56

We have created a foresighters repository for adding additional packages that don’t belong into main repository or in epel.

To be able to use all packages without thinking of specify the label, you can add a search line into system-model and add it in .conaryrc to make it work with conary rq ***

Categories: FLOSS Project Planets

Andre Roberge: Practical Python and OpenCV: a review (part 1)

Planet Python - Sun, 2014-11-23 11:25
A few weeks ago, I purchased the premium bundle of Practical Python and OpenCV consisting of a pdf book (Practical Python and OpenCV) and short Python programs explained in the book, a Case Studies bundle also consisting of a pdf book and Python programs, and a Ubuntu VirtualBox virtual machine with all the computer vision and image processing libraries pre-installed.

In this post, I will give my first impression going through the Practical Python and OpenCV part of the bundle.  My intention at this point is to cover the Case Studies part in a future post, and conclude with a review of the Ubuntu VirtualBox, including some annoying Windows specific problems I encountered when I attempted to install use it and, more importantly, the solutions to these problems. (In short: my Windows 8 laptop came with BIOS setttings that prevented such VirtualBoxes from working - something that may be quite common.)

The Practical Python and OpenCV pdf book (hereafter designated simply by "this book") consists of 12 chapters.  Chapter 1 is a brief introduction motivating the reader to learn more about computer vision.  Chapter 2 explains how to install NumPy, SciPy, MatplotLib, OpenCV and Mahotas.  Since I used the virtual Ubuntu machine, I skipped that part.  If I had to install these (which I may in the future), I would probably install the Anaconda Python distribution (which I have used on another computer before) as it already include the first three packages mentioned above in addition to many other useful Python packages not included in the standard distribution.

Chapter 3 is a short chapter that explains how to load, display and save images using OpenCV and friends.  After reading the first 3 chapters, which numerically represent one quarter of the book, I was far from impressed by the amount of useful material covered.  This view was reinforced by the fourth chapter (Image Basics, explaining what a pixel is, how to access and manipulate pixels and the RGB color notation) and by the fifth chapter explaining how to draw simple shapes (lines, rectangles and circles).  However, and this is important to point out, Chapter 5 ends at page 36 ... which is only one-quarter of the book.  In my experience, most books produced "professionally" tend to have chapters of similar length (except for the introduction) so that one gets a subconscious impression of the amount of material covered in an entire book by reading a few chapters.  By contrast here, the author seems to have focused on defining a chapter as a set of closely related topics, with little regards to the amount of material (number of pages) included in a given chapter.  After reading the entire book, this decision makes a lot of sense to me here - even though it initially gave me a negative impression (Is that all there is? Am I wasting my time?) as I was reading the first few chapters.  So, if you purchased this book as well, and stopped reading before going through Chapter 6, I encourage you to resume your reading...

Chapter 6, Image Processing, is the first substantial chapter.  It covers topics such as Image transformations (translation, rotation, resizing, flipping, cropping), image arithmetic, bitwise operation, masking, splitting and mergin channels and conclude with a short section on color spaces which I would probably have put in an appendix.  As everywhere else in the book, each topic is illustrated by a simple program.

Chapter 7 introduces color histograms explaining what they are, and how to manipulate them to change the appearance of an image.

Chapter 8, Smoothing and Blurring, explains four simple methods (simple averaging, gaussian, median and bilateral) used to perform smoothing and blurring of images.

Chapter 9 Thresholding, covers three methods (simple, adaptive, and Otsu and Riddler-Calvard) to do thresholding.  What is thresholding?... it is a way to separate pixels into two categories (think black or white) in a given image.  This can be used as a preliminary to identify individual objects in an image and focus on them.

Chapter 10 deals with Gradients and Edge Detection.  Two methods are introduced (Laplacian and Sobel, and Canny Edge Detector).  This is a prelude to Chapter 11 which uses these techniques to count the number of objects in an image.

Chapter 12 is a short conclusion.

After going (quickly) through the book, I found that every individual topic was well illustrated by at least one simple example (program) showing the original image and the expected output.  Since the source code and images used are included with the book, it was really easy to reproduce the examples and do further exploration either using the same images or using my own images.   Note that I have not (yet) tried all the examples but all those I tried ran exactly as expected and are explained in sufficient details that they are very straightforward to modify for further exploration.

For the advanced topic, you will not find some theoretical derivation (read: math) for the various techniques: this is a book designed for people having at least some basic knowledge of Python and who want to write programs to do image manipulation; it is not aimed at researchers or graduate students in computer vision.

At first glance, one may think that asking $22 for a short (143 pages) ebook with code samples and images is a bit on the high side as compared with other programming ebooks and taking into account how much free material is already available on Internet. For example, I have not read (yet) any of the available tutorials on the OpenCV site  ... However, I found that the very good organization of the material in the book, the smooth progression of topics introduced and the number of useful pointers (e.g. Numpy gives nb columns X nb of rows unlike the traditional rows X cols in linear algebra; OpenCV store images in order Blue Green Red, as opposed to the traditional Red Green Blue, etc.) makes it very worthwhile for anyone that would like to learn about image processing using OpenCV.
I should also point out that books on advanced topics (such as computer vision) tend to be much pricier than the average programming book.  So the asking price seems more than fair to me.

If you are interested in learning about image processing using OpenCV (and Python, of course!), I would tentatively recommend this book.  I wrote tentatively as I have not yet read the Case Studies book: it could well turn out that my recommendation would be to purchase both as a bundle.  So, stay tuned if you are interested in this topic.
Categories: FLOSS Project Planets

Junior Job: Breeze Icon theme for LibreOffice

Planet KDE - Sun, 2014-11-23 09:23

Here’s a nice project if you’re bored and wanting to help make a very visual difference to KDE, port the Breeze icon theme to LibreOffice.

Wiki page up at https://community.kde.org/KDE_Visual_Design_Group/LibreOffice_Breeze

All help welcome

Open, Save and PDF icons are breeze, all the rest still to go

 

by
Categories: FLOSS Project Planets

xzgrep for searching in compressed files

LinuxPlanet - Sun, 2014-11-23 07:46
grep is a very powerful tool for everyone who works with text and strings. For searching through logs, errors and other text based data. But if the same data is present inside a compressed archive, like .bz2 or a .xz. Then we need to uncompress it and then use grep on the files.

If the number of such compressed files are too many, uncompressing all of them with out knowing which one contains the data that we are looking for could be a waste of time. This is were xzgrep can come to rescue.

xzgrep can search through compressed files of the formar .bz2,.xz etc and look for strings in the files like grep does. The only difference is that xzgrep can only confirm whether the search string is present in the compressed file or not and not be able to list further details like grep does.

For example let us take two files

hello1

12345

hello2

ABCDEF

Let us create a tar file

tar -cf hello.tar hello1 hello2

Compress it using bunzip

bzip2 -z hello.tar

This will create the compressed file hello.tar.bz2

Now to search for text inside this compressed file.

$ xzgrep 12 hello.tar.bz2 Binary file (standard input) matches

The string 12 is present is hello1 thus the ouput says that a match is found.

$ xzgrep 78 hello.tar.bz2

If match is not found, xzgrep does not mention any thing.

But note that the xzgrep works only on the text files inside the compressed file, like grep works only on text files. Thus if the search string is inside a pdf or any other document, xzgrep will not be able to pick it up.


Categories: FLOSS Project Planets

Python 4 Kids: Math (aka Maths) Trainer

Planet Python - Sun, 2014-11-23 07:33

In this Tutorial I would like to write a short (command line) program to help you to practise your times tables.  The program will ask you a series of questions like, for example, “5×6 =”  which you need to answer.  The time you take will be recorded as well as your score, which will be printed out at the end.

First, I will define a model to store the data that we’ll be using. The model makes an allowance for an upper and lower limit on what is being multiplied together (so you don’t have to answer 1×1=). By default this produces the four through 12 times tables, but that can be changed by nominating different values.

# trainer.py # Brendan Scott # 19 November 2014 # train user on times tables # from lowest to highest from random import shuffle from time import time range = xrange class Model(object): def __init__(self, lowest = 4, highest = 12): self.lowest = lowest self.highest = highest def make_guess_list(self,random_order=True, allow_reversals= True, max_questions = None): if allow_reversals: question_list = [(x, y) for x in range(self.lowest, self.highest + 1) for y in range(self.lowest, self.highest + 1)] else: question_list = [(x, y) for x in range(self.lowest, self.highest + 1) for y in range(x, self.highest + 1)] if random_order: shuffle(question_list) if max_questions is not None: question_list=question_list[:max_questions] return question_list

The model also contains a method for constructing a list of “questions”. Each question is a tuple with two elements. These two elements will be the numbers to be multiplied together. The method has a few parameters:
* random_order determines whether the trainer will ask questions in a random order or in the normal order for a times table (eg all of the four times followed by the five times)
* allow_reversals – this determines whether the trainer will allow both 4×5 and 5×4 or will treat them as the same. This reduces the total number of questions and avoids repetition.
* max_questions: limits the number of questions being asked by chopping questions off the end of the list. This parameter makes more sense when random_order is True.
time.time() is used later…

To this model we add the following interface code:

if __name__ == "__main__": m = Model() question_list = m.make_guess_list(random_order=True, allow_reversals=False, max_questions=10) qn_format = "%sx%s= " # a x b summary_fmt = "You scored %s (%s%%) out of a possible %s. " \ "You took %0.1f seconds in total or %0.2f seconds for each correct answer" # score, percent, possible, time_taken, time per correct start = raw_input("Press the enter key to start") score = 0 t1 = time() for qn in question_list: answer = raw_input(qn_format % (qn)) if answer.lower() == "q": break correct_answer = qn[0] * qn[1] try: if int(answer) == correct_answer: print("correct!") score += 1 else: print ("wrong! (should be %s)" % correct_answer) except(ValueError): print ("wrong! (should be %s)" % correct_answer) # something other than a number has been typed, just ignore it t2 = time() total_questions = len(question_list) percent = int(100.0 * score / total_questions) time_taken = t2 - t1 try: time_per_correct = float(time_taken) / score except(ZeroDivisionError): # if score == 0 (!) time_per_correct= time_taken fmt_args = (score, percent, total_questions, time_taken, time_per_correct) print(summary_fmt % fmt_args)

When this script is run from the command line, a Model is created with default values (4 and 12). A question list is produced which is randomly ordered, does not allow reversals and has at most 10 questions. These values can be changed to suit your needs.
Two template strings are define, the first for asking questions, the second for providing a summary at the end.
The user is asked to press the enter key to begin the test. Before the questions start a time marker (t1) is initialised. After the questions have ended a second time marker is taken (t2). For each of the questions in (the) question_list the question is put to the user. If the user responds with a “q”, the rest of the test is aborted (break) and the summary is printed. Otherwise the answer is either correct (in which case the score is increased by one) or it isn’t (in which case the user is told what the correct answer is. There is a little bit of error checking.

This script could be improved by, for example, keeping track of which questions were wrong (for later practise) and by keeping a leaderboard of results.

 


Categories: FLOSS Project Planets

KDE Promo Idea

Planet KDE - Sun, 2014-11-23 07:11

New seasonal KDE marketing campaign.  Use Kubuntu to get off the naughty list.

 

by
Categories: FLOSS Project Planets

Iustin Pop: Debian, Debian…

Planet Debian - Sun, 2014-11-23 04:23

Due to some technical issues, I've been without access to my lists subscription email for a bit more than a week. Once I regained access and proceeded to read the batch of emails, I was - once again - shocked. Shocked at the amount of emails spent on the systemd issue, shocked at the number of people resigning, shocked at the amount of mud thrown.

I just hope that the GR results finally will mean silence and getting over the last 3-6 months.

For the record:

  • I seconded the GR because I believed we were moving too fast (I wanted one full release as a transition period, even if that's a long time)
  • I am quite happy with the result of the GR!
  • I am not happy with the amount of people leaving (I hope they're just taking a break)
  • I am, as usual, behind on my Debian packaging ☹

However, some of the more recent emails on -private give me more hope, so I'm looking forward to the next 6 months. I wonder how this will all look in two years?

(Side-note: emacs-nox shows me the italic word above as italic in text mode: I never saw that before, and didn't know, that it's possible to have italic fonts in xterm! What is this trickery⁈ … it seems to be related to the font I use, fun!)

Categories: FLOSS Project Planets

Hiranya Jayathilaka: Running Python from Python

Planet Apache - Sat, 2014-11-22 22:36
It has been pointed out to me that I don't blog as often as I used to. So here's a first step towards rectifying that.In this post, I'm going to briefly describe the support that Python provides for processing, well, "Python". If you're using Python for simple scripting and automation tasks, you might often have to load, parse and execute other Python files from your code. While you can always "import" some Python code as a module, and execute it, in many situations it is impossible to determine precisely at the development time, which Python files your code needs to import. Also some Python scripts are written as simple executable files, which are not ideal for inclusion via import. To deal with cases such as these, Python provides several built-in features that allow referring to and executing other Python files.One of the easiest ways to execute an external Python file is by using the built-in execfile function. This function takes the path to another Python file as the only mandatory argument. Optionally, we can also provide a global and a local namespace. If provided, the external code will be executed within those namespace contexts. This is a great way to exert some control over how certain names mentioned in the external code will be resolved (more on this later).execfile('/path/to/code.py')
Another way to include some external code in your script is by using the built-in __import__ function. This is the same function that gets called when we use the usual "import" keyword to include some module. But unlike the keyword, the __import__ function gives you lot more control over certain matters like namespaces.
Another way to run some external Python code from your Python script is to first read the external file contents into memory (as a string), and then use the exec keyword on it. The exec keyword can be used as a function call or as keyword statement.code_string = load_file_content('/path/to/code.py')
exec(code_string)
Similar to the execfile function, you have the option of passing custom global and local namespaces. Here's some code I've written for a project that uses the exec keyword:globals_map = globals().copy()
globals_map['app'] = app
globals_map['assert_app_dependency'] = assert_app_dependency
globals_map['assert_not_app_dependency'] = assert_not_app_dependency
globals_map['assert_app_dependency_in_range'] = assert_app_dependency_in_range
globals_map['assert_true'] = assert_true
globals_map['assert_false'] = assert_false
globals_map['compare_versions'] = compare_versions
try:
exec(self.source_code, globals_map, {})
except Exception as ex:
utils.log('[{0}] Unexpected policy exception: {1}'.format(self.name, ex))
Here I first create a clone of the current global namespace, and pass it as an argument to the exec function. The clone is discarded at the end of the execution. This makes sure that the code in the external file does not pollute my existing global namespace. I also add some of my own variables and functions (e.g assert_true, assert_false etc.) into the global namespace clone, which allows the external code to refer to them as built-in constructs. In other words, the external script can be written in a slightly extended version of Python.
There are other neat little tricks you can do using the constructs like exec and execfile. Go through the official documentation for more details.
Categories: FLOSS Project Planets

Matthew Palmer: You stay classy, Uber

Planet Debian - Sat, 2014-11-22 19:00

You may have heard that Uber has been under a bit of fire lately for its desires to hire private investigators to dig up “dirt” on journalists who are critical of Uber. From using users’ ride data for party entertainment, putting the assistance dogs of blind passengers in the trunk, adding a surcharge to reduce the number of dodgy drivers, or even booking rides with competitors and then cancelling, or using the ride to try and convince the driver to change teams, it’s pretty clear that Uber is a pretty good example of how companies are inherently sociopathic.

However, most of those examples are internal stupidities that happened to be made public. It’s a very rare company that doesn’t do all sorts of shady things, on the assumption that the world will never find out about them. Uber goes quite a bit further, though, and is so out-of-touch with the world that it blogs about analysing people’s sexual activity for amusement.

You’ll note that if you follow the above link, it sends you to the Wayback Machine, and not Uber’s own site. That’s because the original page has recently turned into a 404. Why? Probably because someone at Uber realised that bragging about how Uber employees can amuse themselves by perving on your one night stands might not be a great idea. That still leaves the question open of what sort of a corporate culture makes anyone ever think that inspecting user data for amusement would be a good thing, let alone publicising it? It’s horrific.

Thankfully, despite Uber’s fairly transparent attempt at whitewashing (“clearwashing”?), the good ol’ Wayback Machine helps us to remember what really went on. It would be amusing if Uber tried to pressure the Internet Archive to remove their copies of this blog post (don’t bother, Uber; I’ve got a “Save As” button and I’m not afraid to use it).

In any event, I’ve never used Uber (not that I’ve got one-night stands to analyse, anyway), and I’ll certainly not be patronising them in the future. If you’re not keen on companies amusing themselves with your private data, I suggest you might consider doing the same.

Categories: FLOSS Project Planets

Bryan Pendleton: Turkey trot

Planet Apache - Sat, 2014-11-22 18:07

It seems like we are at that time of the year when everybody is incredibly busy.

And there's so much to read!

  • If you only watch one technical video this year, make it James Hamilton: AWS Innovation at ScaleThis session, led by James Hamilton, VP and Distinguished Engineer, gives an insider view of some the innovations that help make the AWS cloud unique. He will show examples of AWS networking innovations from the interregional network backbone, through custom routers and networking protocol stack, all the way down to individual servers. He will show examples from AWS server hardware, storage, and power distribution and then, up the stack, in high scale streaming data processing. James will also dive into fundamental database work AWS is delivering to open up scaling and performance limits, reduce costs, and eliminate much of the administrative burden of managing databases.
  • Google to Quadruple Computer Science Prize Winnings to $1 MillionThe Turing Award had carried prize money of $250,000 and was jointly underwritten by Google and Intel since 2007. But Intel decided to step away as a funder, and Google stepped up and upped the ante.
  • Microsoft Releases Emergency Security Update“The attacker could forge a Kerberos Ticket and send that to the Kerberos KDC which claims the user is a domain administrator,” writes Chris Goettl, product manager with Shavlik. “From there the attacker can impersonate any domain accounts, add themselves to any group, install programs, view\change\delete date, or create any new accounts they wish.
  • Compiler Design in CCompiler Design in C is now, unfortunately, out of print. However, you can download a copy
  • Keeping Secretsthe conference featured the work of a group from Stanford that had drawn the ire of the National Security Agency and the attention of the national press. The researchers in question were Martin Hellman, then an associate professor of electrical engineering, and his students Steve Pohlig, MS ’75, PhD ’78, and Ralph Merkle, PhD ’79.

    A year earlier, Hellman had published “New Directions in Cryptography” with his student Whitfield Diffie, Gr. ’78. The paper introduced the principles that now form the basis for all modern cryptography

  • Building a complete Tweet indexIn this post, we describe how we built a search service that efficiently indexes roughly half a trillion documents and serves queries with an average latency of under 100ms.
  • Why I'm not signing up for Google Contributor (or giving up on web advertising)People say all kinds of stuff. You have to watch what they do. What they do, offline, is enjoy high-value ad-supported content, with the ads. Why is the web so different? Why do people treat web ads more like email spam and less like offline ads? The faster we can figure out the ad blocking paradox, the faster we can move from annoying, low-value web ads to ads that pull their weight economically.
  • Your developers aren’t slowFeel like your team isn’t shipping fast enough? Chances are, your developers aren’t to blame.

    What’s really slowing down development?

    If it’s not your developers, what’s slowing down development? Here’s a hint: it’s your process.

  • Cache is the new RAMYou know things are really desperate when “less painful than writing it yourself” is the main selling point.
  • Delayed Durability in SQL Server 2014With delayed durability, the transaction commit proceeds without the log block flush occurring – hence the act of making the transaction durable is delayed. Under delayed durability, log blocks are only flushed to disk when they reach their maximum size of 60KB. This means that transactions commit a lot faster, hold their locks for less time, and so Transactions/sec increases greatly (for this workload). You can also see that the Log Flushes/sec decreased greatly as well, as previously it was flushing lots of tiny log blocks and then changed to only flush maximum-sized log blocks.
  • Delayed Durability in SQL Server 2014 Like many other additions in recent versions of SQL Server (*cough* Hekaton), this feature is NOT designed to improve every single workload – and as noted above, it can actually make some workloads worse. See this blog post by Simon Harvey for some other questions you should ask yourself about your workload to determine if it is feasible to sacrifice some durability to achieve better performance.
  • The Programmer's Price: Want to hire a coding superstar? Call the agent.Hiring computer engineers used to be the province of tech companies, but, these days, every business—from fashion to finance—is a tech company. City governments have apps, and the actress Jessica Alba is the co-founder of a startup worth almost a billion dollars. All of these enterprises need programmers. The venture capitalist Marc Andreessen told New York recently, “Our companies are dying for talent. They’re like lying on the beach gasping because they can’t get enough talented people in for these jobs.”
  • git-p4 - Import from and submit to Perforce repositoriesCreate a new Git repository from an existing p4 repository using git p4 clone, giving it one or more p4 depot paths. Incorporate new commits from p4 changes with git p4 sync. The sync command is also used to include new branches from other p4 depot paths. Submit Git changes back to p4 using git p4 submit. The command git p4 rebase does a sync plus rebases the current branch onto the updated p4 remote branch.
Categories: FLOSS Project Planets
Syndicate content