Planet Apache

Subscribe to Planet Apache feed
Updated: 8 hours 1 sec ago

Holden Karau: Contributing to Spark 3 @ Spark BCN Meetup

9 hours 54 min ago
Thanks for joining me on 2019-03-19 at Spark BCN Meetup 2019 Barcelona, Spain for Contributing to Spark 3.

The slides are at

Contributing to Apache Spark 3 from Holden Karau Comment bellow to join in the discussion :).Talk feedback is appreciated at
Categories: FLOSS Project Planets

James Duncan

16 hours 8 min ago

Props to Emma Haruka and her work with the Google Cloud Platform Developer Advocacy team to calculate 31.4 trillion digits of π.

They don’t say how much it would cost to do this yourself, but Corey Quinn estimates it’s about a quarter of a million dollars. Now that it’s been generated, however, you can just grab the data by accessing the delivery.pi API:

$ curl "" {"content":"3.14159265"}

Or, you can clone off your own copy of their cloud disk image. The dataset is big enough, however, that it’ll cost you $40 a day.

Categories: FLOSS Project Planets

FeatherCast: CHAOSSCon EU 2019, Why it’s Important for Open Source Metrics to Tell a Story, Brian Proffitt

Mon, 2019-03-18 14:20

CHAOSS stands for Community Health Analytics Open Source Software and recently at CHAOSSCon EU in Brussels, we spoke briefly to Brian Proffitt, one of the CHAOSS Board members and also Senior Principal Community Architect for Open Source and Standards team at Red Hat. He tells us why it’s important for metrics to tell a story, why previous metrics may not have been as impartial as people would want, and why increased mailing list traffic could indicate a potential community crisis!
Categories: FLOSS Project Planets

James Duncan

Mon, 2019-03-18 12:45

Somewhere in Kruezberg

Categories: FLOSS Project Planets

James Duncan

Mon, 2019-03-18 08:20

Trevor Sumner’s tweet thread about why the 737MAX tragedies aren’t a software problem is an insightful read.

Hint: it’s probably a systems problem.

Categories: FLOSS Project Planets

Justin Mason: Links for 2019-03-17

Sun, 2019-03-17 19:58
  • 2-hour-long meetings can impair cognitive functioning

    ‘Study shoes three people in a conference room over 2 hours can result in a Co2 level that can impair cognitive functioning. Ie. If you’re making decisions at the end of the meeting, you’re mentally less qualified to do so.’ Well, I’d say that fatigue could also result in this, but it’s interesting to see how unhealthy the typical office environment can be. (via Jeff Dean)

    (tags: via:jeffdean meetings work offices brain co2 cognition)

Categories: FLOSS Project Planets

James Duncan

Sun, 2019-03-17 05:20

Not everyone likes untitled blog posts. Dan Steinberg argues:

As a reader, I remember a post I liked from an author I admire and I look back through their posts and I see a sea of posts without a title. It makes it hard for me to find the one that resonated with me.

Titles are a straightforward affordance for scanning through posts in an archive page. And, it’s not just a human reader that might appreciate the hierarchy a title gives. Search engines really lean on them as well, both for indexing and for displaying search results.

Going title-optional in a meaningful way means thinking about both of these issues.

Categories: FLOSS Project Planets

James Duncan

Sat, 2019-03-16 11:30

Worth a re-read is Zeynep Tufecki’s article for the MIT Technology Review last August: How social media took us from Tahir Square to Donald Trump.

The conclusion:

If digital connectivity provided the spark, it ignited because the kindling was already everywhere. The way forward is not to cultivate nostalgia for the old-world information gatekeepers or for the idealism of the Arab Spring. It’s to figure out how our institutions, our checks and balances, and our societal safeguards should function in the 21st century—not just for digital technologies but for politics and the economy in general. This responsibility isn’t on Russia, or solely on Facebook or Google or Twitter. It’s on us.

So, what are we going to do about it?

Categories: FLOSS Project Planets

James Duncan

Sat, 2019-03-16 06:15

Beto O’Rourke was a member of a group famous for hacktivism back in the days of the Apple IIe and 300-baud modems? Alright.

Categories: FLOSS Project Planets

Justin Mason: Links for 2019-03-15

Fri, 2019-03-15 19:58
  • The Oxygen of Amplification

    Offering extremely candid comments from mainstream journalists, this report provides a snapshot of an industry [news media] caught between the pressure to deliver page views, the impulse to cover manipulators and “trolls,” and the disgust (expressed in interviewees’ own words) of accidentally propagating extremist ideology. After reviewing common methods of “information laundering” of radical and racist messages through the press, Phillips uses journalists’ own words to propose a set of editorial “better practices” intended to reduce manipulation and harm. As social and digital media are leveraged to reconfigure the information landscape, Phillips argues that this new domain requires journalists to take what they know about abuses of power and media manipulation in traditional information ecosystems; and apply and adapt that knowledge to networked actors, such as white nationalist networks online.

    (tags: media news harassment nazis fascism overton-window journalism racism press)

  • Ash Sarkar on how to counter the new right

    ‘a) Acknowledge that the fascist threat has changed. It’s political operations are far more nebulous and diffuse; it works in political institutions and dark corners of the internet; it will adopt and distort liberal tropes and talking points. b) Deal with the fact that traditional forms of policing will be of little effectiveness in countering it. Those with the most power to inhibit the dissemination of far-right and racist ideology are the digital platforms they rely on: reddit, Twitch, YouTube, Twitter, Facebook. c) Transform current affairs media. For too long, producers and editors have taken the alt-right at their word, and framed issues as free speech/limits of offensive humour. That must change. Unless you’re willing to do rigorous research first, don’t commission the debate. d) Overhaul the teaching of PSHE & Citizenship in education to prepare young people for the desensitising and extreme content they will see online. Create space for healthy debate and discussion in respectful environments. Don’t let groomers take advantage of their curiousity. e) Get a very big bin, and put Melanie Phillips, Rod Liddle, and Douglas Murray in it. Then fire the bin into outer space.’

    (tags: alt-right fascism media politics internet social-media twitter reddit ash-sarkar)

  • Why Do so Many Egyptian Statues Have Broken Noses? – Artsy

    wow, TIL. ‘The ancient Egyptians, it’s important to note, ascribed important powers to images of the human form. They believed that the essence of a deity could inhabit an image of that deity, or, in the case of mere mortals, part of that deceased human being’s soul could inhabit a statue inscribed for that particular person. These campaigns of vandalism were therefore intended to “deactivate an image’s strength,” as Bleiberg put it.’

    (tags: egypt culture art history noses)

Categories: FLOSS Project Planets

James Duncan: What are we going to do about this?

Fri, 2019-03-15 10:40

There’s not much that’s reasonable to say about the horrific news from Christchurch today. The actions of the gunman, a self-proclaimed fascist, being amplified and repeated by so many others online…

I just can’t…

I’ve been stuck most of the day. Stuck in thoughts that don’t go anywhere, at least not anywhere good.

The last few years, the worst side of humanity has been winning in a big way, and while there’s nothing new about white supremacy, fascism, violence, or hate, we’re seeing how those old human reflexes have adapted to the tools that we’ve built in and for our online world.

What are we going to do about this?

Categories: FLOSS Project Planets

James Duncan

Fri, 2019-03-15 05:20

Steven Sinofsky’s tweet thread about going through the Microsoft security crisis is related to Facebook starting its own pivot, but is useful thinking for taking any kind of organization through a significant shift of strategy:

All you can do as an organization executing on this sort of pivot is to do the work. Keep people informed. Build incremental trust across all stakeholders by actions.

Nobody knows if Facebook can pull off what Mark Zuckerberg says that he wants to do, but for sure it’s going to be messy, take a lot of time, and it will be a hard time for all involved to build up trust.

Categories: FLOSS Project Planets

James Duncan: Of course Kubernetes isn’t always the answer

Thu, 2019-03-14 06:40

Kubernetes is the new hotness, for sure, and pretty awesome to boot. But is it really needed, especially when you’re bootstrapping a new idea? Maybe not. UK-based Freetrade started out with a decently designed Kubernetes-based stack, complete with all the bells and whistles, but ended up scrapping that plan and launching with Firebase functions.

Even now at 20k users, it’s been the right decision for them.

In fact, not only did we not need it - but if we’d launched with this stack with just two engineers (our launch team size!) I’m confident our customers would have been very unhappy.

As they continue to grow, I’m sure that Freetrade may end up putting some functionality into on-demand container instances or even spin up a Kubertnetes cluster to handle the parts of their app that need it. When they do, however, I’m sure they’ll be able to leave a significant portion of their application surface in functions.

Categories: FLOSS Project Planets

James Duncan

Thu, 2019-03-14 05:45

Helping companies through the Microsoft for Startups program has reinforced my feeling that many problems companies face as they grow are really people problems in disguise. To help figure out what a startup needs to focus on at each phase of the startup lifecycle, Wendy van Ierschot gives us a road map.

We’ve identified five stages, each pegged to a certain number of employees. For each stage, we’ve established which areas an organisation should put its HR focus on.

The diagram in the article showing what to focus on at each stage feels about right, and it’s a better starting point to implementing HR strategy in a startup than having nothing at all.

Categories: FLOSS Project Planets

Timothy Chen: The power of choice in data-aware cluster scheduling

Thu, 2019-03-14 00:39

In this post we’ll cover a scheduler called KMN that is looking to solve scheduling I/O intensive tasks in distributed compute frameworks like Spark or MapReduce. This scheduler is different than the ones we discussed previously, as it’s emphasizing on a data-aware scheduling which we’ll cover in this post.


In today’s batch computing frameworks like Hadoop and Spark, they run a number of stages and tasks for each job which builds into a DAG (directed acyclic graph) dependency graph. If we assume a large portion of these jobs are I/O intensive, then a scheduler job will be to try to minimize the time it takes for tasks to read their data. However, in a large multi-tenant cluster, the perfect node with data locality can be often unavailable.

Data applications and algorithms today are also having the option to only choose a subset of source data for approximating the answer instead of requiring the full set of data.

Spark & MapReduce frameworks typically has input tasks that reads source data and intermediate tasks that has data forwarded from the input tasks to further processing. For a task scheduler, what it can optimize for input tasks is to try to place tasks closer to the source data (locality). For intermediate tasks, the scheduler instead will optimize for minimizing the network transfer from the input tasks. One of the main bottlenecks for in-cluster network bandwidth is over-saturated cross rack links. The authors simulated if network contention and data locality is achieved using past Facebook traces and estimated a 87.6% performance increase.

KMN Scheduler

The KMN scheduler is implemented in Spark that provides an application interface that allows users to choose what ratio of input data that the query will be selecting (1-100%).

What the KMN scheduler will do is based on all the available N inputs and locality choices, choose to launch input tasks (one-to-one transfers) on a random sample of K available blocks with memory locality.

For intermediate tasks that does many-to-one transfers, the main insight that the authors found is that the key to avoid skews in cross rack network bandwidth is to allow more than K inputs tasks to be launched (M tasks), since this allows more choices to transfer data from in the downstream tasks that can avoid skewing. While finding the optimal rack placement for tasks is a NP-hard problem, the authors suggested either using greedy search that works best for small jobs or a variant of round-robin for larger jobs works quite well in their setup.

One important decision here is certainly how many additional tasks should we launch. Too many more tasks will cause longer job wait time (also taking in account stragglers), but too little additional tasks can potentially cause network imbalance problems. Finding the balance allows you maximize the balance between the two. One strategy here is that the scheduler can decide how long it’s going to wait for upstream tasks to launch and complete before firing the downstream tasks, so when you do encounter stragglers you won’t be waiting for all of them to complete in your sample.


Cross rack network congestion is still a real problem when I chatted with several companies operating large on-prem clusters. While the importance of data locality is decreasing over time given the faster speed available in the cloud, I think cross-AZ and also network congestion is still a problem that I see companies often run into in the cloud.

Certainly can see all distributed data frameworks start to be more aware of the cluster resource bottleneck while making tasks and distribution decisions.

Categories: FLOSS Project Planets

Bryan Pendleton: The Weight of Ink: a very short review

Thu, 2019-03-14 00:11

Who can resist a love story?

Who can resist a love story, set in a library?

It's two great tastes, that taste great together: Rachel Kadish's The Weight of Ink.

Well, I'm not really being fair. It's not set in a library, it's set (partially) in a Rare Manuscripts Conservation Laboratory.

In a library.

Well, it's also set in a kibbutz in Israel.

Oh, and it's also set in 17th century London, during the time of the Inquisition, and the Plague, and yet also, the time of the birth of modern Philosophy.

It's a book about Baruch Spinoza, who you might never have spent much time thinking about (certainly I never did), and it's a book about being Jewish in England during a time when that was only barely legal.

And it's DEFINITELY a love story.

But it's rather a non-traditional love story, not least because a lot of it is about People Who Love Books, both now and then, back in the days when a book was still a thing that People Who Love Books built by hand, with agonizing care.

Our heroes and heroines are the sort of people who know immediately what a rare thing it is to find a 350 year old book, or even writing of any sort:

Her eyes were on the book. "Iron gall ink," she said after a moment.

Following her gaze, he understood that the damage had been done before he ever touched the ledger. The pages were like Swiss Cheese. Letters and words excised at random, holes eaten through the page over the centuries by the ink itself.

And they are the sort of people who can survive the most horrible tortures and injuries, and yet the thing that pains them the most is the loss of books:

Before she knew what she was saying, she turned to the rabbi. "What do you see," she said, "behind the lids of your eyes?"

For the first time there was unease beneath his silence. She felt a hard, thin satisfaction she was ashamed of.

"I shall not, at this moment, answer this question," he said. "But I will tell you what I learned after I lost my sight, in the first days as I came to understand how much of the world was now banned from me -- for my hands would never again turn the pages of a book, nor be stained with the sweet, grave weight of ink, a thing I had loved since first memory. I walked through rooms that had once been familiar, my arms outstretched, and was fouled and thwarted by every obstacle in my path. What I learned then, Ester, is a thing that I have been learning ever since."

The literary technique of trying to tell two stories, one old and set in the past, and one new and set in the current time, is well-known, and although it can be powerful, it can also be a bit of a crutch.

It also leads to a situation in which the book is packed full of characters, and can be a tad confusing when you jump back and forth, although I felt like, overall, The Weight of Ink pulled this off well, and did not over-burden the reader.

Some of the characters are extraordinarily compelling, and front and center is surely Ester, the 17th-century orphan girl who comes to live in the household of a blind, dying rabbi.

Other characters are, well, not quite so gripping, such as the young heiress Mary, or the extremely annoying graduate student Aaron.

But for my money, my favorite was the aging scholar Helen, absorbed in the study of history, dragging herself out of bed every day, overcoming her advanced Parkinson's disease, to get into the library and spend her time with The Books:

For a long time, Helen sat in the silent laboratory. All around her, on shelves and tables, on metal trays and in glass chambers, lay a silent company of paper: centuries old, leaf after leaf, torn or faded or brittle. Pages inked by long-dead hands. Pages damaged by time and worse. But they -- the pages -- would live again.

The climactic scene in which Helen must go to face the Dean, who waits for her to deliver her requested resignation, is remarkably more vivid and compelling and heart-wrenching than you could possibly imagine.

A large part of The Weight of Ink is the painstaking detective story of the literary historians, discovering The Books, poring over their contents, and then, slowly, but surely, reading between the lines to understand what they really say.

But The Weight of Ink succeeded, for me, because it balances that detective story quite nicely with the fill-in-the-blanks story of Ester and her adventures in London.

Bit by bit, page by page, Aaron and Helen come to understand what Ester's life was like, and what she did and thought and felt.

And yet, how could they? How could any of us know what it was like to be a young girl, alone in a city of tragedies at a time of horrors, still consumed by those most elemental of human passions:

"No," she said. "No, it's not that way. I choose with my heart, and my heart is for you." As she said it she felt her heart insisting within her ribs -- indeed, for the first time in her life she almost could see her heart, and to her astonishment it seemed a brave and hopeful thing: a small wooden cup of some golden liquid, brimming until it spilled over all -- the rabbi breathing in his bed, the dim candlelight by which Ester had so long strained at words on the page, the dead girl with her father in the cart. All that was beautiful and all that was precious, all of it streaming with sudden purpose here -- to this place where they now stood.

And, of course, in and around it all, there is that Birth of Modern Philosophy business, with plenty of Hobbes and Descartes and Spinoza.

And whether that's your thing, or not, probably depends a lot on how you feel about Philosophy.

The folly of her own words astonished her. She pulled the papers back from over the water, and read more, and as she read she saw the enormity of her blindness. In her arrogance and loneliness she'd thought she understood the world -- yet its very essence had been missing from her own philosophy.

The imperative -- she whispered it to herself -- to live. The universe was ruled by a force, and the force was life, and live, and live -- a pulsing, commanding law of its own. The comet making its fiery passage across their sky didn't signify divine displeasure, nor did it have anything to say of London's sin; the comet's light existed for the mere purpose of shining. It hurtled because the cosmos demanded it to hurtle. Just as the grass grew in order to grow. Just as the disfigured woman must defy Bescos, who'd consider her unfit for love; just as Ester herself had once, long ago, written because she had to write.

But I suspect that most People Who Love Books are also people who are quite interested in Philosophy, so I suspect that it's actually a pretty fair bet that if you want to read a love story (actually four or five different love stories, as it turns out) set in a library (yes, yes, I know, a Rare Manuscripts Conservation Laboratory), then you probably want to read a fair amount about the early days of Rationalism and its conflicts with the major Religious Philosophies of the day.

Or maybe you just want to read a great love story!

Oh, to heck with it: go read The Weight of Ink. It's well worth your time.

Categories: FLOSS Project Planets

Justin Mason: Links for 2019-03-13

Wed, 2019-03-13 19:58
Categories: FLOSS Project Planets

James Duncan

Wed, 2019-03-13 07:30

My group at Microsoft is going on tour to twelve cities around the world. This is a big part of my current work and I’m pretty stoked about it.

First stop: New York on April 9th.

Categories: FLOSS Project Planets

James Duncan

Wed, 2019-03-13 05:00

What’s a product roadmap and how do you build one in the early days of a startup? James Turnbull writes about one way to do it on the Microsoft for Startups blog.

Categories: FLOSS Project Planets

Justin Mason: Links for 2019-03-12

Tue, 2019-03-12 19:58
Categories: FLOSS Project Planets