Planet Apache

Syndicate content
Updated: 1 day 1 hour ago

Justin Mason: Links for 2017-09-18

Mon, 2017-09-18 19:58
  • Native Memory Tracking

    Java 8 HotSpot feature to monitor and diagnose native memory leaks

    (tags: java jvm memory native-memory malloc debugging coding nmt java-8 jcmd)

  • This Heroic Captain Defied His Orders and Stopped America From Starting World War III

    Captain William Bassett, a USAF officer stationed at Okinawa on October 28, 1962, can now be added alongside Stanislav Petrov to the list of people who have saved the world from WWIII:

    By [John] Bordne’s account, at the height of the Cuban Missile Crisis, Air Force crews on Okinawa were ordered to launch 32 missiles, each carrying a large nuclear warhead. […] The Captain told Missile Operations Center over the phone that he either needed to hear that the threat level had been raised to DEFCON 1 and that he should fire the nukes, or that he should stand down. We don’t know exactly what the Missile Operations Center told Captain Bassett, but they finally received confirmation that they should not launch their nukes. After the crisis had passed Bassett reportedly told his men: “None of us will discuss anything that happened here tonight, and I mean anything. No discussions at the barracks, in a bar, or even here at the launch site. You do not even write home about this. Am I making myself perfectly clear on this subject?”

    (tags: wwiii history nukes cuban-missile-crisis 1960s usaf okinawa missiles william-bassett)

  • malware piggybacking on CCleaner

    On September 13, 2017 while conducting customer beta testing of our new exploit detection technology, Cisco Talos identified a specific executable which was triggering our advanced malware protection systems. Upon closer inspection, the executable in question was the installer for CCleaner v5.33, which was being delivered to endpoints by the legitimate CCleaner download servers. Talos began initial analysis to determine what was causing this technology to flag CCleaner. We identified that even though the downloaded installation executable was signed using a valid digital signature issued to Piriform, CCleaner was not the only application that came with the download. During the installation of CCleaner 5.33, the 32-bit CCleaner binary that was included also contained a malicious payload that featured a Domain Generation Algorithm (DGA) as well as hardcoded Command and Control (C2) functionality. We confirmed that this malicious version of CCleaner was being hosted directly on CCleaner’s download server as recently as September 11, 2017.

    (tags: ccleaner malware avast piriform windows security)

Categories: FLOSS Project Planets

Bryan Pendleton: News of the weird, part 4 (of four)

Sun, 2017-09-17 23:13

Well, this isn't exactly news, and I guess you'll have to judge for yourself whether it's weird or not.

But I thought both of these were pretty interesting.

  • How Half Of America Lost Its F**king MindThere's this universal shorthand that epic adventure movies use to tell the good guys from the bad. The good guys are simple folk from the countryside ...

    ... while the bad guys are decadent assholes who live in the city and wear stupid clothes.

    The theme expresses itself in several ways -- primitive vs. advanced, tough vs. delicate, masculine vs. feminine, poor vs. rich, pure vs. decadent, traditional vs. weird. All of it is code for rural vs. urban. That tense divide between the two doesn't exist because of these movies, obviously. These movies used it as shorthand because the divide already existed.

  • I Spent 5 Years With Some of Trump’s Biggest Fans. Here’s What They Won’t Tell You.Pervasive among the people I talked to was a sense of detachment from a distant elite with whom they had ever less contact and less in common.


    Trump has put on his blue-collar cap, pumped his fist in the air, and left mainstream Republicans helpless. Not only does he speak to the white working class’ grievances; as they see it, he has finally stopped their story from being politically suppressed. We may never know if Trump has done this intentionally or instinctively, but in any case he’s created a movement much like the anti-immigrant but pro-welfare-state right-wing populism on the rise in Europe. For these are all based on variations of the same Deep Story of personal protectionism.

Categories: FLOSS Project Planets

Bryan Pendleton: News of the weird, part 3

Sun, 2017-09-17 15:30

This one, for a change of pace, does not come out of the pages of Wired.

But it's just as weird.

So let's turn the microphone over to the great chess blogger Dana Mackenzie: Scandal Ruins World Cup’s Best Day

everybody is talking about the stupid dispute that caused the Canadian player, Anton Kovalyov, to forfeit his game and withdraw from the tournament — all over a pair of shorts.

Probably most of my readers are already familiar with the sad details, but for those who haven’t heard yet, these seem to be the facts:

  • Kovalyov showed up for his game against Maxim Rodshtein wearing a pair of shorts. He had worn the same shorts for his previous four games. Yes, apparently he only packed this one pair of shorts for a potentially month-long chess tournament. Cue jokes about chess players’ dressing habits.
  • The chief arbiter spoke to him and told him that the players’ dress code (which is in a legal contract they sign before the tournament) requires more dignified wear. He told him to go back to his room and change.
  • Kovalyov went back to his room but never reappeared. His opponent played one move (1. d4) and won by forfeit.

Even from these facts, it seems to me that the FIDE approach was very heavy-handed. From a legal point of view it seems to me that they have greatly weakened their case by allowing Kovalyov to play four games (!) in the offending garment. The arbiter said that nobody noticed earlier. Come on! If it’s a rule, then enforce it from the beginning. If it’s not enforced, then it’s not really a rule.

Kovalyov is actually Ukrainian, playing as a Canadian citizen, but living in Brownsville, Texas, where he studies computer science and got a chess scholarship!.

Kovalyov later wrote about this on his Facebook page, then tried to delete what he wrote, then tried to close his Facebook account, then re-opened his Facebook account, then wrote about it some more.

More at The Guardian, where we find that the REAL issue may have involved an ethnic slur:

Azmaiparashvili refused to back down, said Kovalyov. “At this point I was really angry but tried not to do anything stupid, and asked him why he was so rude to me, and he said because I’m a gypsy,” he said.

He continued: “So imagine this, the round is about to start, I’m being bullied by the organiser of the tournament, being assured that I will be punished by FIDE, yelled at and racially insulted. What would you do in my situation? I think many people would have punched this person in the face or at least insulted him. I decided to leave.”

Assuming that is what actually happened, it's a shame, but clearly he made the right decision.

The internets took to calling this "shortsgate" for a little while.

But it has now passed from public interest.

Categories: FLOSS Project Planets

Bryan Pendleton: News of the weird, part 2

Sun, 2017-09-17 12:59

There are a lot of strange, disturbing, bizarre aspects to this long book excerpt that ran on the Wired website: Meet the CamperForce, Amazon's Nomadic Retiree Army.

The article is an excerpt from an upcoming book, by the way, it's not intended to be a stand-alone article on Wired.


The article winds through a long and close examination of what it's like to chase jobs in Amazon distribution centers around the country, camping out in your R.V. at night, getting up at 4:00 A.M. to get to work on time, taking advantage of the "the free generic pain relievers on offer in the warehouse".

You won't be surprised to hear that this is No Fun At All:

Chuck was a picker. His job was to take items down from warehouse shelves as customers ordered them, scanning each product with a handheld barcode reader. The warehouse was so immense that he and his fellow workers used the names of states to navigate its vast interior. The western half was “Nevada,” and the eastern half was “Utah.” Chuck ended up walking about 13 miles a day. He told himself it was good exercise. Besides, he’d met another picker who was 80 years old—if that guy could do it, surely he could.

Barb was a stower. That meant scanning incoming merchandise and shelving it. Stowers didn’t have to walk as far as pickers did, though Barb’s muscles still ached from the lifting, squatting, reaching, and twisting motions that her job required. Much of the strain was mental. With the holiday season nearing, the warehouse’s shelves were crammed, and one day she wandered around the warehouse for 45 minutes—she timed it—looking for a place to stow a single oversized book. Barb murmured, “Breathe, breathe,” to herself to stay calm.

On days off, many of Barb and Chuck’s coworkers were too exhausted to do anything but sleep, eat, and catch up on laundry.

Much of this article won't be a surprise, as this part of America has been documented for decades (see, e.g., More retirees keep one foot in workforce for pay and play and More Help Wanted: Older Workers Please Apply and Older Workers Survey, Working Longer, Younger Employees, Dear Abby).

And, though CNBC rather sunnily quotes an expert on "Aging & Work" as saying that

"We're in a new era of retirement, and we're not going back."

He added that "most people assume that seniors keep working due to financial necessity, and some do, but the majority do it to keep active and stay alert."

the reality, clearly, is much closer the converse of that viewpoint, as bluntly explained by the AARP, or by the Times, which observes that The recruitment efforts for the elderly are reaching a willing audience, as more older people seek work because they need extra cash and health benefits and sometimes because they miss having a 9-to-5 routine with other workers.

I mean: duh. I DO know some people who are, perhaps, deferring retirement because they really enjoy their current job and don't (yet) have enough saved up to be able to retire as they choose.

But, really?

"They don't want to go fishing; they want to stay sharp," said Jeanne Benoit, principal director of human resources at the Charles Stark Draper Laboratory, a military research contractor in Cambridge, Mass., that creates prototypes for aerospace projects.


They want to go fishing.

And they don't appreciate you telling them that they aren't sharp, you young whippersnapper.

Anyway, back to the Wired article.

One of the things that drives me crazy about this whole situation, and which seems vastly under-reported, is how people got into these situations in the first place.

And the Wired article provides some fascinating detail in this area.

For instance:

Chuck still remembers the call from Wells Fargo that brought the 2008 financial crisis crashing down on his head. He had invested his $250,000 nest egg in a fund that supposedly guaranteed him $4,000 a month to live on. “You have no more money,” he recalls his banker saying flatly. “What do you want us to do?”


Bob worked as an accountant for a timber products firm, and Anita was an interior decorator and part-time caregiver. They thought they would retire aboard a sailboat, funding that dream with equity from their three­ bedroom house. But then the housing bubble burst and their home’s value tumbled. Neither could imagine spending the rest of their lives servicing a loan worth more than their house. So they bought the trailer and drove away. “We just walked,” Anita says. “We told ourselves, ‘We’re not playing this game anymore.’”

Bob blamed Wall Street. When he spoke about his decision to abandon the house, he’d rush to add that, before that moment, he’d always paid the bills on time.

I mean yes, finances are complicated!

But it doesn't take much more than elementary school mathematics to be able to look at a $250,000 "nest egg" and realize that, if you withdraw $50,000 a year, it will only last (wait for it...): 5 years.

Nor should it take much more sophistication to understand that, if your entire plan for retirement is to depend on your house doubling in value so that you can sell it and buy a sailboat, well, you're gambling. You were a professional ACCOUNTANT? And you blamed "Wall Street"?

Now, part of this shame does indeed belong to the bankers and real-estate professionals and others who sold everyone a pipe dream back in the early 2000's.

They were con artists, and a lot of pain was caused by all that speculation, lying, pyramid schemes, and "financial engineering."

But, really, part of this shame is simpler to understand; it seems undeniable that, as a country, we are clearly failing our people.

We should be teaching basic "financial sense" in elementary school.

We should be making retirement savings accounts MANDATORY.

We should be providing universal health care to all. Yes, even if you're not working. Medicare for all.

And otherwise legitimate media organizations like CNBC and The New York Times should be flat-out ashamed of themselves to publish rot about "staying alert" or "pay and play" or "staying sharp" or "missing that 9-to-5 routine."

Call it what it is: elder abuse.

Categories: FLOSS Project Planets

Bryan Pendleton: News of the weird, part 1

Sat, 2017-09-16 20:32

I read a pair of (unrelated) stories on the Wired website recently that have stuck with me, for probably the wrong reasons.

Warning, ahead of time: these are weird stories. Odd, strange, disturbing, uncomfortable.

But, I think, not incorrect. Nor are they misdirected or misleading. I think this is just an honest assessment of Our Strange Times.

So, forthwith:

A Weird MIT Dorm Dies, and a Crisis Blooms at Colleges

This starts out being a story about how things at MIT are a little odd, which isn't, really, that much of a surprise.

MIT, after all, is the home of the Smoot, a measurement unit for bridge lengths, and is the home of nearly legendary student pranks

But, something about Senior House is not quite right.

This was Senior House, the oldest dormitory on campus, built in 1916 by the architect William Welles Bosworth. For 101 years it welcomed freshman and returning students. Since the ’60s it was a proudly anarchic community of creative misfits and self-described outcasts—the special kind of brilliant oddballs who couldn’t or didn’t want to fit in with the mainstream eggheads at MIT.

If it was just brilliant oddballs, there wouldn't be an issue. Something else happened, and the question that Wired wants to discuss is: is this MIT? Or is this America, changing?

The demise of Senior House is emblematic of a larger shift on campuses across the US. Last year my own alma mater, Wesleyan University, closed down its countercultural house Eclectic, which had existed for a century. A few years ago Caltech kicked students out of its countercultural dorm Ricketts.

And what, exactly, happened at Senior House? It seems it's rather a mystery

the administration refused to disclose what precisely had happened, but Barnhart told the student newspaper The Tech that “we received highly credible reports of unsafe and illegal behavior in Senior House.”

Unsafe and illegal behavior? I am shocked!

Wired suggests that this is all due to risk-adverse administrators:

college tuition has skyrocketed and with it the competition for students who can afford it. Parents footing the bill are paying a lot more attention. The world has become more litigious and more corporate. All of this has led to an atmosphere in which university administrations have little margin for error when it comes to student safety or even bad publicity. Money. And lawyers. And lawyers, worried about money.

or is it, rather, that you can't legislate weirdness?

groups like Senior House, which define themselves by being different, also run the risk of becoming highly conformist, Packer says. The punk rock movement is a particularly vivid example of this phenomenon. “They self-describe as being different, but from the outside they all look the same,” he says.

I'd hate to think that the weird is gone from college: discard the weird and you discard so much of what is important about school. And Wired seem to feel that way too, forecasting a rather glum future:

When school ends, they’ll head out into the big wide world, where building a nurturing community sometimes feels hard. Maybe the invisible threads of the internet will help bind them. Maybe Senior House alums will meet up in different cities to drink beer and trade stories of Steer Roasts past or find themselves across from each other at tech company boardroom tables, the memory of that shared place a secret tie between them.

One of my correspondents suggested a close parallel between the crackdown on Senior House, and the Ghost Ship backlash.

I think she makes a great point. Yes, these brilliant oddballs make us uncomfortable, and yes, they live on the edge.

But what do we sacrifice when we legislate their conformity?

I'm not betting that the invisible threads of the internet will solve this problem.

Categories: FLOSS Project Planets

Bryan Pendleton: All the Wild That Remains: a very short review

Sat, 2017-09-16 01:02

We're hoping to make a trip to southern Utah sometime later this year.

It's been on my list for a long time; the last time I was in those parts was 1972, and I don't remember much.

(What? I was only 11! And, how much do you remember from 45 years ago?)

Anyway, as a bit of a warm up, I came across David Gessner's All The Wild That Remains: Edward Abbey, Wallace Stegner, and the American West.

Oh, this is a wonderful book!

Gessner, a literature professor and writer himself, tries, and mostly succeeds, to tie together two of the great writers of the west: Abbey and Stegner.

It turns out, that, in a bit of a coincidence, that we're approaching the 50th anniversary of Abbey's Desert Solitaire, and the 75th anniversary of Stegner's The Big Rock Candy Mountain.

So it's a wonderful occasion to spend some time thinking about Abbey and Stegner.

But Gessner manages to do more than that; his interests are broad and before we are done he has discussed water rights, the Wilderness Act of 1964, fracking, the Dust Bowl, forest fires, whether the Russian Olive or the Green Tamarisk is the less "native" plant, and many other topics.

Oh, and pronghorn.

Gessner loves pronghorn, and rightly so. Here he is, driving through the west with his daughter:

Hadley and I thanked him and pushed off for points north and west, driving out of Colorado and into Wyoming. We spent hours crossing southern Wyoming. In late afternoon we saw a herd of pronghorn antelopes gliding across the prairie. Pronghorns are the fastest land animals in the West, and the truth is it isn't even close. I told Hadley a fact I had learned from a friend: the reason pronghorns run so fast, much faster than any predator of theirs, is that they are outrunning a ghost -- the long-extinct American cheetah, which centuries ago chased them across these grasslands.

To see a pronghorn run is to want to run yourself. A more graceful animal is hard to imagine. Delicate and gorgeously bedecked with rich brown-and-white patterns, with small horns and snow-white fur on their stomachs, they glide across the land. As we drove I was worried about all the barbed-wire fences that blocked their way as they roamed, at least until I saw one pronghorn fawn jump a fence like it was nothing, flowing over it like water.

It's marvelous fun to follow along with Gessner as he revisits the lands of Abbey and Stegner, kayaking and rafting the rivers they rode, hiking the trails they followed, looking out from the summits they climbed.

But that's just the icing. The hard work of Gessner's book involves a serious consideration of whether Abbey and Stegner have staying power, whether they deserve to be read and studied and considered, even now after so much time has passed.

It's rather easier to answer this question for Stegner, whose life and work is so obviously important: winner of Pulitzer Prize and National Book Award, original member of the University of Iowa Writer's Program, founder of the Stanford Creative Writing Program, author of the Wilderness Letter, inspiration for the Wilderness Act, savior of Dinosaur National Park, oh the list just goes on and on.

But Abbey? Troublemaker, rebel, outlaw, misogynist, curmudgeon? Should we still be reading and studying Abbey, as well?

Gessner's answer is an unqualified "yes":

So is Abbey passe, as dated as bad '70's hair? Obviously I wouldn't be out here tracking his spoor if I thought so. But it is difficult, at least at first, to see how his spirit might be adapted to fit our times. For instance, isn't monkeywrenching dead, not just in an FBI agent's eyes, but as a legitimate possibility for the environmental movement? I must admit that in my own grown-up life as a professor and father I don't blow a lot of things up. For most of us who care about the environment, Stegner provides a much more sensible model.

But I don't want to be so quick to toss Abbey on the scrap heap. Looked at in a different way, Abbey's ideas about freedom are exactly what is needed today. If the times have changed, the essence of what he offered has in some ways never been more relevant. Many of the things that he foresaw have come to pass: we currently live in an age of unprecedented surveillance, where the government regularly reads our letters (now called e-mails) and monitors our movements. Abbey offers resistance to this. Resistance to the worst of our times, the constant encroaching on freedom and wildness. He says to us: Question them, question their authority. Don't be so quick to give up the things you know are vital, no matter what others say.

Biography is usually not my thing; I often find it dry and dated.

But Gessner's treatment of Abbey and Stegner is warm, spirited, and refreshing.

Even though I came to it with a fondness for both writers, and for the region they both loved so well, I still found All the Wild That Remains a vivid, compelling, and lively treatment of people and topics that are just as crucial today as they were nearly a century ago.

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-15

Fri, 2017-09-15 19:58
  • Malicious typosquatting packages in PyPI

    skcsirt-sa-20170909-pypi vulnerability announcement from SK-CSIRT:

    SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI, posing as well known libraries. A prominent example is a fake package urllib-1.21.1.tar.gz, based upon a well known package urllib3-1.21.1.tar.gz. Such packages may have been downloaded by unwitting developer or administrator by various means, including the popular “pip” utility (pip install urllib). There is evidence that the fake packages have indeed been downloaded and incorporated into software multiple times between June 2017 and September 2017.

    (tags: pypi python typos urllib security malware)

Categories: FLOSS Project Planets

Community Over Code: Three React-ions to the Facebook PATENTS License

Fri, 2017-09-15 14:35

There are really three aspects to your project’s decision (to use React.js or not based on the BSD+Patents license), and it’s important to consider each of them. You really need to consider which aspects are important to your project’s success — and which ones don’t really matter to you.
(See the updated FAQ about the PATENTS issue on Medium!)

  • Legal — both details of the license and PATENTS file that Facebook offers React.js under, and some realistic situations where the patent clauses might actually come into play (which is certainly rare in court, but it’s the chilling effect of uncertainty that’s the issue)
  • Technology — are other libraries sufficiently functional to provide the features your project needs? Does a project have the capacity to make a change, if they decided to?
  • Community — how does the rest of the open source community-of-communities see the issue, and care about your choices? This includes both future buyers of a startup, as well as future partners, as well as future talent (employees) or contributors (open source developers).


I’ll start off what is almost certainly the least important issue to consider for your project: licenses and patents.

  • The legal issue is immaterial unless you’ve really thought through your project’s business model and really taken the time to evaluate how any patents you might now or in the future hold play into this. Seriously, before you read about this controversy (and even earlier), how much did you worry about potential future patent claims that might be in the many open source components your company uses?
  • The major legal point that’s worth bringing up as a generality is that including software under a non-OSI approved license always adds complexity, immaterial of the details of the license. In an honest open source world, there is never a good reason to use a license besides one of the OSI-approved licenses. OSI-approval is not a magic stamp; however, it does show licenses that are so widely used — and reviewed by lawyers — that there is seen as less risk to everyone else in consuming software under an OSI license.
  • Note: React is not offered under “BSD + Patent” (or more specifically BSD-2-Clause-Patentthanks, SPDX) OSI-approved license. It is offered under the BSD-3-Clause license (OSI-approved), plus Facebook’s own custom-written PATENTS file. It’s the addition of the custom bits about patents — which may be well written, but are different than other well-used licenses — that is the issue. Different licenses mean the lawyers need to spend extra time reviewing them before you can even get an informed opinion.
Technology Changes

If you are not yet using React.JS in your project(s), then now is an excellent time to review the functionality and ecosystems around the similar libraries, like PreactVue.jsReact-liteInferno.jsRiot.js, or other great JS libraries out there.

If you are already using React.JS — like a lot of people — then you should take a brief moment to read up on this licensing issue. Don’t simply listen to the hype pro or con, but think how the issues you’ve read about apply to your project and your goals. React.JS has been using the Facebook PATENTS license for years now, so this is not a new situation and certainly doesn’t mean you need to make any quick changes.

If you are really worried about the legal aspects of the license now, then you need to ask yourself: is it practical for us to change libraries?

  • Is there another open source library that provides sufficient functionality for what your project needs?
  • Do you have the technical capacity — engineering staff (for a company) or passionate volunteers (for an opensource project) — to change your architecture to use a new library?
  • Are there aspects of your project that could work better if you changed to a different library?

These questions probably look familiar to most readers than the license and patent issues. And in most cases, these are the most important questions for your project — your technical capacity to make any changes, and if this is an opportunity to improve things, or if it’s just extra make-work to switch libraries.

Community Expectations

What does your community think about this issue? Again: not just the hype, take the time to think this through. And consider what “community” means to your specific project — VCs to buy your startup, customers to buy your software, contributors to join your open source project or developer talent you want to hire for your company. You need to understand who your community is to understand how they will view your decision.

  • If you are a big company, your lawyers have probably already told you what to do. Most likely if you’re already using React.JS, the issue was decided long ago when you first started using it.
  • If you are a startup thinking about VC exits, then don’t worry about the hype. But you do need to do an analysis of how this (old) news affects your project and your specific goals. My bet is that it won’t matter much — any major VC’s lawyers have long known about this issue and already calculated their reaction. More to the point, if you’re looking for a big buyout, at that point you’ll add enough staff to rewrite at the time if you decide it’s necessary (but I bet it won’t be).
  • If you are a software company building a variety of applications, you probably don’t need to worry about existing tools. Certainly, consider alternatives for new tools you start.
  • If you are a non-software company, you don’t need to worry about it. React.JS has used this license for ages, so there’s no change (just hype and news about the ASF policy change).
  • If you are an open source project, you’re probably already realizing that many open source contributors expect a level playing field for any software that calls itself “open source”. That means using an OSI-approved license, period. In particular, if your project might intend to ever join the Apache Software Foundation, then you need to consider OSI-licensed alternatives since the ASF no longer allows React.JS in its projects.
What Open Source Means

The big lesson here is: if you expect to play in the open source arena, you need to be honest about what “open source” means. There are a lot of aspects to the definition, but the most important one is publicly providing the source code under an OSI-approved license. React.JS is not offered under an OSI-approved license — and now that people are talking about it, they’re realizing it’s not the kind of open source they expected.

The details of licensing are complex but rarely matter in a developer’s day to day life. What is important is managing expectations and risk, not just for yourself but for consumers and contributors to your project. Using an OSI-approved license means that the world can easily and quickly understand what you’re offering. Using a custom license means… people need to pause and evaluate before considering contributing to your projects.

Even if you think you have a reason to use a custom license, you probably don’t (other than using a proprietary license, which is just fine too). Stick with OSI, because that’s what the world expects.

This Three Reactions post previously appeared on Medium.

The post Three React-ions to the Facebook PATENTS License appeared first on Community Over Code.

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Hive - part V

Fri, 2017-09-15 06:40
This is the fifth in a series of blog posts on securing Apache Hive. The first post looked at installing Apache Hive and doing some queries on data stored in HDFS. The second post looked at how to add authorization to the previous tutorial using Apache Ranger. The third post looked at how to use Apache Ranger to create policies to both mask and filter data returned in the Hive query. The fourth post looked how Apache Ranger can create "tag" based authorization policies for Apache Hive using Apache Atlas. In this post we will look at an alternative authorization solution called Apache Sentry.

1) Build the Apache Sentry distribution

First we will build and install the Apache Sentry distribution. Download Apache Sentry (1.8.0 was used for the purposes of this tutorial). Verify that the signature is valid and that the message digests match. Now extract and build the source and copy the distribution to a location where you wish to install it:
  • tar zxvf apache-sentry-1.8.0-src.tar.gz
  • cd apache-sentry-1.8.0-src
  • mvn clean install -DskipTests
  • cp -r sentry-dist/target/apache-sentry-1.8.0-bin ${sentry.home}
I previously covered the authorization plugin that Apache Sentry provides for Apache Kafka. In addition, Apache Sentry provides an authorization plugin for Apache Hive. For the purposes of this tutorial we will just configure the authorization privileges in a configuration file locally to the Hive Server. Therefore we don't need to do any further configuration to the distribution at this point.

2) Install and configure Apache Hive

Please follow the first tutorial to install and configure Apache Hadoop if you have not already done so. Apache Sentry 1.8.0 does not support Apache Hive 2.1.x, so we will need to download and extract Apache Hive 2.0.1. Set the "HADOOP_HOME" environment variable to point to the Apache Hadoop installation directory above. Then follow the steps as outlined in the first tutorial to create the table in Hive and make sure that a query is successful.

3) Integrate Apache Sentry with Apache Hive

Now we will integrate Apache Sentry with Apache Hive. We need to add three new configuration files to the "conf" directory of Apache Hive.

3.a) Configure Apache Hive to use authorization

Create a file called 'conf/hiveserver2-site.xml' with the content:
Here we are enabling authorization and adding the Sentry authorization plugin.

3.b) Add Sentry plugin configuration

Create a new file in the "conf" directory of Apache Hive called "sentry-site.xml" with the following content:
This is the configuration file for the Sentry plugin for Hive. It essentially says that the authorization privileges are stored in a local file, and that the groups for authenticated users should be retrieved from this file. As we are not using Kerberos, the "testing.mode" configuration parameter must be set to "true".

3.c) Add the authorization privileges for our test-case

Next, we need to specify the authorization privileges. Create a new file in the config directory called "sentry.ini" with the following content:
Here we are granting the user "alice" a role which allows her to perform a "select" on the table "words".

3.d) Add Sentry libraries to Hive

Finally, we need to add the Sentry libraries to Hive. Copy the following files from ${sentry.home}/lib  to ${hive.home}/lib:
  • sentry-binding-hive-common-1.8.0.jar
  • sentry-core-model-db-1.8.0.jar
  • sentry*provider*.jar
  • sentry-core-common-1.8.0.jar
  • shiro-core-1.2.3.jar
  • sentry-policy*.jar
  • sentry-service-*.jar
In addition we need the "sentry-binding-hive-v2-1.8.0.jar" which is not bundled with the Apache Sentry distribution. This can be obtained from "" instead.

4) Test authorization with Apache Hive

Now we can test authorization after restarting Apache Hive. The user 'alice' can query the table according to our policy:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n alice
  • select * from words where word == 'Dare'; (works)
However, the user 'bob' is denied access:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n bob
  • select * from words where word == 'Dare'; (fails)

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-14

Thu, 2017-09-14 19:58
  • London police’s use of AFR facial recognition falls flat on its face

    A “top-of-the-line” automated facial recognition (AFR) system trialled for the second year in a row at London’s Notting Hill Carnival couldn’t even tell the difference between a young woman and a balding man, according to a rights group worker invited to view it in action. Because yes, of course they did it again: London’s Met police used controversial, inaccurate, largely unregulated automated facial recognition (AFR) technology to spot troublemakers. And once again, it did more harm than good. Last year, it proved useless. This year, it proved worse than useless: it blew up in their faces, with 35 false matches and one wrongful arrest of somebody erroneously tagged as being wanted on a warrant for a rioting offense. […] During a recent, scathing US House oversight committee hearing on the FBI’s use of the technology, it emerged that 80% of the people in the FBI database don’t have any sort of arrest record. Yet the system’s recognition algorithm inaccurately identifies them during criminal searches 15% of the time, with black women most often being misidentified.

    (tags: face-recognition afr london notting-hill-carnival police liberty met-police privacy data-privacy algorithms)

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Hive - part IV

Thu, 2017-09-14 08:02
This is the fourth in a series of blog posts on securing Apache Hive. The first post looked at installing Apache Hive and doing some queries on data stored in HDFS. The second post looked at how to add authorization to the previous tutorial using Apache Ranger. The third post looked at how to use Apache Ranger to create policies to both mask and filter data returned in the Hive query.

In this post we will show how Apache Ranger can create "tag" based authorization policies for Apache Hive using Apache Atlas. In the second post, we showed how to create a "resource" based policy for "alice" in Ranger, by granting "alice" the "select" permission for the "words" table. Instead, we can grant a user "bob" the "select" permission for a given "tag", which is synced into Ranger from Apache Atlas. This means that we can avoid managing specific resources in Ranger itself.

1) Start Apache Atlas and create entities/tags for Hive

First let's look at setting up Apache Atlas. Download the latest released version (0.8.1) and extract it. Build the distribution that contains an embedded HBase and Solr instance via:
  • mvn clean package -Pdist,embedded-hbase-solr -DskipTests
The distribution will then be available in 'distro/target/apache-atlas-0.8.1-bin'. To launch Atlas, we need to set some variables to tell it to use the local HBase and Solr instances:
  • export MANAGE_LOCAL_HBASE=true
  • export MANAGE_LOCAL_SOLR=true
Now let's start Apache Atlas with 'bin/'. Open a browser and go to 'http://localhost:21000/', logging on with credentials 'admin/admin'. Click on "TAGS" and create a new tag called "words_tag".  Unlike for HDFS or Kafka, Atlas doesn't provide an easy way to create a Hive Entity in the UI. Instead we can use the following json file to create a Hive Entity for the "words" table that we are using in our example, that is based off the example given here:
You can upload it to Atlas via:
  • curl -v -H 'Accept: application/json, text/plain, */*' -H 'Content-Type: application/json;  charset=UTF-8' -u admin:admin -d @hive-create.json http://localhost:21000/api/atlas/entities
Once the new entity has been uploaded, then you can search for it in the Atlas UI. Once it is found, then click on "+" beside "Tags" and associate the new entity with the "words_tag" tag.

2) Use the Apache Ranger TagSync service to import tags from Atlas into Ranger

To create tag based policies in Apache Ranger, we have to import the entity + tag we have created in Apache Atlas into Ranger via the Ranger TagSync service. After building Apache Ranger then extract the file called "target/ranger-<version>-tagsync.tar.gz". Edit '' as follows:
  • Set TAG_SOURCE_ATLASREST_DOWNLOAD_INTERVAL_IN_MILLIS to "60000" (just for testing purposes)
Save '' and install the tagsync service via "sudo ./". Start the Apache Ranger admin service via "sudo ranger-admin start" and then the tagsync service via "sudo start".

3) Create Tag-based authorization policies in Apache Ranger

Now let's create a tag-based authorization policy in the Apache Ranger admin UI (http://localhost:6080). Click on "Access Manager" and then "Tag based policies". Create a new Tag service called "HiveTagService". Create a new policy for this service called "WordsTagPolicy". In the "TAG" field enter a "w" and the "words_tag" tag should pop up, meaning that it was successfully synced in from Apache Atlas. Create an "Allow" condition for the user "bob" with the "select" permissions for "Hive":
We also need to go back to the Resource based policies and edit "cl1_hive" that we created in the second tutorial, and select the tag service we have created above. Once our new policy (including tags) has synced to '/etc/ranger/cl1_hive/policycache' we can test authorization in Hive. Previously, the user "bob" was denied access to the "words" table, as only "alice" was assigned a resource-based policy for the table. However, "bob" can now access the table via the tag-based authorization policy we have created:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n bob
  • select * from words where word == 'Dare';
Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Hive - part II

Thu, 2017-09-14 07:47
This is the second post in a series of articles on securing Apache Hive. The first post looked at installing Apache Hive and doing some queries on data stored in HDFS. In this post we will show how to add authorization to the previous example using Apache Ranger.

1) Install the Apache Ranger Hive plugin

If you have not done so already, please follow the first post to install and configure Apache Hadoop and Apache Hive. Next download Apache Ranger and verify that the signature is valid and that the message digests match. Due to some bugs that were fixed for the installation process, I am using version 1.0.0-SNAPSHOT in this post. Now extract and build the source, and copy the resulting plugin to a location where you will configure and install it:
  • mvn clean package assembly:assembly -DskipTests
  • tar zxvf target/ranger-1.0.0-SNAPSHOT-hive-plugin.tar.gz
  • mv ranger-1.0.0-SNAPSHOT-hive-plugin ${ranger.hive.home}
Now go to ${ranger.hive.home} and edit "". You need to specify the following properties:
  • POLICY_MGR_URL: Set this to "http://localhost:6080"
  • REPOSITORY_NAME: Set this to "cl1_hive".
  • COMPONENT_INSTALL_DIR_NAME: The location of your Apache Hive installation
Save "" and install the plugin as root via "sudo -E ./". The Apache Ranger Hive plugin should now be successfully installed. Make sure that the default policy cache for the Hive plugin '/etc/ranger/cl1_hive/policycache' is readable by the user who is running the Hive server. Then restart the Apache Hive server to enable the authorization plugin.

2) Create authorization policies in the Apache Ranger Admin console

Next we will use the Apache Ranger admin console to create authorization policies for Apache Hive. Follow the steps in this tutorial to install the Apache Ranger admin service. Start the Ranger admin service via 'sudo ranger-admin start' and open a browser at 'http://localhost:6080', logging on with the credentials 'admin/admin'. Click the "+" button next to the "HIVE" logo and enter the following properties:
  • Service Name: cl1_hive
  • Username/Password: admin
  • jdbc.url: jdbc:hive2://localhost:10000
Note that "Test Connection" won't work as the "admin" user will not have the necessary authorization to invoke on Hive at this point. Click "Add" to create the service. If you have not done so in a previous tutorial, click on "Settings" and then "Users/Groups" and add two new users called "alice" and "bob", who we will use to test authorization. Then go back to the newly created "cl1_hive" service, and click "Add new policy" with the following properties:
  • Policy Name: SelectWords
  • database: default
  • table: words
  • Hive column: *
Then under "Allow Conditions", give "alice" the "select" permission and click "Add".

3) Test authorization with Apache Hive

Once our new policy has synced to '/etc/ranger/cl1_hive/policycache' we can test authorization in Hive. The user 'alice' can query the table according to our policy:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n alice
  • select * from words where word == 'Dare'; (works)
However, the user 'bob' is denied access:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n bob
  • select * from words where word == 'Dare'; (fails)
Categories: FLOSS Project Planets

Aaron Morton: Phantom Consistency Mechanisms

Wed, 2017-09-13 20:00

In this blog post we will take a look at consistency mechanisms in Apache Cassandra. There are three reasonably well documented features serving this purpose:

  • Read repair gives the option to sync data on read requests.
  • Hinted handoff is a buffering mechanism for situations when nodes are temporarily unavailable.
  • Anti-entropy repair (or simply just repair) is a process of synchronizing data across the board.

What is far less known, and what we will explore in detail in this post, is a fourth mechanism Apache Cassandra uses to ensure data consistency. We are going to see Cassandra perform another flavour of read repairs but in far sneakier way.

Setting things up

In order to see this sneaky repair happening, we need to orchestrate a few things. Let’s just blaze through some initial setup using Cassandra Cluster Manager (ccm - available on github).

# create a cluster of 2x3 nodes ccm create sneaky-repair -v 2.1.15 ccm updateconf 'num_tokens: 32' ccm populate --vnodes -n 3:3 # start nodes in one DC only ccm node1 start --wait-for-binary-proto ccm node2 start --wait-for-binary-proto ccm node3 start --wait-for-binary-proto # create table and keypsace ccm node1 cqlsh -e "CREATE KEYSPACE sneaky WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 3};" ccm node1 cqlsh -e "CREATE TABLE (k TEXT PRIMARY KEY , v TEXT);" # insert some data ccm node1 cqlsh -e "INSERT INTO (k, v) VALUES ('firstKey', 'firstValue');" The familiar situation

At this point, we have a cluster up and running. Suddenly, “the requirements change” and we need to expand the cluster by adding one more data center. So we will do just that and observe what happens to the consistency of our data.

Before we proceed, we need to ensure some determinism and turn off Cassandra’s known consistency mechanisms (we will not be disabling anti-entropy repair as that process must be initiated by an operator anyway):

# disable hinted handoff ccm node1 nodetool disablehandoff ccm node2 nodetool disablehandoff ccm node3 nodetool disablehandoff # disable read repairs ccm node1 cqlsh -e "ALTER TABLE WITH read_repair_chance = 0.0 AND dclocal_read_repair_chance = 0.0"

Now we expand the cluster:

# start nodes ccm node4 start --wait-for-binary-proto ccm node5 start --wait-for-binary-proto ccm node6 start --wait-for-binary-proto # alter keyspace ccm node1 cqlsh -e "ALTER KEYSPACE sneaky WITH replication ={'class': 'NetworkTopologyStrategy', 'dc1': 3, 'dc2':3 };"

With these commands, we have effectively added a new DC into the cluster. From this point, Cassandra can start using the new DC to serve client requests. However, there is a catch. We have not populated the new nodes with data. Typically, we would do a nodetool rebuild. For this blog post we will skip that, because this situation allows some sneakiness to be observed.

Sneakiness: blocking read repairs

Without any data being put on the new nodes, we can expect no data to be actually readable from the new DC. We will go to one of the new nodes (node4) and do a read request with LOCAL_QUORUM consistency to ensure only the new DC participates in the request. After the read request we will also check the read repair statistics from nodetool, but we will set that information aside for later:

ccm node4 cqlsh -e "CONSISTENCY LOCAL_QUORUM; SELECT * FROM WHERE k ='firstKey';" ccm node4 nodetool netstats | grep -A 3 "Read Repair" k | v ---+--- (0 rows)

No rows are returned as expected. Now, let’s do another read request (again from node4), this time involving at least one replica from the old DC thanks to QUORUM consistency:

ccm node4 cqlsh -e "CONSISTENCY QUORUM; SELECT * FROM WHERE k ='firstKey';" ccm node4 nodetool netstats | grep -A 3 "Read Repair" k | v ----------+------------ firstKey | firstValue (1 rows)

We now got a hit! This is quite unexpected because we did not run rebuild or repair meanwhile and hinted handoff and read repairs have been disabled. How come Cassandra went ahead and fixed our data anyway?

In order to shed some light onto this issue, let’s examine the nodetool netstat output from before. We should see something like this:

# after first SELECT using LOCAL_QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 0 Mismatch (Background): 0 # after second SELECT using QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 1 Mismatch (Background): 0 # after third SELECT using LOCAL_QUORUM ccm node4 nodetool netstats | grep -A 3 "Read Repair" Read Repair Statistics: Attempted: 0 Mismatch (Blocking): 1 Mismatch (Background): 0

From this output we can tell that:

  • No read repairs happened (Attempted is 0).
  • One blocking read repair actually did happen (Mismatch (Blocking) is 1).
  • No background read repair happened (Mismatch (Background) is 0).

It turns out there are two read repairs that can happen:

  • A blocking read repair happens when a query can not complete with desired consistency level without actually repairing the data. read_repair_chance has no impact on this.
  • A background read repair happens in situations when a query succeeds but inconsistencies are found. This happens with read_repair_chance probability.
The take-away

To sum things up, it is not possible to entirely disable read repairs and Cassandra will sometimes try to fix inconsistent data for us. While this is pretty convenient, it also has some inconvenient implications. The best way to avoid any surprises is to keep the data consistent by running regular repairs.

In situations featuring non-negligible amounts of inconsistent data this sneakiness can cause a lot of unexpected load on the nodes, as well as the cross-DC network links. Having to do cross-DC reads can also introduce additional latency. Read-heavy workloads and workloads with large partitions are particularly susceptible to problems caused by blocking read repair.

A particular situation when a lot of inconsistent data is guaranteed happens when a new data center gets added to the cluster. In these situations, LOCAL_QUORUM is necessary to avoid doing blocking repairs until a rebuild or a full repair is done. Using a LOCAL_QUORUM is twice as important when the data center expansion happens for the first time. In one data center scenario QUORUM and LOCAL_QUORUM have virtually the same semantics and it is easy to forget which one is actually used.

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-13

Wed, 2017-09-13 19:58
Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-12

Tue, 2017-09-12 19:58
  • “You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech”

    In 2015, Reddit closed several subreddits—foremost among them r/fatpeoplehate and r/CoonTown—due to violations of Reddit’s anti-harassment policy. However, the effectiveness of banning as a moderation approach remains unclear: banning might diminish hateful behavior, or it may relocate such behavior to different parts of the site. We study the ban of r/fatpeoplehate and r/CoonTown in terms of its effect on both participating users and affected subreddits. Working from over 100M Reddit posts and comments, we generate hate speech lexicons to examine variations in hate speech usage via causal inference methods. We find that the ban worked for Reddit. More accounts than expected discontinued using the site; those that stayed drastically decreased their hate speech usage—by at least 80%. Though many subreddits saw an influx of r/fatpeoplehate and r/CoonTown “migrants,” those subreddits saw no significant changes in hate speech usage. In other words, other subreddits did not inherit the problem. We conclude by reflecting on the apparent success of the ban, discussing implications for online moderation, Reddit and internet communities more broadly. (Via Anil Dash)

    (tags: abuse reddit research hate-speech community moderation racism internet)

  • The Immortal Myths About Online Abuse – Humane Tech – Medium

    After building online communities for two decades, we’ve learned how to fight abuse. It’s a solvable problem. We just have to stop repeating the same myths as excuses not to fix things. Here are the 8 myths Anil Dash picks out: 1. False: You can’t fix abusive behavior online. 2. False: Fighting abuse hurts free speech! 3. False: Software can detect abuse using simple rules. 4. False: Most people say “abuse” when they just mean criticism. 5. False: We just need everybody to use their “real” name. 6. False: Just charge a dollar to comment and that’ll fix things. 7. False: You can call the cops! If it’s not illegal, it’s not harmful. 8. False: Abuse can be fixed without dedicated resources.

    (tags: abuse comments community harassment racism reddit anil-dash free-speech)

  • ‘Let’s all survive the GDPR’

    Simon McGarr and John Looney’s slides from their SRECon ’17 presentation

    (tags: simon-mcgarr data-privacy privacy data-protection gdpr slides presentations)

Categories: FLOSS Project Planets

Sergey Beryozkin: The Real Data Processing with Apache Beam and Tika

Tue, 2017-09-12 14:55
If we talk about the data injestion in the big data streaming pipelines it is fair to say that in the vast majority of cases it is the files in the CSV and other text, easy to parse formats which provide the source data.

Things will become more complex when the task is to read and parse the files in the format such as PDF. One would need to create a reader/receiver capable of parsing the PDF files and feeding the content fragments (the regular text, the text found in the embedded attachments and the file metadata) into the processing pipelines. That was tricky to do right but you did it just fine.

The next morning you get a call from your team lead letting you know the customer actually needs the content injested not only from the PDF files but also from the files in a format you've never heard of before. You spend the rest of the week looking for a library which can parse such files and when you finish writing the code involving that library's not well documented API all you think of is that the weekends have arrived just in time.

On Monday your new task is to ensure that the pipelines have to be initialized from the same network folder where the files in PDF and other format will be dropped. You end up writing a frontend reader code which reads the file, checks the extension, and then chooses a more specific reader.   

Next day, when you are told that Microsoft Excel and Word documents which may or may not be zipped will have to be parsed as well, you report back asking for the holidays...

I'm sure you already know I've been preparing you for a couple of good news.

The first one is a well known fact that Apache Tika allows to write a generic code which can collect the data from the massive number of text, binary, image and video formats. One has to prepare or update the dependencies and configuration and have the same code serving the data from the variety of the data formats.

The other and main news is that Apache Beam 2.2.0-SNAPSHOT now ships a new TikaIO module (thanks to my colleague JB for reviewing and merging the PR). With Apache Beam capable of running the pipelines on top of Spark, Flink and other runners and Apache Tika taking care of various file formats, you get the most flexible data streaming system.

Do give it a try, help to improve TikaIO with new PRs, and if you are really serious about supporting a variety of the data formats in the pipelines, start planning on integrating it into your products :-)


Categories: FLOSS Project Planets

Justin Mason: Links for 2017-09-11

Mon, 2017-09-11 19:58
  • The React license for founders and CTOs – James Ide – Medium

    Decent explanation of _why_ Facebook came up with the BSD+Patents license: “Facebook’s patent grant is about sharing its code while preserving its ability to defend itself against patent lawsuits.”

    The difficulty of open sourcing code at Facebook, including React in 2013, was one of the reasons the company’s open-source contributions used to be a fraction of what they are today. It didn’t use to have a strong reputation as an open-source contributor to front-end technologies. Facebook wanted to open source code, though; when it grew communities for projects like React, core contributors emerged to help out and interview candidates often cited React and other Facebook open source as one of the reasons they were interested in applying. People at Facebook wanted to make it easier to open source code and not worry as much about patents. Facebook’s solution was the Facebook BSD+Patents license.

    (tags: facebook bsd licenses licensing asf patents swpats react license software-patents open-source rocksdb)

  • HN thread on the new Network Load Balancer AWS product

    looks like @colmmacc works on it. Lots and lots of good details here

    (tags: nlb aws load-balancing ops architecture lbs tcp ip)

  • Java Flame Graphs Introduction: Fire For Everyone!

    lots of good detail on flame graph usage in Java, and the Honest Profiler (honest because it’s safepoint-free)

    (tags: profiling java safepoints jvm flame-graphs perf measurement benchmarking testing)

  • Teaching Students to Code – What Works

    Lynn Langit describing her work as part of Microsoft Digigirlz and TKP to teach thousands of kids worldwide to code. Describes a curriculum from “K” (4-6-year olds) learning computational thinking with a block-based programming environment like Scratch, up to University level, solving problems with public clouds like AWS’ free tier.

    (tags: education learning coding teaching tkp lynn-langit scratch kids)

  • So much for that Voynich manuscript “solution”


    The idea that the book is a medical treatise on women’s health, however, might turn out to be correct. But that wasn’t Gibbs’ discovery. Many scholars and amateur sleuths had already reached that conclusion, using the same evidence that Gibbs did. Essentially, Gibbs rolled together a bunch of already-existing scholarship and did a highly speculative translation, without even consulting the librarians at the institute where the book resides. Gibbs said in the TLS article that he did his research for an unnamed “television network.” Given that Gibbs’ main claim to fame before this article was a series of books about how to write and sell television screenplays, it seems that his goal in this research was probably to sell a television screenplay of his own. In 2015, Gibbs did an interview where he said that in five years, “I would like to think I could have a returnable series up and running.” Considering the dubious accuracy of many History Channel “documentaries,” he might just get his wish.

    (tags: crypto history voynich-manuscript historians tls)

  • How to Optimize Garbage Collection in Go

    In this post, we’ll share a few powerful optimizations that mitigate many of the performance problems common to Go’s garbage collection (we will cover “fun with deadlocks” in a follow-up). In particular, we’ll share how embedding structs, using sync.Pool, and reusing backing arrays can minimize memory allocations and reduce garbage collection overhead.

    (tags: garbage performance gc golang go coding)

Categories: FLOSS Project Planets

Sam Ruby: Converting to Vue.js

Mon, 2017-09-11 14:35

Whimsy had four applications which made use of React.js; two of which previously were written using Angular.js.  One of these applications has already been converted to Vue, conversion of a second one is in progress.

The reason for the conversion was the decision by Facebook not to change their license.

Selection of Vue was based on two criteria: community size and the ability to support a React-like development model.  As a bonus, Vue supports an Angular-like development model too, is smaller in download size than either, and has a few additional features.  It is also fast, though I haven’t done any formal measurements.

Note that the API is different than React.js’s, in particular lifecycle methods and event names.  Oh, and the parameters to createElement are completely different.  Much of my conversion was made easier by the fact that I was already using a ruby2js filter, so all I needed to do was to write a new filter.

Things I like a lot:

  • Setters actually change the values synchronously.  This has been a source of subtle bugs and surprises when implementing a React.js application.
  • Framework can be used without preprocessors.  This is mostly true for React, but React.createClass is now deprecated.

Things I find valuable:

  • Mixins.  And probably in the near future extends.  These make components true building blocks, not mere means of encapsulation.
  • Computed values.  Better than Angular’s watchers, and easier than React’s componentWillReceiveProps.
  • Events.  I haven’t made much use of these yet, but this looks promising.

Things I dislike (but can work around):

  • Warnings are issued if property and data values are named the same.  I can understand why this was done; but I can access properties and data separately, and I’m migrating a codebase which often uses properties to define the initial values for instance data. It would be fine if there were a way to silence this one warning, but the only option available is to silence all warnings.
  • If I have a logic error in my application (it happens :-)), the stack traceback on Chrome doesn’t show my application.  On firefox, it does, but it is formatted oddly, and doesn’t make use of source maps so I can’t directly navigate to either the original source or the downloaded code.
  • Mounting an element replaces the entire element instead of just its children.  In my case, I’m doing server side rendering followed by client side updates.  Replacing the element means that the client can’t find the mount point.  My workaround is to add the enclosing element to the render.
  • Rendering on both the server and client can create a timing problem for forms.  At times, there can be just enough of a delay where the user can check a box or start to input data only to have Vue on the client wipe out the input.  I’m not sure why this wasn’t a problem with React.js, but for now I’m rendering the input fields as disabled until mounted on the client.

Things I’m not using:

  • templates, directives, and filters.  Mostly because I’m migrating from React instead of Angular.  But also because I like components better than those three.

On balance, so far I like Vue best of the three (even ignoring licensing issues), and am optimistic that Vue will continue to improve.

Categories: FLOSS Project Planets

Claus Ibsen: Upcoming Kubernetes and Apache Camel presentations in Aarhus Denmark

Mon, 2017-09-11 08:00
I have just been confirmed for attending and speaking at another Javagruppen event which takes place on October 11th from 16:30 to 19:00 in Aarhus Denmark.

The event is a "Gå Hjem Møde" which means it starts towards the end of the work day. We have two speakers. At first Helge Tesgaard will talk about getting started with highly available Kubernetes cluster. Then I follow up with my talk about agile integration with Apache Camel on Kubernetes.

After the event, then you are welcome to join for a coffee, beer or drink where some of us will head to the city to catch a few before heading home.

You can find more details about the event and how to register to attend.

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Integrating JSON Web Tokens with Kerberos using Apache Kerby

Mon, 2017-09-11 07:28
JSON Web Tokens (JWTs) are a standard way of encapsulating a number of claims about a particular subject. Kerberos is a long-established and widely-deployed SSO protocol, used extensively in the Big-Data space in recent years. An interesting question is to examine how a JWT could be used as part of the Kerberos protocol. In this post we will consider one possible use-case, where a JWT is used to convey additional authorization information to the kerberized service provider.

This use-case is based on a document available at HADOOP-10959, called "A Complement and Short Term Solution to TokenAuth Based on
Kerberos Pre-Authentication Framework", written by Kai Zheng and Weihua Jiang of Intel (also see here).

1) The test-case

To show how to integrate JWTs with Kerberos we will use a concrete test-case available in my github repo here:
  • cxf-kerberos-kerby: This project contains a number of tests that show how to use Kerberos with Apache CXF, where the KDC used in the tests is based on Apache Kerby
The test-case relevant to this blog entry is the JWTJAXRSAuthenticationTest. Here we have a trivial "double it" JAX-RS service implemented using Apache CXF, which is secured using Kerberos. An Apache Kerby-based KDC is launched which the client code uses to obtain a service ticket using JAAS (all done transparently by CXF), which is sent to the service code as part of the Authorization header when making the invocation.

So far this is just a fairly typical example of a kerberized web-service request. What is different is that the service configuration requires a level of authorization above and beyond the kerberos ticket, by insisting that the user must have a particular role to access the web service. This is done by inserting the CXF SimpleAuthorizingInterceptor into the service interceptor chain. An authenticated user must have the "boss" role to access this service. 

So we need somehow to convey the role of the user as part of the kerberized request. We can do this using a JWT as will be explained in the next few sections.

2) High-level overview of JWT use-case with Kerberos
As stated above, we need to convey some additional claims about the user to the service. This can be done by including a JWT containing those claims in the Kerberos service ticket. Let's assume that the user is in possession of a JWT that is issued by an IdP that contains a number of claims relating to that user (including the "role" as required by the service in our test-case). The token must be sent to the KDC when obtaining a service ticket.

The KDC must validate the token (checking the signature is correct, and that the signing identity is trusted, etc.). The KDC must then extract some relevant information from the token and insert it somehow into the service ticket. The kerberos spec defines a structure that can be used for this purposes called the AuthorizationData, which consists of a "type" along with some data to be interpreted according to the "type". We can use this structure to insert the encoded JWT as part of the data.  

On the receiving side, the service can extract the AuthorizationData structure from the received ticket and parse it accordingly to retrieve the JWT, and obtain whatever claims are desired from this token accordingly.

3) Sending a JWT Token to the KDC

Let's take a look at how the test-case works in more detail, starting with the client. The test code retrieves a JWT for "alice" by invoking on the JAX-RS interface of the Apache CXF STS. The token contains the claim that "alice" has the "boss" role, which is required to invoke on the "double it" service. Now we need to send this token to the KDC to retrieve a service ticket for the "double it" service, with the JWT encoded in the ticket.

This cannot be done by the built-in Java GSS implementation. Instead we will use Apache Kerby. Apache Kerby has been covered extensively on this blog (see for example here). As well as providing the implementation for the KDC used in our test-case, Apache Kerby provides a complete GSS implementation that supports tokens in the forthcoming 1.1.0 release. To use the Kerby GSS implementation we need to register the KerbyGssProvider as a Java security provider.

To actually pass the JWT we got from the STS to the Kerby GSS layer, we need to use a custom version of the CXF HttpAuthSupplier interface. The KerbyHttpAuthSupplier implementation takes the JWT String, and creates a Kerby KrbToken class using it. This class is added to the private credential list of the current JAAS Subject. This way it will be available to the Kerby GSS layer, which will send the token to the KDC using Kerberos pre-authentication as defined in the document which is linked at the start of this post.

4) Processing the received token in the KDC

The Apache Kerby-based KDC extracts the JWT token from the pre-authentication data entry and verifies that it is signed and that the issuer is trusted. The KDC is configured in the test-case with a certificate to use for this purpose, and also with an issuer String against which the issuer of the JWT must match. If there is an audience claim in the token, then it must match the principal of the service for which we are requesting a ticket. 

If the verification of the received JWT passes, then it is inserted into the AuthorizationData structure in the issued service ticket. The type that is used is a custom value defined here, as this behaviour is not yet standardized. The JWT is serialized and added to the data part of the token. Note that this behaviour is fully customizable.

5) Processing the AuthorizationData structure on the service end

After the service successfully authenticates the client, we have to access the AuthorizationData part of the ticket to extract the JWT. This can all be done using the Java APIs, Kerby is not required on the receiving side. The standard CXF interceptor for Kerberos is subclassed in the tests, to set up a custom CXF SecurityContext using the GssContext. By casting it to a ExtendedGSSContext, we can access the AuthorizationData and hence the JWT. The role claim is then extracted from the JWT and used to enforce the standard "isUserInRole" method of the CXF SecurityContext. 

If you are interested in exploring this topic further, please get involved with the Apache Kerby project, and help us to further improve and expand this integration between JWT and Kerberos.
Categories: FLOSS Project Planets