Planet Apache

Syndicate content
Updated: 12 hours 19 min ago

Ian Boston: Ultrasonic Antifouling

12 hours 42 min ago

The board design went off to PCBWay via web browser and 5 days later 5 boards arrived by DHL from China. The whole process was unbelievably smooth. This was the first time I had ordered boards using the output of KiCad so I was impressed with both KiCad and PCBWay. The boards were simple, being 2 layer, but complex being large with some areas needing to carry high amps. So how did I do ?

I made 1 mistake on the footprints. The 2 terminal connectors for the 600v ultrasound output didn’t have pads on both sides. This didn’t matter as being through hole the connectors soldered ok. Other than that PCBWay did exactly what I had instructed them to. Even the Arduino Mega footprint fitted perfectly.

How did it perform ?

Once populated the board initially appeared to perform well. Random frequency from 20KHz to 150KHz worked. The drive waveform from the Mostfet drivers into the Mosfet was near perfect with no high frequency ringing on the edges with levels going from 0-12v and back in much less than 1us. However I noticed some problems with the PWM control. There was none. With PWM pulses at 10% the MOSFETS would turn on for 90% of the time and drive a wildly resonant waverform through the coil. Rather like a little hammer hitting a bit pendulum and having it feedback into resonance. On further investigation the scope showed that when the Mosfet tried to switch off the inductor carried on producing a flyback voltage causing the MostFet to continue conducting till the opposing mosfet turned on. Initially I thought this was ringing, but it turned out a simple pair of 1A high frequency Schottky diodes across each winding of the primary coil returned the energy to the the 12V line eliminating the fly back. Now I had ringing, at 10MHz, but control over the power output via a digital pot. I could leave it at that, but this 10MHz would probably transmit and cause problems with other equipment on the boat.

I think the difference between the red and blue signals is due to slightly different track lengths on each Mosfet. The shorter track not ringing nearly as much shown in the blue signal. The longer track with more capacitance ringing more and inducing a parasitic ring in the blue track. To eliminate this 2 things were done. Traditional Snubber RC networks had little or no impact. So a 100nF cap as close as possible to the Drain and Source on each Mosfet (RPF50N6) eliminated some of the high frequency, and a 100uF cap on the center tap to store the energy returned to the 12V line by flyback. This reduced the peak current.

There is still some ringing, but now the frequency is less and it is less violent. The ripple on the 12V line is now less than 0.2v and filtered out by decoupling caps on the supply pins to the Ardiono Mega. All of these modifications have been accommodated on the underside of the board.

The board now produces 60W per transducer between 20 and 150 KHz at 50% PWM drawing 5A from the supply. This is very loud on my desk and far louder than the Ultrasound Antifouling installed in Isador, which seems to work. I will need to implement a control program that balances power consumption against noise levels against effectiveness, but that is all software. There are sensors on board for temperature, current and voltage so it should be possible to have the code adapt to its environment.

Board Layout mistakes

Apart from the circuit errors, I made some mistakes in the MoSFET power connections. Rev2 of the board will have the MosFETS placed as close to the primary of the transformer with identical track lengths. Hopefully this will eliminate the ringing seen on the red trace and made both line the blue trace.

I have 4 spare unpopulated PCBs. If I do a rev2 board, I will use PCBWay again. Their boards were perfect, all the mistakes were mine.



Categories: FLOSS Project Planets

Steve Loughran: Dissent is a right: Dissent is a duty. @Dissidentbot

Mon, 2017-05-22 18:52
It looks like the Russians interfered with the US elections, not just from the alleged publishing of the stolen emails, or through the alleged close links with the Trump campaign, but in the social networks, creating astroturfed campaigns and repeating the messages the country deemed important.

Now the UK is having an election. And no doubt the bots will be out. But if the Russians can do bots: so can I.

This then, is @dissidentbot.

Dissident bot is a Raspbery Pi running a 350 line ruby script tasked with heckling politicans

It offers:
  • The ability to listen to tweets from a number of sources: currently a few UK politicians
  • To respond pick a random responses from a set of replies written explicitly for each one
  • To tweet the reply after a 20-60s sleep.
  • Admin CLI over Twitter Direct Messaging
  • Live update of response sets via github.
  • Live add/remove of new targets (just follow/unfollow from the twitter UI)
  • Ability to assign a probability of replying, 0-100
  • Random response to anyone tweeting about it when that is not a reply (disabled due to issues)
  • Good PUE numbers, being powered off the USB port of the wifi base station, SSD storage and fanless naturally cooled DC. Oh, and we're generating a lot of solar right now, so zero-CO2 for half the day.
It's the first Ruby script of more than ten lines I've ever written; interesting experience, and I've now got three chapters into a copy of the Pickaxe Book I've had sitting unloved alongside "ML for the working programmer".  It's nice to be able to develop just by saving the file & reloading it in the interpreter...not done that since I was Prolog programming. Refreshing.

Without type checking its easy to ship code that's broken. I know, that's what tests are meant to find, but as this all depends on the live twitter APIs, it'd take effort, including maybe some split between Model and Control. Instead: broken the code into little methods I can run in the CLI.

As usual, the real problems surface once you go live:
  1. The bot kept failing overnight; nothing in the logs. Cause: its powered by the router and DD-WRT was set to reboot every night. Fix: disable.
  2. It's "reply to any reference which isn't a reply itself" doesn't work right. I think it's partly RT related, but not fully tracked it down.
  3. Although it can do a live update of the dissident.rb script, it's not yet restarting: I need to ssh in for that.
  4. I've been testing it by tweeting things myself, so I've been having to tweet random things during testing.
  5. Had to add handling of twitter blocking from too many API calls. Again: sleep a bit before retries.
  6. It's been blocked by the conservative party. That was because they've been tweeting 2-4 times/hour, and dissidentbot originally didn't have any jitter/sleep. After 24h of replying with 5s of their tweets, it's blocked.
The loopback code is the most annoying bug; nothing too serious though.

The DM CLI is nice, the fact that I haven't got live restart something which interferes with the workflow.
Because the Pi is behind the firewall, I've no off-prem SSH access.

The fact the conservatives have blocked me, that's just amusing. I'll need another account.

One of the most amusing things is people argue with the bot. Even with "bot" in the name, a profile saying "a raspberry pi", people argue.

Overall the big barrier is content.  It turns out that you don't need to do anything clever about string matching to select the right tweet: random heckles seems to blend in. That's probably a metric of political debate in social media: a 350 line ruby script tweeting random phrases from a limited set is indistinguishable from humans.

I will accept Pull Requests of new content. Also: people are free to deploy their own copies. without the self.txt file it won't reply to any random mentions, just listen to its followed accounts and reply to those with a matching file in the data dir.

If the Russians can do it, so can we.
Categories: FLOSS Project Planets

Colm O hEigeartaigh: Security advisories issued for Apache CXF Fediz

Mon, 2017-05-22 12:23
Two security advisories were recently issued for Apache CXF Fediz. In addition to fixing these issues, the recent releases of Fediz impose tighter security constraints in some areas by default compared to older releases. In this post I will document the advisories and the other security-related changes in the recent Fediz releases.

1) Security Advisories

The first security advisory is CVE-2017-7661: "The Apache CXF Fediz Jetty and Spring plugins are vulnerable to CSRF attacks.". Essentially, both the Jetty 8/9 and Spring Security 2/3 plugins are subject to a CSRF-style vulnerability when the user doesn't complete the authentication process. In addition, the Jetty plugins are vulnerable even if the user does first complete the authentication process, but only the root context is available as part of this attack.

The second advisory is CVE-2017-7662: "The Apache CXF Fediz OIDC Client Registration Service is vulnerable to CSRF attacks". The OIDC client registration service is a simple web application that allows the creation of clients for OpenId Connect, as well as a number of other administrative tasks. It is vulnerable to CSRF attacks, where a malicious application could take advantage of an existing session to make changes to the OpenId Connect clients that are stored in the IdP.

2) Fediz IdP security constraints

This section only concerns the WS-Federation (and SAML-SSO) IdP in Fediz. The WS-Federation RP application sends its address via the 'wreply' parameter to the IdP. For SAML SSO, the address to reply to is taken from the consumer service URL of the SAML SSO Request. Previously, the Apache CXF Fediz IdP contained an optional 'passiveRequestorEndpointConstraint' configuration value in the 'ApplicationEntity', which allows the admin to specify a regular expression constraint on the 'wreply' URL.

From Fediz 1.4.0, 1.3.2 and 1.2.4, a new configuration option is available in the 'ApplicationEntity' called 'passiveRequestorEndpoint'. If specified, this is directly matched against the 'wreply' parameter. In a change that breaks backwards compatibility, but that is necessary for security reasons, one of 'passiveRequestorEndpointConstraint' or 'passiveRequestorEndpoint must be specified in the 'ApplicationEntity' configuration. This ensures that the user cannot be redirected to a malicious client. Similarly, new configuration options are available called 'logoutEndpoint' and 'logoutEndpointConstraint' which validate the 'wreply' parameter in the case of redirecting the user after logging out, one of which must be specified.

3) Fediz RP security constraints

This section only concerns the WS-Federation RP plugins available in Fediz. When the user tries to log out of the Fediz RP application, a 'wreply' parameter can be specified to give the address that the Fediz IdP can redirect to after logout is complete. The old functionality was that if 'wreply' was not specified, then the RP plugin instead used the value from the 'logoutRedirectTo' configuration parameter.

From Fediz 1.4.0, 1.3.2 and 1.2.4, a new configuration option is available called 'logoutRedirectToConstraint'. If a 'wreply' parameter is presented, then it must match the regular expression that is specified for 'logoutRedirectToConstraint', otherwise the 'wreply' value is ignored and it falls back to 'logoutRedirectTo'. 
Categories: FLOSS Project Planets

Bryan Pendleton: The Upper Rhine Valley: prelude and overture

Sun, 2017-05-21 17:11

We took an altogether-too-short but thoroughly wonderful trip to the Upper Rhine Valley region of Europe. I'm not sure that "Upper Rhine Valley" is a recognized term for this region, so please forgive me if I've abused it; more technically, we visited:

  1. The Alsace region of France
  2. The Schwarzenwald region of Germany
  3. The neighboring areas of Frankfurt, Germany, and Basel, Switzerland.
But since we were at no point more than about 40 miles from the Rhine river, and since we were several hundred miles from the Rhine's mouth in the North Sea, it seems like a pretty good description to me.

Plus, it matches up quite nicely with this map.

So there you go.

Anyway, we spent 10 wonderful days there, which was hardly even close to enough, but it was what we had.

And I, in my inimitable fashion, packed about 30 days of sightseeing into those 10 days, completely exhausting my travel companions.

Once again, no surprise.

I'll have more to write about various aspects of the trip subsequently, but here let me try to crudely summarize the things that struck me about the trip.

  • Rivers are incredibly important in Europe, much more so than here in America. Rivers provide transportation, drinking water, sewage disposal, electric power, food (fish), and form the boundaries between regions and nations. They do some of these things in America, too, but we aren't nearly as attached to our rivers as they are in Central Europe, where some of the great rivers of the world arise.
  • For centuries, castles helped people keep an eye on their rivers, and make sure that their neighbors were behaving as they should in the river valleys.
  • Trains are how you go places in Europe. Yes, you can fly, or you can drive, but if you CAN take a train, you should. And, if you can take a first class ticket on TGV, you absolutely, absolutely should. I have never had a more civilized travel experience than taking the TGV from Frankfurt to Strasbourg. (Though full credit to Lufthansa for being a much-better-than-ordinary airline. If you get a chance to travel Lufthansa, do it.)
  • To a life-long inhabitant of the American West, Central Europe is odd for having almost no animals. People live in Central Europe, nowadays; animals do not. BUT: storks!
  • France, of course, is the country that perfected that most beautiful of beverages: wine. While most of the attention to wine in France goes to Southern France, don't under-rate Alsace, for they have absolutely delicious wines of many types, and have been making wine for (at least) 2,000 years. We Californians may think we know something about wine; we don't.
  • The visible history of the Upper Rhine Valley is deeply formed by the Franks. Don't try to understand the cathedrals, villages, cities, etc. without spending some time thinking about Charlemagne, etc. And, if you were like me and rather snored through this part of your schooling, prepare to have your eyes opened.
  • The other major history of the Upper Rhine Valley involves wars. My, but this part of the world has been fought over for a long time. Most recently, of course, we can distinguish these major events:
    1. The Franco-Prussian war, which unified Germany and resulted in Alsace being a German territory
    2. World War One
    3. World War Two
    Although the most recent of these events is now 75 years in the past, the centuries and centuries of conflict over who should rule these wonderful lands has left its mark, deeply.

    So often through my visit I thought to myself: "Am I in French Germany? Or perhaps is this German France?" Just trying to form and phrase these questions in my head, I realized how little I knew, and how much there is to learn, about how people form their bonds with their land, and their neighbors, and their thoughts. Language, food, customs, politics, literature: it's all complex and it's all one beautiful whole.

    This, after all, is the land where Johannes Gutenberg invented the printing press, where people like Johann Wolfgang von Goethe, Louis Pasteur, John Calvin, and Albert Schweitzer lived and did their greatest work.

I could, of course, have been much terser:

  1. The Upper Rhine Valley is one of the most beautiful places on the planet. The people who live there are very warm and welcoming, and it is a delightful place to take a vacation
  2. Early May is an absolutely superb time to go there.

I'll write more later, as I find time.

Categories: FLOSS Project Planets

Bryan Pendleton: Back online

Sun, 2017-05-21 09:01

I took a break from computers.

I had a planned vacation, and so I did something that's a bit rare for me: I took an 11 day break from computers.

I didn't use any desktops or laptops. I didn't have my smartphone with me.

I went 11 days without checking my email, or signing on to various sites where I'm a regular, or opening my Feedly RSS read, or anything like that.

Now, I wasn't TOTALLY offline: there were newspapers and television broadcasts around, and I was traveling with other people who had computers.

But, overall, it was a wonderful experience to just "unplug" for a while.

I recommend it highly.

Categories: FLOSS Project Planets

Shawn McKinney: The Anatomy of a Secure Web App Using Java EE, Spring Security and Apache Fortress

Sat, 2017-05-20 22:55

Had a great time this week at ApacheCon.  This talk was presented on Thursday…

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-05-20

Sat, 2017-05-20 19:58
Categories: FLOSS Project Planets

Bertrand Delacretaz: Apache: lean and mean, durable, fun!

Fri, 2017-05-19 05:45

Here’s another blog post of mine that was initially published by Computerworld UK.

My current Fiat Punto Sport is the second Diesel car that I own, and I love those engines. Very smooth yet quite powerful acceleration, good fuel savings, a discount on state taxes thanks to low pollution, and it’s very reliable and durable. And fun to drive. How often does Grandma go “wow” when you put the throttle down in your car? That happens here, and that Grandma is not usually a car freak.

Diesel engines used to be boring, but they have made incredible progress in the last few years – while staying true to their basic principles of simplicity, robustness and reliability.

The recent noise about the Apache Software Foundation (ASF) moving to Git, or not, made me think that the ASF might well be the (turbocharged, like my car) Diesel engine of open source. And that might be a good thing.

The ASF’s best practices are geared towards project sustainability, and building communities around our projects. That might not be as flashy as creating a cool new project in three days, but sometimes you need to build something durable, and you need to be able to provide your users with some reassurances that that will be the case – or that they can take over cleanly if not.

In a similar way to a high tech Diesel engine that’s built to last and operate smoothly, I think the ASF is well suited for projects that have a long term vision. We often encourage projects that want to join the ASF via its Incubator to first create a small community and release some initial code, at some other place, before joining the Foundation. That’s one way to help those projects prove that they are doing something viable, and it’s also clearly faster to get some people together and just commit some code to one of the many available code sharing services, than following the ASF’s rules for releases, voting etc.

A Japanese 4-cylinder 600cc gasoline-powered sports bike might be more exciting than my Punto on a closed track, but I don’t like driving those in day-to-day traffic or on long trips. Too brutal, requires way too much attention. There’s space for both that and my car’s high tech Diesel engine, and I like both styles actually, depending on the context.

Open Source communities are not one-size-fits-all: there’s space for different types of communities, and by exposing each community’s positive aspects, instead of trying to get them to fight each other, we might just grow the collective pie and live happily ever after (there’s a not-so-hidden message to sensationalistic bloggers in that last paragraph).

I’m very happy with the ASF being the turbocharged Diesel engine of Open Source – it does have to stay on its toes to make sure it doesn’t turn into a boring old-style Diesel, but there’s no need to rush evolution. There’s space for different styles.

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-05-18

Thu, 2017-05-18 19:58
  • Spotting a million dollars in your AWS account · Segment Blog

    You can easily split your spend by AWS service per month and call it a day. Ten thousand dollars of EC2, one thousand to S3, five hundred dollars to network traffic, etc. But what’s still missing is a synthesis of which products and engineering teams are dominating your costs.  Then, add in the fact that you may have hundreds of instances and millions of containers that come and go. Soon, what started as simple analysis problem has quickly become unimaginably complex.  In this follow-up post, we’d like to share details on the toolkit we used. Our hope is to offer up a few ideas to help you analyze your AWS spend, no matter whether you’re running only a handful of instances, or tens of thousands.

    (tags: segment money costs billing aws ec2 ecs ops)

Categories: FLOSS Project Planets

FeatherCast: Kevin McGrail, Fundraising and Apachecon North America

Thu, 2017-05-18 16:35

ApacheCon 2017 attendee interview with Kevin McGrail. We talk to Kevin about his new role as VP Fundraising and the goals he’d like to achieve over the coming year.
Categories: FLOSS Project Planets

Colm O hEigeartaigh: Configuring Kerberos for HDFS in Talend Open Studio for Big Data

Thu, 2017-05-18 10:33
A recent series of blog posts showed how to install and configure Apache Hadoop as a single node cluster, and how to authenticate users via Kerberos and authorize them via Apache Ranger. Interacting with HDFS via the command line tools as shown in the article is convenient but limited. Talend offers a freely-available product called Talend Open Studio for Big Data which you can use to interact with HDFS instead (and many other components as well). In this article we will show how to access data stored in HDFS that is secured with Kerberos as per the previous tutorials.

1) HDFS setup

To begin with please follow the first tutorial to install Hadoop and to store the LICENSE.txt in a '/data' folder. Then follow the fifth tutorial to set up an Apache Kerby based KDC testcase and configure HDFS to authenticate users via Kerberos. To test everything is working correctly on the command line do:
  • export KRB5_CONFIG=/pathtokerby/target/krb5.conf
  • kinit -k -t /pathtokerby/target/alice.keytab alice
  • bin/hadoop fs -cat /data/LICENSE.txt
2) Download Talend Open Studio for Big Data and create a job

Now we will download Talend Open Studio for Big Data (6.4.0 was used for the purposes of this tutorial). Unzip the file when it is downloaded and then start the Studio using one of the platform-specific scripts. It will prompt you to download some additional dependencies and to accept the licenses. Click on "Create a new job" called "HDFSKerberosRead". In the search bar under "Palette" on the right hand side enter "tHDFS" and hit enter. Drag "tHDFSConnection" and "tHDFSInput" to the middle of the screen. Do the same for "tLogRow":
We now have all the components we need to read data from HDFS. "tHDFSConnection" will be used to configure the connection to Hadoop. "tHDFSInput" will be used to read the data from "/data" and finally "tLogRow" will just log the data so that we can be sure that it was read correctly. The next step is to join the components up. Right click on "tHDFSConnection" and select "Trigger/On Subjob Ok" and drag the resulting line to "tHDFSInput". Right click on "tHDFSInput" and select "Row/Main" and drag the resulting line to "tLogRow":
3) Configure the components

Now let's configure the individual components. Double click on "tHDFSConnection". For the "version", select the "Hortonworks" Distribution with version HDP V2.5.0 (we are using the original Apache distribution as part of this tutorial, but it suffices to select Hortonworks here). Under "Authentication" tick the checkbox called "Use kerberos authentication". For the Namenode principal specify "hdfs/". Select the checkbox marked "Use a keytab to authenticate". Select "alice" as the principal and "<>/target/alice.keytab" as the "Keytab":
Now click on "tHDFSInput". Select the checkbox for "Use an existing connection" + select the "tHDFSConnection" component in the resulting component list. For "File Name" specify the file we want to read: "/data/LICENSE.txt":
Now click on "Edit schema" and hit the "+" button. This will create a "newColumn" column of type "String". We can leave this as it is, because we are not doing anything with the data other than logging it. Save the job. Now the only thing that remains is to point to the krb5.conf file that is generated by the Kerby project. Click on "Window/Preferences" at the top of the screen. Select "Talend" and "Run/Debug". Add a new JVM argument: "":

Now we are ready to run the job. Click on the "Run" tab and then hit the "Run" button. If everything is working correctly, you should see the contents of "/data/LICENSE.txt" displayed in the Run window.
Categories: FLOSS Project Planets

Piergiorgio Lucidi: Microsoft Build 2017: Open Source and interoperability

Thu, 2017-05-18 06:18

The last week I joined Microsoft Build 2017 in Seattle together with some members of the Technical Advisory Group (TAG).

I would like to share some of the good points for developers and the Enterprise field in terms of how Microsoft is continuing to adopt Open Source and Open Standards in different ways.

Categories: FLOSS Project Planets

Sergey Beryozkin: Distributed Tracing with CXF: New Features

Thu, 2017-05-18 05:23
As you may already know Apache CXF has been offering a simple but effective support for tracing CXF client and server calls with HTrace since 2015.

What is interesting about this feature is that it was done after the DevMind attended to Apache Con NA 2015 and got inspired about integrating CXF with HTrace.

You'll be glad to know this feature has now been enhanced to get the trace details propagated to the logs which is the least intrusive way of working with HTrace though should you need more advanced control, CXF will help, see this section for example.

CXF has also been integrated with Brave. That should do better for CXF OSGI users. The integration work with Brave 4 is under way now.
Categories: FLOSS Project Planets

Justin Mason: Links for 2017-05-17

Wed, 2017-05-17 19:58
  • Seeking medical abortions online is safe and effective, study finds | World news | The Guardian

    Of the 1,636 women who were sent the drugs between the start of 2010 and the end of 2012, the team were able to analyse self-reported data from 1,000 individuals who confirmed taking the pills. All were less than 10 weeks pregnant. The results reveal that almost 95% of the women successfully ended their pregnancy without the need for surgical intervention. None of the women died, although seven women required a blood transfusion and 26 needed antibiotics. Of the 93 women who experienced symptoms for which the advice was to seek medical attention, 95% did so, going to a hospital or clinic. “When we talk about self-sought, self-induced abortion, people think about coat hangers or they think about tables in back alleys,” said Aiken. “But I think this research really shows that in 2017 self-sourced abortion is a network of people helping and supporting each other through what’s really a safe and effective process in the comfort of their own homes, and I think is a huge step forward in public health.”

    (tags: health medicine abortion pro-choice data women-on-web ireland law repealthe8th)

Categories: FLOSS Project Planets

Nick Kew: The Great B Minor

Wed, 2017-05-17 11:53

This Sunday, May 21st, we’re performing Bach’s B Minor Mass at the Guildhall, Plymouth.  This work needs no introduction, and I have no hesitation recommending it for readers who enjoy music and are within evening-out distance of Plymouth.

Tickets are cheaper in advance than on the door, so you might want to visit your favourite regular ticket vendor or google for online sales.

Minor curiosity: the edition we’re using was edited by Arthur Sullivan.  Yes, he of G&S, and an entirely different era and genre of music!   It’s also the Novello edition used in most performances in Britain.

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Hadoop Distributed File System (HDFS) - part V

Wed, 2017-05-17 05:23
This is the fifth in a series of blog posts on securing HDFS. The first post described how to install Apache Hadoop, and how to use POSIX permissions and ACLs to restrict access to data stored in HDFS. The second post looked at how to use Apache Ranger to authorize access to data stored in HDFS. The third post looked at how Apache Ranger can create "tag" based authorization policies for HDFS using Apache Atlas. The fourth post looked at how to implement transparent encryption for HDFS using Apache Ranger. Up to now, we have not shown how to authenticate users, concentrating only on authorizing local access to HDFS. In this post we will show how to configure HDFS to authenticate users via Kerberos.

1) Set up a KDC using Apache Kerby

If we are going to configure Apache Hadoop to use Kerberos to authenticate users, then we need a Kerberos Key Distribution Center (KDC). Typically most documentation revolves around installing the MIT Kerberos server, adding principals, and creating keytabs etc. However, in this post we will show a simpler way of getting started by using a pre-configured maven project that uses Apache Kerby. Apache Kerby is a subproject of the Apache Directory project, and is a complete open-source KDC written entirely in Java.

A github project that uses Apache Kerby to start up a KDC is available here:
  • bigdata-kerberos-deployment: This project contains some tests which can be used to test kerberos with various big data deployments, such as Apache Hadoop etc.
The KDC is a simple junit test that is available here. To run it just comment out the "org.junit.Ignore" annotation on the test method. It uses Apache Kerby to define the following principals:
  • hdfs/
  • HTTP/
Keytabs are created in the "target" folder for "alice", "bob" and "hdfs" (where the latter has both the hdfs/localhost + HTTP/localhost principals included). Kerby is configured to use a random port to lauch the KDC each time, and it will create a "krb5.conf" file containing the random port number in the target directory. So all we need to do is to point Hadoop to the keytabs that were generated and the krb5.conf, and it should be able to communicate correctly with the Kerby-based KDC.

2) Configure Hadoop to authenticate users via Kerberos

Download and configure Apache Hadoop as per the first tutorial. For now, we will not enable the Ranger authorization plugin, but rather secure access to the "/data" directory using ACLs, as described in section (3) of the first tutorial, such that "alice" has permission to read the file stored in "/data" but "bob" does not. The next step is to configure Hadoop to authenticate users via Kerberos.

Edit 'etc/hadoop/core-site.xml' and adding the following property name/values:
  • kerberos
Next edit 'etc/hadoop/hdfs-site.xml' and add the following property name/values to configure Kerberos for the namenode:
  • dfs.namenode.keytab.file: Path to Kerby hdfs.keytab (see above).
  • dfs.namenode.kerberos.principal: hdfs/
  • dfs.namenode.kerberos.internal.spnego.principal: HTTP/
Add the exact same property name/values for the secondary namenode, except using the property name "secondary.namenode" instead of "namenode". We also need to configure Kerberos for the datanode:
  • 700
  • dfs.datanode.address:
  • dfs.datanode.http.address:
  • dfs.web.authentication.kerberos.principal: HTTP/
  • dfs.datanode.keytab.file: Path to Kerby hdfs.keytab (see above).
  • dfs.datanode.kerberos.principal: hdfs/
  • dfs.block.access.token.enable: true 
As we are not using SASL to secure the the data transfer protocol (see here), we need to download and configure JSVC into JSVC_HOME. Then edit 'etc/hadoop/' and add the following properties:
  • export HADOOP_SECURE_DN_USER=(the user you are running HDFS as)
  • export JSVC_HOME=(path to JSVC as above)
  • export HADOOP_OPTS="<path to Kerby target/krb5.conf"
You also need to make sure that you can ssh to localhost as "root" without specifying a password.

3) Launch Kerby and HDFS and test authorization

Now that we have hopefully configured everything correctly it's time to launch the Kerby based KDC and HDFS. Start Kerby by running the JUnit test as described in the first section. Now start HDFS via:
  • sbin/
  • sudo sbin/
Now let's try to read the file in "/data" using "bin/hadoop fs -cat /data/LICENSE.txt". You should see an exception as we have no credentials. Let's try to read as "alice" now:
  • export KRB5_CONFIG=/pathtokerby/target/krb5.conf
  • kinit -k -t /pathtokerby/target/alice.keytab alice
  • bin/hadoop fs -cat /data/LICENSE.txt
This should be successful. However the following should result in a "Permission denied" message:
  • kdestroy
  • kinit -k -t /pathtokerby/target/bob.keytab bob
  • bin/hadoop fs -cat /data/LICENSE.txt
Categories: FLOSS Project Planets

Mukul Gandhi: Turing Award 2016

Wed, 2017-05-17 01:20
Its nice to see, Sir Tim Berners-Lee as a recipient of A.M. Turing Award. More details are available on
Categories: FLOSS Project Planets

Justin Mason: Links for 2017-05-15

Mon, 2017-05-15 19:58
  • The World Is Getting Hacked. Why Don’t We Do More to Stop It? – The New York Times

    Zeynep Tufekci is (as usual!) on the money with this op-ed. I strongly agree with the following:

    First, companies like Microsoft should discard the idea that they can abandon people using older software. The money they made from these customers hasn’t expired; neither has their responsibility to fix defects. Besides, Microsoft is sitting on a cash hoard estimated at more than $100 billion (the result of how little tax modern corporations pay and how profitable it is to sell a dominant operating system under monopolistic dynamics with no liability for defects). At a minimum, Microsoft clearly should have provided the critical update in March to all its users, not just those paying extra. Indeed, “pay extra money to us or we will withhold critical security updates” can be seen as its own form of ransomware. In its defense, Microsoft probably could point out that its operating systems have come a long way in security since Windows XP, and it has spent a lot of money updating old software, even above industry norms. However, industry norms are lousy to horrible, and it is reasonable to expect a company with a dominant market position, that made so much money selling software that runs critical infrastructure, to do more. Microsoft should spend more of that $100 billion to help institutions and users upgrade to newer software, especially those who run essential services on it. This has to be through a system that incentivizes institutions and people to upgrade to more secure systems and does not force choosing between privacy and security. Security updates should only update security, and everything else should be optional and unbundled. More on this twitter thread:

    (tags: security microsoft upgrades windows windows-xp zeynep-tufekci worms viruses malware updates software)

  • Fireside Chat with Vint Cerf & Marc Andreessen (Google Cloud Next ’17) – YouTube

    In which Vint Cerf calls for regulatory oversight of software engineering. “It’s a serious issue now”

    (tags: vint-cerf gcp regulation oversight politics law reliability systems)

  • don’t use String.intern() in Java

    String.intern is the gateway to native JVM String table, and it comes with caveats: throughput, memory footprint, pause time problems will await the users. Hand-rolled deduplicators/interners to reduce memory footprint are working much more reliably, because they are working on Java side, and also can be thrown away when done. GC-assisted String deduplication does alleviate things even more. In almost every project we were taking care of, removing String.intern from the hotpaths was the very profitable performance optimization. Do not use it without thinking, okay?

    (tags: strings interning java performance tips)

  • Moom removed from sale due to patent violation claim | Hacker News

    Well this sucks. Some scumbag applied for a patent on tiling window management in 2008, and it’s been granted. I use Moom every day :(

    (tags: moom patents bullshit swpat software window-management osx)

  • V2V and the challenge of cooperating technology

    A great deal of effort and attention has gone into a mobile data technology that you may not be aware of. This is “Vehicle to Vehicle” (V2V) communication designed so that cars can send data to other cars. There is special spectrum allocated at 5.9ghz, and a protocol named DSRC, derived from wifi, exists for communications from car-to-car and also between cars and roadside transmitters in the infrastructure, known as V2I. This effort has been going on for some time, but those involved have had trouble finding a compelling application which users would pay for. Unable to find one, advocates hope that various national governments will mandate V2V radios in cars in the coming years for safety reasons. In December 2016, the U.S. Dept. of Transportation proposed just such a mandate. [….] “Connected Autonomous Vehicles — Pick 2.”

    (tags: cars self-driving autonomous-vehicles v2v wireless connectivity networking security)

  • _Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases_

    ‘Amazon Aurora is a relational database service for OLTP workloads offered as part of Amazon Web Services (AWS). In this paper, we describe the architecture of Aurora and the design considerations leading to that architecture. We believe the central constraint in high throughput data processing has moved from compute and storage to the network. Aurora brings a novel architecture to the relational database to address this constraint, most notably by pushing redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. We describe how doing so not only reduces network traffic, but also allows for fast crash recovery, failovers to replicas without loss of data, and fault-tolerant, self-healing storage. We then describe how Aurora achieves consensus on durable state across numerous storage nodes using an efficient asynchronous scheme, avoiding expensive and chatty recovery protocols. Finally, having operated Aurora as a production service for over 18 months, we share the lessons we have learnt from our customers on what modern cloud applications expect from databases.’

    (tags: via:rbranson aurora aws amazon databases storage papers architecture)

  • Hello Sandwich Tokyo Guide

    a guide for people who like travelling like a local and visiting hidden places off the beaten track. There are tips on where to rent a bike, the best bike path, the best coffee, the best craft shops, the coolest shops, the cheapest drinks, the most delicious pizza, the best izakaya, the cutest cafes, the best rooftop bar, the coolest hotels (and the cheap and cheerful hotels), the loveliest parks and soooo much more. It’s a list of all of the places I frequent, making it a local insiders guide to Tokyo. Also included in the Hello Sandwich Tokyo Guide are language essentials and travel tips. It’s the bloggers guide to Tokyo and if you’d like to visit the places seen on Hello Sandwich, then this guide is the zine for you.

    (tags: shops tourism japan tokyo guidebooks)

  • jantman/awslimitchecker

    A script and python module to check your AWS service limits and usage, and warn when usage approaches limits. Users building out scalable services in Amazon AWS often run into AWS’ service limits – often at the least convenient time (i.e. mid-deploy or when autoscaling fails). Amazon’s Trusted Advisor can help this, but even the version that comes with Business and Enterprise support only monitors a small subset of AWS limits and only alerts weekly. awslimitchecker provides a command line script and reusable package that queries your current usage of AWS resources and compares it to limits (hard-coded AWS defaults that you can override, API-based limits where available, or data from Trusted Advisor where available), notifying you when you are approaching or at your limits. (via This Week in AWS)

    (tags: aws amazon limits scripts ops)

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Securing Apache Kafka with Kerberos

Mon, 2017-05-15 10:45
Last year, I wrote a series of blog articles based on securing Apache Kafka. The articles covered how to secure access to the Apache Kafka broker using TLS client authentication, and how to implement authorization policies using Apache Ranger and Apache Sentry. Recently I wrote another article giving a practical demonstration how to secure HDFS using Kerberos. In this post I will look at how to secure Apache Kafka using Kerberos, using a test-case based on Apache Kerby. For more information on securing Kafka with kerberos, see the Kafka security documentation.

1) Set up a KDC using Apache Kerby

A github project that uses Apache Kerby to start up a KDC is available here:
  • bigdata-kerberos-deployment: This project contains some tests which can be used to test kerberos with various big data deployments, such as Apache Hadoop etc.
The KDC is a simple junit test that is available here. To run it just comment out the "org.junit.Ignore" annotation on the test method. It uses Apache Kerby to define the following principals:
  • zookeeper/
  • kafka/
Keytabs are created in the "target" folder. Kerby is configured to use a random port to lauch the KDC each time, and it will create a "krb5.conf" file containing the random port number in the target directory. 

2) Configure Apache Zookeeper

Download Apache Kafka and extract it ( was used for the purposes of this tutorial). Edit 'config/' and add the following properties:
  • authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
  • requireClientAuthScheme=sasl 
  • jaasLoginRenew=3600000
Now create 'config/zookeeper.jaas' with the following content:

Server { required refreshKrb5Config=true useKeyTab=true keyTab="/" storeKey=true principal="zookeeper/localhost";

Before launching Zookeeper, we need to point to the JAAS configuration file above and also to the krb5.conf file generated in the Kerby test-case above. This can be done by setting the "KAFKA_OPTS" system property with the JVM arguments:
Now start Zookeeper via:
  • bin/ config/ 
3) Configure Apache Kafka broker

Create 'config/kafka.jaas' with the content:

KafkaServer {
   required refreshKrb5Config=true useKeyTab=true keyTab="/" storeKey=true principal="kafka/localhost";

Client { required refreshKrb5Config=true useKeyTab=true keyTab="/" storeKey=true principal="kafka/localhost";

The "Client" section is used to talk to Zookeeper. Now edit  'config/' and add the following properties:
  • listeners=SASL_PLAINTEXT://localhost:9092
  • sasl.enabled.mechanisms=GSSAPI 
We will just concentrate on using SASL for authentication, and hence we are using "SASL_PLAINTEXT" as the protocol. For "SASL_SSL" please follow the keystore generation as outlined in the following article. Again, we need to set the "KAFKA_OPTS" system property with the JVM arguments:
Now we can start the server and create a topic as follows:
  • bin/ config/
  • bin/ --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
4) Configure Apache Kafka producers/consumers

To make the test-case simpler we added a single principal "client" in the KDC for both the producer and consumer. Create a file called "config/client.jaas" with the content:

KafkaClient { required refreshKrb5Config=true useKeyTab=true keyTab="/" storeKey=true principal="client";

Edit *both* 'config/' and 'config/' and add:
  • security.protocol=SASL_PLAINTEXT
  • sasl.mechanism=GSSAPI 
Now set the "KAFKA_OPTS" system property with the JVM arguments:
We should now be all set. Start the producer and consumer via:
  • bin/ --broker-list localhost:9092 --topic test --producer.config config/
  • bin/ --bootstrap-server localhost:9092 --topic test --from-beginning --consumer.config config/ --new-consumer
Categories: FLOSS Project Planets