Brian Osborne: Keeping a view of upcoming events fresh in Drupal 8

Planet Drupal - Thu, 2017-05-18 19:34

Imagine you have a view that lists upcoming events on your Drupal 8 site. There's a date filter that filters out any event who's start date is less than the current date. This works great until you realize that the output of the view will be cached in one or many places (dynamic page cache, internal page cache, varnish, etc). Once it's cached, views doesn't execute the query and can't compare the date to the current time, so you may get older events sticking around.

Categories: FLOSS Project Planets

My adventures on Crafting PT II

Planet KDE - Thu, 2017-05-18 19:23

5 days ago I did the part one of this adventure, that you can check here. Now it's time for part two. =D Well, I was able to Craft AtCore. And have it running on Windows. However, that raised a problem that I had when I crafted AtCore on the beginning of the year. It [...]

Categories: FLOSS Project Planets

Dries Buytaert: Friduction: the internet's unstoppable drive to eliminate friction

Planet Drupal - Thu, 2017-05-18 19:02

There is one significant trend that I have noticed over and over again: the internet's continuous drive to mitigate friction in user experiences and business models.

Since the internet's commercial debut in the early 90s, it has captured success and upset the established order by eliminating unnecessary middlemen. Book stores, photo shops, travel agents, stock brokers, bank tellers and music stores are just a few examples of the kinds of middlemen who have been eliminated by their online counterparts. The act of buying books, printing photos or booking flights online alleviates the friction felt by consumers who must stand in line or wait on hold to speak to a customer service representative.

Rather than interpreting this evolution as disintermediation or taking something away, I believe there is value in recognizing that the internet is constantly improving customer experiences by reducing friction from systems — a process I like to call "friduction".

Open Source and cloud

Over the past 15 years, I've watched open source and cloud computing solutions transform content management into digital experience management. Specifically, I have observed open source and cloud-computing solutions remove friction from legacy approaches to technology. Open source takes the friction out of the technology evaluation and adoption process; you are not forced to get a demo or go through a sales and procurement process, or deal with the limitations of a proprietary license. Cloud computing also took off because it also offers friduction; with cloud, companies pay for what they use, avoid large up-front capital expenditures, and gain speed-to-market.

Cross-channel experiences

Technology will continue to work to eliminate inefficiencies, and today, emerging distribution platforms will continue to improve user experience. There is a reason why Drupal's API-first initiative is one of the topics I've talked and written the most about in 2016; it enables Drupal to "move beyond the page" and integrate with different user engagement systems. We're quickly headed to a world where websites are evolving into cross­channel experiences, which includes push notifications, conversational UIs, and more. Conversational UIs, such as chatbots and voice assistants, will eliminate certain inefficiencies inherent to traditional websites. These technologies will prevail because they improve and redefine the customer experience. In fact, Acquia Labs was founded last year to explore how we can help customer bring these browser-less experiences to market.

Personalization and contextualization

In the 90s, personalization meant that websites could address authenticated users by name. I remember the first time I saw my name appear on a website; I was excited! Obviously personalization strategies have come a long way since the 90s. Today, websites present recommendations based on a user's most recent activity, and consumers expect to be provided with highly tailored experiences. The drive for greater personalization and contextualization will never stop; there is too much value in removing friction from the user experience. When a commerce website can predict what you like based on past behavior, it eliminates friction from the shopping process. When a customer support website can predict what question you are going to ask next, it is able to provide a better customer experience. This is not only useful for the user, but also for the business. A more efficient user experience will translate into higher sales, improved customer retention and better brand exposure.

To keep pace with evolving user expectations, tomorrow's digital experiences will need to deliver more tailored, and even predictive customer experiences. This will require organizations to consume multiple sources of data, such as location data, historic clickstream data, or information from wearables to create a fine-grained user context. Data will be the foundation for predictive analytics and personalization services. Advancing user privacy in conjunction with data-driven strategies will be an important component of enhancing personalized experiences. Eventually, I believe that data-driven experiences will be the norm.

At Acquia, we started investing in contextualization and personalization in 2014, through the release of a product called Acquia Lift. Adoption of Acquia Lift has grown year over year, and we expect it to increase for years to come. Contextualization and personalization will become more pervasive, especially as different systems of engagements, big data, the internet of things (IoT) and machine learning mature, combine, and begin to have profound impacts on what the definition of a great user experience should be. It might take a few more years before trends like personalization and contextualization are fully adopted by the early majority, but we are patient investors and product builders. Systems like Acquia Lift will be of critical importance and premiums will be placed on orchestrating the optimal customer journey.


The history of the web dictates that lower-friction solutions will surpass what came before them because they eliminate inefficiencies from the customer experience. Friduction is a long-term trend. Websites, the internet of things, augmented and virtual reality, conversational UIs — all of these technologies will continue to grow because they will enable us to build lower-friction digital experiences.

Categories: FLOSS Project Planets

Lullabot: Modernizing JavaScript in Drupal 8

Planet Drupal - Thu, 2017-05-18 18:00
Mike and Matt host two of Drupal's JavaScript maintainers, Théodore Biadala and Matthew Grill, as well as Lullabot's resident JavaScript expert Sally Young, and talk about the history of JavaScript in Drupal, and attempts to modernize it.
Categories: FLOSS Project Planets

Third & Grove: Using the Batch API and hook_update_N in Drupal 8

Planet Drupal - Thu, 2017-05-18 17:07
Using the Batch API and hook_update_N in Drupal 8 ed Thu, 05/18/2017 - 17:07
Categories: FLOSS Project Planets

agoradesign: How Drupal Commerce 2.x improved my skills

Planet Drupal - Thu, 2017-05-18 16:59
Being one of the first early adopters of Drupal Commerce 2.x by starting our first project in early 2016 (on alpha2 version) and soon after a second one, I originally planned to write a blog post soon after beta1 gets published. Unfortunately I was too busy at that time....
Categories: FLOSS Project Planets

Tomasz Früboes: MongoDB for Developers (python flavour) – course review

Planet Python - Thu, 2017-05-18 16:37

Some while ago I decided to dive into the NoSQL world. This decision was driven partially by curiosity if it could be treated as an alternative to the Elasticsearch, which I have seen being used (along with Kibana) to implement rich analytics on mixed (text+numerical) data. I wanted also to understand when I would want to use a NoSQL database over a more conservative solution (e.g. PostreSQL or MySQL).

It turns out the company behind the popular MongoDB NoSQL database offers a number of free online courses, targeting audiences with different levels of expertise (see https://university.mongodb.com/). Among introductory courses, you can find “M101P: MongoDB for Developers” targeting python developers. Python expertise is not a strict requirement for this course since a compact introduction to python is also available during the first lesson.

The course consists of 6 units and a final exam. Units are released on a weekly basis. After completing the course you will understand the following

  • Database instalation and python side setup
  • Working inside mongo shell – creating/reading/updating data
  • Basics of schema design. Most importantly – when to denormalize
  • Basics of performance tuning such as using database indexes and analizing query execution via explain
  • Aggregation framework
  • Basics of multihost setup, e.g. using sharding to spread workload across multiple slaves

Each unit ends with a homework assignment which will give you 50% of the final score. Remaining part comes from the final exam.

Python related part (i.e. how to incorporate MongoDB into your application using the pymongo package) makes maybe 20% of the course content. At first, this may seem low. The more I think about it, the more right it feels. Working via mongo shell, i.e. using JSON to construct your queries or aggregations is a valuable skill. If you are able to do that using JSON, nearly automatically you know how to do that in python using pymongo. There are some minor differences, but the course stresses them out.

As written above, home assignments will give you 50% of the final score. Therefore lack of difficulty of some assignments may be surprising. For example, some of them seem to only test copy&paste capability, like the one for building a 3 server large replica set (last chapter of the course). For some assignments you get a test case with a known answer to verify that your solution to the problem is working correctly. Then you need to run it on different data and post the answer. In such case it is nearly impossible to post a wrong answer to such problem (assuming you got a correct one to the test case).

Overall, this gives you a positive push towards trying various things yourself. In the above-mentioned case of building a replica set, I have a feeling I’ve learned and memorized something. Even if it was only copy&paste into the terminal it had a better educational effect than just reading the text or watching the video.

The course ends with an exam consisting of 10 questions. Since the worst graded homework is automatically canceled it is fairly easy to get 100% homework score, which gives you a 50% “pedestal” toward the final grade. In such case you need to give just three correct answers in the final exam to positively complete the course (final score threshold is 65%).

The only two flaws I have found in the course are the following:

  • Videos introduce a range of different mongodb versions, sometimes “ancient” (e.g. 2.6). In some cases those present a different behavior than the latest and greatest one (3.4, which, BTW, was not supported in that edition of the course). The positive side is that in all such cases you always get a comment/errata under the video, so it’s not really a problem.
  • Time to time the instructor says in a plain text he doesn’t know how to explain some observed behaviors and leaves it like that. One such thing was why having elements in a set (on mongo side) does not guarantee the set ordering (underlying implementation using hash tables I guess). This leaves a bad impression since the course was created by the company that created mongodb. It’s the only real issue I have when it comes to course quality, fortunately this is rare

Overall the course is well prepared and was worth the time invested (BTW the “required time” estimates for each unit felt overestimated in most cases). I would decide to take this course again, even with my current, vague feeling, that I won’t decide to use mongodb anytime soon (perhaps never). So go register here, since new course session starts in couple of days.

Categories: FLOSS Project Planets

FeatherCast: Kevin McGrail, Fundraising and Apachecon North America

Planet Apache - Thu, 2017-05-18 16:35

ApacheCon North America 2017 attendee interview with Kevin McGrail. We talk to Kevin about his new role as VP Fundraising and the goals he’d like to achieve over the coming year.

Categories: FLOSS Project Planets

[No longer accepting applications] Call for a volunteer--Use your Web skills to protect users' rights

FSF Blogs - Thu, 2017-05-18 15:20

Update: We've received an overwhelming number of offers to volunteer. Thank you! Please do not submit any more. If you applied, you can expect to hear back from us by the end of May.

To help you protect yourself, we host a list of Web-based email services which respect your rights better than the big services such as GMail and Yahoo Mail. With thousands of unique visitors per month, this page is an important resource for the free software community and beyond.

Though the FSF hosts the page, we do not perform the evaluations. For more than three years, a community member named Ryan White has carried on these duties on a volunteer basis. Ryan has done an outstanding job, but the volume of requests for evaluations of Webmail systems has recently outstripped the time he is able to commit.

Are you able to help Ryan evaluate Webmail services and maintain the page on fsf.org? You will need:

  • Intermediate knowledge of JavaScript and Web development
  • A working understanding of free software licensing
  • A willingness to learn and analyze new issues
  • Enough free time to generally respond to emails within one week
  • The ability to write with correct grammar in English
  • About 5 hours a month to devote to this project

This is a one-person, four-month volunteer commitment, after which you will meet with FSF staff to review your experience and let us know whether or not you'd like to continue working with Ryan to maintain the page.

The image is in the public domain, copied from https://openclipart.org/detail/139507/homer-postal-pigeon, originally uploaded by rones.

Categories: FLOSS Project Planets

PyCharm: Meet the PyCharm team at PyCon in Portland, OR

Planet Python - Thu, 2017-05-18 12:53

The PyCharm team is traveling to Portland, Oregon for PyCon 2017!

JetBrains is sponsoring the event again this year, and we’ll have a booth in the expo hall during the conference. If you’re attending PyCon, drop by the expo hall and come find us!

Elizaveta Shashkova will share how she achieved the 40x speedup in the PyCharm 2017.1 debugger at her talk “Debugging in Python 3.6: Better, Faster, Stronger“. Come see her talk Saturday, May 20, at 10:50 am in the Oregon Ballroom 201–202.

Our developer advocate Paul Everitt will host a panel discussion with Python creator Guido van Rossum and other original Pythonistas on Sunday, May 21 at 9:20am. Paul will also host a talk at the Intel booth: “Develop With Speed: PyCharm and Intel® Distribution for Python”, it’s Friday, May 19, 10:30 am at the Intel booth in the expo hall.

Andrey Vlasovskikh, and Dmitry Trofimov are attending the Python Language Summit, Dmitry Filippov is attending the educational summit at PyCon.

If you have any questions, suggestions, or even complaints about PyCharm: come visit us in the expo hall! We’ll have JetBrains goodies, we can do live demos, and we’ll have a raffle for free PyCharm Professional Edition licenses. We’re also testing a prototype of new PyCharm features, so if you’d like a sneak peek at the future, drop by!

Finally, some of our developers will participate in the sprints. Would you like to join us or invite us to your own sprint? Drop by our booth!

Categories: FLOSS Project Planets

Blair Wadman: End of life of Mollom. What can you use instead?

Planet Drupal - Thu, 2017-05-18 11:57

Acquia has announced that they will be ending the life of Mollom. As of April 2018, they will no longer support the product.

You have over a year to find a replacement. I am currently using Mollom and planning on changing mine now. Chances are, if I don't, I'll forget to change it closer to the time!

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Configuring Kerberos for HDFS in Talend Open Studio for Big Data

Planet Apache - Thu, 2017-05-18 10:33
A recent series of blog posts showed how to install and configure Apache Hadoop as a single node cluster, and how to authenticate users via Kerberos and authorize them via Apache Ranger. Interacting with HDFS via the command line tools as shown in the article is convenient but limited. Talend offers a freely-available product called Talend Open Studio for Big Data which you can use to interact with HDFS instead (and many other components as well). In this article we will show how to access data stored in HDFS that is secured with Kerberos as per the previous tutorials.

1) HDFS setup

To begin with please follow the first tutorial to install Hadoop and to store the LICENSE.txt in a '/data' folder. Then follow the fifth tutorial to set up an Apache Kerby based KDC testcase and configure HDFS to authenticate users via Kerberos. To test everything is working correctly on the command line do:
  • export KRB5_CONFIG=/pathtokerby/target/krb5.conf
  • kinit -k -t /pathtokerby/target/alice.keytab alice
  • bin/hadoop fs -cat /data/LICENSE.txt
2) Download Talend Open Studio for Big Data and create a job

Now we will download Talend Open Studio for Big Data (6.4.0 was used for the purposes of this tutorial). Unzip the file when it is downloaded and then start the Studio using one of the platform-specific scripts. It will prompt you to download some additional dependencies and to accept the licenses. Click on "Create a new job" called "HDFSKerberosRead". In the search bar under "Palette" on the right hand side enter "tHDFS" and hit enter. Drag "tHDFSConnection" and "tHDFSInput" to the middle of the screen. Do the same for "tLogRow":
We now have all the components we need to read data from HDFS. "tHDFSConnection" will be used to configure the connection to Hadoop. "tHDFSInput" will be used to read the data from "/data" and finally "tLogRow" will just log the data so that we can be sure that it was read correctly. The next step is to join the components up. Right click on "tHDFSConnection" and select "Trigger/On Subjob Ok" and drag the resulting line to "tHDFSInput". Right click on "tHDFSInput" and select "Row/Main" and drag the resulting line to "tLogRow":
3) Configure the components

Now let's configure the individual components. Double click on "tHDFSConnection". For the "version", select the "Hortonworks" Distribution with version HDP V2.5.0 (we are using the original Apache distribution as part of this tutorial, but it suffices to select Hortonworks here). Under "Authentication" tick the checkbox called "Use kerberos authentication". For the Namenode principal specify "hdfs/localhost@hadoop.apache.org". Select the checkbox marked "Use a keytab to authenticate". Select "alice" as the principal and "<path.to.kerby.project>/target/alice.keytab" as the "Keytab":
Now click on "tHDFSInput". Select the checkbox for "Use an existing connection" + select the "tHDFSConnection" component in the resulting component list. For "File Name" specify the file we want to read: "/data/LICENSE.txt":
Now click on "Edit schema" and hit the "+" button. This will create a "newColumn" column of type "String". We can leave this as it is, because we are not doing anything with the data other than logging it. Save the job. Now the only thing that remains is to point to the krb5.conf file that is generated by the Kerby project. Click on "Window/Preferences" at the top of the screen. Select "Talend" and "Run/Debug". Add a new JVM argument: "-Djava.security.krb5.conf=/path.to.kerby.project/target/krb5.conf":

Now we are ready to run the job. Click on the "Run" tab and then hit the "Run" button. If everything is working correctly, you should see the contents of "/data/LICENSE.txt" displayed in the Run window.
Categories: FLOSS Project Planets

A. Jesse Jiryu Davis: Let's Grok the Gil at PyCon, Friday

Planet Python - Thu, 2017-05-18 09:14

I’m talking about fast and thread-safe Python tomorrow at 12:10, in the Oregon Ballroom 201-202. Come learn the internals of the Global Interpreter Lock, its effect on your code, and how to make your programs go fast without crashing. It boils down to two principles pithy enough to write on the backs of your two hands.

Categories: FLOSS Project Planets

Promet Source: Supporting Global Accessibility Awareness Day

Planet Drupal - Thu, 2017-05-18 08:43
Thursday, May 18 2017 marks the sixth annual Global Accessibility Awareness Day (GAAD). The purpose of GAAD is to get everyone talking, thinking and learning about digital (web, software, mobile, etc.) access/inclusion and people with different disabilities. Promet Source is proud to support GAAD as we help our clients achieve equal access for all across their digital properties.
Categories: FLOSS Project Planets

Marc Kerins: Inspect PCAP Files Using AWS Lambda

Planet Python - Thu, 2017-05-18 08:36

AWS Lambda is a service that allows you to run code without provisioning a server. This has some interesting possibilities especially when processing data asynchronously. When I first started learning about Lambda most of the examples were about resizing images. I work with PCAP files on a daily basis and have used scapy for several years so thought it would be a good experiment to use Lambda to do some simple PCAP inspection.

It will work something like this:

  1. A PCAP file is uploaded to a specific S3 bucket.
  2. The bucket will trigger an event and notify the Lambda function.
  3. The Lambda function will load the PCAP file from S3 and extract MAC addresses.
  4. The manufacturer of the network interface will be looked up using an external API.

We'll be keeping track of the technical debt accumulated and address it after we have a working proof of concept. If you want to put a more positive spin on it, you can call it opportunities for improvement. Let's get started.

It's always a good idea to set up a virtual environment when starting a Python project, no matter how small. Let's set up a virtual environment making sure to use Python 2.7 and install scapy

$ mkdir inspect-pcap-lambda && cd inspect-pcap-lambda $ virtualenv --python=python2.7 env $ source env/bin/activate $ pip freeze > pre_scapy.txt $ pip install scapy $ pip freeze > post_scapy.txt

You'll see why I captured the output of pip freeze both before and after I installed scapy later on in the post. scapy has a very useful REPL interface that can be loaded by simply running:

$ scapy

It's a great way to learn about the tool and I recommend exploring it further. We'll be writing a normal python script so scapy will be imported. To make things easier, we'll structure our script to match what is expected by AWS Lambda:

from __future__ import print_function import scapy def handler(event, context): pass

The handler function is the entrypoint for the Lambda function. It can be called anything you want as the module will be imported. We'll call this file inspect_pcap.py so when running our Lambda function inspect_pcap.handler(event, context) will be called.

Let's start using scapy to inspect a PCAP file. There are lots of places where you can download PCAP files but you can easily create your own. Make sure all other applications are closed to keep from cluttering up the capture and start tcpdump in one terminal:

$ sudo sudo tcpdump -i enp0s3 -s 1514 -w wget_google.pcap

In another terminal make a simple HTTP request to Google:

$ wget http://www.google.com

In the first terminal press Ctrl-C to stop the capture. It should output the number of packets capture. My capture was 29 and it shouldn't be too much more than that. Take a look at your capture by reading it:

$ tcpdump -r wget_google.pcap reading from file wget_google.pcap, link-type EN10MB (Ethernet) ... 13:24:16.047269 IP > 63805+ A? www.google.com. (32) 13:24:16.066879 IP > 63805 1/0/0 A (48) ...

The packets shown represent a host at performing a DNS lookup for www.google.com and a DNS server at returning the IP address.

Ok, back to scapy. Let's add some code that opens the PCAP and prints a summary of each packet to the console. We need a way to actually invoke the handler function so we'll add that as well:

from __future__ import print_function from scapy.all import rdpcap def handler(event, context): # Load PCAP file pcap = rdpcap('wget_google.pcap') # Iterate over each packet in the PCAP file for pkt in pcap: print(pkt.summary()) if __name__ == '__main__': handler(event=None, context=None)

Next run the script:

$ python inspect_pcap_local.py ... Ether / IP / UDP / DNS Qry "www.google.com." Ether / IP / UDP / DNS Ans "" ...

Technical Debt Item 1 - write unit tests and configure CI

Ether, IP and UDP represent the layers of the packets. See Internet protocol suite for more information about what each of these of layers does and how they work together. We want to take a closer look at the Ether layer for now, specifically the source and destination MAC addresses. Let's look up the manufacturer of each MAC address based on the first 24 bits, called the Organizationally Unique Identifier or OUI. Manufacturers of network interfaces are assigned one or more of OUIs by the IEEE that are universally unique. This means that every network interface has a completely unique MAC address. This is a simplification, but for our purposes it'll work. The MAC address of the (VirtualBox) host at is 08:00:27:71:bc:15. The OUI is the first 24 bits, or 08:00:27 which according to the Wireshark - OUI Lookup page is assigned to "PCS Systemtechnik GmbH".

To extract the MAC addresses from each packet, we need to look at the Ether layer.

from __future__ import print_function from scapy.all import rdpcap, Ether def handler(event, context): # Load PCAP file pcap = rdpcap('wget_google.pcap') # Iterate over each packet in the PCAP file for pkt in pcap: # Get the source and destination MAC addresses src_mac = pkt.getlayer(Ether).src dst_mac = pkt.getlayer(Ether).dst print('src_mac = {} dst_mac = {}'.format(src_mac, dst_mac)) if __name__ == '__main__': handler(event=None, context=None)

This is not robust code. If the packet doesn't have an Ether layer this will raise an unhandled exception.

Technical Debt Item 2 - add proper exception handling

$ python inspect_pcap_local.py src_mac = 74:d0:2b:93:fb:12 dst_mac = ff:ff:ff:ff:ff:ff ... src_mac = 08:00:27:71:bc:15 dst_mac = 2c:30:33:e9:3c:a3 ...

We can see the source and destination MAC addresses of each packet in the PCAP. Next, we need to put these values into a set so we can query the external API. A set is a better choice than a list because many values will be repeated and we don't want to query the external API any more than necessary.

Sending a few extra queries may not seem like a big deal, but we are using a public cloud where everything has a cost. Also, we want to be good Internet citizens and not unduly burden a free API.

Let's finish up the PCAP inspection part:

from __future__ import print_function from scapy.all import rdpcap, Ether def handler(event, context): # Load PCAP file pcap = rdpcap('wget_google.pcap') macs = set() # Iterate over each packet in the PCAP file for pkt in pcap: # Get the source and destination MAC addresses src_mac = pkt.getlayer(Ether).src dst_mac = pkt.getlayer(Ether).dst # Add them to the set of MAC addresses macs.add(src_mac) macs.add(dst_mac) print('Found {} MAC addresses'.format(len(macs))) if __name__ == '__main__': handler(event=None, context=None)

Running the script should show something similar to this:

$ python inspect_pcap_local.py Found 9 MAC addresses

Now we need to query the external API for each MAC address. I'm a big fan of requests but our use case is very simple so we can rely on the builtin urllib2 module. We'll use the API provided by MAC Vendors which can be queried by appending the MAC address to the end of the http://api.macvendors.com/ URL. Using urllib2, we can perform a lookup like this:

>>> import urllib2 >>> response = urllib2.urlopen('http://api.macvendors.com/08:00:27:71:bc:15') >>> print(response.getcode()) 200 >>> print(response.readline()) PCS Systemtechnik GmbH

The response code should be checked to make sure the lookup was successful. It will be 200 if it was found or 404 if it was not found. urllib2 will raise an exception that should be handled. All output that would normally go to stdout will be written to CloudWatch and we don't want the logs getting cluttered up with tracebacks when we know that some queries will fail. After adding in the call to the external API, our function should look like this:

from __future__ import print_function from scapy.all import rdpcap, Ether import urllib2 def handler(event, context): # Load PCAP file pcap = rdpcap('wget_google.pcap') mac_addresses = set() # Iterate over each packet in the PCAP file for pkt in pcap: # Get the source and destination MAC addresses src_mac = pkt.getlayer(Ether).src dst_mac = pkt.getlayer(Ether).dst # Add them to the set of MAC addresses mac_addresses.add(src_mac) mac_addresses.add(dst_mac) print('Found {} MAC addresses'.format(len(mac_addresses))) # Iterate over the set() of MAC addresses for mac in mac_addresses: # Attempt to look up the manufacturer try: resp = urllib2.urlopen('http://api.macvendors.com/{}'.format(mac)) if resp.getcode() == 200: vendor_str = resp.readline() print('{} is a {} network interface'.format(mac, vendor_str)) # Handle not found queries except urllib2.HTTPError: print('The manufacturer for {} was not found'.format(mac)) continue if __name__ == '__main__': handler(event=None, context=None)

Technical Debt Item 3 - make external API calls in parallel rather than serially

To list the manufacturers of the network interfaces represented by the PCAP run the script:

$ python inspect_pcap_local.py Found 9 MAC addresses ... 08:00:27:71:bc:15 is a PCS Systemtechnik GmbH network interface ... The manufacturer for 33:33:00:00:00:fb was not found The manufacturer for ff:ff:ff:ff:ff:ff was not found ...

There's our VirtualBox host but what about the ones that were not found? They're actually "special" MAC addresses that don't belong to a host. 33:33:00:00:00:fb is for IPv6-Neighbor-Discovery and ff:ff:ff:ff:ff:ff is the broadcast MAC address.

As an aside, only the first 24 bits of the MAC address are needed to look up the manufacturer. These MAC addresses:

f8:bc:12:53:0b:da f8:bc:12:53:0b:db

Are unique and represent two separate network interfaces, but they are both manufactured by Dell Inc. As a result we only need to lookup up f8:bc:12 once.

Technical Debt Item 4 - lookup each OUI once

The next step is to load a PCAP file from S3. Python based Lambda functions have the boto3 module available implicitly but we'll include it explicitly so we can test locally. Downloading a file from S3 is very simple. The example below assumes that the user executing the code has properly configured ~/.aws/config and ~/.aws/credentials files.

>>> import boto3 >>> s3 = boto3.resource('s3') >>> pcap_file = open('/tmp/temp.pcap', 'wb') >>> s3.Object('uploaded-pcaps', 'wget_google.pcap').download_file(pcap_file.name) >>> pcap_file.close()

Let's update our script to use a PCAP file from S3. This is when we start using the special event Lambda function argument.

Technical Debt Item 5 - put an upper limit on the size of the PCAP being downloaded from S3

from __future__ import print_function import json import os import urllib import urllib2 import boto3 from scapy.all import rdpcap, Ether def handler(event, context): # Log the event print('Received event: {}'.format(json.dumps(event))) # Extract the bucket and key (from AWS 's3-get-object-python' example) bucket = event['Records'][0]['s3']['bucket']['name'] key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8')) try: # Create a temporary file pcap_file = open('/tmp/temp.pcap', 'wb') # Download the PCAP from S3 s3 = boto3.resource('s3') s3.Object(bucket, key).download_file( pcap_file.name) pcap_file.close() except Exception: print('Error getting object {} from the {} bucket'.format(key, bucket)) # Load PCAP file pcap = rdpcap(pcap_file.name) mac_addresses = set() # Iterate over each packet in the PCAP file for pkt in pcap: # Get the source and destination MAC addresses src_mac = pkt.getlayer(Ether).src dst_mac = pkt.getlayer(Ether).dst # Add them to the set of MAC addresses mac_addresses.add(src_mac) mac_addresses.add(dst_mac) print('Found {} MAC addresses'.format(len(mac_addresses))) # Iterate over the set() of MAC addresses for mac in mac_addresses: # Attempt to look up the manufacturer try: resp = urllib2.urlopen('http://api.macvendors.com/{}'.format(mac)) if resp.getcode() == 200: vendor_str = resp.readline() print('{} is a {} network interface'.format(mac, vendor_str)) # Handle not found queries except urllib2.HTTPError: print('The manufacturer for {} was not found'.format(mac)) continue # Delete the temporary file os.remove(pcap_file.name) if __name__ == '__main__': handler(event=None, context=None)

We're now ready to prepare the archive for upload. It's tempting to just zip up the whole project directory but that will include a lot of unnecessary data. Remember when we were first set up the project directory we captured the output of pip freeze before and after we installed scapy? Comparing these two files will tell us what modules need to be included in our .zip archive.

$ diff pre_scapy.txt post_scapy.txt 3a4 > scapy==2.3.3

Easy enough, there's just one package we need to include in addition to our script. Lambda won't know about our virtual environment so we can either set environmental variables or "flatten" the .zip archive. I'm sure there is a more elegant way to do this, but it'll do for now. We'll also want to exclude files that end in *.pyc:

$ cd venv/lib/python2.7/site-packages/ $ zip -x "*.pyc" -r ../../../../inspect_pcap.zip scapy $ cd ../../../../ $ zip -x "*.pyc" -r inspect_pcap.zip inspect_pcap.py

Technical Debt Item 6 - automate the creation of Lambda .ZIP archive

We're now ready to upload our Lambda function with the scapy module included. Following the instructions here I created a Lambda function, an S3 trigger and role. The S3 trigger will send an event when an object is PUT to a specific bucket. A Lambda function has some permissions implicitly included (like writing logs to CloudWatch) but we need to explicitly grant it read-only permissions to S3 using a Role. After everything is set up you should be able to see and edit your code. This is very helpful so you don't have to keep re-uploading the .zip archive. If you're going to be making a lot of changes using the UI I would recommend checking out versions. I configured the test event by clicking the "Actions" button then "Configure test event". The sample event template used was "S3 PUT" and I modified it for our use:

{ "Records": [ { "eventVersion": "2.0", "eventTime": "1970-01-01T00:00:00.000Z", "requestParameters": { "sourceIPAddress": "" }, "s3": { "configurationId": "testConfigRule", "object": { "eTag": "0123456789abcdef0123456789abcdef", "sequencer": "0A1B2C3D4E5F678901", "key": "wget_google.pcap", "size": 1024 }, "bucket": { "arn": "arn:aws:s3:::mybucket", "name": "uploaded-pcaps", "ownerIdentity": { "principalId": "EXAMPLE" } }, "s3SchemaVersion": "1.0" }, "responseElements": { "x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH", "x-amz-request-id": "EXAMPLE123456789" }, "awsRegion": "us-east-1", "eventName": "ObjectCreated:Put", "userIdentity": { "principalId": "EXAMPLE" }, "eventSource": "aws:s3" } ] }

Click the blue "Save and Test" button and in a few seconds you should see something like this:
To check the logs when the the function is put into production, you'll look at the CloudWatch log group:

We'll address our list of technical debt in the next post:

  1. write unit tests and configure CI
  2. add proper exception handling
  3. make external API calls in parallel rather than serially
  4. lookup each OUI once
  5. put an upper limit on the size of the PCAP being downloaded from S3
  6. automate the creation of Lambda .ZIP archive

Thanks for following along, please leave any comments or questions below.

Categories: FLOSS Project Planets

EuroPython: EuroPython 2017 Keynote: Katharine Jarmul

Planet Python - Thu, 2017-05-18 08:23

We are pleased to announce our next keynote speaker for EuroPython 2017: Katharine Jarmul.

About Katharine

Katharine Jarmul is a pythonista and founder of Kjamistan, a data consulting company in Berlin, Germany. She’s been using Python since 2008 to solve and create problems. She helped form the first PyLadies chapter in Los Angeles in 2010, and co-authored an O'Reilly book along with several video courses on Python and data. She enjoys following the latest developments in machine learning, natural language processing and workflow automation infrastructure and is generally chatty and crabby on Twitter, where you can keep up with her latest shenanigans (@kjam).

The Keynote: If Ethics is not None

The history of computing, as it’s often covered in textbooks or talks, remains primarily focused on a series of hardware advancements, architectures, operating systems and software.

In this talk, we will instead explore the history of ethics in computing, touching on the early days of computers in warfare and science, leading up to ethical issues today such as Artificial Intelligence and privacy regulation.


EuroPython 2017 Team
EuroPython Society
EuroPython 2017 Conference

Categories: FLOSS Project Planets

PyCon: Introducing Our 2017 Keystone Sponsor: Intel!

Planet Python - Thu, 2017-05-18 07:17

It has been a trend over the past several years that our top sponsors — the companies who step forward to make the biggest investment in PyCon and its community — tend to be companies that not only use Python for their own development, but who turn around and offer Python as a crucial tool for their own customers. And that is certainly true of PyCon’s biggest sponsor this year.

PyCon 2017’s Keystone Sponsor is Intel Corporation!

Did you see Intel’s booth in the Expo Hall at PyCon 2016 last year? It was a phenomenon. I remember remarking to a fellow volunteer that Intel was making stunningly good use of their space. Their booth was very nearly a small self-contained conference of its own. It featured a large display and space for a speaker to stand, which Intel used to run a busy schedule of quick presentations and tutorials that focused on both Intel hardware and their support tools for developers. There always seemed to be an attentive crowd gathered whenever I would pass by.

Given Intel’s contribution to last year’s our Expo Hall, I was especially happy when I received word that they are stepping forward as our Keystone sponsor this year.

Intel’s investment in Python is an index of how prominent the language is becoming as a standard tool for data — a startling development for those of us who have traditionally associated computation with arcane compiled languages like Fortran and the C language family. But an easy-to-read and easy-to-write language like Python is of course a perfect fit for professionals who write code not for its own sake, but because they have some bigger job to do.

Whatever data and or compute problem a professional is tackling, they really want a programming language that will get out of their way and let them get their work done. They don’t want to be staring at their code because they are hung up on some sharp edge of a language’s syntax or rules. They want a language that is nearly transparent, that lets them look past the code at the problem they are trying to solve, and Python is filling that role for increasing numbers of people.

As Intel has stepped forward to offer their own distribution of Python — which compiles the language and its data libraries to take the best possible advantage of Intel processors and compute cores — it has been heartening to see their engagement with the existing Python community and its standard open-source tools. For example, instead of proffering yet another install mechanism for Python, Intel not only offers support for the standard “pip” installer but have also partnered with Continuum Analytics — a faithful sponsor of PyCon now for more than half a decade — to deliver their Intel Distribution for Python using Continuum’s “conda” install system that is so beloved by scientists.

The range of data problems against which Python is now flung every day is evidenced by the range of data tools that it now supports. Glancing just at Intel’s latest release notes, for example, one sees mention of a whole range of operations from different domains — Fourier transforms, NumPy vector operations, Scikit-learn machine learning optimizations, and even an accelerated neural network library.

We are excited that the elegant and simple Python language has been discovered by data scientists, academics, professionals, and students. And we are excited that Intel has chosen to support PyCon as our 2017 Keynote Sponsor as part of their own effort to make Intel hardware and compute services a standard choice for Python’s ever-widening community. Thank you!

Categories: FLOSS Project Planets

Paul Everitt: Off to PyCon 2017

Planet Python - Thu, 2017-05-18 07:17

On the train to the plane to the plane to the whatever to the hotel to meetup up with the PyCharm gang for PyCon. Like everybody else, I measure the passage of time by PyCons and look forward to old friends, familiar faces, and meeting new people.

We’re getting a gang of folks (Guido, Barry, Jim, me) from the original NIST workshop in 1994 together for the keynote panel on Sunday morning. What was going through our heads back then? What did Python do right and wrong? If you have any questions, come by the PyCharm booth, or even better, ask during the panel. It’s going to be interactive and a whole barrel-o’-fun.

Yes, I’m posting this on my ancient WordPress blog. I’m working on a replacement in Sphinx, teaching myself Sphinx extensions and brushing up on Bootstrap 4 etc. But that project is mega-months behind. I’m also experimenting with Hugo on a local girls lacrosse site (stafflax) hosted on Firebase. It’s been a fun learning experience.

Categories: FLOSS Project Planets

Kwin Wayland High DPI Support

Planet KDE - Thu, 2017-05-18 06:29

The current world of high DPI works fine when dealing with a single montior and only dealing with modern apps.
but it breaks down with multiple monitors.

What we want to see:

What we need to render:

As well as windows being spread across outputs, we also want the following features to work:

  • Legacy applications to still be readable and usable
  • Mouse speed to be consistent
  • Screenshots to be consistent across screens
  • All toolkits behaving the same through a common protocol

Handling scaling is part of the core wayland protocol and, with some changes in kwin, solves all of these problems.

The system

The system is a bit counter-intuitive, yet at the same time very simple; instead of clients having bigger windows and adjusting all the
input and positioning, we pretend everything is normal DPI and kwin just renders the entire screen at twice the size.
Clients, then provide textures (pictures of their window contents) that are twice the resolution of their window size.

This covers all possible cases:
- we render a 1x window on a 2x screen:
Because kwin scales up all the rendering, the window is shown twice the size, and therefore readable, albeit at standard resolution.

- we render a 2x window on a 1x screen:
The window texture will be downsampled to be the right size.

- we render a 2x window on a 2x screen:
Kwin scales up all the output, so we draw the window at twice the size. However, because the texture is twice as detailed this cancels out
and we end up showing it at the native drawn resolution giving us the high DPI detail.

The changes in KWin are not about adding explicit resizing or input redirection anywhere; but instead about decoupling the assumption between the size of a window or monitor, and its actual resolution.

What changes for app developers?



All the kwin code changes landed in time for Plasma 5.10, but dynamically changing the screen scale exposed some problems elsewhere in the stack. Thefore the main UI has been disabled till hopefully Plasma 5.11. This also allows us to expand our testing with users that want to manually edit their kscreen config and opt-in.

What about fractional scaling?

Despite Qt having quite good fractional scaling support, the wayland protocol limits itself to integers. This is in the very core protocol and somewhat hard to avoid. However, there's technically nothing stopping kwin from scaling to a different size than it tells the client to scale at...So it's something we can revisit later.

Tags: kde
Categories: FLOSS Project Planets

Piergiorgio Lucidi: Microsoft Build 2017: Open Source and interoperability

Planet Apache - Thu, 2017-05-18 06:18

The last week I joined Microsoft Build 2017 in Seattle together with some members of the Technical Advisory Group (TAG).

I would like to share some of the good points for developers and the Enterprise field in terms of how Microsoft is continuing to adopt Open Source and Open Standards in different ways.

Categories: FLOSS Project Planets
Syndicate content