Feeds

Hook 42: September A11Y (Accessibility) Talk Review

Planet Drupal - Mon, 2017-10-09 21:20

We have all heard about website accessibility and know what it means in a broad sense, but what does website accessibility look like in a practical sense?

This month’s A11Y Talk featured Scott O'Hara from The Paciello Group. In this A11Y Talk, Scott O'Hara addressed questions like:
- How do I get started in a11y?
- How do I get my team to care about it?
- Where does one start in trying to incorporate a11y into the work they or their team produce?
- Who is in charge of a11y at your company anyway?

Categories: FLOSS Project Planets

Justin Mason: Links for 2017-10-09

Planet Apache - Mon, 2017-10-09 19:58
Categories: FLOSS Project Planets

Vincent Fourmond: Define a function with inline Ruby code in QSoas

Planet Debian - Mon, 2017-10-09 18:31
QSoas can read and execute Ruby code directly, while reading command files, or even at the command prompt. For that, just write plain Ruby code inside a ruby...ruby end block. Probably the most useful possibility is to define elaborated functions directly from within QSoas, or, preferable, from within a script; this is an alternative to defining a function in a completely separated Ruby-only file using ruby-run. For instance, you can define a function for plain Michaelis-Menten kinetics with a file containing:

ruby def my_func(x, vm, km) return vm/(1 + km/x) end ruby end
This defines the function my_func with three parameters, , (vm) and (km), with the formula:

You can then test that the function has been correctly defined running for instance:

QSoas> eval my_func(1.0,1.0,1.0) => 0.5 QSoas> eval my_func(1e4,1.0,1.0) => 0.999900009999
This yields the correct answer: the first command evaluates the function with x = 1.0, vm = 1.0 and km = 1.0. For , the result is (here 0.5). For , the result is almost . You can use the newly defined my_func in any place you would use any ruby code, such as in the optional argument to generate-buffer, or for arbitrary fits:

QSoas> generate-buffer 0 10 my_func(x,3.0,0.6) QSoas> fit-arb my_func(x,vm,km)
To redefine my_func, just run the ruby code again with a new definition, such as:
ruby def my_func(x, vm, km) return vm/(1 + km/x**2) end ruby end The previous version is just erased, and all new uses of my_func will refer to your new definition.


See for yourselfThe code for this example can be found there. Browse the qsoas-goodies github repository for more goodies !

About QSoasQSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.1. You can download its source code or buy precompiled versions for MacOS and Windows there.

Categories: FLOSS Project Planets

Markus Koschany: My Free Software Activities in September 2017

Planet Debian - Mon, 2017-10-09 18:18

Welcome to gambaru.de. Here is my monthly report that covers what I have been doing for Debian. If you’re interested in  Java, Games and LTS topics, this might be interesting for you.

Debian Games
  • I sponsored a new release of hexalate for Unit193 and icebreaker for Andreas Gnau. The latter is a reintroduction.
  • New upstream releases this month: freeorion and hyperrogue.
  • I backported freeciv and freeorion to Stretch.
Debian Java Debian LTS

This was my nineteenth month as a paid contributor and I have been paid to work 15,75 hours on Debian LTS, a project started by Raphaël Hertzog. In that time I did the following:

  • From 18. September to 24. September I was in charge of our LTS frontdesk. I triaged bugs in poppler, binutils, kannel, wordpress, libsndfile, libexif, nautilus, libstruts1.2-java, nvidia-graphics-drivers, p3scan, otrs2 and glassfish.
  • DLA-1108-1. Issued a security update for tomcat7 fixing 1 CVE.
  • DLA-1116-1. Issued a security update for poppler fixing 3 CVE.
  • DLA-1119-1. Issued a security update for otrs2 fixing 4 CVE.
  • DLA-1122-1. Issued a security update for asterisk fixing 1 CVE. I also investigated CVE-2017-14099 and CVE-2017-14603. I decided against a backport because the fix was too intrusive and the vulnerable option is disabled by default in Wheezy’s version which makes it a minor issue for most users.
  • I submitted a patch for Debian’s reportbug tool. (#878088) During our LTS BoF at DebConf 17 we came to the conclusion that we should implement a feature in reportbug that checks whether the bug reporter wants to report a regression for a recent security update. Usually the LTS and security teams  receive word from the maintainer or users who report issues directly to our mailing lists or IRC channels. However in some cases we were not informed about possible regressions and the new feature in reportbug shall ensure that we can respond faster to such reports.
  • I started to investigate the open security issues in wordpress and will complete the work in October.
Misc
  • I packaged a new version of xarchiver. Thanks to the work of Ingo Brückl xarchiver can handle almost all archive formats in Debian now.
QA upload
  • I did a QA upload of xball, an ancient game from the 90ies that simulates bouncing balls.  It should be ready for another decade at least.

Thanks for reading and see you next time.

Categories: FLOSS Project Planets

Python Data: Text Analytics and Visualization

Planet Python - Mon, 2017-10-09 15:05

For this post, I want to describe a text analytics and visualization technique using a basic keyword extraction mechanism using nothing but a word counter to find the top 3 keywords from a corpus of articles that I’ve created from my blog at http://ericbrown.com.  To create this corpus, I downloaded all of my blog posts (~1400 of them) and grabbed the text of each post. Then, I tokenize the post using nltk and various stemming / lemmatization techniques, count the keywords and take the top 3 keywords.  I then aggregate all keywords from all posts to create a visualization using Gephi.

I’ve uploaded a jupyter notebook with the full code-set for you to replicate this work. You can also get a subset of my blog articles in a csv file here.   You’ll need beautifulsoup and nltk installed. You can install them with:

pip install bs4 nltk

To get started, let’s load our libraries:

import pandas as pd import numpy as np from nltk.tokenize import word_tokenize, sent_tokenize from nltk.corpus import stopwords from nltk.stem import WordNetLemmatizer, PorterStemmer from string import punctuation from collections import Counter from collections import OrderedDict import re import warnings warnings.filterwarnings("ignore", category=DeprecationWarning) from HTMLParser import HTMLParser from bs4 import BeautifulSoup

I’m loading warnings here because there’s a warning about BeautifulSoup that we can ignore.

Now, let’s set up some things we’ll need for this work.

First, let’s set up our stop words, stemmers and lemmatizers.

porter = PorterStemmer() wnl = WordNetLemmatizer() stop = stopwords.words('english') stop.append("new") stop.append("like") stop.append("u") stop.append("it'") stop.append("'s") stop.append("n't") stop.append('mr.') stop = set(stop)

Now, let’s set up some functions we’ll need.

The tokenizer function is taken from here.  If you want to see some cool topic modeling, jump over and read How to mine newsfeed data and extract interactive insights in Python…its a really good article that gets into topic modeling and clustering…which is something I’ll hit on here as well in a future post.

# From http://ahmedbesbes.com/how-to-mine-newsfeed-data-and-extract-interactive-insights-in-python.html def tokenizer(text): tokens_ = [word_tokenize(sent) for sent in sent_tokenize(text)] tokens = [] for token_by_sent in tokens_: tokens += token_by_sent tokens = list(filter(lambda t: t.lower() not in stop, tokens)) tokens = list(filter(lambda t: t not in punctuation, tokens)) tokens = list(filter(lambda t: t not in [u"'s", u"n't", u"...", u"''", u'``', u'\u2014', u'\u2026', u'\u2013'], tokens)) filtered_tokens = [] for token in tokens: token = wnl.lemmatize(token) if re.search('[a-zA-Z]', token): filtered_tokens.append(token) filtered_tokens = list(map(lambda token: token.lower(), filtered_tokens)) return filtered_tokens

Next, I had some html in my articles, so i wanted to strip it from my text before doing anything else with it…here’s a class to do that using bs4.  I found this code on Stackoverflow.

class MLStripper(HTMLParser): def __init__(self): self.reset() self.fed = [] def handle_data(self, d): self.fed.append(d) def get_data(self): return ''.join(self.fed) def strip_tags(html): s = MLStripper() s.feed(html) return s.get_data()

OK – now to the fun stuff. To get our keywords, we need only 2 lines of code. This function does a count and returns said count of keywords for us.

def get_keywords(tokens, num): return Counter(tokens).most_common(num)

Finally,  I created a function to take a pandas dataframe filled with urls/pubdate/author/text and then create my keywords from that.  This function  iterates over a pandas dataframe (each row is an article from my blog), tokenizes the ‘text’ from  and returns a pandas dataframe with keywords, the title of the article and the publication data of the article.

def build_article_df(urls): articles = [] for index, row in urls.iterrows(): try: data=row['text'].strip().replace("'", "") data = strip_tags(data) soup = BeautifulSoup(data) data = soup.get_text() data = data.encode('ascii', 'ignore').decode('ascii') document = tokenizer(data) top_5 = get_keywords(document, 5) unzipped = zip(*top_5) kw= list(unzipped[0]) kw=",".join(str(x) for x in kw) articles.append((kw, row['title'], row['pubdate'])) except Exception as e: print e #print data #break pass #break article_df = pd.DataFrame(articles, columns=['keywords', 'title', 'pubdate']) return article_df

Time to load the data and start analyzing. This bit of code loads in my blog articles (found here) and then grabs only the interesting columns from the data, renames them and prepares them for tokenization. Most of this can be done in one line when reading in the csv file, but I already had this written for another project and just it as is.

df = pd.read_csv('../examples/tocsv.csv') data = [] for index, row in df.iterrows(): data.append((row['Title'], row['Permalink'], row['Date'], row['Content'])) data_df = pd.DataFrame(data, columns=['title' ,'url', 'pubdate', 'text' ])

Taking the tail() of the dataframe gets us:

Now, we can tokenize and do our word-count by calling our build_article_df function.

article_df = build_article_df(data_df)

This gives us a new dataframe with the top 3 keywords for each article (along with the pubdate and title of the article).

 

 

 

 

 

This is quite cool by itself. We’ve generated keywords for each article automatically using a simple counter. Not terribly sophisticated but it works and works well. There are many other ways to do this, but for now we’ll stick with this one. Beyond just having the keywords, it might be interesting to see how these keywords are ‘connected’ with each other and with other keywords. For example, how many times does ‘data’ shows up in other articles?

There are multiple ways to answer this question, but one way is by visualizing the keywords in a topology / network map to see the connections between keywords. we need to do a ‘count’ of our keywords and then build a co-occurrence matrix. This matrix is what we can then import into Gephi to visualize. We could draw the network map using networkx, but it tends to be tough to get something useful from that without a lot of work…using Gephi is much more user friendly.

We have our keywords and need a co-occurance matrix. To get there, we need to take a few steps to get our keywords broken out individually.

keywords_array=[] for index, row in article_df.iterrows(): keywords=row['keywords'].split(',') for kw in keywords: keywords_array.append((kw.strip(' '), row['keywords'])) kw_df = pd.DataFrame(keywords_array).rename(columns={0:'keyword', 1:'keywords'})

We now have a keyword dataframe kw_df that holds two columns: keyword and keywords with keyword

 

 

 

 

 

This doesn’t really make a lot of sense yet, but we need both columns to build a co-occurance matrix. We do by iterative over each document keyword list (the keywords column) and seeing if the keyword is included. If so, we added to our occurance matrix and then build our co-occurance matrix.

document = kw_df.keywords.tolist() names = kw_df.keyword.tolist() document_array = [] for item in document: items = item.split(',') document_array.append((items)) occurrences = OrderedDict((name, OrderedDict((name, 0) for name in names)) for name in names) # Find the co-occurrences: for l in document_array: for i in range(len(l)): for item in l[:i] + l[i + 1:]: occurrences[l[i]][item] += 1 co_occur = pd.DataFrame.from_dict(occurrences )

Now, we have a co-occurance matrix in the co_occur dataframe, which can be imported into Gephi to view a map of nodes and edges. Save the co_occur dataframe as a CSV file for use in Gephi (you can download a copy of the matrix here).

co_occur.to_csv('out/ericbrown_co-occurancy_matrix.csv')

Over to Gephi

Now, its time to play around in Gephi. I’m a novice in the tool so can’t really give you much in the way of a tutorial, but I can tell you the steps you need to take to build a network map. First, import your co-occuance matrix csv file using File -> Import Spreadsheet and just leave everything at the default.  Then, in the ‘overview’ tab, you should see a bunch of nodes and connections like the image below.

Network map of a subset of ericbrown.com articles

Next, move down to the ‘layout’ section and select the Fruchterman Reingold layout and push ‘run’ to get the map to redraw. At some point, you’ll need to press ‘stop’ after the nodes settle down on the screen. You should see something like the below.

Network map of a subset of ericbrown.com articles

 

Cool, huh? Now…let’s get some color into this graph.  In the ‘appearance’ section, select ‘nodes’ and then ‘ranking’. Select “Degree’ and hit ‘apply’.  You should see the network graph change and now have some color associated with it.  You can play around with the colors if you want but the default color scheme should look something like the following:

Still not quite interesting though. Where’s the text/keywords?  Well…you need to swtich over to the ‘overview’ tab to see that. You should see something like the following (after selecting ‘Default Curved’ in the drop-down.

Now that’s pretty cool. You can see two very distinct areas of interest here. “Data’ and “Canon”…which makes sense since I write a lot about data and share a lot of my photography (taken with a Canon camera).

Here’s a full map of all 1400 of my articles if you are interested.  Again, there are two main clusters around photography and data but there’s also another large cluster around ‘business’, ‘people’ and ‘cio’, which fits with what most of my writing has been about over the years.

There are a number of other ways to visualize text analytics.  I’m planning a few additional posts to talk about some of the more interesting approaches that I’ve used and run across recently. Stay tuned.

If you want to learn more about Text analytics, check out these books:

Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data 

Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit

Text Mining with R

 

The post Text Analytics and Visualization appeared first on Python Data.

Categories: FLOSS Project Planets

Stack Abuse: Python Generators

Planet Python - Mon, 2017-10-09 14:48
What is a Generator?

A Python generator is a function that produces a sequence of results. It works by maintaining its local state, so that the function can resume again exactly where it left off when called subsequent times. Thus, you can think of a generator as something like a powerful iterator.

The state of the function is maintained through the use of the keyword yield, which has the following syntax:

yield [expression_list]

This Python keyword works much like using return, but it has some important differences, which we'll explain throughout this article.

Generators were introduced in PEP 255, together with the yield statement. They have been available since Python version 2.2.

How do Python Generators Work?

In order to understand how generators work, let's use the simple example below:

# generator_example_1.py def numberGenerator(n): number = 0 while number < n: yield number number += 1 myGenerator = numberGenerator(3) print(next(myGenerator)) print(next(myGenerator)) print(next(myGenerator))

The code above defines a generator named numberGenerator, which receives a value n as an argument, and then defines and uses it as the limit value in a while loop. In addition, it defines a variable named number and assigns the value zero to it.

Calling the "instantiated" generator (myGenerator) with the next() method runs the generator code until the first yield statement, which returns 1 in this case.

Even after returning a value to us, the function then keeps the value of the variable number for the next time the function is called and increases its value by one. So the next time this function is called, it will pick up right where it left off.

Calling the function two more times, provides us with the next 2 numbers in the sequence, as seen below:

$ python generator_example_1.py 0 1 2

If we were to have called this generator again, we would have received a StopIteration exception since it had completed and returned from its internal while loop.

This functionality is useful because we can use generators to dynamically create iterables on the fly. If we were to wrap myGenerator with list(), then we'd get back an array of numbers (like [0, 1, 2]) instead of a generator object, which is a bit easier to work with in some applications.

The Difference Between return and yield

The keyword return returns a value from a function, at which time the function then loses its local state. Thus, the next time we call that function, it starts over from its first statement.

On the other hand, yield maintains the state between function calls, and resumes from where it left off when we call the next() method again. So if yield is called in the generator, then the next time the same generator is called we'll pick right back up after the last yield statement.

Using return in a Generator

A generator can use a return statement, but only without a return value, that is in the form:

return

When the generator finds the return statement, it proceeds as in any other function return.

As the PEP 255 states:

Note that return means "I'm done, and have nothing interesting to return", for both generator functions and non-generator functions.

Let's modify our previous example by adding an if-else clause, which will discriminate against numbers higher than 20. The code is as follows:

# generator_example_2.py def numberGenerator(n): if n < 20: number = 0 while number < n: yield number number += 1 else: return print(list(numberGenerator(30)))

In this example, since our generator won't yield any values it will be an empty array, as the number 30 is higher than 20. Thus, the return statement is working similarly to a break statement in this case.

This can be seen below:

$ python generator_example_2.py []

If we would have assigned a value less than 20, the results would have been similar to the first example.

Using next() to Iterate through a Generator

We can parse the values yielded by a generator using the next() method, as seen in the first example. This method tells the generator to only return the next value of the iterable, but nothing else.

For example, the following code will print on the screen the values 0 to 9.

# generator_example_3.py def numberGenerator(n): number = 0 while number < n: yield number number += 1 g = numberGenerator(10) counter = 0 while counter < 10: print(next(g)) counter += 1

The code above is similar to the previous ones, but calls each value yielded by the generator with the function next(). In order to do this, we must first instantiate a generator g, which is like a variable that holds our generator state.

When the function next() is called with the generator as its argument, the Python generator function is executed until it finds a yield statement. Then, the yielded value is returned to the caller and the state of the generator is saved for later use.

Running the code above will produce the following output:

$ python generator_example_3.py 0 1 2 3 4 5 6 7 8 9

Note: There is, however, a syntax difference between Python 2 and 3. The code above uses the Python 3 version. In Python 2, the next() can use the previous syntax or the following syntax:

print(g.next()) What is a Generator Expression?

Generator expressions are like list comprehensions, but they return a generator instead of a list. They were proposed in PEP 289, and became part of Python since version 2.4.

The syntax is similar to list comprehensions, but instead of square brackets, they use parenthesis.

For example, our code from before could be modified using generator expressions as follows:

# generator_example_4.py g = (x for x in range(10)) print(list(g))

The results will be the same as in our first few examples:

$ python generator_example_4.py [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Generator expressions are useful when using reduction functions such as sum(), min(), or max(), as they reduce the code to a single line. They're also much shorter to type than a full Python generator function. For example, the following code will sum the first 10 numbers:

# generator_example_5.py g = (x for x in range(10)) print(sum(g))

After running this code, the result will be:

$ python generator_example_5.py 45 Managing Exceptions

One important thing to note is that the yield keyword is not permitted in the try part of a try/finally construct. Thus, generators should allocate resources with caution.

However, yield can appear in finally clauses, except clauses, or in the try part of try/except clauses.

For example, we could have created the following code:

# generator_example_6.py def numberGenerator(n): try: number = 0 while number < n: yield number number += 1 finally: yield n print(list(numberGenerator(10)))

In the code above, as a result of the finally clause, the number 10 is included in the output, and the result is a list of numbers from 0 to 10. This normally wouldn't happen since the conditional statement is number < n. This can be seen in the output below:

$ python generator_example_6.py [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Sending Values to Generators

Generators have a powerful tool in the send() method for generator-iterators. This method was defined in PEP 342, and is available since Python version 2.5.

The send() method resumes the generator and sends a value that will be used to continue with the next yield. The method returns the new value yielded by the generator.

The syntax is send() or send(value). Without any value, the send method is equivalent to a next() call. This method can also use None as a value. In both cases, the result will be that the generator advances its execution to the first yield expression.

If the generator exits without yielding a new value (like by using return), the send() method raises StopIteration.

The following example illustrates the use of send(). In the first and third lines of our generator, we ask the program to assign the variable number the value previously yielded. In the first line after our generator function, we instantiate the generator, and we generate a first yield in the next line by calling the next function. Thus, in the last line we send the value 5, which will be used as input by the generator, and considered as its previous yield.

# generator_example_7.py def numberGenerator(n): number = yield while number < n: number = yield number number += 1 g = numberGenerator(10) # Create our generator next(g) # print(g.send(5))

Note: Because there is no yielded value when the generator is first created, before using send(), we must make sure that the generator yielded a value using next() or send(None). In the example above, we execute the next(g) line for just this reason, otherwise we'd get an error saying "TypeError: can't send non-None value to a just-started generator".

After running the program, it prints on the screen the value 5, which is what we sent to it:

$ python generator_example_7.py 5

The third line of our generator from above also shows a new Python feature introduced in the same PEP: yield expressions. This feature allows the yield clause to be used on the right side of an assignment statement. The value of a yield expression is None, until the program calls the method send(value).

Connecting Generators

Since Python 3.3, a new feature allows generators to connect themselves and delegate to a sub-generator.

The new expression is defined in PEP 380, and its syntax is:

yield from <expression>

where <expression> is an expression evaluating to an iterable, which defines the delegating generator.

Let's see this with an example:

# generator_example_8.py def myGenerator1(n): for i in range(n): yield i def myGenerator2(n, m): for j in range(n, m): yield j def myGenerator3(n, m): yield from myGenerator1(n) yield from myGenerator2(n, m) yield from myGenerator2(m, m+5) print(list(myGenerator1(5))) print(list(myGenerator2(5, 10))) print(list(myGenerator3(0, 10)))

The code above defines three different generators. The first, named myGenerator1, has an input parameter, which is used to specify the limit in a range. The second, named myGenerator2, is similar to the previous one, but contains two input parameters, which specify the two limits allowed in the range of numbers. After this, myGenerator3 calls myGenerator1 and myGenerator2 to yield their values.

The last three lines of code print on the screen three lists generated from each of the three generators previously defined. As we can see when we run the program below, the result is that myGenerator3 uses the yields obtained from myGenerator1 and myGenerator2, in order to generate a list that combines the previous three lists.

The example also shows an important application of generators: the capacity to divide a long task into several separate parts, which can be useful when working with big sets of data.

$ python generator_example_8.py [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

As you can see, thanks to the yield from syntax, generators can be chained together for more dynamic programming.

Benefits of Generators
  1. Simplified code

    As seen in the examples shown in this article, generators simplify code in a very elegant manner. These code simplification and elegance are even more evident in generator expressions, where a single line of code replaces an entire block of code.

  2. Better performance

    Generators work on lazy (on-demand) generation of values. This results in two advantages. First, lower memory consumption. However, this memory saving will work in our benefit if we use the generator only once. If we use the values several times, it may be worthwhile to generate them at once and keep them for later use.

    The on-demand nature of generators also means we may not have to generate values that won't be used, and thus would have been wasted cycles if they were generated. This means your program can use only the values needed without having to wait until all of them have been generated.

When to use Generators

Generators are an advanced tool present in Python. There are several programming cases where generators can increase efficiency. Some of these cases are:

  • Processing large amounts of data: generators provide calculation on-demand, also called lazy evaluation. This technique is used in stream processing.
  • Piping: stacked generators can be used as pipes, in a manner similar to Unix pipes.
  • Concurrency: generators can be used to generate (simulate) concurrency.
Wrapping Up

Generators are a type of function that generate a sequence of values. As such they can act in a similar manner to iterators. Their use results in a more elegant code and improved performance.

These aspects are even more evident in generator expressions, where one line of code can summarize a sequence of statements.

Generators' working capacity has been improved with new methods, such as send(), and enhanced statements, such as yield from.

As a result of these properties, generators have many useful applications, such as generating pipes, concurrent programming, and helping in creating streams from large amounts of data.

As a consequence of these improvements, Python is becoming more and more the language of choice in data science.

What have you used generators for? Let us know in the comments!

Categories: FLOSS Project Planets

Reuven Lerner: Announcing: Three new live courses, to level up your Python and Git skills

Planet Python - Mon, 2017-10-09 13:40
  • Confused by Python dicts, or wondering how you can take advantage of them in your programs?
  • Do you wonder how Python functions work, and how you can make them more “Pythonic,” and easier to maintain?
  • Do you wonder why everyone raves about Git, when it seems impossibly hard to understand?  Cloning, pulling, and pushing mostly work… but when they don’t, Git seems like magic, and not the good kind.

If any (or all) of the above is true, then you’ll likely be interested in one or more of the live, online classes I’m teaching later this month:

  1. Python dictionaries, on Wednesday, October 25
  2. Python functions, on Thursday, October 26
  3. Understanding Git, on Tuesday, October 31 and Wednesday, November 1

Each of these classes is live, with tons of live-coding demos, exercises, and time for Q&A. My goal is for you to understand these technologies, how they work, and (most importantly) how you can use them effectively in your work.

Previous classes have been small and highly interactive. These are the same classes I give to some of the best-known companies in the world, such as Apple, Cisco, IBM, PayPal, VMWare, and Western Digital; I’m sure that you’ll enjoy yourself, and come out a better engineer.

Better yet: Buy a ticket by this Friday, and you’ll get a substantial (20%) discount on the ticket price.

This is not a recorded class (although recordings will be available later on).  I’ll be speaking and interacting the entire time, giving you a chance to get your questions answered.  I want to make sure you really understand what’s going on, and will answer any questions you have!

Speaking of which: If you have questions, just e-mail me at reuven@lerner.co.il, and I’ll do my best to answer.

And if you’re a student, ask me for a coupon code that will give you a substantial discount off of the ticket price.

I hope that you can join me for one or more of these classes!

The post Announcing: Three new live courses, to level up your Python and Git skills appeared first on Lerner Consulting Blog.

Categories: FLOSS Project Planets

Palantir: Drupal 8 is Great for Showing Solutions Quickly

Planet Drupal - Mon, 2017-10-09 12:45
Drupal 8 is Great for Showing Solutions Quickly #D8isGr8 brandt Mon, 10/09/2017 - 11:45 Luke Wertz Oct 9, 2017

The #D8isGr8 blog series will focus on why we love Drupal 8 and how it provides solutions for our clients. This post in the series comes from Luke Wertz, Solution Architect.

We want to make your project a success.

Let's Chat.

We often work on projects with clients who are juggling strict timelines and multiple stakeholders. From the time a vendor is selected, to contract signing and project kick-off meetings, it can sometimes be a whole month before our production team is able to really dig into a new project.

The thing I love about Drupal 8 is that it gives us the ability to skip parts of the prototyping phase and get into rapid proof of concept work very quickly. We can quickly demonstrate to our clients the problem space they’re working in and a potential solution. Drupal 8 allows us to get there quickly without writing a lot of code, which means our client product owners are able to show progress to their stakeholders sooner.

This proof of concept work is enabled by the functionality that is now baked into Drupal 8 core. In previous versions of Drupal, Views was a contrib module. A lot of how Views functions in Drupal 8 is the same as before, but that extra step of having to install, deploy, and configure it has been removed.

The ability to show value to a client early and quickly is reflective of Palantir’s move to Agile development. We are a data-driven company, and we like to use quantitative methods to prove our value to our clients. Drupal 8 helps us to iterate rapidly: have an idea, quickly show how it might work, test it, and prove it.

Stay connected with the latest news on web strategy, design, and development.

Sign up for our newsletter.
Categories: FLOSS Project Planets

Ben Hutchings: Debian LTS work, September 2017

Planet Debian - Mon, 2017-10-09 12:25

I was assigned 15 hours of work by Freexian's Debian LTS initiative and carried over 6 hours from August. I only worked 12 hours, so I will carry over 9 hours to the next month.

I prepared and released another update on the Linux 3.2 longterm stable branch (3.2.93). I then rebased the Debian linux package onto this version, added further security fixes, and uploaded it (DLA-1099-1).

Categories: FLOSS Project Planets

Will McGugan: Lifting the Curtain on Asyncio Magic - Part 1

Planet Python - Mon, 2017-10-09 12:13

The new asyncio module introduced in Python3.4 is a nice addition to the Python standard library, especially when used with the async and await keywords introduced in Python3.5.

If you have read the official docs, you should hopefully be able to be able to write high performance network servers and clients with asyncio. But if something goes wrong, you may find it hard to debug. This is no reflection on asyncio; concurrency and IO are deceptively complex things, and asyncio is the point they meet.

I suspect that the problem with debugging asyncio projects is that async code doesn't run itself. You are dependant on a rather large body of code to run your code for you, and the distinction between your code library code is far more blurry than you may be used to.

Normally when trying to understand code in the standard library I would reccomend reading the source. The Python stand library code is on Github these, which makes that super easy. For the most part, the standard library code is not too intimidating--it's written by Python developers like yourself, and I would absolutely encourage developers at any level to read through it some time. But the asyncio code is a different beast, reading that might leave you happy to consider it magic written by wizards.

Unfortunately this means that acquiring the mental model you will need to tackle bugs in your async code isn't easy. Which is why I have put together a working example of a network server using async techniques but without using asyncio or any other async framework.

The project is asyncchat which runs a chat server you can connect to with telnet. The server is implemented in a single file. My hope is that it is small enough to be easily digestible but complete enough not to leave too much as an exercise for the reader.

Because the chat server uses similar techniques to asyncio, if you can grasp how chatserver.py works, then you will have a good grasp on how async is implemented in general.

Running chatserver.py

The chat server code is Python3.6 only (for the love of f strings). To run it, check out the asyncchat repos and run the following command in a terminal:

python3.6 chatserver.py

In another terminal, run the following:

telnet 127.0.0.1 2323

The telnet commands works on Linux / OSX. On Windows you may have to open your favourite telnet client.

Run the telnet command again in yet another terminal, and you should be able to send messages back and forth between the two terminals.

If you want to test it over a network, you can run the server with the following:

sudo python3.6 asyncchat.py 0.0.0.0 23

Now you should be able to exchange messages over a network, by telneting to the servers IP address. It will also work over the internet, assuming you know how to set up your router to do that.

© 2017 Will McGugan

An asyncchat.py session with two telnet clients. The server can support a large number of clients without multiple threads or processes.

In Part 2

After I've received any feedback, I'll post about how asyncchat.py works and dissect it line by line. Please do let me know which areas I should focus on.

Categories: FLOSS Project Planets

Michal &#268;iha&#345;: Better acess control in Weblate

Planet Debian - Mon, 2017-10-09 12:00

Upcoming Weblate 2.17 will bring improved access control settings. Previously this could be controlled only by server admins, but now the project visibility and access presets can be configured.

This allows you to better tweak access control for your needs. There is additional choice of making the project public, but restricting translations, what has been requested by several projects.

You can see the possible choices on the UI screenshot:

On Hosted Weblate this feature is currently available only to commercial hosting customers. Projects hosted for free are limited to public visibility only.

Filed under: Debian English SUSE Weblate

Categories: FLOSS Project Planets

aleksip.net: Reasons for choosing standards-based technologies

Planet Drupal - Mon, 2017-10-09 11:12
The recent announcement that Drupal is looking to adopt React has inspired me to live up to my Twitter bio, and be an active advocate for open standards-based technologies. While my knee-jerk reaction to the announcement was to focus on React, this blog post approaches the topic of adopting technologies in a more general manner, while still aiming to contribute to the current front-end framework discussion.
Categories: FLOSS Project Planets

Antonio Terceiro: pristine-tar updates

Planet Debian - Mon, 2017-10-09 11:06
Introduction

pristine-tar is a tool that is present in the workflow of a lot of Debian people. I adopted it last year after it has been orphaned by its creator Joey Hess. A little after that Tomasz Buchert joined me and we are now a functional two-person team.

pristine-tar goals are to import the content of a pristine upstream tarball into a VCS repository, and being able to later reconstruct that exact same tarball, bit by bit, based on the contents in the VCS, so we don’t have to store a full copy of that tarball. This is done by storing a binary delta files which can be used to reconstruct the original tarball from a tarball produced with the contents of the VCS. Ultimately, we want to make sure that the tarball that is uploaded to Debian is exactly the same as the one that has been downloaded from upstream, without having to keep a full copy of it around if all of its contents is already extracted in the VCS anyway.

The current state of the art, and perspectives for the future

pristine-tar solves a wicked problem, because our ability to reconstruct the original tarball is affected by changes in the behavior of tar and of all of the compression tools (gzip, bzip2, xz) and by what exact options were used when creating the original tarballs. Because of this, pristine-tar currently has a few embedded copies of old versions of compressors to be able to reconstruct tarballs produced by them, and also rely on a ever-evolving patch to tar that is been carried in Debian for a while.

So basically keeping pristine-tar working is a game of Whac-A-Mole. Joey provided a good summary of the situation when he orphaned pristine-tar.

Going forward, we may need to rely on other ways of ensuring integrity of upstream source code. That could take the form of signed git tags, signed uncompressed tarballs (so that the compression doesn’t matter), or maybe even a different system for storing actual tarballs. Debian bug #871806 contains an interesting discussion on this topic.

Recent improvements

Even if keeping pristine-tar useful in the long term will be hard, too much of Debian work currently relies on it, so we can’t just abandon it. Instead, we keep figuring out ways to improve. And I have good news: pristine-tar has recently received updates that improve the situation quite a bit.

In order to be able to understand how better we are getting at it, I created a "visualization of the regression test suite results. With the help of data from there, let’s look at the improvements made since pristine-tar 1.38, which was the version included in stretch.

pristine-tar 1.39: xdelta3 by default.

This was the first release made after the stretch release, and made xdelta3 the default delta generator for newly-imported tarballs. Existing tarballs with deltas produced by xdelta are still supported, this only affects new imports.

The support for having multiple delta generator was written by Tomasz, and was already there since 1.35, but we decided to only flip the switch after using xdelta3 was supported in a stable release.

pristine-tar 1.40: improved compression heuristics

pristine-tar uses a few heuristics to produce the smaller delta possible, and this includes trying different compression options. In the release Tomasz included a contribution by Lennart Sorensen to also try the --gnu, which gretly improved the support for rsyncable gzip compressed files. We can see an example of the type of improvement we got in the regression test suite data for delta sizes for faad2_2.6.1.orig.tar.gz:

pristine-tar 1.41: support for signatures

This release saw the addition of support for storage and retrieval of upstream signatures, contributed by Chris Lamb.

pristine-tar 1.42: optionally recompressing tarballs

I had this idea and wanted to try it out: most of our problems reproducing tarballs come from tarballs produced with old compressors, or from changes in compressor behavior, or from uncommon compression options being used. What if we could just recompress the tarballs before importing then? Yes, this kind of breaks the “pristine” bit of the whole business, but on the other hand, 1) the contents of the tarball are not affected, and 2) even if the initial tarball is not bit by bit the same that upstream release, at least future uploads of that same upstream version with Debian revisions can be regenerated just fine.

In some cases, as the case for the test tarball util-linux_2.30.1.orig.tar.xz, recompressing is what makes it possible to reproduce the tarball (and thus import it with pristine-tar) possible at all:

In other cases, if the current heuristics can’t produce a reasonably small delta, recompressing makes a huge difference. It’s the case for mumble_1.1.8.orig.tar.gz:

Recompressing is not enabled by default, and can be enabled by passing the --recompress option. If you are using pristine-tar via a wrapper tool like gbp-buildpackage, you can use the $PRISTINE_TAR environment variable to set options that will affect any pristine-tar invocations.

Also, even if you enable recompression, pristine-tar will only try it if the delta generations fails completely, of if the delta produced from the original tarball is too large. You can control what “too large” means by using the --recompress-threshold-bytes and --recompress-threshold-percent options. See the pristine-tar(1) manual page for details.

Categories: FLOSS Project Planets

Continuum Analytics Blog: Strata Data Conference Grows Up

Planet Python - Mon, 2017-10-09 10:31
The Strata conference will always hold a place in my heart, as it’s one of the events that inspired Travis and I to found Anaconda. We listened to open source-driven talks about data lakes and low-cost storage and knew there would be a demand for tools to help organizations and data scientists derive value from these mountains of information.
Categories: FLOSS Project Planets

Michal &#268;iha&#345;: stardicter 1.1

Planet Debian - Mon, 2017-10-09 09:15

Stardicter 1.1, the set of scripts to convert some freely available dictionaries to StarDict format, has been released today. The biggest change is that it will also keep source data together with generated dictionaries. This is good for licensing reasons and will also allow to actually build these as packages within Debian.

Full list of changes:

  • Various cleanups for first stable release.
  • Fixed generating of README for dictionaries.
  • Added support for generating source tarballs.
  • Fixed installation on systems with non utf-8 locale.

As usual, you can install from pip, download source or download generated dictionaries from my website. The package should be soon available in Debian as well.

Filed under: Debian English StarDict

Categories: FLOSS Project Planets

KTextEditorPreviewPlugin 0.2.0

Planet KDE - Mon, 2017-10-09 09:09

KTextEditorPreviewPlugin 0.2.0 has been released.

The KTextEditorPreviewPlugin software provides the KTextEditor Document Preview Plugin, a plugin for the editor Kate, the IDE KDevelop, or other software using the KTextEditor framework.

The plugin enables a live preview of the currently edited text document in the final format, in the sidebar (Kate) or as tool view (KDevelop). So when editing e.g. a Markdown text or an SVG image, the result is instantly visible next to the source text. For the display the plugin uses that KParts plugin which is currently selected as the preferred one for the MIME type of the document. If there is no KParts plugin for that type, no preview is possible.

Download from:
https://download.kde.org/stable/ktexteditorpreviewplugin/0.2.0/src

sha256:
ab54382dfd8e88247b53b72fdd9b259feb7c0266300b604db899edf0828677ae ktexteditorpreviewplugin-0.2.0.tar.xz

Signed with my new PGP key
E191 FD5B E6F4 6870 F09E 82B2 024E 7FB4 3D01 5474
Friedrich W. H. Kossebau
ktexteditorpreviewplugin-0.2.0.tar.xz.sig

Change since 0.1.0
  • Add dropdown menu to toolbar with the main menu of the KParts plugin
  • Add About dialog for the currently used KParts plugin (invokable from the new dropdown menu)
Notes

Long term the plan is to merge this plugin into the Kate repository, or some new separate KTextEditor-Plugins repo, ideally already for KDE Applications 17.12.

For now though this plugin is in its own repository to allow an initial independent quick release cycle phase, following the release-often-and-early mantra. With the help of your feedback (file your issue) that should make the features of the plugin the ones you like to have rather soon.

Developers: Improve your favourite KParts plugin

While a usual KParts plugin works out of the box, for a perfect experience with the Automatic Updating option some further improvements might be needed:

A few KParts plugins have already seen such adaptions, like the SVGPart and the KUIViewerPart (see also blog post), adaptions to be released with KDE Applications 17.12.
Another KParts plugin has been written with that in mind from the start, the KMarkdownWebViewPart (see also blog post), which already has been released.

You might want to take some guidance by the respective commit “Support loading by stream and restoring state on reload” to the SVGPart repository.


Categories: FLOSS Project Planets

Doug Hellmann: Get “The Python 3 Standard Library by Example” for $20 off this week

Planet Python - Mon, 2017-10-09 09:08
My book, The Python 3 Standard Library by Example, is the InformIT.com eBook Deal of the Week this week. Save $20 off the list price by purchasing directly from the publisher at http://informit.com/deals before 14 October.
Categories: FLOSS Project Planets

Doug Hellmann: linecache — Read Text Files Efficiently — PyMOTW 3

Planet Python - Mon, 2017-10-09 09:00
The linecache module is used within other parts of the Python standard library when dealing with Python source files. The implementation of the cache holds the contents of files, parsed into separate lines, in memory. The API returns the requested line(s) by indexing into a list , and saves time over repeatedly reading the file …
Categories: FLOSS Project Planets

Valuebound: Step-by-step guide to Foundation framework to develop responsive web applications

Planet Drupal - Mon, 2017-10-09 08:55

Creating responsive websites have always remained a challenge for many even I faced similar difficulties in the beginning. Recently, our team came across a situation where we had to design a responsive and beautiful website in Drupal 8 for a media and publishing firm. In order to create such an amazing site, we came up with an idea to use Foundation Framework and yes! it worked.

I have written this blog to help anyone having difficulty in understanding the Foundation framework to develop responsive websites as it is a rising market demand. My idea is that this article will be a "living laboratory" to help you in understanding Foundation from the scratch. The post comprises an intro of Foundation, its features, comparison…

Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: Giampaolo Rodola’

Planet Python - Mon, 2017-10-09 08:30

This week we welcome Giampaolo Rodola’ (@grodola) as our PyDev of the Week! Giampaolo is the creator and maintainer of the psutil project as well as the pyftpdlib and pysendfile packages. He has also been a maintainer of the asyncore and asynchat stdlib modules. You can check out some of his work over on Github or check out his blog! Let’s take some time to get to know our fellow Pythonista better!

Can you tell us a little about yourself (hobbies, education, etc):

I am a Python freelancer working remotely. I am Italian and I am currently based in Prague. My main hobbies are programming and music (I play guitar and I enjoy singing). I also enjoy traveling and have been semi-nomadic for the last 3.5 years. As such I may also work and travel at the same time, although if I get a long gig I prefer to stay in Prague. I have no formal IT education. I learned programming because it was fun and I ended up paying the rent with it by accident.

Why did you start using Python?

Initially (early 2000s) I was fascinated by hacking and the underground hacking scene. I remember I read lots of articles and hacking e-zines. I was particularly interested in networking and was fascinated by vulnerabilities which could be exploited via the network. I had a huge limit though: I didn’t know any programming language, and those articles often contained code samples and exploits which were written in C. I was stuck, so after a while I finally decided I would learn C. I started reading a tutorial but I ended up being discouraged pretty soon because I just couldn’t wrap my head around the language (pointers in particular). I didn’t give up and started looking for an easier language to learn and I eventually ended up on a Python tutorial (I don’t remember how exactly). Contrarily to C I found Python extremely pleasant and easy to learn and I immediately fell in love with it because I realized I could write easy programs which “did something” and I enjoyed doing it! I wasn’t really sure what I was doing but I focused my efforts around (RAW) sockets and libpcap, because that was what I needed to realize my idea (a backdoor). After a couple of months I released my first hack-related tool: a remote shell using ICMP instead of TCP. The second tool was a port knocker, which is basically a sniffer which listens for combinations of packets. Funnily, shortly after the release of these two hack toys, I gradually started losing interest for hacking but I still liked networking and sockets, so I decided I would write a server, I didn’t care which one. I ended up bumping into an RFC about the FTP protocol (RFC-959), I kind of understood it (despite it was horrible =)) so I decided I would give it a try and ended up writing my first server. It was a end-user command line tool which was using thread for concurrency. I later rewrote it from scratch by using an event-driven approach. It was also no longer an end-user program but a library intended for other developers: pyftpdlib. That was my first “official” open source project including a VCS, a bug tracker and (some time later) an actual user base.

What other programming languages do you know and which is your favorite?

I know C, mainly because I had to learn it for psutil, but I’m not particularly good at it nor I particularly like it.

Categories: FLOSS Project Planets
Syndicate content