Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 6 min 15 sec ago

PyCoder’s Weekly: Issue #378 (July 23, 2019)

1 hour 42 min ago

#378 – JULY 23, 2019
View in Browser »

Create a Flask Application With Google Login

In this step-by-step tutorial, you’ll create a Flask application that lets users sign in using their Google login. You’ll learn about OAuth 2 and OpenID Connect and also find out how to implement some code to handle user session management.
REAL PYTHON

Guido on PEG Parsers for Python

“Some years ago someone asked whether it would make sense to switch Python to a PEG parser. […] I looked into it a bit and wasn’t sure what to think, so I dropped the subject. Recently I’ve learned more about PEG (Parsing Expression Grammars), and I now think it’s an interesting alternative to the home-grown parser generator that I developed 30 years ago when I started working on Python.”
GUIDO VAN ROSSUM

Save 40% on Your Order at manning.com

Take the time to learn something new! Manning Publications are offering 40% off everything at manning.com, including everything Pythonic. Just enter the code pycoders40 at the cart before you checkout to save →
MANNING PUBLICATIONS sponsor

Python 2.x Support in Pip Going Forward

“pip will continue to ensure that it runs on Python 2.7 after the CPython 2.7 EOL date. Support for Python 2.7 will be dropped, if bugs in Python 2.7 itself make this necessary (which is unlikely) or Python 2 usage reduces to a level where pip maintainers feel it is OK to drop support. The same approach is used to determine when to drop support for other Python versions.”
PYPA.IO

Logging in Python

Python provides a logging system as a part of its standard library, so you can quickly add logging to your application. In this course, you’ll learn why using this module is the best way to add logging to your application as well as how to get started quickly, and you will get an introduction to some of the advanced features available.
REAL PYTHON video

Understand How Celery Works by Building a Clone

“A delayed job processor (also called a background processor, asynchronous task queue, etc.) is a software system that can run code at a later time. Examples of such software includes Celery, Resque, Sidekiq, and others. In this post we will try and understand how these things work by building a clone/replica of such software.”
KOMU WAIRAGU • Shared by Komu Wairagu

Simplify Your Python Developer Environment

Three tools (pyenv, pipx, pipenv) make for smooth, isolated, reproducible Python developer and production environments.
MASON EGGER • Shared by Python Bytes FM

Discussions Exploring Best Practices for Upcoming Python 3.8 Features

“As a Python 3.8 learning exercise, I’m using the walrus operator, / notation, and f= at every opportunity and then evaluating the result for clarity.”
RAYMOND HETTINGER

Using NumPy With Pandas Without Import NumPy

“Want to use NumPy without importing it? You can access all of its functionality from within pandas…”
TWITTER.COM/JUSTMARKHAM

Writing the Digits of Pi Backwards…in Python?

;-)
REDDIT

Python Jobs Software Engineering Lead, Python (Houston, TX)

SimpleLegal

Python Software Engineer (Multiple US Locations)

Invitae

Python Software Engineer (Munich, Germany)

Stylight GmbH

Senior Back-End Web Developer (Vancouver, BC)

7Geese

Lead Data Scientist (Buffalo, NY)

Utilant LLC

Python Developer (Remote)

418 Media

Sr. Python Engineer (Arlington, VA)

Public Broadcasting Service

Senior Backend Software Engineer (Remote)

Close

Data Engineer (Munich, Germany)

Stylight GmbH

More Python Jobs >>>

Articles & Tutorials Making Python Classes More Modular Using Mixins

“In this article I want to discuss mixins: what they are, how they work, and when they’re useful. Hopefully after reading this brief article you will be ready to experiment with this pattern yourself in your own projects.”
ALEKSEY BILOGUR • Shared by Aleksey Bilogur

Keras Learning Rate Schedules and Decay

In this tutorial, you will learn about learning rate schedules and decay using Keras. You’ll learn how to use Keras’ standard learning rate decay along with step-based, linear, and polynomial learning rate schedules.
ADRIAN ROSEBROCK

Safely Roll Out New Features in Python With Optimizely Rollouts

Tired of code rollbacks, hotfixes, or merge conflicts? Instantly turn on or off features in production. Comes with unlimited collaborators and feature flags. Embrace safer CI/CD releases with SDKs for Python and all major platforms →
OPTIMIZELY sponsor

How to Use np.arange()

In this step-by-step tutorial, you’ll learn how to use the NumPy arange() function, which is one of the routines for array creation based on numerical ranges. np.arange() returns arrays with evenly spaced values.
REAL PYTHON

Protecting the Future of Python by Hunting Black Swans

An interview with Russell Keith-Magee about identifying potential black swan events for the Python ecosystem and how to address them for the future of the language and its community.
PYTHONPODCAST.COM podcast

Writing Sustainable Python Scripts

A standalone Python script can come with a discoverable interface a documentation and some tests to keep it useful a year later.
VINCENT BERNAT

Decoupling Database Migrations From Server Startup: Why and How

Why running migrations on application startup are a bad idea (potential database corruption & downtime) and what to do instead.
ITAMAR TURNER-TRAURING

Train to a Guaranteed Machine Learning Job

1:1 personalized guidance with your own machine expert. Learn through a curated curriculum designed by hiring managers. Get career support and insider connections to jobs with online self-paced course. Get a machine learning job or your money back.
SPRINGBOARD sponsor

Let’s Build a Simple Interpreter: Recognizing Procedure Calls

Part 16 in Ruslan’s amazing tutorial series on building a scripting language interpreter using Python, from scratch.
RUSLAN SPIVAK

Practical Production Python Scripts

A step-by-step refactoring journey from simple fizzbuzz script to a cleaned up and “production-ready” piece of code.
DAN CONNOLLY

The Invasion of Giant Pythons Threatening Florida

Python is eating the world! Related discussion on Hacker News.
SMITHSONIANMAG.COM

Extensive Python Testing on Travis CI

Testing open-source Python on several operating systems using Travis for continous integration.
SHAY PALACHY

List of Python Anti-Patterns and Worst Practices

QUANTIFIEDCODE.COM

Estimating Pi in Python

CHRIS WEBB

Multipage PDF to JPEG Image Conversion in Python

M. TARIK YURT

5 Common Beginner Mistakes in Python

DEEPSOURCE.IO

Ranking Capitals by the Number of Starbucks Using Python and Google Maps API

ARTEM RYS

Projects & Code Pdb++: Fancy Pdb (The Python Debugger)

GITHUB.COM/PDBPP

preper: Persian (Farsi) Preprocessing Python Module

GITHUB.COM/ALIIE62 • Shared by Ali Hosseini

pyorbs: Tool for Convenient Python Virtual Environment Management

WBRP.CH

pyon: The Pythonic Way to Use JSON

GITHUB.COM/LAGMOELLERTIM

snoop: Python Debugging Library Designed for Maximum Convenience

GITHUB.COM/ALEXMOJAKI

StanfordNLP: Python NLP Library for Many Human Languages

STANFORDNLP.GITHUB.IO

dlint: Robust Static Analysis for Python

DUO.COM

pygraphblas: Graph Processing With Python and GraphBLAS

GITHUB.COM/MICHELP

Events PythOnRio Meetup

July 27, 2019
PYTHON.ORG.BR

Python Sheffield

July 30, 2019
GOOGLE.COM

Heidelberg Python Meetup

July 31, 2019
MEETUP.COM

PiterPy Breakfast

July 31, 2019
TIMEPAD.RU

Reunión Python Valencia

August 1, 2019
GOOGLE.COM

DjangoCon AU 2019

August 2 to August 3, 2019
DJANGOCON.COM.AU

PyCon AU 2019

August 2 to August 7, 2019
PYCON-AU.ORG

Happy Pythoning!
This was PyCoder’s Weekly Issue #378.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Codementor: Python zip function tutorial (Simple Examples)

6 hours 6 min ago
Learn how to join multiple iterables into one single object using Python zip function. Learn how to zip lists and matricies, send output to a file, and more
Categories: FLOSS Project Planets

PSF GSoC students blogs: The week that has been @ 2048

6 hours 49 min ago

 

<meta charset="utf-8">

Week #9 17/07 to 23/07

 

Well, integration isn’t working out, and neither am I giving up. Also, Evaludation 2 coming up!

What did you do this week?

Well, another week another PR merged. Do those PR’s count that you merge yourself - https://github.com/vipulgupta2048/spidermon/pull/4, Translator has now been officially completed. 

Over to integration, the ride has been quite bumpy. As Cerberus is not being detected by the pipeline, no worries. We have settled on a methodology to solve this problem. We will first check if CerberusValidator works in the ItemValidationPipeline if it does then Spidermon works. And then we will start worrying why it doesn’t work in other places. 

Oh and I found a bug - https://github.com/scrapinghub/spidermon/issues/192

 

What is coming up next? 

For now, if you like to know. We will be completing the ItemValidationPipeline, then moving onto integration which rounds this project up successfully. 

 

Did you get stuck anywhere?

Don’t even ask, I somehow wasn’t able to install Scrapy in my first go (didn’t read the docs) and couldn’t implement the JSONSchemaValidator (didn’t read the docs enough times, with a magnifying glass). So yeah, bumpy.

Categories: FLOSS Project Planets

Codementor: Playing Tic Tac Toe using Reinforcement Learning

7 hours 11 min ago
How I made two bots competent enough to play Tic Tac Toe, and made them battle it out for glory.
Categories: FLOSS Project Planets

Real Python: Logging in Python

7 hours 12 min ago

Logging is a very useful tool in a programmer’s toolbox. It can help you develop a better understanding of the flow of a program and discover scenarios that you might not even have thought of while developing.

Logs provide developers with an extra set of eyes that are constantly looking at the flow that an application is going through. They can store information, like which user or IP accessed the application. If an error occurs, then they can provide more insights than a stack trace by telling you what the state of the program was before it arrived at the line of code where the error occurred.

By logging useful data from the right places, you can not only debug errors easily but also use the data to analyze the performance of the application to plan for scaling or look at usage patterns to plan for marketing.

Python provides a logging system as a part of its standard library, so you can quickly add logging to your application. In this course, you’ll learn why using this module is the best way to add logging to your application as well as how to get started quickly, and you will get an introduction to some of the advanced features available.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Check-in #9 : ( 19 July - 26 July )

8 hours 1 min ago
What did you do this week?
  • I added tests to make sure Protego doesn't throw exceptions on `robots.txt` of top 10,000 most popular websites.
  • Utilised Scrapy to create a tool to download `robots.txt` of top 10,000 most popular websites.
  • Benchmarked Protego : I ran Protego(written in Python), Robotexclusionrulesparser(written in Python), Reppy(written in C++ but has Python interface) on 570 `robots.txt` downloaded from top 1000 websites, and here are the results.
    • Time spend parsing the `robots.txt`
      • Protego : 79.00128873897484 seconds  
      • Robotexclusionrulesparser : 0.30100024401326664 seconds
      • Reppy : 0.05821833698428236 seconds
    • Time spend answering queries (1000 queries (all were identical) per `robots.txt`)
      • Protego : 14.736387926022871 seconds
      • Robotexclusionrulesparser : 67.33521732398367 seconds
      • Reppy : 1.0866852040198864 seconds
  • Added logging to Protego.
What is coming up next?
  • Will depend on the review from the mentors. If everything looks good to them, I would shift my focus back on Scrapy.
  • Make `SitemapSpider` use the new interface for `robots.txt` parsers.
  • Implement Crawl-delay & Host directive support in Scrapy.
Did you get stuck anywhere?
  • Nothing major.
Categories: FLOSS Project Planets

Ruslan Spivak: Let’s Build A Simple Interpreter. Part 16: Recognizing Procedure Calls

8 hours 52 min ago

“Learning is like rowing upstream: not to advance is to drop back.” — Chinese proverb

Today we’re going to extend our interpreter to recognize procedure calls. I hope by now you’ve flexed your coding muscles and are ready to tackle this step. This is a necessary step for us before we can learn how to execute procedure calls, which will be a topic that we will cover in great detail in future articles.

The goal for today is to make sure that when our interpreter reads a program with a procedure call, the parser constructs an Abstract Syntax Tree (AST) with a new tree node for the procedure call, and the semantic analyzer and the interpreter don’t throw any errors when walking the AST.

Let’s take a look at a sample program that contains a procedure call Alpha(3 + 5, 7):

Making our interpreter recognize programs like the one above will be our focus for today.

As with any new feature, we need to update various components of the interpreter to support this feature. Let’s dive into each of those components one by one.


First, we need to update the parser. Here is a list of all the parser changes that we need to make to be able to parse procedure calls and build the right AST:

  1. We need to add a new AST node to represent a procedure call
  2. We need to add a new grammar rule for procedure call statements; then we need to implement the rule in code
  3. We need to extend the statement grammar rule to include the rule for procedure call statements and update the statement method to reflect the changes in the grammar

1. Let’s start by creating a separate class to represent a procedure call AST node. Let’s call the class ProcedureCall:

class ProcedureCall(AST): def __init__(self, proc_name, actual_params, token): self.proc_name = proc_name self.actual_params = actual_params # a list of AST nodes self.token = token

The ProcedureCall class constructor takes three parameters: a procedure name, a list of actual parameters (a.k.a arguments), and a token. Nothing really special here, just enough information for us to capture a particular procedure call.

2. The next step that we need to take is to extend our grammar and add a grammar rule for procedure calls. Let’s call the rule proccall_statement:

proccall_statement : ID LPAREN expr? (COMMA expr)* RPAREN

Here is a corresponding syntax diagram for the rule:

From the diagram above you can see that a procedure call is an ID token followed by a left parenthesis, followed by zero or more expressions separated by commas, followed by a right parenthesis. Here are some of the procedure call examples that fit the rule:

Alpha(); Alpha(1); Alpha(3 + 5, 7);

Next, let’s implement the rule in our parser by adding a proccall_statement method

def proccall_statement(self): """proccall_statement : ID LPAREN expr? (COMMA expr)* RPAREN""" token = self.current_token proc_name = self.current_token.value self.eat(TokenType.ID) self.eat(TokenType.LPAREN) actual_params = [] if self.current_token.type != TokenType.RPAREN: node = self.expr() actual_params.append(node) while self.current_token.type == TokenType.COMMA: self.eat(TokenType.COMMA) node = self.expr() actual_params.append(node) self.eat(TokenType.RPAREN) node = ProcedureCall( proc_name=proc_name, actual_params=actual_params, token=token, ) return node

The implementation is pretty straightforward and follows the grammar rule: the method parses a procedure call and returns a new ProcedureCall AST node.

3. And the last changes to the parser that we need to make are: extend the statement grammar rule by adding the proccall_statement rule and update the statement method to call the proccall_statement method.

Here is the updated statement grammar rule, which includes the proccall_statement rule:

statement : compound_statement | proccall_statement | assignment_statement | empty

Now, we have a tricky situation on hand where we have two grammar rules - proccall_statement and assignment_statement - that start with the same token, the ID token. Here are their complete grammar rules put together for comparison:

proccall_statement : ID LPAREN expr? (COMMA expr)* RPAREN assignment_statement : variable ASSIGN expr variable: ID

How do you distinguish between a procedure call and an assignment in a case like that? They are both statements and they both start with an ID token. In the fragment of code below, the ID token’s value(lexeme) for both statements is foo:

foo(); { procedure call } foo := 5; { assignment }

The parser should recognize foo(); above as a procedure call and foo := 5; as an assignment. But what can we do to help the parser to distinguish between procedure calls and assignments? According to our new proccall_statement grammar rule, procedure calls start with an ID token followed by a left parenthesis. And that’s what we are going to rely on in the parser to distinguish between procedure calls and assignments to variables - the presence of a left parenthesis after the ID token:

if (self.current_token.type == TokenType.ID and self.lexer.current_char == '(' ): node = self.proccall_statement() elif self.current_token.type == TokenType.ID: node = self.assignment_statement()

As you can see in the code above, first we check if the current token is an ID token and then we check if it’s followed by a left parenthesis. If it is, we parse a procedure call, otherwise we parse an assignment statement.

Here is the full updated version of the statement method:

def statement(self): """ statement : compound_statement | proccall_statement | assignment_statement | empty """ if self.current_token.type == TokenType.BEGIN: node = self.compound_statement() elif (self.current_token.type == TokenType.ID and self.lexer.current_char == '(' ): node = self.proccall_statement() elif self.current_token.type == TokenType.ID: node = self.assignment_statement() else: node = self.empty() return node


So far so good. The parser can now parse procedure calls. One thing to keep in mind though is that Pascal procedures don’t have return statements, so we can’t use procedure calls in expressions. For example, the following example will not work if Alpha is a procedure:

x := 10 * Alpha(3 + 5, 7);

That’s why we added proccall_statement to the statements method only and nowhere else. Not to worry, later in the series we’ll learn about Pascal functions that can return values and also can be used in expressions and assignments.

These are all the changes for our parser. Next up is the semantic analyzer changes.


The only change we need to make in our semantic analyzer to support procedure calls is to add a visit_ProcedureCall method:

def visit_ProcedureCall(self, node): for param_node in node.actual_params: self.visit(param_node)

All the method does is iterate over a list of actual parameters passed to a procedure call and visit each parameter node in turn. It’s important not to forget to visit each parameter node because each parameter node is an AST sub-tree in itself.

That was easy, wasn’t it? Okay, now moving on to interpreter changes.


The interpreter changes, compared to the changes to the semantic analyzer, are even simpler - we only need to add an empty visit_ProcedureCall method to the Interpreter class:

def visit_ProcedureCall(self, node): pass

With all the above changes in place, we now have an interpreter that can recognize procedure calls. And by that I mean the interpreter can parse procedure calls and create an AST with ProcedureCall nodes corresponding to those procedure calls. Here is the sample Pascal program we saw at the beginning of the article that we want our interpreter to be tested on:

program Main; procedure Alpha(a : integer; b : integer); var x : integer; begin x := (a + b ) * 2; end; begin { Main } Alpha(3 + 5, 7); { procedure call } end. { Main }

Download the above program from GitHub or save the code to the file part16.pas

See for yourself that running our updated interpreter with the part16.pas as its input file does not generate any errors:

$ python spi.py part16.pas $

So far so good, but no output is not that exciting. :) Let’s get a bit visual and generate an AST for the above program and then visualize the AST using an updated version of the genastdot.py utility:

$ python genastdot.py part16.pas > ast.dot && dot -Tpng -o ast.png ast.dot

That’s better. In the picture above you can see our new ProcCall AST node labeled ProcCall:Alpha for the Alpha(3 + 5, 7) procedure call. The two children of the ProcCall:Alpha node are the subtrees for the arguments 3 + 5 and 7 passed to the Alpha(3 + 5, 7) procedure call.

Okay, we have accomplished our goal for today: when encountering a procedure call, the parser constructs an AST with a ProcCall node for the procedure call, and the semantic analyzer and the interpreter don’t throw any errors when walking the AST.


Now, it’s time for an exercise.

Exercise: Add a check to the semantic analyzer that verifies that the number of arguments (actual parameters) passed to a procedure call equals the number of formal parameters defined in the corresponding procedure declaration. Let’s take the Alpha procedure declaration we used earlier in the article as an example:

procedure Alpha(a : integer; b : integer); var x : integer; begin x := (a + b ) * 2; end;

The number of formal parameters in the procedure declaration above is two (integers a and b). Your check should throw an error if you try to call the procedure with a number of arguments other than two:

Alpha(); { 0 arguments —> ERROR } Alpha(1); { 1 argument —> ERROR } Alpha(1, 2, 3); { 3 arguments —> ERROR }

You can find a solution to the exercise in the file solutions.txt on GitHub, but try to work out your own solution first before peeking into the file.


That’s all for today. In the next article we’ll begin to learn how to interpret procedure calls. We will cover topics like call stack and activation records. It is going to be a wild ride :) So stay tuned and see you next time!


Resources used in preparation for this article (some links are affiliate links):

  1. Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)
  2. Writing Compilers and Interpreters: A Software Engineering Approach
  3. Free Pascal Reference guide

Categories: FLOSS Project Planets

Catalin George Festila: Python 3.7.3 : Using the flask - part 001.

9 hours 2 min ago
A short intro into this python module can be found at the PyPI website: Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja and has become one of the most popular Python web application frameworks. Flask offers suggestions but
Categories: FLOSS Project Planets

Stack Abuse: Python for NLP: Word Embeddings for Deep Learning in Keras

9 hours 3 min ago

This is the 16th article in my series of articles on Python for NLP. In my previous article I explained how N-Grams technique can be used to develop a simple automatic text filler in Python. N-Gram model is basically a way to convert text data into numeric form so that it can be used by statisitcal algorithms.

Before N-Grams, I explained the bag of words and TF-IDF approaches, which can also be used to generate numeric feature vectors from text data. Till now we have been using machine learning appraoches to perform different NLP tasks such as text classification, topic modeling, sentimental analysis, text summarization, etc. In this article we will start our discussion about deep learning techniques for NLP.

Deep learning approaches consist of different types of densely connected neural networks. These approaches have been proven efficient to solve several complex tasks such as self-driving cars, image generation, image segmentation, etc. Deep learning approaches have also been proven quite efficient for NLP tasks.

In this article, we will study word embeddings for NLP tasks that involve deep learning. We will see how word embeddings can be used to perform simple classification task using deep neural network in Python's Keras Library.

Problems with One-Hot Encoded Feature Vector Approaches

A potential drawback with one-hot encoded feature vector approaches such as N-Grams, bag of words and TF-IDF approach is that the feature vector for each document can be huge. For instance, if you have a half million unique words in your corpus and you want to represent a sentence that contains 10 words, your feature vector will be a half million dimensional one-hot encoded vector where only 10 indexes will have 1. This is a wastage of space and increases algorithm complexity exponentially resulting in the curse of dimentionality.

Word Embeddings

In word embeddings, every word is represented as an n-dimensional dense vector. The words that are similar will have similar vector. Word embeddings techniques such as GloVe and Word2Vec have proven to be extremely efficient for converting words into corresponding dense vectors. The vector size is small and none of the indexes in the vector is actually empty.

Implementing Word Embeddings with Keras Sequential Models

The Keras library is one of the most famous and commonly used deep learning libraries for Python that is built on top of TensorFlow.

Keras support two types of APIs: Sequential and Functional. In this section we will see how word embeddings are used with Keras Sequential API. In the next section, I will explain how to implement the same model via the Keras functional API.

To implement word embeddings, the Keras library contains a layer called Embedding(). The embedding layer is implemented in the form of a class in Keras and is normally used as a first layer in the sequential model for NLP tasks.

The embedding layer can be used to peform three tasks in Keras:

  • It can be used to learn word embeddings and save the resulting model
  • It can be used to learn the word embeddings in addition to performing the NLP tasks such as text classification, sentiment analysis, etc.
  • It can be used to load pretrained word embeddings and use them in a new model

In this article, we will see the second and third use-case of the Embedding layer. The first use-case is a subset of the second use-case.

Let's see how the embedding layer looks:

embedding_layer = Embedding(200, 32, input_length=50)

The first parameter in the embeddig layer is the size of the vocabulary or the total number of unique words in a corpus. The second parameter is the number of the dimensions for each word vector. For instance, if you want each word vector to have 32 dimensions, you will specify 32 as the second parameter. And finally, the third parameter is the length of the input sentence.

The output of the word embedding is a 2D vector where words are represented in rows, whereas their corresponding dimensions are presented in columns. Finally, if you wish to directly connect your word embedding layer with a densely connected layer, you first have to flatten your 2D word embeddings into 1D. These concepts will become more understandable once we see word embedding in action.

Custom Word Embeddings

As I said earlier, Keras can be used to either learn custom words embedding or it can be used to load pretrained word embeddings. In this section, we will see how the Keras Embedding Layer can be used to learn custom word embeddings.

We will perform simple text classification tasks that will use word embeddings. Execute the following script to download the required libraries:

from numpy import array from keras.preprocessing.text import one_hot from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers.embeddings import Embedding

Next, we need to define our dataset. We will be using a very simple custom dataset that will contain reviews above movies. The following script creates our dataset:

corpus = [ # Positive Reviews 'This is an excellent movie', 'The move was fantastic I like it', 'You should watch it is brilliant', 'Exceptionally good', 'Wonderfully directed and executed I like it', 'Its a fantastic series', 'Never watched such a brillent movie', 'It is a Wonderful movie', # Negtive Reviews "horrible acting", 'waste of money', 'pathetic picture', 'It was very boring', 'I did not like the movie', 'The movie was horrible', 'I will not recommend', 'The acting is pathetic' ]

Our corpus has 8 positive reviews and 8 negative reviews. The next step is to create label set for our data.

sentiments = array([1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0])

You can see that the first 8 items in the sentiment array contain 1, which correspond to positive sentiment. The last 8 items are zero that correspond to negative sentiment.

Earlier we said that the first parameter to the Embedding() layer is the vocabulary, or number of unique words in the corpus. Let's first find the total number of words in our corpus:

from nltk.tokenize import word_tokenize all_words = [] for sent in corpus: tokenize_word = word_tokenize(sent) for word in tokenize_word: all_words.append(word)

In the script above, we simply iterate through each sentence in our corpus and then tokenize the sentence into words. Next, we iterate through the list of all the words and append the words into the all_words list. Once you execute the above script, you should see all the words in the all_words dictionary. However, we do not want the duplicate words.

We can retrieve all the unique words from a list by passing the list into the set function, as shown below.

unique_words = set(all_words) print(len(unique_words))

In the output you will see "45", which is the number of unique words in our corpus. We will add a buffer of 5 to our vocabulary size and will set the value of vocab_length to 50.

The Embedding layer expects the words to be in numeric form. Therefore, we need to convert the sentences in our corpus to numbers. One way to convert text to numbers is by using the one_hot function from the keras.preprocessing.text library. The function takes sentence and the total length of the vocabulary and returns the sentence in numeric form.

embedded_sentences = [one_hot(sent, vocab_length) for sent in corpus] print(embedded_sentences )

In the script above, we convert all the sentences in our corpus to their numeric form and display them on the console. The output looks like this:

[[31, 12, 31, 14, 9], [20, 3, 20, 16, 18, 45, 14], [16, 26, 29, 14, 12, 1], [16, 23], [32, 41, 13, 20, 18, 45, 14], [15, 28, 16, 43], [7, 9, 31, 28, 31, 9], [14, 12, 28, 46, 9], [4, 22], [5, 4, 9], [23, 46], [14, 20, 32, 14], [18, 1, 26, 45, 20, 9], [20, 9, 20, 4], [18, 8, 26, 34], [20, 22, 12, 23]]

You can see that our first sentence contained five words, therefore we have five integers in the first list item. Also, notice that the last word of the first sentence was "movie" in the first list item, and we have digit 9 at the fifth place of the resulting 2D array, which means that "movie" has been encoded as 9 and so on.

The embedding layer expects sentences to be of equal size. However, our encoded sentences are of different sizes. One way to make all the sentences of uniform size is to increase the lenght of all the sentences and make it equal to the length of the largest sentence. Let's first find the largest sentence in our corpus and then we will increase the length of all the sentences to the length of the largest sentence. To do so, execute the following script:

word_count = lambda sentence: len(word_tokenize(sentence)) longest_sentence = max(corpus, key=word_count) length_long_sentence = len(word_tokenize(longest_sentence))

In the sentence above, we use a lambda expression to find the length of all the sentences. We then use the max function to return the longest sentence. Finally the longest sentence is tokenized into words and the number of words are counted using the len function.

Next to make all the sentences of equal size, we will add zeros to the empty indexes that will be created as a result of increasing the sentence length. To append the zeros at the end of the sentencses, we can use the pad_sequences method. The first parameter is the list of encoded sentences of unequal sizes, the second parameter is the size of the longest sentence or the padding index, while the last parameter is padding where you specify post to add padding at the end of sentences.

Execute the following script:

padded_sentences = pad_sequences(embedded_sentences, length_long_sentence, padding='post') print(padded_sentences)

In the output, you should see sentences with padding.

[[31 12 31 14 9 0 0] [20 3 20 16 18 45 14] [16 26 29 14 12 1 0] [16 23 0 0 0 0 0] [32 41 13 20 18 45 14] [15 28 16 43 0 0 0] [ 7 9 31 28 31 9 0] [14 12 28 46 9 0 0] [ 4 22 0 0 0 0 0] [ 5 4 9 0 0 0 0] [23 46 0 0 0 0 0] [14 20 32 14 0 0 0] [18 1 26 45 20 9 0] [20 9 20 4 0 0 0] [18 8 26 34 0 0 0] [20 22 12 23 0 0 0]]

You can see zeros at the end of the padded sentences.

Now we have everything that we need to create a sentiment classification model using word embeddings.

We will create a very simple text classification model with an embedding layer and no hidden layers. Look at the following script:

model = Sequential() model.add(Embedding(vocab_length, 20, input_length=length_long_sentence)) model.add(Flatten()) model.add(Dense(1, activation='sigmoid'))

In the script above, we create a Sequential model and add the Embedding layer as the first layer to the model. The length of the vocabulary is specified by the vocab_length parameter. The dimension of each word vector will be 20 and the input_length will be the length of the longest sentence, which is 7. Next, the Embedding layer is flattened so that it can be directly used with the densely connected layer. Since it is a binary classification problem, we use the sigmoid function as the loss function at the dense layer.

Next, we will compile the model and print the summary of our model, as shown below:

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc']) print(model.summary())

The summary of the model is as follows:

Layer (type) Output Shape Param # ================================================================= embedding_1 (Embedding) (None, 7, 20) 1000 _________________________________________________________________ flatten_1 (Flatten) (None, 140) 0 _________________________________________________________________ dense_1 (Dense) (None, 1) 141 ================================================================= Total params: 1,141 Trainable params: 1,141 Non-trainable params: 0

You can see that the first layer has 1000 trainable parameters. This is because our vocabulary size is 50 and each word will be presented as a 20 dimensional vector. Hence the total number of trainable parameters will be 1000. Similarly, the output from the embedding layer will be a sentence with 7 words where each word is represented by a 20 dimensional vector. However, when the 2D output is flattened, we get a 140 dimensional vector (7 x 20). The flattened vector is directly connected to the dense layer that contains 1 neuran.

Now let's train the model on our data using the fit method, as shown below:

model.fit(padded_sentences, sentiments, epochs=100, verbose=1)

The model will be trained for 100 epochs.

We will train and test the model using the same corpus. Execute the following script to evaluate the model performance on our corpus:

loss, accuracy = model.evaluate(padded_sentences, sentiments, verbose=0) print('Accuracy: %f' % (accuracy*100))

In the output, you will see that model accuracy is 1.00 i.e. 100 percent.

Note: In real world applications, train and test sets should be different. We will see an example of that when we perform text classification on some real world data in an upcoming article.

Loading Pretrained Word Embeddings

In the previous section we trained custom word embeddings. However, we can also use pretrained word embeddings.

Several types of pretrained word embeddings exist, however we will be using the GloVe word embeddings from Stanford NLP since it is the most famous one and commonly used. The word embeddings can be downloaded from this link.

The smallest file is named "Glove.6B.zip". The size of the file is 822 MB. The file contains 50, 100, 200, and 300 dimensional word vectors for 400k words. We will be using the 100 dimensional vector.

The process is quite similar. First we have to import the required libraries:

from numpy import array from keras.preprocessing.text import one_hot from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers.embeddings import Embedding

Next, we have to create our corpus followed by the labels.

corpus = [ # Positive Reviews 'This is an excellent movie', 'The move was fantastic I like it', 'You should watch it is brilliant', 'Exceptionally good', 'Wonderfully directed and executed I like it', 'Its a fantastic series', 'Never watched such a brillent movie', 'It is a Wonderful movie', # Negtive Reviews "horrible acting", 'waste of money', 'pathetic picture', 'It was very boring', 'I did not like the movie', 'The movie was horrible', 'I will not recommend', 'The acting is pathetic' ] sentiments = array([1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0])

In the last section, we used one_hot function to convert text to vectors. Another approach is to use Tokenizer function from keras.preprocessing.text library.

You simply have to pass your corpus to the Tokenizer's fit_on_text method.

word_tokenizer = Tokenizer() word_tokenizer.fit_on_texts(corpus)

To get the number of unique words in the text, you can simply count the length of word_index dictionary of the word_tokenizer object. Remember to add 1 with the vocabulary size. This is to store the dimensions for the words for which no pretrained word embeddings exist.

vocab_length = len(word_tokenizer.word_index) + 1

Finally, to convert sentences to their numeric counterpart, call the texts_to_sequences function and pass it the whole corpus.

embedded_sentences = word_tokenizer.texts_to_sequences(corpus) print(embedded_sentences)

In the output, you will see the sentences in their numeric form:

[[14, 3, 15, 16, 1], [4, 17, 6, 9, 5, 7, 2], [18, 19, 20, 2, 3, 21], [22, 23], [24, 25, 26, 27, 5, 7, 2], [28, 8, 9, 29], [30, 31, 32, 8, 33, 1], [2, 3, 8, 34, 1], [10, 11], [35, 36, 37], [12, 38], [2, 6, 39, 40], [5, 41, 13, 7, 4, 1], [4, 1, 6, 10], [5, 42, 13, 43], [4, 11, 3, 12]]

The next step is to find the number of words in the longest sentence and then to apply padding to the sentences having shorter lengths than the length of the longest sentence.

from nltk.tokenize import word_tokenize word_count = lambda sentence: len(word_tokenize(sentence)) longest_sentence = max(corpus, key=word_count) length_long_sentence = len(word_tokenize(longest_sentence)) padded_sentences = pad_sequences(embedded_sentences, length_long_sentence, padding='post') print(padded_sentences)

The padded sentences look like this:

[[14 3 15 16 1 0 0] [ 4 17 6 9 5 7 2] [18 19 20 2 3 21 0] [22 23 0 0 0 0 0] [24 25 26 27 5 7 2] [28 8 9 29 0 0 0] [30 31 32 8 33 1 0] [ 2 3 8 34 1 0 0] [10 11 0 0 0 0 0] [35 36 37 0 0 0 0] [12 38 0 0 0 0 0] [ 2 6 39 40 0 0 0] [ 5 41 13 7 4 1 0] [ 4 1 6 10 0 0 0] [ 5 42 13 43 0 0 0] [ 4 11 3 12 0 0 0]]

We have converted our sentences into padded sequence of numbers. The next step is to load the GloVe word embeddings and then create our embedding matrix that contains the words in our corpus and their corresponding values from GloVe embeddings. Run the following script:

from numpy import array from numpy import asarray from numpy import zeros embeddings_dictionary = dict() glove_file = open('E:/Datasets/Word Embeddings/glove.6B.100d.txt', encoding="utf8")

In the script above, in addition to loading the GloVe embeddings, we also imported a few libraries. We will see the use of these libraries in the upcoming section. Here notice that we loaded glove.6B.100d.txt file. This file contains 100 dimensional word embeddings. We also created an empty dictionary that will store our word embeddings.

If you open the file, you will see a word at the beginning of each line followed by set of 100 numbers. The numbers form the 100 dimensional vector for the word at the begining of each line.

We will create a dictionary that will contain words as keys and the corresponding 100 dimensional vectors as values, in the form of an array. Execute the following script:

for line in glove_file: records = line.split() word = records[0] vector_dimensions = asarray(records[1:], dtype='float32') embeddings_dictionary [word] = vector_dimensions glove_file.close()

The dictionary embeddings_dictionary now contains words and corresponding GloVe embeddings for all the words.

We want the word embeddings for only those words that are present in our corpus. We will create a two dimensional numpy array of 44 (size of vocabulary) rows and 100 columns. The array will initially contain zeros. The array will be named as embedding_matrix

Next, we will iterate through each word in our corpus by traversing the word_tokenizer.word_index dictionary that contains our words and their corresponding index.

Each word will be passed as key to the embedding_dictionary to retrieve the corresponding 100 dimensional vector for the word. The 100 dimensional vector will then be stored at the corresponding index of the word in the embedding_matrix. Look at the following script:

embedding_matrix = zeros((vocab_length, 100)) for word, index in word_tokenizer.word_index.items(): embedding_vector = embeddings_dictionary.get(word) if embedding_vector is not None: embedding_matrix[index] = embedding_vector

Our embedding_matrix now contains pretrained word embeddings for the words in our corpus.

Now we are ready to create our sequential model. Look at the following script:

model = Sequential() embedding_layer = Embedding(vocab_length, 100, weights=[embedding_matrix], input_length=length_long_sentence, trainable=False) model.add(embedding_layer) model.add(Flatten()) model.add(Dense(1, activation='sigmoid'))

The script remains the same, except for the embedding layer. Here in the embedding layer, the first parameter is the size of the vacabulary. The second parameter is the vector dimension of the output vector. Since we are using pretrained word embeddings that contain 100 dimensional vector, we set the vector dimension to 100.

Another very important attribute of the Embedding() layer that we did not use in the last section is weights. You can pass your pretrained embedding matrix as default weights to the weights parameter. And since we are not training the embedding layer, the trainable attribute has been set to False.

Let's compile our model and see the summary of our model:

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc']) print(model.summary())

We are again using adam as the optizer to minimize the loss. The loss function being used is binary_crossentropy. And we want to see the results in the form of accuracy so acc has been passed as the value for the metrics attribute.

The model summary is as follows:

Layer (type) Output Shape Param # ================================================================= embedding_1 (Embedding) (None, 7, 100) 4400 _________________________________________________________________ flatten_1 (Flatten) (None, 700) 0 _________________________________________________________________ dense_1 (Dense) (None, 1) 701 ================================================================= Total params: 5,101 Trainable params: 701 Non-trainable params: 4,400 _________________________________________________________________

You can see that since we have 44 words in our vocabulary and each word will be represented as a 100 dimensional vector, the number of parameters for the embedding layer will be 44 x 100 = 4400. The output from the embedding layer will be a 2D vector with 7 rows (1 for each word in the sentence) and 100 columns. The output from the embedding layer will be flattened so that it can be used with the dense layer. Finally the dense layer is used to make predictions.

Execute the following script to train the algorithms:

model.fit(padded_sentences, sentiments, epochs=100, verbose=1)

Once the algorithm is trained, run the following script to evaluate the peformance of the algorithm.

loss, accuracy = model.evaluate(padded_sentences, sentiments, verbose=0) print('Accuracy: %f' % (accuracy*100))

In the output, you should see that accuracy is 1.000 i.e. 100%.

Word Embeddings with Keras Functional API

In the last section, we saw how word embeddings can be used with the Keras sequential API. While the sequential API is a good starting point for beginners, as it allows you to quickly create deep learning models, it is extremely important to know how Keras Functional API works. Most of the advanced deep learning models involving multiple inputs and outputs use the Functional API.

In this section, we will see how we can implement embedding layer with Keras Functional API.

The rest of the script remains similar as it was in the last section. The only change will be in the development of a deep learning model. Let's implement the same deep learning model as we implemented in the last section with Keras Functional API.

from keras.models import Model from keras.layers import Input deep_inputs = Input(shape=(length_long_sentence,)) embedding = Embedding(vocab_length, 100, weights=[embedding_matrix], input_length=length_long_sentence, trainable=False)(deep_inputs) # line A flatten = Flatten()(embedding) hidden = Dense(1, activation='sigmoid')(flatten) model = Model(inputs=deep_inputs, outputs=hidden)

In the Keras Functional API, you have to define the input layer separately before the embedding layer. In the input, layer you have to simply pass the length of input vector. To specify that previous layer as input to the next layer, the previous layer is passed as a parameter inside the parenthesis, at the end of the next layer.

For instance, in the above script, you can see that deep_inputs is passed as parameter at the end of the embedding layer. Similarly, embedding is passed as input at the end of the Flatten() layer and so on.

Finally, in the Model(), you have to pass the input layer, and the final output layer.

Let's now compile the model and take a look at the summary of the model.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc']) print(model.summary())

The output looks like this:

Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) (None, 7) 0 _________________________________________________________________ embedding_1 (Embedding) (None, 7, 100) 4400 _________________________________________________________________ flatten_1 (Flatten) (None, 700) 0 _________________________________________________________________ dense_1 (Dense) (None, 1) 701 ================================================================= Total params: 5,101 Trainable params: 701 Non-trainable params: 4,400

In the model summary, you can see the input layer as a separate layer before the embedding layer. The rest of the model remains the same.

Finally, the process to fit and evaluate the model is same as the one used in Sequential API:

model.fit(padded_sentences, sentiments, epochs=100, verbose=1) loss, accuracy = model.evaluate(padded_sentences, sentiments, verbose=0) print('Accuracy: %f' % (accuracy*100))

In the ouput, you will see an accuracy of 1.000 i.e. 100 percent.

Conclusion

To use text data as input to the deep learning model, we need to convert text to numbers. However unlike machine learning models, passing sparse vector of huge sizes can greately affect deep learning models. Therefore, we need to convert our text to small dense vectors. Word embeddings help us convert text to dense vectors.

In this article we saw how word embeddings can be implemented with Keras deep learning library. We implemented the custom word embeddings as well as used pretrained word embedddings to solve simple classification task. Finally, we also saw how to implement word embeddings with Keras Functional API.

Categories: FLOSS Project Planets

Zato Blog: Windows commands and PowerShell scripts as API microservices

9 hours 57 min ago

This post goes through the steps of exposing Windows commans and PowerShell scripts as remote Zato API services that can be invoked by REST clients.

This lets one access a fleet of Windows systems from a single place and makes it possible for Zato services to participate in Windows management processes.

Note that Zato servers always run on Linux and no installation of any kind of software under Windows is necessary for Zato to connect to remote systems.

Prerequisites

Start by installing a library that implements the remote Windows connectivity:

$ cd /opt/zato/current $ ./bin/pip install pywinrm

Next, stop and start again any servers running, e.g.:

$ zato stop /path/to/server $ zato start /path/to/server Python code

Deploy the following service to your Zato cluster - note its name, windows.remote.management.

# -*- coding: utf-8 -*- from __future__ import absolute_import, division, print_function, unicode_literals # stdlib from traceback import format_exc # pywinrm import winrm # Zato from zato.server.service import Service class MyService(Service): name = 'windows.remote.management' class SimpleIO: input_required = 'type', 'data' output_required = 'exec_code' output_optional = 'status_code', 'stdout', 'stderr', 'details' response_elem = None skip_empty_keys = True def handle(self): # Local aliases input_type = self.request.input.type input_data = self.request.input.data # Validate input - we support either regular commands # or PowerShell scripts on input. if input_type not in ('cmd', 'ps'): self.response.payload.exec_code = 'error' self.response.payload.details = 'Invalid type' # Input was valid, we can try to execute the command now else: try: # Remote server details host = '10.151.139.17' # Credentials username = 'myuser' password = 'NZaIhMezvK00Y' # Establish a connection to the remote host session = winrm.Session(host, (username, password)) # Dynamically select a function to run, # either for commands or PowerShell func = session.run_cmd if input_type == 'cmd' else session.run_ps # Run the function with input data given result = func(input_data) # Status code is always available self.response.payload.status_code = result.status_code self.response.payload.stdout = result.std_out self.response.payload.stderr = result.std_err except Exception: self.response.payload.exec_code = 'error' self.response.payload.details = format_exc() # Everything went fine else: self.response.payload.exec_code = 'ok' REST channel

Create a new REST channel in web-admin and mount service windows.remote.management on it. Make sure to set data format to JSON.

Usage

Let us invoke the service from command line, using curl. For clarity, the output of commands below is limited to a few lines.

First, we will run a regular command to get a directory listing of drive C:

$ curl http://api:password@localhost:11223/windows -d '{"type":"cmd", "data":"dir c:"}' {"status_code": 0, "exec_code": "ok", "stdout": " Volume in drive C has no label.\r\n Volume Serial Number is 1F76-3AB6\r\n\r\n Directory of C:\\Users\\Administrator\r\n\r\n07/22/2019 11:53 PM <DIR>"}

What if we provide an invalid drive name?

$ curl http://api:password@localhost:11223/windows -d '{"type":"cmd", "data":"dir z:"}' {"status_code": 1, "exec_code": "ok", "stderr": "The system cannot find the path specified.\r\n"}

Now, invoke a PowerShell script, which in this case is a single-line one to check connection from the remote Windows system to example.com, but it could be much more complex, there are no limitations:

$ curl http://api:password@localhost:11223/windows -d \ '{"type":"ps", "data":"Test-Connection example.com"}'

This time, both stdout and stderr are returned but because the overall status_code is 0, we know that the invocation was successful.

{"status_code": 0, "exec_code": "ok", "stdout": "WIN-A3I92B... example.com 93.184.216.34", "stderr": "#< CLIXML\r\n<Objs Version=\"1.1.0.1\" ...", In conclusion

The service is just a starting point and there are a couple ways to extend it:

  • Details of remote servers, including credentials, should be kept separately
  • Permissions, including ACLs, can be added to allow or disallow access to particular commands to selected REST users only

Yet, even in this simple form, it already shows how easy it is to connect to Windows servers and turn remote commands into REST APIs microservices.

On top of it, REST is but one of many formats that Zato supports - one could just as well design workflows around AMQP, ZeroMQ, IBM MQ, FTP or other protocols in addition to REST with no changes to Python code required.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Week #8 | Fixes in decorator for validation

14 hours 56 min ago

Issue: #508 Validating all the request hiting Guillotina against proper jsonschema.

This Week:

PR: #517 

Requests coming to the API were not being validated, task was to validate all the request hitting the API. Jsonschema is already defined for most of the endpoints, only a validation wrapper was required to be added to all the endpoint(along with jsonschema for missing endpoints and testcases for validation). Now all the request hitting being validated against a well defined schema and through 412(Precondition failed) along with proper error in case of a bad request. 

Next Week:

Issue: #507

Added validation is a wrapper over all the endpoint causing too much redundancy in code base because its need to be added to all the endpoint. Next thing is to replace the validation with a decorator, which will validate the request automatically and will have very less code redundancy

Stuck anywhere:

No

Categories: FLOSS Project Planets

IslandT: Cryptocurrency user interface set up

15 hours 42 min ago
A simple cryptocurrency interface set up

As mentioned above, in this article we will start to create the user interface of our latest cryptocurrency project. Along the path we will also use the CryptoCompare API to retrieve data. In this article we will do these:

  1. Create a simple interface with the cryptocurrency and the currency combo boxes for future use as well as a display panel to display the currency exchange rate.
  2. Make the API call which will retrieve the exchange rate of BTC vs USD, JPY, and the EUR.
  3. Display those exchange rates in the display panel.

Below is the entire python program.

import json from tkinter import * import tkinter.ttk as tk import requests win = Tk() # Create tk instance win.title("Crypto Calculator") # Add a title win.resizable(0, 0) # Disable resizing the GUI win.configure(background='white') # change window background color selectorFrame = Frame(win, background="white") # create top frame to hold search button and combobox selectorFrame.pack(anchor = "nw", pady = 2, padx=10) currency_label = Label(selectorFrame, text = "Select crypto / currency pair :", background="white") currency_label.pack(anchor="w") # Create a combo box for crypto currency select_currency = StringVar() # create a string variable crypto = tk.Combobox(selectorFrame, textvariable=select_currency) crypto.pack(side = LEFT, padx=3) s = StringVar() # create string variable # create currency frame and text widget to display the incoming exchange rate data currencyFrame = Frame(win) currencyFrame.pack(side=TOP) currency = Label(currencyFrame) currency.pack() text_widget = Text(currency, fg='white', background='black') text_widget.pack() s.set("Click the find button to load the crypto currency - currency exchange rate") text_widget.insert(END, s.get()) buttonFrame = Frame(win) # create a bottom frame to hold the load button buttonFrame.pack(side = BOTTOM, fill=X, pady = 6) def search_currency(): # search currency pair pass #for future use action_search = tk.Button(selectorFrame, text="Search", command=search_currency) # button used to search the currency pair within the text widget action_search.pack(side=LEFT) # Create currency combo box base_currency = StringVar() # create a string variable based = tk.Combobox(selectorFrame, textvariable=base_currency) based.pack(side = LEFT, padx=3) def get_exchange_rate(): # this method will display the incoming exchange rate data after the api called global exchange_rate global base_crypto base_crypto ="BTC" try: url = "https://min-api.cryptocompare.com/data/price" #url for API call data = {'fsym' : base_crypto, 'tsyms':"USD,JPY,EUR"} r = requests.get(url, params=data) exchange_rate_s = json.loads(json.dumps(r.json())) except: print("An exception occurred") count = 0 found = False curr = tuple() # the tuple which will be populated by cryptocurrency curr += (base_crypto,) curr1 = tuple() # the tuple which will be populated by currency sell_buy = '' for key, value in exchange_rate_s.items(): # populate exchange rate string and the currency tuple sell_buy += base_crypto + ":" + key + " " + str(value) + " " curr1 += (key,) # fill up combo boxes for both currency and cryptocurrency based['values'] = curr1 based.current(0) crypto['values'] = curr crypto.current(0) text_widget.delete('1.0', END) # clear all those previous text first s.set(sell_buy) text_widget.insert(INSERT, s.get()) # populate the text widget with new exchange rate data action_vid = tk.Button(buttonFrame, text="Load", command=get_exchange_rate) # button used to load the exchange rate of currency pairs action_vid.pack() win.iconbitmap(r'ico.ico') win.mainloop()

In this article, all the currencies are hardcoded into the program, in the next chapter, we will create two files to load all those currencies into those currency combo boxes.

Categories: FLOSS Project Planets

Kushal Das: Using signify tool for sign and verification

Mon, 2019-07-22 23:53

We generally use GNUPG for sign and verify files on our systems. There are other tools available to do so; some tools are particularly written only for this purpose. signify is one such tool from the OpenBSD land.

How to install signify? pkg install signify

I used the above command to install the tool on my FreeBSD system, and you can install it in your Debian system too, the tool is called signify-openbsd as Debian already has another tool with the same name. signify is yet to be packaged for Fedora, if you are Fedora packager, you may want to package this one for all of us.

Creating a public/private key pair signify -G -s atest.sec -p atest.pub -c "Test key for blog post"

The command will also ask for a password for the secret key. -c allows us to add a comment in our key files. The following is the content of the public keyfile.

untrusted comment: Test key for blog post public key RWRjWJ28QRKKQCXxYPqwbnOqgsLYQSwvqfa2WDpp0dRDQX2Ht6Xl4Vz4

As it is very small in size, you can even create a QR code for the same.

Signing a file

In our demo directory, we have a hello.txt file, and we can use the newly generated key to create a signature.

signify -S -s atest.sec -m hello.txt

This will create a hello.txt.sig file as the signature.

Verifying the signature $ signify -V -p atest.pub -m hello.txt Signature Verified

This assumes the signature file in the same directory. You can find the OpenBSD signature files under /usr/local/etc/signify (or in /etc/signify/ if you are on Debian).

To know more about the tool, read this paper.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Getting started with new models

Mon, 2019-07-22 23:33

What did I do this week?

This week was mostly about testing and debugging the scikit linear regression model. After that I implemented saving and loading of the model, which took some time is debugging itself. This work was completed by Friday and I spent the weekend studying some other models including k-Nearest Neighbors, K Means and learnt about support vector machines.

Did I get stuck?

Oh, at loads of places. Most surprising and funny was that I was receiving negative accuracy out of scikit model and I had absolutely no idea what it meant. Now I have got it and mentors are looking into how to we can make this work for DFFML.

I got stuck in saving and loading too, as scikit offers saving and loading with pickle or joblib, I also had to save confidence of the model in a JSON that took tons of ideas and debugging.

Plans for upcoming week?

It'll probably be discussed in the weekly sync and previous two models would be merged.

Categories: FLOSS Project Planets

Podcast.__init__: Protecting The Future Of Python By Hunting Black Swans

Mon, 2019-07-22 17:57
The Python language has seen exponential growth in popularity and usage over the past decade. This has been driven by industry trends such as the rise of data science and the continued growth of complex web applications. It is easy to think that there is no threat to the continued health of Python, its ecosystem, and its community, but there are always outside factors that may pose a threat in the long term. In this episode Russell Keith-Magee reprises his keynote from PyCon US in 2019 and shares his thoughts on potential black swan events and what we can do as engineers and as a community to guard against them.Summary

The Python language has seen exponential growth in popularity and usage over the past decade. This has been driven by industry trends such as the rise of data science and the continued growth of complex web applications. It is easy to think that there is no threat to the continued health of Python, its ecosystem, and its community, but there are always outside factors that may pose a threat in the long term. In this episode Russell Keith-Magee reprises his keynote from PyCon US in 2019 and shares his thoughts on potential black swan events and what we can do as engineers and as a community to guard against them.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to grow your professional network and find opportunities with the startups that are changing the world then Angel List is the place to go. Go to pythonpodcast.com/angel to sign up today.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Upcoming events include the O’Reilly AI Conference, the Strata Data Conference, and the combined events of the Data Architecture Summit and Graphorum. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Russell Keith-Magee about potential black swans for the Python language, ecosystem, and community and what we can do about them
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what a Black Swan is in the context of our conversation?
  • You were the opening keynote for PyCon this year, where you talked about some of the potential challenges facing Python. What motivated you to choose this topic for your presentation?
  • What effect did your talk have on the overall tone and focus of the conversations that you experienced during the rest of the conference?
    • What were some of the most notable or memorable reactions or pieces of feedback that you heard?
  • What are the biggest potential risks for the Python ecosystem that you have identified or discussed with others?
  • What is your overall sentiment about the potential for the future of Python?
  • As developers and technologists, does it really matter if Python continues to be a viable language?
  • What is your personal wish list of new capabilities or new directions for the future of the Python language and ecosystem?
  • For listeners to this podcast and members of the Python community, what are some of the ways that we can contribute to the long-term success of the language?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Categories: FLOSS Project Planets

PSF GSoC students blogs: Google Summer of Code with Nuitka 4th Blog Post

Mon, 2019-07-22 17:27

<meta charset="utf-8">

<meta charset="utf-8">

This week, I continued to work on my script which automates the testing of nuitka-wheel pytest. Details can be found on my pull request: https://github.com/Nuitka/Nuitka/pull/440

 

The script automates the manual testing of comparing pytest results of a nuitka compiled wheel using `python setup.py bdist_nuitka` to the pytest results of an uncompiled wheel built using `python setup.py bdist_wheel` for the most popular PyPI packages. Testing is done to ensure that nuitka is building the wheel correctly. If the pytests pass/fail in the same way, that means Nuitka built the wheel properly. Else if the tests differ, then something is wrong. Virtualenv is used to create a clean environment with no outside pollution.

 

There were a lot of changes made this week.

  • Added an `__import__` check to make sure pytest used the correct (uncompiled or compiled) package

  • Added a mechanism to ignore certain tests which are unimportant (fixed the comparison issue for urllib3)

  • Added coloring to output for better visuals

  • Added a summary of all packages at the very end of testing

  • Extended testing to many more packages

  • Added local caching of `git clone`

  • Many run time improvements

 

Very proud of the work I have done this week. The plan for next week is to extend the automation to more PyPI packages.

 

Categories: FLOSS Project Planets

Artem Rys: Ranking capitals by the number of Starbucks using Python and Google Maps API

Mon, 2019-07-22 17:09
Photo by Khadeeja Yasser on Unsplash

Recently being in Budapest (great city by the way) I saw lots of Starbucks coffee shops and decided to write a small script that will rank capitals by the number of Starbucks there.

I am not pretending to get the most accurate results, it is just an example. So, please don’t use it make some serious business decisions :)

We will use Google Places API and a local copy of a bit reworked list of capitals’ coordinates from here.

The full code is available here.

First of all, let me say that Google Places API has a serious limitation of returning maximum 60 results for one search. That’s why we need to be a bit tricky. We will request it several times with different coordinates and then count a number of unique places that we got.

Probably not the best solution but it at least simple and returns more than 60 places. If you need more precise results there are lots of Google Places API wrappers around the web.

To run it by yourself you need to get a key. Don’t be afraid about money, I spent 0$ while I was testing my code.

https://medium.com/media/51e890f14de642963c2ffc692202e24b/href

Using this function we can create a class that searches Starbucks in the different parts of the world. get_starbucks is just a function that makes a call to Google Places API.

https://medium.com/media/19c814174c185e89be60e40c58c2fd04/hrefFact: 117 out of 240 capitals don’t have Starbucks at all.

Top 30 capitals by the number of Starbucks there:

https://medium.com/media/60a34271f168238cc2504a32b3c3fb18/href

Python IO streams in examples

Thanks for the attention to the topic, feel free to leave your questions in the comments for discussion.

Ranking capitals by the number of Starbucks using Python and Google Maps API was originally published in python4you on Medium, where people are continuing the conversation by highlighting and responding to this story.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Week 8: Appearances and WMS server

Mon, 2019-07-22 16:51

What did you do this week?

As planned, since the basic working of topview with all the code migrated to cartopy was fine. But now I needed to address all the bugs and make sure the working of each functionality and if any bug then bring it to working state. So building up on that my mentor identified few errors, one of which was the absolutely rubbish image being projected when using through WMS server. I was a little lost and experimented with wrong stuff and diverted a little more of my time than needed; basically I thought the issue is definitely being traced back from wms_control.py but infact it was merely the absence of keyword `extent` from function `imshow`. Other than that I also worked on the appearance feature to change color of water bodies which previously didn't worked unless you restart mss.

What is coming up next?

I think the current plan is to remove any source of crash in program through the interface and then proceed to rewrite tests which were basemap specific. Yes I did 

Did you get stuck anywhere?

Apart from tracing back the issue to imshow function, No I did not get stuck anywhere.

Categories: FLOSS Project Planets

Codementor: Python Snippet 2: Quick Sequence Reversal

Mon, 2019-07-22 16:23
Python lists have a handy method called reverse, but it's not always what we want. For a start, we can't use it on other sequence types, like tuples and strings, and it also performs an in-place modification of the original sequence. Read more!
Categories: FLOSS Project Planets

PSF GSoC students blogs: Week 8: Blog Post (#4)

Mon, 2019-07-22 16:20

Last week, I was working on the parametrizing tests for SourceEstimates with tfr_morlet, and making sure the tests are also passed for all possible different functional arguments. Overall this worked quite smoothly for most parameters. In fact, I was already able to start implementing the same procedure for tfr_multitaper. Anyhow, for tfr_multitaper there is no reference function as for tfr_morlet, so tests for tfr_multitaper are for now basically just running the function and checking that everything is in the right place.

However, some of the parameters still were causing problems. By far the most time this week was spent to work on tfr_morlet processing data from 'vector' representations (i.e. instead of one virtual "sensor" in source space, there are 3 virtual "sensors" created, with each one representating one directional activation axis. Besides adding a further dimension (which was no big problem to adapt to), the data fields consistently showed a slight difference (of about 2-4%).
After noticing that for this case, the results are theoretically not expected to be the same for the two procedures taken in the functions, I'll have to find a case where a comparison is possible between the two.

So this is one thing i'm going to do the following week. But mainly, I'm going to continue covering cases for the tfr_multitaper function, as well as starting with the last function, tfr_stockwell.
Another think to possibly look at now is a further option of the functions, which is to additionally return the inter-trial coherence of the time-frequency transforms.

Categories: FLOSS Project Planets

Pages