Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 13 hours 9 min ago

Codementor: ML with Python: Part-1

Sat, 2019-09-21 07:02
Now, We are comfortable with Python and ready to get started with Machine Learning (ML) projects. But, Where to go next? Can we directly dive into coding ML projects? Please follow along to...
Categories: FLOSS Project Planets

Test and Code: 88: Error Monitoring, Crash Reporting, Performance Monitoring - JD Trask

Sat, 2019-09-21 00:45

Tools like error monitoring, crash reporting, and performance monitoring are tools to help you create a better user experience and are fast becoming crucial tools for web development and site reliability. But really what are they? And when do you need them?

You've built a cool web app or service, and you want to make sure your customers have a great experience.

You know I advocate for utilizing automated tests so you find bugs before your customers do. However, fast development lifecycles, and quickly reacting to customer needs is a good thing, and we all know that complete testing is not possible. That's why I firmly believe that site monitoring tools like logging, crash reporting, performance monitoring, etc are awesome for maintaining and improving user experience.

John-Daniel Trask, JD, the CEO of Raygun, agreed to come on the show and let me ask all my questions about this whole field.

Special Guest: John-Daniel Trask.

Sponsored By:

Support Test & Code - Python Testing & Development

<p>Tools like error monitoring, crash reporting, and performance monitoring are tools to help you create a better user experience and are fast becoming crucial tools for web development and site reliability. But really what are they? And when do you need them? </p> <p>You&#39;ve built a cool web app or service, and you want to make sure your customers have a great experience.</p> <p>You know I advocate for utilizing automated tests so you find bugs before your customers do. However, fast development lifecycles, and quickly reacting to customer needs is a good thing, and we all know that complete testing is not possible. That&#39;s why I firmly believe that site monitoring tools like logging, crash reporting, performance monitoring, etc are awesome for maintaining and improving user experience.</p> <p>John-Daniel Trask, JD, the CEO of Raygun, agreed to come on the show and let me ask all my questions about this whole field.</p><p>Special Guest: John-Daniel Trask.</p><p>Sponsored By:</p><ul><li><a href="https://testandcode.com/raygun" rel="nofollow">Raygun</a>: <a href="https://testandcode.com/raygun" rel="nofollow">Detect, diagnose, and destroy Python errors that are affecting your customers. With smart Python error monitoring software from Raygun.com, you can be alerted to issues affecting your users the second they happen.</a></li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code - Python Testing & Development</a></p>
Categories: FLOSS Project Planets

Brett Cannon: A Primer on the Stellar Network

Fri, 2019-09-20 22:11

On September 10, 2019 while I was in London, UK at the Python core dev sprints, I got a message from a user named "spacedrop" on Keybase. The message said I was being given 356.2904939 XLM as a surprise gift of "free Lumens worth $20.98 USD" from the Stellar Development Foundation. All of that screamed "cryptocurrency" which isn't my thing, so my initial reaction was this was some scam by someone who randomly messaged me on Keybase trying to get me to buy into some new cryptocurrency. But then I realized that Keybase wouldn't let a random person message me like that. Curious, I read the rest of the message and found a link to Keybase's "airdrop" announcement which explained that Keybase was actually facilitating the message. Trusting Keybase wasn't getting into anything nefarious,  that enticed me enough to dig a little deeper into Stellar and find their overview page which has the following summary:

Stellar is a multi-currency payment backend that tens of thousands of people  use every day. It’s decentralized, open-source, and developer-friendly,  so anyone can issue assets, settle payments, and trade.Stellar  is a blockchain, but it works more like cash—Stellar is much faster and  cheaper than bitcoin, for example. And it uses far less electricity.

Okay, that sounds nice. But when I was poking around the web site and I found a code of conduct and a roadmap that both seemed reasonable, that's when I decided to dive into Stellar and I came out thinking that's it's actually a rather cool piece of technology for people to track "what they own ... and and what they want to do with what they own".

So this blog post is basically me writing down what I learned about Stellar and why I found it interesting from the perspective of trying to find a cheap, secure way to send remittance to the United States from Canada (which, spoiler alert, Stellar can't do for me yet, but the technology is there if someone would let me get CAD on to the Stellar network).

What is Stellar for?

I will go into more detail later, but to help motivate reading the rest of the blog post, I want to quickly outline what Stellar is. Basically it's a public ledger that tracks ownership of assets. Those assets do not need to be inherent to Stellar, and in fact a key part of Stellar is that 3rd-parties can provide their own assets to have managed on the network.

You can also trade assets on the network. Stellar lets you put out buy and sell orders on the network and the network will exchange these orders with assets as necessary at the best price for you. This is just like a stock market with buy and sell orders, but instead of stock certificates it's assets on the Stellar network. But one extra twist is that since Stellar lets anyone put assets on to the network, the network will do up to 6 different exchanges to try to get you the best value for your assets. For instance, if you're trying to buy spam with bacon, but people are only selling bacon for eggs and buying spam for eggs, the network will do the bacon -> eggs -> spam trade for you to get you the best result.

Now substitute "bacon" for "CAD" and "USD" for "spam" and you start to see how Stellar might be really handy for exchanging money around the world.

Lumens (yes, there's a digital currency)

To start discussing Stellar you need to know about lumens (or XLM for short). There's 100,000,000,000 lumens on the network since it went live, so there's no mining new ones like with Bitcoin. The smallest unit of lumens is called a stroop and it's 0.0000001 of a lumen (and they are named after stroopwafels which my wife and I like, and stroopwafels are Dutch which just makes the Pythonista in me smile 😊).

Now when I read that Stellar had lumens, I 🙄 like this was yet another cryptocurrency that people are just speculating with (which some people are), but when I began to read about what lumens are used for I realized it's actually an anti-spam mechanism and baseline asset to exchange for other things more than a play to make money from lumens themselves.

Accounts

Accounts on Stellar are a public key and a private seed. Nothing crazy, but also nothing terribly difficult to calculate either. So how does Stellar prevent people from creating a ton of accounts to spam the network?

By having a minimum account balance required to even create an account. Since lumens are the original asset on Stellar they are what you need to open an account (and keep it open). As of today it's 1 XLM which is about $0.07275 USD as I write this. In other words it is not a financial hardship to to require having a single lumen to open an account, but it also won't lead to everyone creating 1 billion accounts on their own either.

Trading

So now you have your account, how do you do something as simple as send or receive an asset? Once again, lumens are used as an anti-spam mechanism for trading.

Every change you want to make to the Stellar network is an operation. All the operations you want to do a single unit is a transaction (just like with databases). All the transactions that get resolved end up in a new version of the ledger which tracks the state of the network at that point in time.

Each transaction costs at least the base fee of 100 stroops per operation contained in that transaction. That way you can't flood the network with operations without having to at least pay a little bit for it.

And what exactly are you paying for? Well, there's a limit to how many new operations can occur on the Stellar network per ledger update. Protocol 11 made it so the network votes on what the maximum number of operations per ledger should be, and as of right now it's sitting at 1000 operations/ledger (if you look at any ledger like ledger 25923589 you will see max_tx_set_size and that shows the network's current operations/ledger rate). Even with ledgers resolving every 5 seconds, that still means there's limited capacity if the network gets backed up (i.e. it's about 200 operations/second). In those instances where there's not enough capacity there's surge pricing.

You specify the maximum base fee you're willing to pay when you create a transaction. An auction is held where your maximum base fee is offered to fund resolving your transaction. In the end, though, you end up paying only what was required for you to get your transaction resolved (e.g. you might offer to pay a total of 1000 stroops as a maximum base fee for your one operation, but if all it took was 150 stroops for your transaction to get resolved during surge pricing then that's all you end up paying).

So you're paying to prevent spam, and you're paying to potentially prioritize your transaction in case the network is backed up. Now currently the network is not at capacity so worrying about surge pricing isn't a big deal, but even if it did increase we're talking about miniscule amounts of XLM. With the price for 1 XLM that I quoted above, 100 stroop is $0.0000007275 USD, so even if you had to go up by several orders of magnitude to get your operation resolved it wouldn't exactly be expensive.

Inflation

One other interesting thing about lumens to help hit home the fact that they are more of an anti-spam mechanism and baseline asset everyone can agree to than an investment vehicle is the fact that the network has built-in inflation. The network automatically distributes 1% worth of lumens compared to the amount in circulation annually. The network also gives back all transaction fees that were collected. This is done to prevent speculating since it keeps lumens from becoming a scarce commodity and something you want to hoard as the value will systematically go down over time.

The way inflation and transaction fees are disbursed is via voting for an account to receive those funds which gets its proportion based on the amount of votes it got (which equates to the amount of lumens that voting account holds). Since an account must hold the votes of accounts in total of at least 0.05% of all lumens in circulation, people typically join an inflation pool. A popular one is https://pool.lumenaut.net/ which redistributes the lumens that the pool acquires back to those who voted for it in proportion to the amount of votes/lumens the account gave to the pool.

Basically there's no reason not to join an inflation pool. It's free lumens and it's easy to do. If you are on Keybase and were part of the airdrop, make sure to go into your lumens wallet and opt into one of the inflation pools as it was not done for you automatically.

Anchors (or what makes Stellar interesting)

So up to this point you're probably wondering how to heck remittance from Canada to the United States might work if everything is being done in lumens and I said they are not meant to act as investment vehicles. And the answer to that is assets and anchors.

Basically anchors join the network and offer tokens which represent assets that the anhor holds. The anchor can then send those assets as tokens to other accounts on the Stellar network, expressing the fact that an account owns those tokens representing that asset. While lumens is the asset we have talked about up until now, anything can be an asset on the network.

Let's say I run a bank and it acts as an anchor on the Stellar network that will generate tokens representing CAD. What that would mean is customers could withdraw CAD cash from their bank accounts and exchange them for CAD tokens on Stellar. My bank would hold the physical CAD in escrow to back the tokens in circulation. This allows people to then exchange their CAD tokens for real/fiat currency at my bank by sending the tokens to their account, whereby my bank would destroy the token so there isn't double-counting of the money in the world.

To take this bank analogy a little farther, think of physical cash as tokens, your wallet as your Stellar account, and the world of Canadian money as the Stellar network. When you withdraw money from the ATM, you are exchanging money in your bank account for a different format; in this case it's physical cash. You can then transact with it at stores, etc. And then eventually that physical cash comes back out of circulation when you deposit money into your bank account and becomes bits in some bank database.

And this is how anchors that back fiat currency work. For instance, AnchorUSD takes money in USD from you and then converts it 1:1 into a token on Stellar for you to send to whomever. It also lets you receive those USD tokens and then convert them back into USD money by destroying the token. Basically it's a gateway between the USD money and Stellar. This is also where lumens comes in, acting as a baseline asset everyone accepts and understands. That way you can transact in and out of XLM as necessary and still end up with what you want. In other words you can think of lumens as an intermediary asset that everyone understands. This also makes the value of lumens not critical if you do your end-to-end transaction at once as it will make lumens just a temporary part of the transaction (if it's even necessary to have it as an intermediary asset).

In order to prevent people from trusting any random anchor, Stellar has the concept of trustlines. Basically it's a way to say on the network, "we both agree that this token represents what the anchor says it does". That way you enter into an agreement with the anchor to avoid getting ripped off.

Why this interests me

When I realized that the Stellar network was set up so that it would be feasible for a CAD-equivalent of AnchorUSD to exist such that I could send family in the United States actual USD that they could deposit into their bank accounts from CAD money in my bank account, that got me excited. Typically I use TransferWise (if you choose to sign up to TransferWise, this is a referral link that gets me and the first couple of people who use it some money), but it takes a few days for the money to arrive and I have to go through some hoops  to get the cheapest fee with the fastest result by letting them log into my bank account to check I actually have the funds which has always bugged me from a security perspective.

Add on to that the fact that PayPal is about the only solution I know of for sending small amounts of money internationally – which happens regularly to me when I'm at a conference outside of Canada and the restaurant won't split the cheque – and you start to wonder why there aren't more potential solutions out there for sending money internationally in a fast, cheap manner.

And apparently I'm not the only one who thinks this: IBM has a service called World Wire built on the Stellar network specifically for moving money quickly and cheaply between banks. So now I'm just waiting for someone to set up a CAD-based anchor which acts as an Interac e-Transfer bridge between my Stellar account and my Canadian chequing account so I can send money to the United States cheaply and easily.

Interesting LinksStellar - an open network for moneyStellar lets you hold, send, and swap digital versions of everyday currencies. Dozens of financial institutions issue assets and settle payments on Stellar.an open network for moneyGithub IconTwitter IconKeybase IconReddit IconFacebook IconStellar DashboardStellar Dashboard.Keybase Stellar Space Drop, 2 Billion Lumens for the WorldKeybase is for keeping everyone’s chats and files safe, from families to communities to companies. MacOS, Windows, Linux, iPhone, and Android.If you were a Keybase user before September 9, 2019 you might as well go and collect your lumens and sign up to get the monthly disbursement they are going to be doing for the next 20 months, and remember to sign up for an inflation pool.Lumenaut PoolLumenaut Pool. 100% payout, no fees. By the community, for the community.If you're looking for an inflation pool.StellarXA user-friendly, peer-to-peer marketplace.Market site which makes all trades free since Stellar itself handles all buy/sell orders itself.

https://www.coinbase.com/earn/stellar/

(If you want to watch some videos from Coinbase and get some free lumens if you have a Coinbase account. If you don't have a Coinbase account and you want to sign up for one, you can use this referral link which will give me some extra lumens.)

Stellar News, Education, and Insights - Lumenauts.comLumenauts.com is your source for Stellar news, educational resources, and insights.Michael BrinkLUMENAUTS.COMLUMENAUTS.COMA site with a few YouTube videos covering similar things to what the Coinbase videos do, but without having to sign up for anything.

My Keybase Stellar account is:

GDZE4QWQLXHR6SWPOZ2ZUF6JWS3GBKEYQJDKWPZSD7JQTE2XP5GPS4KW

if you want a place to send XLM in order to experience that (I don't need more lumens so please don't view this as me begging for more, it's just that I didn't know anyone with a Stellar account when I started this so all I could do was just stare at my Keybase wallet until I got my free XLM from Coinbase and sent the assets to my own Keybase account).

Categories: FLOSS Project Planets

Python Engineering at Microsoft: Come meet Microsoft at DjangoCon 2019

Fri, 2019-09-20 20:00

DjangoCon 2019 is happening next week Sept 22-27 in San Diego, and Microsoft is pleased to be supporting the conference as Gold sponsors this year. We will have some members from our Python team, our Azure Cloud Advocates, and the PostgreSQL team at the conference giving talks and at our booth.

Be sure to check out our talks from our team during the regular conference, all on Tuesday Sept 24th:

After the conference we’ll update this post with links to slides and content for the talks, so be sure to check back here!

If you are at the conference, come by our booth to meet the team and ask us your questions about Python, Django, and PostgreSQL in VS Code and Azure. If you cannot make it, or want more information after the conference, here are some helpful resources that cover a subset of what we’ll be sharing:

 

The post Come meet Microsoft at DjangoCon 2019 appeared first on Python.

Categories: FLOSS Project Planets

Roberto Alsina: Episodio 9: Generadores

Fri, 2019-09-20 13:00

Generadores en Python ... ¿qué son? ¿Con qué se comen?

Categories: FLOSS Project Planets

PyCharm: 2019.3 EAP 2

Fri, 2019-09-20 07:49

We have a new Early Access Program (EAP) version of PyCharm that can be now downloaded from our website.

New in the EAP
  • Support was added for namespaces coming from packages. PyCharm now recognizes namespaces defined by other packages.
  • Fixed a TensorFlow issue with its new version that wasn’t allowing PyCharm to recognize it properly as a library.
  • Annotations were improved in a way that they won’t show unnecessary error messages for string literals.
  • We fixed the position of the caret when an input is called for on the console. Before it was being shown at the beginning of the input call and now it is properly positioned at the end right next to where the input is about to be introduced.
  • The autocomplete for methods inside classes now includes the ‘self’ parameter if smart enter is used.
Further Improvements
  • The code inspection warning reprioritized the list of actions, giving a higher priority to quick fixes rather than run/debug actions.
  • Option to customize scrollbar visibility from the appearance settings.
  • And much more, check out the release notes for further details.
Interested?

Get the latest EAP build from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP.

EAP Program Key Facts
  • The EAP version of PyCharm Professional Version is free to use
  • EAP build will expire after 30 days
  • This is pre-release software, you may face stability issues and other rough edges
  • You can install the EAP version alongside a stable version of PyCharm
  • EAP versions of PyCharm report statistics by default, you can opt out by changing the settings in Preferences | Appearance & Behavior | System Settings | Data Sharing
  • There’s an EAP version of the documentation as well
Categories: FLOSS Project Planets

Python Circle: Intellectual property Law and Coding

Fri, 2019-09-20 06:45
IP laws and coding, patenting code, code copyrights, intellectual property law and code
Categories: FLOSS Project Planets

Codementor: Simple rules of good programming

Fri, 2019-09-20 06:13
Hi guys, I work as a programmer for more than 15 years and was using many different languages, paradigms, frameworks and other shit. And I want to share with you my rules of writing good...
Categories: FLOSS Project Planets

Codementor: Create a simple image search engine in OpenCV and Flask

Fri, 2019-09-20 00:13
Learn how to use OpenCV to extract image colors and then use Flask based web apps to search them.
Categories: FLOSS Project Planets

Peter Bengtsson: uwsgi weirdness with --http

Thu, 2019-09-19 09:20

Instead of upgrading everything on my server, I'm just starting from scratch. From Ubuntu 16.04 to Ubuntu 19.04 and I also upgraded everything else in sight. One of them was uwsgi. I copied various user config files but for uwsgi things didn't very well. On the old server I had uwsgi version 2.0.12-debian and on the new one 2.0.18-debian. The uWSGI changelog is pretty hard to read but I sure don't see any mention of this.

You see, on SongSearch I have it so that Nginx talks to Django via a uWSGI socket. But the NodeJS server talks to Django via 127.0.0.1:PORT. So I need my uWSGI config to start both. Here was the old config:

[uwsgi] plugins = python35 virtualenv = /var/lib/django/songsearch/venv pythonpath = /var/lib/django/songsearch user = django uid = django master = true processes = 3 enable-threads = true touch-reload = /var/lib/django/songsearch/uwsgi-reload.touch http = 127.0.0.1:9090 module = songsearch.wsgi:application env = LANG=en_US.utf8 env = LC_ALL=en_US.UTF-8 env = LC_LANG=en_US.UTF-8

(The only difference on the new server was the python37 plugin instead)

I start it and everything looks fine. No errors in the log files. And netstat looks like this:

# netstat -ntpl | grep 9090 tcp 0 0 127.0.0.1:9090 0.0.0.0:* LISTEN 1855/uwsgi

But every time I try to curl localhost:9090 I kept getting curl: (52) Empty reply from server. Nothing in the log files! It seemed no matter what I tried I just couldn't talk to it over HTTP. No, I'm not a sysadmin. I'm just a hobbyist trying to stand up my little server with the tools and limited techniques I know but I was stumped.

The solution

After endless Googling for a resolution and trying all sorts of uwsgi commands directly, I somehow stumbled on the solution.

[uwsgi] plugins = python35 virtualenv = /var/lib/django/songsearch/venv pythonpath = /var/lib/django/songsearch user = django uid = django master = true processes = 3 enable-threads = true touch-reload = /var/lib/django/songsearch/uwsgi-reload.touch -http = 127.0.0.1:9090 +http-socket = 127.0.0.1:9090 module = songsearch.wsgi:application env = LANG=en_US.utf8 env = LC_ALL=en_US.UTF-8 env = LC_LANG=en_US.UTF-8

With this one subtle change, I can now curl localhost:9090 and I still have the /var/run/uwsgi/app/songsearch/socket socket. So, yay!

I'm blogging about this in case someone else ever gets stuck in the same nasty surprise as me.

Also, I have to admit, I was fuming with rage from this frustration. It's really inspired me to revive the quest for an alternative to uwsgi because I'm not sure it's that great anymore. There are new alternatives such as gunicorn, gunicorn with Meinheld, bjoern etc.

Categories: FLOSS Project Planets

Will Kahn-Greene: Markus v2.0.0 released! Better metrics API for Python projects.

Thu, 2019-09-19 09:00
What is it?

Markus is a Python library for generating metrics.

Markus makes it easier to generate metrics in your program by:

  • providing multiple backends (Datadog statsd, statsd, logging, logging roll-up, and so on) for sending metrics data to different places
  • sending metrics to multiple backends at the same time
  • providing a testing framework for easy metrics generation testing
  • providing a decoupled architecture making it easier to write code to generate metrics without having to worry about making sure creating and configuring a metrics client has been done--similar to the Python logging module in this way

We use it at Mozilla on many projects.

v2.0.0 released!

I released v2.0.0 just now. Changes:

Features

  • Use time.perf_counter() if available. Thank you, Mike! (#34)
  • Support Python 3.7 officially.
  • Add filters for adjusting and dropping metrics getting emitted. See documentation for more details. (#40)

Backwards incompatible changes

  • tags now defaults to [] instead of None which may affect some expected test output.

  • Adjust internals to run .emit() on backends. If you wrote your own backend, you may need to adjust it.

  • Drop support for Python 3.4. (#39)

  • Drop support for Python 2.7.

    If you're still using Python 2.7, you'll need to pin to <2.0.0. (#42)

Bug fixes

  • Document feature support in backends. (#47)
  • Fix MetricsMock.has_record() example. Thank you, John!
Where to go for more

Changes for this release: https://markus.readthedocs.io/en/latest/history.html#september-19th-2019

Documentation and quickstart here: https://markus.readthedocs.io/en/latest/index.html

Source code and issue tracker here: https://github.com/willkg/markus

Let me know whether this helps you!

Categories: FLOSS Project Planets

Stack Abuse: Solving Sequence Problems with LSTM in Keras: Part 2

Thu, 2019-09-19 08:56

This is the second and final part of the two-part series of articles on solving sequence problems with LSTMs. In the part 1 of the series, I explained how to solve one-to-one and many-to-one sequence problems using LSTM. In this part, you will see how to solve one-to-many and many-to-many sequence problems via LSTM in Keras.

Image captioning is a classic example of one-to-many sequence problems where you have a single image as input and you have to predict the image description in the form of a word sequence. Similarly, stock market prediction for the next X days, where input is the stock price of the previous Y days, is a classic example of many-to-many sequence problems.

In this article you will see very basic examples of one-to-many and many-to-many problems. However, the concepts learned in this article will lay the foundation for solving advanced sequence problems, such as stock price prediction and automated image captioning that we will see in the upcoming articles.

One-to-Many Sequence Problems

One-to-many sequence problems are the type of sequence problems where input data has one time-step and the output contains a vector of multiple values or multiple time-steps. In this section, we will see how to solve one-to-many sequence problems where the input has a single feature. We will then move on to see how to work with multiple features input to solve one-to-many sequence problems.

One-to-Many Sequence Problems with a Single Feature

Let's first create a dataset and understand the problem that we are going to solve in this section.

Creating the Dataset

The following script imports the required libraries:

from numpy import array from keras.preprocessing.text import one_hot from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers.core import Activation, Dropout, Dense from keras.layers import Flatten, LSTM from keras.layers import GlobalMaxPooling1D from keras.models import Model from keras.layers.embeddings import Embedding from sklearn.model_selection import train_test_split from keras.preprocessing.text import Tokenizer from keras.layers import Input from keras.layers.merge import Concatenate from keras.layers import Bidirectional import pandas as pd import numpy as np import re import matplotlib.pyplot as plt

And the following script creates the dataset:

X = list() Y = list() X = [x+3 for x in range(-2, 43, 3)] for i in X: output_vector = list() output_vector.append(i+1) output_vector.append(i+2) Y.append(output_vector) print(X) print(Y)

Here is the output:

[1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43] [[2, 3], [5, 6], [8, 9], [11, 12], [14, 15], [17, 18], [20, 21], [23, 24], [26, 27], [29, 30], [32, 33], [35, 36], [38, 39], [41, 42], [44, 45]]

Our input contains 15 samples with one time-step and one feature value. For each value in the input sample, the corresponding output vector contains the next two integers. For instance, if the input is 4, the output vector will contain values 5 and 6. Hence, the problem is a simple one-to-many sequence problem.

The following script reshapes our data as required by the LSTM:

X = np.array(X).reshape(15, 1, 1) Y = np.array(Y)

We can now train our models. We will train simple and stacked LSTMs.

Solution via Simple LSTM model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(1, 1))) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') model.fit(X, Y, epochs=1000, validation_split=0.2, batch_size=3)

Once the model is trained we can make predictions on the test data:

test_input = array([10]) test_input = test_input.reshape((1, 1, 1)) test_output = model.predict(test_input, verbose=0) print(test_output)

The test data contains a value 10. In the output, we should get a vector containing 11 and 12. The output I received is [10.982891 12.109697] which is actually very close to the expected output.

Solution via Stacked LSTM

The following script trains stacked LSTMs on our data and makes prediction on the test points:

model = Sequential() model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(1, 1))) model.add(LSTM(50, activation='relu')) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3) test_output = model.predict(test_input, verbose=0) print(test_output)

The answer is [11.00432 11.99205] which is very close to the actual output.

Solution via Bidirectional LSTM

The following script trains a bidirectional LSTM on our data and then makes a prediction on the test set.

from keras.layers import Bidirectional model = Sequential() model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(1, 1))) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3) test_output = model.predict(test_input, verbose=0) print(test_output)

The output I received is [11.035181 12.082813]

One-to-Many Sequence Problems with Multiple Features

In this section we will see one-to-many sequence problems where input samples will have one time-step, but two features. The output will be a vector of two elements.

Creating the Dataset

As always, the first step is to create the dataset:

nums = 25 X1 = list() X2 = list() X = list() Y = list() X1 = [(x+1)*2 for x in range(25)] X2 = [(x+1)*3 for x in range(25)] for x1, x2 in zip(X1, X2): output_vector = list() output_vector.append(x1+1) output_vector.append(x2+1) Y.append(output_vector) X = np.column_stack((X1, X2)) print(X)

Our input dataset looks like this:

[[ 2 3] [ 4 6] [ 6 9] [ 8 12] [10 15] [12 18] [14 21] [16 24] [18 27] [20 30] [22 33] [24 36] [26 39] [28 42] [30 45] [32 48] [34 51] [36 54] [38 57] [40 60] [42 63] [44 66] [46 69] [48 72] [50 75]]

You can see each input time-step consists of two features. The output will be a vector which contains the next two elements that correspond to the two features in the time-step of the input sample. For instance, for the input sample [2, 3], the output will be [3, 4], and so on.

Let's reshape our data:

X = np.array(X).reshape(25, 1, 2) Y = np.array(Y) Solution via Simple LSTM model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(1, 2))) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') model.fit(X, Y, epochs=1000, validation_split=0.2, batch_size=3)

Let's now create our test point and see how well our algorithm performs:

test_input = array([40, 60]) test_input = test_input.reshape((1, 1, 2)) test_output = model.predict(test_input, verbose=0) print(test_output)

The input is [40, 60], the output should be [41, 61]. The output predicted by our simple LSTM is [40.946873 60.941723] which is very close to the expected output.

Solution via Stacked LSTM model = Sequential() model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(1, 2))) model.add(LSTM(50, activation='relu')) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3) test_input = array([40, 60]) test_input = test_input.reshape((1, 1, 2)) test_output = model.predict(test_input, verbose=0) print(test_output)

The output in this case is: [40.978477 60.994644]

Solution via Bidirectional LSTM from keras.layers import Bidirectional model = Sequential() model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(1, 2))) model.add(Dense(2)) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3) test_output = model.predict(test_input, verbose=0) print(test_output)

The output obtained is: [41.0975 61.159065]

Many-to-Many Sequence Problems

In one-to-many and many-to-one sequence problems, we saw that the output vector can contain multiple values. Depending upon the problem, an output vector containing multiple values can be considered as having single (since the output contains one time-step data in strict terms) or multiple (since one vector contains multiple values) outputs.

However, in some sequence problems, we want multiple outputs divided over time-steps. In other words, for each time-step in the input, we want a corresponding time-step in the output. Such models can be used to solve many-to-many sequence problems with variable lengths.

Encoder-Decoder Model

To solve such sequence problems, the encoder-decoder model has been designed. The encoder-decoder model is basically a fancy name for neural network architecture with two LSTM layers.

The first layer works as an encoder layer and encodes the input sequence. The decoder is also an LSTM layer, which accepts three inputs: the encoded sequence from the encoder LSTM, the previous hidden state, and the current input. During training the actual output at each time-step is used to train the encoder-decoder model. While making predictions, the encoder output, the current hidden state, and the previous output is used as input to make prediction at each time-step. These concepts will become more understandable when you will see them in action in an upcoming section.

Many-to-Many Sequence Problems with Single Feature

In this section we will solve many-to-many sequence problems via the encoder-decoder model, where each time-step in the input sample will contain one feature.

Let's first create our dataset.

Creating the Dataset X = list() Y = list() X = [x for x in range(5, 301, 5)] Y = [y for y in range(20, 316, 5)] X = np.array(X).reshape(20, 3, 1) Y = np.array(Y).reshape(20, 3, 1)

The input X contains 20 samples where each sample contains 3 time-steps with one feature. One input sample looks like this:

[[[ 5] [ 10] [ 15]]

You can see that the input sample contain 3 values that are basically 3 consecutive multiples of 5. The corresponding output sequence for the above input sample is as follows:

[[[ 20] [ 25] [ 30]]

The output contains the next three consecutive multiples of 5. You can see the output in this case is different than what we have seen in the previous sections. For the encoder-decoder model, the output should also be converted into a 3D format containing the number of samples, time-steps, and features. The is because the decoder generates an output per time-step.

We have created our dataset; the next step is to train our models. We will train stacked LSTM and bidirectional LSTM models in the following sections.

Solution via Stacked LSTM

The following script creates the encoder-decoder model using stacked LSTMs:

from keras.layers import RepeatVector from keras.layers import TimeDistributed model = Sequential() # encoder layer model.add(LSTM(100, activation='relu', input_shape=(3, 1))) # repeat vector model.add(RepeatVector(3)) # decoder layer model.add(LSTM(100, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') print(model.summary())

In the above script, the first LSTM layer is the encoder layer.

Next, we have added the repeat vector to our model. The repeat vector takes the output from encoder and feeds it repeatedly as input at each time-step to the decoder. For instance, in the output we have three time-steps. To predict each output time-step, the decoder will use the value from the repeat vector, the hidden state from the previous output and the current input.

Next we have a decoder layer. Since the output is in the form of a time-step, which is a 3D format, the return_sequences for the decoder model has been set True. The TimeDistributed layer is used to individually predict the output for each time-step.

The model summary for the encoder-decoder model created in the script above is as follows:

Layer (type) Output Shape Param # ================================================================= lstm_40 (LSTM) (None, 100) 40800 _________________________________________________________________ repeat_vector_7 (RepeatVecto (None, 3, 100) 0 _________________________________________________________________ lstm_41 (LSTM) (None, 3, 100) 80400 _________________________________________________________________ time_distributed_7 (TimeDist (None, 3, 1) 101 ================================================================= Total params: 121,301 Trainable params: 121,301 Non-trainable params: 0

You can see that the repeat vector only repeats the encoder output and has no parameters to train.

The following script trains the above encoder-decoder model.

history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3)

Let's create a test-point and see if our encoder-decoder model is able to predict the multi-step output. Execute the following script:

test_input = array([300, 305, 310]) test_input = test_input.reshape((1, 3, 1)) test_output = model.predict(test_input, verbose=0) print(test_output)

Our input sequence contains three time-step values 300, 305 and 310. The output should be next three multiples of 5 i.e. 315, 320 and 325. I received the following output:

[[[316.02878] [322.27145] [328.5536 ]]]

You can see that the output is in 3D format.

Solution via Bidirectional LSTM

Let's now create encoder-decoder model with bidirectional LSTMs and see if we can get better results:

from keras.layers import RepeatVector from keras.layers import TimeDistributed model = Sequential() model.add(Bidirectional(LSTM(100, activation='relu', input_shape=(3, 1)))) model.add(RepeatVector(3)) model.add(Bidirectional(LSTM(100, activation='relu', return_sequences=True))) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3)

The above script trains the encoder-decoder model via bidirectional LSTM. Let's now make predictions on the test point i.e. [300, 305, 310].

test_output = model.predict(test_input, verbose=0) print(test_output)

Here is the output:

[[[315.7526 ] [321.47153] [327.94025]]]

The output I got via bidirectional LSTMs is better than what I got via the simple stacked LSTM-based encoder-decoder model.

Many-to-Many Sequence Problems with Multiple Features

As you might have guessed it by now, in many-to-many sequence problems, each time-step in the input sample contains multiple features.

Creating the Dataset

Let's create a simple dataset for our problem:

X = list() Y = list() X1 = [x1 for x1 in range(5, 301, 5)] X2 = [x2 for x2 in range(20, 316, 5)] Y = [y for y in range(35, 331, 5)] X = np.column_stack((X1, X2))

In the script above we create two lists X1 and X2. The list X1 contains all the multiples of 5 from 5 to 300 (inclusive) and the list X2 contains all the multiples of 5 from 20 to 315 (inclusive). Finally, the list Y, which happens to be the output contains all the multiples of 5 between 35 and 330 (inclusive). The final input list X is a column-wise merger of X1 and X2.

As always, we need to reshape our input X and output Y before they can be used to train LSTM.

X = np.array(X).reshape(20, 3, 2) Y = np.array(Y).reshape(20, 3, 1)

You can see the input X has been reshaped into 20 samples of three time-steps with 2 features where the output has been reshaped into similar dimensions but with 1 feature.

The first sample from the input looks like this:

[[ 5 20] [ 10 25] [ 15 30]]

The input contains 6 consecutive multiples of integer 5, three each in the two columns. Here is the corresponding output for the above input sample:

[[ 35] [ 40] [ 45]]

As you can see, the output contains the next three consecutive multiples of 5.

Let's now train our encoder-decoder model to learn the above sequence. We will first train a simple stacked LSTM-based encoder-decoder.

Solution via Stacked LSTM

The following script trains the stacked LSTM model. You can see that the input shape is now (3, 2) corresponding to three time-steps and two features in the input.

from keras.layers import RepeatVector from keras.layers import TimeDistributed model = Sequential() model.add(LSTM(100, activation='relu', input_shape=(3, 2))) model.add(RepeatVector(3)) model.add(LSTM(100, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3)

Let's now create a test point that will be used for making a prediction.

X1 = [300, 305, 310] X2 = [315, 320, 325] test_input = np.column_stack((X1, X2)) test_input = test_input.reshape((1, 3, 2)) print(test_input)

The test point looks like this:

[[[300 315] [305 320] [310 325]]]

The actual output of the above test point is [330, 335, 340]. Let's see what are model predicts:

test_output = model.predict(test_input, verbose=0) print(test_output)

The predicted output is:

[[[324.5786 ] [328.89658] [335.67603]]]

The output is far from being correct.

Solution via Bidirectional LSTM

Let's now train encoder-decoder model based on bidirectional LSTMs and see if we can get improved results. The following script trains the model.

from keras.layers import RepeatVector from keras.layers import TimeDistributed model = Sequential() model.add(Bidirectional(LSTM(100, activation='relu', input_shape=(3, 2)))) model.add(RepeatVector(3)) model.add(Bidirectional(LSTM(100, activation='relu', return_sequences=True))) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3)

The following script makes predictions on the test set:

test_output = model.predict(test_input, verbose=0) print(test_output)

Here is the output:

[[[330.49133] [335.35327] [339.64398]]]

The output achieved is pretty close to the actual output i.e. [330, 335, 340]. Hence our bidirectional LSTM outperformed the simple LSTM.

Conclusion

This is the second part of my article on "Solving Sequence Problems with LSTM in Keras" (part 1 here). In this article you saw how to solve one-to-many and many-to-many sequence problems in LSTM. You also saw how encoder-decoder model can be used to predict multi-step outputs. The encoder-decoder model is used in a variety of natural language processing applications such as neural machine translation and chatbot development.

In the upcoming article, we will see the application of encoder-decoder model in NLP.

Categories: FLOSS Project Planets

Wingware Blog: Viewing Arrays and Data Frames in Wing Pro 7

Wed, 2019-09-18 21:00

Wing Pro 7 introduced an array and data frame viewer that can be used to inspect data objects in the debugger. Values are transferred to the IDE according to what portion of the data is visible on the screen, so working with large data sets won't slow down the IDE.

The array viewer works with Pandas, numpy, sqlite3, xarray, Python's builtin lists, tuples, and dicts, and other classes that emulate lists, tuples, or dicts.

To use the array viewer, right-click on a value in the Stack Data tool in Wing Pro and select Show Value as Array:

This reveals the array viewer and displays the selected item from the Stack Data tree, in this case the global variable pandas_df:

Wing fetches data for display as you move the scroll bars. The Filter can be used to display only matching rows:

The drop down next to the Filter field may be used to select plain text, wildcard, or regular expression searching, to control whether searches are case sensitive, and to select whether to search on all columns or only the visible columns.

If more space is needed to view data, the Stack Data tool's tab can be dragged out of the window, to create a separate window for it.



That's it for now! We'll be back soon with more Wing Tips for Wing Python IDE.

Categories: FLOSS Project Planets

Audrey Roy Greenfeld: Voronoi Mandalas

Wed, 2019-09-18 13:46
SciPy has tools for creating Voronoi tessellations. Besides the obvious data science applications, you can use them to make pretty art like this:
The above was generated by this code:




I started with Carlos Focil's mandalapy code, modifying the parameters until I had a design I liked. I decided to make the Voronoi diagram show both points and vertices, and I gave it an equal aspect ratio. Carlos' mandalapy code is a port of Antonio Sánchez Chinchón's inspiring work drawing mandalas with R, using the deldir library to plot Voronoi tesselations.
Categories: FLOSS Project Planets

Mike Driscoll: Python Code Kata: Fizzbuzz

Wed, 2019-09-18 13:22

A code kata is a fun way for computer programmers to practice coding. They are also used a lot for learning how to implement Test Driven Development (TDD) when writing code. One of the popular programming katas is called FizzBuzz. This is also a popular interview question for computer programmers.

The concept behind FizzBuzz is as follows:

  • Write a program that prints the numbers 1-100, each on a new line
  • For each number that is a multiple of 3, print “Fizz” instead of the number
  • For each number that is a multiple of 5, print “Buzz” instead of the number
  • For each number that is a multiple of both 3 and 5, print “FizzBuzz” instead of the number

Now that you know what you need to write, you can get started!

Creating a Workspace

The first step is to create a workspace or project folder on your machine. For example, you could create a katas folder with a fizzbuzz inside of it.

The next step is to install a source control program. One of the most popular is Git, but you could use something else like Mercurial. For the purposes of this tutorial, you will be using Git. You can get it from the Git website.

Now open up a terminal or run cmd.exe if you are a Windows user. Then navigate in the terminal to your fizzbuzz folder. You can use the cd command to do that. Once you are inside the folder, run the following command:


git init

This will initialize the fizzbuzz folder into a Git repository. Any files or folders that you add inside the fizzbuzz folder can now be added to Git and versioned.

The Fizz Test

To keep things simple, you can create your test file inside of the fizzbuzz folder. A lot of people will save their tests in sub-folder called test or tests and tell their test runner to add the top level folder to sys.path so that the tests can import it.

Note: If you need to brush up on how to use Python’s unittest library, then you might find Python 3 Testing: An Intro to unittest helpful.

Go ahead an create a file called test_fizzbuzz.py inside your fizzbuzz folder.

Now enter the following into your Python file:

import fizzbuzz import unittest   class TestFizzBuzz(unittest.TestCase):   def test_multiple_of_three(self): self.assertEqual(fizzbuzz.process(6), 'Fizz')   if __name__ == '__main__': unittest.main()

Python comes with the unittest library builtin. To use it, all you need to do is import it and subclass unittest.TestCase. Then you can create a series of functions that represent the tests that you want to run.

Note that you also import the fizzbuzz module. You haven’t created that module yet, so you will receive a ModuleNotFoundError when you run this test code. You could create this file without even adding any code other than the imports and have a failing test. But for completeness, you go ahead and assert that fizzbuzz.process(6) returns the correct string.

The fix is to create an empty fizzbuzz.py file. This will only fix the ModuleNotFoundError, but it will allow you to run the test and see its output now.

You can run your test by doing this:


python test_fizzbuzz.py

The output will look something like this:


ERROR: test_multiple_of_three (__main__.TestFizzBuzz)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/michael/Dropbox/code/fizzbuzz/test_fizzbuzz.py", line 7, in test_multiple_of_three
self.assertEqual(fizzbuzz.process(6), 'Fizz')
AttributeError: module 'fizzbuzz' has no attribute 'process'

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)

So this tells you that your fizzbuzz module is missing an attribute called process.

You can fix that by adding a process() function to your fizzbuzz.py file:

def process(number): if number % 3 == 0: return 'Fizz'

This function accepts a number and uses the modulus operator to divide the number by 3 and check to see if there is a remainder. If there is no remainder, then you know that the number is divisible by 3 so you can return the string “Fizz”.

Now when you run the test, the output should look like this:


.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

The period on the first line above means that you ran one test and it passed.

Let’s take a quick step back here. When a test is failing, it is considered to be in a “red” state. When a test is passing, that is a “green” state. This refers to the Test Driven Development (TDD) mantra of red/green/refactor. Most developers will start a new project by creating a failing test (red). Then they will write the code to make the test pass, usually in the simplest way possible (green).

When your tests are green, that is a good time to commit your test and the code change(s). This allows you to have a working piece of code that you can rollback to. Now you can write a new test or refactor the code to make it better without worrying that you will lose your work because now you have an easy way to roll back to a previous version of the code.

To commit your code, you can do the following:


git add fizzbuzz.py test_fizzbuzz.py
git commit -m "First commit"

The first command will add the two new files. You don’t need to commit *.pyc files, just the Python files. There is a handy file called .gitignore that you can add to your Git repository that you may use to exclude certain file types or folder, such as *.pyc. Github has some default gitignore files for various languages that you can get if you’d like to see an example.

The second command is how you can commit the code to your local repository. The “-m” is for message followed by a descriptive message about the changes that you’re committing. If you would like to save your changes to Github as well (which is great for backup purposes), you should check out this article.

Now we are ready to write another test!

The Buzz Test

The second test that you can write can be for multiples of five. To add a new test, you can create another method in the TestFizzBuzz class:

import fizzbuzz import unittest   class TestFizzBuzz(unittest.TestCase):   def test_multiple_of_three(self): self.assertEqual(fizzbuzz.process(6), 'Fizz')   def test_multiple_of_five(self): self.assertEqual(fizzbuzz.process(20), 'Buzz')   if __name__ == '__main__': unittest.main()

This time around, you want to use a number that is only divisible by 5. When you call fizzbuzz.process(), you should get “Buzz” returned. When you run the test though, you will receive this:


F.
======================================================================
FAIL: test_multiple_of_five (__main__.TestFizzBuzz)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_fizzbuzz.py", line 10, in test_multiple_of_five
self.assertEqual(fizzbuzz.process(20), 'Buzz')
AssertionError: None != 'Buzz'

----------------------------------------------------------------------
Ran 2 tests in 0.000s

FAILED (failures=1)

Oops! Right now your code uses the modulus operator to check for remainders after dividing by 3. If the number 20 has a remainder, that statement won’t run. The default return value of a function is None, so that is why you end up getting the failure above.

Go ahead and update the process() function to be the following:

def process(number): if number % 3 == 0: return 'Fizz' elif number % 5 == 0: return 'Buzz'

Now you can check for remainders with both 3 and 5. When you run the tests this time, the output should look like this:


..
----------------------------------------------------------------------
Ran 2 tests in 0.000s

OK

Yay! Your tests passed and are now green! That means you can commit these changes to your Git repository.

Now you are ready to add a test for FizzBuzz!

The FizzBuzz Test

The next test that you can write will be for when you want to get “FizzBuzz” back. As you may recall, you will get FizzBuzz whenever the number is divisible by 3 and 5. Go ahead and add a third test that does just that:

import fizzbuzz import unittest   class TestFizzBuzz(unittest.TestCase):   def test_multiple_of_three(self): self.assertEqual(fizzbuzz.process(6), 'Fizz')   def test_multiple_of_five(self): self.assertEqual(fizzbuzz.process(20), 'Buzz')   def test_fizzbuzz(self): self.assertEqual(fizzbuzz.process(15), 'FizzBuzz')   if __name__ == '__main__': unittest.main()

For this test, test_fizzbuzz, you ask your program to process the number 15. This shouldn’t work right yet, but go ahead and run the test code to check:


F..
======================================================================
FAIL: test_fizzbuzz (__main__.TestFizzBuzz)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_fizzbuzz.py", line 13, in test_fizzbuzz
self.assertEqual(fizzbuzz.process(15), 'FizzBuzz')
AssertionError: 'Fizz' != 'FizzBuzz'

----------------------------------------------------------------------
Ran 3 tests in 0.000s

FAILED (failures=1)

Three tests were run with one failure. You are now back to red. This time the error is ‘Fizz’ != ‘FizzBuzz’ instead of comparing None to FizzBuzz. The reason for that is because your code checks if 15 is divisible by 3 and it is so it returns “Fizz”.

Since that isn’t what you want to happen, you will need to update your code to check if the number is divisible by 3 and 5 before checking for just 3:

def process(number): if number % 3 == 0 and number % 5 == 0: return 'FizzBuzz' elif number % 3 == 0: return 'Fizz' elif number % 5 == 0: return 'Buzz'

Here you do the divisibility check for 3 and 5 first. Then you check for the other two as before.

Now if you run your tests, you should get the following output:


...
----------------------------------------------------------------------
Ran 3 tests in 0.000s

OK

So far so good. However you don’t have the code working for returning numbers that aren’t divisible by 3 or 5. Time for another test!

The Final Test

The last thing that your code needs to do is return the number when it does have a remainder when divided by 3 and 5. Let’s test it a couple of different ways:

import fizzbuzz import unittest   class TestFizzBuzz(unittest.TestCase):   def test_multiple_of_three(self): self.assertEqual(fizzbuzz.process(6), 'Fizz')   def test_multiple_of_five(self): self.assertEqual(fizzbuzz.process(20), 'Buzz')   def test_fizzbuzz(self): self.assertEqual(fizzbuzz.process(15), 'FizzBuzz')   def test_regular_numbers(self): self.assertEqual(fizzbuzz.process(2), 2) self.assertEqual(fizzbuzz.process(98), 98)   if __name__ == '__main__': unittest.main()

For this test, you test normal numbers 2 and 98 with the test_regular_numbers() test. These numbers will always have a remainder when divided by 3 or 5, so they should just be returned.

When you run the tests now, you should get something like this:


...F
======================================================================
FAIL: test_regular_numbers (__main__.TestFizzBuzz)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_fizzbuzz.py", line 16, in test_regular_numbers
self.assertEqual(fizzbuzz.process(2), 2)
AssertionError: None != 2

----------------------------------------------------------------------
Ran 4 tests in 0.000s

FAILED (failures=1)

This time you are back to comparing None to the number, which is what you probably suspected would be the output.

Go ahead and update the process() function as follows:

def process(number): if number % 3 == 0 and number % 5 == 0: return 'FizzBuzz' elif number % 3 == 0: return 'Fizz' elif number % 5 == 0: return 'Buzz' else: return number

That was easy! All you needed to do at this point was add an else statement that returns the number.

Now when you run the tests, they should all pass:


....
----------------------------------------------------------------------
Ran 4 tests in 0.000s

OK

Good job! Now your code works. You can verify that it works for all the numbers, 1-100, by adding the following to your fizzbuzz.py module:

if __name__ == '__main__': for i in range(1, 101): print(process(i))

Now when you run fizzbuzz yourself using python fizzbuzz.py, you should see the appropriate output that was specified at the beginning of this tutorial.

This is a good time to commit your code and push it to the cloud.

Wrapping Up

Now you know the basics of using Test Driven Development to drive you to solve a coding kata. Python’s unittest module has many more types of asserts and functionality than is covered in this brief tutorial. You could also modify this tutorial to use pytest, another popular 3rd party Python package that you can use in place of Python’s own unittest module.

The nice thing about having these tests is that now you can refactor your code and verify you didn’t break anything by running the tests. This also allows you to add new features more easily without breaking existing features. Just be sure to add more tests as you add more features.

Related Reading

The post Python Code Kata: Fizzbuzz appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

Real Python: How to Convert a Python String to int

Wed, 2019-09-18 10:00

Integers are whole numbers. In other words, they have no fractional component. Two data types you can use to store an integer in Python are int and str. These types offer flexibility for working with integers in different circumstances. In this tutorial, you’ll learn how you can convert a Python string to an int. You’ll also learn how to convert an int to a string.

By the end of this tutorial, you’ll understand:

  • How to store integers using str and int
  • How to convert a Python string to an int
  • How to convert a Python int to a string

Let’s get started!

Python Pit Stop: This tutorial is a quick and practical way to find the info you need, so you’ll be back to your project in no time!

Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

Representing Integers in Python

An integer can be stored using different types. Two possible Python data types for representing an integer are:

  1. str
  2. int

For example, you can represent an integer using a string literal:

>>>>>> s = "110"

Here, Python understands you to mean that you want to store the integer 110 as a string. You can do the same with the integer data type:

>>>>>> i = 110

It’s important to consider what you specifically mean by "110" and 110 in the examples above. As a human who has used the decimal number system for your whole life, it may be obvious that you mean the number one hundred and ten. However, there are several other number systems, such as binary and hexadecimal, which use different bases to represent an integer.

For example, you can represent the number one hundred and ten in binary and hexadecimal as 1101110 and 6e respectively.

You can also represent your integers with other number systems in Python using the str and int data types:

>>>>>> binary = 0b1010 >>> hexadecimal = "0xa"

Notice that binary and hexadecimal use prefixes to identify the number system. All integer prefixes are in the form 0?, in which you replace ? with a character that refers to the number system:

  • b: binary (base 2)
  • o: octal (base 8)
  • d: decimal (base 10)
  • x: hexadecimal (base 16)

Technical Detail: The prefix is not required in either an integer or string representation when it can be inferred.

int assumes the literal integer to be decimal:

>>>>>> decimal = 303 >>> hexadecimal_with_prefix = 0x12F >>> hexadecimal_no_prefix = 12F File "<stdin>", line 1 hexadecimal_no_prefix = 12F ^ SyntaxError: invalid syntax

The string representation of an integer is more flexible because a string holds arbitrary text data:

>>>>>> decimal = "303" >>> hexadecimal_with_prefix = "0x12F" >>> hexadecimal_no_prefix = "12F"

Each of these strings represent the same integer.

Now that you have some foundational knowledge about how to represent integers using str and int, you’ll learn how to convert a Python string to an int.

Converting a Python String to an int

If you have a decimal integer represented as a string and you want to convert the Python string to an int, then you just pass the string to int(), which returns a decimal integer:

>>>>>> int("10") 10 >>> type(int("10")) <class 'int'>

By default, int() assumes that the string argument represents a decimal integer. If, however, you pass a hexadecimal string to int(), then you’ll see a ValueError:

>>>>>> int("0x12F") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: '0x12F'

The error message says that the string is not a valid decimal integer.

Note:

It’s important to recognize the difference between two types of failed results of passing a string to int():

  1. Syntax Error: A ValueError will occur when int() doesn’t know how to parse the string using the provided base (10 by default).
  2. Logical Error: int() does know how to parse the string, but not the way you expected.

Here’s an example of a logical error:

>>>>>> binary = "11010010" >>> int(binary) # Using the default base of 10, instead of 2 11010010

In this example, you meant for the result to be 210, which is the decimal representation of the binary string. Unfortunately, because you didn’t specify that behavior, int() assumed that the string was a decimal integer.

One good safeguard for this behavior is to always define your string representations using explicit bases:

>>>>>> int("0b11010010") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: '0b11010010'

Here, you get a ValueError because int() doesn’t know how to parse the binary string as a decimal integer.

When you pass a string to int(), you can specify the number system that you’re using to represent the integer. The way to specify the number system is to use base:

>>>>>> int("0x12F", base=16) 303

Now, int() understands you are passing a hexadecimal string and expecting a decimal integer.

Technical Detail: The argument that you pass to base is not limited to 2, 8, 10, and 16:

>>>>>> int("10", base=3) 3

Great! Now that you’re comfortable with the ins and outs of converting a Python string to an int, you’ll learn how to do the inverse operation.

Converting a Python int to a String

In Python, you can convert a Python int to a string using str():

>>>>>> str(10) '10' >>> type(str(10)) <class 'str'>

By default, str() behaves like int() in that it results in a decimal representation:

>>>>>> str(0b11010010) '210'

In this example, str() is smart enough to interpret the binary literal and convert it to a decimal string.

If you want a string to represent an integer in another number system, then you use a formatted string, such as an f-string (in Python 3.6+), and an option that specifies the base:

>>>>>> octal = 0o1073 >>> f"{octal}" # Decimal '571' >>> f"{octal:x}" # Hexadecimal '23b' >>> f"{octal:b}" # Binary '1000111011'

str is a flexible way to represent an integer in a variety of different number systems.

Conclusion

Congratulations! You’ve learned so much about integers and how to represent and convert them between Python string and int data types.

In this tutorial, you learned:

  • How to use str and int to store integers
  • How to specify an explicit number system for an integer representation
  • How to convert a Python string to an int
  • How to convert a Python int to a string

Now that you know so much about str and int, you can learn more about representing numerical types using float(), hex(), oct(), and bin()!

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Podcast.__init__: Cultivating The Python Community In Argentina

Wed, 2019-09-18 08:53
The Python community in Argentina is large and active, thanks largely to the motivated individuals who manage and organize it. In this episode Facundo Batista explains how he helped to found the Python user group for Argentina and the work that he does to make it accessible and welcoming. He discusses the challenges of encompassing such a large and distributed group, the types of events, resources, and projects that they build, and his own efforts to make information free and available. He is an impressive individual with a substantial list of accomplishments, as well as exhibiting the best of what the global Python community has to offer.Summary

The Python community in Argentina is large and active, thanks largely to the motivated individuals who manage and organize it. In this episode Facundo Batista explains how he helped to found the Python user group for Argentina and the work that he does to make it accessible and welcoming. He discusses the challenges of encompassing such a large and distributed group, the types of events, resources, and projects that they build, and his own efforts to make information free and available. He is an impressive individual with a substantial list of accomplishments, as well as exhibiting the best of what the global Python community has to offer.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Facundo Batista about his experiences founding and fostering the Argentinian Python community, working as a core developer, and his career in Python
Interview
  • Introductions
  • How did you get introduced to Python?
  • What was your motivation for organizing a Python user group in Argentina?
  • How does the geography and culture of Argentina influence the focus of the community?
  • Argentina is a fairly large country. What is the reasoning for having the user group encompass the whole nation and how is it organized to provide access to everyone?
  • What are some notable projects that have been built by or for members of PyAr?
    • What are some of the challenges that you faced while building CDPedia and what aspects of it are you most proud of?
  • How did you get started as a core developer?
    • What areas of the language and runtime have you been most involved with?
  • As a core developer, what are some of the most interesting/unexpected/challenging lessons that you have learned?
  • What other languages do you currently use and what is it about Python that has motivated you to spend so much of your attention on it?
  • What are some of the shortcomings in Python that you would like to see addressed in the future?
  • Outside of CPython, what are some of the projects that you are most proud of?
  • How has your involvement with core development and PyAr influenced your life and career?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Categories: FLOSS Project Planets

Talk Python to Me: #230 Python in digital humanities research

Wed, 2019-09-18 04:00
You've often heard me talk about Python as a superpower. It can amplify whatever you're interested in or what you have specialized in for your career. This episode is an amazing example of this. You'll meet Cornelis van Lit. He is a scholar of medieval Islamic philosophy and woks at Utrecht University in the Netherlands. What he is doing with Python is pretty amazing.
Categories: FLOSS Project Planets

Pages