Planet Python

Subscribe to Planet Python feed
Planet Python -
Updated: 18 hours 20 min ago

IslandT: Python Tutorial — Chapter 9

Mon, 2022-09-26 05:16

In this chapter, let us look at the Python set object which is used to store multiple items just as tuples, lists, and dictionaries.

You can declare a set object with one of these methods:-

set1 = {1,2,3} set2 = set((1,2,3))

In order to remove an element from a set, use the remove or discard method as follows:-

set1.remove(2) # {1,3} set1.discard(3) # {1}

In order to remove the front element from a set, use the pop method.

set1.pop() # {2,3}

Insert an element to the end of the set with the add method.

set1.add(5) #{1,2,3,5}

Find the difference between two set elements:-

set1 = {1,2,3} set2 = set((1,4,6)) set3 = set1.difference(set2) # {2,3}

As you can see this method will subtract the element within two sets based on their index position.

Clear all the elements within a set:-


Find the common elements within two sets:-

set1 = {1,2,3} set2 = set((1,4,6)) set3 = set1.intersection(set2) # {1}

Combined all the elements within two sets:-

set1 = {1,2,3} set2 = set((1,4,6)) set3 = set1.union(set2) # {1,2,3,4,6}

Copy a set of elements into another set:-

set3 = set1.copy() # {1, 2, 3}

Find out whether a set is a subset of another set:-

set1 = {1,2,3} set2 = set((1,4,6)) set1.issubset(set2) # False

There are still lots of methods that you can read on the official documentation page of Python regarding the set object!

Categories: FLOSS Project Planets

Matthew Wright: Use pandas DateOffsets for easy date manipulation

Sun, 2022-09-25 20:54

So much useful data has a date or time component. Often, data has a timestamp to represent when the data was acquired, or when an event will take place, or as an identifying attribute like an expiration date. For this reason, understanding how to work with dates and times effectively can be a very useful skill. One common need is to select dates (and times) using rules based on their offset from known times. This article will focus on some handy ways to use pandas DateOffsets for working with dates specifically.

Since my experience is in the areas of finance and trading, I’ll use some practical examples I’ve encountered over the years. But even if you don’t work in finance, the techniques should work for any data that has dates.

What is a DateOffset?

A DateOffset is just a special object that represents a way to shift a date to a new date. This turns out to be really useful.

The DateOffset class and a number of useful offset aliases are in the pd.offsets package (an alias to pandas.tseries.offsets).

Quick overview

Before we look at some ideas of how to use these DateOffsets, let’s just review how they work. This is all just a high level of what you’ll find in the documentation, so head there for more detail.

First, let’s just look at the DateOffset class itself, you can do quite a bit with it alone!

The DateOffset constructor takes a number of keyword arguments. Plural arguments will shift the date. Singular arguments replace the resulting date’s values. Use normalize to set the time to midnight. Note that DateOffset will respect timezones, unlike Timedelta, so if you cross a daylight savings boundary, it will make sure you aren’t off by an hour.

import pandas as pd now = print("Add a day:", now + pd.offsets.DateOffset(days=1)) print("Add a week:", now + pd.offsets.DateOffset(weeks=1)) print("Add a month:", now + pd.offsets.DateOffset(months=1)) print("Add an hour:", now + pd.offsets.DateOffset(hours=1)) print("Add a day, replace the hour:", now + pd.offsets.DateOffset(days=1, hour=13)) print("Add a month, normalize:", now + pd.offsets.DateOffset(month=1, normalize=True)) print("Add 2 days across DST change:", pd.Timestamp("2022-11-05 00:00:00", tz="America/Chicago") + pd.offsets.DateOffset(days=2)) print("Add 2 days across DST change (with Timedelta, no adjustment):", pd.Timestamp("2022-11-05 00:00:00", tz="America/Chicago") + pd.Timedelta(days=2)) Add a day: 2022-09-26 14:20:30.243984 Add a week: 2022-10-02 14:20:30.243984 Add a month: 2022-10-25 14:20:30.243984 Add an hour: 2022-09-25 15:20:30.243984 Add a day, replace the hour: 2022-09-26 13:20:30.243984 Add a month, normalize: 2022-01-25 00:00:00 Add 2 days across DST change: 2022-11-07 00:00:00-06:00 Add 2 days across DST change (with Timedelta, no adjustment): 2022-11-06 23:00:00-06:00 Offset aliases

However, you don’t need to use the DateOffset class directly. Pandas has a ton of named offset aliases that do what you want for a number of common scenarios. You’ll find these to be extremely useful.

print("Next business day (or weekday):", now + pd.offsets.BDay(normalize=True)) print("Three business days (or weekday):", now + pd.offsets.BDay(3, normalize=True)) print("Next Easter:", now + pd.offsets.Easter(normalize=True)) Next business day (or weekday): 2022-09-26 00:00:00 Three business days (or weekday): 2022-09-28 00:00:00 Next Easter: 2023-04-09 00:00:00

You can also subtract offsets.

print("Beginning of month:", now - pd.offsets.MonthBegin(normalize=True)) print("Beginning of quarter:", now - pd.offsets.QuarterBegin(normalize=True)) print("Beginning of year:", now - pd.offsets.YearBegin(normalize=True)) Beginning of month: 2022-09-01 00:00:00 Beginning of quarter: 2022-09-01 00:00:00 Beginning of year: 2022-01-01 00:00:00 Full offset alias list

Pandas has a plethora of configured offset aliases. You can create them by constructing them as an object as shown above, or you can pass their code (listed in parentheses below) to other pandas methods that take offsets as a parameter, as you’ll see below. Here’s a list taken right from the documentation.

  • DateOffset Generic offset class, defaults to absolute 24 hours
  • BDay or BusinessDay, (B). business day or weekday
  • CDay or CustomBusinessDay, (C). custom business day
  • Week (W) one week, optionally anchored on a day of the week
  • WeekOfMonth (WOM) the x-th day of the y-th week of each month
  • LastWeekOfMonth (LWOM) the x-th day of the last week of each month
  • MonthEnd (M) calendar month end
  • MonthBegin (MS) calendar month begin
  • BMonthEnd or BusinessMonthEnd (BM) business month end
  • BMonthBegin or BusinessMonthBegin (BMS) business month begin
  • CBMonthEnd or CustomBusinessMonthEnd (CBM) custom business month end
  • CBMonthBegin or CustomBusinessMonthBegin (CBMS) custom business month begin
  • SemiMonthEnd (SM) 15th (or other day_of_month) and calendar month end
  • SemiMonthBegin (SMS) 15th (or other day_of_month) and calendar month begin
  • QuarterEnd (Q) calendar quarter end
  • QuarterBegin (QS) calendar quarter begin
  • BQuarterEnd (BQ) business quarter end
  • BQuarterBegin (BQS) business quarter begin
  • FY5253Quarter (REQ) retail (aka 52-53 week) quarter
  • YearEnd (A) calendar year end
  • YearBegin (AS) or (BYS) calendar year begin
  • BYearEnd (BA) business year end
  • BYearBegin (BAS) business year begin
  • FY5253 (RE) retail (aka 52-53 week) year
  • Easter Easter holiday
  • BusinessHour (BH) business hour
  • CustomBusinessHour (CBH) custom business hour
  • Day (D) one absolute day
  • Hour (H) one hour
  • Minute (T) or (min) one minute
  • Second (S) one second
  • Milli (L) or (ms) one millisecond
  • Micro (U) or (us) one microsecond
  • Nano (N) one nanosecond

A useful place to use the offset aliases is in pd.date_range. The code can be passed in as the freq argument along with numbers. Here’s a few examples.

print("Beginning of the quarter\n", pd.date_range(start='2022-01-01', freq='QS', periods=4)) print("Beginning of the month\n", pd.date_range(start='2022-01-01', freq='MS', periods=4)) print("Beginning of every 3rd month\n", pd.date_range(start='2022-01-01', freq='3MS', periods=4)) Beginning of the quarter DatetimeIndex(['2022-01-01', '2022-04-01', '2022-07-01', '2022-10-01'], dtype='datetime64[ns]', freq='QS-JAN') Beginning of the month DatetimeIndex(['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01'], dtype='datetime64[ns]', freq='MS') Beginning of every 3rd month DatetimeIndex(['2022-01-01', '2022-04-01', '2022-07-01', '2022-10-01'], dtype='datetime64[ns]', freq='3MS') What is the alternative to pandas DateOffsets?

You’ve probably seen a lot of code that tries to do complex date logic using basic Python datetime times. This might make sense for trival cases, but you will quickly run into situations that cause that code to turn ugly. For example, if given a date you want to find the next Monday, you could write something like this:

import datetime today = while today.weekday() != 0: # Monday today += datetime.timedelta(days=1) today, 9, 26)

Compare the above to

( + pd.offsets.Week(1, weekday=0)).date(), 9, 26)

Plus, it’s easy to use these offsets on pandas Series and DataFrames.

s = pd.Series(pd.date_range('2022-01-01', periods=5)) s + pd.offsets.Week(1, weekday=0) 0 2022-01-03 1 2022-01-03 2 2022-01-10 3 2022-01-10 4 2022-01-10 dtype: datetime64[ns] More complicated scenarios

One way I’ve found offsets to be useful is to select data for certain events. For example, a very important report for the US financial markets is made available every month from the U.S. Bureau of Labor Statistics. It’s called the Employment Situation, with the “Non-Farm payrolls” number in that report being one of the most closely watched pieces of data by traders. Their schedule is [listed on their website]. It generally follows the schedule of the first Friday of the month. We can generate this pretty easily using pandas. There are a couple of ways we could do this, but here’s one technique. We can make a date index using date_range, and pass in the MonthBegin as the freq, using the code from the list above.

dates = pd.date_range('2022-01-01', '2022-12-31', freq='MS') dates DatetimeIndex(['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01', '2022-05-01', '2022-06-01', '2022-07-01', '2022-08-01', '2022-09-01', '2022-10-01', '2022-11-01', '2022-12-01'], dtype='datetime64[ns]', freq='MS')

Now, given the first day of the month, can we get the first Monday of the month? One way to do this is to back up 1 day (in case the first day is a Monday itself), then move forward one week, but setting the weekday to Friday.

dates - pd.offsets.Day(1) + pd.offsets.Week(1, weekday=4) DatetimeIndex(['2022-01-07', '2022-02-04', '2022-03-04', '2022-04-01', '2022-05-06', '2022-06-03', '2022-07-01', '2022-08-05', '2022-09-02', '2022-10-07', '2022-11-04', '2022-12-02'], dtype='datetime64[ns]', freq=None)

But, it turns out you can pass in a 0 as the week move, and in that case it will not shift if the start date is the same as the anchor point. (I hadn’t realized that until I was writing this up so used to do it the first way. The pandas docs are full of great information, you should read them!).

dates + pd.offsets.Week(0, weekday=4) DatetimeIndex(['2022-01-07', '2022-02-04', '2022-03-04', '2022-04-01', '2022-05-06', '2022-06-03', '2022-07-01', '2022-08-05', '2022-09-02', '2022-10-07', '2022-11-04', '2022-12-02'], dtype='datetime64[ns]', freq=None)

Now if I compare the values above with this year’s dates listed at the BLS site, I see that the July data was released on July 8th, not July 1st. This is related to the U.S. Independence Day holiday the following Monday. This is a good reminder to never completely trust your understanding of the data! The BLS can choose to move things around if they want to, so having a reliable reference source for events is probably required if you are depending on this data.


What about dealing with holidays? If we look at the 2021 schedule, we can see that the first Friday in January falls on New Year’s Day. It turns out that adding holidays is not that hard with pandas. If all you want to do is select the next business day, you can just use the calendar with a CustomBusinessDay offset, with a value of 0 that means we should only move forward if the date is a holiday.

dates_2021 = pd.date_range('2021-01-01', '2021-12-31', freq='MS') dates_2021 DatetimeIndex(['2021-01-01', '2021-02-01', '2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-09-01', '2021-10-01', '2021-11-01', '2021-12-01'], dtype='datetime64[ns]', freq='MS') from import USFederalHolidayCalendar bday_us = pd.offsets.CustomBusinessDay(0, calendar=USFederalHolidayCalendar()) dates_2021 + bday_us /Users/mcw/.pyenv/versions/pandas/lib/python3.8/site-packages/pandas/core/arrays/ PerformanceWarning: Non-vectorized DateOffset being applied to Series or DatetimeIndex. warnings.warn( DatetimeIndex(['2021-01-04', '2021-02-01', '2021-03-01', '2021-04-01', '2021-05-03', '2021-06-01', '2021-07-01', '2021-08-02', '2021-09-01', '2021-10-01', '2021-11-01', '2021-12-01'], dtype='datetime64[ns]', freq=None)

Note that we get a warning that the offset is non-vectorized. This means that if you want to use this technique on an extremely large dataset, this will be quite slow (as of the time of writing with pandas 1.4.3). For this reason, for larger data sets you may want to create this index once and use it multiple times with your data.

Now, note that above we used a holiday calendar from pandas. But the holidays on the web site were slightly different – the BLS listed Inauguration Day as a holiday as well. We can make a custom holiday calendar ourselves.

bls_holidays = [ "2021-01-01", "2021-01-18", "2021-01-20", "2021-02-15", "2021-05-31", "2021-07-05", "2021-09-06", "2021-10-11", "2021-11-11", "2021-11-25", "2021-12-24", "2021-12-31", ] bday_bls = pd.offsets.CustomBusinessDay(0, holidays=bls_holidays) dates_2021_bls = dates_2021 + bday_bls dates_2021_bls /Users/mcw/.pyenv/versions/pandas/lib/python3.8/site-packages/pandas/core/arrays/ PerformanceWarning: Non-vectorized DateOffset being applied to Series or DatetimeIndex. warnings.warn( DatetimeIndex(['2021-01-04', '2021-02-01', '2021-03-01', '2021-04-01', '2021-05-03', '2021-06-01', '2021-07-01', '2021-08-02', '2021-09-01', '2021-10-01', '2021-11-01', '2021-12-01'], dtype='datetime64[ns]', freq=None)

Now, if you had a Series or DataFrame of data, say returns for a financial instrument for every day of the year, you could use this index to pick out the ones from the dates in question using pandas indexing. If you want to know more about indexing time series data in pandas, you can check out this article. Here’s an example:

# make some fake data, one value per day of the year df = pd.DataFrame(np.random.rand(365), index=pd.date_range('2021-01-01', '2021-12-31')) df.loc[dates_2021_bls] 0 2021-01-04 0.151260 2021-02-01 0.201709 2021-03-01 0.921957 2021-04-01 0.072389 2021-05-03 0.821674 2021-06-01 0.561620 2021-07-01 0.926453 2021-08-02 0.055801 2021-09-01 0.768521 2021-10-01 0.294276 2021-11-01 0.651574 2021-12-01 0.099297

In summary, you can use pandas DateOffsets to shift dates easily. This can be a huge timesaver when you need to select data using complex (and not so complex) criteria. How will you use them in your next data investigation?

The post Use pandas DateOffsets for easy date manipulation appeared first on

Categories: FLOSS Project Planets

Go Deh: Answering a Reddit Question with timings.

Sun, 2022-09-25 06:01

Best viewed on larger than a phone screen.


Someone had a problem on Reddit r/python:

Hello guys, I want to find a string in a list and this list has 350K elements all they are strings . I want to find out a good algorithm that can find the string very quick . I know linear search but want to figure out other ways if possible.

The problem had already had over seventy answers when I came across it, but I wanted to check:

  1. Had people teased more information out of the original poster?
  2. Was there  a pocket of timed goodness in there?
I read on.

The answers given were polite. Yay, here's to the python community 🍻👌🏾👍🏾! 

The answers did give alternatives to linear search, but so many posts had no timings, and the OP was still pretty vague in the data he gave, considering he was wanting "faster".

I was late to this party, but wanted to write some Python, as I can find coding relaxing if I write in the spirit of helping, (and remember I too might learn something). It's not the first time I've tried to help out and decided on three points:

  1. Get more realistic timings, or a timing framework were they can go in and alter things to get timings that more represent their data.
  2. The answers mentioned linear search of the text, sets, and tries. I would skip tries as I have already found that its hard to get over their speed of interpretation limitations.
  3.  I'm gonna need several files to do the timings - lets try creating them from one initial python file. (Why not, keeps me interested).


I chose to code in the Spyder IDE for this, rather than Vscode. YMMVm but I still find Spyder to support a more "dynamic" development style, great for scripting and I am not writing hundreds of lines of code for this.

I thought of the steps I would need, seemed simple enough so just started coding the first,

Create a text file of 350K sorted "strings"

What is meant by strings? I don't know, don't sweat it - I'll make each line several words with one word having an increasing count appended:

# -*- coding: utf-8 -*-"""Created on Sat Sep 24 10:00:51 2022
@author: paddy3118"""
# %% Create txt file of 350_000 strings
from pathlib import Path
lines = '\n'.join(f"Fee fi word{i:06} fum" for i in range(350_000))
txtfile = Path('string_in_long_list_2.txt')txtfile.write_text(lines, 'utf8')

Notice the # %%  cell comment. It allows me to quickly run and rerun the code in the cell and check the values of variables in Spyder until I get it right. (Vscode will do this too, but I'm more comfortable with Spyder's implementation).

Sets; On your marks...

If the OP was going to test for strings in the text file multiple times for the one text file, which I think I read in the original posters replies to others, then I would like to time set inclusion, but reduce the time to create or read the set. 

I decided on creating a module file that assigns the set of lines in the text file to a variable data. Another prog would load this module and check for string inclusion and compile the module so subsequent loadings would be faster

# %% Create a module of the set of those lines assigned to name 'data'
import pprint
set_of_strings = set(txtfile.read_text('utf8').split('\n'))moduletxt = f"""\# -*- coding: utf-8 -*-"Set of each line from the text file."data = {pprint.pformat(set_of_strings, indent=2)}"""
modulefile = Path('')modulefile.write_text(moduletxt, 'utf8')

I ran the cell and checked the generated script until it looked right.

Linear search.

Although the OP said the text file was sorted, others had already noted that trying binary search would be slowed as you would have to read every line first into a list... you might as well do a comparison as you read each line, so I needed to generate a script to do just that:

# %% Create a script to just search for a line containing its arg whilst reading.
scripttxt = """\# -*- coding: utf-8 -*-"Search for first line in text file matching argument."
import sys
to_find = sys.argv[1] if len(sys.argv) > 1 else ''
with open('string_in_long_list_2.txt', encoding='utf8') as txtfile:    for line in txtfile:        if to_find == line.rstrip('\\n'):            print("True")            sys.exit(0)            break    else:        print("False")        sys.exit(1)"""
linesearchfile = Path('')linesearchfile.write_text(scripttxt, 'utf8')

When writing this, I first had no tripple quotes around what became scripttxt =, and could rerun the contained code as I debugged, then encapsulate in scripttxt = """\ and add the the last couple of lines to generate the file.

 Match against set

This generates the python script that when given a string as its argument, will check against data loaded as a set from its module.

# %% Create a script to look for the arg in the modules data set
scripttxt = """\# -*- coding: utf-8 -*-"test if argument is in data loaded from module as a set."
from string_in_long_list_3 import dataimport sys
to_find = sys.argv[1] if len(sys.argv) > 1 else ''
if to_find in data:    print("True")    sys.exit(0)
modulesearchfile = Path('')modulesearchfile.write_text(scripttxt, 'utf8')

The end of the Python script!


I was always going to time runs of the scripts out in the shell of the OS, for which I, and many others are familiar with the GNU/Linux time command. I would add comments to tell a story and show I had "nothing up my sleeves".

(py310) $(py310) $ $(py310) $ # New dir(py310) $ mkdir work(py310) $ cd work(py310) $ ls -a.  ..(py310) $ cp ../ .(py310) $ ls -a.  .. $(py310) $ # Create files(py310) $ python3 $(py310) $ # (How many lines)(py310) $ wc -l *      75  349999 string_in_long_list_2.txt  350002      16      14  700106 total(py310) $(py310) $ # time linear search of text file(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.102suser    0m0.058ssys     0m0.000s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.102suser    0m0.045ssys     0m0.015s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.094suser    0m0.043ssys     0m0.009s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.100suser    0m0.056ssys     0m0.000s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.106suser    0m0.030ssys     0m0.030s(py310) $(py310) $  string_in_long_list_5.pystring_in_long_list_2.txt $(py310) $ # time creation of module .pyc file and search of data in set(py310) $ time python3 'Fee fi word175000 fum'True
real    0m1.578suser    0m0.869ssys     0m0.667s(py310) $(py310) $ # time search of data in set from compiled module file(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.186suser    0m0.120ssys     0m0.047s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.220suser    0m0.178ssys     0m0.021s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.199suser    0m0.155ssys     0m0.023s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.193suser    0m0.136ssys     0m0.033s(py310) $ time python3 'Fee fi word175000 fum'True
real    0m0.187suser    0m0.127ssys     0m0.040s(py310) $

My linear search is faster than the set lookup, even after the set data is precompiled into a .pyc module.

The OP could adapt to use their text file I guess, and use more typical string searches, but only they have that data.

Big O

The thing about big O comparisons that cloud their use when people have real data is:

  • People forget that actual timings put values on the constants that are abstracted away when you compare only BigO. If the actual timings of two algorithms are 100x + 100 and 3*x**2 + 2*x + 1; then although the first algorithm is proportional to x and the second is proportional to x**2, there is a range for x, the data, where the second x**2 algorithm is faster.
  • Python is interpreted. It makes it harder to get the true BigO dependency for an interpreted algorithm when things like variable name accesses and repeated code interpretation speeds may be significant.     

If people want faster, then they should ask with an expectation of needing to give data.


Categories: FLOSS Project Planets

David Amos: Want cleaner code? Use the rule of six

Sat, 2022-09-24 08:00

This article contains affiliate links. See my affiliate disclosure for more information.

Everyone wants to write clean code. There are whole books about it!

But you don&apost need to read a book to write cleaner code right now. There&aposs one "trick" that every coder can learn to make their code less confusing.

The key is:

Every line does only one thing

One line, one task.

But don&apost go crazy with it.

Don&apost be like this.

Here&aposs the main idea: Short lines of code require less brainpower to read than long ones. Code that&aposs easy to read is easier to reason about. Programs with shorter lines are, in theory, easier to maintain.

But compact code can be cryptic. (Ever seen APL?) And just because you can split a line doesn&apost mean you should.

In some languages, you can assign two values to two variables on one line:

x, y = 2, 7

You could put both assignments on their own line:

x = 2 y = 7

But, c&aposmon. Do you really need to? How can you tell if a line should be split up?

It&aposs not all about line length

Felienne Hermans opens her book The Programmer&aposs Brain with an undeniable truth: "Confusion is a part of programming."

It probably means it&aposs time to take a break.

Hermans&apos book (which I highly recommend) explains how your brain&aposs three memory functions work together to understand code:

  • Long-term memory (LTM): Stores information for long-term retrieval, such as keywords, syntax, and commonly used idioms and patterns.
  • Short-term memory (STM): Stores new information for short-term retrieval (less than 30 seconds!), such as variable names and special values.
  • Working memory (WM): Processes information from LTM and STM to draw conclusions and derive new knowledge.

STM and WM are small. Both can only store about 4 to 6 things at a time! Overload them and you&aposve got a recipe for confusion.

How your brain processes information.

That gives us a rule for deciding if a line of code is too complex:

A line of code containing 6+ pieces of information should be simplified.

I call it the "rule of six."

Here&aposs an example in Python:

map(lambda x: x.split(&apos=&apos)[1], s.split(&apos?&apos)[1].split(&apos&&apos)[-3:])

Is that hard for you to read? Me too. There&aposs a good reason why.

You have to know what map, lambda, and .split() are. The variables x and s, the strings &apos=&apos, &apos?&apos, and &apos&&apos, the index [1], and the slice [-3:] all take up space in STM and WM. In total: ten things! Your brain can&apost keep up.

Or maybe yours can.

If so, you&aposve got some good experience under your belt.

Your brain "chunks" syntax like s.split(&apos?&apos)[1] into "the part of the string to the right of the question mark." And you can reconstruct the code using information stored in your LTM. But you still only process a few chunks at a time.

So… we can identify when a line of code is too complex. Now what?

Want more like this?

One email, every Saturday, with one actionable tip.
Always less than 5 minutes of your time.

Subscribe now Processing your application Great! Check your inbox and confirm your subscription There was an error sending the email If code is confusing, break it

Break it into smaller pieces, that is!

There are two strategies I use to break up code. I call them SIMPLE and MORF.

The SIMPLE strategy adds lines of code to decrease cognitive load.

Let&aposs apply SIMPLE to that nasty one-liner we saw earlier. Remove the second argument from map() and put it on its own line:

query_params = s.split(&apos?&apos)[1].split(&apos&&apos)[-3:] map(lambda x: x.split(&apos=&apos)[1], query_params)

It still might be hard to read. There are seven things to keep track of in the first line:

  • query_params
  • s
  • .split()
  • &apos?&apos
  • [1]
  • &apos&&apos
  • [-3:]

But each line has fewer things to track than before. Your brain can process them more easily.

Apply SIMPLE again and move s.split(&apos?&apos)[1] to a new line:

url_query_string = s.split(&apos?&apos)[1] query_params = url_query_string.split(&apos&&apos)[-3:] map(lambda x: x.split(&apos=&apos)[1], query_params)

Compare that to the original one-liner. Which one is easier to process?

The MORF strategy takes a different approach and groups code into functions.

Here&aposs what MORF looks like applied to our one-liner:

def query_params(url): return url.split(&apos?&apos)[1].split(&apos&&apos)[-3:] map(lambda x: x.split(&apos=&apos)[1], query_params(s))

You can even combine MORF and SIMPLE:

def query_params(url): query_string = url.split(&apos?&apos)[1] return query_string.split(&apos&&apos)[-3:] map(lambda x: x.split(&apos=&apos)[1], query_params(s))

You don&apost have to understand the code to feel the effect. Each line is easier for your brain to process.

There&aposs a bonus benefit, too!

Once you know that your WM and STM aren&apost overloaded, you know that any confusion left over is due to missing information in your LTM.

In other words, SIMPLE and MORF don&apost just help you write cleaner code. They help you identify knowledge gaps that you can improve with practice!

Want to know more about how your brain works while you&aposre coding?

Check out The Programmer&aposs Brain by Felienne Hermans.

Get instant access through Manning, or order it from Amazon.


Look at the code we ended up with using SIMPLE:

url_query_string = s.split(&apos?&apos)[1] query_params = url_query_string.split(&apos&&apos)[-3:] map(lambda x: x.split(&apos=&apos)[1], query_params)

One line still has over six "ideas" in it and should, according to the rule of six, be split up:

  • Which line?
  • What are the "ideas?"
  • How would you split it up?
  • Did splitting it up make a big difference?
Categories: FLOSS Project Planets

IslandT: Python Tutorial — Chapter 8

Sat, 2022-09-24 03:14

In this chapter let us look at the python dictionary collection object. Python dictionary object has both key and value pair and it is used to store data. The key of the python dictionary is not duplicatable.

In order to create the dictionary’s key and value pair, you need a key that links to a value (data) like below:-

power = { "station1": 200, "station2": 300, "location": "New York" }

In order to retrieve the value from a dictionary, used its key as follows:-


In order to change the value of a key, all you need to do is as follows:-

power['station1'] = 500

Example: Loop and print the key and value pairs within the above dictionary.

for key, value in power.items(): print(key + " : " + str(value)) station1 : 200 station2 : 300 location : New York

Example: Loop and print only the values of the above dictionary.

for value in power.values(): print(value)

Example: Get an item from the key.

a = power.get("station1") # 200

Example: Get rid of one item in the above dictionary.

print(power.pop("station1")) print(power) 200 {'station2': 300, 'location': 'New York'}

Example: Put more items into the above dictionary.

power['station3'] = 600

Example: Clear all the key and value pairs in the above dictionary.


Example: Print the keys of the above dictionary.

for key in power.keys(): print(key)

Example: Get the key and value pair of the last item in the above dictionary.


Example: Make and assign a copy of the above dictionary to another dictionary.

power1 = power.copy()

There are still a lot of dictionary methods I have not yet covered and you can learn all of them on the official Python document’s page.

Categories: FLOSS Project Planets

PyBites: Help, I need to refactor a mega class! Here are 5 tips …

Sat, 2022-09-24 01:38

Somebody asked the other day for tips on how to refactor a mega-class? It was actually one of the first tasks on the new job, ouch!

A single class, several thousands lines of code, no tests available

You might scratch your head and say WTF?! After all, good developers decouple code into manageable pieces and write a test suite, no? Well, not everywhere, horror stories like this are more common than you might think

So the person tasked with this was now puzzled

– How do you figure out what is going on with a new class?

– Do you go function by function and read the code and make notes?

– Do you step through with a debugger?  

– Maybe call the class methods directly to understand what they do?

Luckily we have an awesome community of developers (join here) which graciously came to the rescue.

Here are some tips I distilled from the conversation:

1. Run it.

You need a way to run the code. See where it is used and how. That should give you hints and hopefully it allows you to write a couple of tests for it so you have a safety net for refactoring.

2. Narrow down the scope.

It would be good to ask why the code needs to be refactored, is it giving any problems?

Then zoom in on specific areas and address those first, not touching the rest yet (“if it ain’t broken, don’t fix it”).

3. Apply 80/20 (Pareto principle).

Sometimes work to clean up/ refactor some of the code (most used pieces), and then when a high percentage has been cleaned up, then undergo the full refactor of the remainder – you may find that the full refactor is unnecessary.

There is actually a great book on this topic: Working Effectively with Legacy Code by Michael Feathers.

Also the following video was shared as a helpful resource providing a step-by-step approach: Brett Slatkin’s Refactoring Python: Why and how to restructure your code (PyCon 2016).

4. Regroup the code by functionality.

One long God class might not be necessary, can it be broken up into multiple functions / smaller classes and more modules?

Maybe you can logically break it out over different areas. A nice side effect is that overall naming and encapsulation (namespacing) will improve as well.

One thing I typically find is once I start breaking it down into smaller classes/components/methods etc I keep going until it is all changed. Makes me feel better, but can take longer though. May want to create some initial tests to confirm the as-is and the to-be work the same for external calls.

5. Improve your code reading skills.

In The Programmer’s Brain, Felienne Hermans offers 7 strategies to use as you are reading code:

  • Activating: Thinking of related things to activate prior knowledge
  • Monitoring: Keeping track of your understanding of a text
  • Determining Important: Deciding what parts of a text are most relevant
  • Inferring: Filling in facts that are not explicitly given the text
  • Visualizing: Drawing diagrams of the read text to deepen understanding
  • Questioning: Asking questions about the text at hand
  • Summarising: Creating a short summary of a text

You can use these steps to create a model of the code. Don’t be afraid of non-digital tools either. White boards and paper and pencil can be used to draw things out. It’s important to create mental models of how the different parts of the code relate to each other.

By the way, we spoke with AJ about The Programmer’s Brain book on our podcast, you can listen here.

We hope that these tips help you if you’re ever faced with this situation at your (new) developer job.

Thanks and shoutout to Heather, Thomas, AJ, Richard, Russell and David from our Pybites Community for contributing to this discussion!

Categories: FLOSS Project Planets

PyCharm: PyCharm 2022.3 EAP 1 Is Out!

Fri, 2022-09-23 12:34

We are launching the Early Access Program (EAP) for PyCharm 2022.3! This means that you can get access to the features that we are still polishing for the major release. We are looking forward to all your feedback on the EAP versions of PyCharm. This will help us to catch unforseen bugs quickly – your active participation in the EAP helps us make PyCharm better! 

Important! PyCharm EAP builds are not fully tested and might be unstable.

The Toolbox App is the easiest way to get the EAP builds and to keep both your stable and EAP versions up to date. You can also manually download the EAP builds from our website.


Below, you’ll find some of the improvements in PyCharm 2022.3 EAP #1. Please try them out and share your feedback using our issue tracker or in the comments.

UI New UI available via settings

In May of this year, we announced a closed preview program for the new UI for JetBrains IDEs. We aimed to introduce the reworked look and feel of the IntelliJ-based products to a limited number of users. The preview program helped us accumulate and process a lot of insightful feedback, and now we are ready to invite everyone to try out the new UI.

We invite you to switch to the new UI in Preferences / Settings | Appearance & Behavior | New UI Preview. Give it a test drive and share your thoughts about this huge change with us!

Option to dock tool windows to floating editor tabs

To make it more convenient to arrange your working space and interact with PyCharm using multiple monitors, we’ve implemented the option to drag tool windows out of the main window and dock them to floating editor tabs.

Improved user experience with Search Everywhere results

We have fine-tuned the algorithm behind the Search Everywhere results list to make its behavior more predictable and the selection of the elements you’re searching for more accurate. Now, when you start typing your query, the IDE freezes the first search results that appear and doesn’t re-sort them when more options are found as earlier versions did.

The machine learning ranking is now enabled by default for the Files tab, resulting in improved accuracy of the lookup results and shorter search sessions.

Terminal: support for Conda environments on Windows

For Windows OS, the PyCharm built-in terminal now recognizes if the project has a Conda environment and sets itself up accordingly. This now works for the default built-in PowerShell terminal. 

Improved UX for Python Console and debugger Command Queue: easy switch between on and off modes

In PyCharm 2021.3, we added a Command Queue to the Python Console. The Command Queue allows you to write new commands in the console while previous commands are still being executed. We also added a convenient way to switch Command Queue off: Go to Preferences / Settings | Build, Execution, Deployment | Console and uncheck the Command Queue for Python Console checkbox.

Performance improvements for Special Variables

While working in the Python Console or debugger, you are able to preview variables and get a more detailed visualization of them as dataframes or arrays in the Data tab of the SciView window. For PyCharm 2022.3 we sped up the Special Variables list loading by making the elements of the groups load on demand.

We also fixed the problem with the array display of complex numbers. To see complex numbers as arrays, right-click on them in the Python Console and choose Show as Array option. 

Fixes for Python 2 support

We fixed an issue with running Python 2 code in the debugger. Now PyCharm can locate the Python executable file and run the file in the debugger.


As we continue improving support for docstrings, we included some enhancements in this EAP.  For numpy docstrings, PyCharm recognizes function parameters documented in the Other Parameters section and provides proper code insight for such function parameters.

Minor update on the Google docstrings: PyCharm now properly handles the multiline blocks in the Returns section of Google docstrings so that all lines are now displayed.

Descriptive error messages for unsuccessful virtual environment creation

Creating virtual environments is not always easy. Knowing that, we changed the messages the user receives when a virtual environment could not be successfully created. When environment creation fails in this version, the reason should be clear to the user.

Frontend development Bundled plugins

We updated the list of bundled plugins that might be helpful for frontend development. It now includes vue.js, tailwind, prettier, karma, styled-components, node.js, and intellij.nextjs. This means that these plugins are now available in PyCharm out of the box so that you can get proper code insight, completion, and more for Vue or Tailwind CSS without additional tweaks to your IDE.

New project templates for Vite and Next.js

PyCharm 2022.3 includes project templates to help you get up and running quickly with Vite and Next.js. The new project templates run all of the necessary backend scripts for you and set up all of the dependencies. This leaves you with a nice skeleton project that has everything installed and ready to go. You can find the new templates in the main menu under File | New | Project…or on the Welcome screen.

For the full list of the improvements available in PyCharm 2022.3 EAP #1, check out the release notes.

The PyCharm team

Categories: FLOSS Project Planets

Python for Beginners: Rename Columns in a Dataframe in Python

Fri, 2022-09-23 09:00

Pandas dataframes are one of the most efficient data structures to handle tabular data in python. When we import tabular data into dataframes from csv files, we usually need to rename the columns in the dataframes. In this article, we will discuss how we can rename columns in a dataframe in python.

Rename DataFrame Columns Using the rename() Method

The pandas module provides us with the rename() method to rename columns in a dataframe. The rename() method, when invoked on a dataframe, takes a python dictionary as its first input argument. The keys in the dictionary should consist of the original name of the columns that are to be renamed. The values associated with the keys should be the new column names.  After execution, the rename() method returns a new dataframe with the modified name. For example, we can rename the ‘Roll’ column of the given dataframe using the rename() method as shown in the following example.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The original column names are:") print(df.columns.values) nameDict={"Roll":"Roll No."} df=df.rename(columns=nameDict) print("The modified column names are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The original column names are: ['Name' 'Roll' 'Language'] The modified column names are: ['Name' 'Roll No.' 'Language']

If you want to rename multiple columns in the dataframe, you can pass the old column names and new column names in the dictionary as follows.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The original column names are:") print(df.columns.values) nameDict={"Name":"Person","Roll":"Roll No."} df=df.rename(columns=nameDict) print("The modified column names are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The original column names are: ['Name' 'Roll' 'Language'] The modified column names are: ['Person' 'Roll No.' 'Language']

In the above examples, the column names in the original columns aren’t modified. Instead, we get a new dataframe with the modified column names.

You can also rename the columns of the original dataframe. For this, we will use the ‘inplace’ parameter of the rename() method. The ‘inplace’ parameter takes an optional input argument and it has the default value False. Due to this, the column names in the original dataframe aren’t modified. You can set the ‘inplace’ parameter to the value True to modify the column names of the original dataframe as shown below.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The original column names are:") print(df.columns.values) nameDict={"Name":"Person","Roll":"Roll No."} df.rename(columns=nameDict,inplace=True) print("The modified column names are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The original column names are: ['Name' 'Roll' 'Language'] The modified column names are: ['Person' 'Roll No.' 'Language']

Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on k-means clustering with numerical example.

Rename DataFrame Columns Using a List of Column Names

If you have to rename all the columns of the dataframes at once, you can do it using a python list. For this, we just have to assign the list containing the new dataframe names to the ‘columns’ attribute of the dataframe as shown below.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The original column names are:") print(df.columns.values) df.columns=['Person', 'Roll No.', 'Language'] print("The modified column names are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The original column names are: ['Name' 'Roll' 'Language'] The modified column names are: ['Person' 'Roll No.' 'Language'] Conclusion

In this article, we have discussed how to rename columns in a dataframe in python. To know more about python programming, you can read this article on dictionary comprehension in python. You might also like this article on list comprehension in python.

The post Rename Columns in a Dataframe in Python appeared first on

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #126: Python as an Efficiency Tool for Non-Developers

Fri, 2022-09-23 08:00

Are you interested in using Python in an industry outside of software development? Would adding a few custom software tools increase efficiency and make your coworkers' jobs easier? This week on the show, Josh Burnett talks about using Python as a mechanical engineer.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets Python Meeting Düsseldorf - 2022-09-28

Fri, 2022-09-23 05:00

The following text is in German, since we're announcing a regional user group meeting in Düsseldorf, Germany.


Das nächste Python Meeting Düsseldorf findet an folgendem Termin statt:

28.09.2022, 18:00 Uhr
Raum 1, 2.OG im Bürgerhaus Stadtteilzentrum Bilk
Düsseldorfer Arcaden, Bachstr. 145, 40217 Düsseldorf

Programm Bereits angemeldete VorträgeMarc-Andre Lemburg:
         "Choosing the right database for your next project - Looking at options beyond PostgreSQL and MySQL"

Lars Lengersdorf:
        "Einführung in Stable Diffusion"

Marc-Andre Lemburg:
        "Bericht von der PyCon SK 2022 und der PyCon UK 2022"

Detlef Lannert:
        "Nutzung von Exceptions vs. strukturierter Rückgabewerte"

Detlef Lannert:
        "Einführung in Structural Pattern Matching"

Weitere Vorträge können gerne noch angemeldet werden. Bei Interesse, bitte unter melden. Startzeit und Ort

Wir treffen uns um 18:00 Uhr im Bürgerhaus in den Düsseldorfer Arcaden.

Das Bürgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet sich an der Seite der Tiefgarageneinfahrt der Düsseldorfer Arcaden.

Über dem Eingang steht ein großes "Schwimm’ in Bilk" Logo. Hinter der Tür direkt links zu den zwei Aufzügen, dann in den 2. Stock hochfahren. Der Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt.

>>> Eingang in Google Street View


Die Corona Einschränkungen sind mittlerweile aufgehoben worden. Vorsicht ist zwar immer noch geboten, aber jetzt jedem selbst überlassen.

⚠️ Wichtig: Bitte nur dann anmelden, wenn ihr absolut sicher seid, dass ihr auch kommt. Angesichts der begrenzten Anzahl Plätze, haben wir kein Verständnis für kurzfristige Absagen oder No-Shows.


Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.

Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.

Veranstaltet wird das Meeting von der GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf:


Das Python Meeting Düsseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion.

Vorträge können vorher angemeldet werden, oder auch spontan während des Treffens eingebracht werden. Ein Beamer mit XGA Auflösung steht zur Verfügung.

(Lightning) Talk Anmeldung bitte formlos per EMail an


Das Python Meeting Düsseldorf wird von Python Nutzern für Python Nutzer veranstaltet.

Da Tagungsraum, Beamer, Internet und Getränke Kosten produzieren, bitten wir die Teilnehmer um einen Beitrag in Höhe von EUR 10,00 inkl. 19% Mwst. Schüler und Studenten zahlen EUR 5,00 inkl. 19% Mwst.

Wir möchten alle Teilnehmer bitten, den Betrag in bar mitzubringen.


Da wir nur 25 Personen in dem angemieteten Raum empfangen können, möchten wir bitten, sich vorher anzumelden.

Meeting Anmeldung bitte per Meetup

Weitere Informationen

Weitere Informationen finden Sie auf der Webseite des Meetings:


Viel Spaß !

Marc-Andre Lemburg,

Categories: FLOSS Project Planets

Nicola Iarocci: Eve 2.0.2 released

Fri, 2022-09-23 02:05
Eve 2.0.2 was just released today. It fixes a problem introduced with v2.0 in which ETag generation failed if uuidRepresentation was not set in MONGO_OPTIONS. See issue #1486 for details. Many thanks @tgm for reporting and then contributing the fix. Subscribe to the newsletter, the RSS feed, or follow @nicolaiarocci on Twitter
Categories: FLOSS Project Planets

PyBites: Tips for Navigating the Job Hunt with Rhys Powell

Thu, 2022-09-22 10:26

Listen here:

The job hunt is on! With so many people looking to change things up with work, whether it be a new role, a new company, remote work – you name it – we decided it was time to talk a little about the Job Hunt.

This week, I.T veteran and long-standing Pybites Community Member, Rhys Powell joins me (Julian!) to share some tips around searching and applying for jobs these days.

Rhys brings a wealth of experience from both sides of the fence as someone seeking a change in I.T and also as the Hiring Manager.

In this casual, fun and lively discussion, Rhys shares his tips on:

  • Identifying the right company for you
  • Challenging yourself to just get in there and apply
  • Thinking through your personal values
  • Tackling your job interview
  • Questions to ask in your interview
  • and more.

If you’d like to follow Rhys, and we 100% recommend you do, you can find him in the following places. Totally check out his Twitch, it’s one of my favourite ways to wind down after hosting the Mindset call in PDM!




Links from this episode: 

Categories: FLOSS Project Planets

IslandT: Python Tutorial — Chapter 7

Thu, 2022-09-22 05:36

In this chapter let us look at how to store various items within a tuple. A tuple is just like a list that can be used to store multiple items and then allows us to retrieve those items through its index.

Here is how to declare a tuple, either through its constructor or the (), and how to retrieve a particular item within that tuple.

atuple = tuple((1,2,3)) atuple1 = (1,2,3) atuple[0] # 1, the index of a tuple started at index 0

In the below example let us create the above tuple again and use the for loop, to sum up, all those numbers within that tuple.

atuple = tuple((1,2,3)) sum = 0 for num in atuple: sum += num print(sum) # 6

Just like a list, those elements within the tuple consists of various types:

atuple = ("hi", True, 1)

Tuple is one of the Array data type used in Python and its function is basically almost the same as a list!

Categories: FLOSS Project Planets

Talk Python to Me: #382: Apache Superset: Modern Data Exploration Platform

Thu, 2022-09-22 04:00
When you think data exploration using Python, Jupyter notebooks likely come to mind. They are excellent for those of us who gravitate towards Python. But what about your everyday power user? Think of that person who is really good at Excel but has never written a line of code? They can still harness the power of modern Python using a cool application called Superset. <br/> <br/> This open source Python-based web app is all about connecting to live data and creating charts and dashboards based on it using only UI tools. It's super popular too with almost 50,000 GitHub stars. Its creator, Max Beauchemin is here to introduce it to us all.<br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Max on Twitter</b>: <a href="" target="_blank" rel="noopener">@mistercrunch</a><br/> <b>Superset</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>60 notebook environments</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>SQL Fluff linter</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>DB API PEP</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Preset Company</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Watch this episode on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Episode transcripts</b>: <a href="" target="_blank" rel="noopener"></a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Follow Talk Python on Twitter</b>: <a href="" target="_blank" rel="noopener">@talkpython</a><br/> <b>Follow Michael on Twitter</b>: <a href="" target="_blank" rel="noopener">@mkennedy</a><br/></div><br/> <strong>Sponsors</strong><br/> <a href=''>Sentry's DEX Conference</a><br> <a href=''>Talk Python Training</a>
Categories: FLOSS Project Planets

Real Python: What Does if __name__ == "__main__" Do in Python?

Wed, 2022-09-21 10:00

You’ve likely encountered Python’s if __name__ == "__main__" idiom when reading other people’s code. No wonder—it’s widespread! You might have even used if __name__ == "__main__" in your own scripts. But did you use it correctly?

Maybe you’ve programmed in a C-family language like Java before, and you wonder whether this construct is a clumsy accessory to using a main() function as an entry point.

Syntactically, Python’s if __name__ == "__main__" idiom is just a normal conditional block:

1if __name__ == "__main__": 2 ...

The indented block starting in line 2 contains all the code that Python will execute when the conditional statement in line 1 evaluates to True. In the code example above, the specific code logic that you’d put in the conditional block is represented with a placeholder ellipsis (...).

So—if there’s nothing special about the if __name__ == "__main__" idiom, then why does it look confusing, and why does it continue to spark discussion in the Python community?

If the idiom still seems a little cryptic, and you’re not completely sure what it does, why you might want it, and when to use it, then you’ve come to the right place! In this tutorial, you’ll learn all about Python’s if __name__ == "__main__" idiom—starting with what it really does in Python, and ending with a suggestion for a quicker way to refer to it.

Source Code: Click here to download the free source code that you’ll use to understand the name-main idiom.

In Short: It Allows You to Execute Code When the File Runs as a Script, but Not When It’s Imported as a Module

For most practical purposes, you can think of the conditional block that you open with if __name__ == "__main__" as a way to store code that should only run when your file is executed as a script.

You’ll see what that means in a moment. For now, say you have the following file:

1# 2 3def echo(text: str, repetitions: int = 3) -> str: 4 """Imitate a real-world echo.""" 5 echoed_text = "" 6 for i in range(repetitions, 0, -1): 7 echoed_text += f"{text[-i:]}\n" 8 return f"{echoed_text.lower()}." 9 10if __name__ == "__main__": 11 text = input("Yell something at a mountain: ") 12 print(echo(text))

In this example, you define a function, echo(), that mimics a real-world echo by gradually printing fewer and fewer of the final letters of the input text.

Below that, in lines 10 to 12, you use the if __name__ == "__main__" idiom. This code starts with the conditional statement if __name__ == "__main__" in line 10. In the indented lines, 11 and 12, you then collect user input and call echo() with that input. These two lines will execute when you run as a script from your command line:

$ python Yell something at a mountain: HELLOOOO ECHOOOOOOOOOO ooo oo o .

When you run the file as a script by passing the file object to your Python interpreter, the expression __name__ == "__main__" returns True. The code block under if then runs, so Python collects user input and calls echo().

Try it out yourself! You can download all the code files that you’ll use in this tutorial from the link below:

Source Code: Click here to download the free source code that you’ll use to understand the name-main idiom.

At the same time, if you import echo() in another module or a console session, then the nested code won’t run:

>>>>>> from echo import echo >>> print(echo("Please help me I'm stuck on a mountain")) ain in n .

In this case, you want to use echo() in the context of another script or interpreter session, so you won’t need to collect user input. Running input() would mess with your code by producing a side effect when importing echo.

When you nest the code that’s specific to the script usage of your file under the if __name__ == "__main__" idiom, then you avoid running code that’s irrelevant for imported modules.

Nesting code under if __name__ == "__main__" allows you to cater to different use cases:

  • Script: When run as a script, your code prompts the user for input, calls echo(), and prints the result.
  • Module: When you import echo as a module, then echo() gets defined, but no code executes. You provide echo() to the main code session without any side effects.

By implementing the if __name__ == "__main__" idiom in your code, you set up an additional entry point that allows you to use echo() right from the command line.

Read the full article at »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: 2022 Django Developers Survey

Wed, 2022-09-21 09:49

Please take a moment to fill it out the 2022 Django Developers Survey. We are once again partnering with JetBrains and it is available in 10 different languages.

The survey is an important metric of Django usage and helps guide future technical and community decisions. One recent example is past surveys demonstrated how popular Redis is and built-in caching support was added in Django 4.0 as a direct result of that feedback.

After the survey is over, the aggregated results and anonymized raw data will be published.

Categories: FLOSS Project Planets

Python for Beginners: Rename Column by Index in Dataframes

Wed, 2022-09-21 09:00

Dataframes are used to handle tabular data in python. In this article, we will discuss how we can rename a column by index in dataframes in python.

Change Column Name Using Index Number

We can access the column names in a dataframe using the ‘columns’ attribute. The columns attribute of a dataframe contains an Index object. The Index object contains a list of column names as you can see in the following example.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The column object is:") print(df.columns)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The column object is: Index(['Name', 'Roll', 'Language'], dtype='object')

You can access the array of column names using the ‘values’ attribute of the Index object as follows.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The column object is:") print(df.columns) print("The columns are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The column object is: Index(['Name', 'Roll', 'Language'], dtype='object') The columns are: ['Name' 'Roll' 'Language']

To rename the column by index in the dataframe, we can modify the array in the values attribute. For instance, you can change the name of the first column using index 0 of the values array as follows.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The column object is:") print(df.columns) print("The columns are:") print(df.columns.values) df.columns.values[0]="First Name" print("The modified column object is:") print(df.columns) print("The modified columns are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The column object is: Index(['Name', 'Roll', 'Language'], dtype='object') The columns are: ['Name' 'Roll' 'Language'] The modified column object is: Index(['First Name', 'Roll', 'Language'], dtype='object') The modified columns are: ['First Name' 'Roll' 'Language']

In this approach, we cannot change multiple column names at once. To change multiple column names, you need to rename each column name one by one as follows.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The column object is:") print(df.columns) print("The columns are:") print(df.columns.values) df.columns.values[0]="First Name" df.columns.values[1]="Roll Number" print("The modified column object is:") print(df.columns) print("The modified columns are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The column object is: Index(['Name', 'Roll', 'Language'], dtype='object') The columns are: ['Name' 'Roll' 'Language'] The modified column object is: Index(['First Name', 'Roll Number', 'Language'], dtype='object') The modified columns are: ['First Name' 'Roll Number' 'Language']

We can also change multiple column names by index at once using the rename() method. Let us discuss this approach. 

Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on k-means clustering with numerical example.

Change Column Name Using rename() Method in a DataFrame

We can use the rename() method to rename multiple columns using the index numbers. The rename() method, when invoked on a dataframe, takes a dictionary as its input argument. The dictionary should contain the column names that need to be renamed as the keys. The new column names should be the values associated with the original keys. After execution, the rename() method returns a new dataframe with the modified column names. 

To modify the column names using the index number and rename() method, we will first obtain the array of column names using the columns.values attribute of the dataframe. After that, we will create a dictionary with column names as keys and the new column names as associated values for the keys. Then, we will pass the dictionary to the rename() method. After execution, the rename() method will return the dataframe with modified column names as follows.

import pandas as pd import numpy as np df=pd.read_csv("demo_file.csv") print("The dataframe is:") print(df) print("The column object is:") print(df.columns) print("The columns are:") print(df.columns.values) nameDict={"Name":"First Name","Roll":"Roll No."} df=df.rename(columns=nameDict) print("The modified column object is:") print(df.columns) print("The modified columns are:") print(df.columns.values)


The dataframe is: Name Roll Language 0 Aditya 1 Python 1 Sam 2 Java 2 Chris 3 C++ 3 Joel 4 TypeScript The column object is: Index(['Name', 'Roll', 'Language'], dtype='object') The columns are: ['Name' 'Roll' 'Language'] The modified column object is: Index(['First Name', 'Roll No.', 'Language'], dtype='object') The modified columns are: ['First Name' 'Roll No.' 'Language'] Conclusion

In this article, we have discussed how to rename column by index in dataframes in python. To know more about python programming, you can read this article on dictionary comprehension in python. You might also like this article on list comprehension in python.

The post Rename Column by Index in Dataframes appeared first on

Categories: FLOSS Project Planets

Python GUIs: Getting started with VS Code for Python: Setting up a development environment

Wed, 2022-09-21 05:00

Setting up a working development environment is the first step for any project. Your development environment setup will determine how easy it is to develop and maintain your projects over time. That makes it important to choose the right tools for your project. This article will guide you through how to set up Visual Studio Code, which is a popular free-to-use, cross-platform code editor developed by Microsoft, in order to develop Python applications.

Visual Studio Code is not to be confused with Visual Studio, which is a separate product also offered by Microsoft. Visual Studio is a fully-fledged IDE that is mainly geared towards Windows application development using C# and the .NET Framework.

Setup a Python environment

In case you haven't already done this, Python needs to be installed on the development machine. You can do this by going to and grabbing the specific installer for either Windows or macOS. Python is also available for installation via Microsoft Store on Windows devices.

Make sure that you select the option to Add Python to PATH during installation (via the installer).

If you are on Linux, you can check if Python is already installed on your machine by typing python3 --version in a terminal. If it returns an error, you need to install it from your distribution's repository. On Ubuntu/Debian, this can be done by typing sudo apt install python3. Both pip (or pip3) and venv are distributed as separate packages on Ubuntu/Debian and can also be installed by typing sudo apt install python3-pip python3-venv.

Setup Visual Studio Code

First, head over to to and grab the installer for your specific platform.

If you are on a Raspberry Pi (with Raspberry Pi OS), you can also install VS Code by simply typing sudo apt install code. On Linux distributions that support Snaps, you can do it by typing sudo snap install code --classic.

Once VS Code is installed, head over to the Extensions tab in the sidebar on the left by clicking on it or by pressing CTRL+SHIFT+X. Search for the 'Python' extension published by Microsoft and click on Install.

The Extensions tab in the left-hand sidebar.

Usage and Configuration

Now that you have finished setting up VS Code, you can go ahead and create a new Python file. Remember that the Python extension only works if you open a .py file or have selected the language mode for the active file as Python.

To change the language mode for the active file, simply press CTRL+K once and then press M after releasing the previous keys. This kind of keyboard shortcut is called a chord in VS Code. You can see more of them by pressing CTRL+K CTRL+S (another chord).

The Python extension in VS Code allows you to directly run a Python file by clicking on the 'Play' button on the top-right corner of the editor (without having to type python in the terminal).

You can also do it by pressing CTRL+SHIFT+P to open the Command Palette and running the > Python: Run File in Terminal command.

Finally, you can configure VS Code's settings by going to File > Preferences > Settings or by pressing CTRL+COMMA. In VS Code, each individual setting has an unique identifier which you can see by clicking on the cog wheel that appears to the left of each setting and clicking on 'Copy Setting ID'. This ID is what will be referred to while talking about a specific setting. You can also search for this ID in the search bar under Settings.

Linting and Formatting Support (Optional)

Linters make it easier to find errors and check the quality of your code. On the other hand, code formatters help keep the source code of your application compliant with PEP (Python Enhancement Proposal) standards, which make it easier for other developers to read your code and collaborate with you.

For VS Code to provide linting support for your projects, you must first install a preferred linter like flake8 or pylint.

bash pip install flake8

Then, go to Settings in VS Code and toggle the relevant setting (e.g. python.linting.flake8Enabled) for the Python extension depending on what you installed. You also need to make sure that python.linting.enabled is toggled on.

A similar process must be followed for code formatting. First, install something like autopep8 or black.

bash pip install autopep8

You then need to tell VS Code which formatter to use by modifying python.formatting.provider and toggle on editor.formatOnSave so that it works without manual intervention.

If pip warns that the installed modules aren't in your PATH, you may have to specify the path to their location in VS Code (under Settings). Follow the method described under Working With Virtual Environments to do that.

Now, when you create a new Python file, VS Code automatically gives you a list of Problems (CTRL+SHIFT+M) in your program and formats the code on saving the file.

Identified problems in the source code, along with a description and line/column numbers.

You can also find the location of identified problems from the source overview on the right hand, inside the scrollbar.

Working With Virtual Environments

Virtual environments are a way of life for Python developers. Most Python projects require the installation of external packages and modules (via pip). Virtual environments allow you to separate one project's packages from your other projects, which may require a different version of those same packages. Hence, it allows all those projects to have the specific dependencies they require to work.

The Python extension makes it easier for you by automatically activating the desired virtual environment for the in-built terminal and Run Python File command after you set the path to the Python interpreter. By default, the path is set to use the system's Python installation (without a virtual environment).

To use a virtual environment for your project/workspace, you need to first make a new one by opening a terminal (View > Terminal) and typing python -m venv .venv. Then, you can set the default interpreter for that project by opening the Command Palette (CTRL+SHIFT+P) and selecting > Python: Select Interpreter.

You should now either close the terminal pane in VS Code and open a new one or type source .venv/bin/activate into the existing one to start using the virtual environment. Then, install the required packages for your project by typing pip install <package_name>.

VS Code, by default, looks for tools like linters and code formatters in the current Python environment. If you don't want to keep installing them over and over again for each new virtual environment you make (unless your project requires a specific version of that tool), you can specify the path to their location under Settings in VS Code. - flake8 - python.linting.flake8Path - autopep8 - python.formatting.autopep8Path

To find the global location of these packages on macOS and Linux, type which flake8 and which autopep8 in a terminal. If you are on Windows, you can use where <command_name>. Both these commands assume that flake8 and autopep8 are in your PATH.

Understanding Workspaces in VS Code

VS Code has a concept of Workspaces. Each 'project folder' (or the root/top folder) is treated as a separate workspace. This allows you to have project-specific settings and enable/disable certain extensions for that workspace. It is also what allows VS Code to quickly recover the UI state (e.g. files that were previously kept open) when you open that workspace again.

In VS Code, each workspace (or folder) has to be 'trusted' before certain features like linters, autocomplete suggestions and the in-built terminal are allowed to work.

In the context of Python projects, if you tend to keep your virtual environments outside the workspace (where VS Code is unable to detect it), you can use this feature to set the default path to the Python interpreter for that workspace. To do that, first Open a Folder (CTRL+K CTRL+O) and then go to File > Preferences > Settings > Workspace to modify python.defaultInterpreterPath.

Setting the default interpreter path for the workspace.

In VS Code settings you can search for settings by name using the bar at the top.

You can also use this approach to do things like use a different linter for that workspace or disable the code formatter for it. The workspace-specific settings you change are saved in a .vscode folder inside that workspace, which you can share with others.

If your VS Code is not recognizing libraries you are using in your code, double check the correct interpreter is being used. You can find which Python version you're using on the command line by running which python or which python3 on macOS/Linux, or where python or where python3 on Windows.

Working With Git in VS Code (Optional)

Using Version Control is required for developing applications. VS Code does have in-built support for Git but it is pretty barebones, not allowing much more than tracking changes that you have currently made and committing/pushing those changes once you are done.

For the best experience, it is recommended to use the GitLens extension. It lets you view your commit history, check who made the changes and much more. To set it up, you first need to have Git set up on your machine (go here) and then install GitLens from the Extensions tab in the sidebar on the left. You can now use those Git-related features by going to the Git tab in the sidebar (CTRL+SHIFT+G).

There are more Git-related extensions you could try as well, like Git History and GitLab Workflow. Give them a whirl too!

Community-driven & open source alternatives

While VS Code is open source (MIT-licensed), the distributed versions include some Microsoft-specific proprietary modifications, such as telemetry (app tracking). If you would like to avoid this, there is also a community-driven distribution of Visual Studio Code called VSCodium that provides freely-licensed binaries without telemetry.

Due to legal restrictions, VSCodium is unable to use the official Visual Studio Marketplace for extensions. Instead, it uses a separate vendor neutral, open source marketplace called Open VSX Registry. It doesn't have every extension, especially proprietary ones, and some are not kept up-to-date but both the Python and GitLens extensions are available on it.

You can also use the open source Jedi language server for the Python extension, rather than the bundled Pylance language server/extension, by configuring the python.languageServer setting. You can then completely disable Pylance by going to the Extensions tab. Note that, if you are on VSCodium, Jedi is used by default (as Pylance is not available on Open VSX Registry) when you install the Python extension.


Having the right tools and making sure they're set up correctly will greatly simplify your development process. While Visual Studio starts as a simple tool, it is flexible and extendable with plugins to suit your own preferred workflow. In this tutorial we've covered the basics of setting up your environment, and you should now be ready to start developing your own applications with Python!

For an in-depth guide to building GUIs with Python see my PyQt6 book.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #543 (Sept. 20, 2022)

Tue, 2022-09-20 15:30

#543 – SEPTEMBER 20, 2022
View in Browser »

Build an Alexa Equivalent in Python

It’s not as difficult as you think to build an AI program that listens to speech and answers questions. You can make the magic happen in an afternoon by leveraging a few Python packages and APIs.

Recipes From Python SQLite Docs

The official documentation of Python’s sqlite3 module is a little short on examples. This article lists various in-depth examples that cover the most commonly used APIs in the module.
REDOWAN DELOWAR • Shared by Redowan Delowar

Scout APM: A Performance Monitoring Tool Built for Developers

Scout’s always-on monitoring will keep you ahead of performance outliers and allow you to analyze increased response time. With dashboards that will help you drill down into specific endpoints, Scout will save you time and resources and give your developers time to build applications people love →
SCOUT APM sponsor

Custom Python Lists: Inheriting From list vs UserList

In this tutorial, you’ll learn how to create custom list-like classes in Python by inheriting from the built-in list class or by subclassing UserList from the collections module.

Python 3.11.0rc2 Released


Discussions Lazy Imports for Python

A short article discussing PEP 690 which proposes support for lazy imports in Python, followed by an in-depth discussion by the LWN community.

What’s the Best Source Code You’ve Read?


Python Jobs Senior Software Engineer Backend (USA)

Muck Rack

Senior Backend Engineer (Anywhere)


Django Developer (USA)

Abnormal Security

Python Developer (Anywhere)

SIGMA Assessment Systems, Inc.

Senior Software Engineer, Python (Backend) (Anywhere)


Software Development Lead (Ann Arbor, MI, USA)

University of Michigan

Software Engineer - Backend/Python (100% Remote) (Anywhere)


Software Engineer (Los Angeles or Dallas) (Los Angeles, CA, USA)

Causeway Capital Management LLC

Enterprise GIS Data Engineer (Information Systems Analyst) (San Jose, CA, USA)

City of San Jose

More Python Jobs >>>

Articles & Tutorials Django Favicon Guide

Favicons are the little icons you see in your browser tabs. Your web browser looks in very specific places for these icons, and different browsers expect different file names and types. This article runs you through two different ways of getting favicons working in your Django web project.

Python Basics: Conditional Logic & Control Flow

In this Python Basics video course, you’ll learn how use conditional logic to write programs that perform different actions based on different conditions. Paired with functions and loops, conditional logic allows you to write complex programs that can handle many different situations.

Your AI Opportunity Awaits

Python devs with AI training are rapidly advancing their careers while building the future. This is a great opportunity to sharpen your skills to tackle tomorrow’s technological challenges. Stand out in a competitive economic environment with the Intel® Edge AI Certification →

Meta Spins Off PyTorch Foundation

PyTorch is a popular open-source deep-learning framework originally created by Meta/Facebook. Meta has announced that it is creating an independent organization called the PyTorch Foundation that will operate as part of the Linux Foundation, making the framework vendor-neutral.

Python Dictionary Operations You Should Know

The dict is one of the basic data structures in Python. It is truly at the core of Python and is used everywhere. This article runs you through some common operations on dictionaries, including initialization, merging, comprehensions, and more.

How to Replace a String in Python

In this tutorial, you’ll learn how to remove or replace a string or substring. You’ll go from the basic string method .replace() all the way up to a multi-layer regex pattern using the sub() function from Python’s re module.

Evolution of Access Control Explained Through Python

Sometimes writing code can help you explore and understand concepts. This article shows a history of access controls in software using Python scripts to re-implement the ideas.
ADAM BUGGIA • Shared by Adam Buggia

Find Your Next Tech Job Through Hired

Hired has 1000s of companies ranging from startups to Fortune 500s that are actively hiring developers, data scientists, mobile engineers, and more. Create a profile with your skills and preferences for hiring managers to reach you. Sign up today!
HIRED sponsor

Why You Should Use Data Classes in Python

Know what a Data Class is? Do you know how to use one? Know the differences from regular classes? This article answers these questions and more.
GIULIANO PERTILE • Shared by Giuliano Pertile

The Maze of Python Dependency Management

This article gives an overview of how dependencies are handled within virtual environments and what you can do when transitive dependencies are in conflict.

Projects & Code simplerecon: 3D Reconstruction Without Convolutions


django-imagekit: Automated Image Processing for Django


s3sqlite: Query SQLite Files in S3 Using S3fs


Python Data Visualization Cookbook


chard: async/await Task Queue for Django


Events DjangoCon Europe 2022

September 21 to September 26, 2022

An Applied Introduction to Finite State Machines

September 21, 2022

Weekly Real Python Office Hours Q&A (Virtual)

September 21, 2022

PyCon Portugal 2022

September 24 to September 25, 2022

Webinar: Writing REST With Django and Ninja

September 27, 2022, 11AM EDT

Happy Pythoning!
This was PyCoder’s Weekly Issue #543.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Real Python: Building Python Project Documentation With MkDocs

Tue, 2022-09-20 10:00

In this course, you’ll learn how to quickly build documentation for a Python package using MkDocs and mkdocstrings. These tools allow you to generate nice-looking and modern documentation from Markdown files and your code’s docstrings.

Maintaining auto-generated documentation means less effort because you’re linking information between your code and the documentation pages. However, good documentation is more than just the technical description pulled from your code! Your project will appeal more to users if you guide them through examples and connect the dots between the docstrings.

The Material for MkDocs theme makes your documentation look good without any extra effort and is used by popular projects such as Typer CLI and FastAPI.

In this course, you’ll:

  • Work with MkDocs to produce static pages from Markdown
  • Pull in code documentation from docstrings using mkdocstrings
  • Follow best practices for project documentation
  • Use the Material for MkDocs theme to make your documentation look good
  • Host your documentation on GitHub Pages

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets