Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 1 hour 6 min ago

Mike Driscoll: How to Create a Command-line Application with argparse

Thu, 2022-05-19 08:30

When you are creating an application, you will usually want to be able to tell your application how to do something. There are two popular methods for accomplishing this task. You can make your application accept command-line arguments or you can create a graphical user interface. Some applications support both.

Command-line interfaces are helpful when you need to run your code on a server. Most servers do not have a monitor hooked up, especially if they are Linux servers. In those cases, you might not be able to run a graphical user interface even if you wanted to.

Python comes with a built-in library called argparse that you can use to create a command-line interface. In this article, you will learn about the following:

  • Parsing Arguments
  • Creating Helpful Messages
  • Adding Aliases
  • Using Mutually Exclusive Arguments
  • Creating a Simple Search Utility

There is a lot more to the argparse module than what will be covered in this article. If you would like to know more about it, you can check out the documentation.

Now it's time to get started with parsing arguments from the command-line!

Parsing Arguments

Before you learn how to use argparse, it's good to know that there is another way to pass arguments to a Python script. You can pass any arguments to a Python script and access those arguments by using the sys module.

To see how that works, create a file named sys_args.py and enter the following code into it:

# sys_args.py import sys def main(): print('You passed the following arguments:') print(sys.argv) if __name__ == '__main__': main()

This code imports sys and prints out whatever is in sys.argv. The argv attribute contains a list of everything that was passed to the script with the first item being the script itself.

Here's an example of what happens when you run this code along with a couple of sample arguments:

$ python3 sys_args.py --s 45 You passed the following arguments: ['sys_args.py', '--s', '45']

The problem with using sys.argv is that you have no control over the arguments that can be passed to your application:

  • You can't ignore arguments
  • You can't create default arguments
  • You can't really tell what is a valid argument at all

This is why using argparse is the way to go when working with Python's standard library. The argparse module is very powerful and useful. Let's think about a common process that a command line application follows:

  • pass in a file
  • do something to that file in your program
  • output the result

Here is a generic example of how that might work. Go ahead and create file_parser.py and add the following code:

# file_parser.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser('File parser') parser.add_argument('--infile', help='Input file') parser.add_argument('--out', help='Output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

The file_parser() function is where the logic for the parsing would go. For this example, it only takes in a file name and prints it back out. The output_file argument defaults to an empty string.

The meat of the program is in main() though. Here you create an instance of argparse.ArgumentParser() and give your parser a name. Then you add two arguments, --infile and --out. To use the parser, you need to call parse_args(), which will return whatever valid arguments were passed to your program. Finally, you check to see if the user used the --infile flag. If they did, then you run file_parser().

Here is how you might run the code in your terminal:

$ python file_parser.py --infile something.txt Processing something.txt Finished processing

Here you run your script with the --infile flag along with a file name. This will run main() which in turns calls file_parser().

The next step is to try your application using both command-line arguments you declared in your code:

$ python file_parser.py --infile something.txt --out output.txt Processing something.txt Finished processing Creating output.txt

This time around, you get an extra line of output that mentions the output file name. This represents a branch in your code logic. When you specify an output file, you can have your code go through the process of generating that file using a new block of code or a function. If you do not specify an output file, then that block of code would not run.

When you create your command-line tool using argparse, you can easily add messages that help your users when they are unsure of how to correctly interact with your program.

Now it's time to find out how to get help from your application!

Creating Helpful Messages

The argparse library will automatically create a helpful message for your application using the information that you provided when you create each argument. Here is your code again:

# file_parser.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser('File parser') parser.add_argument('--infile', help='Input file') parser.add_argument('--out', help='Output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

Now try running this code with the -h flag and you should see the following:

$ file_parser.py -h usage: File parser [-h] [--infile INFILE] [--out OUT] optional arguments: -h, --help show this help message and exit --infile INFILE Input file --out OUT Output file

The help parameter to add_argument() is used to create the help message above. The -h and --help options are added automatically by argparse. You can make your help more informative by giving it a description and an epilog.

Let's use them to improve your help messages. Start by copying the code from above into a new file named file_parser_with_description.py, then modify it to look like this:

# file_parser_with_description.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser( 'File parser', description='PyParse - The File Processor', epilog='Thank you for choosing PyParse!', ) parser.add_argument('--infile', help='Input file for conversion') parser.add_argument('--out', help='Converted output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

Here you pass in the description and epilog arguments to ArgumentParser. You also update the help arguments to add_argument() to be more descriptive.

When you run this script with -h or --help after making these changes, you will see the following output:

$ python file_parser_with_description.py -h usage: File parser [-h] [--infile INFILE] [--out OUT] PyParse - The File Processor optional arguments: -h, --help show this help message and exit --infile INFILE Input file for conversion --out OUT Converted output file Thank you for choosing PyParse!

Now you can see the new description and epilog in your help output. This gives your command-line application some extra polish.

You can also disable help entirely in your application via the add_help argument to ArgumentParser. If you think that your help text is too wordy, you can disable it like this:

# file_parser_no_help.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser( 'File parser', description='PyParse - The File Processor', epilog='Thank you for choosing PyParse!', add_help=False, ) parser.add_argument('--infile', help='Input file for conversion') parser.add_argument('--out', help='Converted output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

By setting add_help to False, you are disabling the -h and --help flags.

You can see this demonstrated below:

$ python file_parser_no_help.py --help usage: File parser [--infile INFILE] [--out OUT] File parser: error: unrecognized arguments: --help

In the next section, you'll learn about adding aliases to your arguments!

Adding Aliases

An alias is a fancy word for using an alternate flag that does the same thing. For example, you learned that you can use both -h and --help to access your program's help message. -h is an alias for --help, and vice-versa

Look for the changes in the parser.add_argument() methods inside of main():

# file_parser_aliases.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser( 'File parser', description='PyParse - The File Processor', epilog='Thank you for choosing PyParse!', add_help=False, ) parser.add_argument('-i', '--infile', help='Input file for conversion') parser.add_argument('-o', '--out', help='Converted output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

Here you change the first add_argument() to accept -i in addition to --infile and you also added -o to the second add_argument(). This allows you to run your code using two new shortcut flags.

Here's an example:

$ python3 file_parser_aliases.py -i something.txt -o output.txt Processing something.txt Finished processing Creating output.txt

If you go looking through the argparse documentation, you will find that you can add aliases to subparsers too. A subparser is a way to create sub-commands in your application so that it can do other things. A good example is Docker, a virtualization or container application. It has a series of commands that you can run under docker as well as docker compose and more. Each of these commands has separate sub-commands that you can use.

Here is a typical docker command to run a container:

docker exec -it container_name bash

This will launch a container with docker. Whereas if you were to use docker compose, you would use a different set of commands. The exec and compose are examples of subparsers.

The topic of subparsers are outside the scope of this tutorial. If you are interested in more details dive right into the documentation.

Using Mutually Exclusive Arguments

Sometimes you need to have your application accept some arguments but not others. For example, you might want to limit your application so that it can only create or delete files, but not both at once.

The argparse module provides the add_mutually_exclusive_group() method that does just that!

Change your two arguments to be mutually exclusive by adding them to a group object like in the example below:

# file_parser_exclusive.py import argparse def file_parser(input_file, output_file=''): print(f'Processing {input_file}') print('Finished processing') if output_file: print(f'Creating {output_file}') def main(): parser = argparse.ArgumentParser( 'File parser', description='PyParse - The File Processor', epilog='Thank you for choosing PyParse!', add_help=False, ) group = parser.add_mutually_exclusive_group() group.add_argument('-i', '--infile', help='Input file for conversion') group.add_argument('-o', '--out', help='Converted output file') args = parser.parse_args() if args.infile: file_parser(args.infile, args.out) if __name__ == '__main__': main()

First, you created a mutually exclusive group. Then, you added the -i and -o arguments to the group instead of to the parser object. Now these two arguments are mutually exclusive.

Here is what happens when you try to run your code with both arguments:

$ python3 file_parser_exclusive.py -i something.txt -o output.txt usage: File parser [-i INFILE | -o OUT] File parser: error: argument -o/--out: not allowed with argument -i/--infile

Running your code with both arguments causes your parser to show the user an error message that explains what they did wrong.

After covering all this information related to using argparse, you are ready to apply your new skills to create a simple search tool!

Creating a Simple Search Utility

Before starting to create an application, it is always good to figure out what you are trying to accomplish. The application you want to build in this section should be able to search for files of a specific file type. To make it more interesting, you can add an additional argument that allows you to optionally search for specific file sizes as well.

You can use Python's glob module for searching for file types. You can read all about this module here:

There is also the fnmatch module, which glob itself uses. You should use glob for now as it is easier to use, but if you're interested in writing something more specialized, then fnmatch may be what you are looking for.

However, since you want to be able to optionally filter the files returned by the file size, you can use pathlib which includes a glob-like interface. The glob module itself does not provide file size information.

You can start by creating a file named pysearch.py and entering the following code:

# pysearch.py import argparse import pathlib def search_folder(path, extension, file_size=None): """ Search folder for files """ folder = pathlib.Path(path) files = list(folder.rglob(f'*.{extension}')) if not files: print(f'No files found with {extension=}') return if file_size is not None: files = [ f for f in files if f.stat().st_size >= file_size ] print(f'{len(files)} *.{extension} files found:') for file_path in files: print(file_path)

You start the code snippet above by importing argparse and pathlib. Next you create the search_folder() function which takes in three arguments:

  • path - The folder to search within
  • extension - The file extension to look for
  • file_size - What file size to filter on in bytes

You turn the path into a pathlib.Path object and then use its rglob() method to search in the folder for the extension that the user passed in. If no files are found, you print out a meaningful message to the user and exit.

If any files are found, you check to see whether file_size has been set. If it was set, you use a list comprehension to filter out the files that are smaller than the specified file_size.

Next, you print out the number of files that were found and finally loop over these files to print out their names.

To make this all work correctly, you need to create a command-line interface. You can do that by adding a main() function that contains your argparse code like this:

def main(): parser = argparse.ArgumentParser( 'PySearch', description='PySearch - The Python Powered File Searcher', ) parser.add_argument('-p', '--path', help='The path to search for files', required=True, dest='path') parser.add_argument('-e', '--ext', help='The extension to search for', required=True, dest='extension') parser.add_argument('-s', '--size', help='The file size to filter on in bytes', type=int, dest='size', default=None) args = parser.parse_args() search_folder(args.path, args.extension, args.size) if __name__ == '__main__': main()

This ArgumentParser() has three arguments added to it that correspond to the arguments that you pass to search_folder(). You make the --path and --ext arguments required while leaving the --size argument optional. Note that the --size argument is set to type=int, which means that you cannot pass it a string.

There is a new argument to the add_argument() function. It is the dest argument which you use to tell your argument parser where to save the arguments that are passed to them.

Here is an example run of the script:

$ python3 pysearch.py -p /Users/michael/Dropbox/python101code/chapter32_argparse -e py -s 650 6 *.py files found: /Users/michael/Dropbox/python101code/chapter32_argparse/file_parser_aliases2.py /Users/michael/Dropbox/python101code/chapter32_argparse/pysearch.py /Users/michael/Dropbox/python101code/chapter32_argparse/file_parser_aliases.py /Users/michael/Dropbox/python101code/chapter32_argparse/file_parser_with_description.py /Users/michael/Dropbox/python101code/chapter32_argparse/file_parser_exclusive.py /Users/michael/Dropbox/python101code/chapter32_argparse/file_parser_no_help.py

That worked quite well! Now try running it with -s and a string:

$ python3 pysearch.py -p /Users/michael/Dropbox/python101code/chapter32_argparse -e py -s python usage: PySearch [-h] -p PATH -e EXTENSION [-s SIZE] PySearch: error: argument -s/--size: invalid int value: 'python'

This time, you received an error because -s and --size only accept integers. Go try this code on your own machine and see if it works the way you want when you use -s with an integer.

Here are some ideas you can use to improve your version of the code:

  • Handle the extensions better. Right now it will accept *.py which won't work the way you might expect
  • Update the code so you can search for multiple extensions at once
  • Update the code to filter on a range of file sizes (Ex. 1 MB - 5MB)

There are lots of other features and enhancements you can add to this code, such as adding error handling or unittests.

Wrapping Up

The argparse module is full featured and can be used to create great, flexible command-line applications. In this chapter, you learned about the following:

  • Parsing Arguments
  • Creating Helpful Messages
  • Adding Aliases
  • Using Mutually Exclusive Arguments
  • Creating a Simple Search Utility

You can do a lot more with the argparse module than what was covered in this chapter. Be sure to check out the documentation for full details. Now go ahead and give it a try yourself. You will find that once you get the hang of using argparse, you can create some really neat applications!

The post How to Create a Command-line Application with argparse appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Python GUIs: PyQt6, PySide6, PyQt5 and PySide2 Books: Create GUI Applications with Python & Qt

Thu, 2022-05-19 05:00

Hello! Today I have released new digital editions of my PyQt5, PyQt6, PySide2 and PySide6 book Create GUI Applications with Python & Qt.

This update adds over 200 pages of Qt examples and exercises - the book is now 800 pages long! - and continues to be updated and extended. The latest additions include:

  • Built-in dialogs, including QMessageBox and QFileDialog
  • Working with multiple windows, cross-window communication
  • Using QThreadPool.start() to execute Python functions
  • Long-running threads with QThread
  • Using custom widgets in Qt Designer
  • Recurring & single shot timers
  • Managing data files, working with paths
  • Packaging with PyInstaller on Windows, macOS & Linux
  • Creating distributable installers on Windows, macOS & Linux

This update marks the 2nd Edition of the PyQt6 & PySide2 books, and the 5th Edition of PyQt5 & PySide2.

As always, if you've previously bought a copy of the book you get these updates for free! Just go to the downloads page and enter the email you used for the purchase. If you have problems getting this update just get in touch.

Enjoy!

For more, see the complete PySide2 tutorial.

Categories: FLOSS Project Planets

ListenData: Wish Christmas with Python and R

Thu, 2022-05-19 04:15
This post is dedicated to all the Python and R Programming Lovers...Flaunt your knowledge in your peer group with the following programs. As a data science professional, you want your wish to be special on eve of christmas. If you observe the code, you may also learn 1-2 tricks which you can use later in your daily tasks.
Method 1 : Run the following program and see what I mean

R Code


paste(intToUtf8(acos(log(1))*180/pi-13),
toupper(substr(month.name[2],2,2)),
paste(rep(intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2),2), collapse = intToUtf8(0)),
LETTERS[5^(3-1)], intToUtf8(atan(1/sqrt(3))*180/pi+2),
toupper(substr(month.abb[10],2,2)),
intToUtf8(acos(log(1))*180/pi-(2*3^2)),
toupper(substr(month.name[4],3,4)),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2+1),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+2*4),
intToUtf8(acos(log(1))*180/pi-13),
LETTERS[median(0:2)],
intToUtf8(atan(1/sqrt(3))*180/pi*3-7),
sep = intToUtf8(0)
)

Python Code


import math
import datetime

(chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ datetime.date(1900, 2, 1).strftime('%B')[1] \
+ 2 * datetime.date(1900, 2, 1).strftime('%B')[3] \
+ datetime.date(1900, 2, 1).strftime('%B')[7] \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi+2)) \
+ datetime.date(1900, 10, 1).strftime('%B')[1] \
+ chr(int(math.acos(math.log(1))*180/math.pi-18)) \
+ datetime.date(1900, 4, 1).strftime('%B')[2:4] \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+3*2+1)) \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+2*4)) \
+ chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ "{:c}".format(97) \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi*3-7))).upper()
Method 2 : Audio Wish for Christmas

Turn on computer speakers before running the code.

R Code



install.packages("audio")
library(audio)
christmas_file - tempfile()
download.file("https://github.com/deepanshu88/Datasets/raw/master/UploadedFiles/merrychristmas1.wav", christmas_file, mode = "wb")
xmas <- load.wave(christmas_file)
play(xmas)

Python Code


#Install packages
!pip install requests
!pip install playsound

import requests
from playsound import playsound

audio_url = "https://github.com/deepanshu88/Datasets/raw/master/UploadedFiles/merrychristmas1.wav"
r = requests.get(audio_url)

with open("christmas.wav",'wb') as f:
f.write(r.content)

playsound('christmas.wav')
Make sure requests and playsound packages are installed before running the above code. If you get "Permission Denied" error, make sure your working directory is set where you have write access. Set working directory by using this code. os.chdir()
Categories: FLOSS Project Planets

death and gravity: The unreasonable effectiveness of f‍-‍strings and re.VERBOSE

Wed, 2022-05-18 18:24

... in which we look at one or two ways to make life easier when working regular expressions in Python.

tl;dr: You can compose verbose regular expressions using f‍-‍strings.

Here's a real-world example – instead of this:

1pattern = r"((?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?)(.*?)(?=(?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?(?![^\w\s])|$)"

... do this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22code = r""" [A-Z]*H # prefix \d+ # digits [a-z]* # suffix """ multicode = fr""" (?: \( \s* )? # maybe open paren and maybe space {code} # one code (?: \s* \+ \s* {code} )* # maybe followed by other codes, plus-separated (?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus """ pattern = fr""" ( {multicode} ) # code (capture) ( .*? ) # message (capture): everything ... (?= # ... up to (but excluding) ... {multicode} # ... the next code (?! [^\w\s] ) # (but not when followed by punctuation) | $ # ... or the end ) """ For comparison, the same pattern without f‍-‍strings (click to expand). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60pattern = r""" ( # code (capture) # BEGIN multicode (?: \( \s* )? # maybe open paren and maybe space # code [A-Z]*H # prefix \d+ # digits [a-z]* # suffix (?: # maybe followed by other codes, \s* \+ \s* # ... plus-separated # code [A-Z]*H # prefix \d+ # digits [a-z]* # suffix )* (?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus # END multicode ) ( .*? ) # message (capture): everything ... (?= # ... up to (but excluding) ... # ... the next code # BEGIN multicode (?: \( \s* )? # maybe open paren and maybe space # code [A-Z]*H # prefix \d+ # digits [a-z]* # suffix (?: # maybe followed by other codes, \s* \+ \s* # ... plus-separated # code [A-Z]*H # prefix \d+ # digits [a-z]* # suffix )* (?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus # END multicode # (but not when followed by punctuation) (?! [^\w\s] ) # ... or the end | $ ) """

It's better than the non-verbose one, but even with careful formatting and comments, the repetition makes it pretty hard to follow – and wait until you have to change something!

Read on for more details and some caveats.

Prerequisites #

Formatted string literals (f‍-‍strings) were added in Python 3.61, and provide a way to embed expressions inside string literals, using a syntax similar to that of str.format():

>>> name = "world" >>> >>> "Hello, {name}!".format(name=name) 'Hello, world!' >>> >>> f"Hello, {name}!" 'Hello, world!'

Verbose regular expressions (re.VERBOSE) have been around since forever2, and allow writing regular expressions with non-significant whitespace and comments:

>>> text = "H1 code (AH2b+EUH3) fancy code" >>> >>> code = r"[A-Z]*H\d+[a-z]*" >>> re.findall(code, text) ['H1', 'AH2b', 'EUH3'] >>> >>> code = r""" ... [A-Z]*H # prefix ... \d+ # digits ... [a-z]* # suffix ... """ >>> re.findall(code, text, re.VERBOSE) ['H1', 'AH2b', 'EUH3'] The "one weird trick" #

Once you see it, it's obvious – you can use f‍-‍strings to compose regular expressions:

>>> multicode = fr""" ... (?: \( )? # maybe open paren ... {code} # one code ... (?: \+ {code} )* # maybe other codes, plus-separated ... (?: \) )? # maybe close paren ... """ >>> re.findall(multicode, text, re.VERBOSE) ['H1', '(AH2b+EUH3)']

It's so obvious, I only took me three years to do it, despite using both features in that time.

Of course, there's any number of libraries for building regular expressions; the benefit of this is that it has zero dependencies, and zero extra things you need to learn.

Caveats # Hashes and spaces need to be escaped #

Because a hash is used to mark the start of a comment, and spaces are mostly ignored, you have to represent them in some other way.

The documentation of re.VERBOSE is quite helpful:

When a line contains a # that is not in a character class and is not preceded by an unescaped backslash, all characters from the leftmost such # through the end of the line are ignored.

That is, this won't work as the non-verbose version:

>>> re.findall("\d+#\d+", "1#23a") ['1#23'] >>> re.findall("\d+ # \d+", "1#23a", re.VERBOSE) ['1', '23']

... but these will:

>>> re.findall("\d+ [#] \d+", "1#23a", re.VERBOSE) ['1#23'] >>> re.findall("\d+ \# \d+", "1#23a", re.VERBOSE) ['1#23']

The same is true for spaces:

>>> re.findall("\d+ [ ] \d+", "1 23a", re.VERBOSE) ['1 23'] >>> re.findall("\d+ \ \d+", "1 23a", re.VERBOSE) ['1 23'] Hashes need extra care #

When composing regexes, ending a pattern on the same line as a comment might accidentally comment the following line in the enclosing pattern:

>>> one = "1 # comment" >>> onetwo = f"{one} 2" >>> re.findall(onetwo, '0123', re.VERBOSE) ['1'] >>> print(onetwo) 1 # comment 2

This can be avoided by always ending the pattern on a new line:

>>> one = """\ ... 1 # comment ... """ >>> onetwo = f"""\ ... {one} 2 ... """ >>> re.findall(onetwo, '0123', re.VERBOSE) ['12']

While a bit cumbersome, in real life most patterns would span multiple lines anyway, so it's not really an issue.

(Note that this is only needed if you use comments.)

Brace quantifiers need to be escaped #

Because f‍-‍strings already use braces for replacements, to represent brace quantifiers you must double the braces:

>>> re.findall("m{2}", "entire mm but only two of mmm") ['mm', 'mm'] >>> letter = "m" >>> pattern = f"{letter}{{2}}" >>> re.findall(pattern, "entire mm but only two of mmm") ['mm', 'mm'] I don't control the flags #

Maybe you'd like to use verbose regexes, but don't control the flags passed to the re functions (for example, because you're passing the regex to an API).

Worry not! The regular expression syntax supports inline flags:

(?aiLmsux)
(One or more letters [...]) The group matches the empty string; the letters set the corresponding flags: [...] re.X (verbose), for the entire regular expression. [...] This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the re.compile() function. Flags should be used first in the expression string.
(?aiLmsux-imsx:...)
[...] The letters set or remove the corresponding flags [...] for the part of the expression. [...]

So, you can do this:

>>> onetwo = """\ ... (?x) ... 1 # look, ma ... 2 # no flags ... """ >>> re.findall(onetwo, '0123') ['12']

... or this:

>>> onetwo = """\ ... (?x: ... 1 # verbose until the close paren ... )2""" >>> re.findall(onetwo, '0123') ['12']

That's it for now.

Learned something new today? Share this with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

Bonus: I don't use Python #

Lots of other languages support the inline verbose flag, too! You can build a pattern in whichever language is more convenient, and use it in any other one.3 Languges like...

C (with PCRE – and by extension, C++, PHP, and many others):

echo '0123' | pcregrep -o '(?x) 1 2 # such inline ' ... yeah, the C version is actually really long, click to expand. char *pattern = "(?x)\n" "1 2 # much verbose\n" ; char *subject = "0123"; int subject_length = strlen(subject); int errornumber; PCRE2_SIZE erroroffset; pcre2_code *re = pcre2_compile( (PCRE2_SPTR)pattern, PCRE2_ZERO_TERMINATED, 0, &errornumber, &erroroffset, NULL ); pcre2_match_data *match_data = pcre2_match_data_create_from_pattern(re, NULL); pcre2_match( re, (PCRE2_SPTR)subject, subject_length, 0, 0, match_data, NULL ); PCRE2_SIZE *ovector = pcre2_get_ovector_pointer(match_data); PCRE2_SPTR substring_start = (PCRE2_SPTR)subject + ovector[0]; size_t substring_length = ovector[1] - ovector[0]; printf("%.*s\n", (int)substring_length, (char *)substring_start);

C#:

Console.WriteLine(new Regex(@"(?x) 1 2 # wow ").Match("0123"));

grep (only the GNU one):

echo '0123' | grep -Po '(?x) 1 2 # no line'

Java (and by extension, lots of JVM languages, like Scala):

var p = Pattern.compile( "(?x)\n" + "1 2 # much class\n" ); var m = p.matcher("0123"); m.find(); System.out.println(m.group(0));

Perl:

"0123" =~ /(?x)( 1 2 # no scare )/; print $1 . "\n";

PostgreSQL:

select substring( '0123' from $$(?x) 1 2 # such declarative $$ );

Ruby:

puts /(?x) 1 2 # nice /.match('0123')

Rust:

let re = Regex::new( r"(?x) 1 2 # much safe " ).unwrap(); println!("{}", re.find("0123").unwrap().as_str());

Swift:

let string = "0123" let range = string.range( of : """ (?x) 1 2 # omg hi """, options : .regularExpression ) print(string[range!])

Notable languages that don't support inline verbose flags out of the box:

  • C (regex.h – POSIX regular expressions)
  • C++ (regex)
  • Go (regexp)
  • Javascript
  • Lua
  1. In PEP 498 – Literal String Interpolation. [return]

  2. That is, since at least Python 1.5.2, released in 1998 – for all except a tiny minority of Python users, that's before forever. [return]

  3. If they both support inline flags, they likely share most other features. [return]

Categories: FLOSS Project Planets

Real Python: Build a URL Shortener With FastAPI and Python

Wed, 2022-05-18 10:00

In this tutorial, you’ll build a URL shortener with Python and FastAPI. URLs can be extremely long and not user-friendly. This is where a URL shortener can come in handy. A URL shortener reduces the number of characters in a URL, making it easier to read, remember, and share.

By following this step-by-step project, you’ll build a URL shortener with Python and FastAPI. At the end of this tutorial, you’ll have a fully functional API-driven web app that creates shortened URLs that forward to target URLs.

In this tutorial, you’ll learn how to:

  • Create a REST API with FastAPI
  • Run a development web server with Uvicorn
  • Model an SQLite database
  • Investigate the auto-generated API documentation
  • Interact with the database with CRUD actions
  • Optimize your app by refactoring your code

This URL shortener project is for intermediate Python programmers who want to try out FastAPI and learn about API design, CRUD, and interaction with a database. To follow along, it’ll help if you’re familiar with the basics of handling HTTP requests. If you need a refresher on FastAPI, Using FastAPI to Build Python Web APIs is a great introduction.

Get Source Code: Click here to get access to the source code that you’ll use to build your Python URL shortener with FastAPI.

Demo: Your Python URL Shortener

In this step-by-step project, you’ll build an API to create and manage shortened URLs. The main purpose of this API is to receive a full target URL and return a shortened URL. To try out your API endpoints, you’ll leverage the documentation that FastAPI automatically creates:

When you post a target URL to the URL shortener app, you get a shortened URL and a secret key back. The shortened URL contains a random key that forwards to the target URL. You can use the secret key to see the shortened URL’s statistics or delete the forwarding.

Project Overview

Your URL shortener Python project will provide API endpoints that are capable of receiving different HTTP request types. Each endpoint will perform an action that you’ll specify. Here’s a summary of your URL shortener’s API endpoints:

Endpoint HTTP Verb Request Body Action / GET Returns a Hello, World! string /url POST Your target URL Shows the created url_key with additional info, including a secret_key /{url_key} GET Forwards to your target URL /admin/{secret_key} GET Shows administrative info about your shortened URL /admin/{secret_key} DELETE Your secret key Deletes your shortened URL

The code you’ll write in this tutorial focuses on getting the app working first. However, having a working app doesn’t always mean that the code behind it is perfect. That’s why you’ll find a step in this tutorial where you’ll refactor parts of your app.

If you want to have a look at the final source code, go ahead and download it:

Get Source Code: Click here to get access to the source code that you’ll use to build your Python URL shortener with FastAPI.

This project is a great starting point to extend your API with more functionality. At the end of this tutorial, you’ll find ideas about what to build next.

Prerequisites

To get the most out of this tutorial, you should be comfortable with the following concepts:

The linked resources will help you better understand the code that you write in this tutorial. However, in this tutorial, you’ll build your app step by step. So even if you’re not familiar with the concepts above, then you’ll be able to follow along.

Step 1: Prepare Your Environment

In this step, you’ll prepare the development environment for your FastAPI app. First, you’ll create the folder structure for your app. Then, you’ll create a virtual environment and install all project dependencies that you need for your project. Finally, you’ll learn how to store environment variables outside of your code and how to load the variables into your app.

Create the Project’s Folder Structure

In this section, you’ll create your project structure. You can name the root folder of your project any way you like. For example, you could name it url_shortener_project/.

Read the full article at https://realpython.com/build-a-python-url-shortener-with-fastapi/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python for Beginners: Convert Tuple to String in Python

Wed, 2022-05-18 09:00

In python, a tuple can contain different elements. On the other hand, a string is just a sequence of characters. In this article, we will discuss how to convert a tuple to a string when we are given a tuple of numbers of characters in python. 

Table of Contents How to Convert Tuple of Characters to a String?

Suppose that we have the following tuple of characters.

myTuple = ("P", "y", "t", "h", "o", "n")

Now, we have to convert this tuple to the string “Python”.

To convert the tuple to string, we will first create an empty string named output_string. After that, We will traverse the tuple using a for loop. While traversal, we will add each character in the tuple to the output_string using the string concatenation operation. After execution of the for loop, we will get the desired string in the variable output_string. You can observe this in the following example.

myTuple = ("P", "y", "t", "h", "o", "n") output_string = "" for character in myTuple: output_string += character print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

The input tuple is: ('P', 'y', 't', 'h', 'o', 'n') The output string is: Python

Instead of using the for loop, we can use the join() method to convert the tuple of characters into a string. The join() method, when invoked on a string, takes an iterable object containing strings as an input argument. After execution, it returns the modified string containing the new character along with the characters in the original string. 

To convert a tuple of characters to a string, we will first create an empty string named  output_string. After that, we will invoke the join() method on the output_string with the tuple as input argument. After execution, the join() method will return the desired string as shown below.

myTuple = ("P", "y", "t", "h", "o", "n") output_string = "" output_string = output_string.join(myTuple) print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

The input tuple is: ('P', 'y', 't', 'h', 'o', 'n') The output string is: Python Convert Tuple of Numbers to a String

If you try to convert a tuple of numbers to a string using any of the methods discussed above, the program will raise TypeError exception. You can observe this in the following example.

myTuple = (1, 2, 3, 4, 5, 6) output_string = "" for character in myTuple: output_string += character print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

Traceback (most recent call last): File "/home/aditya1117/PycharmProjects/pythonProject/string1.py", line 4, in <module> output_string += character TypeError: can only concatenate str (not "int") to str

Similarly, when we use the second approach, the program runs into TypeError exception as shown below.

myTuple = (1, 2, 3, 4, 5, 6) output_string = "" output_string = output_string.join(myTuple) print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

Traceback (most recent call last): File "/home/aditya1117/PycharmProjects/pythonProject/string1.py", line 3, in <module> output_string = output_string.join(myTuple) TypeError: sequence item 0: expected str instance, int found

To avoid the error, we just have to convert each element of the tuple into a string before adding it to the output_string. In the first approach, we will first convert each element of the tuple into a string using the str() function. After that, we will perform the concatenation operation. In this way, we can convert a tuple of numbers to a string.

myTuple = (1, 2, 3, 4, 5, 6) output_string = "" for element in myTuple: character = str(element) output_string += character print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

The input tuple is: (1, 2, 3, 4, 5, 6) The output string is: 123456

For the approach using the join() method, we will first convert the tuple of numbers into a tuple of strings. For this, we will use the map() function and the str() function. 

The map() function takes a function as its first argument and an iterable object as the second input argument. After execution, it returns a map object in which the function is applied to each element of the iterable object. We will pass the str() function and the tuple of numbers to the map() function as input arguments. After that, we will convert the map object using the tuple() constructor. Hence, we will get a tuple of strings as shown below.

myTuple = (1, 2, 3, 4, 5, 6) newTuple = tuple(map(str, myTuple)) print("The input tuple is:", myTuple) print("The output tuple is:", newTuple)

Output:

The input tuple is: (1, 2, 3, 4, 5, 6) The output tuple is: ('1', '2', '3', '4', '5', '6')

After getting the tuple of strings from the tuple of numbers, we can directly obtain the string as follows.

myTuple = (1, 2, 3, 4, 5, 6) newTuple = tuple(map(str, myTuple)) output_string = ''.join(newTuple) print("The input tuple is:", myTuple) print("The output string is:", output_string)

Output:

The input tuple is: (1, 2, 3, 4, 5, 6) The output string is: 123456 Conclusion

In this article, we have discussed how to convert a tuple to a string in python. To know more about strings in python, you can read this article on string formatting in python. You might also like this article on list comprehension in python.

The post Convert Tuple to String in Python appeared first on PythonForBeginners.com.

Categories: FLOSS Project Planets

PyBites: Configure a Linux Development Environment on Windows with WSL and VS Code

Wed, 2022-05-18 08:12
About WSL

It seems like everyone is using Linux or Mac for software development these days, but if you’re a windows user, you may have looked into what you needed to do to be able to use Linux on your PC and found that dual-booting or virtual machines sounded like too much trouble. But these days, most employers want their devs to know their way around the Linux command line. There’s never been a better time for Windows users to start learning those skills since there’s finally a way to have our cake and eat it too. It’s called Windows Subsystem for Linux (WSL).

The Windows Subsystem for Linux lets developers run a GNU/Linux environment — including most command-line tools, utilities, and applications — directly on Windows, unmodified, without the overhead of a traditional virtual machine or dualboot setup.”

What is Windows Subsystem for Linux | Microsoft Docs

If this sounds interesting, read on as I walk you through setting up a Python development environment in WSL.

Enable and install WSL

In order to use WSL, you must be using Windows version 10 (build 19041 or higher) or any version of Windows 11. For this tutorial I’m using Windows 11 and am also going to integrate Visual Studio Code with WSL so you’ll want to install this too. (WSL is not just for VS Code users, but at the time of this writing, it is the only lightweight code editor that integrates WSL natively.)

To get started on Windows 11, open a Powershell prompt as administrator and run
wsl --install

Note: if you’re using Windows 10 you’ll need to specify the distribution you want to install, e.g.: wsl --install -d Ubuntu

The process for installing WSL through PowerShell

When prompted, reboot your computer

You should get a PowerShell window prompting you for your credentials to use within the Ubuntu distribution. After entering your new username and password, you should arrive at a Bash prompt.

You can use this shell the same as you would in the Linux terminal and can open it in the future by going to the Start Menu and searching for the Ubuntu app.

Launch your Windows Explorer by pressing WIN + E and you’ll see that you have a new Linux icon at the bottom of your navigation pane. This is a mounted drive that contains all your Linux files.

Access your files through the Linux drive mounted in Windows Explorer

Congratulations, you have installed Linux on Windows!

Install Python, Pip, and Venv on Ubuntu

The included Ubuntu distribution already has python installed. Run python3 –version to ensure the version. As of the time of this writing, Python 3.8.2 is included.

Let’s ensure your python is up to date in Linux and you have the basics by returning to the Bash shell (Start Menu -> Ubuntu) and following the next 3 steps

  • If you need to update your version of Python, first update your Ubuntu version by entering: 
    sudo apt update && sudo apt upgrade
    then update Python using sudo apt upgrade python3.
  • Install pip by entering: sudo apt install python3-pip.
  • If you use a virtual environment, install it by entering: sudo apt install python3-venv.
Run VS Code on WSL

Now that you have Linux installed on Windows and have your basic Python installation ready to go, we can start setting up a real Linux development environment in VS Code

Open VS Code and install the extension called Remote WSL

The Remote WSL Extension

You will notice a new icon in the lower-left corner of the application. Click that icon to open a remote window.

The new Remote Window icon in VS Code from the Remote WSL Extension

From here you have multiple options:

New WSL Window will give you a clean session of VS Code in Linux. Alternatively, you can launch a new session in a different distro (see Windows Store for more distros to download if you wish).

Open Folder in WSL will open any folder on your Windows or Linux drives in VS Code on WSL. Although you can open files in WSL from the Windows file system, I recommend you work with projects and files actually stored within your Linux file system for compatibility. However, if you have a rather large project, porting it over could potentially be a lot of work, and guidance on that process is beyond the scope of this article, always make sure you have your code backed up in case something goes wrong.

The final option is Reopen Folder in WSL which will reload a folder opened in VS Code under Windows and open it under WSL instead.

The Remote Window options

For the purposes of this tutorial, click on New WSL Window. A new instance of VS Code will open and it will load into Ubuntu. You know you are running your VS Code instance in Linux by the icon in the lower left.

WSL indicator/button

NOTE: There’s a possibility you may have to adjust some of your settings or reinstall/reconfigure some of your extensions through WSL.

Now that you have the hard part done, there’s just one more thing to walk through.

Creating and running a sample project

You’re all set up and ready to rock with your new Linux environment, but before you go, let’s run a simple hello world.

Create a new python file, write a hello world statement, then File -> Save. Notice VS Code is prompting you to save a file in the Linux file system.

Save prompt with path to Linux home folder

After saving the file, press CTRL + ` to open the integrated terminal. VS Code now opens the default Ubuntu shell (aka “Bash”). If you have never used Linux before, you’ll want to learn the terminal as it’s one of the main benefits you’re getting by going through this process!

Type in the following command and press Enter to see your first program running in Linux on Windows

Hello World

Enjoy your new development environment! To learn more about Linux, see Linux.org

Categories: FLOSS Project Planets

Zato Blog: Web scraping as an API service

Wed, 2022-05-18 05:31
Overview

In systems-to-systems integrations, there comes an inevitable time when we have to employ some kind of a web scraping tool to integrate with a particular application. Despite its not being our first choice, it is good to know what to use at such a time - in this article, I provide a gentle introduction to my favourite tool of this kind, called Playwright, followed by sample Python code that integrates it with an API service.

Naturally, in the context of backend integrations, web scraping should be avoided and, generally, it should be considered the last resort. The basic issue here is that while the UI term contains the “interface” part, it is not really the “Application Programming” Interface that we would like to have.

It is not that the UI cannot be programmed against. After all, a web browser does just that, it takes a web page and renders it as expected. Same goes for desktop or mobile applications. Also, anyone integrating with mainframe computers will recognize that this is basically what 3270 can be used for too.

Rather, the fundamental issue is that web scraping goes against the principles of separation of layers and roles across frontend, middleware and backend, which in turn means that authors of resources (e.g. HTML pages) do not really expect for many people to access them in automated ways.

Perhaps they actually should expect it, and web pages should finally start to resemble genuine knowledge graphs, easy to access by humans, be it manually or through automation tools, but the reality today is that it is not the case and, in comparison with backend systems, the whole of the web scraping space is relatively brittle, which is why we shun this approach in integrations.

Yet, another part of reality, particularly in enterprise integrations, is that people may be sometimes given access to a frontend application on an internal network and that is it. No API, no REST, no JSON, no POST data, no real data formats, and one is simply supposed to fill out forms as part of a business process.

Typically, such a situation will result in an integration gap. There will be fully automated parts in the business process preceding this gap, with multiple systems coordinated towards a specific goal and there will be subsequent steps in the process, also fully automated.

Or you may be given access only to a specific frontend and only through VPN via a single remote Windows desktop. Getting access to a REST API may take months or may be never realized because of some high level licensing issues. This is not uncommon in the real life.

Such a gap can be a jarring and sore point, truly ruining the whole, otherwise fluid, integration process. This creates a tension and to resolve the tension, we can, should all the attempts to find a real API fail, finally resort to web scraping.

It is mostly in this context that I am looking at Playwright below - the tool is good and it has many other uses that go beyond the scope of this text, and it is well worth knowing it, for instance for frontend testing of your backend systems, but, when we deal with API integrations, we should not overdo with web scraping.

Needless to say, if web scraping is what you do primarily, your perspective will be somewhat different - you will not need any explanation of why it is needed or when, and you may be only looking for a way to enclose up your web scraping code in API services. This article will explain that too.

Introducing Playwright

The nice part of Playwright is that we can use it to visually prepare a draft of Python code that will scrape a given resource. That is, instead of programming it in Python, we go to an address, fill out a form, click buttons and otherwise use everything as usually and Playwright generates for us code that will be later used in integrations.

That code will require a bit of clean-up work, which I will talk about below, but overall it works very nicely and is certainly useful. The result is not one of these do-not-touch auto-generated pieces of code that are better left to their own.

While there are better ways to integrate with Jira, I chose that application as an example of Playwright’s usage simply because I cannot show you any internal application in a public blog post.

Below, there are two windows. One is Playwright’s emulating a Blackberry device to open a resource. I was clicking around, I provided an email address and then I clicked the same email field once more. To the right, based on my actions, we can find the generated Python code, which I consider quite good and readable.

The Playwright Inspector, the tool that gave us the code, will keep recording all of our actions until we click the “Record” button which then allows us to click the button next to “Record” which is “Copy code to clipboard”. We can then save the code to a separate file and run it on demand, automatically.

But first, we will need to install Playwright.

Installing and starting Playwright

The tools is written in TypeScript and can be installed using npx, which in turn is part of NodeJS.

Afterwards, the “playwright install” call is needed as well because that will potentially install runtime dependencies, such as Chrome libraries.

Finally, we install Playwright using pip as well because we want to access with Python. Note that if you are installing Playwright under Zato, the “/path/to/pip” will be typically “/opt/zato/code/bin/pip”.

npx -g --yes playwright install playwright install /path/to/pip install playwright

We can now start it as below. I am using BlackBerry as an example of what Playwright is capable of. Also, it is usually more convenient to use a mobile version of a site when the main window and Inspector are opened side by side, but you may prefer to use Chrome, Firefox or anything else.

playwright codegen https://example.atlassian.net/jira --device "BlackBerry Z30"

That is practically everything as using Playwright to generate code in our context goes. Open the tool, fill out forms, copy code to a Python module, done.

What is still needed, though, is cleaning up the resulting code and embedding it in an API integration process.

Code clean-up

After you keep using Playwright for a while with longer forms and pages, you will note that the generated code tends to accumulate parts that repeat.

For instance, in the module below, which I already cleaned up, the same “[placeholder="Enter email"]” reference to the email field is used twice, even if a programmer developing this could would prefer to introduce a variable for that.

There is not a good answer to the question of what to do about it. On the one hand, obviously, being programmers we would prefer not to repeat that kind of details. On the other hand, if we clean up the code too much, this may result in too much of a maintenance burden because we need to keep it mind that we do not really want to invest to much in web scraping and, should there be a need to repeat the whole process, we do not want to end up with Playwright’s code auto-generated from scratch once more, without any of our clean-up.

A good compromise position is to at least extract any kind of credentials from the code to environment variables or a similar place and to remove some of the code comments that Playwright generates. The result as below is what it should like at the end. Not too much effort without leaving the whole code as it was originally either.

Save the code below as “play1.py” as this is what the API service below will use.

# -*- coding: utf-8 -*- # stdlib import os # Playwright from playwright.sync_api import Playwright, sync_playwright class Config: Email = os.environ.get('APP_EMAIL', 'zato@example.com') Password = os.environ.get('APP_PASSWORD', '') Headless = bool(os.environ.get('APP_HEADLESS', False)) def run(playwright: Playwright) -> None: browser = playwright.chromium.launch(headless=Config.Headless) # type: ignore context = browser.new_context() # Open new page page = context.new_page() # Open project boards page.goto("https://example.atlassian.net/jira/software/projects/ABC/boards/1") page.goto("https://id.atlassian.com/login?continue=https%3A%2F%2Fexample.atlassian.net%2Flogin%3FredirectCount%3D1%26dest-url%3D%252Fjira%252Fsoftware%252Fprojects%252FABC%252Fboards%252F1%26application%3Djira&application=jira") # Fill out the email page.locator("[placeholder=\"Enter email\"]").click() page.locator("[placeholder=\"Enter email\"]").fill(Config.Email) # Click #login-submit page.locator("#login-submit").click() with sync_playwright() as playwright: run(playwright) Web scraping as a standalone activity

We have the generated code so the first thing to do with it is to run it from command line. This will result in a new Chrome window’s accessing Jira - it is Chrome, not Blackberry, because that is the default for Playwright.

The window will close soon enough but this is fine, that code only demonstrates a principle, it is not a full integration task.

python /path/to/play1.py

It is also useful that we can run the same Python module from our IDE, giving us the ability to step through the code line by line, observing what changes when and why.

Web scraping as an API service

Finally, we are ready to invoke the standalone module from an API service, as in the following code that we are also going to make available as a REST channel.

A couple of notes about the Python service below:

  • We invoke Playwright in a subprocess, as a shell command
  • We accept input through data models although we do not provide any output definition because it is not needed here
  • When we invoke Playwright, we set the APP_HEADLESS to True which will ensure that it does not attempt to actually display a Chrome window. After all, we intend for this service to run on Linux servers, in backend, and such a thing will be unlikely to work in this kind of an environment.

Other than that, this is a straightforward Zato service - it receives input, carries out its work and a reply is returned to the caller (here, empty).

# -*- coding: utf-8 -*- # stdlib from dataclasses import dataclass # Zato from zato.server.service import Model, Service # ########################################################################### @dataclass(init=False) class WebScrapingDemoRequest(Model): email: str password: str # ########################################################################### class WebScrapingDemo(Service): name = 'demo.web-scraping' class SimpleIO: input = WebScrapingDemoRequest def handle(self): # Path to a Python installation that Playwright was installed under py_path = '/path/to/python' # Path to a Playwright module with code to invoke playwright_path = '/path/to/the-playwright-module.py' # This is a template script that we will invoke in a subprocess command_template = """ APP_EMAIL={app_email} APP_PASSWORD={app_password} APP_HEADLESS=True {py_path} {playwright_path} """ # This is our input data input = self.request.input # type: WebScrapingDemoRequest # Extract credentials from the input .. email = input.email password = input.password # .. build the full command, taking all the config into account .. command = command_template.format( app_email = email, app_password = password, py_path = py_path, playwright_path = playwright_path, ) # .. invoke the command in a subprocess .. result = self.commands.invoke(command) # .. if it was not a success, log the details received .. if not result.is_ok: self.logger.info('Exit code -> %s', result.exit_code) self.logger.info('Stderr -> %s', result.stderr) self.logger.info('Stdout -> %s', result.stdout) # ###########################################################################

Now, the REST channel:

The last thing to do is to invoke the service - I am using curl from the command line below but it could very well be Postman or a similar option.

curl localhost:17010/demo/web-scraping -d '{"email":"hello@example.com", "password":"abc"}' ; echo

There will be no Chrome window this time around because we run Playwright in the headless mode. There will be no output from curl either because we do not return anything from the service but in server logs we will find details such as below.

We can learn from the log that the command took close to 4 seconds to complete, that the exit code was 0 (indicating success) and that is no stdout or stderr at all.

INFO - Command ` APP_EMAIL=hello@example.com APP_PASSWORD=abc APP_HEADLESS=True /path/to/python /path/to/the-playwright-module.py ` completed in 0:00:03.844157, exit_code -> 0; len-out=0 (0 Bytes); len-err=0 (0 Bytes); cid -> zcmdc5422816b2c6ff9f10742134

We are now ready to continue to work on it - for instance, you will notice that the password is visible in logs and this should not be allowed.

But, all such works are extra in comparison with the main theme - we have Playwright, which is a a tool that allows us to quickly integrate with frontend applications and we can automate it through API services. Just as expected.

Next steps
  • Start the tutorial to learn how to integrate APIs and build systems. After completing it, you will have a multi-protocol service representing a sample scenario often seen in banking systems with several applications cooperating to provide a single and consistent API to its callers.

  • Visit the support page if you need assistance.

  • Para aprender más sobre las integraciones de Zato y API en español, haga clic aquí.

  • Pour en savoir plus sur les intégrations API avec Zato en français, cliquez ici.

Categories: FLOSS Project Planets

Python Bytes: #284 Spicy git for Engineers

Wed, 2022-05-18 04:00
<p><strong>Watch the live stream:</strong></p> <a href='https://www.youtube.com/watch?v=Go2-6sboFS4' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://testandcode.com/"><strong>Test &amp; Code</strong></a> Podcast</li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p><strong>Brian #1:</strong><a href="https://github.com/alan-turing-institute/distinctipy"><strong>distinctipy</strong></a></p> <ul> <li>“<em>distinctipy</em> is a lightweight python package providing functions to generate colours that are visually distinct from one another.”</li> <li>Small, focused tool, but really cool.</li> <li>Say you need to plot a dynamic number of lines.</li> <li>Why not let distinctipy pick colors for you that will be distinct?</li> <li>Also can display the color swatches.</li> <li>Some example palettes here: https://github.com/alan-turing-institute/distinctipy/tree/main/examples <pre><code>from distinctipy import distinctipy # number of colours to generate N = 36 # generate N visually distinct colours colors = distinctipy.get_colors(N) # display the colours distinctipy.color_swatch(colors) </code></pre></li> </ul> <p><strong>Michael #2:</strong> <a href="https://docs.soda.io/soda-sql/concepts.html"><strong>Soda SQL</strong></a></p> <ul> <li>Soda SQL is a free, open-source command-line tool.</li> <li>It utilizes user-defined input to prepare SQL queries that run tests on dataset in a data source to find invalid, missing, or unexpected data.</li> <li>Looks good for data pipelines and other CI/CD work!</li> </ul> <p><strong>Daniel #3:</strong> <a href="https://www.nature.com/articles/s41586-020-2649-2"><strong>Python in Nature</strong></a></p> <ul> <li>There’s a review article from Sept 2020 on array programming with NumPy in the research journal Nature.</li> <li>For reference, in grad school we had a fancy paper on quantum entanglement that got rejected from Nature Communications, a sub-journal to Nature. Nature is hard to get into.</li> <li>List of authors includes Travis Oliphant who started NumPy. Covers NumPy as the foundation, building up to specialized libraries like QuTiP for quantum computing.</li> <li>If you search “Python” on their site, many papers come up. Interesting to see their take on publishing software work.</li> </ul> <p><strong>Brian #4:</strong> <a href="https://github.blog/2022-05-09-supercharging-github-actions-with-job-summaries/"><strong>Supercharging GitHub Actions with Job Summaries</strong></a></p> <ul> <li><p>From a tweet by <a href="https://twitter.com/simonw/status/1526337395334885377?s=20&amp;t=pFgZ2Ruklh8MLNlSiUmIcA">Simon Willison</a></p> <ul> <li>and an article: <a href="https://til.simonwillison.net/github-actions/job-summaries">GH Actions job summaries</a></li> </ul></li> <li><p>Also, <a href="https://twitter.com/nedbat/status/1526338136699281408?s=20&amp;t=pFgZ2Ruklh8MLNlSiUmIcA">Ned Batchelder</a> is using it for Coverage reports</p></li> <li><p>“You can now output and group custom Markdown content on the Actions run summary page.”</p></li> <li><p>“Custom Markdown content can be used for a variety of creative purposes, such as:</p> <ul> <li>Aggregating and displaying test results</li> <li>Generating reports</li> <li>Custom output independent of logs”</li> </ul></li> <li><p><a href="https://github.com/nedbat/coveragepy/blob/ad824b4585c88d0a153dd248f4585084dea33189/.github/workflows/coverage.yml#L218-L221">Coverage.py example:</a></p> <pre><code>- name: "Create summary" run: | echo '### Total coverage: ${{ env.total }}%' &gt;&gt; $GITHUB_STEP_SUMMARY echo '[${{ env.url }}](${{ env.url }})' &gt;&gt; $GITHUB_STEP_SUMMARY </code></pre></li> </ul> <p><strong>Michael #5:</strong><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit_01678898482.html"><strong>Language Summit is write up out</strong></a></p> <ul> <li>via Itamar, by Alex Waygood <ul> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit-python_11.html"><strong>Python without the GIL</strong></a>: A talk by Sam Gross</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit-per.html"><strong>Reaching a per-interpreter GIL</strong></a>: A talk by Eric Snow</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit_2.html"><strong>The "Faster CPython" project: 3.12 and beyond</strong></a>: A talk by Mark Shannon</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit-python.html"><strong>WebAssembly: Python in the browser and beyond</strong></a>: A talk by Christian Heimes</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit-f.html"><strong>F-strings in the grammar</strong></a><strong>:</strong> A talk by Pablo Galindo Salgado</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit_60.html"><strong>Cinder Async Optimisations</strong></a>: A talk by Itamar Ostricher</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit-dealing.html"><strong>The issue and PR backlog</strong></a>: A talk by Irit Katriel</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit_11.html"><strong>The path forward for immortal objects</strong></a>: A talk by Eddie Elizondo and Eric Snow</li> <li><a href="https://pyfound.blogspot.com/2022/05/the-2022-python-language-summit.html"><strong>Lightning talks</strong></a>, featuring short presentations by Carl Meyer, Thomas Wouters, Kevin Modzelewski, Samuel Colvin and Larry Hastings</li> </ul></li> </ul> <p><strong>Daniel #6:</strong><a href="https://www.allspice.io"><strong>AllSpice is Git for EEs</strong></a></p> <ul> <li>Software engineers have Git/SVN/Mercurial/etc</li> <li>None of the other engineering disciplines (mechanical, electrical, optical, etc), have it nearly as good. Altium has their Vault and “365,” but there’s nothing with a Git-like UX.</li> <li>Supports version history, diffs, all the things you expect. Even self-hosting and a Gov Cloud version.</li> <li>“Bring your workflow to the 21st century, finally.”</li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li><a href="https://testandcode.com/188">Will McGugan talks about Rich, Textual, and Textualize on Test &amp; Code 188</a></li> <li>Also 3 other episodes since last week. (I have a backlog I’m working through.)</li> </ul> <p>Michael:</p> <ul> <li><a href="https://www.youtube.com/watch?v=je21yaRV_xc"><strong>Power On-Xbox Documentary | Full Movie</strong></a></li> <li><a href="https://www.youtube.com/watch?v=7sxUu-tYcIA"><strong>The 4 Reasons To Branch with Git - Illustrated Examples with Python</strong></a></li> <li><a href="https://www.easypost.com"><strong>A Python spotting</strong></a> - via Jason Pecor</li> <li><a href="https://twitter.com/btskinn/status/1524507904929370114"><strong>2022 StackOverflow Developer Survey</strong></a> is live, via Brian</li> <li><a href="https://www.textsniper.app"><strong>TextSniper macOS App</strong></a></li> <li>PandasTutor on webassembly </li> </ul> <p>Daniel: </p> <ul> <li>I know Adafruit’s a household name, shout-out to <a href="https://www.sparkfun.com">Sparkfun</a>, <a href="https://www.seeedstudio.com">Seeed Studio</a>, <a href="https://openmv.io">OpenMV</a>, and other companies in the field.</li> </ul> <p><strong>Joke:</strong> </p> <p><a href="https://www.reddit.com/r/ProgrammerHumor/comments/un1pmg/i_can_explain_this/"><strong>A little awkward</strong></a></p>
Categories: FLOSS Project Planets

Django Weblog: Django 4.1 alpha 1 released

Wed, 2022-05-18 01:58

Django 4.1 alpha 1 is now available. It represents the first stage in the 4.1 release cycle and is an opportunity for you to try out the changes coming in Django 4.1.

Django 4.1 has an profusion of new features which you can read about in the in-development 4.1 release notes.

This alpha milestone marks the feature freeze. The current release schedule calls for a beta release in about a month and a release candidate about a month from then. We'll only be able to keep this schedule if we get early and often testing from the community. Updates on the release schedule are available on the django-developers mailing list.

As with all alpha and beta packages, this is not for production use. But if you'd like to take some of the new features for a spin, or to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the alpha package from our downloads page or on PyPI.

The PGP key ID used for this release is Carlton Gibson: E17DF5C82B4F9D00.

Categories: FLOSS Project Planets

Python⇒Speed: CPUs, cloud VMs, and noisy neighbors: the limits of parallelism

Tue, 2022-05-17 20:00

Sometimes your program is slow not because of your code, but because of where it’s running. If you have other processes competing for the same limited hardware resources, your code will run more slowly.

Once you add virtualization into the mix, those competing processes might be invisible… but they’re still there.

In this article we’ll cover:

  • The hardware limits of CPUs’ cores and “hyperthreads”.
  • How operating systems deal with multiple processes that want to use limited number of CPU cores.
  • The impact of virtualization on these various resources, including what cloud vCPUs mean.
  • CPU sharing limits available via containers.
Read more...
Categories: FLOSS Project Planets

scikit-learn: The Value of Open Source Sprints, the scikit-learn Experience

Tue, 2022-05-17 20:00
Author: Reshama Shaikh

With contributions from: Gaël Varoquaux, Andreas Mueller, Olivier Grisel, Julien Jerphanion, Guillaume Lemaitre

Top Line Summary

Sprints are working sessions to contribute to an open source library. The goals and achievements differ between Developer and Community sprints. The long-term impact of open source sprints, particularly community events, is not easily quantifiable or measurable. Positive outcomes of sprints have slowly been emerging, and for that reason, to realize the value of open source sprints requires playing the “long game”.

Introduction

The scikit-learn project has a long and extraordinary legacy of open source sprints. Since 2010, when its first public version was released, there have been as many as 45 sprints organized. The 45 number is a lower bound, since there are likely more sprints that have not been listed.

To date, scikit-learn has over 2300 contributors to the library. The number of contributors to scikit-learn exceeds those of other related libraries such as numpy, scipy and matplotlib, with the exception of the pandas, which has a greater number of contributors (See Appendix A).

The public discourse on open source has expanded to explore topics of sustainability, funding models, and diversity and inclusion, to name a few. A reasonable, yet ”difficult to answer” question that has been posed is:

What is the effectiveness of sprint models and what is the long-term engagement as a result of these sprints?

Due to technological limitations of GitHub, we do not hold precise data on how many scikit-learn contributors connected to the project via a sprint. We have no formal data collection process which records statistics on how many sprint participants are recurring or information on their contributions to other open source projects or other long term positive ripple effects. A scientific look at the correlation between the number of sprints and contributors is beyond the scope of this article. What we will examine in this article are the objectives, results and aspirations of running the scikit-learn open source sprints.

The queries from other open-source projects requesting guidance on sprints and diversity and inclusions have been increasing. We share these experiences and lessons learned with the community, potential funders and open source project maintainers, particularly those projects which are nascent in their quest to build community, sustainability and diversity and inclusion.

Outline

In this article we examine the following:

  • What is a “sprint”?
  • What are the differences between “Developer” and “Community” sprints?
  • What are the goals of the open source sprints?
  • What value do open source sprints bring to the project and community?
  • What are the aspirations of the scikit-learn project, in terms of connecting with the community?
Definition of Sprint

A sprint has traditionally been an event where contributors come together to work on issues in the scikit-learn repository. A sprint can be as short as a few hours, or last over several days, even a week or longer. They may be in-person, online, hybrid or asynchronous. Sprints may be organized by the developers of the library, community groups (such as Meetups), scheduled alongside scientific or Python conferences, or even at home with a few friends. They can more simply and less dauntingly be described as working sessions to contribute to the open source library.

Developer vs Community Sprint

We distinguish between a Developer (Dev) and Community sprint because the goals and results differ significantly between the two.

Developer (Dev) Sprint

A Developer, or “dev”, sprint is one that is typically organized by the maintainers of the library. A dev sprint is one where the developers or maintainers of the library gather to work on issues and to discuss the resolution of ongoing complex issues. This also provides the team an opportunity to focus on tasks related to the long-term roadmap of the project.

For scikit-learn, the early sprints were alongside the SciPy conferences and the practice has continued for over a decade.

Community Sprint

A Community sprint can be a collaboration by individuals, by affinity communities such as Meetup Groups (Data Umbrella, PyLadies, etc.), by conferences (SciPy, PyData Global, JupyterCon, etc.). A Community sprint is one that is with the general public and it may be beginners, experts, or a combination of both.

At a Developer sprint, a contributor may work on a PR that has been ongoing for three months. Conversely, Community sprints require curated issues which newcomers can complete in a shorter period of time (such as 1 day, or 1 day with 1-2 months follow-up).

The landscape of community sprints with other scientific python libraries is unknown. It is possible that scikit-learn may have had community sprints earlier than other projects.

Goals of the Sprints Goals of Dev Sprints
  • Get maintainers in one room to efficiently discuss open issues and pull requests
  • Move along contributions in a synchronous fashion
  • To foster existing collaborations with external developers synchronously (Julien)
  • Building rapport: Maintainers reside in various continents and the in-person sprints build rapport within the team. Social interactions are critical in having a productive team
  • To foster collaborations with the project’s corporate sponsors (members of the scikit-learn Consortium)
Goals of Community & Beginner Sprints
  • To broaden the project’s contributor base
  • To build community and connect the project maintainers with its users
  • To get interactive feedback from new scikit-learn users and contributors
  • To onboard new contributors to scikit-learn and PyData generally
  • To onboard new contributors who would become recurring contributors
  • To collaborate with community groups to increase diversity of contributor base with intentional outreach
  • To increase the number of recurring contributors
scikit-learn Team Members Who Connected to the Project Via a Sprint

It is notable that a number of the current maintainers of the library found their way to the project via a sprint. Additionally, some members of the Contributor Experience Team also connected to the scikit-learn project via the sprints.

Olivier Grisel

Olivier Grisel has been a contributor and maintainer for more than 12 years. Olivier met Gaël Varoquaux at a local conference organized in Paris by the French speaking Python users group AFPy.org. After chatting 5 minutes about toy ML experiments in Python, Gaël invited Olivier to join the first sprint organized at Inria in March 2010:

Olivier shares:

At the time, scikit-learn coding sprints gathered only 6 people sitting around a table with some wifi and a coffee machine :)

First scikit-learn sprint, Paris, March 2010; Photo credit: Fabian Pedregosa Andreas Mueller

Andreas Mueller has been a maintainer of the project since 2011. He joined a sprint at a conference because he was a user and wanted to contribute. He shares in a 2017 interview:

While working on my Ph.D. in computer vision and learning, the scikit-learn library became an essential part of my toolkit. My initial participation in open source began in 2011 at the NeurIPS conference in Granada, Spain, where I had attended a scikit-learn sprint. The scikit-learn release manager at the time had to leave, and the project leads asked me to become release manager; that’s how it all got started.

Julien Jerphanion

Julien Jerphanion participated in a sprint in February 2019 at AXA as a first time contributor while interning at Dataiku. The sprint provided Julien an opportunity to experience scikit-learn and meet the maintainers. Prior to the sprint, he had only used the library in a few projects. He contributed code, reviews, and documentation since March 2021, joined Inria in April 2021 and in October 2021, Julien became a core developer.

Other Maintainers

There are other maintainers and emeritus contributors who had participated in a developer or community sprint along their journey with the scikit-learn team, such as Vlad Nicolae (current maintainer), Gilles Loupe (Emeritus), Thouis (Ray) Jones (Emeritus).

Reshama Shaikh

Reshama Shaikh has organized nine scikit-learn community sprints from 2017 to 2021. She first contributed code and documentation fixes to scikit-learn in September 2018. In September 2020, she was invited to join the scikit-learn team.

In her PyConDE PyData Berlin keynote from April 2022, 5 Years, 10 Sprints, a scikit-learn Open Source Journey, she shares a history and progression of the Community sprints.

Juan Martín Loyola

Juan Martín Loyola started contributing to scikit-learn as preparation for the Data Umbrella Latin America, June 2021 sprint. He continued to contribute prolifically after the sprint, and he was invited to join the team in December 2021. Given his location in Argentina, he will be providing support at the 2022 SciPy Latin America sprint.

Second Degree Impact

Lauren Burke joined the scikit-learn Communications Team in November 2021 at the recommendation of Reshama Shaikh, and this can be considered a network effect.

Sprints: Observed Impact and Lessons Learned

There are a number of observed favorable outcomes from the sprints for both the project and contributors.

Onboarding

The sprints help the community discover the open source process and get started with contributing.

Building community

Sprint participants, whether one-time or recurring, become ambassadors for the project.

Open source workflow knowledge

Users learn about testing, control version system (i.e. git), documentation which they bring to their work. The sprint experience assists contributors in developing a wider set of technical skills that can be shared across projects, networking, on to jobs and more.

Overcoming barriers to entry

The sprints, as a “hands-on working session”, provides an avenue for potential contributors to overcome common barriers to entry, particularly “getting started”, and moving from the possibility to an actuality stage.

Providing an avenue for advanced contributions

As sprints provide an on-ramp for new contributors, it similarly provides an opportunity for returning contributors to advance their contributing skills to the next level in a structured environment and with mentorship.

Building confidence

The sprints help to build confidence for both new and returning contributors.

Gaël shares:

I believe those sprints helped resourceful people (like Juan Martín) to gain confidence and provide valuable contributions (especially reviews).

Increase open-source literacy

The sprints are a forum for users to gain a greater understanding of how an open source project functions and for the user/contributor to learn of an actual contribution, from start to finish.

Value of synchronous interaction

Typically, open source contributions to scikit-learn occur on the GitHub repository in asynchronous fashion. The sprints provide real-time synchronous interaction. This experience provides more direct access to technical assistance and feedback to the contributor, and in a direct, efficient, and time-saving manner.

Julien shares:

I think having a setup like this [beginner/community sprint] is valuable for first time contributors because they can synchronously get specific information they would hardly have got otherwise. To me, this allow giving feedback which is immediate, specific and exact, making contributing to open-source enjoyable and preventing frustration: giving such feedback is what we should aim for and in this regard this setup is convenient.

Online Sprints

Since the start of the pandemic, Data Umbrella organized 4 online sprints. Additionally, there were 2 online sprints with SciPy and EuroPython.

These have been the observed benefits of the online sprints, which began in 2020 due to the global pandemic:

Networking

Sprints make it easier to meet new people with different backgrounds, and in particular, online sprints help break geographical barriers.

International collaboration

Collaborating with affinity communities can attract more candidates from various backgrounds.

Pair programming

The pairing of contributors seems to work well. Pair programming was consistently ranked as a positive experience by online sprint participants

Increases accessibility

The use of online tools in particular makes it possible to interact with people who would not have joined traditional community events organized in North America or western Europe e.g. because of the travel costs and complexity to get a visa in time. Attending those online events is probably also less disruptive for people with young children.

For the scikit-learn project itself, it made it possible to “recruit” a couple of new recurring contributors who attend regular office hours after the original sprints.

Office Hours

Actually the fact that we now have community office hours on discord is probably a consequence of us attending the Data Umbrella online sprints.

Olivier shares:

I think they [the sprints] were the most interesting online events I attended during the COVID-19 crisis when all traditional on-site tech events were canceled. In particular the active planning by the Data Umbrella team for participants to work in pairs with audio rooms on Discord + a central help desk audio room worked really well.

The pre-sprint and post-sprint office hours also made it possible to limit the time spent on helping fix setup issues compared to what we experience in traditional sprints. They also forced us as maintainers to review and fix our documentation before the event.

Creation of supplementary resources in various medium forms

Data Umbrella coordinated the creation of a series of videos and transcripts that provided learning materials for the community to prepare for the sprint. These resources were available to the public and have a wide reach:

This is the Contributing to scikit-learn list of videos that were created for the sprints.

Photo credit: Reshama Shaikh Aspirations for Future scikit-learn Sprints

One of the primary goals of the Community sprints was to onboard new contributors who would become recurring contributors. This goal has generally not been realized. scikit-learn is a complex and advanced project, and a one-time sprint does not provide sufficient opportunity and support to sprint participants to become recurring contributors. A few sprint participants have progressed to become returning contributors, and it is a very small number relative to the number of sprint participants.

Onboarding a first-time contributor takes time. People who are contributing for the first time need to go through a lot of information simultaneously regarding both technical and organizational aspects of contributions. People may run into unexpected issues at the really start depending on their setup and experience, might get frustrated and or discouraged and might not report the problem they are having (thinking it is their fault). Pre-event office hours have been successful at alleviating some of these roadblocks, for those sprint participants who have completed their pre-work.

Here are some adjustments that can be made in the future to reach the goal of recruiting recurring contributors:

  • Provide mentoring
  • Improve onboarding process
  • Improve issues definitions
  • Have sprints alongside tutorials
  • Expand types of contributions that new contributors can make
  • Have smaller sprint events

Mentoring
Sprints may not be sufficient for onboarding people. Mentoring is needed to take to the next level. Mentoring relationships can be established during sprint events.

Improve the onboarding process

While the scikit-learn project has improved significantly in the past few years as a result of feedback and learnings from the sprints, there is still room for improvement.

The scikit-learn project is complex, the contributor learning curve is steeper, and it has been getting more difficult to contribute to scikit-learn.

Improve issues definitions

There are 1600+ issues in the GitHub repository. Issues can be better defined and it would be valuable to break the issues into smaller steps which would be more approachable.

Sprints alongside tutorials

Scheduling sprints alongside tutorial sessions would be conducive in allowing users to connect the open source tool use cases with the motivation and product vision of scikit-learn.

Expand types of contributions

While the sprints have typically focused on documentation and code contributions, the project needs support in other areas. There is a backlog of open issues (1600+ !) and open pull requests (650+). The project needs support in triaging issues and reviewing pull requests. It would be beneficial to have sprint contributors work on increasingly complex issues.

Julien shares from personal experience:

In particular and in my opinion,reviewing pull requests is as valuable as authoring them. I also find it a preferable way to learn about scikit-learn internals rather than solving issues.

Have smaller sprints

Julien suggests:

Would sprints with a really small number of people (e.g. 2 mentees per mentor) be more valuable in the long term? Personally, I would prefer mentoring one or two people closely instead (ideally in-person) as I think it is more achievable, enjoyable and fruitful experience (this is something I am trying to do at the moment when I can get some time but I currently have limited of it).

Finally, I would also really treasure having in-person sprints [in Paris] with external (recurring) contributors (with a specific expertise) on advanced subjects when it is possible in the future.

Appendix A: GitHub Contributors Comparison of Libraries

A comparison of the contributor base to other related libraries in the same space (May 2022):

References
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #525 (May 17, 2022)

Tue, 2022-05-17 15:30

#525 – MAY 17, 2022
View in Browser »

Python’s min() and max(): Find Smallest and Largest Values

In this tutorial, you’ll learn how to use Python’s built-in min() and max() functions to find the smallest and largest values. You’ll also learn how to modify their standard behavior by providing a suitable key function. Finally, you’ll code a few practical examples of using min() and max().
REAL PYTHON

Using django-rich for Testing

The django-rich library adds color and formatting to Django management commands, including colorized tracebacks. Make your debugging and testing more visual.
OKKEN, JOHNSON, & SMITH podcast

Analyze Code-Level Performance Across Your app’s Environment With Minimal Performance Overhead

Datadog’s profiler allows you to capture code profiles for all of your production instances. Compare those profiles in the profile comparison view to see how the performance of your code changes over time. You can quantify the changes you’ve made to fix a bottleneck. Try Datadog APM free →
DATADOG sponsor

The Well-Maintained Test: 12 Questions for New Dependencies

There is lots of openly available code out there, but how do you know if you should build a dependency on some random coder’s package? 12 Questions you should ask yourself before using a library.
ADAM JOHNSON

DjangoCon Europe 2022 Call for Proposals

DJANGOCON.EU

DjangoCon US 2022 Call for Proposals

PRETALX.COM

Python Release Python 3.11.0b1

PYTHON.ORG

Python in Visual Studio Code: May 2022 Release

MICROSOFT.COM

Discussions Python Language Summit: Python Without the GIL

What’s a language summit without a conversation about the GIL? This HN discussion is all about the “nogil” conversation at the 2022 summit
HACKER NEWS

Which Python Packages Do You Use the Most?

MIKE DRISCOLL

Python Jobs Academic Innovation Developer (Ann Arbor, MI, USA)

University of Michigan

Software Development Lead (Ann Arbor, MI, USA)

University of Michigan

Senior Backend Engineer (Anywhere)

Doist

Senior Storytelling Framework Engineer - Python (France)

GoPro

Senior Software Engineer - Python Full Stack (USA)

Blenderbox

Gameful Learning Developer (Ann Arbor, MI, USA)

University of Michigan

Data & Operations Engineer (Ann Arbor, MI, USA)

University of Michigan

Lead Software Engineer (Anywhere)

Right Side Up

More Python Jobs >>>

Articles & Tutorials Deploying a Flask Application Using Heroku

In this video course, you’ll learn how to create a Python Flask example web application and deploy it using Heroku. You’ll also use Git to track changes to the code, and you’ll configure a deployment workflow with different environments for staging and production.
REAL PYTHON course

Python Decorator Patterns

Decorators are a way of wrapping functions around functions, they’re a common technique for providing pre- and post-conditions on your code. Learn about the different ways decorators get invoked and how to write each pattern.
MARTON TRENCSENI

Pulumi: Developer-First Infrastructure with Python

Developing on the cloud is complex. What if you could use your existing programming knowledge to build, deploy, and manage infrastructure on any cloud using your favorite languages and tools? With Pulumi you can. Get started for free at pulumi.com →
PULUMI sponsor

Introduction to Linear Programming in Python

Linear programming is a technique in mathematics for optimizing multi-variable problems. This article introduces you to the world of linear programming and some Python libraries you can use to solve these kinds of problems.
MAXIME LABONNE

Gevent Performance

Gevent is a co-routine based networking library who’s sweet spot for performance is network-bound workloads. Learn how gevent allows you to efficiently interleave other CPU work while waiting on the network for results.
ROY WILLIAMS

REPL Python Programming and Debugging With IPython

IPython is a powerful alternative to the built-in REPL. Learn how to use it for exploratory programming and debugging, including using it in the Django shell.
LUKE PLANT

Unlock Secret Knowledge From Python Experts for Just $10

Packt’s Spring Sale is on and for a limited period, all eBooks and Videos are only $10. Our Products are available as PDF, ePub, and MP4 files for you to download and keep forever. All the practical content you need - by developers for developers.
PACKT PUBLISHING sponsor

Profiling and Analyzing Performance of Python Programs

The tools and techniques for finding all the bottlenecks in your Python programs and fixing them, fast. Includes info on cProfile, py-spy, py-heat, and more.
MARTIN HEINZ • Shared by Martin Heinz

How to Code a Blockchain in 6 Steps

The best way to understand blockchains is to see one in action , or better yet, to build one. Learn how to use Python and hashlib to create your own.
ARI COHEN

Projects & Code ads: Store Data in Soundwaves

GITHUB.COM/STACKBUFFER

TatSu: Generate Python Parsers From EBNF Grammars

GITHUB.COM/NEOGENY

woodwork: Data Typing Namespace for Many ML Tools

GITHUB.COM/ALTERYX

pony: Pony Object Relational Mapper

GITHUB.COM/PONYORM

open-data-anonymizer: Data Anonymization & Masking

GITHUB.COM/ARTLABSS

Events PiterPy Breakfast

May 18, 2022
TIMEPAD.RU

Weekly Real Python Office Hours Q&A (Virtual)

May 18, 2022
REALPYTHON.COM

PyData Bristol Meetup

May 19, 2022
MEETUP.COM

PyLadies Dublin

May 19, 2022
PYLADIES.COM

Karlsruhe Python User Group (KaPy)

May 20, 2022
BL0RG.NET

Django Girls Malabo

May 21 to May 22, 2022
DJANGOGIRLS.ORG

Happy Pythoning!
This was PyCoder’s Weekly Issue #525.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Real Python: Using Python Class Constructors

Tue, 2022-05-17 10:00

Class constructors are a fundamental part of object-oriented programming in Python. They allow you to create and properly initialize objects of a given class, making those objects ready to use. Class constructors internally trigger Python’s instantiation process, which runs through two main steps: instance creation and instance initialization.

If you want to dive deeper into how Python internally constructs objects and learn how to customize the process, then this video course is for you.

In this video course, you’ll:

  • Understand Python’s internal instantiation process
  • Customize object initialization using .__init__()
  • Fine-tune object creation by overriding .__new__()

With this knowledge, you’ll be able to tweak the creation and initialization of objects in your custom Python classes, which will give you control over the instantiation process at a more advanced level.

To better understand the examples and concepts in this tutorial, you should be familiar with object-oriented programming and special methods in Python.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python Insider: Python 3.9.13 is now available

Tue, 2022-05-17 09:33

 

This is the thirteenth maintenance release of Python 3.9. Get it here:
Python 3.9.13

According to the release calendar specified in PEP 596, Python 3.9.13 is the final regular maintenance release. Starting now, the 3.9 branch will only accept security fixes and releases of those will be made in source-only form until October 2025.

This is a milestone moment for me as it means that now both of my release series are security-only. My work as release manager enters its final stage. I’m not crying, you’re crying!

Compared to the 3.8 series, this last regular bugfix release is still pretty active at 166 commits since 3.9.12. In comparison, version 3.8.10, the final regular bugfix release of Python 3.8, included only 92 commits. However, it’s likely that it was 3.8 that was special here with the governance changes occupying core developers’ minds. For reference, version 3.7.8, the final regular bugfix release of Python 3.7, included 187 commits.

In any case, 166 commits is quite a few changes, some of which being pretty important fixes. Take a look at the change log for details.

Major new features of the 3.9 series, compared to 3.8

Some of the new major new features and changes in Python 3.9 are:

  • PEP 573, Module State Access from C Extension Methods
  • PEP 584, Union Operators in dict
  • PEP 585, Type Hinting Generics In Standard Collections
  • PEP 593, Flexible function and variable annotations
  • PEP 602, Python adopts a stable annual release cadence
  • PEP 614, Relaxing Grammar Restrictions On Decorators
  • PEP 615, Support for the IANA Time Zone Database in the Standard Library
  • PEP 616, String methods to remove prefixes and suffixes
  • PEP 617, New PEG parser for CPython
  • BPO 38379, garbage collection does not block on resurrected objects;
  • BPO 38692, os.pidfd_open added that allows process management without races and signals;
  • BPO 39926, Unicode support updated to version 13.0.0;
  • BPO 1635741, when Python is initialized multiple times in the same process, it does not leak memory anymore;
  • A number of Python builtins (range, tuple, set, frozenset, list, dict) are now sped up using PEP 590 vectorcall;
  • A number of Python modules (_abc, audioop, _bz2, _codecs, _contextvars, _crypt, _functools, _json, _locale, operator, resource, time, _weakref) now use multiphase initialization as defined by PEP 489;
  • A number of standard library modules (audioop, ast, grp, _hashlib, pwd, _posixsubprocess, random, select, struct, termios, zlib) are now using the stable ABI defined by PEP 384.

You can find a more comprehensive list in this release’s “What’s New” document.

We hope you enjoy Python 3.9!

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation.

Your friendly release team,
Ned Deily @nad
Steve Dower @steve.dower
Łukasz Langa @ambv

Categories: FLOSS Project Planets

Test and Code: 188: Python's Rich, Textual, and Textualize - Innovating the CLI

Tue, 2022-05-17 09:15

Will McGugan has brought a lot of color to CLIs within Python due to Rich.
Then Textual started rethinking full command line applications, including layout with CSS.
And now Textualize, a new startup, is bringing CLI apps to the web.

Special Guest: Will McGugan.

Sponsored By:

Links:

<p>Will McGugan has brought a lot of color to CLIs within Python due to Rich. <br> Then Textual started rethinking full command line applications, including layout with CSS.<br> And now Textualize, a new startup, is bringing CLI apps to the web.</p><p>Special Guest: Will McGugan.</p><p>Sponsored By:</p><ul><li><a href="http://rollbar.com/testandcode" rel="nofollow">Rollbar</a>: <a href="http://rollbar.com/testandcode" rel="nofollow">With Rollbar, developers deploy better software faster.</a></li></ul><p>Links:</p><ul><li><a href="https://github.com/Textualize/rich" title="rich" rel="nofollow">rich</a></li><li><a href="https://github.com/Textualize/rich-cli" title="rich-cli" rel="nofollow">rich-cli</a></li><li><a href="https://github.com/Textualize/textual" title="textual" rel="nofollow">textual</a></li><li><a href="https://www.textualize.io/" title="Textualize.io" rel="nofollow">Textualize.io</a></li><li><a href="https://www.textualize.io/rich/gallery" title="Rich Gallery" rel="nofollow">Rich Gallery</a></li><li><a href="https://www.textualize.io/textual/gallery" title="Textualize Gallery" rel="nofollow">Textualize Gallery</a></li><li><a href="https://pythonbytes.fm/" title="Python Bytes Podcast" rel="nofollow">Python Bytes Podcast</a></li></ul>
Categories: FLOSS Project Planets

Nicola Iarocci: Eve-Swagger v0.2 released

Tue, 2022-05-17 02:05
I just released Eve-Swagger v0.2 on PyPI. Eve-Swagger is a Swagger/OpenAPI extension for Eve powered RESTful APIs. This maintenance release addresses a few issues and adds support for eve-auth-jwt. Many thanks to Roberto Romero for his contributions to this release. Subscribe to the newsletter, the RSS feed, or follow @nicolaiarocci on Twitter
Categories: FLOSS Project Planets

Hynek Schlawack: Better Python Object Serialization

Mon, 2022-05-16 20:00

The Python standard library is full of underappreciated gems. One of them allows for simple and elegant function dispatching based on argument types. This makes it perfect for serialization of arbitrary objects – for example to JSON in web APIs and structured logs.

Categories: FLOSS Project Planets

Kushal Das: OAuth Security Workshop 2022

Mon, 2022-05-16 11:46

Last week I attended OAuth Security Workshop at Trondheim, Norway. It was a 3-day single-track conference, where the first half of the days were pre-selected talks, and the second parts were unconference talks/side meetings. This was also my first proper conference after COVID emerged in the world.

Back to the starting line

After many years felt the whole excitement of being a total newbie in something and suddenly being able to meet all the people behind the ideas. I reached the conference hotel in the afternoon of day 0 and met the organizers as they were in the lobby area. That chat went on for a long, and as more and more people kept checking into the hotel, I realized that it was a kind of reunion for many of the participants. Though a few of them met at a conference in California just a week ago, they all were excited to meet again.

To understand how welcoming any community is, just notice how the community behaves towards new folks. I think the Python community stands high in this regard. And I am very happy to say the whole OAuth/OIDC/Identity-related community folks are excellent in this regard. Even though I kept introducing myself as the new person in this identity land, not even for a single time I felt unwelcome. I attended OpenID-related working group meetings during the conference, multiple hallway chats, or talked to people while walking around the beautiful city. Everyone was happy to explain things in detail to me. Even though most of the people there have already spent 5-15+ years in the identity world.

The talks & meetings

What happens in Trondheim, stays in Trondheim.

I generally do not attend many talks at conferences, as they get recorded. But here, the conference was a single track, and also, there were no recordings.

The first talk was related to formal verification, and this was the first time I saw those (scary in my mind) maths on the big screen. But, full credit to the speakers as they explained things in such a way so that even an average programmer like me understood each step. And after this talk, we jumped into the world of OAuth/OpenID. One funny thing was whenever someone mentioned some RFC number, we found the authors inside the meeting room.

In the second half, we had the GNAP master class from Justin Richer. And once again, the speaker straightforwardly explained such deep technical details so that everyone in the room could understand it.

Now, in the evening before, a few times, people mentioned that in heated technical details, many RFC numbers will be thrown at each other, though it was not that many for me to get too scared :)

I also managed to meet Roland for the first time. We had longer chats about the status of Python in the identity ecosystem and also about Identity Python. I took some notes about how we can improve the usage of Python in this, and I will most probably start writing about those in the coming weeks.

In multiple talks, researchers & people from the industry pointed out the mistakes made in the space from the security point of view. Even though, for many things, we have clear instructions in the SPECs, there is no guarantee that the implementors will follow them properly, thus causing security gaps.

At the end of day 1, we had a special Organ concert at the beautiful Trondheim Cathedral. On day 2, we had a special talk, “The Viking Kings of Norway”.

If you let me talk about my experience at the conference, I don’t think I will stop before 2 hours. It was so much excitement, new information, the whole feeling of going back into my starting days where I knew nothing much. Every discussion was full of learning opportunities (all discussions are anyway, but being a newbie is a different level of excitement) or the sadness of leaving Anwesha & Py back in Stockholm. This was the first time I was staying away from them after moving to Sweden.

Just before the conference ended, Aaron Parecki gave me a surprise gift. I spent time with it during the whole flight back to Stockholm.

This conference had the best food experience of my life for a conference. Starting from breakfast to lunch, big snack tables, dinners, or restaurant foods. In front of me, at least 4 people during the conference said, “oh, it feels like we are only eating and sometimes talking”.

Another thing I really loved to see is that the two primary conference organizers are university roommates who are continuing the friendship and journey in a very beautiful way. After midnight, standing outside of the hotel and talking about random things about life and just being able to see two longtime friends excited about similar things, it felt so nice.

I also want to thank the whole organizing team, including local organizers, Steinar, and the rest of the team did a superb job.

Categories: FLOSS Project Planets

Python Morsels: Reading binary files in Python

Mon, 2022-05-16 11:00

How can you read binary files in Python? And how can you read very large binary files in small chunks?

Table of contents

  1. How to read a binary file in Python
  2. Use a library to read your binary file
  3. Working at byte level in Python
  4. Reading binary files in chunks
  5. Aside: using an assignment expression
  6. Summary

How to read a binary file in Python

If we try to read a zip file using the built-in open function in Python using the default read mode, we'll get an error:

>>> with open("exercises.zip") as zip_file: ... contents = zip_file.read() ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/usr/lib/python3.10/codecs.py", line 322, in de code (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8e in position 11: invalid sta rt byte >>> with open("exercises.zip") as zip_file: ... contents = zip_file.read() ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/usr/lib/python3.10/codecs.py", line 322, in de code (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8e in position 11: invalid sta rt byte

We get an error because zip files aren't text files, they're binary files.

To read from a binary file, we need to open it with the mode rb instead of the default mode of rt:

>>> with open("exercises.zip", mode="rb") as zip_file: ... contents = zip_file.read() ... >>> with open("exercises.zip", mode="rb") as zip_file: ... contents = zip_file.read() ...

When you read from a binary file, you won't get back strings. You'll get back a bytes object, also known as a byte string:

>>> with open("exercises.zip", mode="rb") as zip_file: ... contents = zip_file.read() ... >>> type(contents) <class 'bytes'> >>> contents[:20] b'PK\x03\x04\n\x00\x00\x00\x00\x00Y\x8e\x84T\x00\x00\x00\x00\x00\x00' >>> with open("exercises.zip", mode="rb") as zip_file: ... contents = zip_file.read() ... >>> type(contents) <class 'bytes'> >>> contents[:20] b'PK\x03\x04\n\x00\x00\x00\x00\x00Y\x8e\x84T\x00\x00\x00\x00\x00\x00'

Byte strings don't have characters in them: they have bytes in them.

The bytes in a file won't help us very much unless we understand what they mean.

Use a library to read your binary file

You probably won't read a …

Read the full article: https://www.pythonmorsels.com/reading-binary-files-in-python/
Categories: FLOSS Project Planets

Pages