Planet Python
Zero to Mastery: Python Monthly Newsletter 💻🐍
Tryton News: Newsletter March 2023
This is the last newsletter for the Tryton 6.6 series. You will find the latest development:
Changes for the UserA warning is raised when cancelling an account move related to an invoice, because the invoice will be marked as paid and not cancelled.
The receivable/payable lines now have a column with the cumulative balance. This helps you see the evolution of the customer/supplier debt.
We’ve added links from party to its purchase and sales lines in addition to the purchases and sales. Also all those links show the number of pending and done records.
An option to cancel a move with a reversal move has been added. So, when checked, Tryton will post the move with the debit and credit swapped over, instead of posting negative debits and credits.
Now Tryton notifies the user when they are registering a party identifier that is already in use with another party. This helps prevent duplicate parties from being created.
The stock moves and invoice lines linked to a sale or purchase line are now displayed on their form. This is useful to understand the status of an order.
Tryton now supports setting a price for a carrier cost that is based on weight even if the products have no weight.
We’ve added a new category of unit of measure for energy with some common units like the Joule and Kilowatt-hour.
And we removed the non-standard “Work day” unit because its value depends on the company’s working practices.
The size of each cache can now be fine tuned using entries in the configuration file. So you no longer need to develop a new module to change the cache size for a particular usage.
Changes for the DeveloperThe server now includes the tools to generate barcodes and QR codes.
The product module makes use of a method that generates the barcode for a corresponding product identifier.
We’ve added two methods on the Field. searchable which returns True if the field can be used in a domain expression and sortable if the field can be used in an order expression.
The list-form view now has abilities to do validation, pre-validation and automatic saving when the selected record changes. This mimics the behaviour of the editable list.
Also the selections are now restored on list-form view.
New types of exceptions have been introduced: RPCReturnException and ButtonActionException. They are used to launch an action from a button using an exception (which rolls back the transaction started by the button).
This is useful if, for example, you need to launch a wizard before executing a transition.
Now it’s easy to extend the context keys that the caches must ignore.
For the calendar view, it is possible to scroll to a time, by default, by including the calendar_scroll_time key in the context. This is only supported by the web client for now.
1 post - 1 participant
Janusworx: TIL: The Difference Between a CLI Tool and a Freeze Tool
I keep writing these tiny utilities for myself in Python and while I love writing in Python, I definitely don’t enjoy the little war dance I have to do everytime I want to run it on a new machine. Keeping track of virtual environments, and then installing packages in them, quickly gets tiresome. I want to just run the program once I’m done with it. Like a C program. Or Rust. Or Go.
I could go learn one of those languages. Sure. That’s an option.
And I see no problem doing it for other needs (like if I’m working with a group that uses said language) or for my career’s sake or just for being a polyglot.
But when I write for myself, I’d rather stick to Python.
Its mental model fits my brain. I find writing Python joyful.
So what’s a boy to do?
Since I was writing CLI applications, I went and looked for how to write Python CLIs and I found argparse and Typer and Click and I was confused. They didn’t seem to give me my executable. They had all sorts of other powerful features, but the one I wanted.
And I went looking some more. And the penny dropped.
What I wanted, wasn’t a way to build a CLI, (although that is what I was writing).
What I wanted, was a way to bundle my environment into an executable.
What I wanted, I learned, was a way to freeze my code.
More searching led to Nuitka, PyOxidizer, PyInstaller and their ilk.
I went with Nuitka, because that is what I found first and what I found was dead simple
and boom!
I got myself an executable mastodon-to-moi.bin!
And then I got greedy and did a mastodon-to-moi.bin --help and figured, that’s where the cli/tui frameworks like Typer and Textualize, come in handy. To scaffold and help these interfaces.
But when I ran it on my laptop, which has nothing other than system Python, it hiccoughed. I tried Pyinstaller next, and that too, did not do the trick. It choked on a loop, that worked when I ran it as with Python normally. PyOxidizer scared me off.
This needs looking further looking into, but atleast I know what I need to do now
Afterthought: The one con I do see, is losing platform independence. My binaries will be Linux on Intel only.1 I could just take my Python source anywhere, set up my environment and go. Well, I still have that option if I need it and in the meanwhile, having binaries affords me a lot of convenience.
Feedback on this post? Mail me at feedback@janusworx.com
P.S. Subscribe to my mailing list!
Forward these posts and letters to your friends and get them to subscribe!
P.P.S. Feed my insatiable reading habit.
-
until I learn to cross compile that is. ↩︎
TestDriven.io: Deploying a Django App to Azure App Service
Mike Driscoll: Python’s Tuple Methods (Video)
Do you know how many methods Python’s tuple data type has?
If not, you can find out in this video:
Python’s tuple data type only has TWO methods! This video will teach you about both of them.
The post Python’s Tuple Methods (Video) appeared first on Mouse Vs Python.
PyCoder’s Weekly: Issue #566 (Feb. 28, 2023)
#566 – FEBRUARY 28, 2023
View in Browser »
Are you still using loops and lists to process your data in Python? Have you heard of a Python library with optimized data structures and built-in operations that can speed up your data science code? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, returns to share secrets for harnessing linear algebra and NumPy for your projects.
REAL PYTHON podcast
While multiprocessing allows Python to scale to multiple CPUs, it has some performance overhead compared to threading. This article details why processes have performance issues that threads don’t, ways to work around it, and a sample bad solution.
ITAMAR TURNER-TRAURING
We built TelemetryHub to make observability easy. It’s our goal to provide you with peace of mind by providing a user experience that’s intuitive, easy to use, and allows you to monitor and understand what’s happening inside your application at a glance →
SCOUT APM sponsor
XML and YAML are two of the most popular text based data formats. This article teaches you how to use third-party Python libraries to convert from one to the other.
ADITYA RAJ
Text annotation is the process of reading natural language data and adding additional information to it in a way your program can use it. This info can be used to train models or help process the data. This article describes 6 different tools that can help you annotate your text data.
NEWSCATCHER
Find your favorite package and turn to the readme to get it installed - it seems dead simple just a ‘pip install’ away. Nothing could possibly go wrong. Right? If you’re used to it, it is easy to forget almost all the instructions are skipping a step: using a virtual environment.
IAN WOOTTEN
Get the Python resources you need to build your notifications infrastructure, faster. Everything from code tutorials, to an SDK and docs all in one place. See for yourself →
COURIER sponsor
In this tutorial, you’ll learn how to iterate over a pandas DataFrame’s rows, but you’ll also understand why looping is against the way of the panda. You’ll understand vectorization, see how to choose vectorized methods, and compare the performance of iteration against pandas.
REAL PYTHON
App Fiddle is to apps what JSFiddle is to JavaScript. Use this instance to learn Flask/SQLAlchemy, running an app in Codespaces. You can browse and explore using VSCode on the web, customize, and debug a complete project, including a database.
APILOGICSERVER.GITHUB.IO • Shared by Val Huber
This is Duarte’s take on what tools and practices to use for a new Python project. Includes samples for pyproject.toml, details on using pip-tools, and even the occasional Makefile.
DUARTE O.CARMO
Metaclasses are part of the darker corners of Python and many developers avoid them. This article dives deep into how you can use them to reduce boilerplate code and build APIs.
IONEL CRISTIAN MĂRIEȘ
In this quick training, learn how to build a face recognition application using open source tools like OpenCV (Open Source Computer Vision Library) and InfluxDB time series platform. Github repository included.
INFLUXDATA sponsor
This step-by-step guide shows you how to build a REST API with Create, Read, Update, and Delete methods using Flask, SQLAlchemy, Postgres, and Docker.
FRANCESCO CIULLA
Everybody is talking about GPT, this article is actually building one. Learn how to implement a GPT model from scratch in NumPy.
JAY MODY
March 1, 2023
MEETUP.COM
March 1, 2023
REALPYTHON.COM
March 1, 2023
PYSTADA.GITHUB.IO
March 2, 2023
MEETUP.COM
March 2, 2023
MEETUP.COM
March 4 to March 5, 2023
DJANGOGIRLS.ORG
March 6 to March 9, 2023
GEOPYTHON.NET
Happy Pythoning!
This was PyCoder’s Weekly Issue #566.
View in Browser »
[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
Python Morsels: Using Python's "list" constructor
When should you use the built-in list function in Python? And when shouldn't you?
Table of contents
- What does list do?
- Don't use list() to create empty lists
- The four uses of list
- Shallow copying lists
- Turning lazy iterables into lists
- Turning any iterable into a list
- Using list as a factory function
- When merging iterables use * instead of list()
- Embrace list, but don't overuse it
There are two ways to use the built-in list function (or "callable" if you prefer, as it's technically a class).
The list function accepts a single argument, which must be an iterable.
>>> a_new_list = list(my_iterable)The list function will loop over the given iterable and make a new list out it.
Like str, int, dict, and many constructor functions in Python, list can also be called without any arguments:
>>> list() []When given no arguments, list returns an empty list.
Don't use list() to create empty listsCreating an empty list is …
Read the full article: https://www.pythonmorsels.com/using-list/Real Python: Writing Clean, Pythonic Code With namedtuple
Python’s collections module provides a factory function called namedtuple(), which is specially designed to make your code more Pythonic when you’re working with tuples. With namedtuple(), you can create immutable sequence types that allow you to access their values using descriptive field names and the dot notation instead of unclear integer indices.
If you have some experience using Python, then you know that writing Pythonic code is a core skill for Python developers. In this video course, you’ll level up that skill using namedtuple.
In this video course, you’ll learn how to:
- Create namedtuple classes using namedtuple()
- Identify and take advantage of cool features of namedtuple
- Use namedtuple instances to write Pythonic code
- Decide whether to use a namedtuple or a similar data structure
- Subclass a namedtuple to provide new features
To get the most out of this course, you need to have a general understanding of Python’s philosophy related to writing Pythonic and readable code. You also need to know the basics of working with:
If you don’t have all the required knowledge before starting this video course, then that’s okay! You can stop and review the above resources as needed.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
PyBites: Django class based vs function based views
This post first appeared as an email on our friends list. We decided to publish it here because the question is still commonly asked
The other day I did a presentation on Django class based vs function based views. (Warning: this post is opinionated )
Also if you’re not into Django don’t stop reading just yet, because there are parallels with common software design patterns you might find interesting.
Here is a summary of interesting things I presented (and credit to Luke Plant’s Django Views the Right Way for cementing what I already started to sense using both types of views, and Ryan for sending it my way):
– Class based views (CBVs from here on) are great for simple use cases (e.g.: CRUD or create-read-update-delete), but going beyond that is tricky, because you have to know what methods to override. Luckily there is Classy Class-Based Views but in practice you end up having to learn a monstrous API (that is, reading a lot of source code of parent classes and mixins) to make sense of it all. Also as these CBVs are generic, code is less explicit (e.g. the use of self.kwargs – what does this hold?!)
– So you need to know the signature of the thing you override (attribute or method). For example, if you override get_context_data() to pass extra data to a template you need to know:
– It takes **kwargs
– You have to call the parent’s get_context_data() using super()
– You have to return the context dictionary.
Compare that to a function based view where this dictionary is already passed into the response object, you literally have to add another key/value pair to this dictionary, a single line of code change. Much more transparent.
– Overall, I find function based views easier to reason about. It shows what a view actually is (!): a function that receives a web request and returns a web response. CBVs hide this from you, which makes them more implicit, and as the Zen of Python says: explicit is better than implicit.
– CBVs leverage multiple inheritance and mixins. Inheritance trees can be hard to follow (and they tend to get quite deep with CBVs), because there are more dependencies and mixins (common code you “plugin” to a bunch of classes) that makes the number of permutations even greater (and can even lead to code clashes).
Again functions are more straightforward, easier to isolate and test (see this article). To learn more about the limitation of inheritance you might want to read The Composition Over Inheritance Principle.
What about code duplication across function based views? Well, nothing stops you from extracting duplicated code out into helper functions or use decorators which are an elegant pattern for this (e.g.: in Django you can “lock down” your views with @login_required)
The Zen of Python’s simple is better than complex axiom is probably my favorite and for that reason I am sticking with function based views for now. What about you?
– Bob
This is the sort of stuff we discuss on our weekly Code Clinic calls inside PDM.
However the calls are not even the main part of the program. It’s the 1:1 coaching that is shaving off years of struggle and procrastination for Pythonistas we work with.
People credit PDM as a (if not the) major stepping stone in their careers. So if you’re ready to take your Python developer journey to the next level, check out our program here.
Python Bytes: #325 It's called a merge conflict
Codementor: What's inside a programming language
Sumana Harihareswara - Cogito, Ergo Sumana: PyCon 2023: "Argument Clinic" & Mitigating COVID Risk
PyBites: Feel Comfortable with Git?
Folks come to me to ask for help with Git.
Sometimes they can’t guess what git subcommand they need. (Git 2.37 has 169.)
Sometimes they know what subcommand they want, but don’t know what flags to use. (git log now has 149 flags and options.)
Sometimes they issued a command, and Git didn’t do what they expected
Maybe you’ve had one of those problems yourself. Typically, their problem isn’t Git.
They even want Git to do something that it can do, easily. They’re just asking Git for it the wrong way
Usually these folks just have the wrong mental model of how Git works. They’ve learned a bunch of commands, drawn mental cartoons of how Git works, and then typed in a command based on that model.
They’re frustrated because they’ve built the wrong mental model.
The questions sometimes seem like a student saying,
“Dr. Haemer, Dr. Haemer! I understood everything
you said, except the difference between a loop and a CPU.”
I almost never answer with, “Oh, you just need to add the flag --foobarmumble,” or “You need to use git frabitz instead of git zazzle,” or “Git just can’t do that.”
Instead, it’s, “Aha. You just need to understand how Git works … the big picture. Let’s start there.”
Q: “Wait. You’re telling me that the best way for me to solve my Git problems is to understand what it’s doing?”
A: “Yup.”
Git’s magic isn’t in pieces hidden from view, it’s magic is its
simple, open design.
You can watch it work, under the hood, yourself. And should. I suspect making it easy to watch also made it easier for Linus Torvalds to debug.
I’ll show you.
Everything from here on out will be on the command-line,
GUI interfaces to Git are just layered on top of shell-level equivalents.
Working on the command line removes an obfuscating layer.
Oh, and I’m using “Git” to mean the whole, distributed, version-control system, and git when I mean the command.
I’m also going to assume you’re using Linux or something like it:
Unix, OS/X, Penguin, BSD, …
Linus designed and wrote both Linux and Git. Though Git is now pretty portable, guessing which OS you can expect it to make the most sense on is not much of a challenge.
Watching Git at Work
Begin by making a directory to work in:
$ mkdir /tmp/scratch
$ cd /tmp/scratch
$ ls -a
That’s empty all right. Now put it under Git control.
$ git init
$ ls # no files
$ ls -a # ah! a hidden directory
The .git directory is where Git will stuff everything it knows about.
You haven’t even created any files of your own, much less committed any. What did that git init command put into .git?
The most useful tool to explore this is the tree command,
which lays out directory hierarchies for you to see.
If your operating system didn’t supply tree by default,
stop for a second to install it with your favorite package manager: apt, brew, … whatever.
$ tree .git
Now you’re cooking.
Spend a few minutes looking through everything that’s there.
It’s mostly empty directories, plus a few files that are obviously boilerplate and templates.
Nothing useful.
Git knows it’s there, though. Try these:
$ git status # before removing .git
$ rm -rf .git
$ git status # after removing it
Ask Git a question, and it looks for answers in .git .
Want to wipe out a git repo and start over? Just remove .git .
Okay, now put it back.
$ git init
$ tree .git
That was easy. Next, make an empty file.
$ touch my-empty-file
Does that do anything to .git?
$ tree .git
Doesn’t look like it. What’s Git think?
$ git status
Now it sees a file outside of .git but there’s no information about that file inside of .git . And that’s what “untracked” means.
What would change if it were tracked? It tells you to use git add to track it, so try that. Why not? You know that if something goes wrong, you can just start over with rm -rf .git
$ git add my-empty-file
$ tree .git
Oho!
There’s a new file here.
.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
And, since that’s in .git, Git sees it, too.
$ git status
Much of the work you do with Git adds and queries objects in .git/objects, and that’s where I’ll be pointing out things from now on.
Go ahead and commit it, and watch what changes.
$ git commit -m"My first commit: an empty file."
$ git status
$ tree .git/objects
The original .git/objects/e6/9de29* is still there,
but you’ve created two new objects, so now you have three.
Two of the three are Git’s version of a file and of a directory.
Linus calls objects that hold files “blobs,” and objects that hold directories, “trees.”
– e6/9de29*: a blob for the empty file
– bb/216ad*: a tree for the directory containing that empty file
To stave off a potential, mental mix-up. take a short pause to think through that, carefully.
You’re juggling *two* filesystems here. One is your OS’s filesystem, which has a directory called .git/objects/, with subdirectories and files.
The second is Git’s filesystem, which stores all its pieces as objects in that first filesystem. You’re going to explore this second filesystem.
Notice, especially, that the blob object for the empty file isn’t stored under the tree object in your OS’s filesystem. That blob is “in” that tree only in Git’s view of the world, and you’re about to see how that’s done.
Calling the files and directories for Git’s filesystems “blobs” and “trees” will help you keep straight which of the two filesystems you’re talking about.
Trees
In the Unix filesystem, you look at a directory’s content with the ls command. In Linus’s Git filesystem, you can use git ls-tree .
Try that now.
$ git ls-tree bb216ad97a6d296d1feedbc3e0973
43ce93f8f43
Git sees that this tree, has one blob (file), called `my-empty-file`, that it has permissions 100644, and that the blob is e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
Linus sticks that blob in
e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 so it doesn’t have to store every object in the same directory in the parent’s filesystem. That’s just an implementation detail.
If you’ve already guessed that the file bb/216ad* is where Git put the tree bb216ad97a6d296d1feedbc3e097343ce93f8f43, you’ve guessed correctly.
You’re already building a new, detailed, and *correct* mental model of what Git’s doing.
But what’s that third file?
Commits
To take your model to the next level, first take a peek inside those files.
$ cat .git/objects/e69de29*
$ cat .git/objects/bb216ad*
Ugh. They’re encoded in some weird way, so cat isn’t useful.
Fortunately, Linus provides git cat-file -p, which decodes and shows the contents of objects in his file system.
git cat-file -p e69de29
Well, that doesn’t seem to do anything, right? Oh. Wait. That blob was the empty file. There’s no contents to show. Duh.
I’ll pause to point out a piece of syntax: There’s no slash in that name.
It’s e69de29, not e6/9de29. Linus spread Git’s objects out across subdirectories, but Git still thinks of them without the slashes.
Again: those subdirectories are just an implementation detail.
Luckily Git also lets you abbreviate names with the first few characters You can type git cat-file -p e69de29, not
git cat-file -p e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
Since there was nothing in the blob, let’s look at the tree.
git cat-file -p bb216ad
Now you’re cooking! That’s the tree all right. So, what’s the third object? Might as well peek at it.
git cat-file -p 659e774
(Your third object will be named something different from mine, but at this point, I bet you can work out what to type to see yours.)
You’re looking at the commit itself.
Notice what’s in it: your name, your email address, and your commit message. Here’s another way to see the same information:
git log
So, git log looks in .git for your commit object, reads the information, and formats it in a pretty way. And now you see what that leading line of the log comment, “commit …”, means. It’s Git’s name for that commit object.
You can also see that the first line of the commit object says,
tree bb216ad97a6d296d1feedbc3e097343ce93f8f43
So now you see how Git is connecting up all the pieces.
– A commit object keeps track of meta-information about the commit and points at the tree being committed.
– A tree keeps track of the blobs in it, and their human names.
– A blob stores the contents of a file.
What You Now Know
– Git stores all its information in the directory .git/
– git init creates that directory
– Linus implements a user-level filesystem with the files under .git/objects/.
– Each Git object is stored in a subdirectory of .git/objects/. The subdirectory is the first two characters of the name. This is an implementation detail to keep you from having to store every object in the same directory in your OS’s filesystem.
The name of the object in your OS’s file
.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
is e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 ,
which Git lets you abbreviate as e69de29, thank goodness.
– There are three important flavors of objects: blobs, trees, and commits.
– Blobs are files, trees are directories, and commits are, um, commits.
– The objects are encoded, but you can use git cat-file -p to look inside them.
– You can use git cat-file -p to see the contents of a blob.
– You can use git cat-file -p to see the contents of a tree: a list of objects, with their types, permissions and their human names.
git ls-tree serves up the same information in a slightly nicer format.
– You can use git cat-file -p to see the contents of a commit: the commit message, timestamp, committer, and the tree you committed.
git log will format that information in loads of different, friendly ways.
– All these are bound together: commits point at trees, trees point at blobs and other trees.
– The Git filesystem is completely user-visible. You can see the whole thing. The Linux filesystem implements directories and files in a similar way — no surprise — but the implementation details are hidden. You can get a directory listing with ls, but you can’t actually open up a directory and look at its guts with cat.
There’s a lot more here to explore:
– tags & branches,
– SHA1s and object encoding,
– configuration files,
– remotes with their fetches, pushes & pulls,
– merges & rebases,
– indexes & packfiles,
– the git command and its subcommands,
…
Yes, Git is big. But its design is also simple.
Now you know you can watch and understand how all these work. You can see into Git’s secrets for yourself.
If you want some guidance along the way, I can recommend some resources:
– One is Ian Miel’s Learn Git the Hard Way, available on Kindle for $10.
I think it does a good job of teaching Git by teaching how it works.
– A second is Git Under the Hood, a set of videos that I did for Pearson, and available either directly, or through O’Reilly.
– You can even see what Git looked like at the beginning and every step of its evolution: Linus Torvalds started writing released Git on April 3, 2005, released it three days later, and made it self-hosting the next day. You can clone the source with git clone https://github.com/git/git, and then check out the very first version, or any version after that
Real Python: Using NumPy reshape() to Change the Shape of an Array
The main data structure that you’ll use in NumPy is the N-dimensional array. An array can have one or more dimensions to structure your data. In some programs, you may need to change how you organize your data within a NumPy array. You can use NumPy’s reshape() to rearrange the data.
The shape of an array describes the number of dimensions in the array and the length of each dimension. In this tutorial, you’ll learn how to change the shape of a NumPy array to place all its data in a different configuration. When you complete this tutorial, you’ll be able to alter the shape of any array to suit your application’s needs.
In this tutorial, you’ll learn how to:
- Change the shape of a NumPy array without changing its number of dimensions
- Add and remove dimensions in a NumPy array
- Control how data is rearranged when reshaping an array with the order parameter
- Use a wildcard value of -1 for one of the dimensions in reshape()
For this tutorial, you should be familiar with the basics of NumPy and N-dimensional arrays. You can read NumPy Tutorial: Your First Steps Into Data Science in Python to learn more about NumPy before diving in.
Supplemental Material: Click here to download the image repository that you’ll use with NumPy reshape().
Install NumPyYou’ll need to install NumPy to your environment to run the code in this tutorial and explore reshape(). You can install the package using pip within a virtual environment. Select either the Windows or Linux + macOS tab below to see instructions for your operating system:
PS> python -m venv venv PS> .\venv\Scripts\activate (venv) PS> python -m pip install numpy $ python -m venv venv $ source venv/bin/activate (venv) $ python -m pip install numpyIt’s a convention to use the alias np when you import NumPy. To get started, you can import NumPy in the Python REPL:
>>>>>> import numpy as npNow that you’ve installed NumPy and imported the package in a REPL environment, you’re ready to start working with NumPy arrays.
Understand the Shape of NumPy ArraysYou’ll use NumPy’s ndarray in this tutorial. In this section, you’ll review the key features of this data structure, including an array’s overall shape and number of dimensions.
You can create an array from a list of lists:
>>>>>> import numpy as np >>> numbers = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) >>> numbers array([[1, 2, 3, 4], [5, 6, 7, 8]])The function np.array() returns an object of type np.ndarray. This data structure is the main data type in NumPy.
You can describe the shape of an array using the length of each dimension of the array. NumPy represents this as a tuple of integers. The array numbers has two rows and four columns. Therefore, this array has a (2, 4) shape:
>>>>>> numbers.shape (2, 4)You can represent the same data using a different shape:
Both of these arrays contain the same data. The array with the shape (2, 4) has two rows and four columns and the array with the shape (4, 2) has four rows and two columns. You can check the number of dimensions of an array using .ndim:
>>>>>> numbers.ndim 2The array numbers is two-dimensional (2D). You can arrange the same data contained in numbers in arrays with a different number of dimensions:
Read the full article at https://realpython.com/numpy-reshape/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python for Beginners: Working With an XML File in Python
XML files are used to store data as well as to transmit data in software systems. This article discusses how to read, write, update, and delete data from an XML file in Python. For this task, we will use the xmltodict module in python.
Table of Contents- What is an XML File?
- Create an XML File in Python
- Read an XML File in Python
- Add a New Section to an XML File in Python
- Update Value in an XML File Using Python
- Delete data From XML File in Python
- Conclusion
XML (eXtensible Markup Language) is a markup language that is used to store and transmit data. It is similar to HTML in structure. But, unlike HTML, XML is designed to store and manipulate data, not to display data. XML uses a set of markup symbols to describe the structure and meaning of the data it contains, and it can be used to store any type of data, including text, numbers, dates, and other information.
An XML file is a plain text file that contains XML code, which can be read by a wide range of software applications, including web browsers, text editors, and specialized XML tools. The structure of an XML file consists of elements, which are defined by tags, and the data within those elements is stored as text or other data types.
The syntax for declaring data in an XML file is as follows.
<field_name> value <field_name>To understand this, consider the following example.
<?xml version="1.0"?> <employee> <name>John Doe</name> <age>35</age> <job> <title>Software Engineer</title> <department>IT</department> <years_of_experience>10</years_of_experience> </job> <address> <street>123 Main St.</street> <city>San Francisco</city> <state>CA</state> <zip>94102</zip> </address> </employee>The above XML string is a simple XML document that describes the details of an employee.
- The first line of the document, <?xml version="1.0"?>, is the XML declaration and it specifies the version of XML that is being used in the document.
- The root element of the document is <employee>, which contains several other elements:
- The <name> element stores the name of the employee, which is “John Doe”.
- The <age> element stores the age of the employee, which is “35”.
- The <job> element contains information about the employee’s job, including the <title> (Software Engineer), the <department> (IT), and the <years_of_experience> (10) elements.
- The <address> element contains information about the employee’s address, including the <street> (123 Main St.), <city> (San Francisco), <state> (CA), and <zip> (94102) elements.
Each of these elements is nested within the parent <employee> element, creating a hierarchical structure. This structure allows for the data to be organized in a clear and concise manner, making it easy to understand and process.
XML is widely used for data exchange and storage because it is platform-independent, meaning that XML data can be transported and read on any platform or operating system, without the need for proprietary software. It is also human-readable, which makes it easier to debug and maintain, and it is extensible, which means that new elements can be added as needed, without breaking existing applications that use the XML data.
XML files are saved using the .xml extension. Now, we will discuss approaches to read and manipulate XML files using the xmltodict module. You can install this module using pip by executing the following command in your command prompt.
pip3 install xmltodict Create an XML File in PythonTo create an XML file in python, we can use a python dictionary and the unparse() method defined in the xmltodict module. For this, we will use the following steps.
- First, we will create a dictionary containing the data that needs to be put into the XML file.
- Next, we will use the unparse() method to convert the dictionary to an XML string. The unparse() method takes the python dictionary as its input argument and returns the XML representation of the string.
- Now, we will open an XML file in write mode using the open() function. The open() function takes the file name as its first input argument and the literal “w” as its second input argument. After execution, it returns a file pointer.
- Next, we will write the XML string into the file using the write() method. The write() method, when invoked on the file pointer, takes the XML string as its input argument and writes it to the file.
- Finally, we will close the file using the close() method.
After execution of the above steps, the XML file will be saved in the file system. You can observe this in the following example.
import xmltodict employee={'employee': {'name': 'John Doe', 'age': '35', 'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'}, 'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'}, }} file=open("employee.xml","w") xml_string=xmltodict.unparse(employee) file.write(xml_string) file.close()Instead of using the write() method, we can directly write the XML data into the file using the unparse() method. For this, we will pass the python dictionary as the first input argument and the file pointer as the second input argument to the unparse() method. After execution of the unparse() method, the data will be saved to the file.
You can observe this in the following example.
import xmltodict employee={'employee': {'name': 'John Doe', 'age': '35', 'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'}, 'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'}, }} file=open("employee.xml","w") xmltodict.unparse(employee,file) file.close()The output file looks as follows.
XML File Read an XML File in PythonTo read an XML file in python, we will use the following steps.
- First, we will open the file in read mode using the open() function. The open() function takes the file name as its first input argument and the python literal “r” as its second input argument. After execution, it returns a file pointer.
- Once we get the file pointer, we will read the file using the read() method. The read() method, when invoked on the file pointer, returns the file contents as a python string.
- Now, we have read the XML file into a string. Next, we will parse it using the parse() method defined in the xmltodict module. The parse() method takes the XML string as its input and returns the contents of the XML string as a python dictionary.
- After parsing the contents of the XML file, we will close the file using the close() method.
After executing the above steps, we can read the XML file into a python dictionary. You can observe this in the following example.
import xmltodict file=open("employee.xml","r") xml_string=file.read() print("The XML string is:") print(xml_string) python_dict=xmltodict.parse(xml_string) print("The dictionary created from XML is:") print(python_dict) file.close()Output:
The XML string is: <?xml version="1.0" encoding="utf-8"?> <employee><name>John Doe</name><age>35</age><job><title>Software Engineer</title><department>IT</department><years_of_experience>10</years_of_experience></job><address><street>123 Main St.</street><city>San Francisco</city><state>CA</state><zip>94102</zip></address></employee> The dictionary created from XML is: {'employee': {'name': 'John Doe', 'age': '35', 'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'}, 'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'}}} Add a New Section to an XML File in PythonTo add a new section to an existing XML file, we will use the following steps.
- We will open the XML file in “r+” mode using the open() function. This will allow us to modify the file. Then, we will read it into a python dictionary using the read() method and the parse() method.
- Next, we will add the desired data to the python dictionary using key-value pairs.
- After adding the data to the dictionary, we will erase the existing data from the file. For this, we will first go to the start of the file using the seek() method. Then, we will erase the file contents using the truncate() method.
- Next, we will write the updated dictionary as XML to the file using the unparse() method.
- Finally, we will close the file using the close() method.
After executing the above steps, new data will be added to the XML file. You can observe this in the following example.
import xmltodict file=open("employee.xml","r+") xml_string=file.read() python_dict=xmltodict.parse(xml_string) #add a single element python_dict["employee"]["boss"]="Aditya" #Add a section with nested elements python_dict["employee"]["education"]={"University":"MIT", "Course":"B.Tech", "degree":"Hons."} file.seek(0) file.truncate() xmltodict.unparse(python_dict,file) file.close()The output file looks as follows.
XML file after adding dataIn this example, you can observe that we have added a single element as well as a nested element to the XML file.
To add a single element to the XML file, we just need to add a single key-value pair to the dictionary. To add an entire section, we need to add a nested dictionary.
Update Value in an XML File Using PythonTo update a value in the XML file, we will first read it into a python dictionary. Then, we will update the values in the dictionary. Finally, we will write the dictionary back into the XML file as shown below.
import xmltodict file=open("employee.xml","r+") xml_string=file.read() python_dict=xmltodict.parse(xml_string) #update values python_dict["employee"]["boss"]="Chris" python_dict["employee"]["education"]={"University":"Harvard", "Course":"B.Sc", "degree":"Hons."} file.seek(0) file.truncate() xmltodict.unparse(python_dict,file) file.close()Output:
XML file after the update Delete data From XML File in PythonTo delete data from the XML file, we will first read it into a python dictionary. Then, we will delete the key-value pairs from the dictionary. Next, we will dump the dictionary back into the XML file using the unparse() method. Finally, we will close the file using the close() method as shown below.
import xmltodict file=open("employee.xml","r+") xml_string=file.read() python_dict=xmltodict.parse(xml_string) #delete single element python_dict["employee"].pop("boss") #delete nested element python_dict["employee"].pop("education") file.seek(0) file.truncate() xmltodict.unparse(python_dict,file) file.close()Output:
XML file after deletionIn this example, you can observe that we have deleted a single element as well as a nested element from the XML file.
ConclusionIn this article, we have discussed how to perform create, read, update, and delete operations on an XML file in python using the xmltodict module. To learn more about XML files, you can read this article on how to convert XML to YAML in Python. You might also like this article on how to convert JSON to XML in python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
The post Working With an XML File in Python appeared first on PythonForBeginners.com.
Mike Driscoll: PyDev of the Week: Roni Kobrosly
This week we welcome Roni Kobrosly as our PyDev of the Week! Roni is the creator of the causal-curve package. You can catch up with Roni on Roni’s website. If you want to see some of Roni’s code, you can check it out on GitHub.
Let’s spend a few moments getting to know Roni better!
Can you tell us a little about yourself (hobbies, education, etc):I grew up in Austin, TX, back when it was a small, hippie, college town known for its live music and not yet known as the “Silicon Hills”. My parents and brother still live there. My wife and I have a cute 5-week-old baby girl named Cora so my current primary hobbies are hanging out with her and napping, but before that I loved cooking elaborate meals, biking, climbing, doing digital art (using Procreate, Vectornator, and Pixaki), and traveling. I have a PhD in Epidemiology and in a former life I was doing research in environmental health.
Why did you start using Python?Writing code for ETL and statistical modeling is something I’ve been doing since 2006, but until 2015 it had been in R. In 2015 I joined the now-defunct Insight Data Science Fellowship in NYC to transition into the data industry, and after hearing so many industry folks talk about its benefits (e.g. a versatile general purpose language, very readable, lots of support for data and math projects) I set out to learn it through web tutorials and such. Almost immediately I loved working with it and I voraciously watched and worked through tutorials on its various packages.
I’m one of those python people who didn’t start learning python with an engineering perspective or background in place. That is to say, I initially wasn’t thinking about PEP8, the principles of solid application design, what a pure function was, OOP vs functional programming, what a virtual environment was, what a container was, etc. So over time I had to learn to complement my data analysis and statistical technical skills with engineering ones. I developed my software engineering skills while using python, so I have a particular fondness for it (while also completely getting that a language is just a tool).
What other programming languages do you know and which is your favorite?R, Go, Scala, and a modest amount of JS. Python is my favorite, hands down, though I’ve come to really appreciate how readable, opinionated, and fast Go can be. In learning a new language, I would look for associations between python’s built-in functions and data structures to those in the new language (e.g. “It sounds like a Go `map` is like a python dictionary…”). So in that sense, python is pretty hard-wired into my head!
What projects are you working on now?Two active projects, but only one of them involves writing python code.
I’m putting the finishing touches on a 3-hour O’Reilly Media live training session entitled “An introduction to causal inference using python”, which is set to occur on March 16. It’s a workshop that will involve slides, Q&A, and Google Colab notebooks with python-based exercises for attendees. Part of this workshop will cover the causal-curve python package that I rolled out to PyPI back in 2020.
Beyond that, I’ve been working on building out an extensive public collection of articles, posts, and videos pertaining to data team leadership. I love the feeling of getting into the python programming “flow state” but these days I lead data teams, so I spend more time thinking about how I can create the best conditions for coders. I spend a lot of time searching for suggestions around different aspects of management, and I thought a giant aggregation of this content could be helpful to others. It follows the awesome-list format, so it’s in the form of a GitHub README file.
Which Python libraries are your favorite (core or 3rd party)?It’s hard to pick a few! I would say I’m an enormous fan of the typical python data stack, by which I mean libraries like NumPy, SciPy, Pandas, Polars, pyspark, Dask, statsmodels, PyMC3, scikit-learn, to name a few. They’re tools, and like all tools they have their strengths and limitations, but I’ve always appreciated their fantastic documentation and the communities behind them.
I would also add the core package `pdb`. I love the `pdb` debugger and it makes me sad when I switch over to another language and can’t find a tool that works equally well…
How did you get involved with the causal-curve project?I started causal-curve in the dead of the first COVID summer in 2020. Everyone I spoke with at the time was scared, confused, isolated, and bored. Primarily, it was a project to give my mind something to focus on, but I genuinely noticed a gap in a subfield known as “causal inference” within the data science / data analysis python space. It was a problem I regularly came upon in my professional work involving causal inference, and yet there didn’t seem to be any established tools for addressing it in python (there were and are many papers in the academic literature around this topic).
I’m not arrogant enough to assume your readers would know about `causal-curve`, so I’ll briefly describe what it does.
In industry and in academia, the best tool we have for determining the effect some has on some outcome is to do an experiment where you have a “treated” group and some sort of control group. In the medical field, for example, they will randomize the folks in some population to either receive a new drug (the treatment) or get a sugar pill (the control), and then follow them up to see whether their cholesterol levels (for example) have improved after 6 weeks. The experiment allows you to determine the true causal effect of the treatment (assuming the experiment was run properly). Experiments aren’t always feasible though, particularly in the tech industry at a company; they can be resource-intensive or tricky to run. You can’t simply look at raw correlations and averages in observational, non-experimental data, because they are subject to all sorts of biases (e.g. confounding). “Causal inference” methods are a set of tools for taking observational, non-experimental data, and making clean causal estimates from them, like you would have gotten from an experiment if you were able to run one.
Typically causal inference methods assume the “treatment” is binary (e.g. you saw an ad vs you didn’t), but there are tons of scenarios where a treatment could be continuous in nature (e.g. price of a product, minutes per week of exercise someone does, customer service phone wait times in minutes). This python package allows one to estimate the “causal curve”, or the causal relationship between some continuous treatment and some outcome.
What do you love about causal-curve?Primarily, I love hearing from folks that they found causal-curve to be useful
Is there anything else you’d like to say?I don’t think I have to say this as the python community is a friendly lot but just in case… a friendly reminder for the more experienced python developers out there: be kind to and mentor junior folks!
Thanks for doing the interview, Roni!
The post PyDev of the Week: Roni Kobrosly appeared first on Mouse Vs Python.
The Three of Wands: Python is two languages now, and that's actually great
Everyone doing Python nowadays is aware Python supports optional type hints, and has for some time now. This has created a small schism in the community, with some people being completely uninterested in type hinting and a little defensive about the language partially going into a new direction, some people being very exciting about the potential of our evolving type tooling, and the vast majority of folks in the middle, not entirely sure where and how to apply type hinting best.
I am of the belief that currently, Python is actually two very similar programming languages sharing the same name. This certainly isn&apost a surprise to anyone who&aposs been using Python for a while. What might be a surprise, though, is that I think this is actually a good thing. The languages, let&aposs call them untyped Python and typed Python, even though sharing a very large common base are fundamentally different in how they enable the developers using them to solve problems.
Allow me to propose a model of thinking about code: there&aposs infrastructure code and there&aposs business logic code. Infrastructure code is exciting, powerful code that exposes easy-to-use interfaces that solve difficult and tricky problems, like talking to browsers (Flask), talking to databases (the Django ORM, SQLAlchemy), dependency injection frameworks (incant), serialization (cattrs) or defining classes (attrs, dataclasses). Business logic code is boring and unexciting code that enables you to solve problems and finish tickets and sprints at your day job. The point of infrastructure code is to enable and empower business logic code; business logic code provides the actual value to your employer, your users, or whomever is using what you&aposre writing. Infrastructure code is the libraries you&aposre using, business logic code is the code you yourself write and deploy.
(Note that this way of thinking about code is, like all abstractions, leaky. A library that you use may be a simple layer between other libraries and hence have all the characteristics of business logic code. If you&aposre employed at a typical software developer position your work codebase will, almost certainly, have pieces of infrastructure code in it that you&aposve written for that codebase. Even so, I find this way of thinking about software is very useful.)
It is usually impossible for infrastructure code to be fully type-hinted internally; the Python type system isn&apost, and probably never will be, powerful enough to support the types of operations libraries like cattrs and attrs need to do. This makes sense; one of the greatest strengths of untyped Python (and what brought me to Python in the first place) is that the infrastructure code available can offer amazingly friendly and powerful APIs. So untyped Python is, and has historically been, great for infrastructure code. Untyped Python is not very good for business logic code, which is why historically software developers have been quick to complain about maintaining large systems written in Python, and with good reason.
Business logic code is usually much simpler than infrastructure code, and there&aposs a lot more of it in the world today; for each SQLAlchemy or Django, there are probably hundreds, if not millions of codebases actually using it in simple ways. Because of this, business logic code is an amazing match for typed Python. Using typed Python brings a ton of benefits to the development process, like moving entire categories of bugs from runtime into typechecking time, ease of refactoring (which is crucial for healthy codebase lifecycles), great editor suport (including autocomplete and robustly listing references, good code navigation) and lessening the need for tests (which increase the amount of code that needs to be written and maintained drastically).
For this marriage to work, we need infrastructure code not to be type-hinted internally, but to provide type-hinted interfaces at the code boundaries. This is exactly where the ecosystem is going, with noteworthy examples being SQLAlchemy 2.0 and a new generation of web frameworks like FastAPI. Also, as the Python type system matures it will enable a category of infrastructure code to be fully typed, but my gut feeling is that the most interesting pieces will still be untyped.
As for why is this a good thing: if you know one (typed or untyped Python), it&aposs relatively easy for you to learn the other (an order of magnitude easier than learning a completely different language, in any case), and learning it will greatly empower you as a software developer.
Now, to address a potential elephant in the room: could we have had a single language good at both of these? I don&apost know, but I don&apost really think this was in the cards for a language like Python. I&aposm somewhat proficient in several different languages, so let&aposs examine their situations:
- JavaScript also seems to have a split situation with TypeScript, althought I don&apost know what the situation there is vis-a-vis infra vs business-logic code. Going to guess it&aposs similar.
- I haven&apost touched Java for almost a decade now, but I used to be very proficient at it. The Java I used was a business logic language through and through, which handily explains its popularity in the industry (since the vast majority of jobs in the industry are for writing business-logic code), the terrible interfaces all major libraries had, and the horror that was the ORM code I looked at once. I posit that Java is actually two languages as well, but that infrastructure Java is just very difficult to work with. This is why if a fellow developer tells me they&aposve written an ORM in Python I&aposd excitedly want to share notes, but back then if a fellow developer had told me they&aposd written an ORM in Java, I&aposd look at them as if they were mad.
- I think Rust has a very interesting approach to infrastructure code with their powerful macro system. I don&apost really know enough Rust to be able to comment with any confidence, but I suppose you can look at Rust macros as a different, infrastructure language on top of Rust. The way it feeds into (typed) Rust is especially elegant to me.
In conclusion: the addition of typed Python is a great thing for our community, and untyped Python isn&apost going anywhere. We just need to learn the right place for each, and work on combining them effectively.