Feeds

Petter Reinholdtsen: Nikita version 0.4 released - free software archive API server

Planet Debian - Wed, 2019-05-22 05:30

This morning, a new release of Nikita Noark 5 core project was announced on the project mailing list. The Nikita free software solution is an implementation of the Norwegian archive standard Noark 5 used by government offices in Norway. These were the changes in version 0.4 since version 0.3, see the email link above for links to a demo site:

  • Roll out OData handling to all endpoints where applicable
  • Changed the relation key for "ny-journalpost" to the official one.
  • Better link generation on outgoing links.
  • Tidy up code and make code and approaches more consistent throughout the codebase
  • Update rels to be in compliance with updated version in the interface standard
  • Avoid printing links on empty objects as they can't have links
  • Small bug fixes and improvements
  • Start moving generation of outgoing links to @Service layer so access control can be used when generating links
  • Log exception that was being swallowed so it's traceable
  • Fix name mapping problem
  • Update templated printing so templated should only be printed if it is set true. Requires more work to roll out across entire application.
  • Remove Record->DocumentObject as per domain model of n5v4
  • Add ability to delete lists filtered with OData
  • Return NO_CONTENT (204) on delete as per interface standard
  • Introduce support for ConstraintViolationException exception
  • Make Service classes extend NoarkService
  • Make code base respect X-Forwarded-Host, X-Forwarded-Proto and X-Forwarded-Port
  • Update CorrespondencePart* code to be more in line with Single Responsibility Principle
  • Make package name follow directory structure
  • Make sure Document number starts at 1, not 0
  • Fix isues discovered by FindBugs
  • Update from Date to ZonedDateTime
  • Fix wrong tablename
  • Introduce Service layer tests
  • Improvements to CorrespondencePart
  • Continued work on Class / Classificationsystem
  • Fix feature where authors were stored as storageLocations
  • Update HQL builder for OData
  • Update OData search capability from webpage

If free and open standardized archiving API sound interesting to you, please contact us on IRC (#nikita on irc.freenode.net) or email (nikita-noark mailing list).

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Categories: FLOSS Project Planets

Patrick Kennedy: Setting Up GitLab CI for a Python Application

Planet Python - Wed, 2019-05-22 00:28

Introduction

This blog post describes how to configure a Continuous Integration (CI) process on GitLab for a python application.  This blog post utilizes one of my python applications (bild) to show how to setup the CI process:

In this blog post, I’ll show how I setup a GitLab CI process to run the following jobs on a python application:

  • Unit and functional testing using pytest
  • Linting using flake8
  • Static analysis using pylint
  • Type checking using mypy

What is CI?

To me, Continuous Integration (CI) means frequently testing your application in an integrated state.  However, the term ‘testing’ should be interpreted loosely as this can mean:

  • Integration testing
  • Unit testing
  • Functional testing
  • Static analysis
  • Style checking (linting)
  • Dynamic analysis

To facilitate running these tests, it’s best to have these tests run automatically as part of your configuration management (git) process.  This is where GitLab CI is awesome!

In my experience, I’ve found it really beneficial to develop a test script locally and then add it to the CI process that gets automatically run on GitLab CI.

Getting Started with GitLab CI

Before jumping into GitLab CI, here are a few definitions:

 – pipeline: a set of tests to run against a single git commit.

  – runner: GitLab uses runners on different servers to actually execute the tests in a pipeline; GitLab provides runners to use, but you can also spin up your own servers as runners.

  – job: a single test being run in a pipeline.

  – stage: a group of related tests being run in a pipeline.

Here’s a screenshot from GitLab CI that helps illustrate these terms:

GitLab utilizes the ‘.gitlab-ci.yml’ file to run the CI pipeline for each project.  The ‘.gitlab-ci.yml’ file should be found in the top-level directory of your project.

While there are different methods of running a test in GitLab CI, I prefer to utilize a Docker container to run each test.  I’ve found the overhead in spinning up a Docker container to be trivial (in terms of execution time) when doing CI testing.

Creating a Single Job in GitLab CI

The first job that I want to add to GitLab CI for my project is to run a linter (flake8).  In my local development environment, I would run this command:

$ flake8 --max-line-length=120 bild/*.py

This command can be transformed into a job on GitLab CI in the ‘.gitlab-ci.yml’ file:

image: "python:3.7" before_script: - python --version - pip install -r requirements.txt stages: - Static Analysis flake8: stage: Static Analysis script: - flake8 --max-line-length=120 bild/*.py

This YAML file tells GitLab CI what to run on each commit pushed up to the repository. Let’s break down each section…

The first line (image: “python: 3.7”) instructs GitLab CI to utilize Docker for performing ALL of the tests for this project, specifically to use the ‘python:3.7‘ image that is found on DockerHub.

The second section (before_script) is the set of commands to run in the Docker container before starting each job. This is really beneficial for getting the Docker container in the correct state by installing all the python packages needed by the application.

The third section (stages) defines the different stages in the pipeline. There is only a single stage (Static Analysis) at this point, but later a second stage (Test) will be added. I like to think of stages as a way to group together related jobs.

The fourth section (flake8) defines the job; it specifies the stage (Static Analysis) that the job should be part of and the commands to run in the Docker container for this job. For this job, the flake8 linter is run against the python files in the application.

At this point, the updates to ‘.gitlab-ci.yml’ file should be commited to git and then pushed up to GitLab:

git add .gitlab_ci.yml git commit -m "Updated .gitlab_ci.yml" git push origin master

GitLab Ci will see that there is a CI configuration file (.gitlab-ci.yml) and use this to run the pipeline:

This is the start of a CI process for a python project!  GitLab CI will run a linter (flake8) on every commit that is pushed up to GitLab for this project.

Running Tests with pytest on GitLab CI

When I run my unit and functional tests with pytest in my development environment, I run the following command in my top-level directory:

$ pytest

My initial attempt at creating a new job to run pytest in ‘.gitlab-ci.yml’ file was:

image: "python:3.7" before_script: - python --version - pip install -r requirements.txt stages: - Static Analysis - Test ... pytest: stage: Test script: - pytest

However, this did not work as pytest was unable to find the ‘bild’ module (ie. the source code) to test:

$ pytest ========================= test session starts ========================== platform linux -- Python 3.7.3, pytest-4.5.0, py-1.5.4, pluggy-0.11.0 rootdir: /builds/patkennedy79/bild, inifile: pytest.ini plugins: datafiles-2.0 collected 0 items / 3 errors ============================ ERRORS ==================================== ___________ ERROR collecting tests/functional/test_bild.py _____________ ImportError while importing test module '/builds/patkennedy79/bild/tests/functional/test_bild.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: tests/functional/test_bild.py:4: in <module> from bild.directory import Directory E ModuleNotFoundError: No module named 'bild' ... ==================== 3 error in 0.24 seconds ====================== ERROR: Job failed: exit code 1

The problem encountered here is that the ‘bild’ module is not able to be found by the test_*.py files, as the top-level directory of the project was not being specified in the system path:

$ python -c "import sys;print(sys.path)"
['', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

The solution that I came up with was to add the top-level directory to the system path within the Docker container for this job:

pytest: stage: Test script: - pwd - ls -l - export PYTHONPATH="$PYTHONPATH:." - python -c "import sys;print(sys.path)" - pytest

With the updated system path, this job was able to run successfully:

$ pwd /builds/patkennedy79/bild $ export PYTHONPATH="$PYTHONPATH:." $ python -c "import sys;print(sys.path)" ['', '/builds/patkennedy79/bild', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

Final GitLab CI Configuration

Here is the final .gitlab-ci.yml file that runs the static analysis jobs (flake8, mypy, pylint) and the tests (pytest):

image: "python:3.7"

before_script:
- python --version
- pip install -r requirements.txt

stages:
- Static Analysis
- Test

mypy:
stage: Static Analysis
script:
- pwd
- ls -l
- python -m mypy bild/file.py
- python -m mypy bild/directory.py

flake8:
stage: Static Analysis
script:
- flake8 --max-line-length=120 bild/*.py

pylint:
stage: Static Analysis
allow_failure: true
script:
- pylint -d C0301 bild/*.py

unit_test:
stage: Test
script:
- pwd
- ls -l
- export PYTHONPATH="$PYTHONPATH:."
- python -c "import sys;print(sys.path)"
- pytest

Here is the resulting output from GitLab CI:

One item that I’d like to point out is that pylint is reporting some warnings, but I find this to be acceptable. However, I still want to have pylint running in my CI process, but I don’t care if it has failures. I’m more concerned with trends over time (are there warnings being created). Therefore, I set the pylint job to be allowed to fail via the ‘allow_failure’ setting:

pylint: stage: Static Analysis allow_failure: true script: - pylint -d C0301 bild/*.py
Categories: FLOSS Project Planets

The Digital Cat: Object-Oriented Programming in Python 3 - Abstract Base Classes

Planet Python - Tue, 2019-05-21 21:47

This post is available as an IPython Notebook here

The Inspection Club

As you know, Python leverages polymorphism at its maximum by dealing only with generic references to objects. This makes OOP not an addition to the language but part of its structure from the ground up. Moreover, Python pushes the EAFP appoach, which tries to avoid direct inspection of objects as much as possible.

It is however very interesting to read what Guido van Rossum says in PEP 3119: Invocation means interacting with an object by invoking its methods. Usually this is combined with polymorphism, so that invoking a given method may run different code depending on the type of an object. Inspection means the ability for external code (outside of the object's methods) to examine the type or properties of that object, and make decisions on how to treat that object based on that information. [...] In classical OOP theory, invocation is the preferred usage pattern, and inspection is actively discouraged, being considered a relic of an earlier, procedural programming style. However, in practice this view is simply too dogmatic and inflexible, and leads to a kind of design rigidity that is very much at odds with the dynamic nature of a language like Python.

The author of Python recognizes that forcing the use of a pure polymorphic approach leads sometimes to solutions that are too complex or even incorrect. In this section I want to show some of the problems that can arise from a pure polymorphic approach and introduce Abstract Base Classes, which aim to solve them. I strongly suggest to read PEP 3119 (as for any other PEP) since it contains a deeper and better explanation of the whole matter. Indeed I think that this PEP is so well written that any further explanation is hardly needed. I am however used to write explanations to check how much I understood about the topic, so I am going to try it this time too.

E.A.F.P the Extra Test Trial

The EAFP coding style requires you to trust the incoming objects to provide the attributes and methods you need, and to manage the possible exceptions, if you know how to do it. Sometimes, however, you need to test if the incoming object matches a complex behaviour. For example, you could be interested in testing if the object acts like a list, but you quickly realize that the amount of methods a list provides is very big and this could lead to odd EAFP code like

try: obj.append obj.count obj.extend obj.index obj.insert [...] except AttributeError: [...]

where the methods of the list type are accessed (not called) just to force the object to raise the AttributeError exception if they are not present. This code, however, is not only ugly but also wrong. If you recall the "Enter the Composition" section of the third post of this series, you know that in Python you can always customize the __getattr__() method, which is called whenever the requested attribute is not found in the object. So I could write a class that passes the test but actually does not act like a list

class FakeList: def fakemethod(self): pass def __getattr__(self, name): if name in ['append', 'count', 'extend', 'index', 'insert', ...]: return self.fakemethod

This is obviously just an example, and no one will ever write such a class, but this demonstrates that just accessing methods does not guarantee that a class acts like the one we are expecting.

There are many examples that could be done leveraging the highly dynamic nature of Python and its rich object model. I would summarize them by saying that sometimes you'd better to check the type of the incoming object.

In Python you can obtain the type of an object using the type() built-in function, but to check it you'd better use isinstance(), which returns a boolean value. Let us see an example before moving on

>>> isinstance([], list) True >>> isinstance(1, int) True >>> class Door: ... pass ... >>> d = Door() >>> isinstance(d, Door) True >>> class EnhancedDoor(Door): ... pass ... >>> ed = EnhancedDoor() >>> isinstance(ed, EnhancedDoor) True >>> isinstance(ed, Door) True

As you can see the function can also walk the class hierarchy, so the check is not so trivial like the one you would obtain by directly using type().

The isinstance() function, however, does not completely solve the problem. If we write a class that actually acts like a list but does not inherit from it, isinstance() does not recognize the fact that the two may be considered the same thing. The following code returns False regardless the content of the MyList class

>>> class MyList: ... pass ... >>> ml = MyList() >>> isinstance(ml, list) False

since isinstance() does not check the content of the class or its behaviour, it just considers the class and its ancestors.

The problem, thus, may be summed up with the following question: what is the best way to test that an object exposes a given interface? Here, the word interface is used for its natural meaning, without any reference to other programming solutions, which however address the same problem.

A good way to address the problem could be to write inside an attribute of the object the list of interfaces it promises to implement, and to agree that any time we want to test the behaviour of an object we simply have to check the content of this attribute. This is exactly the path followed by Python, and it is very important to understand that the whole system is just about a promised behaviour.

The solution proposed through PEP 3119 is, in my opinion, very simple and elegant, and it perfectly fits the nature of Python, where things are usually agreed rather than being enforced. Not only, the solution follows the spirit of polymorphism, where information is provided by the object itself and not extracted by the calling code.

In the next sections I am going to try and describe this solution in its main building blocks. The matter is complex so my explanation will lack some details: please refer to the forementioned PEP 3119 for a complete description.

Who Framed the Metaclasses

As already described, Python provides two built-ins to inspect objects and classes, which are isinstance() and issubclass() and a solution to the inspection problem should allow the programmer to go on with using those two functions.

This means that we need to find a way to inject the "behaviour promise" into both classes and instances. This is the reason why metaclasses come in play. Recall what we said about them in the fifth issue of this series: metaclasses are the classes used to build classes, which means that they are the preferred way to change the structure of a class, and, in consequence, of its instances.

Another way to do the same job would be to leverage the inheritance mechanism, injecting the behaviour through a dedicated parent class. This solution has many downsides, which I'm am not going to detail. It is enough to say that affecting the class hierarchy may lead to complex situations or subtle bugs. Metaclasses may provide here a different entry point for the introduction of a "virtual base class" (as PEP 3119 specifies, this is not the same concept as in C++).

Overriding Places

As said, isinstance() and issubclass() are built-in functions, not object methods, so we cannot simply override them providing a different implementation in a given class. So the first part of the solution is to change the behaviour of those two functions to first check if the class or the instance contain a special method, which is __instancecheck__() for isinstance() and __subclasscheck__() for issubclass(). So both built-ins try to run the respective special method, reverting to the standard algorithm if it is not present.

A note about naming. Methods must accept the object they belong to as the first argument, so the two special methods shall have the form

def __instancecheck__(cls, inst): [...] def __subclasscheck__(cls, sub): [...]

where cls is the class where they are injected, that is the one representing the promised behaviour. The two built-ins, however, have a reversed argument order, where the behaviour comes after the tested object: when you write isinstance([], list) you want to check if the [] instance has the list behaviour. This is the reason behind the name choice: just calling the methods __isinstance__() and __issubclass__() and passing arguments in a reversed order would have been confusing.

This is ABC

The proposed solution is thus called Abstract Base Classes, as it provides a way to attach to a concrete class a virtual class with the only purpose of signaling a promised behaviour to anyone inspecting it with isinstance() or issubclass().

To help programmers implement Abstract Base Classes, the standard library has been given an abc module, thet contains the ABCMeta class (and other facilities). This class is the one that implements __instancecheck__() and __subclasscheck__() and shall be used as a metaclass to augment a standard class. The latter will then be able to register other classes as implementation of its behaviour.

Sounds complex? An example may clarify the whole matter. The one from the official documentation is rather simple:

from abc import ABCMeta class MyABC(metaclass=ABCMeta): pass MyABC.register(tuple) assert issubclass(tuple, MyABC) assert isinstance((), MyABC)

Here, the MyABC class is provided the ABCMeta metaclass. This puts the two __instancecheck__() and __subclasscheck__() methods inside MyABC so that, when issuing isinstance(), what Python actually ececutes is

>>> d = {'a': 1} >>> isinstance(d, MyABC) False >>> MyABC.__class__.__instancecheck__(MyABC, d) False >>> isinstance((), MyABC) True >>> MyABC.__class__.__instancecheck__(MyABC, ()) True

After the definition of MyABC we need a way to signal that a given class is an instance of the Abstract Base Class and this happens through the register() method, provided by the ABCMeta metaclass. Calling MyABC.register(tuple) we record inside MyABC the fact that the tuple class shall be identified as a subclass of MyABC itself. This is analogous to saying that tuple inherits from MyABC but not quite the same. As already said registering a class in an Abstract Base Class with register() does not affect the class hierarchy. Indeed, the whole tuple class is unchanged.

The current implementation of ABCs stores the registered types inside the _abc_registry attribute. Actually it stores there weak references to the registered types (this part is outside the scope of this article, so I'm not detailing it)

>>> MyABC._abc_registry.data {<weakref at 0xb682966c; to 'type' at 0x83dcca0 (tuple)>} Movie Trivia

Section titles come from the following movies: The Breakfast Club (1985), E.T. the Extra-Terrestrial (1982), Who Framed Roger Rabbit (1988), Trading Places (1983), This is Spinal Tap (1984).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

The Digital Cat: Object-Oriented Programming in Python 3 - Metaclasses

Planet Python - Tue, 2019-05-21 21:47

This post is available as an IPython Notebook here

The Type Brothers

The first step into the most intimate secrets of Python objects comes from two components we already met in the first post: class and object. These two things are the very fundamental elements of Python OOP system, so it is worth spending some time to understand how they work and relate each other.

First of all recall that in Python everything is an object, that is everything inherits from object. Thus, object seems to be the deepest thing you can find digging into Python variables. Let's check this

>>> a = 5 >>> type(a) <class 'int'> >>> a.__class__ <class 'int'> >>> a.__class__.__bases__ (<class 'object'>,) >>> object.__bases__ ()

The variable a is an instance of the int class, and the latter inherits from object, which inherits from nothing. This demonstrates that object is at the top of the class hierarchy. However, as you can see, both int and object are called classes (<class 'int'>, <class 'object'>). Indeed, while a is an instance of the int class, int itself is an instance of another class, a class that is instanced to build classes

>>> type(a) <class 'int'> >>> type(int) <class 'type'> >>> type(float) <class 'type'> >>> type(dict) <class 'type'>

Since in Python everything is an object, everything is the instance of a class, even classes. Well, type is the class that is instanced to get classes. So remember this: object is the base of every object, type is the class of every type. Sounds puzzling? It is not your fault, don't worry. However, just to strike you with the finishing move, this is what Python is built on

>>> type(object) <class 'type'> >>> type.__bases__ (<class 'object'>,)

If you are not about to faint at this point chances are that you are Guido van Rossum of one of his friends down at the Python core development team (in this case let me thank you for your beautiful creation). You may get a cup of tea, if you need it.

Jokes apart, at the very base of Python type system there are two things, object and type, which are inseparable. The previous code shows that object is an instance of type, and type inherits from object. Take your time to understand this subtle concept, as it is very important for the upcoming discussion about metaclasses.

When you think you grasped the type/object matter read this and start thinking again

>>> type(type) <class 'type'> The Metaclasses Take Python

You are now familiar with Python classes. You know that a class is used to create an instance, and that the structure of the latter is ruled by the source class and all its parent classes (until you reach object).

Since classes are objects too, you know that a class itself is an instance of a (super)class, and this class is type. That is, as already stated, type is the class that is used to build classes.

So for example you know that a class may be instanced, i.e. it can be called and by calling it you obtain another object that is linked with the class. What prepares the class for being called? What gives the class all its methods? In Python the class in charge of performing such tasks is called metaclass, and type is the default metaclass of all classes.

The point of exposing this structure of Python objects is that you may change the way classes are built. As you know, type is an object, so it can be subclassed just like any other class. Once you get a subclass of type you need to instruct your class to use it as the metaclass instead of type, and you can do this by passing it as the metaclass keyword argument in the class definition.

>>> class MyType(type): ... pass ... >>> class MySpecialClass(metaclass=MyType): ... pass ... >>> msp = MySpecialClass() >>> type(msp) <class '__main__.MySpecialClass'> >>> type(MySpecialClass) <class '__main__.MyType'> >>> type(MyType) <class 'type'> Metaclasses 2: Singleton Day

Metaclasses are a very advanced topic in Python, but they have many practical uses. For example, by means of a custom metaclass you may log any time a class is instanced, which can be important for applications that shall keep a low memory usage or have to monitor it.

I am going to show here a very simple example of metaclass, the Singleton. Singleton is a well known design pattern, and many description of it may be found on the Internet. It has also been heavily criticized mostly because its bad behaviour when subclassed, but here I do not want to introduce it for its technological value, but for its simplicity (so please do not question the choice, it is just an example).

Singleton has one purpose: to return the same instance every time it is instanced, like a sort of object-oriented global variable. So we need to build a class that does not work like standard classes, which return a new instance every time they are called.

"Build a class"? This is a task for metaclasses. The following implementation comes from Python 3 Patterns, Recipes and Idioms.

class Singleton(type): instance = None def __call__(cls, *args, **kw): if not cls.instance: cls.instance = super(Singleton, cls).__call__(*args, **kw) return cls.instance

We are defining a new type, which inherits from type to provide all bells and whistles of Python classes. We override the __call__ method, that is a special method invoked when we call the class, i.e. when we instance it. The new method wraps the original method of type by calling it only when the instance attribute is not set, i.e. the first time the class is instanced, otherwise it just returns the recorded instance. As you can see this is a very basic cache class, the only trick is that it is applied to the creation of instances.

To test the new type we need to define a new class that uses it as its metaclass

>>> class ASingleton(metaclass=Singleton): ... pass ... >>> a = ASingleton() >>> b = ASingleton() >>> a is b True >>> hex(id(a)) '0xb68030ec' >>> hex(id(b)) '0xb68030ec'

By using the is operator we test that the two objects are the very same structure in memory, that is their ids are the same, as explicitly shown. What actually happens is that when you issue a = ASingleton() the ASingleton class runs its __call__() method, which is taken from the Singleton type behind the class. That method recognizes that no instance has been created (Singleton.instance is None) and acts just like any standard class does. When you issue b = ASingleton() the very same things happen, but since Singleton.instance is now different from None its value (the previous instance) is directly returned.

Metaclasses are a very powerful programming tool and leveraging them you can achieve very complex behaviours with a small effort. Their use is a must every time you are actually metaprogramming, that is you are writing code that has to drive the way your code works. Good examples are creational patterns (injecting custom class attributes depending on some configuration), testing, debugging, and performance monitoring.

Coming to Instance

Before introducing you to a very smart use of metaclasses by talking about Abstract Base Classes (read: to save some topics for the next part of this series), I want to dive into the object creation procedure in Python, that is what happens when you instance a class. In the first post this procedure was described only partially, by looking at the __init__() method.

In the first post I recalled the object-oriented concept of constructor, which is a special method of the class that is automatically called when the instance is created. The class may also define a destructor, which is called when the object is destroyed. In languages without a garbage collection mechanism such as C++ the destructor shall be carefully designed. In Python the destructor may be defined through the __del__() method, but it is hardly used.

The constructor mechanism in Python is on the contrary very important, and it is implemented by two methods, instead of just one: __new__() and __init__(). The tasks of the two methods are very clear and distinct: __new__() shall perform actions needed when creating a new instance while __init__ deals with object initialization.

Since in Python you do not need to declare attributes due to its dynamic nature, __new__() is rarely defined by programmers, who may rely on __init__ to perform the majority of the usual tasks. Typical uses of __new__() are very similar to those listed in the previous section, since it allows to trigger some code whenever your class is instanced.

The standard way to override __new__() is

class MyClass(): def __new__(cls, *args, **kwds): obj = super().__new__(cls, *args, **kwds) [put your code here] return obj

just like you usually do with __init__(). When your class inherits from object you do not need to call the parent method (object.__init__()), because it is empty, but you need to do it when overriding __new__.

Remember that __new__() is not forced to return an instance of the class in which it is defined, even if you shall have very good reasons to break this behaviour. Anyway, __init__() will be called only if you return an instance of the container class. Please also note that __new__(), unlike __init__(), accepts the class as its first parameter. The name is not important in Python, and you can also call it self, but it is worth using cls to remember that it is not an instance.

Movie Trivia

Section titles come from the following movies: The Blues Brothers (1980), The Muppets Take Manhattan (1984), Terminator 2: Judgement Day (1991), Coming to America (1988).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

The Digital Cat: Object-Oriented Programming in Python 3 - Composition and inheritance

Planet Python - Tue, 2019-05-21 21:47

This post is available as an IPython Notebook here

The Delegation Run

If classes are objects what is the difference between types and instances?

When I talk about "my cat" I am referring to a concrete instance of the "cat" concept, which is a subtype of "animal". So, despite being both objects, while types can be specialized, instances cannot.

Usually an object B is said to be a specialization of an object A when:

  • B has all the features of A
  • B can provide new features
  • B can perform some or all the tasks performed by A in a different way

Those targets are very general and valid for any system and the key to achieve them with the maximum reuse of already existing components is delegation. Delegation means that an object shall perform only what it knows best, and leave the rest to other objects.

Delegation can be implemented with two different mechanisms: composition and inheritance. Sadly, very often only inheritance is listed among the pillars of OOP techniques, forgetting that it is an implementation of the more generic and fundamental mechanism of delegation; perhaps a better nomenclature for the two techniques could be explicit delegation (composition) and implicit delegation (inheritance).

Please note that, again, when talking about composition and inheritance we are talking about focusing on a behavioural or structural delegation. Another way to think about the difference between composition and inheritance is to consider if the object knows who can satisfy your request or if the object is the one that satisfy the request.

Please, please, please do not forget composition: in many cases, composition can lead to simpler systems, with benefits on maintainability and changeability.

Usually composition is said to be a very generic technique that needs no special syntax, while inheritance and its rules are strongly dependent on the language of choice. Actually, the strong dynamic nature of Python softens the boundary line between the two techniques.

Inheritance Now

In Python a class can be declared as an extension of one or more different classes, through the class inheritance mechanism. The child class (the one that inherits) has the same internal structure of the parent class (the one that is inherited), and for the case of multiple inheritance the language has very specific rules to manage possible conflicts or redefinitions among the parent classes. A very simple example of inheritance is

class SecurityDoor(Door): pass

where we declare a new class SecurityDoor that, at the moment, is a perfect copy of the Door class. Let us investigate what happens when we access attributes and methods. First we instance the class

>>> sdoor = SecurityDoor(1, 'closed')

The first check we can do is that class attributes are still global and shared

>>> SecurityDoor.colour is Door.colour True >>> sdoor.colour is Door.colour True

This shows us that Python tries to resolve instance members not only looking into the class the instance comes from, but also investigating the parent classes. In this case sdoor.colour becomes SecurityDoor.colour, that in turn becomes Door.colour. SecurityDoor is a Door.

If we investigate the content of __dict__ we can catch a glimpse of the inheritance mechanism in action

>>> sdoor.__dict__ {'number': 1, 'status': 'closed'} >>> sdoor.__class__.__dict__ mappingproxy({'__doc__': None, '__module__': '__main__'}) >>> Door.__dict__ mappingproxy({'__dict__': <attribute '__dict__' of 'Door' objects>, 'colour': 'yellow', 'open': <function Door.open at 0xb687e224>, '__init__': <function Door.__init__ at 0xb687e14c>, '__doc__': None, 'close': <function Door.close at 0xb687e1dc>, 'knock': <classmethod object at 0xb67ff6ac>, '__weakref__': <attribute '__weakref__' of 'Door' objects>, '__module__': '__main__', 'paint': <classmethod object at 0xb67ff6ec>})

As you can see the content of __dict__ for SecurityDoor is very narrow compared to that of Door. The inheritance mechanism takes care of the missing elements by climbing up the classes tree. Where does Python get the parent classes? A class always contains a __bases__ tuple that lists them

>>> SecurityDoor.__bases__ (<class '__main__.Door'>,)

So an example of what Python does to resolve a class method call through the inheritance tree is

>>> sdoor.__class__.__bases__[0].__dict__['knock'].__get__(sdoor) <bound method type.knock of <class '__main__.SecurityDoor'>> >>> sdoor.knock <bound method type.knock of <class '__main__.SecurityDoor'>>

Please note that this is just an example that does not consider multiple inheritance.

Let us try now to override some methods and attributes. In Python you can override (redefine) a parent class member simply by redefining it in the child class.

class SecurityDoor(Door): colour = 'gray' locked = True def open(self): if not self.locked: self.status = 'open'

As you can forecast, the overridden members now are present in the __dict__ of the SecurityDoor class

>>> SecurityDoor.__dict__ mappingproxy({'__doc__': None, '__module__': '__main__', 'open': <function SecurityDoor.open at 0xb6fcf89c>, 'colour': 'gray', 'locked': True})

So when you override a member, the one you put in the child class is used instead of the one in the parent class simply because the former is found before the latter while climbing the class hierarchy. This also shows you that Python does not implicitly call the parent implementation when you override a method. So, overriding is a way to block implicit delegation.

If we want to call the parent implementation we have to do it explicitly. In the former example we could write

class SecurityDoor(Door): colour = 'gray' locked = True def open(self): if self.locked: return Door.open(self)

You can easily test that this implementation is working correctly.

>>> sdoor = SecurityDoor(1, 'closed') >>> sdoor.status 'closed' >>> sdoor.open() >>> sdoor.status 'closed' >>> sdoor.locked = False >>> sdoor.open() >>> sdoor.status 'open'

This form of explicit parent delegation is heavily discouraged, however.

The first reason is because of the very high coupling that results from explicitly naming the parent class again when calling the method. Coupling, in the computer science lingo, means to link two parts of a system, so that changes in one of them directly affect the other one, and is usually avoided as much as possible. In this case if you decide to use a new parent class you have to manually propagate the change to every method that calls it. Moreover, since in Python the class hierarchy can be dynamically changed (i.e. at runtime), this form of explicit delegation could be not only annoying but also wrong.

The second reason is that in general you need to deal with multiple inheritance, where you do not know a priori which parent class implements the original form of the method you are overriding.

To solve these issues, Python supplies the super() built-in function, that climbs the class hierarchy and returns the correct class that shall be called. The syntax for calling super() is

class SecurityDoor(Door): colour = 'gray' locked = True def open(self): if self.locked: return super().open()

The output of super() is not exactly the Door class. It returns a super object which representation is <super: <class 'SecurityDoor'>, <SecurityDoor object>>. This object however acts like the parent class, so you can safely ignore its custom nature and use it just like you would do with the Door class in this case.

Enter the Composition

Composition means that an object knows another object, and explicitly delegates some tasks to it. While inheritance is implicit, composition is explicit: in Python, however, things are far more interesting than this =).

First of all let us implement classic composition, which simply makes an object part of the other as an attribute

class SecurityDoor: colour = 'gray' locked = True def __init__(self, number, status): self.door = Door(number, status) def open(self): if self.locked: return self.door.open() def close(self): self.door.close()

The primary goal of composition is to relax the coupling between objects. This little example shows that now SecurityDoor is an object and no more a Door, which means that the internal structure of Door is not copied. For this very simple example both Door and SecurityDoor are not big classes, but in a real system objects can very complex; this means that their allocation consumes a lot of memory and if a system contains thousands or millions of objects that could be an issue.

The composed SecurityDoor has to redefine the colour attribute since the concept of delegation applies only to methods and not to attributes, doesn't it?

Well, no. Python provides a very high degree of indirection for objects manipulation and attribute access is one of the most useful. As you already discovered, accessing attributes is ruled by a special method called __getattribute__() that is called whenever an attribute of the object is accessed. Overriding __getattribute__(), however, is overkill; it is a very complex method, and, being called on every attribute access, any change makes the whole thing slower.

The method we have to leverage to delegate attribute access is __getattr__(), which is a special method that is called whenever the requested attribute is not found in the object. So basically it is the right place to dispatch all attribute and method access our object cannot handle. The previous example becomes

class SecurityDoor: locked = True def __init__(self, number, status): self.door = Door(number, status) def open(self): if self.locked: return self.door.open() def __getattr__(self, attr): return getattr(self.door, attr)

Using __getattr__() blends the separation line between inheritance and composition since after all the former is a form of automatic delegation of every member access.

class ComposedDoor: def __init__(self, number, status): self.door = Door(number, status) def __getattr__(self, attr): return getattr(self.door, attr)

As this last example shows, delegating every member access through __getattr__() is very simple. Pay attention to getattr() which is different from __getattr__(). The former is a built-in that is equivalent to the dotted syntax, i.e. getattr(obj, 'someattr') is the same as obj.someattr, but you have to use it since the name of the attribute is contained in a string.

Composition provides a superior way to manage delegation since it can selectively delegate the access, even mask some attributes or methods, while inheritance cannot. In Python you also avoid the memory problems that might arise when you put many objects inside another; Python handles everything through its reference, i.e. through a pointer to the memory position of the thing, so the size of an attribute is constant and very limited.

Movie Trivia

Section titles come from the following movies: The Cannonball Run (1981), Apocalypse Now (1979), Enter the Dragon (1973).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

The Digital Cat: Object-Oriented Programming in Python 3 - Classes and members

Planet Python - Tue, 2019-05-21 21:47

This post is available as an IPython Notebook here

Python Classes Strike Again

The Python implementation of classes has some peculiarities. The bare truth is that in Python the class of an object is an object itself. You can check this by issuing type() on the class

>>> a = 1 >>> type(a) <class 'int'> >>> type(int) <class 'type'>

This shows that the int class is an object, an instance of the type class.

This concept is not so difficult to grasp as it can seem at first sight: in the real world we deal with concepts using them like things: for example we can talk about the concept of "door", telling people how a door looks like and how it works. In this case the concept of door is the topic of our discussion, so in our everyday experience the type of an object is an object itself. In Python this can be expressed by saying that everything is an object.

If the class of an object is itself an instance it is a concrete object and is stored somewhere in memory. Let us leverage the inspection capabilities of Python and its id() function to check the status of our objects. The id() built-in function returns the memory position of an object.

In the first post we defined this class

class Door: def __init__(self, number, status): self.number = number self.status = status def open(self): self.status = 'open' def close(self): self.status = 'closed'

First of all, let's create two instances of the Door class and check that the two objects are stored at different addresses

>>> door1 = Door(1, 'closed') >>> door2 = Door(1, 'closed') >>> hex(id(door1)) '0xb67e148c' >>> hex(id(door2)) '0xb67e144c'

This confirms that the two instances are separate and unrelated. Please note that your values are very likely to be different from the ones I got. Being memory addresses they change at every execution. The second instance was given the same attributes of the first instance to show that the two are different objects regardless of the value of the attributes.

However if we use id() on the class of the two instances we discover that the class is exactly the same

>>> hex(id(door1.__class__)) '0xb685f56c' >>> hex(id(door2.__class__)) '0xb685f56c'

Well this is very important. In Python, a class is not just the schema used to build an object. Rather, the class is a shared living object, which code is accessed at run time.

As we already tested, however, attributes are not stored in the class but in every instance, due to the fact that __init__() works on self when creating them. Classes, however, can be given attributes like any other object; with a terrific effort of imagination, let's call them class attributes.

As you can expect, class attributes are shared among the class instances just like their container

class Door: colour = 'brown' def __init__(self, number, status): self.number = number self.status = status def open(self): self.status = 'open' def close(self): self.status = 'closed'

Pay attention: the colour attribute here is not created using self, so it is contained in the class and shared among instances

>>> door1 = Door(1, 'closed') >>> door2 = Door(2, 'closed') >>> Door.colour 'brown' >>> door1.colour 'brown' >>> door2.colour 'brown'

Until here things are not different from the previous case. Let's see if changes of the shared value reflect on all instances

>>> Door.colour = 'white' >>> Door.colour 'white' >>> door1.colour 'white' >>> door2.colour 'white' >>> hex(id(Door.colour)) '0xb67e1500' >>> hex(id(door1.colour)) '0xb67e1500' >>> hex(id(door2.colour)) '0xb67e1500' Raiders of the Lost Attribute

Any Python object is automatically given a __dict__ attribute, which contains its list of attributes. Let's investigate what this dictionary contains for our example objects:

>>> Door.__dict__ mappingproxy({'open': <function Door.open at 0xb68604ac>, 'colour': 'brown', '__dict__': <attribute '__dict__' of 'Door' objects>, '__weakref__': <attribute '__weakref__' of 'Door' objects>, '__init__': <function Door.__init__ at 0xb7062854>, '__module__': '__main__', '__doc__': None, 'close': <function Door.close at 0xb686041c>}) >>> door1.__dict__ {'number': 1, 'status': 'closed'}

Leaving aside the difference between a dictionary and a mappingproxy object, you can see that the colour attribute is listed among the Door class attributes, while status and number are listed for the instance.

How comes that we can call door1.colour, if that attribute is not listed for that instance? This is a job performed by the magic __getattribute__() method; in Python the dotted syntax automatically invokes this method so when we write door1.colour, Python executes door1.__getattribute__('colour'). That method performs the attribute lookup action, i.e. finds the value of the attribute by looking in different places.

The standard implementation of __getattribute__() searches first the internal dictionary (__dict__) of an object, then the type of the object itself; in this case door1.__getattribute__('colour') executes first door1.__dict__['colour'] and then, since the latter raises a KeyError exception, door1.__class__.__dict__['colour']

>>> door1.__dict__['colour'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'colour' >>> door1.__class__.__dict__['colour'] 'brown'

Indeed, if we compare the objects' equality through the is operator we can confirm that both door1.colour and Door.colour are exactly the same object

>>> door1.colour is Door.colour True

When we try to assign a value to a class attribute directly on an instance, we just put in the __dict__ of the instance a value with that name, and this value masks the class attribute since it is found first by __getattribute__(). As you can see from the examples of the previous section, this is different from changing the value of the attribute on the class itself.

>>> door1.colour = 'white' >>> door1.__dict__['colour'] 'white' >>> door1.__class__.__dict__['colour'] 'brown' >>> Door.colour = 'red' >>> door1.__dict__['colour'] 'white' >>> door1.__class__.__dict__['colour'] 'red' Revenge of the Methods

Let's play the same game with methods. First of all you can see that, just like class attributes, methods are listed only in the class __dict__. Chances are that they behave the same as attributes when we get them

>>> door1.open is Door.open False

Whoops. Let us further investigate the matter

>>> Door.__dict__['open'] <function Door.open at 0xb68604ac> >>> Door.open <function Door.open at 0xb68604ac> >>> door1.open <bound method Door.open of <__main__.Door object at 0xb67e162c>>

So, the class method is listed in the members dictionary as function. So far, so good. The same happens when taking it directly from the class; here Python 2 needed to introduce unbound methods, which are not present in Python 3. Taking it from the instance returns a bound method.

Well, a function is a procedure you named and defined with the def statement. When you refer to a function as part of a class in Python 3 you get a plain function, without any difference from a function defined outside a class.

When you get the function from an instance, however, it becomes a bound method. The name method simply means "a function inside an object", according to the usual OOP definitions, while bound signals that the method is linked to that instance. Why does Python bother with methods being bound or not? And how does Python transform a function into a bound method?

First of all, if you try to call a class function you get an error

>>> Door.open() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: open() missing 1 required positional argument: 'self'

Yes. Indeed the function was defined to require an argument called 'self', and calling it without an argument raises an exception. This perhaps means that we can give it one instance of the class and make it work

>>> Door.open(door1) >>> door1.status 'open'

Python does not complain here, and the method works as expected. So Door.open(door1) is the same as door1.open(), and this is the difference between a plain function coming from a class an a bound method: the bound method automatically passes the instance as an argument to the function.

Again, under the hood, __getattribute__() is working to make everything work and when we call door1.open(), Python actually calls door1.__class__.open(door1). However, door1.__class__.open is a plain function, so there is something more that converts it into a bound method that Python can safely call.

When you access a member of an object, Python calls __getattribute__() to satisfy the request. This magic method, however, conforms to a procedure known as descriptor protocol. For the read access __getattribute__() checks if the object has a __get__() method and calls the latter. So the converstion of a function into a bound method happens through such a mechanism. Let us review it by means of an example.

>>> door1.__class__.__dict__['open'] <function Door.open at 0xb68604ac>

This syntax retrieves the function defined in the class; the function knows nothing about objects, but it is an object (remember "everything is an object"). So we can look inside it with the dir() built-in function

>>> dir(door1.__class__.__dict__['open']) ['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__'] >>> door1.__class__.__dict__['open'].__get__ <method-wrapper '__get__' of function object at 0xb68604ac>

As you can see, a __get__ method is listed among the members of the function, and Python recognizes it as a method-wrapper. This method shall connect the open function to the door1 instance, so we can call it passing the instance alone

>>> door1.__class__.__dict__['open'].__get__(door1) <bound method Door.open of <__main__.Door object at 0xb67e162c>>

and we get exactly what we were looking for. This complex syntax is what happens behind the scenes when we call a method of an instance.

When Methods met Classes

Using type() on functions defined inside classes reveals some other details on their internal representation

>>> Door.open <function Door.open at 0xb687e074> >>> door1.open <bound method Door.open of <__main__.Door object at 0xb6f9834c>> >>> type(Door.open) <class 'function'> >>> type(door1.open) <class 'method'>

As you can see, Python tells the two apart recognizing the first as a function and the second as a method, where the second is a function bound to an instance.

What if we want to define a function that operates on the class instead of operating on the instance? As we may define class attributes, we may also define class methods in Python, through the classmethod decorator. Class methods are functions that are bound to the class and not to an instance.

class Door: colour = 'brown' def __init__(self, number, status): self.number = number self.status = status @classmethod def knock(cls): print("Knock!") def open(self): self.status = 'open' def close(self): self.status = 'closed'

Such a definition makes the method callable on both the instance and the class

>>> door1.knock() Knock! >>> Door.knock() Knock!

and Python identifies both as (bound) methods

>>> door1.__class__.__dict__['knock'] <classmethod object at 0xb67ff6ac> >>> door1.knock <bound method type.knock of <class '__main__.Door'>> >>> Door.knock <bound method type.knock of <class '__main__.Door'>> >>> type(Door.knock) <class 'method'> >>> type(door1.knock) <class 'method'>

As you can see the knock() function accepts one argument, which is called cls just to remember that it is not an instance but the class itself. This means that inside the function we can operate on the class, and the class is shared among instances.

class Door: colour = 'brown' def __init__(self, number, status): self.number = number self.status = status @classmethod def knock(cls): print("Knock!") @classmethod def paint(cls, colour): cls.colour = colour def open(self): self.status = 'open' def close(self): self.status = 'closed'

The paint() classmethod now changes the class attribute colour which is shared among instances. Let's check how it works

>>> door1 = Door(1, 'closed') >>> door2 = Door(2, 'closed') >>> Door.colour 'brown' >>> door1.colour 'brown' >>> door2.colour 'brown' >>> Door.paint('white') >>> Door.colour 'white' >>> door1.colour 'white' >>> door2.colour 'white'

The class method can be called on the class, but this affects both the class and the instances, since the colour attribute of instances is taken at runtime from the shared class.

>>> door1.paint('yellow') >>> Door.colour 'yellow' >>> door1.colour 'yellow' >>> door2.colour 'yellow'

Class methods can be called on instances too, however, and their effect is the same as before. The class method is bound to the class, so it works on the latter regardless of the actual object that calls it (class or instance).

Movie Trivia

Section titles come from the following movies: The Empire Strikes Back (1980), Raiders of the Lost Ark (1981), Revenge of the Nerds (1984), When Harry Met Sally (1989).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

The Digital Cat: Object-Oriented Programming in Python 3 - Objects and types

Planet Python - Tue, 2019-05-21 21:47

This post is available as an IPython Notebook here

About this series

Object-oriented programming (OOP) has been the leading programming paradigm for several decades now, starting from the initial attempts back in the 60s to some of the most important languages used nowadays. Being a set of programming concepts and design methodologies, OOP can never be said to be "correctly" or "fully" implemented by a language: indeed there are as many implementations as languages.

So one of the most interesting aspects of OOP languages is to understand how they implement those concepts. In this post I am going to try and start analyzing the OOP implementation of the Python language. Due to the richness of the topic, however, I consider this attempt just like a set of thoughts for Python beginners trying to find their way into this beautiful (and sometimes peculiar) language.

This series of posts wants to introduce the reader to the Python 3 implementation of Object Oriented Programming concepts. The content of this and the following posts will not be completely different from that of the previous "OOP Concepts in Python 2.x" series, however. The reason is that while some of the internal structures change a lot, the global philosophy doesn't, being Python 3 an evolution of Python 2 and not a new language.

So I chose to split the previous series and to adapt the content to Python 3 instead of posting a mere list of corrections. I find this way to be more useful for new readers, that otherwise sould be forced to read the previous series.

Print

One of the most noticeable changes introduced by Python 3 is the transformation of the print keyword into the print() function. This is indeed a very small change, compared to other modifications made to the internal structures, but is the most visual-striking one, and will be the source of 80% of your syntax errors when you will start writing Python 3 code.

Remember that print is now a function so write print(a) and not print a.

Back to the Object

Computer science deals with data and with procedures to manipulate that data. Everything, from the earliest Fortran programs to the latest mobile apps is about data and their manipulation.

So if data are the ingredients and procedures are the recipes, it seems (and can be) reasonable to keep them separate.

Let's do some procedural programming in Python

# This is some data data = (13, 63, 5, 378, 58, 40) # This is a procedure that computes the average def avg(d): return sum(d)/len(d) print(avg(data))

As you can see the code is quite good and general: the procedure (function) operates on a sequence of data, and it returns the average of the sequence items. So far, so good: computing the average of some numbers leaves the numbers untouched and creates new data.

The observation of the everyday world, however, shows that complex data mutate: an electrical device is on or off, a door is open or closed, the content of a bookshelf in your room changes as you buy new books.

You can still manage it keeping data and procedures separate, for example

# These are two numbered doors, initially closed door1 = [1, 'closed'] door2 = [2, 'closed'] # This procedure opens a door def open_door(door): door[1] = 'open' open_door(door1) print(door1)

I described a door as a structure containing a number and the status of the door (as you would do in languages like LISP, for example). The procedure knows how this structure is made and may alter it.

This also works like a charm. Some problems arise, however, when we start building specialized types of data. What happens, for example, when I introduce a "lockable door" data type, which can be opened only when it is not locked? Let's see

# These are two standard doors, initially closed door1 = [1, 'closed'] door2 = [2, 'closed'] # This is a lockable door, initially closed and unlocked ldoor1 = [1, 'closed', 'unlocked'] # This procedure opens a standard door def open_door(door): door[1] = 'open' # This procedure opens a lockable door def open_ldoor(door): if door[2] == 'unlocked': door[1] = 'open' open_door(door1) print(door1) open_ldoor(ldoor1) print(ldoor1)

Everything still works, no surprises in this code. However, as you can see, I had to find a different name for the procedure that opens a locked door since its implementation differs from the procedure that opens a standard door. But, wait... I'm still opening a door, the action is the same, and it just changes the status of the door itself. So why shall I remember that a locked door shall be opened with open_ldoor() instead of open_door() if the verb is the same?

Chances are that this separation between data and procedures doesn't perfectly fit some situations. The key problem is that the "open" action is not actually using the door; rather it is changing its state. So, just like the volume control buttons of your phone, which are on your phone, the "open" procedure should stick to the "door" data.

This is exactly what leads to the concept of object: an object, in the OOP context, is a structure holding data and procedures operating on them.

What About Type?

When you talk about data you immediately need to introduce the concept of type. This concept may have two meanings that are worth being mentioned in computer science: the behavioural and the structural one.

The behavioural meaning represents the fact that you know what something is by describing how it acts. This is the foundation of the so-called "duck typing" (here "typing" means "to give a type" and not "to type on a keyboard"): if it types acts like a duck, it is a duck.

The structural meaning identifies the type of something by looking at its internal structure. So two things that act in the same way but are internally different are of different type.

Both points of view can be valid, and different languages may implement and emphasize one meaning of type or the other, and even both.

Class Games

Objects in Python may be built describing their structure through a class. A class is the programming representation of a generic object, such as "a book", "a car", "a door": when I talk about "a door" everyone can understand what I'm saying, without the need of referring to a specific door in the room.

In Python, the type of an object is represented by the class used to build the object: that is, in Python the word type has the same meaning of the word class.

For example, one of the built-in classes of Python is int, which represents an integer number

>>> a = 6 >>> print(a) 6 >>> print(type(a)) <class 'int'> >>> print(a.__class__) <class 'int'>

As you can see, the built-in function type() returns the content of the magic attribute __class__ (magic here means that its value is managed by Python itself offstage). The type of the variable a, or its class, is int. (This is a very inaccurate description of this rather complex topic, so remember that at the moment we are just scratching the surface).

Once you have a class you can instantiate it to get a concrete object (an instance) of that type, i.e. an object built according to the structure of that class. The Python syntax to instantiate a class is the same of a function call

>>> b = int() >>> type(b) <class 'int'>

When you create an instance, you can pass some values, according to the class definition, to initialize it.

>>> b = int() >>> print(b) 0 >>> c = int(7) >>> print(c) 7

In this example, the int class creates an integer with value 0 when called without arguments, otherwise it uses the given argument to initialize the newly created object.

Let us write a class that represents a door to match the procedural examples done in the first section

class Door: def __init__(self, number, status): self.number = number self.status = status def open(self): self.status = 'open' def close(self): self.status = 'closed'

The class keyword defines a new class named Door; everything indented under class is part of the class. The functions you write inside the object are called methods and don't differ at all from standard functions; the nomenclature changes only to highlight the fact that those functions now are part of an object.

Methods of a class must accept as first argument a special value called self (the name is a convention but please never break it).

The class can be given a special method called __init__() which is run when the class is instantiated, receiving the arguments passed when calling the class; the general name of such a method, in the OOP context, is constructor, even if the __init__() method is not the only part of this mechanism in Python.

The self.number and self.status variables are called attributes of the object. In Python, methods and attributes are both members of the object and are accessible with the dotted syntax; the difference between attributes and methods is that the latter can be called (in Python lingo you say that a method is a callable).

As you can see the __init__() method shall create and initialize the attributes since they are not declared elsewhere. This is very important in Python and is strictly linked with the way the language handles the type of variables. I will detail those concepts when dealing with polymorphism in a later post.

The class can be used to create a concrete object

>>> door1 = Door(1, 'closed') >>> type(door1) <class '__main__.Door'> >>> print(door1.number) 1 >>> print(door1.status) closed

Now door1 is an instance of the Door class; type() returns the class as __main__.Door since the class was defined directly in the interactive shell, that is in the current main module.

To call a method of an object, that is to run one of its internal functions, you just access it as an attribute with the dotted syntax and call it like a standard function.

>>> door1.open() >>> print(door1.number) 1 >>> print(door1.status) open

In this case, the open() method of the door1 instance has been called. No arguments have been passed to the open() method, but if you review the class declaration, you see that it was declared to accept an argument (self). When you call a method of an instance, Python automatically passes the instance itself to the method as the first argument.

You can create as many instances as needed and they are completely unrelated each other. That is, the changes you make on one instance do not reflect on another instance of the same class.

Recap

Objects are described by a class, which can generate one or more instances, unrelated each other. A class contains methods, which are functions, and they accept at least one argument called self, which is the actual instance on which the method has been called. A special method, __init__() deals with the initialization of the object, setting the initial value of the attributes.

Movie Trivia

Section titles come from the following movies: Back to the Future (1985) , What About Bob? (1991), Wargames (1983).

Sources

You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

Molly de Blanc: remuneration

Planet Debian - Tue, 2019-05-21 17:08

I am a leader in free software. As evidence for this claim, I like to point out that I once finagled an invitation to the Google OSCON luminaries dinner, and was once invited to a Facebook party for open source luminaries.

In spite of my humor, I am a leader and have taken on leadership roles for a number of years. I was in charge of guests of honor (and then some) at Penguicon for several years at the start of my involvement in FOSS. I’m a delegate on the Debian Outreach team. My participation in Debian A-H is a leadership role as well. I’m president of the OSI Board of Directors. I’ve given keynote presentations on two continents, and talks on four. And that’s not even getting into my paid professional life. My compensated labor has been nearly exclusively for nonprofits.

Listing my credentials in such concentration feels a bit distasteful, but sometimes I think it’s important. Right now, I want to convey that I know a thing or two about free/open source leadership. I’ve even given talks on that.

Other than my full-time job, my leadership positions come without material renumeration — that is to say I don’t get paid for any of them — though I’ve accepted many a free meal and have had travel compensated on a number of occasions. I am not interested in getting paid for my leadership work, though I have come to believe that more leadership positions should be paid.

One of my criticisms about unpaid project/org leadership positions is that they are so time consuming it means that the people who can do the jobs are:

  • students
  • contractors
  • unemployed
  • those with few to no other responsibilities
  • those with very supportive partners
  • those with very supportive employers
  • those who don’t need much sleep
  • those with other forms of financial privilege

I have few responsibilities beyond some finicky plants and Bash (my cat). I also have extremely helpful roommates and modern technology (e.g. automatic feeders) that assist with these things while traveling. I can spend my evenings and weekends holed up in my office plugging away on my free software work. I have a lot of freedom and flexibility — economic, social, professional — that affords me this opportunity. Very few of us do.

This is is a problem! One solution is to pay more leadership positions; another is to have these projects hire someone in an executive director-like capacity and turn their leadership roles into advisory roles; or replace the positions with committees (the problem with the latter is that most committees still have/need a leader).

Diversity is good.

The time requirements for leadership roles severely limit the pool of potential participants. This limits the perspectives and experiences brought to the positions — and diversity in experience is widely considered to be good. People from underrepresented backgrounds generally overlap with marginalized communities — including ethnic, geographic, gender, race, and socio-economic minorities.

Volunteer work is not “more pure.”

One of the arguments for not paying people for these positions is that their motives will be more pure if they are doing it as a volunteer — because they aren’t “in it for the money.“ I would argue that your motives can be less pure if you aren’t being paid for your labor.

In mission-driven nonprofits, you want as much of your funding as possible to come from individual or community donors rather than corporate sponsors. You want the number of individual and community donors and members to be greater than that of your sponsors. You want to ensure you have enough money that should a corporate sponsor drop you (or you drop them), you are still in a sustainable position. You want to do this so that you are not beholden to any of your corporate or government sponsors. Freeing yourself from corporate influence allows you to focus on the mission of your work.

When searching for a volunteer leader, you need to look at them as a mission-driven nonprofit. Ask: What are their conflicts of interest? What happens if their employers pull away their support? What sort of financial threats are they susceptible to?

In a capitalist system, when someone is being paid for their labor, they are able to prioritize that labor. Adequate compensation enables a person to invest more fully in their work. When your responsibilities as the leader of a free software project, for which you are unpaid, come into direct conflict with the interests of your employer, who is going to win?

Note, however, that it’s important to make sure the funding to pay your leadership does not come with strings attached so that your work isn’t contingent upon any particular sponsor or set of sponsors getting what they want.

It’s a lot of work. Like, a lot of work.

By turning a leadership role into a job (even a part-time one), the associated labor can be prioritized over other labor. Many volunteer leadership positions require the same commitment as a part-time job, and some can be close to if not actually full-time jobs.

Someone’s full-time employer needs to be supportive of their volunteer leadership activities. I have some flexibility in the schedule for my day job, so I can plan meetings with people who are doing their day jobs, or in different time zones, that will work for them. Not everyone has this flexibility when they have a full-time job that isn’t their leadership role. Many people in leadership roles — I know past presidents of the OSI and previous Debian Project Leaders who will attest to this — are only able to do so because their employer allows them to shift their work schedule in order to do their volunteer work. Even when you’re “just” attending meetings, you’re doing so either with your employer giving you the time off, or using your PTO to do so.

A few final thoughts.

Many of us live in capitalist societies. One of the ways you show respect for someone’s labor is by paying them for it. This isn’t to say I think all FOSS contributions should be paid (though some argue they ought to be!), but that certain things require levels of dedication that go significantly above and beyond that which is reasonable. Our free software leaders are incredible, and we need to change how we recognize that.

(Please note that I don’t feel as though I should be paid for any of my leadership roles and, in fact, have reasons why I believe they should be unpaid.)

Categories: FLOSS Project Planets

Quansight Labs Blog: Spyder 4.0 takes a big step closer with the release of Beta 2!

Planet Python - Tue, 2019-05-21 16:02

It has been almost two months since I joined Quansight in April, to start working on Spyder maintenance and development. So far, it has been a very exciting and rewarding journey under the guidance of long time Spyder maintainer Carlos Córdoba. This is the first of a series of blog posts we will be writing to showcase updates on the development of Spyder, new planned features and news on the road to Spyder 4.0 and beyond.

First off, I would like to give a warm welcome to Edgar Margffoy, who recently joined Quansight and will be working with the Spyder team to take its development even further. Edgar has been a core Spyder developer for more than two years now, and we are very excited to have his (almost) full-time commitment to the project.

Spyder 4.0 Beta 2 released!

Since August 2018, when the first beta of the 4.x series was released, the Spyder development team has been working hard on our next release. Over the past year, we've implemented the long awaited full-interface dark theme; overhauled our entire code completion and linting architecture to use the Language Server Protocol, opening the door to supporting many other languages in the future; added a new Plots pane to view and manage the figures generated by your code; and numerous other feature enhancements, bug fixes and internal improvements.

Dark theme

A full-interface dark theme has been a long awaited feature, and is enabled by default in Spyder 4. You can still select the light theme under Preferences > Appearance by either choosing a light-background syntax-highlighting scheme, or changing Interface theme to Light.

Pretty, right :-) ?

Read more… (4 min remaining to read)

Categories: FLOSS Project Planets

Python Engineering at Microsoft: Who put Python in the Windows 10 May 2019 Update?

Planet Python - Tue, 2019-05-21 15:59

Today the Windows team announced the May 2019 Update for Windows 10. In this post we’re going to look at what we, the Python team, have done to make Python easier to install on Windows by helping the community publish to the Microsoft Store and, in collaboration with Windows, adding a default “python.exe” command to help find it. You may have already heard about these on the Python Bytes podcast, at PyCon US, or through Twitter.

As software moves from the PC to the cloud, the browser, and the Internet of Things, development workflows are changing. While Visual Studio remains a great starting point for any workload on Windows, many developers now prefer to acquire tools individually and on-demand.

For other operating systems, the platform-endorsed package manager is the traditional place to find individual tools that have been customized, reviewed, and tested for your system. On Windows we are exploring ways to provide a similar experience for developers without impacting non-developer users or infringing publishers’ ability to manage their own releases. The Windows Subsystem for Linux is one approach, offering developers consistency between their build and deployment environments. But there are other developer tools that also matter.

One such tool is Python. Microsoft has been involved with the Python community for over twelve years, and currently employ four of the key contributors to the language and primary runtime. The growth of Python has been incredible, as it finds homes among data scientists, web developers, system administrators, and students, and roughly half of this work is already happening on Windows. And yet, Python developers on Windows find themselves facing more friction than on other platforms.

Installing Python on Windows

It’s been widely known for many years that Windows is the only mainstream operating system that does not include a Python interpreter out of the box. For many users who are never going to need it, this helps reduce the size and improve the security of the operating system. But for those of us who do need it, Python’s absence has been keenly felt.

Once you discover that you need to get Python, you are quickly faced with many choices. Will you download an installer from python.org? Or perhaps a distribution such as Anaconda? The Visual Studio installer is also an option. And which version? How will you access it after it’s been installed? You quickly find more answers than you need, and depending on your situation, any of them might be correct.

We spent time figuring out why someone would hit the error above and what help they need. If you’re already a Python expert with complex needs, you probably know how to install and use it. It’s much more likely that someone will hit this problem the first time they are trying to use Python. Many of the teachers we spoke to confirmed this hypothesis – students encounter this far more often than experienced developers.

So we made things easier.

First, we helped the community release their distribution of Python to the Microsoft Store. This version of Python is fully maintained by the community, installs easily on Windows 10, and automatically makes common commands such as python, pip and idle available (as well as equivalents with version numbers python3 and python3.7, for all the commands, just like on Linux).

Finally, with the May 2019 Windows Update, we are completing the picture. While Python continues to remain completely independent from the operating system, every install of Windows will include python and python3 commands that take you directly to the Python store page. We believe that the Microsoft Store package is perfect for users starting out with Python, and given our experience with and participation in the Python community we are pleased to endorse it as the default choice.

We hope everyone will be as excited as Scott Hanselman was when he discovered it. Over time, we plan to extend similar integration to other developer tools and reduce the getting started friction. We’d love to hear your thoughts, and suggestions, so feel free to post comments here or use the Windows Feedback app.

 

The post Who put Python in the Windows 10 May 2019 Update? appeared first on Python.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #369 (May 21, 2019)

Planet Python - Tue, 2019-05-21 15:30

#369 – MAY 21, 2019
View in Browser »

Interactive Data Visualization in Python With Bokeh

This course will get you up and running with the Bokeh library, using examples and a real-world dataset. You’ll learn how to visualize your data, customize and organize your visualizations, and add interactivity.
REAL PYTHON video

Build a Hardware-Based Face Recognition System for $150 With the Nvidia Jetson Nano and Python

“With the Nvidia Jetson Nano, you can build stand-alone hardware systems that run GPU-accelerated deep learning models on a tiny budget. It’s just like a Raspberry Pi, but a lot faster.”
ADAM GEITGEY

Leverage Data Science to Optimize Your Application

PyCharm 2019.1 Professional Edition has all-new Jupyter Notebooks support. You can use the same IDE that you use for building your application to analyze the data to improve it. Try it now →
JETBRAINS sponsor

PEP 581 Accepted (Using GitHub Issues for CPython)

CPython’s issue tracker will be migrated from Roundup to GitHub issues.
PYTHON.ORG

Batteries Included, but They’re Leaking

“Amber Brown of the Twisted project shared her criticisms of the Python standard library [at PyCon 2019]. This proved to be the day’s most controversial talk; Guido van Rossum stormed from the room during Q & A.” Related discussion on Hacker News
PYFOUND.BLOGSPOT.COM

Unicode & Character Encodings in Python: A Painless Guide

Get a Python-centric introduction to character encodings and unicode. Handling character encodings and numbering systems can at times seem painful and complicated, but this guide is here to help with easy-to-follow Python examples.
REAL PYTHON

Python Built-Ins Worth Learning

“Which built-ins should you know about? I estimate most Python developers will only ever need about 30 built-in functions, but which 30 depends on what you’re actually doing with Python.”
TREY HUNNER

PSF Q2 2019 Fundraiser

Support the Python Software Foundation by donating in the quarterly donation drive. Your donations help fund Python conferences, workshops, user groups, community web services, and more.
PYTHON.ORG

Discussions Black: The Uncompromising Python Code Formatter

“I often dislike autoformatter output too, but then I remember that while no-one likes what the autoformatter does to their code, everyone likes what the autoformatter does to their coworkers’ code, and then I chill out about it. Having a standard is more important than the standard being excellent.”
HACKER NEWS

Python Jobs SIPS Programmer (Madison, WI)

University of Wisconsin

Senior API Developer (Copenhagen, Denmark)

GameAnalytics Ltd.

Senior Backend Python Developer (Remote)

Kiwi.com

More Python Jobs >>>

Articles & Tutorials Structuring Your Python Project

“Which functions should go into which modules? How does data flow through the project? What features and functions can be grouped together and isolated? By answering questions like these you can begin to plan, in a broad sense, what your finished product will look like.”
PYTHON-GUIDE.ORG

5 Reasons Why People Are Choosing Masonite Over Django

The creator of Masonite explains why you should consider Masonite for use in your next Python web dev project.
JOSEPH MANCUSO

Join a Community of 3.5 Million Developers on DigitalOcean

Discover why Python developers love self-hosting their apps on DigitalOcean, the simplest cloud platform. Click here to learn more and get started within minutes →
DIGITALOCEAN sponsor

New Features Planned for Python 4.0 (Satire)

“All new libraries and standard lib modules must include the phrase “for humans” somewhere in their title.” “Type-hinting has been extended to provide even fewer tangible benefits. This new and “nerfed” type hinting will be called type whispering.”
CHARLES LEIFER

Announcing the Wolfram Client Library for Python

Get full access to the Wolfram Language from Python.
WOLFRAM.COM

Scalable Python Code With Pandas UDFs

“Pandas UDFs are a feature that enable Python code to run in a distributed environment, even if the library was developed for single node execution.”
BEN WEBER

Three Ways of Storing and Accessing Lots of Images in Python

In this tutorial, you’ll cover three ways of storing and accessing lots of images in Python. You’ll also see experimental evidence for the performance benefits and drawbacks of each one.
REAL PYTHON

Support for Python 2 Ends in 2019, and It’s High Time for Developers to Take Action

Your weekly reminder :-)
REVYUH.COM

Docker Is Different: Configuring Gunicorn for Containers

This article covers preventing slowness due to worker heartbeats, configuring the number of workers, and logging to stdout.
PYTHONSPEED.COM

Come to Mexico for PyCon Latam 2019

Come join us in beautiful Puerto Vallarta in the first installment of this conference. With an all-inclusive ticket that covers food and lodging, you can’t miss this opportunity!
PYCON sponsor

Remote Development Using PyCharm

VUYISILE NDLOVU

Overview of Async IO in Python 3.7

STACKABUSE.COM

Python Project Tooling Explained

SIMONE ROBUTTI

Projects & Code Stackless Python

“Stackless Python, or Stackless, is a Python programming language interpreter, so named because it avoids depending on the C call stack for its own stack. […] The most prominent feature of Stackless is microthreads, which avoid much of the overhead associated with usual operating system threads.”
WIKIPEDIA.ORG

DeleteFB: Selenium Script to Delete All of Your Facebook Wall Posts

GITHUB.COM/WESKERFOOT

pydebug: Decorators for Debugging Python

GITHUB.COM/BENMEZGER

PyMedPhys: Common, Core Python Package for Medical Physics

PYMEDPHYS.COM

Django-CRM: Sales Dashboard

DJANGO-CRM.READTHEDOCS.IO

Events PyLadies Bratislava

Thursday, May 23
FACEBOOK.COM • Shared by Filipa Andrade

PyConWeb 2019

May 25 to May 27, 2019
PYCONWEB.COM

Python Toulouse Meetup

June 3 in Toulouse
MEETUP.COM • Shared by Thibault Ducret

PyLondinium 2019

June 14–16 in London, UK
PYLONDINIUM.ORG • Shared by Carlos Pereira Atencio

Dash Conference

July 16–17 in NYC
DASHCON.IO

Happy Pythoning!
This was PyCoder’s Weekly Issue #369.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Duo Consulting: Drupal vs WordPress: Which is Right for You?

Planet Drupal - Tue, 2019-05-21 15:10

For website builders, the perennial debate between WordPress and Drupal rages on. As a Drupal-focused agency, it would be easy for us to promote Drupal’s benefits while badmouthing WordPress. Ultimately, though, that kind of thinking distracts form a more nuanced take on the debate: which CMS is best for you? While we’ve covered the comparisons between the two platforms before, it’s always worth revisiting the similarities and differences between them.

Part of the reason why the “WordPress vs Drupal” narrative persists is because there is no definitive “winner.” Drupal and WordPress are both great tools that we’d have no problem recommending. In fact, the two platforms have more in common than you might realize. Both WordPress and Drupal are free, open source content management systems with vast ecosystems of contributed plugins and modules. Both are also sustained by communities of users and developers who continue to make each platform successful.

Ultimately, the choice between WordPress and Drupal comes down to you and your site’s requirements. Both platforms come with advantages and disadvantages depending on the task at hand, so it really is a case-by-case basis. Instead of boiling the matter down to “Drupal vs. WordPress,” consider the following comparisons against your needs to determine which platform is the best fit for your project.

Ease vs Order

Imagine that you want to publish a new piece of content on the site. If you’re just trying to, say, publish a blog on your site as quickly as you can, it’s hard to beat WordPress. With its simple-to-use interface, WordPress streamlines the content management process and makes it easier for editors to swiftly publish or edit a basic story.

On the other hand, if you have content originating from multiple sources and you want to be able to publish across channels, consider the Drupal CMS. While slightly more difficult to master, the Drupal back end can handle varying data types and keep them organized. Essentially, if you are managing multiple sites or are publishing more complex content types, Drupal’s has the power to deliver a robust, seamless experience.

Model vs. Building Blocks

Consider a model kit. If you follow the directions and don’t deviate, you’ll end up with a sleek and stylish figure. WordPress is very much the same. Sites built using WordPress are specially optimized for easy posting and content creation. If your needs are contained and fit within the boundaries of what WordPress was designed to do, it’s a perfect out-of-the-box solution.

Adding custom features to a WordPress site, however, can be complicated. This is not the case with Drupal, which is more akin to building blocks than to a model. Much like a field of Lego bricks strewn on the floor, Drupal allows for so much customization that you may not even know where to start. Once you have a plan, though, a Drupal site can be configured to your exact specifications while still leaving room for changes later.

Solo vs Team

Because of its aforementioned ease-of-use, WordPress gives plenty of power to content creators. If you stick to OOTB functionality, you can manage an entire WordPress site on your own. Even the plugins and themes that you can add to a site can be updated with a click of a button, making routine maintenance easier.

Given its enterprise-level capabilities, Drupal is better suited to a site run by a team. Different roles with custom permissions can be assigned to different team members inside a Drupal site. These hierarchies can make it easier to collaborate on a site and ensure that there’s accountability throughout the development process.

Pages vs. Architecture

Even without any technical experience, a content creator could easily design a page on a WordPress site. The OOTB editing suite allows you to build and layout rich pages with text, images and other assets that you can quickly deploy and publish.

Though Drupal has taken strides to make their page layout builder more accessible, creating pages in Drupal takes some practice. What Drupal has going for it is its structure. Drupal offers various levels of tagging and taxonomy that allow you to organize and repurpose content in endless permutations. Further, you can create custom content types in Drupal, expanding the possibilities of what kinds of content you can publish.  

What these comparisons illustrate isn’t that one platform is better than the other. Rather, they show that each tool has its own strengths and weaknesses depending on the situation. And in the end, your mileage may vary; our team has seen enterprise sites that run on WordPress and run on Drupal. It’s all about what each user wants and needs.

Duo specializes in Drupal because we like working with the CMS’s flexibility at an enterprise scale. If you think Drupal is right for you or if you still need help deciding, please feel free to reach out to us!

Categories: FLOSS Project Planets

Jonathan Wiltshire: RC candidate of the day (1)

Planet Debian - Tue, 2019-05-21 14:39

Sometimes the list of release-critical bugs is overwhelming, and it’s hard to find something to tackle.

So I invite you to have a go at #928040, which may only be a case of reviewing and uploading the included patch.

Categories: FLOSS Project Planets

Acro Media: Drupal for Open Source Experience-Led Ecommerce

Planet Drupal - Tue, 2019-05-21 12:32

This is the first of a two part series discussing a couple of different platforms that we at Acro Media endorse for our clients. First up, I’ll talk about Drupal, a popular open-source content management system, and how it’s excellent content capabilities can be extended using an ecommerce component of your choice. For companies that require experience-led commerce architecture solutions, Drupal as an integration friendly content engine is an ideal open source choice.

A quick introduction

People who follow our blog will already know about open source technology and Drupal because we talked about them a lot. For those of you who don’t know, here’s a quick introduction.

Open Source

Wikipedia sums up open source software well.

Open-source software is a type of computer software in which source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose. Open-source software may be developed in a collaborative public manner. Open-source software is a prominent example of open collaboration.

Open-source software development can bring in diverse perspectives beyond those of a single company. A 2008 report by the Standish Group stated that adoption of open-source software models have resulted in savings of about $60 billion (£48 billion) per year for consumers.

While that describes open source software as a whole, there are many advantages of open source specifically for content creation and ecommerce. No licensing fees brings the total cost of ownership down, businesses are fully in control of their data, and integrations with virtually any other system can be created. If you like, you can read more about the advantages of open source for ecommerce via this ebook.

Drupal

Drupal is a leading open source content management system that is known for being extremely customizable and ideal for creating rich content experiences. In a CMS world dominated by WordPress, Drupal is often overlooked because of its complexity and somewhat steep learning curve. Don’t let that stop you from considering it, however, as this complexity is actually one of Drupal’s greatest strengths and the learning curve is continuously improving through admin-focused UX initiatives.

The platform can literally be made to do anything and it shines when very specialized or unique functionality is required. It has a rich ecosystem of extensions and is very developer friendly, boasting a massive development community ensuring that businesses using Drupal always have support.

On top of this, Drupal has various strategic initiatives that will keep it modern and relevant now and into the future. One of the initiatives is for the platform to be fully API-first, meaning that a primary focus of Drupal is to be integration friendly. Developers can integrate Drupal with any other software that has an API available.

Drupal for experience-led commerce

Drupal is suited for any of the three main architectures (discover your ideal architecture here), but experience-led commerce is where it’s most capable. Experience-led is for businesses who keep the customer experience top of mind. These businesses want more than to just sell products, they want to also tell their story and foster a community around their brand and their customers. They want their customer experiences to be personalized and content rich. It’s these experiences that set them apart from their competitors, and they want the freedom to innovate in whatever way is best for their business.

More often than not, SaaS ecommerce platforms alone just don’t cut it here. This is simply because they’re built for ecommerce, not as an engine for other content. Although there are a lot of benefits to SaaS for ecommerce, businesses using SaaS must conform to the limitations set by the platform and its extensions. Robust content is just not typically possible. Sure, a business may be able to maintain a blog through their SaaS ecommerce platform, but that’s about it.

Drupal, on the other hand, is a content engine first. It was built for content, whatever that content may be. If you can dream it, Drupal can do. On top of this, Drupal, being integration friendly through its API-first initiative, allows businesses the freedom to integrate any compatible SaaS or open source ecommerce platform. At this point, a complete content & commerce solution has been created and the only limitation is your imagination and available resources to implement it. Implementation can be done in-house with an internal IT team or outsourced to one of the many service providers within the Drupal marketplace, Acro Media being one of them.

Let’s look at three widely different examples of Drupal based experience-led commerce.

TELUS Mobility

Website: www.telus.com

TELUS Mobility is one of Canada’s largest telecommunications companies. Imagine the missed opportunities when a customer’s online billing isn’t connected to your latest promotions and customer service can’t quickly or easily get this information in front of them. This was a problem that they faced and business restrictions, one being that they need to own all of their code and data, required that they look outside of the SaaS marketplace for a solution. Drupal, combined with a Drupal-native Drupal Commerce extension, was the solution that they needed. The open source code base of both Drupal and the Commerce extension meant that TELUS Mobility had the control and ownership that they needed. The result was huge, many important customers and customer service UX improvements were made which enabled TELUS Mobility to outperform their competitors.

You can read the full TELUS Mobility case study here.

Bug Out Bag Builder

Website: www.bugoutbagbuilder.com

Bug Out Bag Builder (BOBB) is a content-rich resource centered around preparedness. They generate a lot of different types of content and needed a way to do it that is easy and reusable. They also had a very unique commerce element that needed to tie in seamlessly. Here’s how we did it.

First is the content aspect. BOBB is full of content! They maintain an active blog, continuously write lengthy product reviews and provide their readers with various guides and tutorials. They’re a one-stop-shop for anything preparedness and have a ton of information to share. As you can see, a simple blog wouldn’t be sufficient enough for this business. They needed a way to create various types of content that can be shared and reused in multiple places. The Drupal CMS was easily able to accommodate. All of the content has a specific home within the site, but each article is categorized and searchable. Content can be featured on the homepage with the click of a button. Various blocks throughout the site show visitors the most recent content. Reviews can be attached to products within their online custom bug out bag builder application (more on this below). All of this is great, but what makes Drupal a fantastic content engine is that if BOBB ever needs to use this content in another way, all of the saved data can be reused and repurposed without needing to recreate the content. Just a little configuration and theming work would need to be done.

Second is the commerce aspect. BOBB is not a standard ecommerce store. At their core, they’re actually an Amazon Associate. They’ve developed a trust with their readers by providing fair and honest reviews of preparedness products that are listed on the Amazon marketplace. If a reader then goes and buys the product, BOBB gets a cut since they helped make the sale.

That’s all pretty normal, but what makes BOBB unique is that they also have a web-based Custom Bag Builder application. This tool has a number of pre-built “BOBB recommended” bag configurations for certain situations. Customers can select these bags (or start from scratch), view/add/remove any of the products, and finally complete the purchase. Since BOBB doesn’t need the full capabilities of ecommerce, it didn’t make sense for them to be paying monthly licensing fees. Drupal Commerce was selected for this purpose. It’s used as a catalog for holding the product information and creating a cart. Then, an integration between Drupal Commerce and Amazon transfers the cart information to Amazon where the customer ultimately goes through checkout. Amazon then handles all of the fulfillment and BOBB gets the commission.

BikeHike Adventures

Website: www.bikehike.com

BikeHike Adventures was founded as a way of bringing like-minded adventurers together through the unique style of world travel that they promote – activity, culture and experience. They provide curated travel packages that customers enquire about through the BikeHike Adventure website. Travel is all about experience and they needed to share this experience through their website. They also needed more than just a standard article page to do it since there is a ton of information to share about each package. Furthermore, they wanted to give customers a way to reserve a trip for pre-selected dates or through a custom trip planner. Again, Drupal was a perfect fit.

When you visit the site, you’re immediately thrown into the world of active travel through a rich video banner followed by a series of travel packages, a travel blog and more. There is a lot of exciting locations and vibrant imagery throughout.

Clicking into a package, you’re again hit with spectacular photography and all of the information you would need to make a decision. You can read about the trip, view the itinerary and locations marked on a map, learn about what’s included and where you’ll be staying, read interesting and useful facts about the country and location, see a breakdown of day-to-day activities, read previous traveler review, and more. When a customer is ready to book, they can submit an enquiry which is then handed off to the BikeHike Adventures travel agents.

A commerce component isn’t actually being used in this site, but it’s just a great example of content and customer experience that is used to facilitate a booking with a travel agent. If BikeHike Adventures wanted to in the future, they are free to integrate the booking and payment platforms of their choice to automate some, if not all, of that aspect of this process. By utilizing the open source Drupal CMS, this is an option that they can exercise at any point in time.

Who is Drupal best suited for?

Drupal could be used for any business, but it’s typically best suited for ecommerce businesses:

  • Who want to differentiate their brand through personalized shopping experiences
  • Who want to showcase products outside of a standard product page
  • Who want the power to develop a content-rich experience AND have an industry standard checkout process
  • Who want to sell across multiple channels and third-party marketplaces
  • Who need to develop and execute cohesive and synchronized marketing campaigns across multiple channels
  • Who want the freedom to integrate and connect their CMS and commerce platform with other components within their overall architecture
  • Who want to limit platform fees and instead invest in their own commerce infrastructure

In closing, there’s a reason why the ecommerce market is open to open source more than ever. Businesses are increasingly seeing that open source provides a quality foundation for which to build and integrate the solutions they need for today's new-age ecommerce. Customer experience is now seen as a competitive advantage and there are a handful of options that can provide this experience, Drupal being one of them. If you’re looking experience-led ecommerce solutions, consider Drupal. It might just be what you need.

Additional resources

If you liked this article, check out these related resources.

Categories: FLOSS Project Planets

CodeGrades: Hello CodeGrades!

Planet Python - Tue, 2019-05-21 12:30

This is a blog about CodeGrades, an experiment to help folks learn about programming (initially in Python). We’ll use it to celebrate the successes, learn from the failures and reflect upon the feedback of participants. We’ll also share project news here too.

So, what are CodeGrades?

At a time when technology is finding its way into every aspect of our lives many folks want to be more than just passive consumers of technology. They feel a desire to become creators of technology. They want to take control of their digital world. They want the skills to make their technology reflect their own needs.

This is where CodeGrades come in…

CodeGrades are a programming version of time-proven techniques like music grades or belts in martial arts. Learners level up by applying the knowledge and skills needed for each grade to their own fun, interesting and challenging coding projects. Learners present their projects to professional software developers who assess the projects against the criteria for the grade being taken and provide a set of marks and written feedback so the learner can see where they’re doing well, what needs to improve and what their next steps may be.

CodeGrades are eight cumulative steps for learning how to write code. The first grade is easy enough for most people to take as a first step into programming. The eighth grade is of equivalent standard to the skills and knowledge needed to be an effective junior professional software developer. The middle grades bridge the way so the skill gaps between each of the grades is achievable. They’re like stepping stones into coding, or perhaps a modern day Gradus ad Parnassum.

The syllabus for CodeGrades is written by professional software developers. The grades reflect current best practice found in the software industry. They offer a framework for sustained and structured long term learning to write code. All the resources associated with CodeGrades are free, learners only pay to take the grading. Grades will be competitively priced and will certainly not cost the many thousands needed to attend a code bootcamp.

Passing a grade is undeniable evidence that an expert programmer believes the learner has attained the level of competence, knowledge and skill for the grade taken. Nobody can take that achievement away. It’s something to be celebrated and gives learners the confidence and momentum to continue on their path to programming mastery.

The professional developers who assess the candidates in a grading (we call them code mentors because that sounds more friendly than examiners) are paid for their time at a level comensurate to that of a senior software engineer. We like to think this may be an alternative source of income for FLOSS developers who want to concentrate on their software projects rather than work in an office.

That’s it in a nutshell.

It’s early days but we have already successfully graduated a first cohort of learners through “grade 1 Python” (with better than expected outcomes). We have just started a second cohort of learners to test the new syllabus (more on that soon) and hope to engage with further test cohorts over the summer. Eventually we will open up our website so learners will be able to book and pay for grading. We expect this to happen by the end of 2019 at the latest.

There is still much to do! If you think you could support our work, or perhaps you have feedback or maybe want to get more involved, please don’t hesitate to get in touch via the email address at the bottom of this page.

Onwards and upwards.

Categories: FLOSS Project Planets

Trey Hunner: Python built-ins worth learning

Planet Python - Tue, 2019-05-21 11:40

In every Intro to Python class I teach, there’s always at least one “how can we be expected to know all this” question.

It’s usually along the lines of either:

  1. Python has so many functions in it, what’s the best way to remember all these?
  2. What’s the best way to learn the functions we’ll need day-to-day like enumerate and range?
  3. How do you know about all the ways to solve problems in Python? Do you memorize them?

There are dozens of built-in functions and classes, hundreds of tools bundled in Python’s standard library, and thousands of third-party libraries on PyPI. There’s no way anyone could ever memorize all of these things.

I recommend triaging your knowledge:

  1. Things I should memorize such that I know them well
  2. Things I should know about so I can look them up more effectively later
  3. Things I shouldn’t bother with at all until/unless I need them one day

We’re going to look through the Built-in Functions page in the Python documentation with this approach in mind.

This will be a very long article, so I’ve linked to 5 sub-sections and 20 specific built-in functions in the next section so you can jump ahead if you’re pressed for time or looking for one built-in in particular.

    Which built-ins should you know about?

    I estimate most Python developers will only ever need about 30 built-in functions, but which 30 depends on what you’re actually doing with Python.

    We’re going to take a look at all 69 of Python’s built-in functions, in a birds eye view sort of way.

    I’ll attempt to categorize these built-ins into five categories:

    1. Commonly known: most newer Pythonistas get exposure to these built-ins pretty quickly out of necessity
    2. Overlooked by beginners: these functions are useful to know about, but they’re easy to overlook when you’re newer to Python
    3. Learn it later: these built-ins are generally useful to know about, but you’ll find them when/if you need them
    4. Maybe learn it eventually: these can come in handy, but only in specific circumstances
    5. You likely don’t need these: you’re unlikely to need these unless you’re doing something fairly specialized

    The built-in functions in categories 1 and 2 are the essential built-ins that nearly all Python programmers should eventually learn about. The built-ins in categories 3 and 4 are the specialized built-ins, which are often very useful but your need for them will vary based on your use for Python. And category 5 are arcane built-ins, which might be very handy when you need them but which many Python programmers are likely to never need.

    Note for pedantic Pythonistas: I will be referring to all of these built-ins as functions, even though 27 of them aren’t actually functions (as discussed in my functions and callables article).

    The commonly known built-in functions (which you likely already know about):

    1. print
    2. len
    3. str
    4. int
    5. float
    6. list
    7. tuple
    8. dict
    9. set
    10. range

    The built-in functions which are often overlooked by newer Python programmers:

    1. sum
    2. enumerate
    3. zip
    4. bool
    5. reversed
    6. sorted
    7. min
    8. max
    9. any
    10. all

    There are also 5 commonly overlooked built-ins which I recommend knowing about solely because they make debugging easier: dir, var, breakpoint, type, help.

    In addition to the 25 built-in functions above, we’ll also briefly see the other 44 built-ins in the learn it later maybe learn it eventually and you likely don’t need these sections.

    10 Commonly known built-in functions

    If you’ve been writing Python code, these built-ins are likely familiar already.

    print

    You already know the print function. Implementing hello world requires print.

    You may not know about the various keyword arguments accepted by print though:

    1 2 3 4 5 6 7 8 9 >>> words = ["Welcome", "to", "Python"] >>> print(words) ['Welcome', 'to', 'Python'] >>> print(*words, end="!\n") Welcome to Python! >>> print(*words, sep="\n") Welcome to Python

    You can look up print on your own.

    len

    In Python, we don’t write things like my_list.length() or my_string.length; instead we strangely (for new Pythonistas at least) say len(my_list) and len(my_string).

    1 2 3 >>> words = ["Welcome", "to", "Python"] >>> len(words) 3

    Regardless of whether you like this operator-like len function, you’re stuck with it so you’ll need to get used to it.

    str

    Unlike many other programming languages, you cannot concatenate strings and numbers in Python.

    1 2 3 4 5 >>> version = 3 >>> "Python " + version Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can only concatenate str (not "int") to str

    Python refuses to coerce that 3 integer to a string, so we need to manually do it ourselves, using the built-in str function (class technically, but as I said, I’ll be calling these all functions):

    1 2 3 >>> version = 3 >>> "Python " + str(version) 'Python 3' int

    Do you have user input and need to convert it to a number? You need the int function!

    The int function can convert strings to integers:

    1 2 3 4 >>> program_name = "Python 3" >>> version_number = program_name.split()[-1] >>> int(version_number) 3

    You can also use int to truncate a floating point number to an integer:

    1 2 3 4 5 >>> from math import sqrt >>> sqrt(28) 5.291502622129181 >>> int(sqrt(28)) 5

    Note that if you need to truncate while dividing, the // operator is likely more appropriate (though this works differently with negative numbers): int(3 / 2) == 3 // 2.

    float

    Is the string you’re converting to a number not actually an integer? Then you’ll want to use float instead of int for this conversion.

    1 2 3 4 5 6 7 8 9 >>> program_name = "Python 3" >>> version_number = program_name.split()[-1] >>> float(version_number) 3.0 >>> pi_digits = '3.141592653589793238462643383279502884197169399375' >>> len(pi_digits) 50 >>> float(pi_digits) 3.141592653589793

    You can also use float to convert integers to floating point numbers.

    In Python 2, we used to use float to convert integers to floating point numbers to force float division instead of integer division. “Integer division” isn’t a thing anymore in Python 3 (unless you’re specifically using the // operator), so we don’t need float for that purpose anymore. So if you ever see float(x) / y in your Python 3 code, you can change that to just x / y.

    list

    Want to make a list out of some other iterable?

    The list function does that:

    1 2 3 4 5 6 7 >>> numbers = [2, 1, 3, 5, 8] >>> squares = (n**2 for n in numbers) >>> squares <generator object <genexpr> at 0x7fd52dbd5930> >>> list_of_squares = list(squares) >>> list_of_squares [4, 1, 9, 25, 64]

    If you know you’re working with a list, you could use the copy method to make a new copy of a list:

    1 >>> copy_of_squares = list_of_squares.copy()

    But if you don’t know what the iterable you’re working with is, the list function is the more general way to loop over an iterable and copy it:

    1 >>> copy_of_squares = list(list_of_squares)

    You could also use a list comprehension for this, but I wouldn’t recommend it.

    Note that when you want to make an empty list, using the list literal syntax (those [] brackets) is recommended:

    1 2 >>> my_list = list() # Don't do this >>> my_list = [] # Do this instead

    Using [] is considered more idiomatic since those square brackets ([]) actually look like a Python list.

    tuple

    The tuple function is pretty much just like the list function, except it makes tuples instead:

    1 2 3 >>> numbers = [2, 1, 3, 4, 7] >>> tuple(numbers) (2, 1, 3, 4, 7)

    If you need a tuple instead of a list, because you’re trying to make a hashable collection for use in a dictionary key for example, you’ll want to reach for tuple over list.

    dict

    The dict function makes a new dictionary.

    Similar to like list and tuple, the dict function is equivalent to looping over an iterable of key-value pairs and making a dictionary from them.

    Given a list of two-item tuples:

    1 >>> color_counts = [('red', 2), ('green', 1), ('blue', 3), ('purple', 5)]

    This:

    1 2 3 4 5 6 >>> colors = {} >>> for color, n in color_counts: ... colors[color] = n ... >>> colors {'red': 2, 'green': 1, 'blue' 3, 'purple': 5}

    Can instead be done with the dict function:

    1 2 3 >>> colors = dict(color_counts) >>> colors {'red': 2, 'green': 1, 'blue' 3, 'purple': 5}

    The dict function accepts two types of arguments:

    1. another dictionary (mapping is the generic term), in which case that dictionary will be copied
    2. a list of key-value tuples (more correctly, an iterable of two-item iterables), in which case a new dictionary will be constructed from these

    So this works as well:

    1 2 3 4 5 >>> colors {'red': 2, 'green': 1, 'blue' 3, 'purple': 5} >>> new_dictionary = dict(colors) >>> new_dictionary {'red': 2, 'green': 1, 'blue' 3, 'purple': 5}

    The dict function can also accept keyword arguments to make a dictionary with string-based keys:

    1 2 3 >>> person = dict(name='Trey Hunner', profession='Python Trainer') >>> person {'name': 'Trey Hunner', 'profession': 'Python Trainer'}

    But I very much prefer to use a dictionary literal instead:

    1 2 3 >>> person = {'name': 'Trey Hunner', 'profession': 'Python Trainer'} >>> person {'name': 'Trey Hunner', 'profession': 'Python Trainer'}

    The dictionary literal syntax is more flexible and a bit faster but most importantly I find that it more clearly conveys the fact that we are creating a dictionary.

    Like with list and tuple, an empty dictionary should be made using the literal syntax as well:

    1 2 >>> my_list = dict() # Don't do this >>> my_list = {} # Do this instead

    Using {} is slightly more CPU efficient, but more importantly it’s more idiomatic: it’s common to see curly braces ({}) used for making dictionaries but dict is seen much less frequently.

    set

    The set function makes a new set. It takes an iterable of hashable values (strings, numbers, or other immutable types) and returns a set:

    1 2 3 >>> numbers = [1, 1, 2, 3, 5, 8] >>> set(numbers) {1, 2, 3, 5, 8}

    There’s no way to make an empty set with the {} set literal syntax (plain {} makes a dictionary), so the set function is the only way to make an empty set:

    1 2 3 >>> numbers = set() >>> numbers set()

    Actually that’s a lie because we have this:

    1 2 >>> {*()} # This makes an empty set set()

    But that syntax is confusing (it relies on a lesser-used feature of the * operator), so I don’t recommend it.

    range

    The range function gives us a range object, which represents a range of numbers:

    1 2 3 4 >>> range(10_000) range(0, 10000) >>> range(-1_000_000_000, 1_000_000_000) range(-1000000000, 1000000000)

    The resulting range of numbers includes the start number but excludes the stop number (range(0, 10) does not include 10).

    The range function is useful when you’d like to loop over numbers.

    1 2 3 4 5 6 7 8 >>> for n in range(0, 50, 10): ... print(n) ... 0 10 20 30 40

    A common use case is to do an operation n times (that’s a list comprehension by the way):

    1 first_five = [get_things() for _ in range(5)]

    Python 2’s range function returned a list, which means the expressions above would make very very large lists. Python 3’s range works like Python 2’s xrange (though they’re a bit different) in that numbers are computed lazily as we loop over these range objects.

    Built-ins overlooked by new Pythonistas

    If you’ve been programming Python for a bit or if you just taken an introduction to Python class, you probably already knew about the built-in functions above.

    I’d now like to show off 15 built-in functions that are very handy to know about, but are more frequently overlooked by new Pythonistas.

    The first 10 of these functions you’ll find floating around in Python code, but the last 5 you’ll most often use while debugging.

    bool

    The bool function checks the truthiness of a Python object.

    For numbers, truthiness is a question of non-zeroness:

    1 2 3 4 5 6 >>> bool(5) True >>> bool(-1) True >>> bool(0) False

    For collections, truthiness is usually a question of non-emptiness (whether the collection has a length greater than 0):

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 >>> bool('hello') True >>> bool('') False >>> bool(['a']) True >>> bool([]) False >>> bool({}) False >>> bool({1: 1, 2: 4, 3: 9}) True >>> bool(range(5)) True >>> bool(range(0)) False >>> bool(None) False

    Truthiness (called truth value testing in the docs) is kind of a big deal in Python.

    Instead of asking questions about the length of a container, many Pythonistas ask questions about truthiness instead:

    1 2 3 4 5 6 7 # Instead of doing this if len(numbers) == 0: print("The numbers list is empty") # Many of us do this if not numbers: print("The numbers list is empty")

    You likely won’t see bool used often, but on the occasion that you need to coerce a value to a boolean to ask about its truthiness, you’ll want to know about bool.

    enumerate

    Whenever you need to count upward, one number at a time, while looping over an iterable at the same time, the enumerate function will come in handy.

    That might seem like a very niche task, but it comes up quite often.

    For example we might want to keep track of the line number in a file:

    1 2 3 4 5 6 7 >>> with open('hello.txt', mode='rt') as my_file: ... for n, line in enumerate(my_file, start=1): ... print(f"{n:03}", line) ... 001 This is the first line of the file 002 This is the second line 003 This is the last line of the file

    The enumerate function is also very commonly used to keep track of the index of items in a sequence.

    1 2 3 4 5 6 def palindromic(sequence): """Return True if the sequence is the same thing in reverse.""" for i, item in enumerate(sequence): if item != sequence[-(i+1)]: return False return True

    Note that you may see newer Pythonistas use range(len(sequence)) in Python. If you ever see code with range(len(...)), you’ll almost always want to use enumerate instead.

    1 2 3 4 5 6 def palindromic(sequence): """Return True if the sequence is the same thing in reverse.""" for i in range(len(sequence)): if sequence[i] != sequence[-(i+1)]: return False return True

    If enumerate is news to you (or if you often use range(len(...))), see my article on looping with indexes in Python.

    zip

    The zip function is even more specialized than enumerate.

    The zip function is used for looping over multiple iterables at the same time. We actually used it above in the explanations of list and dict.

    1 2 3 4 5 6 7 8 9 10 11 >>> one_iterable = [2, 1, 3, 4, 7, 11] >>> another_iterable = ['P', 'y', 't', 'h', 'o', 'n'] >>> for n, letter in zip(one_iterable, another_iterable): ... print(letter, n) ... P 2 y 1 t 3 h 4 o 7 n 11

    If you ever have to loop over two lists (or any other iterables) at the same time, zip is preferred over enumerate. The enumerate function is handy when you need indexes while looping, but zip is great when we care specifically about looping over two iterables at once.

    If you’re new to zip, I also talk about it in my looping with indexes article.

    Both enumerate and zip return iterators to us. Iterators are the lazy iterables that power for loops. I have a whole talk on iterators as well as a somewhat advanced article on how to make your own iterators.

    By the way, if you need to use zip on iterables of different lengths, you may want to look up itertools.zip_longest in the Python standard library.

    reversed

    The reversed function, like enumerate and zip, returns an iterator.

    1 2 3 >>> numbers = [2, 1, 3, 4, 7] >>> reversed(numbers) <list_reverseiterator object at 0x7f3d4452f8d0>

    The only thing we can do with this iterator is loop over it (but only once):

    1 2 3 4 5 >>> reversed_numbers = reversed(numbers) >>> list(reversed_numbers) [7, 4, 3, 1, 2] >>> list(reversed_numbers) []

    Like enumerate and zip, reversed is a sort of looping helper function. You’ll pretty much see reversed used exclusively in the for part of a for loop:

    1 2 3 4 5 6 7 8 >>> for n in reversed(numbers): ... print(n) ... 7 4 3 1 2

    There are some other ways to reverse Python lists besides the reversed function:

    1 2 3 4 5 6 7 8 # Slicing syntax for n in numbers[::-1]: print(n) # In-place reverse method numbers.reverse() for n in numbers: print(n)

    But the reversed function is usually the best way to reverse any iterable in Python.

    Unlike the list reverse method (e.g. numbers.reverse()), reversed doesn’t mutate the list (it returns an iterator of the reversed items instead).

    Unlike the numbers[::-1] slice syntax, reversed(numbers) doesn’t build up a whole new list: the lazy iterator it returns retrieves the next item in reverse as we loop. Also reversed(numbers) is a lot more readable than numbers[::-1] (which just looks weird if you’ve never seen that particular use of slicing before).

    If we combine the non-copying nature of the reversed and zip functions, we can rewrite the palindromic function (from enumerate above) without taking any extra memory (no copying of lists is done here):

    1 2 3 4 5 6 def palindromic(sequence): """Return True if the sequence is the same thing in reverse.""" for n, m in zip(sequence, reversed(sequence)): if n != m: return False return True sum

    The sum function takes an iterable of numbers and returns the sum of those numbers.

    1 2 >>> sum([2, 1, 3, 4, 7]) 17

    There’s not much more to it than that.

    Python has lots of helper functions that do the looping for you, partly because they pair nicely with generator expressions:

    1 2 3 >>> numbers = [2, 1, 3, 4, 7, 11, 18] >>> sum(n**2 for n in numbers) 524

    If you’re curious about generator expressions, I discuss them in my Comprehensible Comprehensions talk (and my 3 hour tutorial on comprehensions and generator expressions).

    min and max

    The min and max functions do what you’d expect: they give you the minimum and maximum items in an iterable.

    1 2 3 4 5 >>> numbers = [2, 1, 3, 4, 7, 11, 18] >>> min(numbers) 1 >>> max(numbers) 18

    The min and max functions compare the items given to them by using the < operator. So all values need to be orderable and comparable to each other (fortunately many objects are orderable in Python).

    The min and max functions also accept a key function to allow customizing what “minimum” and “maximum” really mean for specific objects.

    sorted

    The sorted function takes any iterable and returns a new list of all the values in that iterable in sorted order.

    1 2 3 4 5 6 >>> numbers = [1, 8, 2, 13, 5, 3, 1] >>> words = ["python", "is", "lovely"] >>> sorted(words) ['is', 'lovely', 'python'] >>> sorted(numbers, reverse=True) [13, 8, 5, 3, 2, 1, 1]

    The sorted function, like min and max, compares the items given to it by using the < operator, so all values given to it need so to be orderable.

    The sorted function also allows customization of its sorting via a key function (just like min and max).

    By the way, if you’re curious about sorted versus the list.sort method, Florian Dahlitz wrote an article comparing the two.

    any and all

    The any and all functions can be paired with a generator expression to determine whether any or all items in an iterable match a given condition.

    Our palindromic function from earlier checked whether all items were equal to their corresponding item in the reversed sequence (is the first value equal to the last, second to the second from last, etc.).

    We could rewrite palindromic using all like this:

    1 2 3 4 5 6 def palindromic(sequence): """Return True if the sequence is the same thing in reverse.""" return all( n == m for n, m in zip(sequence, reversed(sequence)) )

    Negating the condition and the return value from all would allow us to use any equivalently (though this is more confusing in this example):

    1 2 3 4 5 6 def palindromic(sequence): """Return True if the sequence is the same thing in reverse.""" return not any( n != m for n, m in zip(sequence, reversed(sequence)) )

    If the any and all functions are new to you, you may want to read my article on them: Checking Whether All Items Match a Condition in Python.

    The 5 debugging functions

    The following 5 functions will be useful for debugging and troubleshooting code.

    breakpoint

    Need to pause the execution of your code and drop into a Python command prompt? You need breakpoint!

    Calling the breakpoint function will drop you into pdb, the Python debugger. There are many tutorials and talks out there on PDB: here’s a short one and here’s a long one.

    This built-in function was added in Python 3.7, but if you’re on older versions of Python you can get the same behavior with import pdb ; pdb.set_trace().

    dir

    The dir function can be used for two things:

    1. Seeing a list of all your local variables
    2. Seeing a list of all attributes on a particular object

    Here we can see that our local variables, right after starting a new Python shell and then after creating a new variable x:

    1 2 3 4 5 >>> dir() ['__annotations__', '__doc__', '__name__', '__package__'] >>> x = [1, 2, 3, 4] >>> dir() ['__annotations__', '__doc__', '__name__', '__package__', 'x']

    If we pass that x list into dir we can see all the attributes it has:

    1 2 >>> dir(x) ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

    We can see the typical list methods, append, pop, remove, and more as well as many dunder methods for operator overloading.

    vars

    The vars function is sort of a mashup of two related things: checking locals() and testing the __dict__ attribute of objects.

    When vars is called with no arguments, it’s equivalent to calling the locals() built-in function (which shows a dictionary of all local variables and their values).

    1 2 >>> vars() {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>}

    When it’s called with an argument, it accesses the __dict__ attribute on that object (which on many objects represents a dictionary of all instance attributes).

    1 2 3 >>> from itertools import chain >>> vars(chain) mappingproxy({'__getattribute__': <slot wrapper '__getattribute__' of 'itertools.chain' objects>, '__iter__': <slot wrapper '__iter__' of 'itertools.chain' objects>, '__next__': <slot wrapper '__next__' of 'itertools.chain' objects>, '__new__': <built-in method __new__ of type object at 0x5611ee76fac0>, 'from_iterable': <method 'from_iterable' of 'itertools.chain' objects>, '__reduce__': <method '__reduce__' of 'itertools.chain' objects>, '__setstate__': <method '__setstate__' of 'itertools.chain' objects>, '__doc__': 'chain(*iterables) --> chain object\n\nReturn a chain object whose .__next__() method returns elements from the\nfirst iterable until it is exhausted, then elements from the next\niterable, until all of the iterables are exhausted.'})

    If you ever try to use my_object.__dict__, you can use vars instead.

    I usually reach for dir just before using vars.

    type

    The type function will tell you the type of the object you pass to it.

    The type of a class instance is the class itself:

    1 2 3 >>> x = [1, 2, 3] >>> type(x) <class 'list'>

    The type of a class is its metaclass, which is usually type:

    1 2 3 4 >>> type(list) <class 'type'> >>> type(type(x)) <class 'type'>

    If you ever see someone reach for __class__, know that they could reach for the higher-level type function instead:

    1 2 3 4 >>> x.__class__ <class 'list'> >>> type(x) <class 'list'>

    The type function is sometimes helpful in actual code (especially object-oriented code with inheritance and custom string representations), but it’s also useful when debugging.

    Note that when type checking, the isinstance function is usually used instead of type (also note that we tend not to type check in Python because we prefer to practice duck typing).

    help

    If you’re in an interactive Python shell (the Python REPL as I usually call it), maybe debugging code using breakpoint, and you’d like to know how a certain object, method, or attribute works, the help function will come in handy.

    Realistically, you’ll likely resort to getting help from your favorite search engine more often than using help. But if you’re already in a Python REPL, it’s quicker to call help(list.insert) than it would be to look up the list.insert method documentation in Google.

    Learn it later

    There are quite a few built-in functions you’ll likely want eventually, but you may not need right now.

    I’m going to mention 14 more built-in functions which are handy to know about, but not worth learning until you actually need to use them.

    open

    Need to open a file in Python? You need the open function!

    Don’t work with files directly? Then you likely don’t need the open function!

    You might think it’s odd that I’ve put open in this section because working with files is so common. While most programmers will read or write to files using open at some point, some Python programmers, such as Django developers, may not use the open function very much (if at all).

    Once you need to work with files, you’ll learn about open. Until then, don’t worry about it.

    By the way, you might want to look into pathlib (which is in the Python standard library) as an alternative to using open. I love the pathlib module so much I’ve considered teaching files in Python by mentioning pathlib first and the built-in open function later.

    input

    The input function prompts the user for input, waits for them to hit the Enter key, and then returns the text they typed.

    Reading from standard input (which is what the input function does) is one way to get inputs into your Python program, but there are so many other ways too! You could accept command-line arguments, read from a configuration file, read from a database, and much more.

    You’ll learn this once you need to prompt the user of a command-line program for input. Until then, you won’t need it. And if you’ve been writing Python for a while and don’t know about this function, you may simply never need it.

    repr

    Need the programmer-readable representation of an object? You need the repr function!

    For many objects, the str and repr representations are the same:

    1 2 3 4 >>> str(4), repr(4) ('4', '4') >>> str([]), repr([]) ('[]', '[]')

    But for some objects, they’re different:

    1 2 3 4 5 >>> str('hello'), repr("hello") ('hello', "'hello'") >>> from datetime import date >>> str(date(2020, 1, 1)), repr(date(2020, 1, 1)) ('2020-01-01', 'datetime.date(2020, 1, 1)')

    The string representation we see at the Python REPL uses repr, while the print function relies on str:

    1 2 3 4 5 6 7 8 >>> date(2020, 1, 1) datetime.date(2020, 1, 1) >>> "hello!" 'hello!' >>> print(date(2020, 1, 1)) 2020-01-01 >>> print("hello!") hello!

    You’ll see repr used when logging, handling exceptions, and implementing dunder methods.

    super

    If you create classes in Python, you’ll likely need to use super. The super function is pretty much essential whenever you’re inheriting from another Python class.

    Many Python users rarely create classes. Creating classes isn’t an essential part of Python, though many types of programming require it. For example, you can’t really use the Django web framework without creating classes.

    If you don’t already know about super, you’ll end up learning this if and when you need it.

    property

    The property function is a decorator and a descriptor (only click those weird terms if you’re extra curious) and it’ll likely seem somewhat magical when you first learn about it.

    This decorator allows us to create an attribute which will always seem to contain the return value of a particular function call. It’s easiest to understand with an example.

    Here’s a class that uses property:

    1 2 3 4 5 6 7 8 class Circle: def __init__(self, radius=1): self.radius = radius @property def diameter(self): return self.radius * 2

    Here’s an access of that diameter attribute on a Circle object:

    1 2 3 4 5 6 >>> circle = Circle() >>> circle.diameter 2 >>> circle.radius = 5 >>> circle.diameter 10

    If you’re doing object-oriented Python programming (you’re making classes a whole bunch), you’ll likely want to learn about property at some point. Unlike other object-orient programming languages, we use properties instead of getter methods and setter methods.

    issubclass and isinstance

    The issubclass function checks whether a class is a subclass of one or more other classes.

    1 2 3 4 5 6 >>> issubclass(int, bool) False >>> issubclass(bool, int) True >>> issubclass(bool, object) True

    The isinstance function checks whether an object is an instance of one or more classes.

    1 2 3 4 5 6 7 8 >>> isinstance(True, str) False >>> isinstance(True, bool) True >>> isinstance(True, int) True >>> isinstance(True, object) True

    You can think of isinstance as delegating to issubclass:

    1 2 3 4 5 6 7 8 >>> issubclass(type(True), str) False >>> issubclass(type(True), bool) True >>> issubclass(type(True), int) True >>> issubclass(type(True), object) True

    If you’re overloading operators (e.g. customizing what the + operator does on your class) you might need to use isinstance, but in general we try to avoid strong type checking in Python so we don’t see these much.

    In Python we usually prefer duck typing over type checking. These functions actually do a bit more than the strong type checking I noted above (the behavior of both can be customized) so it’s actually possible to practice a sort of isinstance-powered duck typing with abstract base classes like collections.abc.Iterable. But this isn’t seen much either (partly because we tend to practice exception-handling and EAFP a bit more than condition-checking and LBYL in Python).

    The last two paragraphs were filled with confusing jargon that I may explain more thoroughly in a future serious of articles if there’s enough interest.

    hasattr, getattr, setattr, and delattr

    Need to work with an attribute on an object but the attribute name is dynamic? You need hasattr, getattr, setattr, and delattr.

    Say we have some thing object we want to check for a particular value on:

    1 2 3 >>> class Thing: pass ... >>> thing = Thing()

    The hasattr function allows us to check whether the object has a certain attribute:

    1 2 3 4 5 >>> hasattr(thing, 'x') False >>> thing.x = 4 >>> hasattr(thing, 'x') True

    The getattr function allows us to retrieve the value of that attribute:

    1 2 >>> getattr(thing, 'x') 4

    The setattr function allows for setting the value:

    1 2 3 >>> setattr(thing, 'x', 5) >>> thing.x 5

    And delattr deletes the attribute:

    1 2 3 4 5 >>> delattr(thing, 'x') >>> thing.x Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Thing' object has no attribute 'x'

    These functions allow for a specific flavor of metaprogramming and you likely won’t see them often.

    classmethod and staticmethod

    The classmethod and staticmethod decorators are somewhat magical in the same way the property decorator is somewhat magical.

    If you have a method that should be callable on either an instance or a class, you want the classmethod decorator. Factory methods (alternative constructors) are a common use case for this:

    1 2 3 4 5 6 7 8 9 10 class RomanNumeral: """A Roman numeral, represented as a string and numerically.""" def __init__(self, number): self.value = number @classmethod def from_string(cls, string): return cls(roman_to_int(string)) # function doesn't exist yet

    It’s a bit harder to come up with a good use for staticmethod, since you can pretty much always use a module-level function instead of a static method.

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 class RomanNumeral: """A Roman numeral, represented as a string and numerically.""" SYMBOLS = {'M': 1000, 'D': 500, 'C': 100, 'L': 50, 'X': 10, 'V': 5, 'I': 1} def __init__(self, number): self.value = number @classmethod def from_string(cls, string): return cls(cls.roman_to_int(string)) @staticmethod def roman_to_int(numeral): total = 0 for symbol, next_symbol in zip_longest(numeral, numeral[1:]): value = RomanNumeral.SYMBOLS[symbol] next_value = RomanNumeral.SYMBOLS.get(next_symbol, 0) if value < next_value: value = -value total += value return total

    The above roman_to_int function doesn’t require access to the instance or the class, so it doesn’t even need to be a @classmethod. There’s no actual need to make this function a staticmethod (instead of a classmethod): staticmethod is just more restrictive to signal the fact that we’re not reliant on the class our function lives on.

    I find that learning these causes folks to think they need them when they often don’t. You can go looking for these if you really need them eventually.

    next

    The next function returns the next item in an iterator.

    I’ve written about iterators before (how for loops work and how to make an iterator) but a very quick summary of iterators you’ll likely run into includes:

    • enumerate objects
    • zip objects
    • the return value of the reversed function
    • files (the thing you get back from the open function)
    • csv.reader objects
    • generator expressions
    • generator functions

    You can think of next as a way to manually loop over an iterator to get a single item and then break.

    1 2 3 4 5 6 7 8 9 10 11 >>> numbers = [2, 1, 3, 4, 7, 11] >>> squares = (n**2 for n in numbers) >>> next(squares) 4 >>> for n in squares: ... break ... >>> n 1 >>> next(squares) 9 Maybe learn it eventually

    We’ve already covered nearly half of the built-in functions.

    The rest of Python’s built-in functions definitely aren’t useless, but they’re a bit more special-purposed.

    The 15 built-ins I’m mentioning in this section are things you may eventually need to learn, but it’s also very possible you’ll never reach for these in your own code.

    • iter: get an iterator from an iterable: this function powers for loops and it can be very useful when you’re making helper functions for looping lazily
    • callable: return True if the argument is a callable (I talked about this a bit in my article functions and callables)
    • filter and map: as I discuss in my article on overusing lambda functions, I recommend using generator expressions over the built-in map and filter functions
    • id, locals, and globals: these are great tools for teaching Python and you may have already seen them, but you won’t see these much in real Python code
    • round: you’ll look this up if you need to round a number
    • divmod: this function does a floor division (//) and a modulo operation (%) at the same time
    • bin, oct, and hex: if you need to display a number as a string in binary, octal, or hexadecimal form, you’ll want these functions
    • abs: when you need the absolute value of a number, you’ll look this up
    • hash: dictionaries and sets rely on the hash function to test for hashability, but you likely won’t need it unless you’re implementing a clever de-duplication algorithm
    • object: this function (yes it’s a class) is useful for making unique default values and sentinel values, if you ever need those

    You’re unlikely to need all the above built-ins, but if you write Python code for long enough you’re likely to see nearly all of them.

    You likely don’t need these

    You’re unlikely to need these built-ins. There are sometimes really appropriate uses for a few of these, but you’ll likely be able to get away with never learning about these.

    • ord and chr: these are fun for teaching ASCII tables and unicode code points, but I’ve never really found a use for them in my own code
    • exec and eval: for evaluating a string as if it was code
    • compile: this is related to exec and eval
    • slice: if you’re implementing __getitem__ to make a custom sequence, you may need this (some Python Morsels exercises require this actually), but unless you make your own custom sequence you’ll likely never see slice
    • bytes, bytearray, and memoryview: if you’re working with bytes often, you’ll reach for some of these (just ignore them until then)
    • ascii: like repr but returns an ASCII-only representation of an object; I haven’t needed this in my code yet
    • frozenset: like set, but it’s immutable; neat but not something I’ve reached for in my own code
    • __import__: this function isn’t really meant to be used by you, use importlib instead
    • format: this calls the __format__ method, which is used for string formatting (f-strings and str.format); you usually don’t need to call this function directly
    • pow: the exponentiation operator (**) usually supplants this… unless you’re doing modulo-math (maybe you’re implementing RSA encryption from scratch…?)
    • complex: if you didn’t know that 4j+3 is valid Python code, you likely don’t need the complex function
    There’s always more to learn

    There are 69 built-in functions in Python (technically only 42 of them are actually functions).

    When you’re newer in your Python journey, I recommend focusing on only 20 of these built-in functions in your own code (the 10 commonly known built-ins and the 10 built-ins that are often overlooked), in addition to the 5 debugging functions.

    After that there are 14 more built-ins which you’ll probably learn later (depending on the style of programming you do).

    Then come the 15 built-ins which you may or may not ever end up needing in your own code. Some people love these built-ins and some people never use them: as you get more specific in your coding needs, you’ll likely find yourself reaching for considerably more niche tools.

    After that I mentioned the last 17 built-ins which you’ll likely never need (again, very much depending on how you use Python).

    You don’t need to learn all the Python built-in functions today. Take it slow: focus on those first 20 important built-ins and then work your way into learning about others if and when you eventually need them.

    Categories: FLOSS Project Planets

    Python Software Foundation: Petr Viktorin: Extension Modules And Subinterpreters

    Planet Python - Tue, 2019-05-21 11:19
    When a Python subinterpreter loads an extension module written in C, it tends to unwittingly share state with other subinterpreters that have loaded the same module, unless that module is written very carefully. Petr Viktorin addressed the Python Language Summit to describe the problem in detail and propose a cleaner isolation of subinterpreters.

    Read more 2019 Python Language Summit coverage.

    Python-Based Libraries Use Subinterpreters For Isolation
    Python can run several interpreter instances in a single process, keeping each subinterpreter relatively isolated from the others. There are two ways this feature could be used in the future, but both require improvements to Python. First, Python could achieve parallelism by giving each subinterpreter its own Global Interpreter Lock (GIL) and passing messages between them; Eric Snow has proposed this use of subinterpreters in PEP 554.

    Another scenario is when libraries happen to use Python as part of their implementation. Viktorin described, for example, a simulation library that uses Python and NumPy internally, or a chat library that uses Python and asyncio. It should be possible for one application to load multiple libraries such as this, each of which uses a Python interpreter, without cross-contamination. This use case was the subject of Viktorin’s presentation. The problem, he said, is that “CPython is not ready for this,” because it does not properly manage global state.

    There Are Many Kinds Of Global State
    Viktorin described a hierarchy, or perhaps a tree, of kinds of global state in an interpreter.

    Process state: For example, open file descriptors.

    Runtime state: The Python memory allocator’s data structures, and the GIL (until PEP 554).

    Interpreter state: The contents of the "builtins" module and the dict of all imported modules.

    Thread state: Thread locals like asyncio’s current event loop; fortunately this is per-interpreter.

    Context state: Implicit state such as decimal.context.

    Module state: Python variables declared at file scope or with the “global” keyword, which in fact creates module-local state.


    Module State Behaves Surprisingly
    With a series of examples, Viktorin demonstrated the subtle behavior of module-level state.

    To begin with a non-surprising example, a pure-Python module’s state is recreated by re-importing it:

    import enum old_enum = enum del sys.modules['enum'] import enum old_enum == enum # False
    But surprisingly, a C extension module only appears to be recreated when it is re-imported:

    import _sqlite3 old_sqlite3 = _sqlite3 del sys.modules['_sqlite3'] import _sqlite3 old_sqlite3 == _sqlite3 # False
    The last line seems to show that the two modules are distinct, but as Viktorin said, “This is a lie.” The module’s initialization is not re-run, and the contents of the two modules are shared:

    old_sqlite3.Error is _sqlite3.Error # True

    It is far too easy to contaminate other subinterpreters with these shared contents—in effect, a C extension’s module state is therefore a process global state.

    Modules Must Be Rewritten Thoughtfully
    C extensions written in the new style avoid this problem with subinterpreters. Not all C extensions in the standard library are updated yet; Christian Heimes commented that the ssl module must be ported to the new style of initialization. Although it is simple to find modules that must be ported, the actual porting requires thought. Coders must meticulously distinguish among different kinds of global state. C static variables are process globals, PyState_FindModule returns an interpreter-global reference to a module, and PyModule_GetState returns module-local state. Each nugget of module data must be deliberately placed at one of the levels in the hierarchy.

    As an example of how tricky this is, Viktorin pointed out a bug in the csv module. If it is imported twice, exception-handling breaks:

    import _csv old_csv = _csv del sys.modules['_csv'] import _csv try: # Pass an invalid array to reader(): should be a string, not 1. list(old_csv.reader([1])) except old_csv.Error: # The exception clause should catch the error but doesn't. pass
    The old_csv.reader function ought to raise an instance of old_csv.Error, which would match the except clause. In fact, the csv module has a bug. When it is re-imported it overwrites interpreter-level state, including the _csv.Error type, instead of keeping its state at the module-local level.

    Audience members agreed this was a bug, but Viktorin insists that this particular bug is merely a symptom of a larger problem: it is too hard to write properly isolated extension modules. Viktorin and three coauthors have proposed PEP 573 to ease this problem, with special attention to exception types.

    Viktorin advised all module authors to keep state at the module level. He recognized that this is not always possible: for example, the Python standard library’s readline module wraps the C readline library, which has global hooks. These are necessarily process-global state. He asked the audience, how should this scenario be handled? Should readline error if it is imported in more than one subinterpreter? He said, “There’s some thinking to do.” In any case, CPython needs a good default.

    The correct way to code a C extension is to use module-local state, and that should be the most obvious place to store state from C. It seems to Viktorin that the newest style APIs do emphasize module-local state as he desires, but they are not yet well-known.

    Further reading:

    PEP 384 (3.2): Defining a Stable ABI

    PEP 489 (3.5): Multi-phase extension module initialization

    PEP 554 (3.9): Multiple Interpreters in the Stdlib

    PEP 573 (3.9): Module State Access from C Extension Methods

    Not a PEP yet: CPython C API Design Guidelines (layers & rings)
    Categories: FLOSS Project Planets

    Stack Abuse: Python: Append Contents to a File

    Planet Python - Tue, 2019-05-21 10:20

    In this article, we'll examine how to append content to an existing file using Python.

    Let's say we have a file called helloworld.txt containing the text "Hello world!" and it is sitting in our current working directory on a Unix file system:

    $ cat ./helloworld.txt Hello world!

    Now assume we want to append the additional text "It's good to have been born!" to the end of this file from a Python program.

    The first step is to obtain a reference to the file from our program. This can be done with the built-in open method, using the file path/name as the first argument and the mode as the second argument, as follows:

    f = open("./helloworld.txt", "a")

    The variable f now holds a reference to a file object that we can use to write to the end of the file. If the file didn't already exist, it will be created. Note that the second argument "a" specified the mode to open the file with, in this case "Append" mode. This sets the writing position to the end of the file.

    If we had used the "w" (Write mode), then anything we write to the file will start at the very beginning and overwrite the existing content.

    Now we can write content to the file like this:

    f.write("\nIt's good to have been born!") f.close()

    Remember to call the close method after writing to files so that it doesn't remain locked after the program exits and to ensure that any buffered content in memory gets written to the file.

    Here is what the file looks like after we append to it:

    $ cat ./helloworld.txt Hello world! It's good to have been born!

    One final note is that if we add a "+" to the mode argument of the open method, we can open the file for both appending and reading. This will enable both reading and writing to the file. Without the "+" an IOError exception will occur if we try and read from the file. By default both reading and writing will occur at the end of the file, but this can be changed at any time using the seek method.

    Here are the commands to achieve this (note that we use the flush method to ensure the new content is written to the file before we try to read it back):

    f = open("./helloworld.txt", "a+") f.write("I am grateful.") f.flush() f.seek(0) content = f.read() print content f.close()

    And here is what the final file looks like:

    $ cat ./helloworld.txt Hello world! It's good to have been born! I am grateful. About the Author

    This article was written by Jacob Stopak, a software consultant and developer with passion for helping others improve their lives through code. Jacob is the creator of Initial Commit - a site dedicated to helping curious developers learn how their favorite programs are coded. Its featured project helps people learn Git at the code level.

    Categories: FLOSS Project Planets

    Pages