Feeds

Made With Mu: Kushal’s Colourful Adafruit Adventures

Planet Python - Wed, 2019-05-22 11:45

Friend of Mu, community hero, Tor core team member, Python core developer and programmer extraordinaire Kushal Das, has blogged about the fun he’s been having with Adafruit’s Circuit Playground Express board, CircuitPython and Mu.

A fortnight ago was PyCon, the world’s largest gathering of Python developers in the world (around 3.5 thousand of us arrived in Cleveland for more than a week of Python related tutorials, presentations, keynotes, open spaces, code sprints and social events). Thanks to the enormous generosity of Adafruit and DigiKey every attendee found a Circuit Playground Express in their swag bag!

Thank you to everyone who helped make this happen!

At the time I was sorely tempted to blog about this amazing gift, but decided to wait until I found evidence of any friend of mine actually using the board. I figured that if a friend had the unprompted motivation to blog or share their experience with the device, then it would be evidence that it was a worthy sponsorship. After all, for every blogger there will be a large number of folks who simply don’t share what they’ve been up to.

Upon his return to India, Kushal (pictured above, sporting a very “Morpheus” look) had lots of fun and, happily, his experiments resulted in a positive educational outcome: now his young daughter (Py) wants to have a go too..!

Kushal explains his experiment like this:

The goal is to guess a color for the next NeoPixel on the board and then press Button A to see if you guessed right or not. Py and I are continuously playing this for the last weeks.

The idea of CircuitPython, where we can connect the device to a computer and start editing code and see the changes live, is super fantastic and straightforward. It takes almost no time to start working on these, the documentation is also unambiguous and with many examples.

You can see the his code for the game in the screenshot below:

I hope you agree, such sponsorship and the ease of CircuitPython is (literally) lighting up the lives a folks all over the world and inspiring the coders of the future.

Categories: FLOSS Project Planets

TEN7 Blog's Drupal Posts: Episode 060: A Recap of Drupaldelphia 2019

Planet Drupal - Wed, 2019-05-22 11:44

In this week’s podcast TEN7’s DevOps Tess Flynn aka @socketwench is our guest, giving us her observations of the Drupaldelphia 2019 conference she recently attended, as well as a summary of her helpful session, “Return of the Clustering: Kubernetes for Drupal.”

Host: Ivan Stegic
Guest: Tess Flynn, TEN7 DevOps

In this podcast we'll discuss: 

Categories: FLOSS Project Planets

Nathan Piccini Data Science Dojo Blog: Enhance your AI superpowers with Geospatial Visualization

Planet Python - Wed, 2019-05-22 10:00

There is so much to explore when it comes to spatial visualization using Python's Folium library. For problems related to crime mapping, housing prices or travel route optimization, spatial visualization could be the most resourceful tool in getting a glimpse of how the instances are geographically located. This is beneficial as we are getting massive amounts of data from several sources such as cellphones, smartwatches, trackers, etc. In this case, patterns and correlations, which otherwise might go unrecognized, can be extracted visually.

This blog will attempt to show you the potential of spatial visualization using the Folium library with Python. This tutorial will give you insights into the most important visualization tools that are extremely useful while analyzing spatial data.

Inroduction to Folium

Folium is an incredible library that allows you to build Leaflet maps. Using latitude and longitude points, Folium can allow you to create a map of any location in the world. Furthermore, Folium creates interactive maps that may allow you to zoom in and out after the map is rendered.

We’ll get some hands-on practice with building a few maps using the Seattle Real-time Fire 911 calls dataset. This dataset provides Seattle Fire Department 911 dispatches and every instance of this dataset provides information about the address, location, date/time and type of emergency of a particular incident. It's extensive and we’ll limit the dataset to a few emergency types for the purpose of explanation.

Let's Begin

Folium can be downloaded using the following commands.

Using pip:

$ pip install folium

Using conda:

$ conda install -c conda-forge folium

Start by importing the required libraries.

import pandas as pd import numpy as np import folium

Let us now create an object named 'seattle_map' which is defined as a folium.Map object. We can add other folium objects on top of the folium.Map to improve the map rendered. The map has been centered to the longitude and latitude points in the location parameters. The zoom parameter sets the magnification level for the map that's going to be rendered. Moreover, we have also set the tiles parameter to 'OpenStreetMap' which is the default tile for this parameter. You can explore more tiles such as StamenTerrain or Mapbox Control in Folium's documentation.

seattle_map = folium.Map( location = [47.6062, -122.3321], tiles = 'OpenStreetMap', zoom_start = 11 ) seattle_map

We can observe the map rendered above. Let's create another map object with a different tile and zoom_level. Through 'Stamen Terrain' tile, we can visualize the terrain data which can be used for several important applications.

We've also inserted a folium.Marker to our 'seattle_map2' map object below. The marker can be placed to any location specified in the square brackets. The string mentioned in the popup parameter will be displayed once the marker is clicked as shown below.

seattle_map2 = folium.Map( location=[47.6062, -122.3321], tiles = 'Stamen Terrain', zoom_start = 10 ) #inserting marker folium.Marker( [47.6740, -122.1215], popup = 'Redmond' ).add_to(seattle_map2) seattle_map2

We are interested to use the Seattle 911 calls dataset to visualize the 911 calls in the year 2019 only. We are also limiting the emergency types to 3 specific emergencies that took place during this time.

We will now import our dataset which is available through this link (in CSV format). The dataset is huge, therefore, we’ll only import the first 10,000 rows using pandas read_csv method. We'll use the head method to display the first 5 rows.

(This process will take some time because the data-set is huge. Alternatively, you can download it to your local machine and then insert the file path below)

path = "https://data.seattle.gov/api/views/kzjm-xkqj/rows.csv?accessType=DOWNLOAD" seattle911 = pd.read_csv(path, nrows = 10000) seattle911.head()

Using the code below, we'll convert the datatype of our Datetime variable to Date-time format and extract the year, removing all other instances that occurred before 2019.

seattle911['Datetime'] = pd.to_datetime(seattle911['Datetime'], format='%m/%d/%Y %H:%M', utc=True) seattle911['Year'] = pd.DatetimeIndex(seattle911['Datetime']).year seattle911 = seattle911[seattle911.Year == 2019]

We'll now limit the Emergency type to 'Aid Response Yellow', 'Auto Fire Alarm' and 'MVI - Motor Vehicle Incident'. The remaining instances will be removed from the 'seattle911' dataframe.

seattle911 = seattle911[seattle911.Type.isin(['Aid Response Yellow', 'Auto Fire Alarm', 'MVI - Motor Vehicle Incident'])]

We'll remove any instance that has a missing longitude or latitude coordinate. Without these values, the particular instance cannot be visualized and will cause an error while rendering.

#drop rows with missing latitude/longitude values seattle911.dropna(subset = ['Longitude', 'Latitude'], inplace = True) seattle911.head()

Now let's steep towards the most interesting part. We'll map all the instances onto the map object we created above, 'seattle_map'. Using the code below, we'll loop over all our instances up to the length of the dataframe. Following this, we will create a folium.CircleMarker (which is similar to the folium.Marker we added above). We'll assign the latitude and longitude coordinates to the location parameter for each instance. The radius of the circle has been assigned to 3, whereas the popup will display the address of the particular instance.

As you can notice, the color of the circle depends on the emergency type. We will now render our map.

for i in range(len(seattle911)): folium.CircleMarker( location = [seattle911.Latitude.iloc[i], seattle911.Longitude.iloc[i]], radius = 3, popup = seattle911.Address.iloc[i], color = '#3186cc' if seattle911.Type.iloc[i] == 'Aid Response Yellow' else '#6ccc31' if seattle911.Type.iloc[i] =='Auto Fire Alarm' else '#ac31cc', ).add_to(seattle_map) seattle_map

Voila! The map above gives us insights about where and what emergency took place across Seattle during 2019. This can be extremely helpful for the local government to more efficiently place its emergency combating resources.

Advanced Features Provided by Folium

Let us now move towards slightly advanced features provided by Folium. For this, we will use the National Obesity by State dataset which is also hosted on data.gov. There are 2 types of files we'll be using, a csv file containing the list of all states and the percentage of obesity in each state, and a geojson file (based on JSON) that contains geographical features in form of polygons.

Before using our dataset, we'll create a new folium.map object with location parameters including coordinates to center the US on the map, whereas, we've set the 'zoom_start' level to 4 to visualize all the states.

usa_map = folium.Map( location=[37.0902, -95.7129], tiles = 'Mapbox Bright', zoom_start = 4 ) usa_map

We will assign the URLs of our datasets to 'obesity_link' and 'state_boundaries' variables, respectively.

obesity_link = 'http://data-lakecountyil.opendata.arcgis.com/datasets/3e0c1eb04e5c48b3be9040b0589d3ccf_8.csv' state_boundaries = 'http://data-lakecountyil.opendata.arcgis.com/datasets/3e0c1eb04e5c48b3be9040b0589d3ccf_8.geojson'

We will use the 'state_boundaries' file to visualize the boundaries and areas covered by each state on our folium.Map object. This is an overlay on our original map and similarly, we can visualize multiple layers on the same map. This overlay will assist us in creating our choropleth map that is discussed ahead.

folium.GeoJson(state_boundaries).add_to(usa_map) usa_map

The 'obesity_data' dataframe can be viewed below. It contains 5 variables. However, for the purpose of this demonstration, we are only concerned with the 'NAME' and 'Obesity' attributes.

obesity_data = pd.read_csv(obesity_link) obesity_data.head() Choropleth Map

Now comes the most interesting part! Creating a choropleth map. We'll bind the 'obesity_data' data frame with our 'state_boundaries' geojson file. We have assigned both the data files to our variables 'data' and 'geo_data' respectively. The columns parameter indicates which DataFrame columns to use, whereas, the key_on parameter indicates the layer in the GeoJSON on which to key the data.

We have additionally specified several other parameters that will define the color scheme we're going to use. Colors are generated from Color Brewer's sequential palettes.

By default, linear binning is used between the min and the max of the values. Custom binning can be achieved with the bins parameter.

folium.Choropleth( geo_data = state_boundaries, name = 'choropleth', data = obesity_data, columns = ['NAME', 'Obesity'], key_on = 'feature.properties.NAME', fill_color = 'YlOrRd', fill_opacity = 0.9, line_opacity = 0.5, legend_name = 'Obesity Percentage' ).add_to(usa_map) folium.LayerControl().add_to(usa_map) usa_map

Awesome! We've been able to create a choropleth map using a simple set of functions offered by Folium. We can visualize the obesity pattern geographically and uncover patterns not visible before. It also helped us in gaining clarity about the data, more than just simplifying the data itself.

You might now feel powerful enough after attaining the skill to visualize spatial data effectively. Go ahead and explore Folium's documentation to discover the incredible capabilities that this open-source library has to offer.

Thanks for reading! If you want more datasets to play with, check out this blog post. It consists of 30 free datasets with questions for you to solve.

References:
Categories: FLOSS Project Planets

Real Python: Python Logging: A Stroll Through the Source Code

Planet Python - Wed, 2019-05-22 10:00

The Python logging package is a a lightweight but extensible package for keeping better track of what your own code does. Using it gives you much more flexibility than just littering your code with superfluous print() calls.

However, Python’s logging package can be complicated in certain spots. Handlers, loggers, levels, namespaces, filters: it’s not easy to keep track of all of these pieces and how they interact.

One way to tie up the loose ends in your understanding of logging is to peek under the hood to its CPython source code. The Python code behind logging is concise and modular, and reading through it can help you get that aha moment.

This article is meant to complement the logging HOWTO document as well as Logging in Python, which is a walkthrough on how to use the package.

By the end of this article, you’ll be familiar with the following:

  • logging levels and how they work
  • Thread-safety versus process-safety in logging
  • The design of logging from an OOP perspective
  • Logging in libraries vs applications
  • Best practices and design patterns for using logging

For the most part, we’ll go line-by-line down the core module in Python’s logging package in order to build a picture of how it’s laid out.

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you'll need to take your Python skills to the next level.

How to Follow Along

Because the logging source code is central to this article, you can assume that any code block or link is based on a specific commit in the Python 3.7 CPython repository, namely commit d730719. You can find the logging package itself in the Lib/ directory within the CPython source.

Within the logging package, most of the heavy lifting occurs within logging/__init__.py, which is the file you’ll spend the most time on here:

cpython/ │ ├── Lib/ │ ├── logging/ │ │ ├── __init__.py │ │ ├── config.py │ │ └── handlers.py │ ├── ... ├── Modules/ ├── Include/ ... ... [truncated]

With that, let’s jump in.

Preliminaries

Before we get to the heavyweight classes, the top hundred lines or so of __init__.py introduce a few subtle but important concepts.

Preliminary #1: A Level Is Just an int!

Objects like logging.INFO or logging.DEBUG can seem a bit opaque. What are these variables internally, and how are they defined?

In fact, the uppercase constants from Python’s logging are just integers, forming an enum-like collection of numerical levels:

CRITICAL = 50 FATAL = CRITICAL ERROR = 40 WARNING = 30 WARN = WARNING INFO = 20 DEBUG = 10 NOTSET = 0

Why not just use the strings "INFO" or "DEBUG"? Levels are int constants to allow for the simple, unambiguous comparison of one level with another. They are given names as well to lend them semantic meaning. Saying that a message has a severity of 50 may not be immediately clear, but saying that it has a level of CRITICAL lets you know that you’ve got a flashing red light somewhere in your program.

Now, technically, you can pass just the str form of a level in some places, such as logger.setLevel("DEBUG"). Internally, this will call _checkLevel(), which ultimately does a dict lookup for the corresponding int:

_nameToLevel = { 'CRITICAL': CRITICAL, 'FATAL': FATAL, 'ERROR': ERROR, 'WARN': WARNING, 'WARNING': WARNING, 'INFO': INFO, 'DEBUG': DEBUG, 'NOTSET': NOTSET, } def _checkLevel(level): if isinstance(level, int): rv = level elif str(level) == level: if level not in _nameToLevel: raise ValueError("Unknown level: %r" % level) rv = _nameToLevel[level] else: raise TypeError("Level not an integer or a valid string: %r" % level) return rv

Which should you prefer? I’m not too opinionated on this, but it’s notable that the logging docs consistently use the form logging.DEBUG rather than "DEBUG" or 10. Also, passing the str form isn’t an option in Python 2, and some logging methods such as logger.isEnabledFor() will accept only an int, not its str cousin.

Preliminary #2: Logging Is Thread-Safe, but Not Process-Safe

A few lines down, you’ll find the following short code block, which is sneakily critical to the whole package:

import threading _lock = threading.RLock() def _acquireLock(): if _lock: _lock.acquire() def _releaseLock(): if _lock: _lock.release()

The _lock object is a reentrant lock that sits in the global namespace of the logging/__init__.py module. It makes pretty much every object and operation in the entire logging package thread-safe, enabling threads to do read and write operations without the threat of a race condition. You can see in the module source code that _acquireLock() and _releaseLock() are ubiquitous to the module and its classes.

There’s something not accounted for here, though: what about process safety? The short answer is that the logging module is not process safe. This isn’t inherently a fault of logging—generally, two processes can’t write to same file without a lot of proactive effort on behalf of the programmer first.

This means that you’ll want to be careful before using classes such as a logging.FileHandler with multiprocessing involved. If two processes want to read from and write to the same underlying file concurrently, then you can run into a nasty bug halfway through a long-running routine.

If you want to get around this limitation, there’s a thorough recipe in the official Logging Cookbook. Because this entails a decent amount of setup, one alternative is to have each process log to a separate file based on its process ID, which you can grab with os.getpid().

Package Architecture: Logging’s MRO

Now that we’ve covered some preliminary setup code, let’s take a high-level look at how logging is laid out. The logging package uses a healthy dose of OOP and inheritance. Here’s a partial look at the method resolution order (MRO) for some of the most important classes in the package:

object │ ├── LogRecord ├── Filterer │ ├── Logger │ │ └── RootLogger │ └── Handler │ ├── StreamHandler │ └── NullHandler ├── Filter └── Manager

The tree diagram above doesn’t cover all of the classes in the module, just those that are most worth highlighting.

Note: You can use the dunder attribute logging.StreamHandler.__mro__ to see the chain of inheritance. A definitive guide to the MRO can be found in the Python 2 docs, though it is applicable to Python 3 as well.

This litany of classes is typically one source of confusion because there’s a lot going on, and it’s all jargon-heavy. Filter versus Filterer? Logger versus Handler? It can be challenging to keep track of everything, much less visualize how it fits together. A picture is worth a thousand words, so here’s a diagram of a scenario where one logger with two handlers attached to it writes a log message with level logging.INFO:

Flow of logging objects (Image: Real Python)

In Python code, everything above would look like this:

import logging import sys logger = logging.getLogger("pylog") logger.setLevel(logging.DEBUG) h1 = logging.FileHandler(filename="/tmp/records.log") h1.setLevel(logging.INFO) h2 = logging.StreamHandler(sys.stderr) h2.setLevel(logging.ERROR) logger.addHandler(h1) logger.addHandler(h2) logger.info("testing %d.. %d.. %d..", 1, 2, 3)

There’s a more detailed map of this flow in the Logging HOWTO. What’s shown above is a simplified scenario.

Your code defines just one Logger instance, logger, along with two Handler instances, h1 and h2.

When you call logger.info("testing %d.. %d.. %d..", 1, 2, 3), the logger object serves as a filter because it also has a level associated with it. Only if the message level is severe enough will the logger do anything with the message. Because the logger has level DEBUG, and the message carries a higher INFO level, it gets the go-ahead to move on.

Internally, logger calls logger.makeRecord() to put the message string "testing %d.. %d.. %d.." and its arguments (1, 2, 3) into a bona fide class instance of a LogRecord, which is just a container for the message and its metadata.

The logger object looks around for its handlers (instances of Handler), which may be tied directly to logger itself or to its parents (a concept that we’ll touch on later). In this example, it finds two handlers:

  1. One with level INFO that dumps log data to a file at /tmp/records.log
  2. One that writes to sys.stderr but only if the incoming message is at level ERROR or higher

At this point, there’s another round of tests that kicks in. Because the LogRecord and its message only carry level INFO, the record gets written to Handler 1 (green arrow), but not to Handler 2’s stderr stream (red arrow). For Handlers, writing the LogRecord to their stream is called emitting it, which is captured in their .emit().

Next, let’s further dissect everything from above.

The LogRecord Class

What is a LogRecord? When you log a message, an instance of the LogRecord class is the object you send to be logged. It’s created for you by a Logger instance and encapsulates all the pertinent info about that event. Internally, it’s little more than a wrapper around a dict that contains attributes for the record. A Logger instance sends a LogRecord instance to zero or more Handler instances.

The LogRecord contains some metadata, such as the following:

  1. A name
  2. The creation time as a Unix timestamp
  3. The message itself
  4. Information on what function made the logging call

Here’s a peek into the metadata that it carries with it, which you can introspect by stepping through a logging.error() call with the pdb module:

>>>>>> import logging >>> import pdb >>> def f(x): ... logging.error("bad vibes") ... return x / 0 ... >>> pdb.run("f(1)")

After stepping through some higher-level functions, you end up at line 1517:

(Pdb) l 1514 exc_info = (type(exc_info), exc_info, exc_info.__traceback__) 1515 elif not isinstance(exc_info, tuple): 1516 exc_info = sys.exc_info() 1517 record = self.makeRecord(self.name, level, fn, lno, msg, args, 1518 exc_info, func, extra, sinfo) 1519 -> self.handle(record) 1520 1521 def handle(self, record): 1522 """ 1523 Call the handlers for the specified record. 1524 (Pdb) from pprint import pprint (Pdb) pprint(vars(record)) {'args': (), 'created': 1550671851.660067, 'exc_info': None, 'exc_text': None, 'filename': '<stdin>', 'funcName': 'f', 'levelname': 'ERROR', 'levelno': 40, 'lineno': 2, 'module': '<stdin>', 'msecs': 660.067081451416, 'msg': 'bad vibes', 'name': 'root', 'pathname': '<stdin>', 'process': 2360, 'processName': 'MainProcess', 'relativeCreated': 295145.5490589142, 'stack_info': None, 'thread': 4372293056, 'threadName': 'MainThread'}

A LogRecord, internally, contains a trove of metadata that’s used in one way or another.

You’ll rarely need to deal with a LogRecord directly, since the Logger and Handler do this for you. It’s still worthwhile to know what information is wrapped up in a LogRecord, because this is where all that useful info, like the timestamp, come from when you see record log messages.

Note: Below the LogRecord class, you’ll also find the setLogRecordFactory(), getLogRecordFactory(), and makeLogRecord() factory functions. You won’t need these unless you want to use a custom class instead of LogRecord to encapsulate log messages and their metadata.

The Logger and Handler Classes

The Logger and Handler classes are both central to how logging works, and they interact with each other frequently. A Logger, a Handler, and a LogRecord each have a .level associated with them.

The Logger takes the LogRecord and passes it off to the Handler, but only if the effective level of the LogRecord is equal to or higher than that of the Logger. The same goes for the LogRecord versus Handler test. This is called level-based filtering, which Logger and Handler implement in slightly different ways.

In other words, there is an (at least) two-step test applied before the message that you log gets to go anywhere. In order to be fully passed from a logger to handler and then logged to the end stream (which could be sys.stdout, a file, or an email via SMTP), a LogRecord must have a level at least as high as both the logger and handler.

PEP 282 describes how this works:

Each Logger object keeps track of a log level (or threshold) that it is interested in, and discards log requests below that level. (Source)

So where does this level-based filtering actually occur for both Logger and Handler?

For the Logger class, it’s a reasonable first assumption that the logger would compare its .level attribute to the level of the LogRecord, and be done there. However, it’s slightly more involved than that.

Level-based filtering for loggers occurs in .isEnabledFor(), which in turn calls .getEffectiveLevel(). Always use logger.getEffectiveLevel() rather than just consulting logger.level. The reason has to do with the organization of Logger objects in a hierarchical namespace. (You’ll see more on this later.)

By default, a Logger instance has a level of 0 (NOTSET). However, loggers also have parent loggers, one of which is the root logger, which functions as the parent of all other loggers. A Logger will walk upwards in its hierarchy and get its effective level vis-à-vis its parent (which ultimately may be root if no other parents are found).

Here’s where this happens in the Logger class:

class Logger(Filterer): # ... def getEffectiveLevel(self): logger = self while logger: if logger.level: return logger.level logger = logger.parent return NOTSET def isEnabledFor(self, level): try: return self._cache[level] except KeyError: _acquireLock() if self.manager.disable >= level: is_enabled = self._cache[level] = False else: is_enabled = self._cache[level] = level >= self.getEffectiveLevel() _releaseLock() return is_enabled

Correspondingly, here’s an example that calls the source code you see above:

>>>>>> import logging >>> logger = logging.getLogger("app") >>> logger.level # No! 0 >>> logger.getEffectiveLevel() 30 >>> logger.parent <RootLogger root (WARNING)> >>> logger.parent.level 30

Here’s the takeaway: don’t rely on .level. If you haven’t explicitly set a level on your logger object, and you’re depending on .level for some reason, then your logging setup will likely behave differently than you expected it to.

What about Handler? For handlers, the level-to-level comparison is simpler, though it actually happens in .callHandlers() from the Logger class:

class Logger(Filterer): # ... def callHandlers(self, record): c = self found = 0 while c: for hdlr in c.handlers: found = found + 1 if record.levelno >= hdlr.level: hdlr.handle(record)

For a given LogRecord instance (named record in the source code above), a logger checks with each of its registered handlers and does a quick check on the .level attribute of that Handler instance. If the .levelno of the LogRecord is greater than or equal to that of the handler, only then does the record get passed on. A docstring in logging refers to this as “conditionally emit[ting] the specified logging record.”

Handlers and the Places They Go Show/Hide

The most important attribute for a Handler subclass instance is its .stream attribute. This is the final destination where logs get written to and can be pretty much any file-like object. Here’s an example with io.StringIO, which is an in-memory stream (buffer) for text I/O.

First, set up a Logger instance with a level of DEBUG. You’ll see that, by default, it has no direct handlers:

>>>>>> import io >>> import logging >>> logger = logging.getLogger("abc") >>> logger.setLevel(logging.DEBUG) >>> print(logger.handlers) []

Next, you can subclass logging.StreamHandler to make the .flush() call a no-op. We would want to flush sys.stderr or sys.stdout, but not the in-memory buffer in this case:

class IOHandler(logging.StreamHandler): def flush(self): pass # No-op

Now, declare the buffer object itself and tie it in as the .stream for your custom handler with a level of INFO, and then tie that handler into the logger:

>>>>>> stream = io.StringIO() >>> h = IOHandler(stream) >>> h.setLevel(logging.INFO) >>> logger.addHandler(h) >>> logger.debug("extraneous info") >>> logger.warning("you've been warned") >>> logger.critical("SOS") >>> try: ... print(stream.getvalue()) ... finally: ... stream.close() ... you've been warned SOS

This last chunk is another illustration of level-based filtering.

Three messages with levels DEBUG, WARNING, and CRITICAL are passed through the chain. At first, it may look as if they don’t go anywhere, but two of them do. All three of them make it out of the gates from logger (which has level DEBUG).

However, only two of them get emitted by the handler because it has a higher level of INFO, which exceeds DEBUG. Finally, you get the entire contents of the buffer as a str and close the buffer to explicitly free up system resources.

The Filter and Filterer Classes

Above, we asked the question, “Where does level-based filtering happen?” In answering this question, it’s easy to get distracted by the Filter and Filterer classes. Paradoxically, level-based filtering for Logger and Handler instances occurs without the help of either of the Filter or Filterer classes.

Filter and Filterer are designed to let you add additional function-based filters on top of the level-based filtering that is done by default. I like to think of it as à la carte filtering.

Filterer is the base class for Logger and Handler because both of these classes are eligible for receiving additional custom filters that you specify. You add instances of Filter to them with logger.addFilter() or handler.addFilter(), which is what self.filters refers to in the following method:

class Filterer(object): # ... def filter(self, record): rv = True for f in self.filters: if hasattr(f, 'filter'): result = f.filter(record) else: result = f(record) if not result: rv = False break return rv

Given a record (which is a LogRecord instance), .filter() returns True or False depending on whether that record gets the okay from this class’s filters.

Here is .handle() in turn, for the Logger and Handler classes:

class Logger(Filterer): # ... def handle(self, record): if (not self.disabled) and self.filter(record): self.callHandlers(record) # ... class Handler(Filterer): # ... def handle(self, record): rv = self.filter(record) if rv: self.acquire() try: self.emit(record) finally: self.release() return rv

Neither Logger nor Handler come with any additional filters by default, but here’s a quick example of how you could add one:

>>>>>> import logging >>> logger = logging.getLogger("rp") >>> logger.setLevel(logging.INFO) >>> logger.addHandler(logging.StreamHandler()) >>> logger.filters # Initially empty [] >>> class ShortMsgFilter(logging.Filter): ... """Only allow records that contain long messages (> 25 chars).""" ... def filter(self, record): ... msg = record.msg ... if isinstance(msg, str): ... return len(msg) > 25 ... return False ... >>> logger.addFilter(ShortMsgFilter()) >>> logger.filters [<__main__.ShortMsgFilter object at 0x10c28b208>] >>> logger.info("Reeeeaaaaallllllly long message") # Length: 31 Reeeeaaaaallllllly long message >>> logger.info("Done") # Length: <25, no output

Above, you define a class ShortMsgFilter and override its .filter(). In .addHandler(), you could also just pass a callable, such as a function or lambda or a class that defines .__call__().

The Manager Class

There’s one more behind-the-scenes actor of logging that is worth touching on: the Manager class. What matters most is not the Manager class but a single instance of it that acts as a container for the growing hierarchy of loggers that are defined across packages. You’ll see in the next section how just a single instance of this class is central to gluing the module together and allowing its parts to talk to each other.

The All-Important Root Logger

When it comes to Logger instances, one stands out. It’s called the root logger:

class RootLogger(Logger): def __init__(self, level): Logger.__init__(self, "root", level) # ... root = RootLogger(WARNING) Logger.root = root Logger.manager = Manager(Logger.root)

The last three lines of this code block are one of the ingenious tricks employed by the logging package. Here are a few points:

  • The root logger is just a no-frills Python object with the identifier root. It has a level of logging.WARNING and a .name of "root". As far as the class RootLogger is concerned, this unique name is all that’s special about it.

  • The root object in turn becomes a class attribute for the Logger class. This means that all instances of Logger, and the Logger class itself, all have a .root attribute that is the root logger. This is another example of a singleton-like pattern being enforced in the logging package.

  • A Manager instance is set as the .manager class attribute for Logger. This eventually comes into play in logging.getLogger("name"). The .manager does all the facilitation of searching for existing loggers with the name "name" and creating them if they don’t exist.

The Logger Hierarchy

Everything is a child of root in the logger namespace, and I mean everything. That includes loggers that you specify yourself as well as those from third-party libraries that you import.

Remember earlier how the .getEffectiveLevel() for our logger instances was 30 (WARNING) even though we had not explicitly set it? That’s because the root logger sits at the top of the hierarchy, and its level is a fallback if any nested loggers have a null level of NOTSET:

>>>>>> root = logging.getLogger() # Or getLogger("") >>> root <RootLogger root (WARNING)> >>> root.parent is None True >>> root.root is root # Self-referential True >>> root is logging.root True >>> root.getEffectiveLevel() 30

The same logic applies to the search for a logger’s handlers. The search is effectively a reverse-order search up the tree of a logger’s parents.

A Multi-Handler Design

The logger hierarchy may seem neat in theory, but how beneficial is it in practice?

Let’s take a break from exploring the logging code and foray into writing our own mini-application—one that takes advantage of the logger hierarchy in a way that reduces boilerplate code and keeps things scalable if the project’s codebase grows.

Here’s the project structure:

project/ │ └── project/ ├── __init__.py ├── utils.py └── base.py

Don’t worry about the application’s main functions in utils.py and base.py. What we’re paying more attention to here is the interaction in logging objects between the modules in project/.

In this case, say that you want to design a multipronged logging setup:

  • Each module gets a logger with multiple handlers.

  • Some of the handlers are shared between different logger instances in different modules. These handlers only care about level-based filtering, not the module where the log record emanated from. There is a handler for DEBUG messages, one for INFO, one for WARNING, and so on.

  • Each logger is also tied to one more additional handler that only receives LogRecord instances from that lone logger. You can call this a module-based file handler.

Visually, what we’re shooting for would look something like this:

A multipronged logging design (Image: Real Python)

The two turquoise objects are instances of Logger, established with logging.getLogger(__name__) for each module in a package. Everything else is a Handler instance.

The thinking behind this design is that it’s neatly compartmentalized. You can conveniently look at messages coming from a single logger, or look at messages of a certain level and above coming from any logger or module.

The properties of the logger hierarchy make it suitable for setting up this multipronged logger-handler layout. What does that mean? Here’s a concise explanation from the Django documentation:

Why is the hierarchy important? Well, because loggers can be set to propagate their logging calls to their parents. In this way, you can define a single set of handlers at the root of a logger tree, and capture all logging calls in the subtree of loggers. A logging handler defined in the project namespace will catch all logging messages issued on the project.interesting and project.interesting.stuff loggers. (Source)

The term propagate refers to how a logger keeps walking up its chain of parents looking for handlers. The .propagate attribute is True for a Logger instance by default:

>>>>>> logger = logging.getLogger(__name__) >>> logger.propagate True

In .callHandlers(), if propagate is True, each successive parent gets reassigned to the local variable c until the hierarchy is exhausted:

class Logger(Filterer): # ... def callHandlers(self, record): c = self found = 0 while c: for hdlr in c.handlers: found = found + 1 if record.levelno >= hdlr.level: hdlr.handle(record) if not c.propagate: c = None else: c = c.parent

Here’s what this means: because the __name__ dunder variable within a package’s __init__.py module is just the name of the package, a logger there becomes a parent to any loggers present in other modules in the same package.

Here are the resulting .name attributes from assigning to logger with logging.getLogger(__name__):

Module .name Attribute project/__init__.py 'project' project/utils.py 'project.utils' project/base.py 'project.base'

Because the 'project.utils' and 'project.base' loggers are children of 'project', they will latch onto not only their own direct handlers but whatever handlers are attached to 'project'.

Let’s build out the modules. First comes __init__.py:

# __init__.py import logging logger = logging.getLogger(__name__) logger.setLevel(logging.DEBUG) levels = ("DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL") for level in levels: handler = logging.FileHandler(f"/tmp/level-{level.lower()}.log") handler.setLevel(getattr(logging, level)) logger.addHandler(handler) def add_module_handler(logger, level=logging.DEBUG): handler = logging.FileHandler( f"/tmp/module-{logger.name.replace('.', '-')}.log" ) handler.setLevel(level) logger.addHandler(handler)

This module is imported when the project package is imported. You add a handler for each level in DEBUG through CRITICAL, then attach it to a single logger at the top of the hierarchy.

You also define a utility function that adds one more FileHandler to a logger, where the filename of the handler corresponds to the module name where the logger is defined. (This assumes the logger is defined with __name__.)

You can then add some minimal boilerplate logger setup in base.py and utils.py. Notice that you only need to add one additional handler with add_module_handler() from __init__.py. You don’t need to worry about the level-oriented handlers because they are already added to their parent logger named 'project':

# base.py import logging from project import add_module_handler logger = logging.getLogger(__name__) add_module_handler(logger) def func1(): logger.debug("debug called from base.func1()") logger.critical("critical called from base.func1()")

Here’s utils.py:

# utils.py import logging from project import add_module_handler logger = logging.getLogger(__name__) add_module_handler(logger) def func2(): logger.debug("debug called from utils.func2()") logger.critical("critical called from utils.func2()")

Let’s see how all of this works together from a fresh Python session:

>>>>>> from pprint import pprint >>> import project >>> from project import base, utils >>> project.logger <Logger project (DEBUG)> >>> base.logger, utils.logger (<Logger project.base (DEBUG)>, <Logger project.utils (DEBUG)>) >>> base.logger.handlers [<FileHandler /tmp/module-project-base.log (DEBUG)>] >>> pprint(base.logger.parent.handlers) [<FileHandler /tmp/level-debug.log (DEBUG)>, <FileHandler /tmp/level-info.log (INFO)>, <FileHandler /tmp/level-warning.log (WARNING)>, <FileHandler /tmp/level-error.log (ERROR)>, <FileHandler /tmp/level-critical.log (CRITICAL)>] >>> base.func1() >>> utils.func2()

You’ll see in the resulting log files that our filtration system works as intended. Module-oriented handlers direct one logger to a specific file, while level-oriented handlers direct multiple loggers to a different file:

$ cat /tmp/level-debug.log debug called from base.func1() critical called from base.func1() debug called from utils.func2() critical called from utils.func2() $ cat /tmp/level-critical.log critical called from base.func1() critical called from utils.func2() $ cat /tmp/module-project-base.log debug called from base.func1() critical called from base.func1() $ cat /tmp/module-project-utils.log debug called from utils.func2() critical called from utils.func2()

A drawback worth mentioning is that this design introduces a lot of redundancy. One LogRecord instance may go to no less than six files. That’s also a non-negligible amount of file I/O that may add up in a performance-critical application.

Now that you’ve seen a practical example, let’s switch gears and delve into a possible source of confusion in logging.

The “Why Didn’t My Log Message Go Anywhere?” Dilemma

There are two common situations with logging when it’s easy to get tripped up:

  1. You logged a message that seemingly went nowhere, and you’re not sure why.
  2. Instead of being suppressed, a log message appeared in a place that you didn’t expect it to.

Each of these has a reason or two commonly associated with it.

You logged a message that seemingly went nowhere, and you’re not sure why.

Don’t forget that the effective level of a logger for which you don’t otherwise set a custom level is WARNING, because a logger will walk up its hierarchy until it finds the root logger with its own WARNING level:

>>>>>> import logging >>> logger = logging.getLogger("xyz") >>> logger.debug("mind numbing info here") >>> logger.critical("storm is coming") storm is coming

Because of this default, the .debug() call goes nowhere.

Instead of being suppressed, a log message appeared in a place that you didn’t expect it to.

When you defined your logger above, you didn’t add any handlers to it. So, why is it writing to the console?

The reason for this is that logging sneakily uses a handler called lastResort that writes to sys.stderr if no other handlers are found:

class _StderrHandler(StreamHandler): # ... @property def stream(self): return sys.stderr _defaultLastResort = _StderrHandler(WARNING) lastResort = _defaultLastResort

This kicks in when a logger goes to find its handlers:

class Logger(Filterer): # ... def callHandlers(self, record): c = self found = 0 while c: for hdlr in c.handlers: found = found + 1 if record.levelno >= hdlr.level: hdlr.handle(record) if not c.propagate: c = None else: c = c.parent if (found == 0): if lastResort: if record.levelno >= lastResort.level: lastResort.handle(record)

If the logger gives up on its search for handlers (both its own direct handlers and attributes of parent loggers), then it picks up the lastResort handler and uses that.

There’s one more subtle detail worth knowing about. This section has largely talked about the instance methods (methods that a class defines) rather than the module-level functions of the logging package that carry the same name.

If you use the functions, such as logging.info() rather than logger.info(), then something slightly different happens internally. The function calls logging.basicConfig(), which adds a StreamHandler that writes to sys.stderr. In the end, the behavior is virtually the same:

>>>>>> import logging >>> root = logging.getLogger("") >>> root.handlers [] >>> root.hasHandlers() False >>> logging.basicConfig() >>> root.handlers [<StreamHandler <stderr> (NOTSET)>] >>> root.hasHandlers() True Taking Advantage of Lazy Formatting

It’s time to switch gears and take a closer look at how messages themselves are joined with their data. While it’s been supplanted by str.format() and f-strings, you’ve probably used Python’s percent-style formatting to do something like this:

>>>>>> print("To iterate is %s, to recurse %s" % ("human", "divine")) To iterate is human, to recurse divine

As a result, you may be tempted to do the same thing in a logging call:

>>>>>> # Bad! Check out a more efficient alternative below. >>> logging.warning("To iterate is %s, to recurse %s" % ("human", "divine")) WARNING:root:To iterate is human, to recurse divine

This uses the entire format string and its arguments as the msg argument to logging.warning().

Here is the recommended alternative, straight from the logging docs:

>>>>>> # Better: formatting doesn't occur until it really needs to. >>> logging.warning("To iterate is %s, to recurse %s", "human", "divine") WARNING:root:To iterate is human, to recurse divine

It looks a little weird, right? This seems to defy the conventions of how percent-style string formatting works, but it’s a more efficient function call because the format string gets formatted lazily rather than greedily. Here’s what that means.

The method signature for Logger.warning() looks like this:

def warning(self, msg, *args, **kwargs)

The same applies to the other methods, such as .debug(). When you call warning("To iterate is %s, to recurse %s", "human", "divine"), both "human" and "divine" get caught as *args and, within the scope of the method’s body, args is equal to ("human", "divine").

Contrast this to the first call above:

logging.warning("To iterate is %s, to recurse %s" % ("human", "divine"))

In this form, everything in the parentheses gets immediately merged together into "To iterate is human, to recurse divine" and passed as msg, while args is an empty tuple.

Why does this matter? Repeated logging calls can degrade runtime performance slightly, but the logging package does its very best to control that and keep it in check. By not merging the format string with its arguments right away, logging is delaying the string formatting until the LogRecord is requested by a Handler.

This happens in LogRecord.getMessage(), so only after logging deems that the LogRecord will actually be passed to a handler does it become its fully merged self.

All that is to say that the logging package makes some very fine-tuned performance optimizations in the right places. This may seem like minutia, but if you’re making the same logging.debug() call a million times inside a loop, and the args are function calls, then the lazy nature of how logging does string formatting can make a difference.

Before doing any merging of msg and args, a Logger instance will check its .isEnabledFor() to see if that merging should be done in the first place.

Functions vs Methods

Towards the bottom of logging/__init__.py sit the module-level functions that are advertised up front in the public API of logging. You already saw the Logger methods such as .debug(), .info(), and .warning(). The top-level functions are wrappers around the corresponding methods of the same name, but they have two important features:

  1. They always call their corresponding method from the root logger, root.

  2. Before calling the root logger methods, they call logging.basicConfig() with no arguments if root doesn’t have any handlers. As you saw earlier, it is this call that sets a sys.stdout handler for the root logger.

For illustration, here’s logging.error():

def error(msg, *args, **kwargs): if len(root.handlers) == 0: basicConfig() root.error(msg, *args, **kwargs)

You’ll find the same pattern for logging.debug(), logging.info(), and the others as well. Tracing the chain of commands is interesting. Eventually, you’ll end up at the same place, which is where the internal Logger._log() is called.

The calls to debug(), info(), warning(), and the other level-based functions all route to here. _log() primarily has two purposes:

  1. Call self.makeRecord(): Make a LogRecord instance from the msg and other arguments you pass to it.

  2. Call self.handle(): This determines what actually gets done with the record. Where does it get sent? Does it make it there or get filtered out?

Here’s that entire process in one diagram:

Internals of a logging call (Image: Real Python)

You can also trace the call stack with pdb.

Tracing the Call to logging.warning() Show/Hide

>>>>>> import logging >>> import pdb >>> pdb.run('logging.warning("%s-%s", "uh", "oh")') > <string>(1)<module>() (Pdb) s --Call-- > lib/python3.7/logging/__init__.py(1971)warning() -> def warning(msg, *args, **kwargs): (Pdb) s > lib/python3.7/logging/__init__.py(1977)warning() -> if len(root.handlers) == 0: (Pdb) unt > lib/python3.7/logging/__init__.py(1978)warning() -> basicConfig() (Pdb) unt > lib/python3.7/logging/__init__.py(1979)warning() -> root.warning(msg, *args, **kwargs) (Pdb) s --Call-- > lib/python3.7/logging/__init__.py(1385)warning() -> def warning(self, msg, *args, **kwargs): (Pdb) l 1380 logger.info("Houston, we have a %s", "interesting problem", exc_info=1) 1381 """ 1382 if self.isEnabledFor(INFO): 1383 self._log(INFO, msg, args, **kwargs) 1384 1385 -> def warning(self, msg, *args, **kwargs): 1386 """ 1387 Log 'msg % args' with severity 'WARNING'. 1388 1389 To pass exception information, use the keyword argument exc_info with 1390 a true value, e.g. (Pdb) s > lib/python3.7/logging/__init__.py(1394)warning() -> if self.isEnabledFor(WARNING): (Pdb) unt > lib/python3.7/logging/__init__.py(1395)warning() -> self._log(WARNING, msg, args, **kwargs) (Pdb) s --Call-- > lib/python3.7/logging/__init__.py(1496)_log() -> def _log(self, level, msg, args, exc_info=None, extra=None, stack_info=False): (Pdb) s > lib/python3.7/logging/__init__.py(1501)_log() -> sinfo = None (Pdb) unt 1517 > lib/python3.7/logging/__init__.py(1517)_log() -> record = self.makeRecord(self.name, level, fn, lno, msg, args, (Pdb) s > lib/python3.7/logging/__init__.py(1518)_log() -> exc_info, func, extra, sinfo) (Pdb) s --Call-- > lib/python3.7/logging/__init__.py(1481)makeRecord() -> def makeRecord(self, name, level, fn, lno, msg, args, exc_info, (Pdb) p name 'root' (Pdb) p level 30 (Pdb) p msg '%s-%s' (Pdb) p args ('uh', 'oh') (Pdb) up > lib/python3.7/logging/__init__.py(1518)_log() -> exc_info, func, extra, sinfo) (Pdb) unt > lib/python3.7/logging/__init__.py(1519)_log() -> self.handle(record) (Pdb) n WARNING:root:uh-oh What Does getLogger() Really Do?

Also hiding in this section of the source code is the top-level getLogger(), which wraps Logger.manager.getLogger():

def getLogger(name=None): if name: return Logger.manager.getLogger(name) else: return root

This is the entry point for enforcing the singleton logger design:

  • If you specify a name, then the underlying .getLogger() does a dict lookup on the string name. What this comes down to is a lookup in the loggerDict of logging.Manager. This is a dictionary of all registered loggers, including the intermediate PlaceHolder instances that are generated when you reference a logger far down in the hierarchy before referencing its parents.

  • Otherwise, root is returned. There is only one root—the instance of RootLogger discussed above.

This feature is what lies behind a trick that can let you peek into all of the registered loggers:

>>>>>> import logging >>> logging.Logger.manager.loggerDict {} >>> from pprint import pprint >>> import asyncio >>> pprint(logging.Logger.manager.loggerDict) {'asyncio': <Logger asyncio (WARNING)>, 'concurrent': <logging.PlaceHolder object at 0x10d153710>, 'concurrent.futures': <Logger concurrent.futures (WARNING)>}

Whoa, hold on a minute. What’s happening here? It looks like something changed internally to the logging package as a result of an import of another library, and that’s exactly what happened.

Firstly, recall that Logger.manager is a class attribute, where an instance of Manager is tacked onto the Logger class. The manager is designed to track and manage all of the singleton instances of Logger. These are housed in .loggerDict.

Now, when you initially import logging, this dictionary is empty. But after you import asyncio, the same dictionary gets populated with three loggers. This is an example of one module setting the attributes of another module in-place. Sure enough, inside of asyncio/log.py, you’ll find the following:

import logging logger = logging.getLogger(__package__) # "asyncio"

The key-value pair is set in Logger.getLogger() so that the manager can oversee the entire namespace of loggers. This means that the object asyncio.log.logger gets registered in the logger dictionary that belongs to the logging package. Something similar happens in the concurrent.futures package as well, which is imported by asyncio.

You can see the power of the singleton design in an equivalence test:

>>>>>> obj1 = logging.getLogger("asyncio") >>> obj2 = logging.Logger.manager.loggerDict["asyncio"] >>> obj1 is obj2 True

This comparison illustrates (glossing over a few details) what getLogger() ultimately does.

Library vs Application Logging: What Is NullHandler?

That brings us to the final hundred or so lines in the logging/__init__.py source, where NullHandler is defined. Here’s the definition in all its glory:

class NullHandler(Handler): def handle(self, record): pass def emit(self, record): pass def createLock(self): self.lock = None

The NullHandler is all about the distinctions between logging in a library versus an application. Let’s see what that means.

A library is an extensible, generalizable Python package that is intended for other users to install and set up. It is built by a developer with the express purpose of being distributed to users. Examples include popular open-source projects like NumPy, dateutil, and cryptography.

An application (or app, or program) is designed for a more specific purpose and a much smaller set of users (possibly just one user). It’s a program or set of programs highly tailored by the user to do a limited set of things. An example of an application is a Django app that sits behind a web page. Applications commonly use (import) libraries and the tools they contain.

When it comes to logging, there are different best practices in a library versus an app.

That’s where NullHandler fits in. It’s basically a do-nothing stub class.

If you’re writing a Python library, you really need to do this one minimalist piece of setup in your package’s __init__.py:

# Place this in your library's uppermost `__init__.py` # Nothing else! import logging logging.getLogger(__name__).addHandler(NullHandler())

This serves two critical purposes.

Firstly, a library logger that is declared with logger = logging.getLogger(__name__) (without any further configuration) will log to sys.stderr by default, even if that’s not what the end user wants. This could be described as an opt-out approach, where the end user of the library has to go in and disable logging to their console if they don’t want it.

Common wisdom says to use an opt-in approach instead: don’t emit any log messages by default, and let the end users of the library determine if they want to further configure the library’s loggers and add handlers to them. Here’s that philosophy worded more bluntly by the author of the logging package, Vinay Sajip:

A third party library which uses logging should not spew logging output by default which may not be wanted by a developer/user of an application which uses it. (Source)

This leaves it up to the library user, not library developer, to incrementally call methods such as logger.addHandler() or logger.setLevel().

The second reason that NullHandler exists is more archaic. In Python 2.7 and earlier, trying to log a LogRecord from a logger that has no handler set would raise a warning. Adding the no-op class NullHandler will avert this.

Here’s what specifically happens in the line logging.getLogger(__name__).addHandler(NullHandler()) from above:

  1. Python gets (creates) the Logger instance with the same name as your package. If you’re designing the calculus package, within __init__.py, then __name__ will be equal to 'calculus'.

  2. A NullHandler instance gets attached to this logger. That means that Python will not default to using the lastResort handler.

Keep in mind that any logger created in any of the other .py modules of the package will be children of this logger in the logger hierarchy and that, because this handler also belongs to them, they won’t need to use the lastResort handler and won’t default to logging to standard error (stderr).

As a quick example, let’s say your library has the following structure:

calculus/ │ ├── __init__.py └── integration.py

In integration.py, as the library developer you are free to do the following:

# calculus/integration.py import logging logger = logging.getLogger(__name__) def func(x): logger.warning("Look!") # Do stuff return None

Now, a user comes along and installs your library from PyPI via pip install calculus. They use from calculus.integration import func in some application code. This user is free to manipulate and configure the logger object from the library like any other Python object, to their heart’s content.

What Logging Does With Exceptions

One thing that you may be wary of is the danger of exceptions that stem from your calls to logging. If you have a logging.error() call that is designed to give you some more verbose debugging information, but that call itself for some reason raises an exception, that would be the height of irony, right?

Cleverly, if the logging package encounters an exception that has to do with logging itself, then it will print the traceback, but not raise the exception itself.

Here’s an example that deals with a common typo: passing two arguments to a format string that is only expecting one argument. The important distinction is that what you see below is not an exception being raised, but rather a prettified printed traceback of the internal exception, which itself was suppressed:

>>>>>> logging.critical("This %s has too many arguments", "msg", "other") --- Logging error --- Traceback (most recent call last): File "lib/python3.7/logging/__init__.py", line 1034, in emit msg = self.format(record) File "lib/python3.7/logging/__init__.py", line 880, in format return fmt.format(record) File "lib/python3.7/logging/__init__.py", line 619, in format record.message = record.getMessage() File "lib/python3.7/logging/__init__.py", line 380, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting Call stack: File "<stdin>", line 1, in <module> Message: 'This %s has too many arguments' Arguments: ('msg', 'other')

This lets your program gracefully carry on with its actual program flow. The rationale is that you wouldn’t want an uncaught exception to come from a logging call itself and stop a program dead in its tracks.

Tracebacks can be messy, but this one is informative and relatively straightforward. What enables the suppression of exceptions related to logging is Handler.handleError(). When the handler calls .emit(), which is the method where it attempts to log the record, it falls back to .handleError() if something goes awry. Here’s the implementation of .emit() for the StreamHandler class:

def emit(self, record): try: msg = self.format(record) stream = self.stream stream.write(msg + self.terminator) self.flush() except Exception: self.handleError(record)

Any exception related to the formatting and writing gets caught rather than being raised, and handleError gracefully writes the traceback to sys.stderr.

Logging Python Tracebacks

Speaking of exceptions and their tracebacks, what about cases where your program encounters them but should log the exception and keep chugging along in its execution?

Let’s walk through a couple of ways to do this.

Here’s a contrived example of a lottery simulator using code that isn’t Pythonic on purpose. You’re developing an online lottery game where users can wager on their lucky number:

import random class Lottery(object): def __init__(self, n): self.n = n def make_tickets(self): for i in range(self.n): yield i def draw(self): pool = self.make_tickets() random.shuffle(pool) return next(pool)

Behind the frontend application sits the critical code below. You want to make sure that you keep track of any errors caused by the site that may make a user lose their money. The first (suboptimal) way is to use logging.error() and log the str form of the exception instance itself:

try: lucky_number = int(input("Enter your ticket number: ")) drawn = Lottery(n=20).draw() if lucky_number == drawn: print("Winner chicken dinner!") except Exception as e: # NOTE: See below for a better way to do this. logging.error("Could not draw ticket: %s", e)

This will only get you the actual exception message, rather than the traceback. You check the logs on your website’s server and find this cryptic message:

ERROR:root:Could not draw ticket: object of type 'generator' has no len()

Hmm. As the application developer, you’ve got a serious problem, and a user got ripped off as a result. But maybe this exception message itself isn’t very informative. Wouldn’t it be nice to see the lineage of the traceback that led to this exception?

The proper solution is to use logging.exception(), which logs a message with level ERROR and also displays the exception traceback. Replace the two final lines above with these:

except Exception: logging.exception("Could not draw ticket")

Now you get a better indication of what’s going on:

>>>ERROR:root:Could not draw ticket Traceback (most recent call last): File "<stdin>", line 3, in <module> File "<stdin>", line 9, in draw File "lib/python3.7/random.py", line 275, in shuffle for i in reversed(range(1, len(x))): TypeError: object of type 'generator' has no len()

Using exception() saves you from having to reference the exception yourself because logging pulls it in with sys.exc_info().

This makes things clearer that the problem stems from random.shuffle(), which needs to know the length of the object it is shuffling. Because our Lottery class passes a generator to shuffle(), it gets held up and raises before the pool can be shuffled, much less generate a winning ticket.

In large, full-blown applications, you’ll find logging.exception() to be even more useful when deep, multi-library tracebacks are involved, and you can’t step into them with a live debugger like pdb.

The code for logging.Logger.exception(), and hence logging.exception(), is just a single line:

def exception(self, msg, *args, exc_info=True, **kwargs): self.error(msg, *args, exc_info=exc_info, **kwargs)

That is, logging.exception() just calls logging.error() with exc_info=True, which is otherwise False by default. If you want to log an exception traceback but at a level different than logging.ERROR, just call that function or method with exc_info=True.

Keep in mind that exception() should only be called in the context of an exception handler, inside of an except block:

for i in data: try: result = my_longwinded_nested_function(i) except ValueError: # We are in the context of exception handler now. # If it's unclear exactly *why* we couldn't process # `i`, then log the traceback and move on rather than # ditching completely. logger.exception("Could not process %s", i) continue

Use this pattern sparingly rather than as a means to suppress any exception. It can be most helpful when you’re debugging a long function call stack where you’re otherwise seeing an ambiguous, unclear, and hard-to-track error.

Conclusion

Pat yourself on the back, because you’ve just walked through almost 2,000 lines of dense source code. You’re now better equipped to deal with the logging package!

Keep in mind that this tutorial has been far from exhaustive in covering all of the classes found in the logging package. There’s even more machinery that glues everything together. If you’d like to learn more, then you can look into the Formatter classes and the separate modules logging/config.py and logging/handlers.py.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

InternetDevels: Drupal multisite: how it works and when it is the best choice

Planet Drupal - Wed, 2019-05-22 09:40

Drupal allows to find the most effective solution for every business case. Large organizations often need more than one website but a collection of related sites, which would be easy to manage. To meet this need, there is the Drupal multisite feature that has become popular among education websites, government websites, corporate websites, and so on. Let’s explore what Drupal multisite is, how it works, and when it is the best choice.

Read more
Categories: FLOSS Project Planets

Lullabot: A Patch-less Composer Workflow for Drupal Using Forks

Planet Drupal - Wed, 2019-05-22 08:46

One of the significant advantages of using free and open-source software is the ability to fix bugs without being dependent on external teams or organizations. As PHP and JavaScript developers, our team deals with in-development bugs and new features daily. So how do we get these changes onto sites before they’re released upstream?

Categories: FLOSS Project Planets

Thomas Goirand: Wrote a Debian mirror setup puppet module in 3 hours

Planet Debian - Wed, 2019-05-22 08:40

As I needed the functionality, I wrote this:

https://salsa.debian.org/openstack-team/puppet/puppet-module-debian-archvsync

The matching Debian package has been uploaded and is now in the NEW queue. Thanks a lot to Waldi for packaging ftpsync, which I’m using.

Comments and contributions are welcome.

Categories: FLOSS Project Planets

OpenSense Labs: The Alternatives in Decoupled Drupal Ecosystem

Planet Drupal - Wed, 2019-05-22 08:34
The Alternatives in Decoupled Drupal Ecosystem Shankar Wed, 05/22/2019 - 22:48

More and more users want the content to be available via a host of different devices. Various new interfaces and devices are thronging the technology landscape and are touted to bring about sweeping changes. There are even talks of website-less future. Smart wearables, Internet of Things, conversational user interface etc. have been gaining traction and are changing the way we experience the internet. New web-enabled devices need the content that the websites do but in a different format which creates complications in the way we develop. Disseminating content can have different needs from one setup to another.


The content management system (CMS) is rife with labels such as ‘decoupled’ and ‘hybrid’ which requires to be dissected for a better understanding of their benefits. Decoupling the backend of a content management system (CMS) from the frontend can be a remarkable solution for a lot of issues that are caused when you move away from standard website-only deliveries. Decoupling the CMS streamlines the process of republishing the content across numerous channels ranging from websites to applications. Decoupled CMS is not new, but as the digital arena observes changes, it gets more and more important.

The Drupal Connect Source: Dries Buytaert’s blog

As one of the leaders in the CMS market, Drupal’s content as a service approach enables you to get out of the page-based mentality. It gives you the flexibility to separate the content management from the content display and allows you the front-end developers to build engrossing customer experiences.

Decoupled Drupal implementations are becoming ubiquitous due to its immense capability in giving a push to digital innovation

Decoupled Drupal implementations, which involves a full separation of concerns between the structure of your content and its presentation, are becoming ubiquitous due to Drupal’s immense capability in giving a push to digital innovation and being a great solution in adopting novel approaches.

Drupal Community has been working on a plenitude of API-first architectures by utilising its core REST, JSON:API and GraphQL features. But there are alternative solutions, too, available in decoupled Drupal ecosystem that can be of great significance. Let’s explore them:

RESTful Web Services and others

While RESTful Web services module helps in exposing entities and other resources as RESTful web API and is one of the most important modules when it comes to decoupled Drupal implementation, there are several others that can come handy as well.

Implementing the Apache CouchDB specification and focussing upon content staging use cases as part of the Drupal Deploy ecosystem, RELAXed Web Services module can be of immense help. The CouchDB helps in storing data within JSON documents that are exposed via a RESTful API and unlike Drupal’s core REST API, it enables not just GET, POST, DELETE but also PUT and COPY.

There’s a REST UI module that offers a user interface (UI) for the configuration of Drupal 8’s REST module. You can utilise Webform REST module that helps in retrieving and submitting webforms via REST. For protracting core’s REST Export views display in order to automatically converting any JSON string field to JSON in the output, REST Export Nested module can be useful. By offering a REST endpoint, REST menu items module can be helpful in retrieving menu items on the basis of menu name. And when you need to use REST for resetting the password, REST Password Request module can be helpful. It is worth noting that Webform REST, REST Export Nested and REST Password Request are not covered by Drupal’s security advisory policy but are really valuable.

JSON:API and others

JSON: API module is an essential tool when you are considering to format your JSON responses and is one of the most sought after options in decoupled Drupal implementations. But there is, again, plenty of options for availing more features and functionalities. 

When in need of overriding the defaults that are preconfigured upon the installation of JSON: API module, you can leverage JSON: API Extras. By offering interfaces to override default settings and registering new ones that the resultant API need to follow, JSON: API Extras can be hugely advantageous in aliasing resource names and paths, modifying field output via field enhancers, aliasing field names and so on. There is JSON API File module that enables enhanced files integration for JSON: API module. And for the websites that expose consumer-facing APIs through REST, JSON: API or something similar, Key auth module offers simple key-based authentication to every user.

For streamlined ingestion of content by other applications, Lightning API module gives you a standard API with authentication and authorisation and utilises JSON: API and OAuth2 standards through JSON API and Simple Oauth modules.

GraphQL and others

GraphQL module is great for exposing Drupal entities to your GraphQL client applications. There are some more useful modules based on GraphQL. To enable integration between GraphQL and Search API modules, there is a GraphQL Search API module. Injecting data into Twig templates by just adding a GraphQL query can be done with GraphQL Twig module. To expose Drupal content entity definitions through GraphQL via GraphQL Drupal module and develop forms or views for entities via front-end automatically, you have GraphQL Entity Definitions module.

OpenAPI and related modules

OpenAPI describes RESTful web services on the basis of the schema. There is OpenAPI module in Drupal, which is not covered by Drupal’s security advisory policies but can integrate well with both core REST and JSON: API for documentation of available entity routes in those services. And for implementing an API to display OpenAPI specifications inside a Drupal website, you get OpenAPI UI module. And ReDoc for OpenAPI UI module offers the ReDoc library, which is an Open API/ Swagger-generated API reference documentation, for displaying Open API specifications inside Drupal website. Then there is Swagger UI for OpenAPI UI module that gives you Swagger UI library in order to display OpenAPI specifications inside Drupal site. You can also utilise Schemata module that provides schemas for facilitating generated documentation and generated code.

Contentajs and related modules

The need for a Nodes.js proxy that acts as middleware between Drupal content API layer and Javascript application can be addressed with the help of Contenta.js.  For Contenta.js to function properly, Contenta JS Drupal module is needed. This module is part of the Contenta CMS Drupal distribution. The Node.js proxy is essential for decoupled Drupal because of data aggregation, server-side rendering and caching. Contenta.js constitutes multithreaded Node.js server, a Subrequests server for the facilitation of request aggregation, a Redis integration and simple approach to CORS (Cross-origin resource sharing).

There are several effective modules that are part of the Contenta CMS decoupled distribution and integrates perfectly with Contenta.js. You have Subrequest module that tells the system to execute multiple requests in a single bootstrap and then return everything. JSON-RPC module offers a lightweight protocol for remote procedure calls and serves as a canonical foundation for Drupal administrative actions that are relied upon more than just REST. To resolve path aliases, Decoupled Router module gives you an endpoint and redirects for entity relates routes.

In order to enable decoupled Drupal implementations to have variations on the basis of consumer who is making the request, you can use Consumers module. You can use Consumer Image Styles that integrates well with JSON: API for giving image styles to your images in the decoupled Drupal project. And there is Simple OAuth module for implementing OAuth 2.0 Authorisation Framework RFC.

Of relating to JavaScript frameworks

Decoupled Blocks module is great for progressive decoupling and enables front-end developers to write custom blocks in whatever javascript framework that they prefer to work on. If you want to implement using Vue.js, you can use Decoupled Blocks: Vuejs. It should be noted that both of these are not covered by Drupal’s security advisory policies.

React Comments comes as a drop-in replacement for the Drupal core comment module frontend.

Also, there is a jDrupal module, which is a JavaScript library and API for Drupal REST and can be leveraged for Drupal 8 application development.

Conclusion

With Drupal as the decoupled web content management, developers can get to use a plentitude of technologies to render the front end experience.

RESTful web services, GraphQL and JSON: API are not the only resources that you get in the decoupled Drupal ecosystem. In fact, there are plenty of other alternatives that can be of paramount importance.

Drupal development is our forte and bringing stupendous digital experience to our partners has been our prime objective. Contact us at hello@opensenselabs.com and let us know how you want us to be a part of your digital transformation goals.

blog banner blog image Decoupled Drupal Drupal module Blog Type Articles Is it a good read ? On
Categories: FLOSS Project Planets

Srijan Technologies: Layout Builder: A DIY Tool to Design a Website with Drupal 8

Planet Drupal - Wed, 2019-05-22 07:37

There's this millennial phrase "You snooze, you lose" and nowhere is it more applicable than for enterprises today. Opportunities to engage with your audience pop up fast and disappear faster, and you have to act quickly if you wish to leverage any of it. 

Categories: FLOSS Project Planets

Catalin George Festila: Python 3.7.3 : Using the win32com - part 001.

Planet Python - Wed, 2019-05-22 07:19
The tutorial is about win32com python module using the python 3.7.3. First exameple is simple: import sys import win32com.client from win32com.client import constants speaker = win32com.client.Dispatch("SAPI.SpVoice") print ("Type word or phrase, then enter.") while 1: text_to_say = input("> ") speaker.Speak(text_to_say)Let's run it. The Microsoft Speech SDK to speak what you type in
Categories: FLOSS Project Planets

FSF Events: Richard Stallman - "The Free Software Movement" (Brno, Czech Republic)

GNU Planet! - Wed, 2019-05-22 07:07
The Free Software Movement campaigns for computer users' freedom to cooperate and control their own computing. The Free Software Movement developed the GNU operating system, typically used together with the kernel Linux, specifically to make these freedoms possible.

This speech by Richard Stallman will be nontechnical, admission is gratis, and the public is encouraged to attend.

Location: University Cinema Scala, MoravskĂŠ nĂĄm. 127/3, 602 00 Brno, Czech Republic

Registration, which can be done anonymously, while not required, is appreciated; it will help us ensure we can accommodate all the people who wish to attend.

Please fill out our contact form, so that we can contact you about future events in and around Brno.

Categories: FLOSS Project Planets

Wim Leers: API-First Drupal: what's new in 8.7?

Planet Drupal - Wed, 2019-05-22 07:00

Drupal 8.7 was released with huge API-First improvements!

The REST API only got fairly small improvements in the 7th minor release of Drupal 8, because it reached a good level of maturity in 8.6 (where we added file uploads, exposed more of Drupal’s data model and improved DX.), and because we of course were busy with JSON:API :)

Thanks to everyone who contributed!

  1. JSON:API #2843147

    Need I say more? :) After keeping you informed about progress in October, November, December and January, followed by one of the most frantic Drupal core contribution periods I’ve ever experienced, the JSON:API module was committed to Drupal 8.7 on March 21, 2019.

    Surely you’re know thinking But when should I use Drupal’s JSON:API module instead of the REST module? Well, choose REST if you have non-entity data you want to expose. In all other cases, choose JSON:API.

    In short, after having seen people use the REST module for years, we believe JSON:API makes solving the most common needs an order of magnitude simpler — sometimes two. Many things that require a lot of client-side code, a lot of requests or even server-side code you get for free when using JSON:API. That’s why we added it, of course :) See Dries’ overview if you haven’t already. It includes a great video that Gabe recorded that shows how it simplifies common needs. And of course, check out the spec!

  2. datetime & daterange fields now respect standards #2926508

    They were always intended to respect standards, but didn’t.

    For a field configured to store date + time:

    "field_datetime":[{ "value": "2017-03-01T20:02:00", }] ⬇ "field_datetime":[{ "value": "2017-03-01T20:02:00+11:00", }]

    The site’s timezone is now present! This is now a valid RFC3339 datetime string.

    For a field configured to store date only:

    "field_dateonly":[{ "value": "2017-03-01T20:02:00", }] ⬇ "field_dateonly":[{ "value": "2017-03-01", }]

    Tme information used to be present despite this being a date-only field! RFC3339 only covers combined date and time representations. For date-only representations, we need to use ISO 8601. There isn’t a particular name for this date-only format, so we have to hardcode the format. See https://en.wikipedia.org/wiki/ISO_8601#Calendar_dates.

    Note: backward compatibility is retained: a datetime string without timezone information is still accepted. This is discouraged though, and deprecated: it will be removed for Drupal 9.

    Previous Drupal 8 minor releases have fixed similar problems. But this one, for the first time ever, implements these improvements as a @DataType-level normalizer, rather than a @FieldType-level normalizer. Perhaps that sounds too far into the weeds, but … it means it fixed this problem in both rest.module and jsonapi.module! We want the entire ecosystem to follow this example.

  3. rest.module has proven to be maintainable!

    In the previous update, I proudly declared that Drupal 8 core’s rest.module was in a “maintainable” state. I’m glad to say that has proven to be true: six months later, the number of open issues still fit on “a single page” (fewer than 50). :) So don’t worry as a rest.module user: we still triage the issue queue, it’s still maintained. But new features will primarily appear for jsonapi.module.

Want more nuance and detail? See the REST: top priorities for Drupal 8.7.x issue on drupal.org.

Are you curious what we’re working on for Drupal 8.8? Want to follow along? Click the follow button at API-First: top priorities for Drupal 8.8.x — whenever things on the list are completed (or when the list gets longer), a comment gets posted. It’s the best way to follow along closely!1

Was this helpful? Let me know in the comments!

For reference, historical data:

  1. ~50 comments per six months — so very little noise. ↩︎

Categories: FLOSS Project Planets

EuroPython Society: EuroPython 2019: First batch of accepted sessions

Planet Python - Wed, 2019-05-22 06:25

europython:

Our program work group (WG) has been working hard over the weekend to select the sessions for EuroPython 2019.

We’re now happy to announce the first batch with:

brought to you by 129 speakers.

More advanced talks than in previous EuroPython editions

We are glad that the Python community has heard our call to submit more advanced talks this year. This will make EuroPython 2019 even more interesting than our previous edition.

Waiting List

Some talks are still in the waiting list. We will inform all speakers who have submitted talks about the selection status by email.

PyData EuroPython 2019

As in previous years, we will have lots of PyData talks and trainings:

  • 4 trainings on Monday and Tuesday (July 8-9)
  • 34 talks on Wednesday and Thursday (July 10-11)
  • no PyData talks on Friday (July 12)
Full Schedule

The full schedule will be available early in June.

Training Tickets & Combined Tickets

If you want to attend the trainings offered on the training days (July 8-9), please head over to the registration page in the next couple of days. Sales are going strong and we only have 300 tickets available.

Combined tickets are new this year and allow attending both training and conference days with a single ticket.


Enjoy,

EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/ 

Categories: FLOSS Project Planets

EuroPython Society: EuroPython 2019: First batch of accepted sessions

Planet Python - Wed, 2019-05-22 06:25

europython:

Our program work group (WG) has been working hard over the weekend to select the sessions for EuroPython 2019.

We’re now happy to announce the first batch with:

brought to you by 129 speakers.

More advanced talks than in previous EuroPython editions

We are glad that the Python community has heard our call to submit more advanced talks this year. This will make EuroPython 2019 even more interesting than our previous edition.

Waiting List

Some talks are still in the waiting list. We will inform all speakers who have submitted talks about the selection status by email.

PyData EuroPython 2019

As in previous years, we will have lots of PyData talks and trainings:

  • 4 trainings on Monday and Tuesday (July 8-9)
  • 34 talks on Wednesday and Thursday (July 10-11)
  • no PyData talks on Friday (July 12)
Full Schedule

The full schedule will be available early in June.

Training Tickets & Combined Tickets

If you want to attend the trainings offered on the training days (July 8-9), please head over to the registration page in the next couple of days. Sales are going strong and we only have 300 tickets available.

Combined tickets are new this year and allow attending both training and conference days with a single ticket.


Enjoy,

EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/ 

Categories: FLOSS Project Planets

EuroPython: EuroPython 2019: First batch of accepted sessions

Planet Python - Wed, 2019-05-22 06:11

Our program work group (WG) has been working hard over the weekend to select the sessions for EuroPython 2019.

We’re now happy to announce the first batch with:

brought to you by 129 speakers.

More advanced talks than in previous EuroPython editions

We are glad that the Python community has heard our call to submit more advanced talks this year. This will make EuroPython 2019 even more interesting than our previous edition.

Waiting List

Some talks are still in the waiting list. We will inform all speakers who have submitted talks about the selection status by email.

PyData EuroPython 2019

As in previous years, we will have lots of PyData talks and trainings:

  • 4 trainings on Monday and Tuesday (July 8-9)
  • 34 talks on Wednesday and Thursday (July 10-11)
  • no PyData talks on Friday (July 12)
Full Schedule

The full schedule will be available early in June.

Training Tickets & Combined Tickets

If you want to attend the trainings offered on the training days (July 8-9), please head over to the registration page in the next couple of days. Sales are going strong and we only have 300 tickets available.

Combined tickets are new this year and allow attending both training and conference days with a single ticket.


Enjoy,

EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/ 

Categories: FLOSS Project Planets

Kushal Das: Game of guessing colors using CircuitPython

Planet Python - Wed, 2019-05-22 05:54

Every participant of PyCon US 2019 received a CircuitPython Playground Express (cpx) in the swag bag from Digikey and Adafuit, which is one of the best swag in a conference. Only another thing which comes in my mind was Yubikeys sponsored by Yubico in a rootconf a few years ago.

I did not play around much with my cpx during PyCon, but, decided to go through the documents and examples in the last week. I used Mu editor (thank you @ntoll) to write a small game.

The goal is to guess a color for the next NeoPixel on the board and then press Button A to see if you guessed right or not. Py and I are continuously playing this for the last weeks.

The idea of CircuitPython, where we can connect the device to a computer and start editing code and see the changes live, is super fantastic and straightforward. It takes almost no time to start working on these, the documentation is also unambiguous and with many examples. Py (our 4 years old daughter) is so excited that now she wants to learn programming so that she can build her things with this board).

Categories: FLOSS Project Planets

KDSoap 1.8.0 released

Planet KDE - Wed, 2019-05-22 05:32

KDAB has released a new version of KDSoap. This is version 1.8.0 and comes more than one year since the last release (1.7.0).

KDSoap is a tool for creating client applications for web services without the need for any further component such as a dedicated web server.

KDSoap lets you interact with applications which have APIs that can be exported as SOAP objects. The web service then provides a machine-accessible interface to its functionality via HTTP. Find out more...

Version 1.8.0 has a large number of improvements and fixes:

General
  • Fixed internally-created faults lacking an XML element name (so e.g. toXml() would abort)
  • KDSoapMessage::messageAddressingProperties() is now correctly filled in when receiving a message with WS-Addressing in the header
Client-side
  • Added support for timing out requests (default 30 minutes, configurable with KDSoapClientInterface::setTimeout())
  • Added support for soap 1.2 faults in faultAsString()
  • Improved detection of soap 1.2 faults in HTTP response
  • Stricter namespace check for Fault elements being received
  • Report client-generated faults as SOAP 1.2 if selected
  • Fixed error code when authentication failed
  • Autodeletion of jobs is now configurable (github pull #125)
  • Added error details in faultAsString() – and the generated lastError() – coming from the SOAP 1.2 detail element.
  • Fixed memory leak in KDSoapClientInterface::callNoReply
  • Added support for WS-UsernameToken, see KDSoapAuthentication
  • Extended KDSOAP_DEBUG functionality (e.g. “KDSOAP_DEBUG=http,reformat” will now print http-headers and pretty-print the xml)
  • Added support for specifying requestHeaders as part of KDSoapJob via KDSoapJob::setRequestHeaders()
  • Renamed the missing KDSoapJob::returnHeaders() to KDSoapJob::replyHeaders(), and provide an implementation
  • Made KDSoapClientInterface::soapVersion() const
  • Added lastFaultCode() for error handling after sync calls. Same as lastErrorCode() but it returns a QString rather than an int.
  • Added conversion operator from KDDateTime to QVariant to void implicit conversion to base QDateTime (github issue #123).
Server-side
  • New method KDSoapServerObjectInterface::additionalHttpResponseHeaderItems to let server objects return additional http headers. This can be used to implement support for CORS, using KDSoapServerCustomVerbRequestInterface to implement OPTIONS response, with “Access-Control-Allow-Origin” in the headers of the response (github issue #117).
  • Stopped generation of two job classes with the same name, when two bindings have the same operation name. Prefixed one of them with the binding name (github issue #139 part 1)
  • Prepended this-> in method class to avoid compilation error when the variable and the method have the same name (github issue #139 part 2)
WSDL parser / code generator changes, applying to both client and server side
  • Source incompatible change: all deserialize() functions now require a KDSoapValue instead of a QVariant. If you use a deserialize(QVariant) function, you need to port your code to use KDSoapValue::setValue(QVariant) before deserialize()
  • Source incompatible change: all serialize() functions now return a KDSoapValue instead of a QVariant. If you use a QVariant serialize() function, you need to port your code to use QVariant KDSoapValue::value() after serialize()
  • Source incompatible change: xs:QName is now represented by KDQName instead of QString, which allows the namespace to be extracted. The old behaviour is available via KDQName::qname().
  • Fixed double-handling of empty elements
  • Fixed fault elements being generated in the wrong namespace, must be SOAP-ENV:Fault (github issue #81).
  • Added import-path argument for setting the local path to get (otherwise downloaded) files from.
  • Added -help-on-missing option to kdwsdl2cpp to display extra help on missing types.
  • Added C++17 std::optional as possible return value for optional elements.
  • Added -both to create both header(.h) and implementation(.cpp) files in one run
  • Added -namespaceMapping @mapping.txt to import url=code mappings, affects C++ class name generation
  • Added functionality to prevent downloading the same WSDL/XSD file twice in one run
  • Added “hasValueFor{MemberName}()” accessor function, for optional elements
  • Generated services now include soapVersion() and endpoint() accessors to match the setSoapVersion(…) and setEndpoint(…) mutators
  • Added support for generating messages for WSDL files without services or bindings
  • Fixed erroneous QT_BEGIN_NAMESPACE around forward-declarations like Q17__DialogType.
  • KDSoapValue now stores the namespace declarations during parsing of a message and writes
  •     namespace declarations during sending of a message
  • Avoid serialize crash with required polymorphic types, if the required variable wasn’t actually provided
  • Fixed generated code for restriction to base class (it wouldn’t compile)
  • Prepended “undef daylight” and “undef timezone” to all generated files, to fix compilation errors in wsdl files that use those names, due to nasty Windows macros
  • Added generation for default attribute values.

Get KDSoap…

KDSoap on github…

The post KDSoap 1.8.0 released appeared first on KDAB.

Categories: FLOSS Project Planets

Petter Reinholdtsen: Nikita version 0.4 released - free software archive API server

Planet Debian - Wed, 2019-05-22 05:30

This morning, a new release of Nikita Noark 5 core project was announced on the project mailing list. The Nikita free software solution is an implementation of the Norwegian archive standard Noark 5 used by government offices in Norway. These were the changes in version 0.4 since version 0.3, see the email link above for links to a demo site:

  • Roll out OData handling to all endpoints where applicable
  • Changed the relation key for "ny-journalpost" to the official one.
  • Better link generation on outgoing links.
  • Tidy up code and make code and approaches more consistent throughout the codebase
  • Update rels to be in compliance with updated version in the interface standard
  • Avoid printing links on empty objects as they can't have links
  • Small bug fixes and improvements
  • Start moving generation of outgoing links to @Service layer so access control can be used when generating links
  • Log exception that was being swallowed so it's traceable
  • Fix name mapping problem
  • Update templated printing so templated should only be printed if it is set true. Requires more work to roll out across entire application.
  • Remove Record->DocumentObject as per domain model of n5v4
  • Add ability to delete lists filtered with OData
  • Return NO_CONTENT (204) on delete as per interface standard
  • Introduce support for ConstraintViolationException exception
  • Make Service classes extend NoarkService
  • Make code base respect X-Forwarded-Host, X-Forwarded-Proto and X-Forwarded-Port
  • Update CorrespondencePart* code to be more in line with Single Responsibility Principle
  • Make package name follow directory structure
  • Make sure Document number starts at 1, not 0
  • Fix isues discovered by FindBugs
  • Update from Date to ZonedDateTime
  • Fix wrong tablename
  • Introduce Service layer tests
  • Improvements to CorrespondencePart
  • Continued work on Class / Classificationsystem
  • Fix feature where authors were stored as storageLocations
  • Update HQL builder for OData
  • Update OData search capability from webpage

If free and open standardized archiving API sound interesting to you, please contact us on IRC (#nikita on irc.freenode.net) or email (nikita-noark mailing list).

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Categories: FLOSS Project Planets

Patrick Kennedy: Setting Up GitLab CI for a Python Application

Planet Python - Wed, 2019-05-22 00:28

Introduction

This blog post describes how to configure a Continuous Integration (CI) process on GitLab for a python application.  This blog post utilizes one of my python applications (bild) to show how to setup the CI process:

In this blog post, I’ll show how I setup a GitLab CI process to run the following jobs on a python application:

  • Unit and functional testing using pytest
  • Linting using flake8
  • Static analysis using pylint
  • Type checking using mypy

What is CI?

To me, Continuous Integration (CI) means frequently testing your application in an integrated state.  However, the term ‘testing’ should be interpreted loosely as this can mean:

  • Integration testing
  • Unit testing
  • Functional testing
  • Static analysis
  • Style checking (linting)
  • Dynamic analysis

To facilitate running these tests, it’s best to have these tests run automatically as part of your configuration management (git) process.  This is where GitLab CI is awesome!

In my experience, I’ve found it really beneficial to develop a test script locally and then add it to the CI process that gets automatically run on GitLab CI.

Getting Started with GitLab CI

Before jumping into GitLab CI, here are a few definitions:

 – pipeline: a set of tests to run against a single git commit.

  – runner: GitLab uses runners on different servers to actually execute the tests in a pipeline; GitLab provides runners to use, but you can also spin up your own servers as runners.

  – job: a single test being run in a pipeline.

  – stage: a group of related tests being run in a pipeline.

Here’s a screenshot from GitLab CI that helps illustrate these terms:

GitLab utilizes the ‘.gitlab-ci.yml’ file to run the CI pipeline for each project.  The ‘.gitlab-ci.yml’ file should be found in the top-level directory of your project.

While there are different methods of running a test in GitLab CI, I prefer to utilize a Docker container to run each test.  I’ve found the overhead in spinning up a Docker container to be trivial (in terms of execution time) when doing CI testing.

Creating a Single Job in GitLab CI

The first job that I want to add to GitLab CI for my project is to run a linter (flake8).  In my local development environment, I would run this command:

$ flake8 --max-line-length=120 bild/*.py

This command can be transformed into a job on GitLab CI in the ‘.gitlab-ci.yml’ file:

image: "python:3.7" before_script: - python --version - pip install -r requirements.txt stages: - Static Analysis flake8: stage: Static Analysis script: - flake8 --max-line-length=120 bild/*.py

This YAML file tells GitLab CI what to run on each commit pushed up to the repository. Let’s break down each section…

The first line (image: “python: 3.7”) instructs GitLab CI to utilize Docker for performing ALL of the tests for this project, specifically to use the ‘python:3.7‘ image that is found on DockerHub.

The second section (before_script) is the set of commands to run in the Docker container before starting each job. This is really beneficial for getting the Docker container in the correct state by installing all the python packages needed by the application.

The third section (stages) defines the different stages in the pipeline. There is only a single stage (Static Analysis) at this point, but later a second stage (Test) will be added. I like to think of stages as a way to group together related jobs.

The fourth section (flake8) defines the job; it specifies the stage (Static Analysis) that the job should be part of and the commands to run in the Docker container for this job. For this job, the flake8 linter is run against the python files in the application.

At this point, the updates to ‘.gitlab-ci.yml’ file should be commited to git and then pushed up to GitLab:

git add .gitlab_ci.yml git commit -m "Updated .gitlab_ci.yml" git push origin master

GitLab Ci will see that there is a CI configuration file (.gitlab-ci.yml) and use this to run the pipeline:

This is the start of a CI process for a python project!  GitLab CI will run a linter (flake8) on every commit that is pushed up to GitLab for this project.

Running Tests with pytest on GitLab CI

When I run my unit and functional tests with pytest in my development environment, I run the following command in my top-level directory:

$ pytest

My initial attempt at creating a new job to run pytest in ‘.gitlab-ci.yml’ file was:

image: "python:3.7" before_script: - python --version - pip install -r requirements.txt stages: - Static Analysis - Test ... pytest: stage: Test script: - pytest

However, this did not work as pytest was unable to find the ‘bild’ module (ie. the source code) to test:

$ pytest ========================= test session starts ========================== platform linux -- Python 3.7.3, pytest-4.5.0, py-1.5.4, pluggy-0.11.0 rootdir: /builds/patkennedy79/bild, inifile: pytest.ini plugins: datafiles-2.0 collected 0 items / 3 errors ============================ ERRORS ==================================== ___________ ERROR collecting tests/functional/test_bild.py _____________ ImportError while importing test module '/builds/patkennedy79/bild/tests/functional/test_bild.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: tests/functional/test_bild.py:4: in <module> from bild.directory import Directory E ModuleNotFoundError: No module named 'bild' ... ==================== 3 error in 0.24 seconds ====================== ERROR: Job failed: exit code 1

The problem encountered here is that the ‘bild’ module is not able to be found by the test_*.py files, as the top-level directory of the project was not being specified in the system path:

$ python -c "import sys;print(sys.path)"
['', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

The solution that I came up with was to add the top-level directory to the system path within the Docker container for this job:

pytest: stage: Test script: - pwd - ls -l - export PYTHONPATH="$PYTHONPATH:." - python -c "import sys;print(sys.path)" - pytest

With the updated system path, this job was able to run successfully:

$ pwd /builds/patkennedy79/bild $ export PYTHONPATH="$PYTHONPATH:." $ python -c "import sys;print(sys.path)" ['', '/builds/patkennedy79/bild', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/site-packages']

Final GitLab CI Configuration

Here is the final .gitlab-ci.yml file that runs the static analysis jobs (flake8, mypy, pylint) and the tests (pytest):

image: "python:3.7"

before_script:
- python --version
- pip install -r requirements.txt

stages:
- Static Analysis
- Test

mypy:
stage: Static Analysis
script:
- pwd
- ls -l
- python -m mypy bild/file.py
- python -m mypy bild/directory.py

flake8:
stage: Static Analysis
script:
- flake8 --max-line-length=120 bild/*.py

pylint:
stage: Static Analysis
allow_failure: true
script:
- pylint -d C0301 bild/*.py

unit_test:
stage: Test
script:
- pwd
- ls -l
- export PYTHONPATH="$PYTHONPATH:."
- python -c "import sys;print(sys.path)"
- pytest

Here is the resulting output from GitLab CI:

One item that I’d like to point out is that pylint is reporting some warnings, but I find this to be acceptable. However, I still want to have pylint running in my CI process, but I don’t care if it has failures. I’m more concerned with trends over time (are there warnings being created). Therefore, I set the pylint job to be allowed to fail via the ‘allow_failure’ setting:

pylint: stage: Static Analysis allow_failure: true script: - pylint -d C0301 bild/*.py
Categories: FLOSS Project Planets

David Bremner: Dear UNB: please leave my email alone.

Planet Debian - Tue, 2019-05-21 23:00

1 Background

Apparently motivated by recent phishing attacks against @unb.ca addresses, UNB's Integrated Technology Services unit (ITS) recently started adding banners to the body of email messages an. Despite (cough) several requests, they have been unable and/or unwilling to let people opt out of this. Recently ITS has reduced the size of banner; this does not change the substance of what is discussed here. In this blog post I'll try to document some of the reasons this reduces the utility of my UNB email account.

2 What do I know about email?

I have been using email since 1985 1. I have administered my own UNIX-like systems since the mid 1990s. I am a Debian Developer 2. Debian is a mid-sized organization (there are more Debian Developers than UNB faculty members) that functions mainly via email (including discussions and a bug tracker). I maintain a mail user agent (informally, an email client) called notmuch 3. I administer my own (non-UNB) email server. I have spent many hours reading RFCs 4. In summary, my perspective might be different than an enterprise email adminstrator, but I do know something about the nuts and bolts of how email works.

3 What's wrong with a helpful message? 3.1 It's a banner ad.

I don't browse the web without an ad-blocker and I don't watch TV with advertising in it. Apparently the main source of advertising in my life is a service provided by my employer. Some readers will probably dispute my description of a warning label inserted by an email provider as "advertising". Note that is information inserted by a third party to promote their own (well intentioned) agenda, and inserted in an intentionally attention grabbing way. Advertisements from charities are still advertisements. Preventing phishing attacks is important, but so are an almost countless number of priorities of other units of the University. For better or worse those units are not so far able to insert messages into my email. As a thought experiment, imagine inserting a banner into every PDF file stored on UNB servers reminding people of the fiscal year end.

3.2 It makes us look unprofessional.

Because the banner is contained in the body of email messages, it almost inevitably ends up in replies. This lets funding agencies, industrial partners, and potential graduate students know that we consider them as potentially hostile entities. Suggesting that people should edit their replies is not really an acceptable answer, since it suggests that it is acceptable to download the work of maintaining the previous level of functionality onto each user of the system.

3.3 It doesn't help me

I have an archive of 61270 email messages received since 2003. Of these 26215 claim to be from a unb.ca address 5. So historically about 42% of the mail to arrive at my UNB mailbox is internal 6. This means that warnings will occur in the majority of messages I receive. I think the onus is on the proposer to show that a warning that occurs in the large majority of messages will have any useful effect.

3.4 It disrupts my collaboration with open-source projects

Part of my job is to collaborate with various open source projects. A prominent example is Eclipse OMR 7, the technological driver for a collaboration with IBM that has brought millions of dollars of graduate student funding to UNB. Git is now the dominant version control system for open source projects, and one popular way of using git is via git-send-email 8

Adding a banner breaks the delivery of patches by this method. In the a previous experiment I did about a month ago, it "only" caused the banner to end up in the git commit message. Those of you familiar with software developement will know that this is roughly the equivalent of walking out of the bathroom with toilet paper stuck to your shoe. You'd rather avoid it, but it's not fatal. The current implementation breaks things completely by quoted-printable re-encoding the message. In particular '=' gets transformed to '=3D' like the following

-+ gunichar *decoded=g_utf8_to_ucs4_fast (utf8_str, -1, NULL); -+ const gunichar *p = decoded; ++ gunichar *decoded=3Dg_utf8_to_ucs4_fast (utf8_str, -1, NULL);

I'm not currently sure if this is a bug in git or some kind of failure in the re-encoding. It would likely require an investment of several hours of time to localize that.

3.5 It interferes with the use of cryptography.

Unlike many people, I don't generally read my email on a phone. This means that I don't rely on the previews that are apparently disrupted by the presence of a warning banner. On the other hand I do send and receive OpenPGP signed and encrypted messages. The effects of the banner on both signed and encrypted messages is similar, so I'll stick to discussing signed messages here. There are two main ways of signing a message. The older method, still unfortunately required for some situations is called "inline PGP". The signed region is re-encoded, which causes gpg to issue a warning about a buggy MTA 9, namely gpg: quoted printable character in armor - probably a buggy MTA has been used. This is not exactly confidence inspiring. The more robust and modern standard is PGP/MIME. Here the insertion of a banner does not provoke warnings from the cryptography software, but it does make it much harder to read the message (before and after screenshots are given below). Perhaps more importantly it changes the message from one which is entirely signed or encrypted 10, to one which is partially signed or encrypted. Such messages were instrumental in the EFAIL exploit 11 and will probably soon be rejected by modern email clients.

Figure 1: Intended view of PGP/MIME signed message

Figure 2: View with added banner

Footnotes: 1

On Multics, when I was a high school student

2

https://www.debian.org

3

https://notmuchmail.org

4

IETF Requests for Comments, which define most of the standards used by email systems.

5

possibly overcounting some spam as UNB originating email

6

In case it's not obvious dear reader, communicating with the world outside UNB is part of my job.

7

https://github.com/eclipse/omr

8

Some important projects function exclusively that way. See https://git-send-email.io/ for more information.

9

Mail Transfer Agent

10

https://dkg.fifthhorseman.net/blog/e-mail-cryptography.html

11

https://efail.de

Author: David Bremner

Created: 2019-05-22 Wed 17:04

Validate

Categories: FLOSS Project Planets

Pages