Russ Allbery: Free software log (September 2017)

Planet Debian - Mon, 2017-10-16 00:47

I said that I was going to start writing these regularly, so I'm going to stick to it, even when the results are rather underwhelming. One of the goals is to make the time for more free software work, and I do better at doing things that I record.

The only piece of free software work for September was that I made rra-c-util compile cleanly with the Clang static analyzer. This was fairly tedious work that mostly involved unconfusing the compiler or converting (semi-intentional) crashes into explicit asserts, but it unblocks using the Clang static analyzer as part of the automated test suite of my other projects that are downstream of rra-c-util.

One of the semantic changes I made was that the vector utilities in rra-c-util (which maintain a resizable array of strings) now always allocate room for at least one string pointer. This wastes a small amount of memory for empty vectors that are never used, but ensures that the strings struct member is always valid. This isn't, strictly speaking, a correctness fix, since all the checks were correct, but after some thought, I decided that humans might have the same problem that the static analyzer had. It's a lot easier to reason about a field that's never NULL. Similarly, the replacement function for a missing reallocarray now does an allocation of size 1 if given a size of 0, just to avoid edge case behavior. (I'm sure the behavior of a realloc with size 0 is defined somewhere in the C standard, but if I have to look it up, I'd rather not make a human reason about it.)

I started on, but didn't finish, making rra-c-util compile without Clang warnings (at least for a chosen set of warnings). By far the hardest problem here are the Clang warnings for comparisons between unsigned and signed integers. In theory, I like this warning, since it's the cause of a lot of very obscure bugs. In practice, gah does C ever do this all over the place, and it's incredibly painful to avoid. (One of the biggest offenders is write, which returns a ssize_t that you almost always want to compare against a size_t.) I did a bunch of mechanical work, but I now have a lot of bits of code like:

if (status < 0) return; written = (size_t) status; if (written < avail) buffer->left += written;

which is ugly and unsatisfying. And I also have a ton of casts, such as with:

buffer_resize(buffer, (size_t) st.st_size + used);

since st.st_size is an off_t, which may be signed. This is all deeply unsatisfying and ugly, and I think it makes the code moderately harder to read, but I do think the warning will potentially catch bugs and even security issues.

I'm still torn. Maybe I can find some nice macros or programming styles to avoid the worst of this problem. It definitely requires more thought, rather than just committing this huge mechanical change with lots of ugly code.

Mostly, this kind of nonsense makes me want to stop working on C code and go finish learning Rust....

Anyway, apart from work, the biggest thing I managed to do last month that was vaguely related to free software was upgrading my personal servers to stretch (finally). That mostly went okay; only a few things made it unnecessarily exciting.

The first was that one of my systems had a very tiny / partition that was too small to hold the downloaded debs for the upgrade, so I had to resize it (VM disk, partition, and file system), and that was a bit exciting because it has an old-style DOS partition table that isn't aligned (hmmm, which is probably why disk I/O is so slow on those VMs), so I had to use the obsolete fdisk -c=dos mode because I wasn't up for replacing the partition right then.

The second was that my first try at an upgrade died with a segfault during the libc6 postinst and then every executable segfaulted. A mild panic and a rescue disk later (and thirty minutes and a lot of swearing), I tracked the problem down to libc6-xen. Nothing in the dependency structure between jessie and stretch forces libc6-xen to be upgraded in lockstep or removed, but it's earlier in the search path. So ld.so gets upgraded, and then finds the old libc6 from the libc6-xen package, and the mismatch causes immediate segfaults. A chroot dpkg --purge from the rescue disk solved the problem as soon as I knew what was going on, but that was a stressful half-hour.

The third problem was something I should have known was going to be an issue: an old Perl program that does some internal stuff for one of the services I ran had a defined @array test that has been warning for eons and that I never fixed. That became a full syntax error with the most recent Perl, and then I fixed it incorrectly the first time and had a bunch of trouble tracking down what I'd broken. All sorted out now, and everything is happily running stretch. (ejabberd, which other folks had mentioned was a problem, went completely smoothly, although I suspect I now have too many of the plugin packages installed and should do a purging.)

Categories: FLOSS Project Planets

Lullabot: Behind the Screens with Chris Teitzel

Planet Drupal - Mon, 2017-10-16 00:00
Chris Teitzel of Cellar Door Media gives us a preview of Security Saturday at BadCamp 2017 and provides some great tips for securing your website. He tells us why we should always say yes to the community; you never know where it's going to lead. Chris also shares some amazing stories about bringing a Drupal-based communications tool developed from the DrupalCon Denver Tropo Hackathon, to Haiti in 2012 to help with relief efforts after their devastating 2010 earthquake.
Categories: FLOSS Project Planets

Bay Area Drupal Camp: BADCamp 2017 starts this Wednesday

Planet Drupal - Sun, 2017-10-15 23:00
BADCamp 2017 starts this Wednesday Anne Sun, 10/15/2017 - 8:00pm

BADCamp kicks off this Wednesday! We are looking forward to seeing you and are excited to share some logistical details and tips for making the most of your time at BADCamp.

Where do I register and pick up my badge?

Central BADCamp registration opens at 8:15 am each morning. It’s located in the Martin Luther King (MLK) Student Union, on the 3rd Floor in the Kerr Lobby.

Map to Martin Luther King Student Union

2495 Bancroft Way, at Telegraph Avenue

University of California

Berkeley CA 94720


If you are attending a summit at the Marsh Art Center, badges will be available for pick up when you arrive.

Map to Marsh Art Center

2120 Allston Way

Berkeley, CA 94704


Be sure to come back to BADCamp Expo Hall at MLK Pauley West during breaks. We’ll have coffee, pinball, 15-min relaxation massages and a chance to thank our generous sponsors ... many are hiring!

Here is an overview of what is happening at each venue.


Where is everything? Where do I go?
  • Take a look at our Event Timeline to find out what is happening when.

  • Check out the Venues to see what is happening where.

  • Be sure to log in and make your session schedule in advance and then follow along on your mobile device.


What’s the 411 on food and beverage?

As always, BADCamp will provide an endless supply of coffee, tea, and water.


Wednesday & Thursday

  • All Training & Summits will have light snacks in the morning.

  • For lunch, head outside to discover some of Berkeley’s best food!

  • Stop by the Sponsor Expo on Thursday for specialty coffees.


Friday & Saturday

  • The Sponsor Expo will have a waffle bar and specialty coffees.

  • Lunch is sponsored by Acquia on both Friday & Saturday.



Parking at Berkeley can be extremely challenging. Consider taking public transportation whenever possible.  


Anything else to know?
  • Wear good shoes! You will do a lot of walking.

  • Bring layers, or donate at the $100 level and get not only an awesome 2017 t-shirt, a solar charger, and a cozy BADCamp hoodie!

  • The Fires. We are keeping an eye on things and will provide any updates if the air quality or anything else impact the event. Stay in touch with BADamp on Twitter.

  • The BADCamp Contribution Lounge is open 24 hours, beginning at 9 am on Wednesday and going until 10 pm on Saturday. We welcome and encourage you to participate!



Our sponsors make the magic of BADCamp possible! Stop by to thank them at the event. As an added bonus, many of them are hiring! We’re also sending an extra big virtual hug to Platform.sh, Pantheon & Acquia for sponsoring at the Core level and helping to keep BADCamp AWESOME!

Drupal Planet
Categories: FLOSS Project Planets

Norbert Preining: Fixing vim in Debian

Planet Debian - Sun, 2017-10-15 21:18

I was wondering for quite some time why on my server vim behaves so stupid with respect to the mouse: Jumping around, copy and paste wasn’t possible the usual way. All this despite having

set mouse=

in my /etc/vim/vimrc.local. Finally I found out why, thanks to bug #864074 and fixed it.

The whole mess comes from the fact that, when there is no ~/.vimrc, vim loads defaults.vim after vimrc.local and thus overwriting several settings put in there.

There is a comment (I didn’t see, though) in /etc/vim/vimrc explaining this:

" Vim will load $VIMRUNTIME/defaults.vim if the user does not have a vimrc. " This happens after /etc/vim/vimrc(.local) are loaded, so it will override " any settings in these files. " If you don't want that to happen, uncomment the below line to prevent " defaults.vim from being loaded. " let g:skip_defaults_vim = 1

I agree that this is a good way to setup vim on a normal installation of Vim, but the Debian package could do better. The problem is laid out clearly in the bug report: If there is no ~/.vimrc, settings in /etc/vim/vimrc.local are overwritten.

This is as counterintuitive as it can be in Debian – and I don’t know any other package that does it in a similar way.

Since the settings in defaults.vim are quite reasonable, I want to have them, but only fix a few of the items I disagree with, like the mouse. At the end what I did is the following in my /etc/vim/vimrc.local:

if filereadable("/usr/share/vim/vim80/defaults.vim") source /usr/share/vim/vim80/defaults.vim endif " now set the line that the defaults file is not reloaded afterwards! let g:skip_defaults_vim = 1 " turn of mouse set mouse= " other override settings go here

There is probably a better way to get a generic load statement that does not depend on the Vim version, but for now I am fine with that.

Categories: FLOSS Project Planets

Matthew Rocklin: Streaming Dataframes

Planet Python - Sun, 2017-10-15 20:00

This work is supported by Anaconda Inc and the Data Driven Discovery Initiative from the Moore Foundation

This post is about experimental software. This is not ready for public use. All code examples and API in this post are subject to change without warning.


This post describes a prototype project to handle continuous data sources of tabular data using Pandas and Streamz.


Some data never stops. It arrives continuously in a constant, never-ending stream. This happens in financial time series, web server logs, scientific instruments, IoT telemetry, and more. Algorithms to handle this data are slightly different from what you find in libraries like NumPy and Pandas, which assume that they know all of the data up-front. It’s still possible to use NumPy and Pandas, but you need to combine them with some cleverness and keep enough intermediate data around to compute marginal updates when new data comes in.

Example: Streaming Mean

For example, imagine that we have a continuous stream of CSV files arriving and we want to print out the mean of our data over time. Whenever a new CSV file arrives we need to recompute the mean of the entire dataset. If we’re clever we keep around enough state so that we can compute this mean without looking back over the rest of our historical data. We can accomplish this by keeping running totals and running counts as follows:

total = 0 count = 0 for filename in filenames: # filenames is an infinite iterator df = pd.read_csv(filename) total = total + df.sum() count = count + df.count() mean = total / count print(mean)

Now as we add new files to our filenames iterator our code prints out new means that are updated over time. We don’t have a single mean result, we have continuous stream of mean results that are each valid for the data up to that point. Our output data is an infinite stream, just like our input data.

When our computations are linear and straightforward like this a for loop suffices. However when our computations have several streams branching out or converging, possibly with rate limiting or buffering between them, this for-loop approach can grow complex and difficult to manage.


A few months ago I pushed a small library called streamz, which handled control flow for pipelines, including linear map operations, operations that accumulated state, branching, joining, as well as back pressure, flow control, feedback, and so on. Streamz was designed to handle all of the movement of data and signaling of computation at the right time. This library was quietly used by a couple of groups and now feels fairly clean and useful.

Streamz was designed to handle the control flow of such a system, but did nothing to help you with streaming algorithms. Over the past week I’ve been building a dataframe module on top of streamz to help with common streaming tabular data situations. This module uses Pandas and implements a subset of the Pandas API, so hopefully it will be easy to use for programmers with existing Python knowledge.

Example: Streaming Mean

Our example above could be written as follows with streamz

source = Stream.filenames('path/to/dir/*.csv') # stream of filenames sdf = (source.map(pd.read_csv) # stream of Pandas dataframes .to_dataframe(example=...)) # logical streaming dataframe sdf.mean().stream.sink(print) # printed stream of mean values

This example is no more clear than the for-loop version. On its own this is probably a worse solution than what we had before, just because it involves new technology. However it starts to become useful in two situations:

  1. You want to do more complex streaming algorithms

    sdf = sdf[sdf.name == 'Alice'] sdf.x.groupby(sdf.y).mean().sink(print) # or sdf.x.rolling('300ms').mean()

    It would require more cleverness to build these algorithms with a for loop as above.

  2. You want to do multiple operations, deal with flow control, etc..

    sdf.mean().sink(print) sdf.x.sum().rate_limit(0.500).sink(write_to_database) ...

    Consistently branching off computations, routing data correctly, and handling time can all be challenging to accomplish consistently.

Jupyter Integration and Streaming Outputs

During development we’ve found it very useful to have live updating outputs in Jupyter.

Usually when we evaluate code in Jupyter we have static inputs and static outputs:

However now both our inputs and our outputs are live:

We accomplish this using a combination of ipywidgets and Bokeh plots both of which provide nice hooks to change previous Jupyter outputs and work well with the Tornado IOLoop (streamz, Bokeh, Jupyter, and Dask all use Tornado for concurrency). We’re able to build nicely responsive feedback whenever things change.

In the following example we build our CSV to dataframe pipeline that updates whenever new files appear in a directory. Whenever we drag files to the data directory we see that all of our outputs update.

What is supported?

This project is very young and could use some help. There are plenty of holes in the API. That being said, the following works well:

Elementwise operations:

sdf['z'] = sdf.x + sdf.y sdf = sdf[sdf.z > 2]

Simple reductions:

sdf.sum() sdf.x.mean()

Groupby reductions:


Rolling reductions by number of rows or time window

sdf.rolling(20).x.mean() sdf.rolling('100ms').x.quantile(0.9)

Real time plotting with Bokeh (one of my favorite features)


What’s missing?
  1. Parallel computing: The core streamz library has an optional Dask backend for parallel computing. I haven’t yet made any attempt to attach this to the dataframe implementation.
  2. Data ingestion from common streaming sources like Kafka. We’re in the process now of building asynchronous-aware wrappers around Kafka Python client libraries, so this is likely to come soon.
  3. Out-of-order data access: soon after parallel data ingestion (like reading from multiple Kafka partitions at once) we’ll need to figure out how to handle out-of-order data access. This is doable, but will take some effort. This is where more mature libraries like Flink are quite strong.
  4. Performance: Some of the operations above (particularly rolling operations) do involve non-trivial copying, especially with larger windows. We’re relying heavily on the Pandas library which wasn’t designed with rapidly changing data in mind. Hopefully future iterations of Pandas (Arrow/libpandas/Pandas 2.0?) will make this more efficient.
  5. Filled out API: Many common operations (like variance) haven’t yet been implemented. Some of this is due to laziness and some is due to wanting to find the right algorithm.
  6. Robust plotting: Currently this works well for numeric data with a timeseries index but not so well for other data.

But most importantly this needs use by people with real problems to help us understand what here is valuable and what is unpleasant.

Help would be welcome with any of this.

You can install this from github

pip install git+https://github.com/mrocklin/streamz.git

Documentation and code are here:

Current work

Current and upcoming work is focused on data ingestion from Kafka and parallelizing with Dask.

Categories: FLOSS Project Planets

Simple is Better Than Complex: A Complete Beginner's Guide to Django - Part 7

Planet Python - Sun, 2017-10-15 20:00

Welcome to the last part of our tutorial series! In this tutorial, we are going to deploy our Django application to a production server. We are also going to configure an Email service and HTTPS certificates for our servers.

At first, I thought about given an example using a Virtual Private Server (VPS), which is more generic and then using one Platform as a Service such as Heroku. But it was too much detail, so I ended up creating this tutorial focused on VPSs.

Our project is live! If you want to check online before you go through the text, this is the application we are going to deploy: www.djangoboards.com.

Version Control

Version control is an extremely important topic in software development. Especially when working with teams and maintaining production code at the same time, several features are being developed in parallel. No matter if it’s a one developer project or a multiple developers project, every project should use version control.

There are several options of version control systems out there. Perhaps because of the popularity of GitHub, Git become the de facto standard in version control. So if you are not familiar version control, Git is a good place to start. There are many tutorials, courses, and resources in general so that it’s easy to find help.

GitHub and Code School have a great interactive tutorial about Git, which I used years ago when I started moving from SVN to Git. It’s a very good introduction.

This is such an important topic that I probably should have brought it up since the first tutorial. But the truth is I wanted the focus of this tutorial series to be on Django. If all this is new for you, don’t worry. It’s important to take one step at a time. Your first project won’t be perfect. It’s important to keep learning and evolving your skills slowly but with constancy.

A very good thing about Git is that it’s much more than just a version control system. There’s a rich ecosystem of tools and services built around it. Some good examples are continuous integration, deployment, code review, code quality, and project management.

Using Git to support the deployment process of Django projects works very well. It’s a convenient way to pull the latest version from the source code repository or to rollback to a specific version in case of a problem. There are many services that integrate with Git so to automate test execution and deployment for example.

If you don’t have Git installed on your local machine, grab the installed from https://git-scm.com/downloads.

Basic Setup

First thing, set your identity:

git config --global user.name "Vitor Freitas" git config --global user.email vitor@simpleisbetterthancomplex.com

In the project root (the same directory as manage.py is), initialize a git repository:

git init Initialized empty Git repository in /Users/vitorfs/Development/myproject/.git/

Check the status of the repository:

git status On branch master Initial commit Untracked files: (use "git add <file>..." to include in what will be committed) accounts/ boards/ manage.py myproject/ requirements.txt static/ templates/ nothing added to commit but untracked files present (use "git add" to track)

Before we proceed in adding the source files, create a new file named .gitignore in the project root. This special file will help us keep the repository clean, without unnecessary files like cache files or logs for example.

You can grab a generic .gitignore file for Python projects from GitHub.

Make sure to rename it from Python.gitignore to just .gitignore (the dot is important!).

You can complement the .gitignore file telling it to ignore SQLite database files for example:


__pycache__/ *.py[cod] .env venv/ # SQLite database files *.sqlite3

Now add the files to the repository:

git add .

Notice the dot here. The command above is telling Git to add all untracked files within the current directory.

Now make the first commit:

git commit -m "Initial commit"

Always write a comment telling what this commit is about, briefly describing what have you changed.

Remote Repository

Now let’s setup GitHub as a remote repository. First, create a free account on GitHub, then confirm your email address. After that, you will be able to create public repositories.

For now, just pick a name for the repository, don’t initialize it with a README, or add a .gitignore or add a license so far. Make sure you start the repository empty:

After you create the repository you should see something like this:

Now let’s configure it as our remote repository:

git remote add origin git@github.com:sibtc/django-boards.git

Now push the code to the remote server, that is, to the GitHub repository:

git push origin master Counting objects: 84, done. Delta compression using up to 4 threads. Compressing objects: 100% (81/81), done. Writing objects: 100% (84/84), 319.70 KiB | 0 bytes/s, done. Total 84 (delta 10), reused 0 (delta 0) remote: Resolving deltas: 100% (10/10), done. To git@github.com:sibtc/django-boards.git * [new branch] master -> master

I create this repository just to demonstrate the process to create a remote repository with an existing code base. The source code of the project is officially hosted in this repository: https://github.com/sibtc/django-beginners-guide.

Project Settings

No matter if the code is stored in a public or private remote repository, sensitive information should never be committed and pushed to the remote repository. That includes secret keys, passwords, API keys, etc.

At this point, we have to deal with two specific types of configuration in our settings.py module:

  • Sensitive information such as keys and passwords;
  • Configurations that are specific to a given environment.

Passwords and keys can be stored in environment variables or using local files (not committed to the remote repository):

# environment variables import os SECRET_KEY = os.environ['SECRET_KEY'] # or local files with open('/etc/secret_key.txt') as f: SECRET_KEY = f.read().strip()

For that, there’s a great utility library called Python Decouple that I use in every single Django project I develop. It will search for a local file named .env to set the configuration variables and will fall back to the environment variables. It also provides an interface to define default values, transform the data into int, bool, and list when applicable.

It’s not mandatory, but I really find it a very useful tool. And it works like a charm with services like Heroku.

First, let’s install it:

pip install python-decouple


from decouple import config SECRET_KEY = config('SECRET_KEY')

Now we can place the sensitive information in a special file named .env (notice the dot in front) in the same directory where the manage.py file is:

myproject/ |-- myproject/ | |-- accounts/ | |-- boards/ | |-- myproject/ | |-- static/ | |-- templates/ | |-- .env <-- here! | |-- .gitignore | |-- db.sqlite3 | +-- manage.py +-- venv/



The .env file is ignored in the .gitignore file, so every time we are going to deploy the application or run in a different machine, we will have to create a .env file and add the necessary configuration.

Now let’s install another library to help us write the database connection in a single line. This way it’s easier to write different database connection strings in different environments:

pip install dj-database-url

For now, all the configurations we will need to decouple:


from decouple import config, Csv import dj_database_url SECRET_KEY = config('SECRET_KEY') DEBUG = config('DEBUG', default=False, cast=bool) ALLOWED_HOSTS = config('ALLOWED_HOSTS', cast=Csv()) DATABASES = { 'default': dj_database_url.config( default=config('DATABASE_URL') ) }

Example of a .env file for our local machine:

SECRET_KEY=rqr_cjv4igscyu8&&(0ce(=sy=f2)p=f_wn&@0xsp7m$@!kp=d DEBUG=True ALLOWED_HOSTS=.localhost,

Notice that in the DEBUG configuration we have a default, so in production we can ignore this configuration because it will be set to False automatically, as it is supposed to be.

Now the ALLOWED_HOSTS will be transformed into a list like ['.localhost', ''. ]. Now, this is on our local machine, for production we will set it to something like ['.djangoboards.com', ] or whatever domain you have.

This particular configuration makes sure your application is only served to this domain.

Tracking Requirements

It’s a good practice to keep track of the project’s dependencies, so to be easier to install it on another machine.

We can check the currently installed Python libraries by running the command:

pip freeze dj-database-url==0.4.2 Django==1.11.6 django-widget-tweaks==1.4.1 Markdown==2.6.9 python-decouple==3.1 pytz==2017.2

Create a file named requirements.txt in the project root, and add the dependencies there:


dj-database-url==0.4.2 Django==1.11.6 django-widget-tweaks==1.4.1 Markdown==2.6.9 python-decouple==3.1

I kept the pytz==2017.2 out, because it is automatically installed by Django.

You can update your source code repository:

git add . git commit -m "Add requirements.txt file" git push origin master Domain Name

If we are going to deploy a Django application properly, we will need a domain name. It’s important to have a domain name to serve the application, configure an email service and configure an https certificate.

Lately, I’ve been using Namecheap a lot. You can get a .com domain for $8.88/year, or if you are just trying things out, you could register a .xyz domain for $0.99/year.

Anyway, you are free to use any registrar. To demonstrate the deployment process, I registered the www.DjangoBoards.com domain.

Deployment Strategy

Here is an overview of the deployment strategy we are going to use in this tutorial:

The cloud is our Virtual Private Server provided by Digital Ocean. You can sign up to Digital Ocean using my affiliate link to get a free $10 credit (only valid for new accounts).

Upfront we will have NGINX, illustrated by the ogre. NGINX will receive all requests to the server. But it won’t try to do anything smart if the request data. All it is going to do is decide if the requested information is a static asset that he can serve by himself, or if it’s something more complicated. If so, it will pass the request to Gunicorn.

The NGINX will also be configured with HTTPS certificates. Meaning it will only accept requests via HTTPS. If the client tries to request via HTTP, NGINX will first redirect the user to the HTTPS, and only then it will decide what to do with the request.

We are also going to install this certbot to automatically renew the Let’s Encrypt certificates.

Gunicorn is an application server. Depending on the number of processors the server has, it can spawn multiple workers to process multiple requests in parallel. It manages the workload and executes the Python and Django code.

Django is the one doing the hard work. It may access the database (PostgreSQL) or the file system. But for the most part, the work is done inside the views, rendering templates, all those things that we’ve been coding for the past weeks. After Django process the request, it returns a response to Gunicorn, who returns the result to NGINX that will finally deliver the response to the client.

We are also going to install PostgreSQL, a production quality database system. Because of Django’s ORM system, it’s easy to switch databases.

The last step is to install Supervisor. It’s a process control system and it will keep an eye on Gunicorn and Django to make sure everything runs smoothly. If the server restarts, or if Gunicorn crashes, it will automatically restart it.

Deploying to a VPS (Digital Ocean)

You may use any other VPS (Virtual Private Server) you like. The configuration should be very similar, after all, we are going to use Ubuntu 16.04 as our server.

First, let’s create a new server (on Digital Ocean they call it “Droplet”). Select Ubuntu 16.04:

Pick the size. The smallest droplet is enough:

Then choose a hostname for your droplet (in my case “django-boards”):

If you have an SSH key, you can add it to your account. Then you will be able to log in the server using it. Otherwise, they will email you the root password.

Now pick the server’s IP address:

Before we log in to the server, let’s point our domain name to this IP address. This will save some time because DNS settings usually take a few minutes to propagate.

So here we added two A records, one pointing to the naked domain “djangoboards.com” and the other one for “www.djangoboards.com”. We will use NGINX to configure a canonical URL.

Now let’s log in to the server using your terminal:

ssh root@ root@'s password:

Then you should see the following message:

You are required to change your password immediately (root enforced) Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-93-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage Get cloud support with Ubuntu Advantage Cloud Guest: http://www.ubuntu.com/business/services/cloud 0 packages can be updated. 0 updates are security updates. Last login: Sun Oct 15 18:39:21 2017 from Changing password for root. (current) UNIX password:

Set the new password, and let’s start to configure the server.

sudo apt-get update sudo apt-get -y upgrade

If you get any prompt during the upgrade, select the option “keep the local version currently installed”.

Python 3.6

sudo add-apt-repository ppa:deadsnakes/ppa sudo apt-get update sudo apt-get install python3.6


sudo apt-get -y install postgresql postgresql-contrib


sudo apt-get -y install nginx


sudo apt-get -y install supervisor sudo systemctl enable supervisor sudo systemctl start supervisor


wget https://bootstrap.pypa.io/get-pip.py sudo python3.6 get-pip.py sudo pip3.6 install virtualenv Application User

Create a new user with the command below:

adduser boards

Usually, I just pick the name of the application. Enter a password and optionally add some extra info to the prompt.

Now add the user to the sudoers list:

gpasswd -a boards sudo PostgreSQL Database Setup

First switch to the postgres user:

sudo su - postgres

Create a database user:

createuser u_boards

Create a new database and set the user as the owner:

createdb django_boards --owner u_boards

Define a strong password for the user:


We can now exit the postgres user:

exit Django Project Setup

Switch to the application user:

sudo su - boards

First, we can check where we are:

pwd /home/boards

First, let’s clone the repository with our code:

git clone https://github.com/sibtc/django-beginners-guide.git

Start a virtual environment:

virtualenv venv -p python3.6

Initialize the virtualenv:

source venv/bin/activate

Install the requirements:

pip install -r django-beginners-guide/requirements.txt

We will have to add two extra libraries here, the Gunicorn and the PostgreSQL driver:

pip install gunicorn pip install psycopg2

Now inside the /home/boards/django-beginners-guide folder, let’s create a .env file to store the database credentials, the secret key and everything else:


SECRET_KEY=rqr_cjv4igscyu8&&(0ce(=sy=f2)p=f_wn&@0xsp7m$@!kp=d ALLOWED_HOSTS=.djangoboards.com DATABASE_URL=postgres://u_boards:BcAZoYWsJbvE7RMgBPzxOCexPRVAq@localhost:5432/django_boards

Here is the syntax of the database URL: postgres://db_user:db_password@db_host:db_port/db_name.

Now let’s migrate the database, collect the static files and create a super user:

cd django-beginners-guide python manage.py migrate Operations to perform: Apply all migrations: admin, auth, boards, contenttypes, sessions Running migrations: Applying contenttypes.0001_initial... OK Applying auth.0001_initial... OK Applying admin.0001_initial... OK Applying admin.0002_logentry_remove_auto_add... OK Applying contenttypes.0002_remove_content_type_name... OK Applying auth.0002_alter_permission_name_max_length... OK Applying auth.0003_alter_user_email_max_length... OK Applying auth.0004_alter_user_username_opts... OK Applying auth.0005_alter_user_last_login_null... OK Applying auth.0006_require_contenttypes_0002... OK Applying auth.0007_alter_validators_add_error_messages... OK Applying auth.0008_alter_user_username_max_length... OK Applying boards.0001_initial... OK Applying boards.0002_auto_20170917_1618... OK Applying boards.0003_topic_views... OK Applying sessions.0001_initial... OK

Now the static files:

python manage.py collectstatic Copying '/home/boards/django-beginners-guide/static/js/jquery-3.2.1.min.js' Copying '/home/boards/django-beginners-guide/static/js/popper.min.js' Copying '/home/boards/django-beginners-guide/static/js/bootstrap.min.js' Copying '/home/boards/django-beginners-guide/static/js/simplemde.min.js' Copying '/home/boards/django-beginners-guide/static/css/app.css' Copying '/home/boards/django-beginners-guide/static/css/bootstrap.min.css' Copying '/home/boards/django-beginners-guide/static/css/accounts.css' Copying '/home/boards/django-beginners-guide/static/css/simplemde.min.css' Copying '/home/boards/django-beginners-guide/static/img/avatar.svg' Copying '/home/boards/django-beginners-guide/static/img/shattered.png' ...

This command copy all the static assets to an external directory where NGINX can serve the files for us. More on that later.

Now create a super user for the application:

python manage.py createsuperuser Configuring Gunicorn

So, Gunicorn is the one responsible for executing the Django code behind a proxy server.

Create a new file named gunicorn_start inside /home/boards:

#!/bin/bash NAME="django_boards" DIR=/home/boards/django-beginners-guide USER=boards GROUP=boards WORKERS=3 BIND=unix:/home/boards/run/gunicorn.sock DJANGO_SETTINGS_MODULE=myproject.settings DJANGO_WSGI_MODULE=myproject.wsgi LOG_LEVEL=error cd $DIR source ../venv/bin/activate export DJANGO_SETTINGS_MODULE=$DJANGO_SETTINGS_MODULE export PYTHONPATH=$DIR:$PYTHONPATH exec ../venv/bin/gunicorn ${DJANGO_WSGI_MODULE}:application \ --name $NAME \ --workers $WORKERS \ --user=$USER \ --group=$GROUP \ --bind=$BIND \ --log-level=$LOG_LEVEL \ --log-file=-

This script will start the application server. We are providing some information such as where the Django project is, which application user to be used to run the server, and so on.

Now make this file executable:

chmod u+x gunicorn_start

Create two empty folders, one for the socket file and one to store the logs:

mkdir run logs

Right now the directory structure inside /home/boards should look like this:

django-beginners-guide/ gunicorn_start logs/ run/ staticfiles/ venv/

The staticfiles folder was created by the collectstatic command.

Configuring Supervisor

First, create an empty log file inside the /home/boards/logs/ folder:

touch logs/gunicorn.log

Now create a new supervisor file:

sudo vim /etc/supervisor/conf.d/boards.conf [program:boards] command=/home/boards/gunicorn_start user=boards autostart=true autorestart=true redirect_stderr=true stdout_logfile=/home/boards/logs/gunicorn.log

Save the file and run the commands below:

sudo supervisorctl reread sudo supervisorctl update

Now check the status:

sudo supervisorctl status boards boards RUNNING pid 308, uptime 0:00:07 Configuring NGINX

Next step is to set up the NGINX server to serve the static files and to pass the requests to Gunicorn:

Add a new configuration file named boards inside /etc/nginx/sites-available/:

upstream app_server { server unix:/home/boards/run/gunicorn.sock fail_timeout=0; } server { listen 80; server_name www.djangoboards.com; # here can also be the IP address of the server keepalive_timeout 5; client_max_body_size 4G; access_log /home/boards/logs/nginx-access.log; error_log /home/boards/logs/nginx-error.log; location /static/ { alias /home/boards/staticfiles/; } # checks for static file, if not found proxy to app location / { try_files $uri @proxy_to_app; } location @proxy_to_app { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_redirect off; proxy_pass http://app_server; } }

Create a symbolic link to the sites-enabled folder:

sudo ln -s /etc/nginx/sites-available/boards /etc/nginx/sites-enabled/boards

Remove the default NGINX website:

sudo rm /etc/nginx/sites-enabled/default

Restart the NGINX service:

sudo service nginx restart

At this point, if the DNS have already propagated, the website should be available on the URL www.djangoboards.com.

Configuring an Email Service

One of the best options to get started is Mailgun. It offers a very reliable free plan covering 12,000 emails per month.

Sign up for a free account. Then just follow the steps, it’s very straightforward. You will have to work together with the service you registered your domain. In my case, it was Namecheap.

Click on add domain to add a new domain to your account. Follow the instructions and make sure you use “mg.” subdomain:

Now grab the first set of DNS records, it’s two TXT records:

Add it to your domain, using the web interface offered by your registrar:

Do the same thing with the MX records:

Add them to the domain:

Now this step is not mandatory, but since we are already here, confirm it as well:

After adding all the DNS records, click in the Check DNS Records Now button:

Now we need to have some patience. Sometimes it takes a while to validate the DNS.

Meanwhile, we can configure the application to receive the connection parameters.


EMAIL_BACKEND = config('EMAIL_BACKEND', default='django.core.mail.backends.smtp.EmailBackend') EMAIL_HOST = config('EMAIL_HOST', default='') EMAIL_PORT = config('EMAIL_PORT', default=587, cast=int) EMAIL_HOST_USER = config('EMAIL_HOST_USER', default='') EMAIL_HOST_PASSWORD = config('EMAIL_HOST_PASSWORD', default='') EMAIL_USE_TLS = config('EMAIL_USE_TLS', default=True, cast=bool) DEFAULT_FROM_EMAIL = 'Django Boards <noreply@djangoboards.com>' EMAIL_SUBJECT_PREFIX = '[Django Boards] '

Then, my local machine .env file would look like this:

SECRET_KEY=rqr_cjv4igscyu8&&(0ce(=sy=f2)p=f_wn&@0xsp7m$@!kp=d DEBUG=True ALLOWED_HOSTS=.localhost, DATABASE_URL=sqlite:///db.sqlite3 EMAIL_BACKEND=django.core.mail.backends.console.EmailBackend

And my production .env file would look like this:

SECRET_KEY=rqr_cjv4igscyu8&&(0ce(=sy=f2)p=f_wn&@0xsp7m$@!kp=d ALLOWED_HOSTS=.djangoboards.com DATABASE_URL=postgres://u_boards:BcAZoYWsJbvE7RMgBPzxOCexPRVAq@localhost:5432/django_boards EMAIL_HOST=smtp.mailgun.org EMAIL_HOST_USER=postmaster@mg.djangoboards.com EMAIL_HOST_PASSWORD=ED2vmrnGTM1Rdwlhazyhxxcd0F

You can find your credentials in the Domain Information section on Mailgun.

  • EMAIL_HOST: SMTP Hostname
  • EMAIL_HOST_USER: Default SMTP Login
  • EMAIL_HOST_PASSWORD: Default Password

We can test the new settings in the production server. Make the changes in the settings.py file on your local machine, commit the changes to the remote repository. Then, in the server pull the new code and restart the Gunicorn process:

git pull

Edit the .env file with the email credentials.

Then restart the Gunicorn process:

sudo supervisorctl restart boards

Now we can try to start the password reset process:

On the Mailgun dashboard you can have some statistics about the email delivery:

Configuring HTTPS Certificate

Now let’s protect our application with a nice HTTPS certificate provided by Let’s Encrypt.

Setting up HTTPS has never been that easy. And better, we can get it for free nowadays. They provide a solution called certbot which takes care of installing and renewing the certificates for us. It’s very straightforward:

sudo apt-get update sudo apt-get install software-properties-common sudo add-apt-repository ppa:certbot/certbot sudo apt-get update sudo apt-get install python-certbot-nginx

Now install the certs:

sudo certbot --nginx

Just follow the prompts. When asked about:

Please choose whether or not to redirect HTTP traffic to HTTPS, removing HTTP access.

Choose 2 to redirect all HTTP traffic to HTTPS.

With that the site is already being served over HTTPS:

Setup the auto renew of the certs. Run the command below to edit the crontab file:

sudo crontab -e

Add the following line to the end of the file:

0 4 * * * /usr/bin/certbot renew --quiet

This command will run every day at 4 am. All certificates expiring within 30 days will automatically be renewed.


Thanks a lot for all those who followed this tutorial series, giving comments and feedback! I really appreciate! This was the last tutorial of the series. I hope you enjoyed it!

Even though this was the last part of the tutorial series, I plan to write a few follow-up tutorials exploring other interesting topics as well, such as database optimization and adding more features on top of what we have at the moment.

By the way, if you are interested in contributing to the project, few free to submit pull requests! The source code of the project is available on GitHub: https://github.com/sibtc/django-beginners-guide/

And please let me know what else you would like to see next! :-)

← Part 6 - Class-Based Views Tutorial Series Index →
Categories: FLOSS Project Planets

Justin Mason: Links for 2017-10-15

Planet Apache - Sun, 2017-10-15 19:58
Categories: FLOSS Project Planets

Iain R. Learmonth: Free Software Efforts (2017W41)

Planet Debian - Sun, 2017-10-15 18:00

Here’s my weekly report for week 41 of 2017. In this week I have explored some Java 8 features, looked at automatic updates in a few Linux distributions and decided that actually I don’t need swap anymore.


The issue that was preventing the migration of the Tasktools Packaging Team’s mailing list from Alioth to Savannah has now been resolved.

Ana’s chkservice package that I sponsored last week has been ACCEPTED into unstable and since MIGRATED to testing.

Tor Project

I have produced a patch for the Tor Project website to update links to the Onionoo documentation now this has moved (#23802 ). I’ve updated the Debian and Ubuntu relay configuration instructions to use systemctl instead of service where appropriate (#23048 ).

When a Tor relay is less than 2 years old, an alert will now appear on Atlas to link to the new relay lifecycle blog post (#23767 ). This should hopefully help new relay operators understand why their relay is not immediately fully loaded but instead it takes some time to ramp up.

I have gone through the tickets for Tor Cloud and did not find any tickets that contain any important information that would be useful to someone reviving the project. I have closed out these tickets and the Tor Cloud component no longer has any non-closed tickets (#7763, #8544, #8768, #9064, #9751, #10282, #10637, #11153, #11502, #13391, #14035, #14036, #14073, #15821 ).

I’ve continued to work on turning the Atlas application into an integrated part of Tor Metrics (#23518 ) and you can see some progress here.

Finally, I’ve continued hacking on a Twitter bot to tweet factoids about the public Tor network and you can now enjoy some JavaDoc documentation if you’d like to learn a little about its internals. I am still waiting for a git repository to be created (#23799 ) but will be publishing the sources shortly after that ticket is actioned.


I believe it is important to be clear not only about the work I have already completed but also about the sustainability of this work into the future. I plan to include a short report on the current sustainability of my work in each weekly report.

I have not had any free software related expenses this week. The current funds I have available for equipment, travel and other free software expenses remains £60.52. I do not believe that any hardware I rely on is looking at imminent failure.

I’d like to thank Digital Ocean for providing me with futher credit for their platform to support my open source work.

I do not find it likely that I’ll be travelling to Cambridge for the miniDebConf as the train alone would be around £350 and hotel accomodation a further £600 (to include both me and Ana).

Categories: FLOSS Project Planets

Yasoob Khalid: Weird Comparison Issue in Python

Planet Python - Sun, 2017-10-15 17:31

Hi guys! I am back with a new article. This time I will tackle a problem which seems easy enough at first but will surprize some of you. Suppose you have the following piece of code:

a = 3 b = False c = """12""" d = 4.7

and you have to evaluate this:

d + 2 * a > int(c) == b

Before reading the rest of the post please take a minute to solve this statement in your head and try to come up with the answer.

So while solving it my thought process went something like this:

2 * a = 6 d + 6 = 10.7 10.7 > int(c) is equal to False False == b is equal to True

But lo-and-behold. If we run this code in Python shell we get the following output:


Dang! What went wrong there? Was our thinking wrong? I am pretty sure it was supposed to return True. I went through the official docs a couple of times but couldn’t find the answer. There was also a possibility in my mind that this might be some Python 2 bug but when I tried this code in Python 3 I got the same output. Finally, I turned to the Python’s IRC channel which is always full of extremely helpful people. I got my answer from there.

So I got to know that I was chaining comparisons. But I knew that already. What I didn’t know was that whenever you chain comparisons, Python compares each thing in order and then does an “AND”. So our comparison code is equivalent to:

(d + 2*a) > (int(c)) and (int(c)) == (b)

This brings us to the question that whenever you chain comparisons, does Python compares each thing in order and then does an “AND”?

As it turns out this is exactly what Python does: x <comparison> y <comparison> z’ is executed just like ‘x <comparison> y and y <comparison> z’, except ‘y’ is only evaluated once.

I hope you found this article helpful. If you have any questions, comments, suggestions please feel free to reach out to me via email or the comments section below.


Categories: FLOSS Project Planets

LibreOffice Conference 2017

Planet KDE - Sun, 2017-10-15 12:00

This week the annual LibreOffice conference was held in Rome and I had the pleasure to attend. The city of Rome is migrating their IT infrastructure to open software and standards and the city council was kind enough to provide the awesome venue for the event, the Campidoglio.

Photo by Simon Phipps

It is always interesting to meet new people from other communities that share the same values we have in KDE. You meet new friends and you get to know another perspective about the things you are doing.

As a bonus point, I also had the pleasure to meet in person with KDE contributors Andreas Kainz, Franklin Weng, Heiko Tzietze and Jos van den Oever. See you all at Akademy next year!

LibreOffice in Plasma 5

Among the speakers, Katarina Behrens from CIB talked about the status of the Qt5 port of the VCL plugin for KDE Plasma. VCL is the toolkit used by LibreOffice to draw the UI of the program, and its plugin-based architecture allows to adapt the UI to the various native toolkits (such as Qt or GTK).

The KDE plugin is currently stuck with Qt4/kdelibs4 and Katarina has been working on porting it to the new Qt5/KF5 stack. The city of Munich is also sponsoring this work, since they will continue to use LibreOffice for at least some years. The main challenge has been getting rid of the legacy X11 code used for drawing the UI. As a result of this task, the new version of the KDE plugin will get proper Wayland and Hi-DPI support.

If you are wondering if this will bring the native Plasma 5 file picker in LibreOffice, the answer is yes! If any developer wants to help reach this milestone, feel free to contact Katarina who will introduce you to what still needs to be done (a lot).

LibreOffice Online

Lastly, I talked with the Collabora people about the issues that KDE faced with LibreOffice Online in our Nextcloud instance. They assured me that the product has been greatly improved with respect to collaborative editing. By the number of talks and speakers about this topic, it is clear that they have been working hard on it.

Our instance was also using a slightly old version of Collabora Online (2.0.7), so they recommended to upgrade to the 2.1.x series (which Ben quickly did). I think that we as community should give another try to LibreOffice Online and report back to the Collabora developers if we still find issues with the tool. As always, that’s the best way to improve FLOSS!

More photos of the event are available in this album.

Categories: FLOSS Project Planets

Kubuntu Artful Aardvark (17.10) initial RC images now available

Planet KDE - Sat, 2017-10-14 22:40

Artful Aardvark (17.10) initial Release Candidate (RC) images are now available for testing. Help us make 17.10 the best release yet!

Note: This is an initial spin of the RC images. It is likely that at least one more rebuild will be done on Monday.

Adam Conrad from the Ubuntu release team list:

Today, I spun up a set of images for everyone with serial 20171015.

Those images are *not* final images (ISO volid and base-files are still
not set to their final values), intentionally, as we had some hiccups
with langpack uploads that are landing just now.

That said, we need as much testing as possible, bugs reported (and, if
you can, fixed), so we can turn around and have slightly more final
images produced on Monday morning. If we get no testing, we get no
fixing, so no time like the present to go bug-hunting.

… Adam

The Kubuntu team will be releasing 17.10 on October 19, 2017.

This is an initial pre-release. Kubuntu RC pre-releases are NOT recommended for:

  • Regular users who are not aware of pre-release issues
  • Anyone who needs a stable system
  • Anyone uncomfortable running a possibly frequently broken system
  • Anyone in a production environment with data or workflows that need to be reliable

Kubuntu pre-releases are recommended for:

  • Regular users who want to help us test by finding, reporting, and/or fixing bugs
  • Kubuntu, KDE, and Qt developers

Getting Kubuntu 17.10 Intial Release Candidate:

To upgrade to Kubuntu 17.10 pre-releases from 17.04, run

sudo do-release-upgrade -d

from a command line.

Download a Bootable image and put it onto a DVD or USB Drive here:

http://iso.qa.ubuntu.com/qatracker/milestones/383/builds (the little CD icon)

See our release notes: https://wiki.ubuntu.com/ArtfulAardvark/Kubuntu

Please report any bugs on Launchpad using the commandline:

ubuntu-bug packagename

Check on IRC channels, Kubuntuforum or the Kubuntu mail lists if you don’t know the package name. Once the bug is reported on Launchpad, please link to it on the qatracker where you got your RC image. Join the community ISO testing party: https://community.ubuntu.com/t/ubuntu-17-10-community-iso-testing/458

KDE bugs (bugs in Plasma or KDE applications) are still filed at https://bugs.kde.org.

Categories: FLOSS Project Planets

Bryan Ruby: Drupal 8.4 Available and Fixes Significant Database Caching Issues

Planet Drupal - Sat, 2017-10-14 21:58
Drupal 8.4 Available and Fixes Significant Database Caching Issues Image Bryan Ruby Sat, 10/14/2017 - 20:58

Your hosting account was found to be causing an overload of MySQL resources. What can you do? Upgrade your Drupal 8 website to Drupal 8.4 or higher.

One of my goals in rebranding my website from CMS Report to socPub was to write diverse articles beyond the topic of content management systems. Yet, here we go again with another CMS related article. The Drupal open source project recently made available Drupal 8.4 and for me this version has been a long time coming as it addresses some long standing frustrations I've had with Drupal 8 from the perspective of a site administrator. While Drupal 8.4 adds some nice new features, I'm just as excited about the bug fixes and performance improvements delivered in this new version of Drupal.

When Drupal 8 was introduced it made significant improvements in how it caches and renders pages. That's great news for websites that use Drupal's built-in caching to speed up delivery of pages or page elements. But there was one unwanted side effect to the cache enhancements, excessive growth of cache tables with tens or hundreds of thousands of entries, and gigabytes in size. For my own website it is not too uncommon to see my database reach 4 GB in size. Let's put it this way, it was no fun to receive a letter from my hosting provider that they weren't too happy of my resource usage. Worse they threatened shutting down my website if I didn't manage the database size better. Just in the nick of time for you and me, Drupal 8.4 delivers a fix to the cache growth by introducing a new default limit of 5000 rows per cache bin.

I'm still playing with this change and I haven't found a lot of documentation, but you can override the default row limit in Drupal's settings.php via the setting "database_cache_max_rows". For my site, the following settings has helped me keep my MySQL database under half a Gigabyte:

$settings['database_cache_max_rows']['default'] = 5000; $settings['database_cache_max_rows']['bins']['page'] = 500; $settings['database_cache_max_rows']['bins']['dynamic_page_cache'] = 500; $settings['database_cache_max_rows']['bins']['render'] = 1000;

For those of you that may not be ready to upgrade to Drupal 8.4 but still need to handle the oversized caching tables today, I had some luck with the Slushi cache module. An additional good summary of similar solutions for Drupal 8 versions prior to 8.4 can be found on Jeff Geerling's blog.

Notable New Features in Drupal 8.4

Of course the purpose of Drupal 8.4 isn't just to address my pet peeve about Drupal caching but also to bring Drupal users a number of new features and improvements. Some of the more significant additions and changes in Drupal that affect me and possibly you include:

Datetime Range

For non-Drupal user I know this is going to sound odd, but despite a number of community approaches there never really been a standard format for expressing a range for date or time commonly used in event and planning calendars. Drupal 8.4 addresses this missing field type with the new core Datetime Range module to support contributed modules like Calendar and shares a consistent API with other Datetime fields. Future releases may improve Views support, usability, Datetime Range field validation, and REST support.

Content Moderation and Workflow

Although I've been a longtime user of Drupal, for a two year period I managed my website on the Agility CMS. One of the benefits of Agility over Drupal were the workflow and moderation tools delivered "out of the box". The ability to moderate content becomes especially important in websites that have multiple authors and editors collaborating together and in need to mark whether the content is a draft, ready for review, in need of revision, ready to publish, etc. With Drupal 8.4 the Workflow modules is now stable and provides the framework to build additional modules such as the much anticipated Content Moderation module. Currently, the new core Content Moderation is considered experimental and beta stable so additional future changes should be expected. Content moderation workflows can now apply to any entity types that support revisions, and numerous usability issues and critical bugs are resolved in this release.

Media Handling

Another long standing issue for me has been how Drupal handles, displays, and allows you to reuses (it doesn't without outside help) those images. Over the years, there has been a host of solutions found via contributed modules but I've often found myself frustrated that support for these modules vary and often compatible versions are not made available until weeks or months after a new major version of Drupal has been released. The new core Media module wants to change this hurdle by providing an API for reusable media entities and references. It is based on the contributed Media Entity module which has become popular in recent years within Drupal's users.

Unfortunately, the core Media module still needs work and is currently marked hidden. In other words Media by default will not appear in Drupal 8.4's module administration page. The module will be displayed to site builders normally once once related user experience issues are resolved in a future release. Although, if you elect to use a contributed module under development that depends on the core Media module it will enable Media automatically for you. Similarly, the REST API and normalizations for Media are not final and support for decoupled applications will be improved in a future release. So while the Media API in available in this version of Drupal, most of us non-developers will need to wait for additional development to see the benefits of this module. 

Additional Information on Drupal 8.4

An overview of Drupal 8.4 can be found at Drupal.org but for a better list of the changes and fixes you'll want to check out the release notes. As always, links to the latest version of Drupal can be found on the project page. I've seen a few strange errors in the logs since updating my site from Drupal 8.3 to 8.4 but nothing significant for me to recommend waiting to install Drupal 8.4. For those that are more cautious, the next bugfix release (8.4.1) is scheduled for November 1, 2017.

Article originally published at socPub.

Disqus Tags Content Management Drupal Planet Drupal Open Source Information System System Administration Story
Categories: FLOSS Project Planets

Norbert Preining: TeX Live Manager: JSON output

Planet Debian - Sat, 2017-10-14 21:32

With the development of TLCockpit continuing, I found the need for and easy exchange format between the TeX Live Manager tlmgr and frontend programs like TLCockpit. Thus, I have implemented JSON output for the tlmgr info command.

While the format is not 100% stable – I might change some thing – I consider it pretty settled. The output of tlmgr info --data json is a JSON array with JSON objects for each package requested (default is to list all).

[ TLPackageObj, TLPackageObj, ... ]

The structure of the JSON object TLPackageObj reflects the internal Perl hash. Guaranteed to be present keys are name (String) and avilable (Boolean). In case the package is available, there are the following further keys sorted by their type:

  • String type: name, shortdesc, longdesc, category, catalogue, containerchecksum, srccontainerchecksum, doccontainerchecksum
  • Number type: revision, runsize, docsize, srcsize, containersize, srccontainersize, doccontainersize
  • Boolean type: available, installed, relocated
  • Array type: runfiles (Strings), docfiles (Strings), srcfiles (Strings), executes (Strings), depends (Strings), postactions (Strings)
  • Object type:
    • binfiles: keys are architecture names, values are arrays of strings (list of binfiles)
    • binsize: keys are architecture names, values or numbers
    • docfiledata: keys are docfile names, values are objects with optional keys details and lang
    • cataloguedata: optional keys aare topics, version, license, ctan, date, values are all strings

A rather long example showing the output for the package latex, formatted with json_pp and having the list of files and the long description shortened:

[ { "installed" : true, "doccontainerchecksum" : "5bdfea6b85c431a0af2abc8f8df160b297ad73f6a324ca88df990f01f24611c9ae80d2f6d12c7b3767308fbe3de3fca3d11664b923ea4080fb13fd056a1d0c3d", "docfiles" : [ "texmf-dist/doc/latex/base/README.txt", .... "texmf-dist/doc/latex/base/webcomp.pdf" ], "containersize" : 163892, "depends" : [ "luatex", "pdftex", "latexconfig", "latex-fonts" ], "runsize" : 414, "relocated" : false, "doccontainersize" : 12812184, "srcsize" : 752, "revision" : 43813, "srcfiles" : [ "texmf-dist/source/latex/base/alltt.dtx", .... "texmf-dist/source/latex/base/utf8ienc.dtx" ], "category" : "Package", "cataloguedata" : { "version" : "2017/01/01 PL1", "topics" : "format", "license" : "lppl1.3", "date" : "2017-01-25 23:33:57 +0100" }, "srccontainerchecksum" : "1d145b567cf48d6ee71582a1f329fe5cf002d6259269a71d2e4a69e6e6bd65abeb92461d31d7137f3803503534282bc0c5546e5d2d1aa2604e896e607c53b041", "postactions" : [], "binsize" : {}, "longdesc" : "LaTeX is a widely-used macro package for TeX, [...]", "srccontainersize" : 516036, "containerchecksum" : "af0ac85f89b7620eb7699c8bca6348f8913352c473af1056b7a90f28567d3f3e21d60be1f44e056107766b1dce8d87d367e7f8a82f777d565a2d4597feb24558", "executes" : [], "binfiles" : {}, "name" : "latex", "catalogue" : null, "docsize" : 3799, "available" : true, "runfiles" : [ "texmf-dist/makeindex/latex/gglo.ist", ... "texmf-dist/tex/latex/base/x2enc.dfu" ], "shortdesc" : "A TeX macro package that defines LaTeX" } ]

What is currently not available via tlmgr info and thus also not via the JSON output is access to virtual TeX Live databases with several member databases (multiple repositories). I am thinking about how to incorporate this information.

These changes are currently available in the tlcritical repository, but will enter proper TeX Live repositories soon.

Using this JSON output I will rewrite the current TLCockpit tlmgr interface to display more complete information.

Categories: FLOSS Project Planets

Arnav Khare: Useful Executable Modules in the Python Standard Library

Planet Python - Sat, 2017-10-14 20:00

Python comes with many handy tools that can make our lives as developers or sysadmins easier. These tools are in the form of modules and libraries that are also executable. Many of these tools are known, but not all are as well known as they should be. I will mention a few useful tools that I have found in this post.

How to write an executable Python script

First, for beginners, a quick introduction to how to write executable scripts in Python. It is actually really easy.

We can simply run a python script from the command line by prepending python to the name, such as python <script>.

To run a module which is present in the current PYTHONPATH, you can run from command-line too

$ python -m <module>

Adding python -m to your script everytime can be tedious sometimes, so in Unix you can tell the shell how to execute your script. This is done by specifying the executing binary path in the first line of the script by appending #! (aka she-bang) to the command and then simply running the script.


or better

#!/usr/bin/env python

When Python executes a script it runs the code top-down line by line. All the functions and classes at the top level of the script will get compiled and any module level statements will be executed. This process is the same as when Python imports a module from another module.

If you want to write code that only executes when the module is run as a script, you can write it in a if __name__ == '__main__' block as below.

The argparse module in stdlib can be used to parse command-line parameters and ensuring that the interface is clearly specified.

import argparse if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('-o', '--output') parser.add_argument('-v', dest='verbose', action='store_true') args = parser.parse_args() # ... do something with args.output ... # ... do something with args.verbose ..

Now let’s look at some interesting and useful runnable modules in the Python Standard Library…

File sharing

A very useful tool is the HTTP Server module. This module can be run to allow sharing of local file over your internal network easily using Python. To start a web server and serve files in current directory simply run the following command…

# Python 3 $ python3 -m http.server # Python 2 $ python2 -m SimpleHTTPServer

Now the files in local directory are it’s subdirectories are visible on http://localhost:8000/. Others in the local network can access the files by replcating localhost with our machine’s IP address.

I have used this in the past to quickly share files with collegues where a file share was not readily available and the file is too big to send over email.

JSON pretty printing

json.tool is a handy way of pretty printing and validating JSON format files from the command line.

$ echo '{"json":"obj"}' | python -m json.tool { "json": "obj" } $ echo '{ 1.2:3.4}' | python -m json.tool Expecting property name enclosed in double quotes: line 1 column 3 (char 2)

I have used JSON tool in the past to debug integration with HTTP APIs that serve JSON results.


The Python debugger pdb makes it very easy to debug issues with Python scripts. Simply run the target script using pdb module rather than directly running it. If an unhandled exception is raised the debugger drops in a debug shell allowing us to run a post-mortem analysis by inspecting state and variables.

$ python -m pdb script.py

If you want the debugger to stop at a particular point in execution, simply add the below statement above it in the code.

import pdb; pdb.set_trace() Performance analysis


The timeit module is an easy way to time a piece of code. The module can run some setup code (import string), and then run test code many (default 10,000,000) times to time execution of the code.

$ python -m timeit -s 'text = "sample string"; char = "g"' 'char in text' 10000000 loops, best of 3: 0.0408 usec per loop $ python -m timeit -s 'text = "sample string"; char = "g"' 'text.find(char)' 10000000 loops, best of 3: 0.195 usec per loop

Some useful command-line options are:

  • -n - how many times to repeat the statement (default 10M)
  • -r - how many times to repeat the timer (default 3)
  • -s - setup statement to be executed once before the test

Profiling cProfile module makes it easy to measure the time spent in executing a script and pinpoint the slow bits.

$ python cProfile scriptfile [arg] ...

A couple of useful flags are:

  • -o - output file path
  • -s - sort output
Running tests - Doctests and Unit tests

Python doctests can be run from the command-line using the doctest executable module.

$ python -m doctest -v example.py

Similarly Unit tests can executed using the unittest module

$ python3 -m unittest

This very useful command-line tool will scan the current directory and sub-modules to discover tests and run them. We can also run specific modules or functions by specifying them. Look at the various options here.

Working with archives

Creating and opening Zip and TAR archive files

In case you don’t have tar or Zip tools handy, in Python 3 tarfile and zipfile modules allow us to bundle directories into archives and open existing ones.

# Create a new TAR archive $ python3 -m tarfile -c <tarname>.tgz <file> <file> # Extract from an existing TAR archive $ python3 -m tarfile -e <tarname>.tgz

Making executable Zip files

In Python 3, the zipapp module also allows us to pack up a directory into an archive, and makes it executable. When run, the archive will execute the main function in the myapp module.

$ python3 -m zipapp myapp $ python3 myapp.pyz <output from myapp> Other useful and interesting modules Opening a web page in a Browser locally

The webbrowser module has a programatic as well as command-line interface.

$ python -m webbrowser -t http://www.yahoo.com Base64 encoding and decoding

When working with a REST APIs, especially where authentication tokens are involved, use of Base64 encoding is quite common. Python base64 module can be used from the command-line a tool in the shell as well.

$ python -m base64 -e Calendar

Did you know that Python comes with an in-built text calendar? The calendar module is executable and can take many parameters that allow for customised display.

$ python -m calendar 2017 January February March Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 1 2 3 4 5 1 2 3 4 5 2 3 4 5 6 7 8 6 7 8 9 10 11 12 6 7 8 9 10 11 12 9 10 11 12 13 14 15 13 14 15 16 17 18 19 13 14 15 16 17 18 19 16 17 18 19 20 21 22 20 21 22 23 24 25 26 20 21 22 23 24 25 26 23 24 25 26 27 28 29 27 28 27 28 29 30 31 30 31 April May June Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 2 1 2 3 4 5 6 7 1 2 3 4 3 4 5 6 7 8 9 8 9 10 11 12 13 14 5 6 7 8 9 10 11 10 11 12 13 14 15 16 15 16 17 18 19 20 21 12 13 14 15 16 17 18 17 18 19 20 21 22 23 22 23 24 25 26 27 28 19 20 21 22 23 24 25 24 25 26 27 28 29 30 29 30 31 26 27 28 29 30 July August September Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 2 1 2 3 4 5 6 1 2 3 3 4 5 6 7 8 9 7 8 9 10 11 12 13 4 5 6 7 8 9 10 10 11 12 13 14 15 16 14 15 16 17 18 19 20 11 12 13 14 15 16 17 17 18 19 20 21 22 23 21 22 23 24 25 26 27 18 19 20 21 22 23 24 24 25 26 27 28 29 30 28 29 30 31 25 26 27 28 29 30 31 October November December Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 1 2 3 4 5 1 2 3 2 3 4 5 6 7 8 6 7 8 9 10 11 12 4 5 6 7 8 9 10 9 10 11 12 13 14 15 13 14 15 16 17 18 19 11 12 13 14 15 16 17 16 17 18 19 20 21 22 20 21 22 23 24 25 26 18 19 20 21 22 23 24 23 24 25 26 27 28 29 27 28 29 30 25 26 27 28 29 30 31 30 31 Print system configuration

Sysconfig module allows you to print the detailed system configuration including environment variables which might be useful for debugging purposes.

$ python -m sysconfig Platform: "macosx-10.6-intel" Python version: "3.6" Current installation scheme: "posix_prefix" Paths: data = "/Library/Frameworks/Python.framework/Versions/3.6" include = "/Library/Frameworks/Python.framework/Versions/3.6/include/python3.6m" platinclude = "/Library/Frameworks/Python.framework/Versions/3.6/include/python3.6m" platlib = "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages" platstdlib = "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6" purelib = "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages" scripts = "/Library/Frameworks/Python.framework/Versions/3.6/bin" stdlib = "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6" Variables: ABIFLAGS = "m" AC_APPLE_UNIVERSAL_BUILD = "1" AIX_GENUINE_CPLUSPLUS = "0" ANDROID_API_LEVEL = "0" ...
Categories: FLOSS Project Planets

Sandipan Dey: Seam Carving: Using Dynamic Programming to implement Content-Aware Image Resizing in Python

Planet Python - Sat, 2017-10-14 19:18
The following problem appeared as an assignment in the Algorithm Course (COS 226) at Princeton University taught by Prof. Sedgewick.  The following description of the problem is taken from the assignment itself. The Seam Carving Problem Seam-carving is a content-aware image resizing technique where the image is reduced in size by one pixel of height (or width) at a time. A vertical seam in an image … Continue reading Seam Carving: Using Dynamic Programming to implement Content-Aware Image Resizing in Python
Categories: FLOSS Project Planets

Lior Kaplan: Debian Installer git repository

Planet Debian - Sat, 2017-10-14 18:15

While dealing with d-i’s translation last month in FOSScamp, I was kinda surprised it’s still on SVN. While reviewing PO files from others, I couldn’t select specific parts to commit.

Debian does have a git server, and many DDs (Debian Developers) use it for their Debian work, but it’s not as public as I wish it to be. Meaning I lack the pull / merge request abilities as well as the review process.

Recently I got a reminder that the D-I’s Hebrew translation needs some love. I asked my local community for help. Receiving a PO file by mail, reminded me of the SVN annoyance. So this time I decided to convert it to git and ask people to send me pull requests. Another benefit would be making the process more transparent as others could see these PRs (and hopefully comment if needed).

For this experiment, I opened a repository on GitHub at https://github.com/kaplanlior/debian-installer I know they aren’t open source as GitLab, but they are a popular choice which is a good start for my experiment. If and when it succeeds, we can discuss the platform.

Debian 9

(featured image by Jonathan Carter)


Filed under: Debian GNU/Linux
Categories: FLOSS Project Planets

Petter Reinholdtsen: A one-way wall on the border?

Planet Debian - Sat, 2017-10-14 16:10

I find it fascinating how many of the people being locked inside the proposed border wall between USA and Mexico support the idea. The proposal to keep Mexicans out reminds me of the propaganda twist from the East Germany government calling the wall the “Antifascist Bulwark” after erecting the Berlin Wall, claiming that the wall was erected to keep enemies from creeping into East Germany, while it was obvious to the people locked inside it that it was erected to keep the people from escaping.

Do the people in USA supporting this wall really believe it is a one way wall, only keeping people on the outside from getting in, while not keeping people in the inside from getting out?

Categories: FLOSS Project Planets

Stack Abuse: Python: List Files in a Directory

Planet Python - Sat, 2017-10-14 09:30

I prefer to work with Python because it is a very flexible programming language, and allows me to interact with the operating system easily. This also includes file system functions. To simply list files in a directory the modules os, subprocess, fnmatch, and pathlib come into play. The following solutions demonstrate how to use these methods effectively.

Using os.walk()

The os module contains a long list of methods that deal with the filesystem, and the operating system. One of them is walk(), which generates the filenames in a directory tree by walking the tree either top-down or bottom-up (with top-down being the default setting).

os.walk() returns a list of three items. It contains the name of the root directory, a list of the names of the subdirectories, and a list of the filenames in the current directory. Listing 1 shows how to write this with only three lines of code. This works with both Python 2 and 3 interpreters.

Listing 1: Traversing the current directory using os.walk()

import os for root, dirs, files in os.walk("."): for filename in files: print(filename) Using the Command Line via Subprocess

As already described in the article Parallel Processing in Python, the subprocess module allows you to execute a system command, and collect its result. The system command we call in this case is the following one:

Example 1: Listing the files in the current directory

$ ls -p . | grep -v /$

The command ls -p . lists directory files for the current directory, and adds the delimiter / at the end of the name of each subdirectory, which we'll need in the next step. The output of this call is piped to the grep command that filters the data as we need it.

The parameters -v /$ exclude all the names of entries that end with the delimiter /. Actually, /$ is a Regular Expression that matches all the strings that contain the character / as the very last character before the end of the string, which is represented by $.

The subprocess module allows to build real pipes, and to connect the input and output streams as you do on a command line. Calling the method subprocess.Popen() opens a corresponding process, and defines the two parameters named stdin and stdout.

Listing 2 shows how to program that. The first variable ls is defined as a process executing ls -p . that outputs to a pipe. That's why the stdout channel is defined as subprocess.PIPE. The second variable grep is defined as a process, too, but executes the command grep -v /$, instead.

To read the output of the ls command from the pipe, the stdin channel of grep is defined as ls.stdout. Finally, the variable endOfPipe reads the output of grep from grep.stdout that is printed to stdout element-wise in the for-loop below. The output is seen in Example 2.

Listing 2: Defining two processes connected with a pipe

import subprocess # define the ls command ls = subprocess.Popen(["ls", "-p", "."], stdout=subprocess.PIPE, ) # define the grep command grep = subprocess.Popen(["grep", "-v", "/$"], stdin=ls.stdout, stdout=subprocess.PIPE, ) # read from the end of the pipe (stdout) endOfPipe = grep.stdout # output the files line by line for line in endOfPipe: print (line)

Example 2: Running the program

$ python find-files3.py find-files2.py find-files3.py find-files4.py ...

This solution works quite well with both Python 2 and 3, but can we improve it somehow? Let us have a look at the other variants, then.

Combining os and fnmatch

As you have seen before the solution using subprocesses is elegant but requires lots of code. Instead, let us combine the methods from the two modules os, and fnmatch. This variant works with Python 2 and 3, too.

As the first step, we import the two modules os, and fnmatch. Next, we define the directory we would like to list the files using os.listdir(), as well as the pattern for which files to filter. In a for loop we iterate over the list of entries stored in the variable listOfFiles.

Finally, with the help of fnmatch we filter for the entries we are looking for, and print the matching entries to stdout. Listing 3 contains the Python script, and Example 3 the corresponding output.

Listing 3: Listing files using os and fnmatch module

import os, fnmatch listOfFiles = os.listdir('.') pattern = "*.py" for entry in listOfFiles: if fnmatch.fnmatch(entry, pattern): print (entry)

Example 3: The output of Listing 3

$ python2 find-files.py find-files.py find-files2.py find-files3.py ... Using os.listdir() and Generators

In simple terms, a generator is a powerful iterator that keeps its state. To learn more about generators, check out one of our previous articles, Python Generators.

The following variant combines the listdir() method of the os module with a generator function. The code works with both versions 2 and 3 of Python.

As you may have noted before, the listdir() method returns the list of entries for the given directory. The method os.path.isfile() returns True if the given entry is a file. The yield operator quits the function but keeps the current state, and returns only the name of the entry detected as a file. This allows us to loop over the generator function (see Listing 4). The output is identical to the one from Example 3.

Listing 4: Combining os.listdir() and a generator function

import os def files(path): for file in os.listdir(path): if os.path.isfile(os.path.join(path, file)): yield file for file in files("."): print (file) Use pathlib

The pathlib module describes itself as a way to "Parse, build, test, and otherwise work on filenames and paths using an object-oriented API instead of low-level string operations". This sounds cool - let's do it. Starting with Python 3, the module belongs to the standard distribution.

In Listing 5, we first define the directory. The dot (".") defines the current directory. Next, the iterdir() method returns an iterator that yields the names of all the files. In a for loop we print the name of the files one after the other.

Listing 5: Reading directory contents with pathlib

import pathlib # define the path currentDirectory = pathlib.Path('.') for currentFile in currentDirectory.iterdir(): print(currentFile)

Again, the output is identical to the one from Example 3.

As an alternative, we can retrieve files by matching their filenames by using something called a glob. This way we can only retrieve the files we want. For example, in the code below we only want to list the Python files in our directory, which we do by specifying "*.py" in the glob.

Listing 6: Using pathlib with the glob method

import pathlib # define the path currentDirectory = pathlib.Path('.') # define the pattern currentPattern = "*.py" for currentFile in currentDirectory.glob(currentPattern): print(currentFile) Using os.scandir()

In Python 3.6, a new method becomes available in the os module. It is named scandir(), and significantly simplifies the call to list files in a directory.

Having imported the os module first, use the getcwd() method to detect the current working directory, and save this value in the path variable. Next, scandir() returns a list of entries for this path, which we test for being a file using the is_file() method.

Listing 7: Reading directory contents with scandir()

import os # detect the current working directory path = os.getcwd() # read the entries with os.scandir(path) as listOfEntries: for entry in listOfEntries: # print all entries that are files if entry.is_file(): print(entry.name)

Again, the output of Listing 7 is identical to the one from Example 3.


There is disagreement which version is the best, which is the most elegant, and which is the most "pythonic" one. I like the simplicity of the os.walk() method as well as the usage of both the fnmatch and pathlib modules.

The two versions with the processes/piping and the iterator require a deeper understanding of UNIX processes and Python knowledge, so they may not be best for all programmers due to their added (and unnecessary) complexity.

To find an answer to which version is the quickest one, the timeit module is quite handy. This module counts the time that has elapsed between two events.

To compare all of our solutions without modifying them, we use a Python functionality: call the Python interpreter with the name of the module, and the appropriate Python code to be executed. To do that for all the Python scripts at once a shell script helps (Listing 8).

Listing 8: Evaluating the execution time using the timeit module

#! /bin/bash for filename in *.py; do echo "$filename:" cat $filename | python3 -m timeit echo " " done

The tests were taken using Python 3.5.3. The result is as follows whereas os.walk() gives the best result. Running the tests with Python 2 returns different values but does not change the order - os.walk() is still on top of the list.

Method Result for 100,000,000 loops os.walk 0.0085 usec per loop subprocess/pipe 0.00859 usec per loop os.listdir/fnmatch 0.00912 usec per loop os.listdir/generator 0.00867 usec per loop pathlib 0.00854 usec per loop pathlib/glob 0.00858 usec per loop os.scandir 0.00856 usec per loop Acknowledgements

The author would like to thank Gerold Rupprecht for his support, and comments while preparing this article.

Categories: FLOSS Project Planets

mark.ie: Adding Tokens for Metatag Image Fields when using Drupal Media Entity

Planet Drupal - Sat, 2017-10-14 09:19
Adding Tokens for Metatag Image Fields when using Drupal Media Entity

Metatag cannot directly extract an image url from a media field referenced by another entity.

markconroy Sat, 10/14/2017 - 14:19

I upgraded my site from Drupal 7 to Drupal 8 this week (yes, that's why it's running on Bartik - a PatternLab developed theme will be installed in time).

This morning I enabled the Metagtag module and set some defaults for page title, description, image, etc. The help notes on the image metatag field says "An image associated with this page, for use as a thumbnail in social networks and other services. This will be able to extract the URL from an image field." This is true, except in my case, all the image fields on the site use the Media Entity module, so they are entity reference fields rather than image fields.

When I put in a token of [node:field_main_image], the result in the outputted metatags was:

In that case, "Mark Conroy | DrupalCon Dublin 2017" is the name of the referenced media. I needed to output the image field of the referenced media.

After a little trial and error, I came up with this:


which outputs:

In this case, "field_main_image" is the name of the image field on my content type, and "field_m_image_image" is the name of the image field on my image media bundle.

I hope that helps!

Categories: FLOSS Project Planets
Syndicate content