FLOSS Project Planets

Julien Danjou: Python and fast HTTP clients

Planet Python - Mon, 2019-10-07 05:30

Nowadays, it is more than likely that you will have to write an HTTP client for your application that will have to talk to another HTTP server. The ubiquity of REST API makes HTTP a first class citizen. That's why knowing optimization patterns are a prerequisite.

There are many HTTP clients in Python; the most widely used and easy to
work with is requests. It is the de-factor standard nowadays.

Persistent Connections

The first optimization to take into account is the use of a persistent connection to the Web server. Persistent connections are a standard since HTTP 1.1 though many applications do not leverage them. This lack of optimization is simple to explain if you know that when using requests in its simple mode (e.g. with the get function) the connection is closed on return. To avoid that, an application needs to use a Session object that allows reusing an already opened connection.

import requests session = requests.Session() session.get("http://example.com") # Connection is re-used session.get("http://example.com")Using Session with requests

Each connection is stored in a pool of connections (10 by default), the size of
which is also configurable:

import requests session = requests.Session() adapter = requests.adapters.HTTPAdapter( pool_connections=100, pool_maxsize=100) session.mount('http://', adapter) response = session.get("http://example.org")Changing pool size

Reusing the TCP connection to send out several HTTP requests offers a number of performance advantages:

  • Lower CPU and memory usage (fewer connections opened simultaneously).
  • Reduced latency in subsequent requests (no TCP handshaking).
  • Exceptions can be raised without the penalty of closing the TCP connection.

The HTTP protocol also provides pipelining, which allows sending several requests on the same connection without waiting for the replies to come (think batch). Unfortunately, this is not supported by the requests library. However, pipelining requests may not be as fast as sending them in parallel. Indeed, the HTTP 1.1 protocol forces the replies to be sent in the same order as the requests were sent – first-in first-out.

Parallelism

requests also has one major drawback: it is synchronous. Calling requests.get("http://example.org") blocks the program until the HTTP server replies completely. Having the application waiting and doing nothing can be a drawback here. It is possible that the program could do something else rather than sitting idle.

A smart application can mitigate this problem by using a pool of threads like the ones provided by concurrent.futures. It allows parallelizing the HTTP requests in a very rapid way.

from concurrent import futures import requests with futures.ThreadPoolExecutor(max_workers=4) as executor: futures = [ executor.submit( lambda: requests.get("http://example.org")) for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)Using futures with requests

This pattern being quite useful, it has been packaged into a library named requests-futures. The usage of Session objects is made transparent to the developer:

from requests_futures import sessions session = sessions.FuturesSession() futures = [ session.get("http://example.org") for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)Using futures with requests

By default a worker with two threads is created, but a program can easily customize this value by passing the max_workers argument or even its own executor to the FuturSession object – for example like this: FuturesSession(executor=ThreadPoolExecutor(max_workers=10)).

Asynchronicity

As explained earlier, requests is entirely synchronous. That blocks the application while waiting for the server to reply, slowing down the program. Making HTTP requests in threads is one solution, but threads do have their own overhead and this implies parallelism, which is not something everyone is always glad to see in a program.

Starting with version 3.5, Python offers asynchronicity as its core using asyncio. The aiohttp library provides an asynchronous HTTP client built on top of asyncio. This library allows sending requests in series but without waiting for the first reply to come back before sending the new one. In contrast to HTTP pipelining, aiohttp sends the requests over multiple connections in parallel, avoiding the ordering issue explained earlier.

import aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return response loop = asyncio.get_event_loop() coroutines = [get("http://example.com") for _ in range(8)] results = loop.run_until_complete(asyncio.gather(*coroutines)) print("Results: %s" % results)Using aiohttp

All those solutions (using Session, threads, futures or asyncio) offer different approaches to making HTTP clients faster.

Performances

The snippet below is an HTTP client sending requests to httpbin.org, an HTTP API that provides (among other things) an endpoint simulating a long request (a second here). This example implements all the techniques listed above and times them.

import contextlib import time import aiohttp import asyncio import requests from requests_futures import sessions URL = "http://httpbin.org/delay/1" TRIES = 10 @contextlib.contextmanager def report_time(test): t0 = time.time() yield print("Time needed for `%s' called: %.2fs" % (test, time.time() - t0)) with report_time("serialized"): for i in range(TRIES): requests.get(URL) session = requests.Session() with report_time("Session"): for i in range(TRIES): session.get(URL) session = sessions.FuturesSession(max_workers=2) with report_time("FuturesSession w/ 2 workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() session = sessions.FuturesSession(max_workers=TRIES) with report_time("FuturesSession w/ max workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: await response.read() loop = asyncio.get_event_loop() with report_time("aiohttp"): loop.run_until_complete( asyncio.gather(*[get(URL) for i in range(TRIES)]))Program to compare the performances of different requests usage

Running this program gives the following output:

Time needed for `serialized' called: 12.12s Time needed for `Session' called: 11.22s Time needed for `FuturesSession w/ 2 workers' called: 5.65s Time needed for `FuturesSession w/ max workers' called: 1.25s Time needed for `aiohttp' called: 1.19s

Without any surprise, the slower result comes with the dumb serialized version, since all the requests are made one after another without reusing the connection — 12 seconds to make 10 requests.

Using a Session object and therefore reusing the connection means saving 8% in terms of time, which is already a big and easy win. Minimally, you should always use a session.

If your system and program allow the usage of threads, it is a good call to use them to parallelize the requests. However threads have some overhead, and they are not weight-less. They need to be created, started and then joined.

Unless you are still using old versions of Python, without a doubt using aiohttp should be the way to go nowadays if you want to write a fast and asynchronous HTTP client. It is the fastest and the most scalable solution as it can handle hundreds of parallel requests. The alternative, managing hundreds of threads in parallel is not a great option.

Streaming

Another speed optimization that can be efficient is streaming the requests. When making a request, by default the body of the response is downloaded immediately. The stream parameter provided by the requests library or the content attribute for aiohttp both provide a way to not load the full content in memory as soon as the request is executed.

import requests # Use `with` to make sure the response stream is closed and the connection can # be returned back to the pool. with requests.get('http://example.org', stream=True) as r: print(list(r.iter_content()))Streaming with requestsimport aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.content.read() loop = asyncio.get_event_loop() tasks = [asyncio.ensure_future(get("http://example.com"))] loop.run_until_complete(asyncio.wait(tasks)) print("Results: %s" % [task.result() for task in tasks])Streaming with aiohttp

Not loading the full content is extremely important in order to avoid allocating potentially hundred of megabytes of memory for nothing. If your program does not need to access the entire content as a whole but can work on chunks, it is probably better to just use those methods. For example, if you're going to save and write the content to a file, reading only a chunk and writing it at the same time is going to be much more memory efficient than reading the whole HTTP body, allocating a giant pile of memory, and then writing it to disk.

I hope that'll make it easier for you to write proper HTTP clients and requests. If you know any other useful technic or method, feel free to write it down in the comment section below!

Categories: FLOSS Project Planets

Julien Danjou: Python and fast HTTP clients

Planet Debian - Mon, 2019-10-07 05:30

Nowadays, it is more than likely that you will have to write an HTTP client for your application that will have to talk to another HTTP server. The ubiquity of REST API makes HTTP a first class citizen. That's why knowing optimization patterns are a prerequisite.

There are many HTTP clients in Python; the most widely used and easy to
work with is requests. It is the de-factor standard nowadays.

Persistent Connections

The first optimization to take into account is the use of a persistent connection to the Web server. Persistent connections are a standard since HTTP 1.1 though many applications do not leverage them. This lack of optimization is simple to explain if you know that when using requests in its simple mode (e.g. with the get function) the connection is closed on return. To avoid that, an application needs to use a Session object that allows reusing an already opened connection.

import requests session = requests.Session() session.get("http://example.com") # Connection is re-used session.get("http://example.com")Using Session with requests

Each connection is stored in a pool of connections (10 by default), the size of
which is also configurable:

import requests session = requests.Session() adapter = requests.adapters.HTTPAdapter( pool_connections=100, pool_maxsize=100) session.mount('http://', adapter) response = session.get("http://example.org")Changing pool size

Reusing the TCP connection to send out several HTTP requests offers a number of performance advantages:

  • Lower CPU and memory usage (fewer connections opened simultaneously).
  • Reduced latency in subsequent requests (no TCP handshaking).
  • Exceptions can be raised without the penalty of closing the TCP connection.

The HTTP protocol also provides pipelining, which allows sending several requests on the same connection without waiting for the replies to come (think batch). Unfortunately, this is not supported by the requests library. However, pipelining requests may not be as fast as sending them in parallel. Indeed, the HTTP 1.1 protocol forces the replies to be sent in the same order as the requests were sent – first-in first-out.

Parallelism

requests also has one major drawback: it is synchronous. Calling requests.get("http://example.org") blocks the program until the HTTP server replies completely. Having the application waiting and doing nothing can be a drawback here. It is possible that the program could do something else rather than sitting idle.

A smart application can mitigate this problem by using a pool of threads like the ones provided by concurrent.futures. It allows parallelizing the HTTP requests in a very rapid way.

from concurrent import futures import requests with futures.ThreadPoolExecutor(max_workers=4) as executor: futures = [ executor.submit( lambda: requests.get("http://example.org")) for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)Using futures with requests

This pattern being quite useful, it has been packaged into a library named requests-futures. The usage of Session objects is made transparent to the developer:

from requests_futures import sessions session = sessions.FuturesSession() futures = [ session.get("http://example.org") for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)Using futures with requests

By default a worker with two threads is created, but a program can easily customize this value by passing the max_workers argument or even its own executor to the FuturSession object – for example like this: FuturesSession(executor=ThreadPoolExecutor(max_workers=10)).

Asynchronicity

As explained earlier, requests is entirely synchronous. That blocks the application while waiting for the server to reply, slowing down the program. Making HTTP requests in threads is one solution, but threads do have their own overhead and this implies parallelism, which is not something everyone is always glad to see in a program.

Starting with version 3.5, Python offers asynchronicity as its core using asyncio. The aiohttp library provides an asynchronous HTTP client built on top of asyncio. This library allows sending requests in series but without waiting for the first reply to come back before sending the new one. In contrast to HTTP pipelining, aiohttp sends the requests over multiple connections in parallel, avoiding the ordering issue explained earlier.

import aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return response loop = asyncio.get_event_loop() coroutines = [get("http://example.com") for _ in range(8)] results = loop.run_until_complete(asyncio.gather(*coroutines)) print("Results: %s" % results)Using aiohttp

All those solutions (using Session, threads, futures or asyncio) offer different approaches to making HTTP clients faster.

Performances

The snippet below is an HTTP client sending requests to httpbin.org, an HTTP API that provides (among other things) an endpoint simulating a long request (a second here). This example implements all the techniques listed above and times them.

import contextlib import time import aiohttp import asyncio import requests from requests_futures import sessions URL = "http://httpbin.org/delay/1" TRIES = 10 @contextlib.contextmanager def report_time(test): t0 = time.time() yield print("Time needed for `%s' called: %.2fs" % (test, time.time() - t0)) with report_time("serialized"): for i in range(TRIES): requests.get(URL) session = requests.Session() with report_time("Session"): for i in range(TRIES): session.get(URL) session = sessions.FuturesSession(max_workers=2) with report_time("FuturesSession w/ 2 workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() session = sessions.FuturesSession(max_workers=TRIES) with report_time("FuturesSession w/ max workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: await response.read() loop = asyncio.get_event_loop() with report_time("aiohttp"): loop.run_until_complete( asyncio.gather(*[get(URL) for i in range(TRIES)]))Program to compare the performances of different requests usage

Running this program gives the following output:

Time needed for `serialized' called: 12.12s Time needed for `Session' called: 11.22s Time needed for `FuturesSession w/ 2 workers' called: 5.65s Time needed for `FuturesSession w/ max workers' called: 1.25s Time needed for `aiohttp' called: 1.19s

Without any surprise, the slower result comes with the dumb serialized version, since all the requests are made one after another without reusing the connection — 12 seconds to make 10 requests.

Using a Session object and therefore reusing the connection means saving 8% in terms of time, which is already a big and easy win. Minimally, you should always use a session.

If your system and program allow the usage of threads, it is a good call to use them to parallelize the requests. However threads have some overhead, and they are not weight-less. They need to be created, started and then joined.

Unless you are still using old versions of Python, without a doubt using aiohttp should be the way to go nowadays if you want to write a fast and asynchronous HTTP client. It is the fastest and the most scalable solution as it can handle hundreds of parallel requests. The alternative, managing hundreds of threads in parallel is not a great option.

Streaming

Another speed optimization that can be efficient is streaming the requests. When making a request, by default the body of the response is downloaded immediately. The stream parameter provided by the requests library or the content attribute for aiohttp both provide a way to not load the full content in memory as soon as the request is executed.

import requests # Use `with` to make sure the response stream is closed and the connection can # be returned back to the pool. with requests.get('http://example.org', stream=True) as r: print(list(r.iter_content()))Streaming with requestsimport aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.content.read() loop = asyncio.get_event_loop() tasks = [asyncio.ensure_future(get("http://example.com"))] loop.run_until_complete(asyncio.wait(tasks)) print("Results: %s" % [task.result() for task in tasks])Streaming with aiohttp

Not loading the full content is extremely important in order to avoid allocating potentially hundred of megabytes of memory for nothing. If your program does not need to access the entire content as a whole but can work on chunks, it is probably better to just use those methods. For example, if you're going to save and write the content to a file, reading only a chunk and writing it at the same time is going to be much more memory efficient than reading the whole HTTP body, allocating a giant pile of memory, and then writing it to disk.

I hope that'll make it easier for you to write proper HTTP clients and requests. If you know any other useful technic or method, feel free to write it down in the comment section below!

Categories: FLOSS Project Planets

Agiledrop.com Blog: Our blog posts from September 2019

Planet Drupal - Mon, 2019-10-07 05:00

Missed some of our blog posts last month? Don't worry - here’s a recap of all our posts from September. Check it out and make sure you’re all caught up!

READ MORE
Categories: FLOSS Project Planets

WatchData PROXKey digital signature using emSigner in Fedora 30

Planet KDE - Mon, 2019-10-07 04:19

TL;DR — go to Howto section to make WatchData PROXKey work with emSigner in GNU/Linux system.

Introduction

Hardware tokens with digital signature are used for filing various financial documents in Govt of India portals. The major tokens supported by eMudhra are WatchData ProxKey, ePass 2003, Aladdin, Safenet, TrustKey etc. Many of these hardware tokens come (in CDROM image mode) with drivers and utilities to manage the signatures, unfortunately only in Windows platform.

Failed attempts

Sometime in 2017, I tried to make these tokens work for signing GST returns under GNU/Linux, using the de-facto pcsc tool. I got a WatchData PROXKey, which doesn’t work out-of-the-box with pcsc. Digging further brings up this report and it seems the driver is a spinoff of upstream (LGPL licensed), but no source code made available, so there is no hope of using these hardware tokens with upstream tools. The only option is depending on vendor provided drivers, unfortunately. There are some instructions by a retailer to get this working under Ubuntu.

Once you download and install that driver (ProxKey_Redhat.rpm), it does a few things — installs a separate pcsc daemon named pcscd_wd, installs the driver CCID bundles and certain supporting binaries/libraries. (The drawback of such custom driver implementations is that different drivers clash with each other (as each one provides a different pcscd_wd binary and their installation scripts silently overwrite existing files!). To avoid any clashes with this pcscd_wd daemon, disable the standard pcscd daemon by systemctl stop pcscd.service.

Plug in the USB hardware token and to the dismay observe that it spews the following error messages in journalctl:

Oct 06 09:16:51 athena pcscd_wd[2408]: ifdhandler.c:134:IFDHCreateChannelByName() failed Oct 06 09:16:51 athena pcscd_wd[2408]: readerfactory.c:1043:RFInitializeReader() Open Port 0x200001 Failed (usb:163c/0417:libhal:/org/freedesktop/Hal/devices/usb_device_163c_0417_serialnotneeded_if1) Oct 06 09:16:51 athena pcscd_wd[2408]: readerfactory.c:335:RFAddReader() WD CCID UTL init failed.

This prompted me to try different drivers, mostly from the eMudhra repository — including eMudhra Watchdata, Trust Key and even ePass (there were no *New* drivers at this time) — none of them seemed to work. Many references were towards Ubuntu, so I tried various Ubuntu versions from 14.04 to 18.10, they didn’t yield different result either. At this point, I have put the endeavour in the back burner.

A renewed interest

Around 2019 September, KITE announced that they will start supporting government officials using digital signatures under GNU/Linux, as most of Kerala government offices now run on libre software. KITE have made the necessary drivers, signing tools and manuals available.

I tried this in a (recommended) Ubuntu 18.04 system, but the pcscd_wd errors persisted and NICDSign tool couldn’t recognize the PROXKey digital token. Although, their installation methods gave me a better idea of how these drivers are supposed to work with the signing middleware.

Couple of days ago, with better understanding of how these drivers work, I thought that these should also work in Fedora 30 system (which is my main OS), I set out for another attempt.

How to
  1. Removed all the wdtokentool-proxkey, wdtokentool-trustkey, wdtokentool-eMudhra, ProxKey_Redhat and such drivers, if installed; to start from a clean slate.
  2. Download WatchData ProxKey (Linux) *New* driver from eMudhra.
  3. Unzip and install wdtokentool-ProxKey-1.1.1 RPM/DEB package. Note that this package installs the TRUSTKEY driver (usr/lib/WatchData/TRUSTKEY/lib/libwdpkcs_TRUSTKEY.so), not ProxKey driver (/usr/lib/WatchData/ProxKey/lib/libwdpkcs_SignatureP11.so) and it seems the ProxKey token only works with TRUSTKEY driver!
  4. Start pcscd_wd.service by systemctl start pcscd_wd.service (only if not auto-started)
  5. Plug in your PROXKey token. (journalctl -f would still show the error message, but — lesson learned — this error can be safely ignored!)
  6. Download emsigner from GST website and unzip it into your ~/Documents or another directory (say ~/Documents/emSigner).
  7. Ensure port 1585 is open in firewall settings: firewall-cmd --add-port=1585/tcp --zone=FedoraWorkstation (adjust the firewall zone if necessary). Repeat the same command by adding --permanent to make this change effective across reboot).
  8. Go to ~/Documents/emSigner in shell and run ./startserver.sh (make sure to chmod 0755 startserver.sh, or double-click on this script from a file browser).
  9. Login to GST portal and try to file your return with DSC.
  10. f you get the error Failed to establish connection to the server. Kindly restart the Emsigner when trying to sign, open another tab in browser window and go to https://localhost:1585 and try signing again.
  11. You should be prompted for the digital signature PIN and signing should succeed.

It is possible to use this digital token also in Firefox (via Preferences → Privacy & Security → Certificates → Security Devices → Load with Module filename as usr/lib/WatchData/TRUSTKEY/lib/libwdpkcs_TRUSTKEY.so) as long as the key is plugged in. Here again, you can skip the error message unable to load the module.

Categories: FLOSS Project Planets

DrupalCon News: DrupalCon Minneapolis - extended deadlines for call for proposals & scholarships and grants

Planet Drupal - Mon, 2019-10-07 03:17
Photo by Rob Shea

Mark your calendars: The deadline for Proposals and applying for Grants & Scholarships is now Wednesday, December 4
 

Categories: FLOSS Project Planets

Lullabot: Behind the Screens: Behind the Screens with Amitai Burstein

Planet Drupal - Mon, 2019-10-07 03:00

How do you maintain one of Drupal's most prolific modules, run a business, have a family, and stay balanced? Amitai Burstein spills  his secrets to success and how you can join him! Also, sushi!

Categories: FLOSS Project Planets

Glyph Lefkowitz: The Numbers, They Lie

Planet Python - Mon, 2019-10-07 02:25

It’s October, and we’re all getting ready for Halloween, so allow me to me tell you a horror story, in Python:

1 2>>> 0.1 + 0.2 - 0.3 5.551115123125783e-17

Some of you might already be familiar with this chilling tale, but for those who might not have experienced it directly, let me briefly recap.

In Python, the default representation of a number with a decimal point in it is something called an “IEEE 754 double precision binary floating-point number”. This standard achieves a generally useful trade-off between performance, correctness, and is widely implemented in hardware, making it a popular choice for numbers in many programming language.

However, as our spooky story above indicates, it’s not perfect. 0.1 + 0.2 is very slightly less than 0.3 in this representation, because it is a floating-point representation in base 2.

If you’ve worked professionally with software that manipulates money1, you typically learn this lesson early; it’s quite easy to smash head-first into the problem with binary floating-point the first time you have an item that costs 30 cents and for some reason three dimes doesn’t suffice to cover it.

There are a few different approaches to the problem; one is using integers for everything, and denominating your transactions in cents rather than dollars. A strategy which requires less weird unit-conversion2, is to use the built-in decimal module, which provides a floating-point base 10 representation, rather than the standard base-2, which doesn’t have any of these weird glitches surrounding numbers like 0.1.

This is often where a working programmer’s numerical education ends; don’t use floats, they’re bad, use decimals, they’re good. Indeed, this advice will work well up to a pretty high degree of application complexity. But the story doesn’t end there. Once division gets involved, things can still get weird really fast:

1 2 3>>> from decimal import Decimal >>> (Decimal("1") / 7) * 14 Decimal('2.000000000000000000000000001')

The problem is the same: before, we were working with 1/10, a value that doesn’t have a finite (non-repeating) representation in base 2; now we’re working with 1/7, which has the same problem in base 10.

Any time you have a representation of a number which uses digits and a decimal point, no matter the base, you’re going to run in to some rational values which do not have an exact representation with a finite number of digits; thus, you’ll drop some digits off the (necessarily finite) end, and end up with a slightly inaccurate representation.

But Python does have a way to maintain symbolic accuracy for arbitrary rational numbers -- the fractions module!

1 2 3 4 5>>> from fractions import Fraction >>> Fraction(1)/3 + Fraction(2)/3 == 1 True >>> (Fraction(1)/7) * 14 == 2 True

You can multiply and divide and add and subtract to your heart’s content, and still compare against zero and it’ll always work exactly, giving you the right answers.

So if Python has a “correct” representation, which doesn’t screw up our results under a basic arithmetic operation such as division, why isn’t it the default? We don’t care all that much about performance, right? Python certainly trades off correctness and safety in plenty of other areas.

First of all, while Python’s willing to trade off some storage or CPU efficiency for correctness, precise fractions rapidly consume huge amounts of storage even under very basic algorithms, like consuming gigabytes while just trying to maintain a simple running average over a stream of incoming numbers.

But even more importantly, you’ll notice that I said we could maintain symbolic accuracy for arbitrary rational numbers; but, as it turns out, a whole lot of interesting math you might want to do with a computer involves numbers which are irrational: like π. If you want to use a computer to do it, pretty much all trigonometry3 involves a slightly inaccurate approximation unless you have a literally infinite amount of storage.

As Morpheus put it, “welcome to the desert of the ”.

  1. or any proxy for it, like video-game virtual currency 

  2. and less time saying weird words like “nanodollars” to your co-workers 

  3. or, for that matter, geometry, or anything involving a square root 

Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: Paul Ivanov

Planet Python - Mon, 2019-10-07 01:05

This week we welcome Paul Ivanov (@ivanov) as our PyDev of the Week! Paul is a core developer of IPython and Jupyter. He is also an instructor at Software Carpentry. You can learn more about Paul on his website. You can also see what he’s been up to in open source by visiting his Github profile. Let’s take some time to get to know Paul!

Can you tell us a little about yourself (hobbies, education, etc):

I grew up in Moscow and moved to the United States with my family when I was 10. I have lived in Northern California ever since. I earned a degree in Computer Science at UC Davis. After that, I worked on a Ph.D. in Vision Science at UC Berkeley.

I really enjoy a lot of different aspects of computing, be it tinkering with hardware (especially microcontrollers) and trying out different operating systems and programming languages. Outside of things involving a keyboard, my main hobby is endurance cycling. I have a touring bike with a front basket that I’ve ridden on for a dozen 200km, two 300km, two 400km and one 600km rides. I also write in my journal (the pen and paper kind), which sometimes turns into poetry, some of which I have posted on my website.

Why did you start using Python?

In college, my roommate, Philip Neustrom, and my brother, Mike Ivanov, started DavisWiki, which started off based on MoinMoin, Wiki software implemented in Python.. I remember pitching in with some minor patches and being able to make some progress despite not knowing the language. It was so intuitive and self-explanatory.

At the time, I was studying Computer Science, so I was used to “priesthood” languages that required a compile cycle, like C++, C, and Java. I had also been exposed to Perl through a Bioinformatics class I took, but there was a bunch of mysterious syntax in it that you couldn’t comprehend unless someone explained it to you. Python was so simple by comparison.

That was my first exposure to it, around 2004-2005, but I didn’t start using it regularly until grad school. I finished college early with two quarters of the academic year left and applied to a few grad schools which I’d only hear back from months later. In the interim, as a backup plan, I got a job at a Java shop.

While waiting for the big monolithic Java2EE project I was working on to start-up or reload (three to eight minutes spent grinding all those enterprise beans), I started playing with Ruby on Rails. Its interactive experience was so refreshing in comparison to a compiled language. Again, it was simple to use, though it had a little too much magic. For example, setting up a model for a “Person” created a table called “People”?!

I started grad school at UC Berkeley in 2006. My first lab rotation in Jack Gallant’s Neuroscience lab was my first real exposure to Matlab, far beyond the backslash solving I learned in an introductory linear algebra class. Again, it was a similar feeling of being able to whip up code and experiment interactively, particularly with matrices. But it was (and still is) quite a step backwards for interacting with the file system, trying to build a GUI, or interacting with a database — things like that. I was also frustrated with the out-of-memory errors that surprisingly cropped up, and its license requirement was a no-go.

I wanted to have skills that would transcend academia. Matlab licenses were cheap for students. I could get one through a campus deal at the time, but I knew that it would be a different story in industry. This was right when the first dual-core laptops started to come out, and I certainly wanted to take advantage of that. But for Matlab, I’d need a license per processor!

Someone in the Gallant lab had a PDF of Travis Oliphant’s “Guide to NumPy,” so I started using Python in my next rotation in the Redwood Center for Theoretical Neuroscience, where I ended up joining Bruno Olshausen’s lab. Luckily, there were a few other people embracing Python in the lab, and in the Brain Imaging Center, which we shared offices with for a while.

What other programming languages do you know and which is your favorite?

I’ve written serious code in C, C++, Java, Go, JavaScript, TypeScript, Elm, Idris, and Haskell. The ones that make me feel particularly giddy when I write code for fun are Elm, Go, and Idris.

I really enjoy Elm for finally providing a path into using functional languages regularly, as well as simplifying front-end code. The Elm Architecture has since been popularized by React with Redux. It’s a pattern I’ve used subsequently for developing personal user interface-based projects in Go, Idris, and Haskell. It’s also a deliberately slower-moving language. I view JavaScript, and now TypeScript, as “treadmill” languages – you have to stay on them and keep running forward just to keep up with current practices or you will fall off and get left behind. I appreciate being able to come back to Elm after six to eight months and not have the whole world shift under me in the meantime. It helps that it’s a smaller community, but I like that it feels quieter.

The Go language embraces simplicity and frowns upon clever solutions. The tooling that comes with it is fantastic – from formatting to fixing code to account for API changes, to being able to cross-compile binaries for multiple architectures *AND* operating systems by just changing some environment variables. There’s nothing else like it. Someone fond of JVM languages like Java (or Clojure or Scala) might pipe up with an objection because the same executable JAR compiled once can usually run anywhere using the Java runtime. However, the same is true for vanilla Python code – it will run anywhere there’s a Python interpreter. With Go, what you get as a result of running `GOOS=openbsd GOARCH=386 go build` will be an executable of your program that will run on OpenBSD on old 32-bit hardware. Period. It does not matter if you run that command on Debian, Windows, or macOS. And it doesn’t matter if your underlying architecture is 386, AMD64, ARM, or one of the other supported ones. This works because the binary doesn’t link against any C libraries; it just makes system calls directly to the kernel. So, what you get are true stand-alone binaries!

Idris is the most different. Writing code there is a dialogue between you and the computer. You get helpful feedback and boilerplate generation that is fractal: by writing down the type signatures, you can get the compiler to fill in a big picture sketch, zoom in on a chunk and ask the compiler to fill in more of the skeleton there as well. Dependent types gave me a new way to think about programming. It’s where I want the future of programming to go. But, in some ways, Idris is the least mature and most academic of the languages I know. Compile times can be slow (though the situation is apparently much improved with the work-in-progress Idris 2). And the community is fairly small, so there aren’t a ton of ready-to-use interfacing libraries.

So, for me, Idris can simultaneously be the most fun, yet the least productive way to code. But, there’s a good parallel here with my proclivity for cycling. There are many ways of traveling between points A and B. You can drive, you can take public transportation (be it by bus or train), or you can take your bike and get some exercise and hear the birds chirping with the wind blowing in your hair.

Apologies for the United States-centric nature of this travel analogy (and setting aside both the environmental footprint and the reality of traffic jams), but for me, in many ways, Python is like driving my own car. Frequently, it is the most practical choice for me to get from point A to point B. It will be fast, and I can go pretty far at a predictable speed. But, practical can get kind of boring. It certainly wasn’t at first. I got my driver’s license when I was 18, and I still remember how much fun it was to drive. The destinations were secondary to the journey.

What projects are you working on now?

With the help of my colleagues at Bloomberg, I’ve been organizing and hosting two-day events in our San Francisco Engineering Office every three months for the past year to encourage and facilitate experimentation in the Jupyter ecosystem. We’ve called them “Open Studio Days.” The current Wikipedia summary for ‘Open studio’ captures the spirit we want to make more prominent in the tech community: “A studio or workroom which is made accessible to all-comers, where artistic or creative work can be viewed and created collaboratively. An Open Studio is intended to foster creativity and encourage experimentation in an atmosphere of cultural exchange, conversation, encouragement, and freedom of expression.” Unlike a sprint or hackathon, where the goal is to produce something specific at the end, the point of our effort is to emphasize that sometimes we simply need to explore and participate by teaching one another, by having a discussion, or just by sharing some feelings and thoughts that we might have.

I’m also helping organize the NumFOCUS Summit this year. This is a chance for folks from the open source projects that are fiscally sponsored by the organization to get together to catch up and teach each other what we’ve been up to and figure out how we can grow our projects and our communities.

I’ve also had a commit bit for Matplotlib for a while. Though I haven’t been as active there lately, I did help Tom Caswell with a pair of releases earlier this year (2.2.4 and 3.0.3), and made my first solo release over the summer (3.1.1). Prior to that, Tom had been doing those releases single-handed for the past several years. The plan is for me to continue handling these, and I am the release manager for Matplotlib 3.2.0 which should be ready in September.

I also have a half-dozen personal projects that I haven’t released which I push forward on in the background. I say this not to tease or withhold them, but to let newer developers know that it’s okay, and even desirable, to have side projects that you don’t share with others. I consider it a public service that I haven’t released a bunch of my half-baked code over the years, though some did trickle out.

What non-Python open source projects do you enjoy using?

There are too many to name, but I suppose I have to start somewhere. Regardless of the operating system I’m on, I prefer the Vim text editor — though plain vi is fine in a pinch. I use Debian, OpenBSD, and FreeBSD operating systems, GIMP and Inkscape for creating graphics, and write code in Go, Idris, Elm, and Haskell.

How did you get involved in the Jupyter and Matplotlib communities?

A dozen years ago, using the tools in the Scientific Python (SciPy) ecosystem was definitely a counter-culture thing to do. Some of the edges were sharp, so “bleeding edge” would definitely have been an apt description at the time.

I mentioned how I started grad school in 2006 and started using Python in 2007. A year later, Fernando Perez, the creator of IPython, showed up on campus. By that point, the Redwood Center had moved to a different building on campus, so we no longer shared space with some of the other Scientific Python users on campus. One major benefit of this move was that we now had access to a premium, hard-to-come-by commodity: our own conference room. So, we started gathering together every week as a py4science group. We would teach each other how to write C extensions, different submodules of NumPy and SciPy, Matplotlib, SWIG, and Weave.

Before GitHub, Stack Overflow, and Discourse, mailing lists were where the majority of the community’s activity took place. For a while, I was very active on the Matplotlib mailing list. One time, someone had a question about whether it was possible to use the Matplotlib event handling code to support interactivity in multiple backends. I wrote a clone of Pong to illustrate that it is indeed possible — it’s crazy that pipong.py is now more than 10 years old!

Is there anything else you’d like to say?

Going back to my “Python is like driving a car” analogy, I hope I’m not dissuading anyone from learning Python or continuing to use it. By all means, please do, and I will continue as well. It’s just that I hope folks are reminded that there are other modes of transportation to leverage: you can steer a ship, pilot an airplane, fly a rocket, or just go for a walk. They all have value.

Thanks for doing the interview Paul!

The post PyDev of the Week: Paul Ivanov appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

Quansight Labs Blog: Quansight Labs Work Update for September, 2019

Planet Python - Mon, 2019-10-07 01:00

As of November, 2018, I have been working at Quansight. Quansight is a new startup founded by the same people who started Anaconda, which aims to connect companies and open source communities, and offers consulting, training, support and mentoring services. I work under the heading of Quansight Labs. Quansight Labs is a public-benefit division of Quansight. It provides a home for a "PyData Core Team" which consists of developers, community managers, designers, and documentation writers who build open-source technology and grow open-source communities around all aspects of the AI and Data Science workflow.

My work at Quansight is split between doing open source consulting for various companies, and working on SymPy. SymPy, for those who do not know, is a symbolic mathematics library written in pure Python. I am the lead maintainer of SymPy.

In this post, I will detail some of the open source work that I have done recently, both as part of my open source consulting, and as part of my work on SymPy for Quansight Labs.

Bounds Checking in Numba

As part of work on a client project, I have been working on contributing code to the numba project. Numba is a just-in-time compiler for Python. It lets you write native Python code and with the use of a simple @jit decorator, the code will be automatically sped up using LLVM. This can result in code that is up to 1000x faster in some cases:

Read more… (7 min remaining to read)

Categories: FLOSS Project Planets

Brad Lucas: Book Squire Is Ten Years Old

Planet Python - Mon, 2019-10-07 00:00

While releasing a new version of Book Squire the other day I realized that Book Squire is ten years old. What first started as a quickly developed application to solve a personal need has grown into one of my longest running applications.

Back in 2009 I was frustrated with the online access to our Library. It was tedious to enter the card number and pin then navigate to the page to see the status of my account. In addition, I was checking on accounts for family members and with four Library cards in hand was finding my patience tested.

I figured why couldn't a program do this. Maybe, do it everyday and at some point send me a note if there was something important to know about.

That was the plan which resulted in Book Squire.

Platforms

I choose Python for the first version. I got the logging in, navigating of the Library site and the scraping of account data working as a script. Then decided to built it into an application running under the then new Google App Engine platform. That worked for a while just fine. Over time I added a database to store user information and an email notifications feature with nightly reports delivered when accounts had notable events worth mentioning.

After working on a few Django applications I decided to move Book Squire to Django and host it on a VPS. Here it stayed for many years working well except for the random updates made to the Library site which broke the parsing of the pages.

Eventually, the Library upgraded there system in a significant way. Actually made it somewhat user friendly. Still it didn't support multiple cards and you had to click around a bit so Book Squire was reworked and it continued on.

For my latest update to Book Squire I've rewritten it in Clojure. The latest version is much cleaner internally and suspect the maintenance going forward will be easier. The old Python code did suffer overtime as refactoring was never justified enough because it just worked.

If you live in Westcheter County New York and have a Library card you can use Book Squire. All of the 30-plus Libraries in the county share the same central system, the Westchester Library System and so you can use Book Squire to check on your accounts.

The address to try Book Squire is:

Categories: FLOSS Project Planets

FSF News: FSF and GNU

GNU Planet! - Sun, 2019-10-06 22:45

The Free Software Foundation (FSF) and the GNU Project were both started by Richard M. Stallman (RMS), and he served until recently as the head of both. Because of that, the relationship between the FSF and GNU has been fluid.

As part of our commitment to supporting the development and distribution of fully free operating systems, the FSF provides GNU with services like fiscal sponsorship, technical infrastructure, promotion, copyright assignment, and volunteer management.

GNU decision-making has largely been in the hands of GNU leadership. Since RMS resigned as president of the FSF, but not as head of GNU ("Chief GNUisance"), the FSF is now working with GNU leadership on a shared understanding of the relationship for the future. As part of that, we invite comments from free software community members at fsf-and-gnu@fsf.org.

Update 2019-10-07: GNU leadership has also published a statement. The contact address for sending comments to GNU is gnu-and-fsf@gnu.org.

Categories: FLOSS Project Planets

Amjith Ramanujam: Examples are Awesome

Planet Python - Sun, 2019-10-06 22:15

There are two things I look for whenever I check out an Opensource project or library that I want to use.

1. Screenshots (A picture is worth a thousand words).

2. Examples (Don't tell me what to do, show me how to do it).

Having a fully working example (or many examples) helps me shape my thought process.

Here are a few projects that are excellent examples of this.

1. https://github.com/prompt-toolkit/python-prompt-toolkit

A CLI framework for building rich command line interfaces. The project comes with a collection of small self-sufficient examples that showcase every feature available in the framework and a nice little tutorial.

2. https://github.com/coleifer/peewee

A small ORM for Python that ships with multiple web projects to showcase how to use the ORM effectively. I'm always overwhelmed by SqlAlchemy's documentation site. PeeWee is a breath of fresh air with a clear purpose and succinct documentation.

3. https://github.com/coleifer/huey

An asynchronous task queue for Python that is simpler than Celery and more featureful than RQ. This project also ships with an awesome set of examples that show how to integrate the task queue with Django, Flask or standalone use case.

The beauty of these examples is that they're self-documenting and show us how the different pieces in the library work with each other as well as external code outside of their library such as Flask, Django, Asyncio etc.

Examples save the users hours of sifting through documentation to piece together how to use a library.

Please include examples in your project.

Categories: FLOSS Project Planets

Antoine Beaupré: This is why native apps matter

Planet Debian - Sun, 2019-10-06 20:15

I was just looking a web stream on Youtube today and was wondering why my CPU was so busy. So I fired up top and saw my web browser (Firefox) took up around 70% of a CPU to play the stream.

I thought, "this must be some high resolution crazy stream! how modern! such wow!" Then I thought, wait, this is the web, there must be something insane going on.

So I did a little experiment: I started chromium --temp-profile on the stream, alongside vlc (which can also play Youtube streams!). Then I took a snapshot of the top(1) command after 5 minutes. Here are the results:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16332 anarcat 20 0 1805160 269684 102660 S 60,2 1,7 3:34.96 chromium 16288 anarcat 20 0 974872 119752 87532 S 33,2 0,7 1:47.51 chromium 16410 anarcat 20 0 2321152 176668 80808 S 22,0 1,1 1:15.83 vlc 6641 anarcat 20 0 21,1g 520060 137580 S 13,8 3,2 55:36.70 x-www-browser 16292 anarcat 20 0 940340 83980 67080 S 13,2 0,5 0:41.28 chromium 1656 anarcat 20 0 1970912 18736 14576 S 10,9 0,1 4:47.08 pulseaudio 2256 anarcat 20 0 435696 93468 78120 S 7,6 0,6 16:03.57 Xorg 16262 anarcat 20 0 3240272 165664 127328 S 6,2 1,0 0:31.06 chromium 920 message+ 20 0 11052 5104 2948 S 1,3 0,0 2:43.37 dbus-daemon 17915 anarcat 20 0 16664 4164 3276 R 1,3 0,0 0:02.07 top

To deconstruct this, you can see my Firefox process (masquerading as x-www-browser) which has been started for a long time. It's taken 55 hours of CPU time, but let's ignore that for now as it's not in the benchmark. What I find fascinating is there are at least 4 chromium processes running here, and they collectively take up over 7 minutes of CPU time.

Compare this a little over one (1!!!11!!!) minute of CPU time for VLC, and you realize why people are so ranty about everything being packaged as web apps these days. It's basically using up an order of magnitudes more processing power (and therefore electric power and slave labor) to watch those silly movies in your web browsers than in a proper video player.

Keep that in mind next time you let Youtube go on a "autoplay Donald Drumpf" playlist...

Categories: FLOSS Project Planets

Calvin Spealman: Announcing Feet, a Python Runner

Planet Python - Sun, 2019-10-06 19:52

I've been working on a problem that's bugged me for about as long as I've used Python and I want to announce my stab at a solution, finally!

I've been working on the problem of "How do i get this little thing I made to my friend so they can try it out?" Python is great. Python is especially a great language to get started in, when you
don't know a lot about software development, and probably don't even know a lot about computers in general.

Yes, Python has a lot of options for tackling some of these distribution problems for games and apps. Py2EXE was an early option, PyInstaller is very popular now, and PyOxide is an interesting recent entry. These can be great options, but they didn't fit the kind of use case and experience that made sense to me. I'd never really been about to put my finger on it, until earlier this year:

Python needs LÖVE.

LÖVE, also known as "Love 2D", is a game engine that makes it super easy to build small Lua games and share them. Before being a game engine, a graphics library, or anything else: LÖVE is a portable runtime that's perfect for distribution these games.

The trick is skipping the build process entirely. We've tackled the distribution problems in Python over the years with many tricks to build self-contained executables of our Python projects. These work, but they add extra steps and infrastructure to projects. They add another set of new, unfamiliar things to learn for newcomers getting in between their excitement over having built their first thing
to show off and their actually being able to share it with anyone.

Learning to make your first Pygame game and then immediately having no idea how to get in someone else's hands can be a really demoralizing barrier. So, I set out to replicate the LÖVE model in Python.

However, I didn't want to build a game engine. I didn't want to reinvent wheels and Python already has many of them. I wanted to combine the Python language with the workflow of LÖVE projects and built on top of the huge ecosystem of Python tooling and libraries, like wxWindows and Pyglet and Numpy. I just wanted a way to make Python projects run.

So I built Feet, a Python Runner.

Feet is different than executable generators like PyInstaller. There is no build step. You don't even need to install Python. Feet is a complete Python runtime that sits inside your project and provides an obvious EXE for users to double-click. It runs a main.py file in your project and runs it, but it also lets you manage packages from the Python ecosystem. That's the real magic sauce. If you distribute a requirements.txt with your project, it'll install the dependencies for your user, locally to the project, and run everything out of the box, or you can package the whole thing up (dependencies included) and hand your users a single Zip or EXE file.

There will be a lot of work ahead to make Feet everything it can be for the Python community. I hope to talk more about why I've wanted to solve this problem for nearly twenty years now and also share technical details about what I'm doing with Feet.

For now, please go try it out. Download the EXE release into a Pygame or other Python project and try using Feet to run it on Windows without having to install Python or package anything. Give me feedback, complain in bug tickets, contribute back if you see improvements, or just please let me know what you think!
Categories: FLOSS Project Planets

Iustin Pop: IronBike Einsiedeln 2019

Planet Debian - Sun, 2019-10-06 17:00

The previous race I did at the end of August was soo much fun—I forgot how much fun this is—that I did the unthinkable and registered to the IronBike as well, at a short (but not the shortest) distance.

Why unthinkable? Last time I was here, three years ago, I apparently told my wife I will never, ever go to this race again, since it’s not a fun one. Steep up, steep down, or flat, but no flowing sections.

Well, it seems I forgot I said that, I only remembered “it’s a hard race”, so I registered at the 53km distance, where there are no age categories. Just “Herren Fun” ☺

September

The month of September was a pretty good one, overall, and I managed to even lose 2kg (of fat, I’m 100% sure), and continued to increase my overall fitness - 23→34 CTL in Training Peaks.

I also did the Strava Escape challenge and the Sufferfest Yoga challenge, so core fitness improved as well. And I was more rested: Training Peaks form was -8, up from -18 for previous race.

Overall, I felt reasonably confident to complete (compete?) the shorter distance; official distance 53km, 1’400m altitude.

Race (day)

The race for this distance starts very late (around half past ten), so it was an easy morning, breakfast, drive to place, park, get dressed, etc.

The weather was actually better than I feared, it was plain shorts and t-shirt, no need for arm warmers, or anything like that. And it got sunnier as the race went on, which reminded me why on my race prep list it says “Sunscreen (always)”, and not just sunscreen.

And yes, this race as well, even at this “fun” distance, started very strong. The first 8.7km are flat (total altitude gain 28m, i.e. nothing), and I did them at an average speed of 35.5km/h! Again, this is on a full-suspension mountain bike, with thick tires. I actually drafted, a lot, and it was the first time I realised even with MTB, knowing how to ride in a group is useful.

And then, the first climb starts. About 8.5km, gaining 522m altitude (around a third only of the goal…), and which was tiring. On the few places where the climb flattened I saw again that one of the few ways in which I can gain on people is by timing my effort well enough such that when it flattens, I can start strongly and thus gain, sometimes significantly.

I don’t know if this is related to the fact that I’m on the heavier but also stronger side (this is comparing peak 5s power with other people on Sufferfest/etc.), or just I was fresh enough, but I was able to apply this repeatedly over the first two thirds of the race.

At this moment I was still well, and the only negative aspect—my lower back pain, which started very early—was on and off, not continuous, so all good.

First descent, 4.7km, average almost -12%, thus losing almost all the gain, ~450m. And this was an interesting descent: a lot of it was on a quite wide trail, but “paved” with square wood logs. And the distance between the wood logs, about 3-4cm, was not filled well enough with soil, so it was very jittery ride. I’ve seen a lot of people not able to ride there (to my surprise), and, an even bigger surprise, probably 10 or 15 people with flats. I guess they went low enough on pressure to get bites (not sure if correct term) and thus a flat. I had no issues, I was running a bit high pressure (lucky guess), which I took off later. But here it served me very well.

Another short uphill, 1.8km/230m altitude, but I gave up and pushed the bike. Not fancy, but when I got off the bike it was 16%!! And that’s not my favourite place in the world to be. Pushing the bike was not that bad; I was going 4.3km/h versus 5.8km/h beforehand, so I wasn’t losing much time. But yes, I was losing time.

Then another downhill, thankfully not so steep so I could enjoy going down for longer, and then a flat section. Overall almost 8km flat section, which I remembered quite well. Actually I was remembering quite a bit of things from the previous race, despite different route, but especially this flat section I did remember in two ways:

  • that I was already very, very tired back then
  • and by the fact that I don’t remembered at all what was following

So I start on this flat section, expecting good progress, only to find myself going slower and slower. I couldn’t understand why it was so tiring to go straight, until I started approaching the group ahead. I realised then it was a “false flat”; I knew the term before, but didn’t understand well what it means, but now I got it ☺ About 1km, at a mean grade of 4.4%, which is not high when you know it’s a climb, but it is not flat. Once I realised this, I felt better. The other ~6.5km I spent pursuing a large group ahead, which I managed to catch about half a kilometre before the end of the flat.

I enjoy this section, then the road turns and I saw what my brain didn’t want to remember: the last climb, 4km long, only ~180m altitude, but mean grade 10%, but I don’t have energy for all of it. Which is kind of hilarious, 180m altitude is only twice and a bit of my daily one-way commute, so peanuts, but I’m quite tired from both previous climbs and chasing that group, so after 11 minutes I get off the bike, and start pushing.

And the pushing is the same as before, actually even better. I was going 5.2km/h on average here, versus 6.1km/h before, so less than 1km/h difference. This walking speed is more than usual walking, as it was not very steep I was walking fast. Since I was able to walk fast and felt recovered, I get back on the bike before the climb finished, only to feel a solid cramp in the right leg as soon as I try to start pedalling. OK, let me pedal with the left, ouch cramp as well. So I’m going actually almost as slow as walking, just managing the cramps and keeping them below-critical, and after about 5 minutes they go away.

It was the first time I got such cramps, or any cramps, during a race. I’m not sure I understand why, I did drink energy drinks not just water (so I assume not acute lack of some electrolytes). Apparently reading on the internet it seems cramps are not a solved issue, and most likely just due to the hard efforts at high intensity. It was scary about ten seconds as I feared a full cramping of one of both legs (and being clipped in… not funny), but once I realised I can keep it under control, it was just interesting.

I remembered now the rest of the route (ahem, see below), so I knew I was looking at a long (7km), flat-ish section (100m gain), before the last, relatively steep descent. Steep as in -24.4% max gradient, with average -15%. This done, I was on relatively flat roads (paved, even), and my Garmin was showing 1’365m altitude gain since the start of the race. Between the official number of 1’400m and this was a difference of only 35m, which I was sure was due to simple errors; distance now was 50.5km, so I was looking at the rest of 2.5km being a fast run towards the end.

Then in a village we exit the main road and take a small road going up a bit, and then up a bit, and then—I was pretty annoyed here already—the road opens to a climb. A CLIMB!!!! I could see the top of the hill, people going up, but it was not a small climb! It was not 35m, not fewer than that, it was a short but real climb!

I was swearing with all my power in various languages. I completely and totally forgot this last climb, I did not plan for it, and I was pissed off ☺ It turned out to be 1.3km long, mean grade of 11.3% (!), with 123m of elevation gain. It was almost 100m more than I planned… I got off the bike again, especially as it was a bit tricky to pedal and the gradient was 20% (!!!), so I pushed, then I got back on, then I got back off, and then back on. At the end of the climb there was a photographer again, so I got on the bike and hoped I might get a smiling picture, which almost happened:

Pretending I’m enjoying this, and showing how much weight I should get rid of :/

And then from here on it was clean, easy downhill, at most -20% grade, back in Einsiedeln, and back to the start with a 10m climb, which I pushed hard, overtook a few people (lol), passed the finish, and promptly found first possible place to get off the bike and lie down. This was all behind me now:

Thanks VeloViewer!

And, my Garmin said 1’514m altitude, more than 100m more than the official number. Boo!

Aftermath

It was the first race I actually had to lie down afterwards. First race I got cramps as well ☺ Third 1h30m heart rate value ever, a lot of other year 2019 peaks too. And I was dead tired.

I learned it’s very important to know well the route altitude profile, in order to be able to time your efforts well. I got reminded I suck at climbing (so I need to train this over and over), but I learned that I can push better than other people at the end of a climb, so I need to train this strength as well.

I also learned that you can’t rely on the official race data, since it can be off by more than 100m. Or maybe I can’t rely on my Garmin(s), but with a barometric altimeter, I don’t think the problem was here.

I think I ate a bit too much during the race, which was not optimal, but I was very hungry already after the first climb…

But the biggest thing overall was that, despite no age group, I got a better placement than I thought. I finished in 3h:37m.14s, 1h:21m.56s behind the first place, but a good enough time that I was 325th place out of 490 finishers.

By my calculations, 490 finishers means the thirds are 1-163, 164-326, and 327-490. Which means, I was in the 2nd third, not in the last one!!! Hence the subtitle of this post, moving up, since I usually am in the bottom third, not the middle one. And there were also 11 DNF people, so I’m well in the middle third ☺

Joke aside, this made me very happy, as it’s the first time I feel like efforts (in a bad year like this, even) do amount to something. So good.

Speaking of DNF, I was shocked at the amount of people I saw either trying to fix their bike on the side, or walking their bike (to the next repair post), or just shouting angrily at a broken component. I counted probably towards 20 of such people, a lot after that first descent, but given only 11 DNF, I guess many people do carry spare tubes. I didn’t, hope is a strategy sometimes ☺

Now, with the season really closed, time to start thinking about next year. And how to lose those 10kg of fat I still need to lose…

Categories: FLOSS Project Planets

Antoine Beaupré: Calibre replacement considerations

Planet Debian - Sun, 2019-10-06 15:27
Summary

TL;DR: I'm considering replacing those various Calibre compnents with...

See below why and a deeper discussion on all the features.

Problems with Calibre

Calibre is an amazing software: it allows users to manage ebooks on your desktop and a multitude of ebook readers. It's used by Linux geeks as well as Windows power-users and vastly surpasses any native app shipped by ebook manufacturers. I know almost exactly zero people that have an ebook reader that do not use Calibre.

However, it has had many problems over the years:

Update: a previous version of that post claimed that all of Calibre had been removed from Debian. This was inaccurate, as the Debian Calibre maintainer pointed out. What happened was Calibre 4.0 was uploaded to Debian unstable, then broke because of missing Python 2 dependencies, and an older version (3.48) was uploaded in its place. So Calibre will stay around in Debian for the foreseeable future, hopefully, but the current latest version (4.0) cannot get in because it depends on older Python 2 libraries.

The latest issue (Python 3) is the last straw, for me. While Calibe is an awesome piece of software, I can't help but think it's doing too much, and the wrong way. It's one of those tools that looks amazing on the surface, but when you look underneath, it's a monster that is impossible to maintain, a liability that is just bound to cause more problems in the future.

What does Calibre do anyways

So let's say I wanted to get rid of Calibre, what would that mean exactly? What do I actually use Calibre for anyways?

Calibre is...

  • an ebook viewer: Calibre ships with the ebook-viewer command, which allows one to browse a vast variety of ebook formats. I rarely use this feature, since I read my ebooks on a e-reader, on purpose. There is, besides, a good variety of ebook-readers, on different platforms, that can replace Calibre here:

    • Atril, MATE's version of Evince, supports ePUBs (Evince doesn't seem to)
    • Bookworm looks very promising, not in Debian (Debian bug #883867), but Flathub. same problems as GNOME books finding my books (ie. it can't).
    • Buka is another "ebook" manager written in Javascript, but only supports PDFs for now.
    • coolreader is another alternative, not yet in Debian (#715470)
    • Emacs (of course) supports ebooks through nov.el
    • fbreader also supports ePUBs, but is much slower than all those others, and turned proprietary so is unmaintained
    • Foliate looks gorgeous and is built on top of the ePUB.js library, not in Debian, but Flathub.
    • GNOME Books is interesting, but relies on the GNOME search engine and doesn't find my books (and instead lots of other garbage). it's been described as "basic" and "the least mature" in this OMG Ubuntu review
    • koreader is a good alternative reader for the Kobo devices and now also has builds for Debian, but no Debian package
    • lucidor is a Firefox extension that can read an organize books, but is not packaged in Debian either (although upstream provides a .deb). It depends on older Firefox releases (or "Pale moon", a Firefox fork), see also the firefox XULocalypse for details
    • MuPDF also reads ePUBs without problems and is really fast
    • Okular supports ePUBs when okular-extra-backends is installed
    • plato is another alternative reader for Kobo readers, not in Debian
  • an ebook editor: Calibre also ships with an ebook-edit command, which allows you to do all sorts of nasty things to your ebooks. I have rarely used this tool, having found it hard to use and not giving me the results I needed, in my use case (which was to reformat ePUBs before publication). For this purpose, Sigil is a much better option, now packaged in Debian. There are also various tools that render to ePUB: I often use the Sphinx documentation system for that purpose, and have been able to produce ePUBs from LaTeX for some projects.

  • a file converter: Calibre can convert between many ebook formats, to accomodate the various readers. In my experience, this doesn't work very well: the layout is often broken and I have found it's much better to find pristine copies of ePUB books than fight with the converter. There are, however, very few alternatives to this functionality, unfortunately.

  • a collection browser: this is the main functionality I would miss from Calibre. I am constantly adding books to my library, and Calibre does have this incredibly nice functionality of just hitting "add book" and Just Do The Right Thing™ after that. Specifically, what I like is that it:

    • sort, view, and search books in folders, per author, date, editor, etc
    • quick search is especially powerful
    • allows downloading and editing metadata (like covers) easily
    • track read/unread status (although that's a custom field I had to add)

    Calibre is, as far as I know, the only tool that goes so deep in solving that problem. The Liber web server, however, does provide similar search and metadata functionality. It also supports migrating from an existing Calibre database as it can read the Calibre metadata stores. When no metadata is found, it fetches some from online sources (currently Google Books).

    One major limitation of Liber in this context is that it's solely search-driven: it will not allow you to see (for example) the "latest books added" or "browse by author". It also doesn't support "uploading" books although it will incrementally pick up new books added by hand in the library. It somewhat assumes Calibre already exists, in a way, to properly curate the library and is more designed to be a search engine and book sharing system between liber instances.

    This also connects with the more general "book inventory" problem I have which involves an inventory physical books and directory of online articles. See also firefox (Zotero section) and ?bookmarks for a longer discussion of that problem.

  • a device synchronization tool : I mostly use Calibre to synchronize books with an ebook-reader. It can also automatically update the database on the ebook with relevant metadata (e.g. collection or "shelves"), although I do not really use that feature. I do like to use Calibre to quickly search and prune books from by ebook reader, however. I might be able to use git-annex for this, however, given that I already use it to synchronize and backup my ebook collection in the first place...

  • an RSS reader: I used this for a while to read RSS feeds on my ebook-reader, but it was pretty clunky. Calibre would be continously generating new ebooks based on those feeds and I would never read them, because I would never find the time to transfer them to my ebook viewer in the first place. Instead, I use a regular RSS feed reader. I ended up writing my own, feed2exec) and when I find an article I like, I add it to Wallabag which gets sync'd to my reader using wallabako, another tool I wrote.

  • an ebook web server : Calibre can also act as a web server, presenting your entire ebook collection as a website. It also supports acting as an OPDS directory, which is kind of neat. There are, as far as I know, no alternative for such a system although there are servers to share and store ebooks, like Trantor or Liber.

Note that I might have forgotten functionality in Calibre in the above list: I'm only listing the things I have used or am using on a regular basis. For example, you can have a USB stick with Calibre on it to carry the actual software, along with the book library, around on different computers, but I never used that feature.

So there you go. It's a colossal task! And while it's great that Calibre does all those things, I can't help but think that it would be better if Calibre was split up in multiple components, each maintained separately. I would love to use only the document converter, for example. It's possible to do that on the commandline, but it still means I have the entire Calibre package installed.

Maybe a simple solution, from Debian's point of view, would be to split the package into multiple components, with the GUI and web servers packaged separately from the commandline converter. This way I would be able to install only the parts of Calibre I need and have limited exposure to other security issues. It would also make it easier to run Calibre headless, in a virtual machine or remote server for extra isoluation, for example.

Update: this post generated some activity on Mastodon, follow the conversation here or on your favorite Mastodon instance.

Categories: FLOSS Project Planets

Anarcat: Calibre replacement considerations

Planet Python - Sun, 2019-10-06 15:27
Summary

TL;DR: I'm considering replacing those various Calibre compnents with...

See below why and a deeper discussion on all the features.

Problems with Calibre

Calibre is an amazing software: it allows users to manage ebooks on your desktop and a multitude of ebook readers. It's used by Linux geeks as well as Windows power-users and vastly surpasses any native app shipped by ebook manufacturers. I know almost exactly zero people that have an ebook reader that do not use Calibre.

However, it has had many problems over the years:

The latest issue (lack of Python 3) is the last straw, for me. While Calibe is an awesome piece of software, I can't help but think it's doing too much, and the wrong way. It's one of those tools that looks amazing on the surface, but when you look underneath, it's a monster that is impossible to maintain, a liability that is just bound to cause more problems in the future.

What does Calibre do anyways

So let's say I wanted to get rid of Calibre, what would that mean exactly? What do I actually use Calibre for anyways?

Calibre is...

  • an ebook viewer: Calibre ships with the ebook-viewer command, which allows one to browse a vast variety of ebook formats. I rarely use this feature, since I read my ebooks on a e-reader, on purpose. There is, besides, a good variety of ebook-readers, on different platforms, that can replace Calibre here:

    • Atril, MATE's version of Evince, supports ePUBs (Evince doesn't)
    • MuPDF also reads ePUBs without problems and is really fast
    • fbreader also supports ePUBs, but is much slower than all those others
    • Emacs (of course) supports ebooks through nov.el
    • Okular apparently supports ePUBs, but I must be missing a library because it doesn't actually work here
    • coolreader is another alternative, not yet in Debian (#715470)
    • lucidor also looks interesting, but is not packaged in Debian either (although upstream provides a .deb)
    • koreader and plato are good alternatives for the Kobo reader (although koreader also now has builds for Debian)
  • an ebook editor: Calibre also ships with an ebook-edit command, which allows you to do all sorts of nasty things to your ebooks. I have rarely used this tool, having found it hard to use and not giving me the results I needed, in my use case (which was to reformat ePUBs before publication). For this purpose, Sigil is a much better option, now packaged in Debian. There are also various tools that render to ePUB: I often use the Sphinx documentation system for that purpose, and have been able to produce ePUBs from LaTeX for some projects.

  • a file converter: Calibre can convert between many ebook formats, to accomodate the various readers. In my experience, this doesn't work very well: the layout is often broken and I have found it's much better to find pristine copies of ePUB books than fight with the converter. There are, however, very few alternatives to this functionality, unfortunately.

  • a collection browser: this is the main functionality I would miss from Calibre. I am constantly adding books to my library, and Calibre does have this incredibly nice functionality of just hitting "add book" and Just Do The Right Thing™ after that. Specifically, what I like is that it:

    • sort, view, and search books in folders, per author, date, editor, etc
    • quick search is especially powerful
    • allows downloading and editing metadata (like covers) easily
    • track read/unread status (although that's a custom field I had to add)

    Calibre is, as far as I know, the only tool that goes so deep in solving that problem. The Liber web server, however, does provide similar search and metadata functionality. It also supports migrating from an existing Calibre database as it can read the Calibre metadata stores.

    This also connects with the more general "book inventory" problem I have which involves an inventory physical books and directory of online articles. See also firefox (Zotero section) and ?bookmarks for a longer discussion of that problem.

  • a device synchronization tool : I mostly use Calibre to synchronize books with an ebook-reader. It can also automatically update the database on the ebook with relevant metadata (e.g. collection or "shelves"), although I do not really use that feature. I do like to use Calibre to quickly search and prune books from by ebook reader, however. I might be able to use git-annex for this, however, given that I already use it to synchronize and backup my ebook collection in the first place...

  • an RSS reader: I used this for a while to read RSS feeds on my ebook-reader, but it was pretty clunky. Calibre would be continously generating new ebooks based on those feeds and I would never read them, because I would never find the time to transfer them to my ebook viewer in the first place. Instead, I use a regular RSS feed reader. I ended up writing my own, feed2exec) and when I find an article I like, I add it to Wallabag which gets sync'd to my reader using wallabako, another tool I wrote.

  • an ebook web server : Calibre can also act as a web server, presenting your entire ebook collection as a website. It also supports acting as an OPDS directory, which is kind of neat. There are, as far as I know, no alternative for such a system although there are servers to share and store ebooks, like Trantor or Liber.

Note that I might have forgotten functionality in Calibre in the above list: I'm only listing the things I have used or am using on a regular basis. For example, you can have a USB stick with Calibre on it to carry the actual software, along with the book library, around on different computers, but I never used that feature.

So there you go. It's a colossal task! And while it's great that Calibre does all those things, I can't help but think that it would be better if Calibre was split up in multiple components, each maintained separately. I would love to use only the document converter, for example. It's possible to do that on the commandline, but it still means I have the entire Calibre package installed.

Maybe a simple solution, from Debian's point of view, would be to split the package into multiple components, with the GUI and web servers packaged separately from the commandline converter. This way I would be able to install only the parts of Calibre I need and have limited exposure to other security issues. It would also make it easier to run Calibre headless, in a virtual machine or remote server for extra isoluation, for example.

Categories: FLOSS Project Planets

Akademy 2019: new goals, new board, new president

Planet KDE - Sun, 2019-10-06 13:43

Akademy 2019 has been over for a little more than 3 weeks now. It’s been a great and eventful Akademy. Let’s take a look at what happened.

New goals

In 2017 we chose 3 goals to work towards together as a community. Those were improved onboarding, usability and productivity of basic software and privacy. For all of them we’ve made great progress and I’m thrilled by the result. But the original idea behind the goals wasn’t just to get work done on some specific goals. The other reason was to give us something to work towards together as a community (as we were starting to lose that uniting factor that binds us all together as KDE grows) and to get in new people by making it clearer what we need help with and where to dive in. Looking back now my expectations were exceeded quite a bit for those. It makes me so happy to see a lot of new people joining and being enthusiastic about contributing to KDE in meaningful ways – even going so far as to proposing 2 of the 3 selected initial goals.

At the beginning of the year we decided it was time to shift our focus to new goals and started the process for proposing and voting on new goals. We started Akademy with a review of the initial goals and then I had the pleasure to announce the new ones:

  • Wayland: We will finalize the transition to Wayland and embrace the future of desktop. This is a necessary step towards a lot of new features and improvements our users want to see, like better touchscreen support.
  • Consistency: As KDE’s software evolved small and large inconsistencies creep in that make our software less pleasant to use. It also means having to maintain different implementations of essentially the same component like a scrollbar. We will identify and remove these inconsistencies in all of KDE’s software.
  • All about the Apps: We want to refocus on KDE’s applications and make them easier to discover and install for our users.

I’m looking forward to the progress on these goals over the next 2 to 3 years. To start helping out please have a look at the goals page and get in touch.

New board

As every year during Akademy we held the general assembly of KDE e.V. and elected new board members for the two open positions. I’m delighted to welcome Adriaan and Neofytos to the board.

After the election it was time to decide on the board positions. I have been on the board of KDE e.V. for 8 years now and the president for 5 years. Leading this organisation has been one of the most important things I have done so far and I believe I have made an impact. At the same time I am convinced that it is not healthy for an organisation to be lead by the same person for too long. That’s why at the start of my current term we discussed how we see the future of the organisation and our role in it. It was clear that Aleix has been doing invaluable work on the board as the vice president and would clearly be a good choice to lead the organisation in the future. We decided that we will have at least one year at the end of our current term where I will be on the board to support and advice and ensure a smooth transition for Aleix. This time has come now. I would like to ask you all welcome Aleix as the new president of KDE e.V. and provide him with all the support he needs. I am looking forward to working with our new Board and see where we will take KDE e.V. together in the next years.

The new board positions we agreed on are as follows:

  • Aleix Pol i Gonzalez: President
  • Lydia Pintscher: Vice President
  • Eike Hein: Vice President and Treasurer
  • Adriaan de Groot: Board Member
  • Neofytos Kolokotronis: Board Member
Next Akademy

We are still looking for a host for Akademy 2020. If you’d like to host the KDE community next year please have a look at the call for hosts which has all the details and reach out if you have any questions.

Categories: FLOSS Project Planets

libredwg @ Savannah: libredwg-0.9 released

GNU Planet! - Sun, 2019-10-06 11:21

This is a major release, the first beta,
adding the new dxf importer, and dxf2dwg (experimental),
the first usage of the new dynapi and the encoder.

More here: https://www.gnu.org/software/libredwg/ and http://git.savannah.gnu.org/cgit/libredwg.git/tree/NEWS

Here are the compressed sources:
  http://ftp.gnu.org/gnu/libredwg/libredwg-0.9.tar.gz   (10.2MB)
  http://ftp.gnu.org/gnu/libredwg/libredwg-0.9.tar.xz   (4.3MB)

Here are the GPG detached signatures[*]:
  http://ftp.gnu.org/gnu/libredwg/libredwg-0.9.tar.gz.sig
  http://ftp.gnu.org/gnu/libredwg/libredwg-0.9.tar.xz.sig

Use a mirror for higher download bandwidth:
  https://www.gnu.org/order/ftp.html

Here are more binaries:
  https://github.com/LibreDWG/libredwg/releases/tag/0.9

Here are the SHA256 checksums:
e39ac35bc174fe8d0b05fc800970c685692714daacd6026a4e4f0f4d0ddb08e0  libredwg-0.9.tar.gz
954f74753860315eb313a3bbb83bf7e5ad03e84bd10408cc629ff2e4e4b3fd46  libredwg-0.9.tar.xz

[*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact.  First, be sure to download both the .sig file and the corresponding tarball.  Then, run a command like this:

  gpg --verify libredwg-0.9.tar.gz.sig

If that command fails because you don't have the required public key, then run this command to import it:

  gpg --keyserver keys.gnupg.net --recv-keys B4F63339E65D6414

and rerun the 'gpg --verify' command.

Categories: FLOSS Project Planets

hussainweb.me: Open content in new tab with an image formatter

Planet Drupal - Sun, 2019-10-06 07:21
One of my sites has a listing of content shown as teaser. The teaser for this content type is defined to show the title, an image, and a few other fields. In the listing, the image is linked to the content so that the visitor may click on the image (or the title) to open the content. All this is easily achievable through regular Drupal site building.
Categories: FLOSS Project Planets

Pages