FLOSS Project Planets

PSF GSoC students blogs: GSoC: Week 6: class InputEngine

Planet Python - Mon, 2020-07-06 03:12
What did I do this week?

I have started working on input engine this week. Currently, we only have csv2cve which accepts csv file of vendor, product and version as input and produces list of CVEs as output. Currently, csv2cve is separate module with separate command line entry point. I have created a module called input_engine that can process data from any input format (currently csv and json).User can now add remarks field in csv or json which can have any value from following values ( Here, values in parenthesis are aliases for that specific type. )

  1. NewFound (1, n, N)
  2. Unexplored (2, u, U)
  3. Mitigated, (3, m, M)
  4. Confirmed (4, c, C)
  5. Ignored (5, i, I)

I have added --input-file(-i) option in the cli.py to specify input file which input_engine parses and create intermediate data structure that will be used by output_engine to display data according to remarks. Output will be displayed in the same order as priority given to the remarks. I have also created a dummy csv2cve which just calls cli.py with -i option as argument specified in csv2cve. Here, is example usage of -i as input file to produce CVE:  cve-bin-tool -i=test.csv  and User can also use -i to supplement remarks data while scanning directory so that output will be sorted according to remarks. Here is example usage for that: cve-bin-tool -i=test.csv /path/to/scan.

I have also added test cases for input_engine and removed old test cases of the csv2cve.

What am I doing this week? 

I have exams this week from today to 9th July. So, I won't be able to do much during this week but I will spend my weekend improving input_engine like giving more fine-grained control to provide remarks and custom severity.

Have I got stuck anywhere?

No, I didn't get stuck anywhere this week :)

Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: Philip James

Planet Python - Mon, 2020-07-06 01:05

This week we welcome Philip James (@phildini) as our PyDev of the Week! Philip is a core contributor for Beeware project. He has worked on several  other open source projects that you’ll learn about in this interview. He is also a popular speaker at PyCons and DjangoCons. You can find out more about Philip on his website or check out his work on Github.

Let’s spend some time getting to know Philip better!

Can you tell us a little about yourself (hobbies, education, etc):

My name is Philip, but I’m probably better known on the internet as phildini. That nickname came from a stage name; I used to do magic shows in high school for pocket money. In the Python community, I’m maybe best known as a frequent conference speaker, I’ve spoken at PyCons and DjangoCons around the world for the past 5 years. Beyond being a speaker, I’ve helped organize some Python meetups and conferences, and I serve on the PSF Conduct Working Group as it’s Chair. I’m also one of the early Core Contributors to the BeeWare project.

I’m the Head of Engineering at a personal finance company called Trim, where we try to automate saving people money on things like their Internet bill. I also co-run a publishing company and print shop called Galaxy Brain with a friend I met while I was at Patreon. We started as a Risograph print shop, making a zine about wine called Adult Juice Box and doing art prints. Galaxy Brain has been moving into software with the pandemic, because accessing our studio is harder, but we’re planning on keeping the printing going once things calm down. It’s kind of hilarious to us that we moved into software as an afterthought; I think we both resisted it for so long because the software is our day job.

Why did you start using Python?

I can remember helping to run a youth retreat in the Santa Cruz mountains in… I want to say 2005 or 2006, and one of the adults on the trip, who’s still a very good friend, showing me Python on a computer we had hooked up to one of the camp’s projectors. My first Python lesson happened on a 6-foot widescreen. Then in college, I took a couple courses on web applications and didn’t want to use PHP, so I started building apps in Django. That got me my first job in programming, then a job at Eventbrite, which got me into speaking, and the rest is history.

What other programming languages do you know and which is your favorite?

College theoretically taught me C and Java, but I know them like some people know ancient Greek — I can read it, but good luck speaking it. Towards the end of college I picked up some C#, and I really enjoyed my time in that language. It hit a lot of nice compromises between direct management and object-oriented modern languages, and I think a lot of that had to do with the fact that Visual Studio was such an incredible IDE.

Since I moved into web programming, I’ve picked up Javascript and Ruby, enough that I can write things in them but not enough to feel comfortable starting a project with them. Web development is in this really weird place right now, where you can maybe get away with only knowing Javascript, but you need a working familiarity with HTML, CSS, Javascript, Python, Ruby, and Shell to be effective at a high level. Maybe you just need to be good at googling those things.

I’ve recently started going deep on a language called ink, which is a language for writing Interactive Fiction games. We used to use this term “literate programming” way more; ink (along with twine and some others) is how you “program literature”. You can use ink to make standalone games or export it into a format that will drive narrative events in more complex Unity games. Stories and narratives don’t lend themselves well to modularization in the way programmers think of it, so it’s been fun watching my optimize-everything programmer brain clash with my get-the-narrative-out writer brain as I learn ink.

What projects are you working on now?

The trick is getting me to stop working on projects. Right now there’s my day job, as well as a host of Galaxy Brain projects. VictoryPic is a little slack app for bringing an Instagram-like experience to Slack. Hello Caller is a tool for doing podcast call-in shows. I’ve got some scripts I put up for building an “on-air” monitor for my office using a raspberry pi and CircuitPlayground Express. I’m writing a scraping library for the game store itch, so that I can do some interesting video game streaming projects. All those are in Python, for the most part, and then there’s the Interactive Fiction game, written in ink, that I’m working on for Galaxy Brain’s wine zine.

I also continue to write for Adult Juice Box, and run a podcast called Thought & A Chaser

Which Python libraries are your favorite (core or 3rd party)?

I think the CSV and sqlite libraries in the standard library are the two most important batteries Python comes with, outside the core language. With those two libraries, you can build a deeper well of data-driven apps than any other language I’ve seen. Outside of the stdlib, requests is the first library I reach for when I’m starting a project, and Django is how I build most of the projects I listed up above. Django is the most powerful set of abstractions for building webapps I’ve seen, in any language.

How did you get involved with the Beeware project?

I got involved in Beeware because of my speaking career. I was accepted to speak at DjangoCon Europe in Budapest a few years back, and met Dr. Russeell Keith-Magee, the creator of Beeware, along with Katie McLaughlin, one of the original Core Contributors. We started chatting about Beeware there, and I hacked on it a bit at sprints, and then I saw them at DjangoCon US in Philadelphia, and then again at PyCon US in Portland, and I had kept working on Beeware during that time and at sprints for those events. At PyCon I got the commit bit and became a Core Contributor.

The thing I take away from this story, that I tell others who want to get involved, is two-fold: (1) Submit talks to conferences, early and often. Being a conference speaker may or may not be good for your career, but it’s incredible for your sense of community and your friend circle within the tech communities you care about. (2) Show Up. There is immeasurable power in being consistent, in showing up to help regularly, in trying to ease the burdens of the people and projects you care about. The people you value in turn value those who Show Up, even if they’re not able to voice it.

Which Beeware package do you enjoy the most?

It feels like cheating to say Briefcase, but I really think Briefcase is the most interesting part of the whole project, because it’s the closest on solving Python’s ecosystem and package management problems. We shouldn’t have to teach people what pip is to let them benefit from what Python can do.

Is there anything else you’d like to say?

I think it’s important for those of us in programming especially to remember that our communities are not as insular as we think; we exist in the world, and this world has granted quite a bit to many of us. We need to be thinking about how we can give back, not just to the tech communities but to the world as a whole. Programming can be a force for justice, but sometimes the greatest difference is made when we show up to our local protest or city council meeting.

Thanks for doing the interview, Philip!

The post PyDev of the Week: Philip James appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Blog #3 (29th Jun - 6th Jul)

Planet Python - Mon, 2020-07-06 01:03

Hey everyone we are done with the first third of the program and I will use this blog to both give the weekly update as well as summarize the current state of progress. In the past 4 weeks , we have created a new number-parser library from scratch and build an MVP that is being continuously improved.

Last week was spent fine-tuning the parser to retrieve the relevant data from the CLDR RBNF repo. This 'rule based number parser' (RBNF) repo is basically a Java library that converts a number (23) to the corresponding word. (twenty-three) It has a lot of hard-coded values and data that are very useful to our library and thus we plan to extract all this information accurately and efficiently.

In addition to this there are multiple nuances in each of the language that was being taken care , accents in languages. For eg) the french '0' is written as zéro with (accent aigu over the e ) However we don't expect the users to enter these accents each time hence we need to normalise (i.e remove) these accents.

The most challenging aspect was definitely understanding (which I am still not completely clear) the CLDR RBNF structure , there is only a little documentation explaining some of the basic rules however it's tough to identify which are the relevant rules and which aren't.

Originally I was hoping to add more tests as well in this week however all this took longer than expected so the testing aspect is going to be pushed to the current week.

Categories: FLOSS Project Planets

Mike Driscoll: Using Widgets in Jupyter Notebook (Video)

Planet Python - Sun, 2020-07-05 21:23

Learn how to use Jupyter Notebook’s built-in widgets in this video tutorial.

Get the book: https://leanpub.com/jupyternotebook101/

The post Using Widgets in Jupyter Notebook (Video) appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

Python⇒Speed: Massive memory overhead: Numbers in Python and how NumPy helps

Planet Python - Sun, 2020-07-05 20:00

Let’s say you want to store a list of integers in Python:

list_of_numbers = [] for i in range(1000000): list_of_numbers.append(i)

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects.

In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

Let’s see what’s going on under the hood, and then how using NumPy can get rid of this overhead.

Categories: FLOSS Project Planets

Enrico Zini: COVID-19 and Capitalism

Planet Debian - Sun, 2020-07-05 18:00
Astroturfing: How To Spot A Fake Movement capitalism covid19 news politics Crowds on Demand - Protests, Rallies and Advocacy archive.org 2020-07-06 If the Reopen America protests seem a little off to you, that's because they are. In this video we're going to talk about astroturfing and how insidious it i... Volunteers 3D-Print Unobtainable $11,000 Valve For $1 To Keep Covid-19 Patients Alive; Original Manufacturer Threatens To Sue capitalism covid19 health news Volunteers produce 3D-printed valves for life-saving coronavirus treatments archive.org 2020-07-06 Techdirt has just written about the extraordinary legal action taken against a company producing Covid-19 tests. Sadly, it's not the only example of some individuals putting profits before people. Here's a story from Italy, which is... Germany tries to stop US from luring away firm seeking coronavirus vaccine capitalism covid19 health news archive.org 2020-07-06 Berlin is trying to stop Washington from persuading a German company seeking a coronavirus vaccine to move its research to the United States. He Has 17,700 Bottles of Hand Sanitizer and Nowhere to Sell Them capitalism covid19 news archive.org 2020-07-06 Amazon cracked down on coronavirus price gouging. Now, while the rest of the world searches, some sellers are holding stockpiles of sanitizer and masks. Theranos vampire lives on: Owner of failed blood-testing biz's patents sues maker of actual COVID-19-testing kit capitalism covid19 news archive.org 2020-07-06 And 3D-printed valve for breathing machine sparks legal threat How an Austrian ski paradise became a COVID-19 hotspot capitalism covid19 news archive.org 2020-07-06 Ischgl, an Austrian ski resort, has achieved tragic international fame: hundreds of tourists are believed to have contracted the coronavirus there and taken it home with them. The Tyrolean state government is now facing serious criticism. EURACTIV Germany reports. Hospitals Need to Repair Ventilators. Manufacturers Are Making That Impossible capitalism covid19 health news archive.org 2020-07-06 We are seeing how the monopolistic repair and lobbying practices of medical device companies are making our response to the coronavirus pandemic harder. Homeless people in Las Vegas sleep 6 feet apart in parking lot as thousands of hotel rooms sit empty capitalism covid19 news privilege archive.org 2020-07-06 Las Vegas, Nevada has come under criticism after reportedly setting up a temporary homeless shelter in a parking lot complete with social distancing barriers.
Categories: FLOSS Project Planets

Nikola: Nikola v8.1.1 is out!

Planet Python - Sun, 2020-07-05 17:44

On behalf of the Nikola team, I am pleased to announce the immediate availability of Nikola v8.1.1. This release is mainly due to an incorrect PGP key being used for the PyPI artifacts; three regressions were also fixed in this release.

What is Nikola?

Nikola is a static site and blog generator, written in Python. It can use Mako and Jinja2 templates, and input in many popular markup formats, such as reStructuredText and Markdown — and can even turn Jupyter Notebooks into blog posts! It also supports image galleries, and is multilingual. Nikola is flexible, and page builds are extremely fast, courtesy of doit (which is rebuilding only what has been changed).

Find out more at the website: https://getnikola.com/


Install using pip install Nikola.

Changes Bugfixes
  • Default to no line numbers in code blocks, honor CodeHilite requesting no line numbers. Listing pages still use line numbers (Issue #3426)

  • Remove duplicate MathJax config in bootstrap themes (Issue #3427)

  • Fix doit requirement to doit>=0.32.0 (Issue #3422)

Categories: FLOSS Project Planets

Glyph Lefkowitz: Zen Guardian

Planet Python - Sun, 2020-07-05 16:44

There should be one — and preferably only one — obvious way to do it.

— Tim Peters, “The Zen of Python”

Moshe wrote a blog post a couple of days ago which neatly constructs a wonderful little coding example from a scene in a movie. And, as we know from the Zen of Python quote, there should only be one obvious way to do something in Python. So my initial reaction to his post was of course to do it differently — to replace an __init__ method with the new @dataclasses.dataclass decorator.

But as I thought about the code example more, I realized there are a number of things beyond just dataclasses that make the difference between “toy”, example-quality Python, and what you’d do in a modern, professional, production codebase today.

So let’s do everything the second, not-obvious way!

There’s more than one way to do it

— Larry Wall, “The Other Zen of Python” Getting started: the __future__ is now

We will want to use type annotations. But, the Guard and his friend are very self-referential, and will have lots of annotations that reference things that come later in the file. So we’ll want to take advantage of a future feature of Python, which is to say, Postponed Evaluation of Annotations. In addition to the benefit of slightly improving our import time, it’ll let us use the nice type annotation syntax without any ugly quoting, even when we need to make forward references.

So, to begin:

1from __future__ import annotations Doors: safe sets of constants

Next, let’s tackle the concept of “doors”. We don’t need to gold-plate this with a full blown Door class with instances and methods - doors don’t have any behavior or state in this example, and we don’t need to add it. But, we still wouldn’t want anyone using using this library to mix up a door or accidentally plunge to their doom by accidentally passing "certian death" when they meant certain. So a Door clearly needs a type of its own, which is to say, an Enum:

1 2 3 4 5from enum import Enum class Door(Enum): certain_death = "certain death" castle = "castle" Questions: describing type interfaces

Next up, what is a “question”? Guards expect a very specific sort of value as their question argument and we if we’re using type annotations, we should specify what it is. We want a Question type that defines arguments for each part of the universe of knowledge that these guards understand. This includes who they are themselves, who the set of both guards are, and what the doors are.

We can specify it like so:

1 2 3 4 5 6 7from typing import Protocol, Sequence class Question(Protocol): def __call__( self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door] ) -> bool: ...

The most flexible way to define a type of thing you can call using mypy and typing is to define a Protocol with a __call__ method and nothing else1. We could also describe this type as Question = Callable[[Guard, Sequence[Guard], Door], bool] instead, but as you may be able to infer, that doesn’t let you easily specify names of arguments, or keyword-only or positional-only arguments, or required default values. So Protocol-with-__call__ it is.

At this point, we also get to consider; does the questioner need the ability to change the collection of doors they’re passed? Probably not; they’re just asking questions, not giving commands. So they should receive an immutable version, which means we need to import Sequence from the typing module and not List, and use that for both guards and doors argument types.

Guards and questions: annotating existing logic with types

Next up, what does Guard look like now? Aside from adding some type annotations — and using our shiny new Door and Question types — it looks substantially similar to Moshe’s version:

1 2 3 4 5 6 7 8 9 10 11 12 13from dataclasses import dataclass @dataclass class Guard: _truth_teller: bool _guards: Sequence[Guard] _doors: Sequence[Door] def ask(self, question: Question) -> bool: answer = question(self, self._guards, self._doors) if not self._truth_teller: answer = not answer return answer

Similarly, the question that we want to ask looks quite similar, with the addition of:

  1. type annotations for both the “outer” and the “inner” question, and
  2. using Door.castle for our comparison rather than the string "castle"
  3. replacing List with Sequence, as discussed above, since the guards in this puzzle also have no power to change their environment, only to answer questions.
  4. using the [var] = value syntax for destructuring bind, rather than the more subtle var, = value form
1 2 3 4 5 6 7 8 9def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool: [other_guard] = (candidate for candidate in guards if candidate != guard) def other_question( guard: Guard, guards: Sequence[Guard], doors: Sequence[Door] ) -> bool: return doors[0] == Door.castle return other_guard.ask(other_question) Eliminating global state: building the guard post

Next up, how shall we initialize this collection of guards? Setting a couple of global variables is never good style, so let’s encapsulate this within a function:

1 2 3 4 5 6 7from typing import List def make_guard_post() -> Sequence[Guard]: doors = list(Door) guards: List[Guard] = [] guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)] return guards Defining the main point

And finally, how shall we actually have this execute? First, let’s put this in a function, so that it can be called by things other than running the script directly; for example, if we want to use entry_points to expose this as a script. Then, let's put it in a "__main__" block, and not just execute it at module scope.

Secondly, rather than inspecting the output of each one at a time, let’s use the all function to express that the interesting thing is that all of the guards will answer the question in the affirmative:

1 2 3 4 5 6def main() -> None: print(all(each.ask(question) for each in make_guard_post())) if __name__ == "__main__": main() Appendix: the full code

To sum up, here’s the full version:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55from __future__ import annotations from dataclasses import dataclass from typing import List, Protocol, Sequence from enum import Enum class Door(Enum): certain_death = "certain death" castle = "castle" class Question(Protocol): def __call__( self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door] ) -> bool: ... @dataclass class Guard: _truth_teller: bool _guards: Sequence[Guard] _doors: Sequence[Door] def ask(self, question: Question) -> bool: answer = question(self, self._guards, self._doors) if not self._truth_teller: answer = not answer return answer def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool: [other_guard] = (candidate for candidate in guards if candidate != guard) def other_question( guard: Guard, guards: Sequence[Guard], doors: Sequence[Door] ) -> bool: return doors[0] == Door.castle return other_guard.ask(other_question) def make_guard_post() -> Sequence[Guard]: doors = list(Door) guards: List[Guard] = [] guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)] return guards def main() -> None: print(all(each.ask(question) for each in make_guard_post())) if __name__ == "__main__": main() Acknowledgments

I’d like to thank Moshe Zadka for the post that inspired this, as well as Nelson Elhage, Jonathan Lange, Ben Bangert and Alex Gaynor for giving feedback on drafts of this post.

  1. I will hopefully have more to say about typing.Protocol in another post soon; it’s the real hero of the Mypy saga, but more on that later... 

Categories: FLOSS Project Planets

GSoC'20 progress : Phase I

Planet KDE - Sun, 2020-07-05 14:30
Wrapping up the first phase of Google Summer of Code
Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Check-In: Week 6

Planet Python - Sun, 2020-07-05 13:35

Make sure to check out Project FURY : https://github.com/fury-gl/fury

Hey ! 
Spherical harmonics, Continued!

What did I do this week

Last week I added a basic implementation of Spherical harmonics based actors. However, the implementation was quite restricted and we needed to add support for more accurate generation of spherical harmonics. So the task assigned this week was to implement the spherical harmonics function within the shader rather than passing variables as uniforms. This was quite an challenging task as it involved understanding of mathematical formulae and implementing them using existing GLSL functions.
The output of the implementation is shown below



While , i was able to complete the task the frame rate for the generated output was quite lower than expected. 

The code for the above render is available at the branch :


What's coming up next

The next task is to discuss possible performance improvements with the mentors and also look into alternative ideas to add spherical harmonics as actors in FURY.

Did I get stuck anywhere

Spherical harmonics involve a lot of complicated math behind the hood as a result the generated output has a very poor frame rate. Currently, we are looking into improving this.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Check-in #6

Planet Python - Sun, 2020-07-05 12:03
Translation, Reposition, Rotation.

Hello and welcome to my 6th weekly check-in. The first evaluation period officially ends and I am very excited to move on to the second coding period. I will be sharing my progress with handling specific object's properties among various multiple objects rendered by a single actor. I am mainly focusing on making it easier to translate, rotate and reposition a particular object, so that I can use them to render physics simulations more efficiently. The official repository of my sub-org, FURY can always be found here.

What did you do this week?

Last week I worked on physics simulations rendered in FURY with the help of pyBullet. Now the simulations were highly un-optimized, specially the brick wall simulation as each brick was rendered by its own actor. In other words, 1 brick = 1 actor. Now my objective was to render all the bricks using a single actor, but before jumping into the simulation I had to figure out how to modify specific properties of an individual object. Thanks to my mentor's PR, I was able to experiment my implementations quickly.


The algorithm behind translation is to first identify the vertices of the object, then bring the vertices to the origin by subtracting their centers and then adding the displacement vector. The said operation can be achieved by the following snippet:

# Update vertices positions vertices[object_index * sec: object_index * sec + sec] = \ (vertices[object_index * sec: object_index * sec + sec] - centers[object_index]) + transln_vector


The algorithm behind rotation is to first calculate the difference between the vertices and the center of the object. Once we get the resultant matrix, we matrix multiply it with the rotation matrix and then we further add the centers back to it so that we preserve the position of the object. Rotation matrix can be defined as:

where gamma, beta and alpha corresponds to the angle of rotation along Z-axis, Y-axis and X-axis.

def get_R(gamma, beta, alpha): """ Returns rotational matrix. """ r = [ [np.cos(alpha)*np.cos(beta), np.cos(alpha)*np.sin(beta)*np.sin(gamma) - np.sin(alpha)*np.cos(gamma), np.cos(alpha)*np.sin(beta)*np.cos(gamma) + np.sin(alpha)*np.sin(gamma)], [np.sin(alpha)*np.cos(beta), np.sin(alpha)*np.sin(beta)*np.sin(gamma) + np.cos(alpha)*np.cos(gamma), np.sin(alpha)*np.sin(beta)*np.cos(gamma) - np.cos(alpha)*np.sin(gamma)], [-np.sin(beta), np.cos(beta)*np.sin(gamma), np.cos(beta)*np.cos(gamma)] ] r = np.array(r) return r vertices[object_index * sec: object_index * sec + sec] = \ (vertices[object_index * sec: object_index * sec + sec] - centers[object_index])@get_R(0, np.pi/4, np.pi/4) + centers[object_index]


Repositioning is similar to that of translation, except in this case, while repositioning we update centers with the new position value.

new_pos = np.array([1, 2, 3]) # Update vertices positions vertices[object_index * sec: object_index * sec + sec] = \ (vertices[object_index * sec: object_index * sec + sec] - centers[object_index]) + new_pos centers[object_index] = new_pos What is coming up next?

Currently, I am yet to figure out the orientation problem. Once I figure that out I will be ready to implement simulations without any major issues. I am also tasked with creating a wrecking ball simulation and a quadruped robot simulation.

Did you get stuck anywhere?

I did face some problems while rotating objects. My mentors suggested me to implement it via rotation matrix. I still haven't figured out the orientation problem, which I plan to work on next. Apart from these I did not face any major issues.

Thank you for reading, see you next week!!
Categories: FLOSS Project Planets

Ian Ozsvald: Weekish notes

Planet Python - Sun, 2020-07-05 11:42

I gave another iteration of my Making Pandas Fly talk sequence for PyDataAmsterdam recently and received some lovely postcards from attendees as a result. I’ve also had time to list new iterations of my training courses for Higher Performance Python (October) and Software Engineering for Data Scientists (September), both will run virtually via Zoom & Slack in the UK timezone.

I’ve been using my dtype_diet tool to time more performance improvements with Pandas and I look forward to talking more on this at EuroPython this month.

In baking news I’ve improved my face-making on sourdough loaves (but still have work to do) and I figure now is a good time to have a crack at dried-yeast baking again.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.

The post Weekish notes appeared first on Entrepreneurial Geekiness.

Categories: FLOSS Project Planets

PSF GSoC students blogs: [Week 5] Check-in

Planet Python - Sun, 2020-07-05 10:21

1. What did you do this week?

  • Add more test cases to cover more functions.
2. Difficulty

No difficulties this week.

3. What is coming up next?
  • Use unumpy multimethods.
  • Improve documentation.
  • Publish a simple version of udiff on pypi.
Categories: FLOSS Project Planets

Thorsten Alteholz: My Debian Activities in June 2020

Planet Debian - Sun, 2020-07-05 10:13

FTP master

This month I accepted 377 packages and rejected 30. The overall number of packages that got accepted was 411.

Debian LTS

This was my seventy-second month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 30h. During that time I did LTS uploads of:

  • [DLA 2255-1] libtasn1-6 security update for one CVE
  • [DLA 2256-1] libtirpc security update for one CVE
  • [DLA 2257-1] pngquant security update for one CVE
  • [DLA 2258-1] zziplib security update for eight CVEs
  • [DLA 2259-1] picocom security update for one CVE
  • [DLA 2260-1] mcabber security update for one CVE
  • [DLA 2261-1] php5 security update for one CVE

I started to work on curl as well but did not upload a fixed version, so this has to go to ELTS now.

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the twenty fourth ELTS month.

Unfortunately in the last month of Wheezy ELTS even I did not find any package to fix a CVE, so during my small allocated time I didn’t uploaded anything.

But at least I did some days of frontdesk duties und updated my working environment for the new ELTS Jessie.

Other stuff

I uploaded a new upstream version of …

Categories: FLOSS Project Planets

The Digital Cat: Flask project setup: TDD, Docker, Postgres and more - Part 1

Planet Python - Sun, 2020-07-05 08:00

There are tons of tutorials on Internet that tech you how to use a web framework and how to create Web applications, and many of these cover Flask, first of all the impressive Flask Mega-Tutorial by Miguel Grinberg (thanks Miguel!).

Why another tutorial, then? Recently I started working on a small personal project and decided that it was a good chance to refresh my knowledge of the framework. For this reason I temporarily dropped the clean architecture I often recommend, and started from scratch following some tutorials. My development environment quickly became very messy, and after a while I realised I was very unsatisfied by the global setup.

So, I decided to start from scratch again, this time writing down some requirements I want from my development setup. I also know very well how complicated the deploy of an application in production can be, so I want my setup to be "deploy-friendly" as much as possible. Having seen too many project suffer from legacy setups, and knowing that many times such issues can be avoided with a minimum amount of planning, I thought this might be interesting for other developers as well. I consider this setup by no means better than others, it simply addresses different concerns.

What you will learn

This post contains a step-by-step description of how I set up a real Flask project that I am working on. It's important that you understand that this is just one of many possible setups, and that my choices are both a matter of personal taste and dictated by some goals that I will state in this section. Changing the requirements would clearly result in a change of the structure. The target of the post is then to show that the setup of a project can take into account many things upfront, without leaving them to an undetermined future when it will likely be too late to tackle them properly.

The requirements of my setup are the following:

  • Use the same database engine in production, in development and for tests
  • Run test on an ephemeral database
  • Run in production with no changes other that the static configuration
  • Have a command to initialise databases and manage migrations
  • Have a way to spin up "scenarios" starting from an empty database, to create a sandbox where I can test queries
  • Possible simulate production in the local environment

As for the technologies, I will use Flask, obviously, as the web framework. I will also use Gunicorn as HTTP server (in prodcution) and Postgres for the database part. I won't show here how to create the production infrastructure, but as I work daily with AWS, I will take into account some of its requirements, trying however not to be too committed to a specific solution.

A general advice

Proper setup is an investment for the future. As we do in TDD, where we decide to spend time now (writing tests) to avoid spending tenfold later (to find and correct bugs), setting up a project requires time, and might frustrate the desire of "see things happen". Proper setup is a discipline that requires patience and commitment!

If you are ready to go, join me for this journey towards a great setup of a Flask application.

The golden rule

The golden rule of any proper infrastructural work is: there has to be a single source of information. The configuration of you project shouldn't be scattered among different files or repositories (not considering secrets, that have to be stored securely). The configuration has to be accessible and easy to convert into different formats to accommodate the needs of different tools. For this reason, the configuration should be stored in a static file format like JSON, YAML, INI, or similar, which can be read and processed by different programming languages and tools.

My format of choice for this tutorial is JSON, as it can be read by both Python and Terraform, and is natively used by ECS on AWS.

Step 1 - Requirements and editor

My standard structure for Python requirements uses 3 files: production.txt, development.txt, and testing.txt. They are all stored in the same directory called requirements, and are hierarchically connected.

File: requirements/production.txt

## This file is currently empty

File: requirements/testing.txt

-r production.txt

File: requirements/development.txt

-r testing.txt

There is also a final requirements.txt file that points to the production one.

File: requirements.txt

-r requirements/production.txt

As you can see this allows me to separate the requirements to avoid installing unneeded packages, which greatly speeds up the deploy in production and keeps things as essential as possible. Production contains the minimum requirements needed to run the project, testing adds to those the packages used to test the code, and development adds to the latter the tools needed during development. A minor shortcoming of this setup is that I might not need in development everything I need in production, for example the HTTP server. I don't think this is significantly affecting my local setup, though, and if I have to decide between production and development, I prefer to keep the former lean and tidy.

I have my linters already installed system-wide, but as I'm using black to format the code I have to configure flake8 to accept what I'm doing

File: .flake8

[flake8] # Recommend matching the black line length (default 88), # rather than using the flake8 default of 79: max-line-length = 100 ignore = E231

This is clearly a very personal choice, and you might have different requirements. Take your time to properly configure the editor and the linter(s). Remember that the editor for a programmer is like the violin for the violinist. You need to know it, and to take care of it. So, set it up properly.

At this point I also create my virtual environment and activate it.

Resources Step 2 - Flask project boilerplate

As this will be a Flask application the first thing to do is to install Flask itself. That goes in the production requirements, as that is needed at every stage.

File: requirements/production.txt


Now, install the development requirements with

$ pip install -r requirements/development.txt

As we saw before, that file automatically installs the testing and production requirements as well.

Then we need a directory where to keep all the code that is directly connected with the Flask framework, and where we will start creating the configuration for the application. Create the application directory and the file config.py in it.

File: application/config.py

import os basedir = os.path.abspath(os.path.dirname(__file__)) class Config(object): """Base configuration""" class ProductionConfig(Config): """Production configuration""" class DevelopmentConfig(Config): """Development configuration""" class TestingConfig(Config): """Testing configuration""" TESTING = True

There are many ways to configure a Flask application, one of which is using Python objects. This allows me to leverage inheritance to avoid duplication (which is always good), so it's my method of choice.

It's important to understand the variables and the parameters involved in the configuration. As the documentation clearly states, FLASK_ENV and FLASK_DEBUG have to be initialised outside the application as the code might misbehave if they are changed once the engine has been started. Furthermore the FLASK_ENV variable can have only the two values development and production, and the main difference is in performances. The most important thing we need to be aware of is that if FLASK_ENV is development, then FLASK_DEBUG becomes automatically True. To sum up we have the following guidelines:

  • It's pointless to set DEBUG and ENV in the application configuration, they have to be environment variables.
  • Generally you don't need to set FLASK_DEBUG, just set FLASK_ENV to development.
  • Testing doesn't need the debug server turned on, so you can set FLASK_ENV to production during that phase. It needs TESTING set to True, though, and that has to be done inside the application.

We need now to create the application and to properly configure it. I decided to use and application factory that accepts a config_name string that is then converted into the name of the config object. For example, if config_name is development the variable config_module becomes application.config.DevelopmentConfig so that app.config.from_object can import it.

File: application/app.py

from flask import Flask def create_app(config_name): app = Flask(__name__) config_module = f"application.config.{config_name.capitalize()}Config" app.config.from_object(config_module) @app.route("/") def hello_world(): return "Hello, World!" return app

I also added the standard "Hello, world!" route to have a quick way to see if the server is working or not.

Last, we need something that initializes the application running the application factory and passing the correct value for the config_name parameter. The Flask development server can automatically use any file named wsgi.py in the root directory, and since WSGI is a standard specification using that makes me sure that any HTTP server we will use in production (for example Gunicorn or uWSGI) will be immediately working.

File: wsgi.py

import os from application.app import create_app app = create_app(os.environ["FLASK_CONFIG"])

Here, I decided to read the value of config_name from the FLASK_CONFIG variable. This is not a variable requested by the framework, but I decided to use the FLASK_ prefix anyway because it is tightly connected with the structure of the Flask application.

At this point we can happily run the Flask development server with

$ FLASK_CONFIG="development" flask run * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on (Press CTRL+C to quit)

Please note that it says Environment: production because we haven't configured FLASK_ENV yet. If you head to with your browser you can see the greetings message.

Resources Step 3 - Application configuration

As I mentioned in the introduction, I am going to use a static JSON configuration file. The choice of JSON comes from the fact that it is a widespread file format, accessible from many programming languages, included Terraform, which I plan to use to create my production infrastructure.

File: config/development.json

[ { "name": "FLASK_ENV", "value": "development" } ]

I obviously need a script that extracts variables from the JSON file and converts them into environment variables, so it's time to start writing my own manage.py file. This is a pretty standard concept in the world of Python web frameworks, a tradition initiated by Django. The idea is to centralise all the management functions like starting/stopping the development server or managing database migrations. As in flask this is partially done by the flask command itself, for the time being I just need to wrap it providing suitable environment variables.

File: manage.py

#! /usr/bin/env python import os import json import signal import subprocess import click # Ensure an environment variable exists and has a value def setenv(variable, default): os.environ[variable] = os.getenv(variable, default) setenv("APPLICATION_CONFIG", "development") # Read configuration from the relative JSON file config_json_filename = os.getenv("APPLICATION_CONFIG") + ".json" with open(os.path.join("config", config_json_filename)) as f: config = json.load(f) # Convert the config into a usable Python dictionary config = dict((i["name"], i["value"]) for i in config) for key, value in config.items(): setenv(key, value) @click.group() def cli(): pass @cli.command(context_settings={"ignore_unknown_options": True}) @click.argument("subcommand", nargs=-1, type=click.Path()) def flask(subcommand): cmdline = ["flask"] + list(subcommand) try: p = subprocess.Popen(cmdline) p.wait() except KeyboardInterrupt: p.send_signal(signal.SIGINT) p.wait() cli.add_command(flask) if __name__ == "__main__": cli()

Remember to give make the script executables with

$ chmod 775 manage.py

As you can see I'm using click, which is the recommended way to implement Flask commands. As I might use it to customise subcommands of the flask main script, I decided to stick to one tool and use it for the manage.py script as well.

The APPLICATION_CONFIG variable is the only one that I need to specify, and its default value is development. From that variable I infer the name of the JSON file with the full configuration and load environment variables from that. The flask function simply wraps the flask command provided by Flask so that I can run ./manage.py flask <subcommand> to run it using the development configuration or APPLICATION_CONFIG="foobar" ./manage.py flask <subcommand> to use the foobar one.

A clarification, to be sure you don't confuse environment variables with each other:

  • APPLICATION_CONFIG is strictly related to my project and is used only to load a JSON configuration file with the name specified in the variable itself.
  • FLASK_CONFIG is used to select the Python object that contains the configuration for the Flask application (see application/app.py and application/config.py). The value of the variable is converted into the name of a class.
  • FLASK_ENV is a variable used by Flask itself, and its values are dictated by it. See the configuration documentation mentioned in the resources of the previous section.

Now we can run the development server

$ ./manage.py flask run * Environment: development * Debug mode: on * Running on (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 172-719-201

Note that it now says Environment: development because of FLASK_ENV has been set to development in the configuration. As we did before, a quick visit to shows us that everything is up and running.

Resources: Step 4 - Containers and orchestration

There is no better way to simplify your development than using Docker.

There is also no better way to complicate your life than using Docker.

As you might guess, I have mixed feelings about Docker. Don't get me wrong, Linux containers are an amazing concept, and Docker is very useful. It's also a complex technology that sometimes requires a lot of work to get properly configured. In this case the setup will be pretty simple, but there is a major complication with using a database server that I will describe later.

Running the application in a Docker container allows me to isolate it and to simulate the way I will run it in production. I will use docker-compose, as I expect to have other containers running in my development setup (at least the database), so I can leverage the fact that the docker-compose configuration file can interpolate environment variables. Once again through the APPLICATION_CONFIG environment variable I will select the correct JSON file, load its values in environment variables and then run the docker-compose file.

First of all we need an image for the Flask application

File: docker/Dockerfile

FROM python:3 ENV PYTHONUNBUFFERED 1 RUN mkdir /opt/code RUN mkdir /opt/requirements WORKDIR /opt/code ADD requirements /opt/requirements RUN pip install -r /opt/requirements/development.txt

As you can see the requirements directory is copied into the image, so that Docker can run the pip install command at creation time. The whole code directory will be mounted live into the image at run time.

This clearly means that every time we change the development requirements we need to rebuild the image. This is not a complicated process, so I will keep it as a manual process for now. To run the image we can create a configuration file for docker-compose.

File: docker/development.yml

version: '3.4' services: web: build: context: ${PWD} dockerfile: docker/Dockerfile environment: FLASK_ENV: ${FLASK_ENV} FLASK_CONFIG: ${FLASK_CONFIG} command: flask run --host volumes: - ${PWD}:/opt/code ports: - "5000:5000"

As you can see, the docker-compose configuration file can read environment variables natively. To run it we first need to add docker-compose itself to the development requirements.

File: requirements/development.txt

-r testing.txt docker-compose

Install it with pip install -r requirements/development.txt, then build the image with

$ FLASK_ENV="development" FLASK_CONFIG="development" docker-compose -f docker/development.yml build web

We are explicitly passing environment variables here, as we have not wrapped docker-compose in the manage script yet. Once the image has been build, we can run it with the up command

$ FLASK_ENV="development" FLASK_CONFIG="development" docker-compose -f docker/development.yml up

This command should give us the following output

Creating network "docker_default" with the default driver Creating docker_web_1 ... done Attaching to docker_web_1 web_1 | * Environment: development web_1 | * Debug mode: on web_1 | * Running on (Press CTRL+C to quit) web_1 | * Restarting with stat web_1 | * Debugger is active! web_1 | * Debugger PIN: 234-361-737

You can stop the containers pressing Ctrl-C, which gracefully tears down the system. If you run the command up -d docker-compose will run as a daemon, leaving you the control of the current terminal. If docker-compose is running you can docker ps and you should see an output similar to this

CONTAINER ID IMAGE COMMAND ... PORTS NAMES c98f35635625 docker_web "flask run --host 0.…" ...>5000/tcp docker_web_1

If you need to explore the container you can login directly with

$ docker exec -it docker_web_1 bash

or with

$ FLASK_ENV="development" FLASK_CONFIG="development" docker-compose -f docker/development.yml exec web bash

In either case, you will end up in the /opt/code directory (which is the WORKDIR of the image), where the current directory in the host is mounted.

To tear down the containers, when running as daemon, you can run

$ FLASK_ENV="development" FLASK_CONFIG="development" docker-compose -f docker/development.yml down

Notice that the server now says Running on, as the Docker container is using that network interface to communicate with the outside world. Since the ports are mapped, however, you can head to either http://localhost:5000 or with your browser.

To simplify the usage of docker-compose, I want to wrap it in the manage.py script, so that it automatically receives environment variables, as their number is going to increase soon when we will add a database.

File: manage.py

#! /usr/bin/env python import os import json import signal import subprocess import click docker_compose_file = "docker/development.yml" docker_compose_cmdline = ["docker-compose", "-f", docker_compose_file] # Ensure an environment variable exists and has a value def setenv(variable, default): os.environ[variable] = os.getenv(variable, default) setenv("APPLICATION_CONFIG", "development") # Read configuration from the relative JSON file config_json_filename = os.getenv("APPLICATION_CONFIG") + ".json" with open(os.path.join("config", config_json_filename)) as f: config = json.load(f) # Convert the config into a usable Python dictionary config = dict((i["name"], i["value"]) for i in config) for key, value in config.items(): setenv(key, value) @click.group() def cli(): pass @cli.command(context_settings={"ignore_unknown_options": True}) @click.argument("subcommand", nargs=-1, type=click.Path()) def flask(subcommand): cmdline = ["flask"] + list(subcommand) try: p = subprocess.Popen(cmdline) p.wait() except KeyboardInterrupt: p.send_signal(signal.SIGINT) p.wait() @cli.command(context_settings={"ignore_unknown_options": True}) @click.argument("subcommand", nargs=-1, type=click.Path()) def compose(subcommand): cmdline = docker_compose_cmdline + list(subcommand) try: p = subprocess.Popen(cmdline) p.wait() except KeyboardInterrupt: p.send_signal(signal.SIGINT) p.wait() if __name__ == "__main__": cli()

You might have noticed that the two functions flask and compose are basically the same code, but I resisted the temptation to refactor them because I know that the compose command will need some changes as soon as I add a database.

The last change we need in order to make everything work properly is adding the FLASK_CONFIG variable to the config file

File: config/development.json

[ { "name": "FLASK_ENV", "value": "development" }, { "name": "FLASK_CONFIG", "value": "development" } ]

Now I can run ./manage.py compose up -d and ./manage.py compose down and have the environment variables automatically passed to the system.

Resources Final words

That's enought for this first post. We started from scratch and added some boilerplate code for a Flask project, exploring what environment variables are used by the framework, then we added a configuration system, a management script, and finally we run everything in a Docker container. In the next post I will show you how to add a persistent database to the development setup and how to use an ephemeral one for the tests. If you find my posts useful please share them with whoever you thing might be interested.

Happy development!


Feel free to reach me on Twitter if you have questions. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

eiriksm.dev: Set id of CSV migration row based on line number

Planet Drupal - Sun, 2020-07-05 07:49

Migrate in core is among my favorite parts of Drupal 8 and 9. The framework is super flexible, and it makes migrating content from any source you can dream up pretty straight forward. Today I want to show a trick that I use when I receive a csv (or Excel file) from clients, where they want all of the contents in it migrated to Drupal. One very simple example would be a list of categories.

Typically the file will come with one term on each line. However, migrate would want us to set an ID for all of the terms, which currently none of the rows have. One solution to this is to place an ID on all of the rows manually with some sort of spreadsheet software, and then point our migration to the new column for its IDs. But since that both involves the words "manual" and "spreadsheet software" it immediately makes me want to find another solution. Is there a way we can set the row id programmatically based on the row number instead? Why, yes, it is!

So, here is a trick I use to set the ID from the line number:

The migration configuration looks something like this:

id: my_module_categories_csv label: My module categories migration_group: my_module source: # We will use a custom source plugin, so we can set the # ID from there. plugin: my_module_categories_csv track_changes: TRUE header_row_count: 1 keys: - id delimiter: ';' # ... And the rest of the file

As stated in the yaml file, we will use a custom source plugin for this. Let's say we have a custom module called "my_module". Inside that module folder, we create a file called Categories Csv.php inside the folder src/Plugin/migrate/source/CategoriesCsv.php. And in that file we put something like this:

<?php namespace Drupal\my_module\Plugin\Migrate\source; use Drupal\migrate\Plugin\MigrationInterface; use Drupal\migrate\Row; use Drupal\migrate_source_csv\Plugin\migrate\source\CSV; /** * Source plugin for Categories in csv. * * @MigrateSource( * id = "my_module_categories_csv" * ) */ class CategoriesCsv extends CSV { /** * {@inheritdoc} */ public function prepareRow(Row $row) { // Delta is here the row number. $delta = $this->file->key(); $row->setSourceProperty('id', $delta); return parent::prepareRow($row); } }

In the code above we set the source property of id to the delta (the row number) of the row. Which means you can have a source like this:

Name Category1 Category2 Category3

Instead of this

id;Name 1;Category1 2;Category2 3;Category3

The best part of this is that when your client changes their mind, you can just update the file instead of editing it before updating it. And with editing, I mean "manually" and with "spreadsheet software". Yuck.

To finish this post, here is an animated gif called "spreadsheet software yuck"

Categories: FLOSS Project Planets

Pythonicity: Closing files

Planet Python - Sun, 2020-07-05 03:00
Contrarian view on closing files.

It has become conventional wisdom to always explicitly close file-like objects, via context managers. The google style guide is representative:

Explicitly close files and sockets when done with them. Leaving files, sockets or other file-like objects open unnecessarily has many downsides, including:

They may consume limited system resources, such as file descriptors.

  • Code that deals with many such objects may exhaust those resources unnecessarily if they're not returned to the system promptly after use.
  • Holding files open may prevent other actions being performed on them, such as moves or deletion.
  • Files and sockets that are shared throughout a program may inadvertantly be read from or written to after logically being closed. If they are actually closed, attempts to read or write from them will throw exceptions, making the problem known sooner.

Furthermore, while files and sockets are automatically closed when the file object is destructed, tying the life-time of the file object to the state of the file is poor practice, for several reasons:

  • There are no guarantees as to when the runtime will actually run the file's destructor. Different Python implementations use different memory management techniques, such as delayed Garbage Collection, which may increase the object's lifetime arbitrarily and indefinitely.
  • Unexpected references to the file may keep it around longer than intended (e.g. in tracebacks of exceptions, inside globals, etc).

The preferred way to manage files is using the "with" statement:

with open("hello.txt") as hello_file: for line in hello_file: print line In theory

Good points, and why limit this advice to file descriptors? Any resource may be limited or require exclusivity; that's why they're called resources. Similarly one should always explicitly call dict.clear when finished with a dict. After all, "there are no guarantees as to when the runtime will actually run the <object's> destructor. And "code that deals with many such objects may exhaust those resources unnecessarily", such as memory, or whatever else is in the dict.

But in all seriousness, this advice is applying a notably higher standard of premature optimization to file descriptors than to any other kind of resource. There are plenty of Python projects that are guaranteed to run on CPython for a variety of reasons, where destructors are immediately called. And there are plenty of Python projects where file descriptor usage is just a non-issue. It's now depressingly commonplace to see this in setup.py files:

In [ ]: with open("README.md") as readme: long_description = readme.read()

Let's consider a practical example: a load function which is supposed to read and parse data given a file path.

In [ ]: import csv import json def load(filepath): """the supposedly bad way""" return json.load(open(filepath)) def load(filepath): """the supposedly good way""" with open(filepath) as file: return json.load(file) def load(filepath): """with a different file format""" with open(filepath) as file: return csv.reader(file)

Which versions work correctly? Are you sure? If it's not immediately obvious why one of these is broken, that's the point. In fact, it's worth trying out before reading on.


The csv version returns an iterator over a closed file. It's a violation of procedural abstraction to know whether the result of load is lazily evaluated or not; it's just supposed to implement an interface. Moreover, according to this best practice, it's impossible to write the csv version correctly. As absurd as it sounds, it's just an abstraction that can't exist.

Defiantly clever readers are probably already trying to fix it. Maybe like this:

In [ ]: def load(filepath): with open(filepath) as file: yield from csv.reader(file)

No, it will not be fixed. This version only appears to work by not closing the file until the generator is exhausted or collected.

This trivial example has deeper implications. If one accepts this practice, then one must also accept that storing a file handle anywhere, such as on an instance, is also disallowed. Unless of course that object then virally implements it owns context manager, ad infinitum.

Furthermore it demonstrates that often the context is not being managed locally. If a file object is passed another function, then it's being used outside of the context. Let's revisit the json version, which works because the file is fully read. Doesn't a json parser have some expensive parsing to do after it's read the file? It might even throw an error. And isn't it be desirable, trivial, and likely that the implementation releases interest in the file as soon as possible?

So in reality there are scenarios where the supposedly good way could keep the file open longer than the supposedly bad way. The original inline version does exactly what it's supposed to do: close the file when all interested parties are done with it. Python uses garbage collection to manage shared resources. Any attempt to pretend otherwise will result in code that is broken, inefficient, or reinventing reference counting.

A true believer now has to accept that json.load is a useless and dangerous wrapper, and that the only correct implementation is:

In [ ]: def load(filepath): with open(filepath) as file: contents = file.read() return json.loads(contents)

This line of reasoning reduces to the absurd: a file should never be passed or stored anywhere. Next an example where the practice has caused real-world damage.

In practice

Requests is one of the most popular python packages, and officially recommended. It includes a Session object which supports closing via a context manager. The vast majority of real-world code uses the the top-level functions or single-use sessions.

In [ ]: response = requests.get(...) with requests.Session() as session: response = session.get(...)

Sessions manage the connection pool, so this pattern of usage is establishing a new connection every time. There are popular standard API clients which seriously do this, for every single request to the same endpoint.

Requests' documentation prominently states that "Keep-alive and HTTP connection pooling are 100% automatic". So part of the blame may lay with that phrasing, since it's only "automatic" if sessions are reused. But surely a large part of the blame is the dogma of closing sockets, and therefore sessions, explicitly. The whole point of a connection pool is that it may leave connections open, so users who genuinely need this granularity are working at the wrong abstraction layer. http.client is already builtin for that level of control.

Tellingly, requests' own top-level functions didn't always close sessions. There's a long history to that code, including a version that only closed sessions on success. An older version was causing warnings, when run to check for such warnings, and was being blamed for the appearance of leaking memory. Those threads are essentially debating whether a resource pool is "leaking" resources.


Prior to with being introduced in Python 2.5, it was not recommended that inlined reading of a file required a try... finally block. Far from it, in the past idioms like open(...).read() and for line in open(...) were lauded for being succinct and expressive. But if all this orphaned file descriptor paranoia was well-founded, it would have been a problem back then too.

Finally, let's address readability. It could be argued (though it rarely is) that showing the reader when the file is closed has inherent value. Conveniently, that tends to align with having opened the file for writing anyway, thereby needing a reference to it. In which case, the readability is approximately equal, and potential pitfalls are more realistic. But readability is genuinely lost when the file would have been opened in a inline expression.

The best practice is unjustifiably special-casing file descriptors, and not seeing its own reasoning through to its logical conclusion. This author proposes advocating for anonymous read-only open expressions. Your setup script is not going to run out of file descriptors because you wrote open("README.md").read().

Categories: FLOSS Project Planets

Kushal Das: dns-tor-proxy 0.2.0 aka DoH release

Planet Python - Sun, 2020-07-05 01:31

I just now released 0.2.0 of the dns-tor-proxy tool. The main feature of this release is DNS over HTTPS support. At first I started writing it from scratch, and then decided to use modified code from the amazing dns-over-https project instead.


✦ ❯ ./dns-tor-proxy -h Usage of ./dns-tor-proxy: --doh Use DoH servers as upstream. --dohaddress string The DoH server address. (default "https://mozilla.cloudflare-dns.com/dns-query") -h, --help Prints the help message and exists. --port int Port on which the tool will listen. (default 53) --proxy string The Tor SOCKS5 proxy to connect locally, IP:PORT format. (default "") --server string The DNS server to connect IP:PORT format. (default "") -v, --version Prints the version and exists. Make sure that your Tor process is running and has a SOCKS proxy enabled.

Now you can pass --doh flag to enable DoH server usage, by default it will use https://mozilla.cloudflare-dns.com/dns-query. But you can pass any server using --dohaddress flag. I found the following servers are working well over Tor.

  • https://doh.libredns.gr/dns-query
  • https://doh.powerdns.org
  • https://dns4torpnlfs2ifuz2s2yf3fc7rdmsbhm6rw75euj35pac6ap25zgqad.onion/dns-query
  • https://dnsforge.de/dns-query

The release also has a binary executable for Linux x86_64. You can verify the executable using the signature file available in the release page.

Categories: FLOSS Project Planets

Full Stack Python: Quickly Use Bootstrap 4 in a Django Template with a CDN

Planet Python - Sun, 2020-07-05 00:00

The Django web framework makes it easy to render HTML using the Django template engine. However, the default styling on HTML pages usually need a Cascading Style Sheet (CSS) framework such as Bootstrap to make the design look decent. In this beginner's tutorial, we'll use the Bootstrap Content Delivery Network (CDN) to quickly add Bootstrap to a rendered HTML page.

Here is what the <h1> element styling will look like at the end of this tutorial:

Tutorial Requirements

Throughout this tutorial we are going to use the following dependencies, which we will install in just a moment. Make sure you also have Python 3, preferrably 3.7 or newer installed, in your environment:

We will use the following dependencies to complete this tutorial:

All code in this blog post is available open source under the MIT license on GitHub under the bootstrap-4-django-template directory of the blog-code-examples repository. Use the source code as you desire for your own projects.

Development environment set up

Change into the directory where you keep your Python virtual environments. Create a new virtualenv for this project using the following command.

Start the Django project by creating a new virtual environment using the following command. I recommend using a separate directory such as ~/venvs/ (the tilde is a shortcut for your user's home directory) so that you always know where all your virtualenvs are located.

python3 -m venv ~/venvs/djbootstrap4

Activate the virtualenv with the activate shell script:

source ~/venvs/djbootstrap4/bin/activate

After the above command is executed, the command prompt will change so that the name of the virtualenv is prepended to the original command prompt format, so if your prompt is simply $, it will now look like the following:

(djbootstrap4) $

Remember, you have to activate your virtualenv in every new terminal window where you want to use dependencies in the virtualenv.

We can now install the Django package into the activated but otherwise empty virtualenv.

pip install django==3.0.8

Look for output similar to the following to confirm the appropriate packages were installed correctly from PyPI.

Collecting django Using cached https://files.pythonhosted.org/packages/ca/ab/5e004afa025a6fb640c6e983d4983e6507421ff01be224da79ab7de7a21f/Django-3.0.8-py3-none-any.whl Collecting sqlparse>=0.2.2 (from django) Using cached https://files.pythonhosted.org/packages/85/ee/6e821932f413a5c4b76be9c5936e313e4fc626b33f16e027866e1d60f588/sqlparse-0.3.1-py2.py3-none-any.whl Collecting asgiref~=3.2 (from django) Using cached https://files.pythonhosted.org/packages/d5/eb/64725b25f991010307fd18a9e0c1f0e6dff2f03622fc4bcbcdb2244f60d6/asgiref-3.2.10-py3-none-any.whl Collecting pytz (from django) Using cached https://files.pythonhosted.org/packages/4f/a4/879454d49688e2fad93e59d7d4efda580b783c745fd2ec2a3adf87b0808d/pytz-2020.1-py2.py3-none-any.whl Installing collected packages: sqlparse, asgiref, pytz, django Successfully installed asgiref-3.2.10 django-3.0.8 pytz-2020.1 sqlparse-0.3.1

We can get started coding the application now that we have all of our required dependencies installed.

Building our application

Let's begin coding our application.

We can use the Django django-admin tool to create the boilerplate code structure to get our project started. Change into the directory where you develop your applications. For example, I typically use /Users/matt/devel/py/ for all of my Python projects. Then run the following command to start a Django project named djbootstrap4:

django-admin.py startproject djbootstrap4

Note that in this tutorial we are using the same name for both the virtualenv and the Django project directory, but they can be different names if you prefer that for organizing your own projects.

The django-admin command creates a directory named djbootstrap4 along with several subdirectories that you should be familiar with if you have previously worked with Django.

Change directories into the new project.

cd djbootstrap4

Create a new Django app within djbootstrap4.

python manage.py startapp bootstrap4

Django will generate a new folder named bootstrap4 for the project. We should update the URLs so the app is accessible before we write our views.py code.

Open djbootstrap4/djbootstrap4/urls.py. Add the highlighted lines so that URL resolver will check the bootstrap4 app for additional routes to match with URLs that are requested of this Django application.

# djbootstrap4/djbootstrap4/urls.py ~~from django.conf.urls import include from django.contrib import admin from django.urls import path urlpatterns = [ ~~ path('', include('bootstrap4.urls')), path('admin/', admin.site.urls), ]

Save djbootstrap4/djbootstrap4/urls.py and open djbootstrap4/djbootstrap4/settings.py. Add the bootstrap4 app to settings.py by inserting the highlighted line:

# djbootstrap4/djbootstrap4/settings.py # Application definition INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', ~~ 'bootstrap4', ]

Make sure you change the default DEBUG and SECRET_KEY values in settings.py before you deploy any code to production. Secure your app properly with the information from the Django production deployment checklist so that you do not add your project to the list of hacked applications on the web.

Save and close settings.py.

Next change into the djbootstrap4/bootstrap4 directory. Create a new file named urls.py to contain routes for the bootstrap4 app.

Add all of these lines to the empty djbootstrap4/bootstrap4/urls.py file.

# djbootstrap4/bootstrap4/urls.py from django.conf.urls import url from . import views urlpatterns = [ url(r'', views.bootstrap4_index, name="index"), ]

Save djbootstrap4/bootstrap4/urls.py. Open djbootstrap4/bootstrap4/views.py to add the following two highlighted lines. You can keep the boilerplate comment "# Create your views here." or delete like I usually do.

# djbootstrap4/bootstrap4/views.py from django.shortcuts import render ~~def bootstrap4_index(request): ~~ return render(request, 'index.html', {})

Next, create a directory for your template files named templates under the djmaps/maps app directory.

mkdir templates

Create a new file named index.html within djbootstrap4/bootstrap4/templates that contains the following Django template language markup.

<!DOCTYPE html> <html> <head> <title>First step for bootstrap4</title> </head> <body> <h1>Hello, world!</h1> </body> </html>

We can test out this static page to make sure all of our code is correct before we start adding the meat of the functionality to the project. Change into the base directory of your Django project where the manage.py file is located. Execute the development server with the following command:

python manage.py runserver

The Django development server should start up with no issues other than an unapplied migrations warning.

Watching for file changes with StatReloader Performing system checks... System check identified no issues (0 silenced). You have 17 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, contenttypes, sessions. Run 'python manage.py migrate' to apply them. July 05, 2020 - 10:59:58 Django version 3.0.8, using settings 'djbootstrap4.settings' Starting development server at Quit the server with CONTROL-C.

Open a web browser and go to "http://localhost:8000".

With our base application working, we can now add Bootstrap.

Integrating Bootstrap

Time to add Bootstrap into the template so we can use its styling.

Open djbootstrap4/bootstrap4/templates/index.html back up and add or modify the following highlighted lines, which are very similar to what you will find in the Bootstrap introduction guide:

<!DOCTYPE html> <html lang="en"> <head> ~~ <meta charset="utf-8"> ~~ <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> ~~ <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.min.css" integrity="sha384-9aIt2nRpC12Uk9gS9baDl411NQApFmC26EwAOH8WgZl5MYYxFfc+NcPb1dKGj7Sk" crossorigin="anonymous"> ~~ <title>bootstrap4</title> </head> <body> <h1>Hello, world!</h1> ~~ <!-- Optional JavaScript --> ~~ <!-- jQuery first, then Popper.js, then Bootstrap JS --> ~~ <script src="https://code.jquery.com/jquery-3.5.1.slim.min.js" integrity="sha384-DfXdz2htPH0lsSSs5nCTpuj/zy4C+OGpamoFVy38MVBnE+IbbVYUew+OrCXaRkfj" crossorigin="anonymous"></script> ~~ <script src="https://cdn.jsdelivr.net/npm/popper.js@1.16.0/dist/umd/popper.min.js" integrity="sha384-Q6E9RHvbIyZFJoft+2mJbHaEWldlvI9IOYy5n3zV9zzTtmI3UksdQRVvoxMfooAo" crossorigin="anonymous"></script> ~~ <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/js/bootstrap.min.js" integrity="sha384-OgVRvuATP1z7JjHLkuOU7Xw704+h835Lr+6QL9UvYjZE3Ipu6Tp75j7Bh/kR0JKI" crossorigin="anonymous"></script> </body> </html>

The above new lines in the <head> section add a couple of meta elements that are important to Bootstrap's styling, and add the mandatory Bootstrap stylesheet.

We keep the same <h1> header, which will automatically get the CSS styling. Then there are 3 optional script elements that pull in Bootstrap JavaScript for more advanced features. We are not using them in this tutorial because we just wanted to quickly show how to use the CDN and with this in place you can see in the Bootstrap content docs what you want to add to the template next.

Refresh the page at "http://localhost:8000" and you should see "Hello, world!" change fonts.

If you see that, it means everything works as expected.

What now?

We just added Bootstrap via the CDN so we can use it in our Django template. This was the absolute simplest way to add Bootstrap to a single Django page and now there's a ton more you can do with it.

Next, try out some of these other related Django tutorials:

Questions? Contact me via Twitter @fullstackpython or @mattmakai. I am also on GitHub with the username mattmakai. If you see an issue or error in this tutorial, please fork the source repository on GitHub and submit a pull request with the fix.

Categories: FLOSS Project Planets

Russell Coker: Debian S390X Emulation

Planet Debian - Sat, 2020-07-04 22:58

I decided to setup some virtual machines for different architectures. One that I decided to try was S390X – the latest 64bit version of the IBM mainframe. Here’s how to do it, I tested on a host running Debian/Unstable but Buster should work in the same way.

First you need to create a filesystem in an an image file with commands like the following:

truncate -s 4g /vmstore/s390x mkfs.ext4 /vmstore/s390x mount -o loop /vmstore/s390x /mnt/tmp

Then visit the Debian Netinst page [1] to download the S390X net install ISO. Then loopback mount it somewhere convenient like /mnt/tmp2.

The package qemu-system-misc has the program for emulating a S390X system (among many others), the qemu-user-static package has the program for emulating S390X for a single program (IE a statically linked program or a chroot environment), you need this to run debootstrap. The following commands should be most of what you need.

# Install the basic packages you need apt install qemu-system-misc qemu-user-static debootstrap # List the support for different binary formats update-binfmts --display # qemu s390x needs exec stack to solve "Could not allocate dynamic translator buffer" # so you probably need this on SE Linux systems setsebool allow_execstack 1 # commands to do the main install debootstrap --foreign --arch=s390x --no-check-gpg buster /mnt/tmp file:///mnt/tmp2 chroot /mnt/tmp /debootstrap/debootstrap --second-stage # set the apt sources cat << END > /mnt/tmp/etc/apt/sources.list deb http://YOURLOCALMIRROR/pub/debian/ buster main deb http://security.debian.org/ buster/updates main END # for minimal install do not want recommended packages echo "APT::Install-Recommends False;" > /mnt/tmp/etc/apt/apt.conf # update to latest packages chroot /mnt/tmp apt update chroot /mnt/tmp apt dist-upgrade # install kernel, ssh, and build-essential chroot /mnt/tmp apt install bash-completion locales linux-image-s390x man-db openssh-server build-essential chroot /mnt/tmp dpkg-reconfigure locales echo s390x > /mnt/tmp/etc/hostname chroot /mnt/tmp passwd # copy kernel and initrd mkdir -p /boot/s390x cp /mnt/tmp/boot/vmlinuz* /mnt/tmp/boot/initrd* /boot/s390x # setup /etc/fstab cat << END > /mnt/tmp/etc/fstab /dev/vda / ext4 noatime 0 0 #/dev/vdb none swap defaults 0 0 END # clean up umount /mnt/tmp umount /mnt/tmp2 # setcap binary for starting bridged networking setcap cap_net_admin+ep /usr/lib/qemu/qemu-bridge-helper # afterwards set the access on /etc/qemu/bridge.conf so it can only # be read by the user/group permitted to start qemu/kvm echo "allow all" > /etc/qemu/bridge.conf

Some of the above can be considered more as pseudo-code in shell script rather than an exact way of doing things. While you can copy and past all the above into a command line and have a reasonable chance of having it work I think it would be better to look at each command and decide whether it’s right for you and whether you need to alter it slightly for your system.

To run qemu as non-root you need to have a helper program with extra capabilities to setup bridged networking. I’ve included that in the explanation because I think it’s important to have all security options enabled.

The “-object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-ccw,rng=rng0” part is to give entropy to the VM from the host, otherwise it will take ages to start sshd. Note that this is slightly but significantly different from the command used for other architectures (the “ccw” is the difference).

I’m not sure if “noresume” on the kernel command line is required, but it doesn’t do any harm. The “net.ifnames=0” stops systemd from renaming Ethernet devices. For the virtual networking the “ccw” again is a difference from other architectures.

Here is a basic command to run a QEMU virtual S390X system. If all goes well it should give you a login: prompt on a curses based text display, you can then login as root and should be able to run “dhclient eth0” and other similar commands to setup networking and allow ssh logins.

qemu-system-s390x -drive format=raw,file=/vmstore/s390x,if=virtio -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-ccw,rng=rng0 -nographic -m 1500 -smp 2 -kernel /boot/s390x/vmlinuz-4.19.0-9-s390x -initrd /boot/s390x/initrd.img-4.19.0-9-s390x -curses -append "net.ifnames=0 noresume root=/dev/vda ro" -device virtio-net-ccw,netdev=net0,mac=02:02:00:00:01:02 -netdev tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper

Here is a slightly more complete QEMU command. It has 2 block devices, for root and swap. It has SE Linux enabled for the VM (SE Linux works nicely on S390X). I added the “lockdown=confidentiality” kernel security option even though it’s not supported in 4.19 kernels, it doesn’t do any harm and when I upgrade systems to newer kernels I won’t have to remember to add it.

qemu-system-s390x -drive format=raw,file=/vmstore/s390x,if=virtio -drive format=raw,file=/vmswap/s390x,if=virtio -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-ccw,rng=rng0 -nographic -m 1500 -smp 2 -kernel /boot/s390x/vmlinuz-4.19.0-9-s390x -initrd /boot/s390x/initrd.img-4.19.0-9-s390x -curses -append "net.ifnames=0 noresume security=selinux root=/dev/vda ro lockdown=confidentiality" -device virtio-net-ccw,netdev=net0,mac=02:02:00:00:01:02 -netdev tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper Try It Out

I’ve got a S390X system online for a while, “ssh root@s390x.coker.com.au” with password “SELINUX” to try it out.


I’ve tried running a PPC64 virtual machine, I did the same things to set it up and then tried launching it with the following result:

qemu-system-ppc64 -drive format=raw,file=/vmstore/ppc64,if=virtio -nographic -m 1024 -kernel /boot/ppc64/vmlinux-4.19.0-9-powerpc64le -initrd /boot/ppc64/initrd.img-4.19.0-9-powerpc64le -curses -append "root=/dev/vda ro"

Above is the minimal qemu command that I’m using. Below is the result, it stops after the “4.” from “4.19.0-9”. Note that I had originally tried with a more complete and usable set of options, but I trimmed it to the minimal needed to demonstrate the problem.

Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Booting from memory... Linux ppc64le #1 SMP Debian 4.

The kernel is from the package linux-image-4.19.0-9-powerpc64le which is a dependency of the package linux-image-ppc64el in Debian/Buster. The program qemu-system-ppc64 is from version 5.0-5 of the qemu-system-ppc package.

Any suggestions on what I should try next would be appreciated.

Related posts:

  1. installing Xen domU on Debian Etch I have just been installing a Xen domU on Debian...
  2. Ext4 and Debian/Lenny I want to use the Ext4 filesystem on Xen DomUs....
  3. QEMU for ARM Processes I’m currently doing some embedded work on ARM systems. Having...
Categories: FLOSS Project Planets