FLOSS Project Planets

PSF GSoC students blogs: Week 6 Blog

Planet Python - Tue, 2020-07-07 15:01

Hello everyone!
So its the beginning of the sixth week. The first evaluation results are out and fortunately, I made it till here. :D
This week I implemented the query functions present in DetourNavMeshQuery class. There are primarily two query functions: findPath() and findStraightPath(). Both have their own uses. The implementation initially looked to be easy, but got real complex when I started coding. There were many other function calls I had to make in order to implement these two functions, and in process taking care of the variables involved.

Both the functions have their usecases. FindPath() returns the polygons present in the path corridor. Then, its over to us how we connect polygons to complete the path. FindStraightPath() directly returns array of vertices. Here I present an example for each path finding query:

 

The green line shows the path in both the above images. The first image makes the findPath() query while the second one makes the findStraightPath() query.
Though, on the basis of these two images, you might conclude that findStraightPath() is obviously better among the two, but trust me, you won't always want to use findStraightPath(). You will know when you use it. :P

Apart from this, I spent time in dividing the library into 2 separate libraries: navmeshgen and navigation. The navmeshgen library provides tool to generate the polygon mesh, detail polygon mesh and navmesh. The navigation library provides tool to further process the navmesh and make queries over it. Recast, NavMeshBuilder are included in navmeshgen while Detour, NavMesh, NavMeshNode, NavMeshQuery are included in navigation.

This week I plan to make BAM serialization possible and work on the reviews made by mentor @moguri over my PR at Github.

Will return next week with more exciting stuff. Till then, stay safe!

 

 

Categories: FLOSS Project Planets

Week 4 and 5: GSoC Project Report

Planet KDE - Tue, 2020-07-07 13:11

This is the report for week 4 and week 5 combined into one because I couldn’t do much during week 4 due to college tests and assignments, so there was not much to report for that week. These two weeks I worked on implementing interactions between the the storyboard docker and timeline docker (or the new Animation Timeline docker). Most of the interactions from the timeline docker to the storyboard docker are implemented. To list the things implemented:

  • Selections are synced.
  • Add, remove and move events in the timeline docker should result in items being added, removed and moved (sensibly).
  • Thumbnails for the current frame would be updated when you change the contents of that frame.

Selections between storyboard docker and timeline docker are synced. If you select a frame in timeline docker the item corresponding to frame in the storyboard docker would get selected . This works the other way too. So if you select an item in storyboard docker that frame will be selected in the timeline docker.

Adding a keyframe in the timeline docker would add an item for that frame in the storyboard docker, if there is not an item for that frame already.

Removing a keyframe in the timeline docker would remove the item for that frame in the storyboard docker, if there is no other keyframe at that frame.

Moving a keyframe in the timeline docker would move the item for that frame in the storyboard docker so that items in the storyboard docker are always sorted according to frames.

It is possible to not add items to storyboard on addition of keyframe by toggling the lock button in storyboard docker. Removing and moving of keyframes would always cause removal and movement of items in the storyboard docker. So the number of items can be less than number of keyframes, but it can never be more.

Changing a keyframe would update the thumbnail for that frame in the storyboard docker. All affected items are not updated right now, but I am working on it.

This week I would work on updating all affected items when a keyframe is changed and would write unit-tests for the interactions.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Check-in #6

Planet Python - Tue, 2020-07-07 12:40
<meta charset="utf-8"> What did I do this week?

Completed the chatbot example and added documentation for the same.

 

<meta charset="utf-8"> What's next?

I'll start working on the second phase of the project, implementing a distributed orchestrator.

<meta charset="utf-8">

Did I get stuck somewhere?

Yes, I had some doubts regarding the implementation of the distributed orchestrator. I had a meeting with my mentor today, and we have decided on an initial path.

Categories: FLOSS Project Planets

Paolo Amoroso: Repl.it Redesigned the Mobile Experience

Planet Python - Tue, 2020-07-07 11:36
The cloud IDE Repl.it was redesigned to improve the user experience on mobile devices.
On smartphones, now the focused REPL pane takes up most of the screen. The redesign takes advantage of native mobile design patterns and lets you switch to a different pane from the bottom navigation bar. There are panes for the code editor, the console, and the output.
A Python REPL in Repl.it on my Pixel 2 XL phone.
Tapping the code in the editor brings up a contextual menu with some options of the desktop version. You can select, search, or paste text, or open the full command palette.
On my Pixel 2 XL phone in Chrome, lines with up to 42 characters fit in the editor’s width. The editor wraps longer lines. But most of the code usually keeps the original indentation and its structure is still clear at a glance. The console pane wraps text, too, so no horizontal scrolling is required.
You can get an idea of what Repl.it looks like on mobile by opening the browser on your device and visiting a Python REPL I set up for testing the mobile interface.  It’s an instance of Repl.it’s Multi-Page-Flask-Template, a Flask app that generates pages based on the slug entered as input.
Repl.it is a multi-language development environment in the cloud. It supports dozens of programming languages and frameworks. It’s my favorite IDE as it works fully in the cloud. This is a killer feature for a Chrome OS enthusiast like me.

This post by Paolo Amoroso was published on Moonshots Beyond the Cloud.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Week 5 Check-in

Planet Python - Tue, 2020-07-07 11:33
What did you do this week?

I continued the PR started in the previous week by adding more multimethods for array manipulation. The following multimethods were added:

Tiling arrays

  • tile
  • repeat

Adding and removing elements

  • delete
  • insert
  • append
  • resize
  • trim_zeros

Rearranging elements

  • flip
  • fliplr
  • flipud
  • reshape
  • roll
  • rot90

Most of the work was adding default implementations for the above multimethods which relied heavily on array slicing. Contrary to what I mentioned in my last blog post, not much time was dedicated to writing overriden_class for other backends so I will try to compensate in the next weeks.

What is coming up next?

As described in my proposal's timeline I'll be starting a new PR that adds multimethods for mathematical functions. This will be the focus of this week's work as some of these multimethods are also needed by sangyx, another GSoC student working on uarray. He is working on the udiff library that uses uarray and unumpy for automatic differentiation. To avoid getting in his way I will try to finish this PR as soon as possible.

Did you get stuck anywhere?

There were times when I didn't know exactly how to implement a specific default but this was easily overcome with the help of my mentors who would point me in the right direction. Asking for help sooner rather than later has proven to be invaluable. Looking back I think there were no major blocks this week.

Categories: FLOSS Project Planets

Codementor: Python Flask Tutorial: How to Make a Basic Page (Source Code Included!) 📃👨‍💻

Planet Python - Tue, 2020-07-07 11:28
Boilerplate app for Python Flask. Example source code in Python for template flask.
Categories: FLOSS Project Planets

Doug Hellmann: beagle 0.3.0

Planet Python - Tue, 2020-07-07 10:12
beagle is a command line tool for querying a hound code search service such as http://codesearch.openstack.org What’s new in 0.3.0? Add repo-pattern usages examples in the doc (contributed by Hervé Beraud) Add an option to filter repositories in search results Refresh python’s versions and their usages (contributed by Hervé Beraud)
Categories: FLOSS Project Planets

Mike Driscoll: Python 101 – Debugging Your Code with pdb

Planet Python - Tue, 2020-07-07 10:00

Mistakes in your code are known as “bugs”. You will make mistakes. You will make many mistakes, and that’s totally fine. Most of the time, they will be simple mistakes such as typos. But since computers are very literal, even typos prevent your code from working as intended. So they need to be fixed. The process of fixing your mistakes in programming is known as debugging.

The Python programming language comes with its own built-in debugger called pdb. You can use pdb on the command line or import it as a module. The name, pdb, is short for “Python debugger”.

Here is a link to the full documentation for pdb:

In this article, you will familiarize yourself with the basics of using pdb. Specifically, you will learn the following:

  • Starting pdb in the REPL
  • Starting pdb on the Command Line
  • Stepping Through Code
  • Adding Breakpoints in pdb
  • Creating a Breakpoint with set_trace()
  • Using the built-in breakpoint() Function
  • Getting Help

While pdb is handy, most Python editors have debuggers with more features. You will find the debugger in PyCharm or WingIDE to have many more features, such as auto-complete, syntax highlighting, and a graphical call stack.

A call stack is what your debugger will use to keep track of function and method calls. When possible, you should use the debugger that is included with your Python IDE as it tends to be a little easier to understand.

However, there are times where you may not have your Python IDE, for example when you are debugging remotely on a server. It is those times when you will find pdb to be especially helpful.

Let’s get started!

Starting pdb in the REPL

The best way to start is to have some code that you want to run pdb on. Feel free to use your own code or a code example from another article on this blog.

Or you can create the following code in a file named debug_code.py:

# debug_code.py def log(number): print(f'Processing {number}') print(f'Adding 2 to number: {number + 2}') def looper(number): for i in range(number): log(i) if __name__ == '__main__': looper(5)

There are several ways to start pdb and use it with your code. For this example, you will need to open up a terminal (or cmd.exe if you’re a Windows user). Then navigate to the folder that you saved your code to.

Now start Python in your terminal. This will give you the Python REPL where you can import your code and run the debugger, pdb. Here’s how:

>>> import debug_code >>> import pdb >>> pdb.run('debug_code.looper(5)') > <string>(1)<module>() (Pdb) continue Processing 0 Adding 2 to number: 2 Processing 1 Adding 2 to number: 3 Processing 2 Adding 2 to number: 4 Processing 3 Adding 2 to number: 5 Processing 4 Adding 2 to number: 6

The first two lines of code import your code and pdb. To run pdb against your code, you need to use pdb.run() and tell it what to do. In this case, you pass in debug_code.looper(5) as a string. When you do this, the pdb module will transform the string into an actual function call of debug_code.looper(5).

The next line is prefixed with (Pdb). That means you are now in the debugger. Success!

To run your code in the debugger, type continue or c for short. This will run your code until one of the following happens:

  • The code raises an exception
  • You get to a breakpoint (explained later on in this article)
  • The code finishes

In this case, there were no exceptions or breakpoints set, so the code worked perfectly and finished execution!

Starting pdb on the Command Line

An alternative way to start pdb is via the command line. The process for starting pdb in this manner is similar to the previous method. You still need to open up your terminal and navigate to the folder where you saved your code.

But instead of opening Python, you will run this command:

python -m pdb debug_code.py

When you run pdb this way, the output will be slightly different:

> /python101code/chapter26_debugging/debug_code.py(1)<module>() -> def log(number): (Pdb) continue Processing 0 Adding 2 to number: 2 Processing 1 Adding 2 to number: 3 Processing 2 Adding 2 to number: 4 Processing 3 Adding 2 to number: 5 Processing 4 Adding 2 to number: 6 The program finished and will be restarted > /python101code/chapter26_debugging/debug_code.py(1)<module>() -> def log(number): (Pdb) exit

The 3rd line of output above has the same (Pdb) prompt that you saw in the previous section. When you see that prompt, you know you are now running in the debugger. To start debugging, enter the continue command.

The code will run successfully as before, but then you will see a new message:

The program finished and will be restarted

The debugger finished running through all your code and then started again from the beginning! That is handy for running your code multiple times! If you do not wish to run through the code again, you can type exit to quit the debugger.

Stepping Through Code

Stepping through your code is when you use your debugger to run one line of code at a time. You can use pdb to step through your code by using the step command, or s for short.

Following is the first few lines of output that you will see if you step through your code with pdb:

$ python -m pdb debug_code.py > /python101code/chapter26_debugging/debug_code.py(3)<module>() -> def log(number): (Pdb) step > /python101code/chapter26_debugging/debug_code.py(8)<module>() -> def looper(number): (Pdb) s > /python101code/chapter26_debugging/debug_code.py(12)<module>() -> if __name__ == '__main__': (Pdb) s > /python101code/chapter26_debugging/debug_code.py(13)<module>() -> looper(5) (Pdb)

The first command that you pass to pdb is step. Then you use s to step through the following two lines. You can see that both commands do exactly the same, since “s” is a shortcut or alias for “step”.

You can use the next (or n) command to continue execution until the next line within the function. If there is a function call within your function, next will step over it. What that means is that it will call the function, execute its contents, and then continue to the next line in the current function. This, in effect, steps over the function.

You can use step and next to navigate your code and run various pieces efficiently.

If you want to step into the looper() function, continue to use step. On the other hand, if you don’t want to run each line of code in the looper() function, then you can use next instead.

You should continue your session in pdb by calling step so that you step into looper():

(Pdb) s --Call-- > /python101code/chapter26_debugging/debug_code.py(8)looper() -> def looper(number): (Pdb) args number = 5

When you step into looper(), pdb will print out --Call-- to let you know that you called the function. Next you used the args command to print out all the current args in your namespace. In this case, looper() has one argument, number, which is displayed in the last line of output above. You can replace args with the shorter a.

The last command that you should know about is jump or j. You can use this command to jump to a specific line number in your code by typing jump followed by a space and then the line number that you wish to go to.

Now let’s learn how you can add a breakpoint!

Adding Breakpoints in pdb

A breakpoint is a location in your code where you want your debugger to stop so you can check on variable states. What this allows you to do is to inspect the callstack, which is a fancy term for all variables and function arguments that are currently in memory.

If you have PyCharm or WingIDE, then they will have a graphical way of letting you inspect the callstack. You will probably be able to mouse over the variables to see what they are set to currently. Or they may have a tool that lists out all the variables in a sidebar.

Let’s add a breakpoint to the last line in the looper() function which is line 10.

Here is your code again:

# debug_code.py def log(number): print(f'Processing {number}') print(f'Adding 2 to number: {number + 2}') def looper(number): for i in range(number): log(i) if __name__ == '__main__': looper(5)

To set a breakpoint in the pdb debugger, you can use the break or b command followed by the line number you wish to break on:

$ python3.8 -m pdb debug_code.py > /python101code/chapter26_debugging/debug_code.py(3)<module>() -> def log(number): (Pdb) break 10 Breakpoint 1 at /python101code/chapter26_debugging/debug_code.py:10 (Pdb) continue > /python101code/chapter26_debugging/debug_code.py(10)looper() -> log(i) (Pdb)

Now you can use the args command here to find out what the current arguments are set to. You can also print out the value of variables, such as the value of i, using the print (or p for short) command:

(Pdb) print(i) 0

Now let’s find out how to add a breakpoint to your code!

Creating a Breakpoint with set_trace()

The Python debugger allows you to import the pbd module and add a breakpoint to your code directly, like this:

# debug_code_with_settrace.py def log(number): print(f'Processing {number}') print(f'Adding 2 to number: {number + 2}') def looper(number): for i in range(number): import pdb; pdb.set_trace() log(i) if __name__ == '__main__': looper(5)

Now when you run this code in your terminal, it will automatically launch into pdb when it reaches the set_trace() function call:

$ python3.8 debug_code_with_settrace.py > /python101code/chapter26_debugging/debug_code_with_settrace.py(12)looper() -> log(i) (Pdb)

This requires you to add a fair amount of extra code that you’ll need to remove later. You can also have issues if you forget to add the semi-colon between the import and the pdb.set_trace() call.

To make things easier, the Python core developers added breakpoint() which is the equivalent of writing import pdb; pdb.set_trace().

Let’s discover how to use that next!

Using the built-in breakpoint() Function

Starting in Python 3.7, the breakpoint() function has been added to the language to make debugging easier. You can read all about the change here:

Go ahead and update your code from the previous section to use breakpoint() instead:

# debug_code_with_breakpoint.py def log(number): print(f'Processing {number}') print(f'Adding 2 to number: {number + 2}') def looper(number): for i in range(number): breakpoint() log(i) if __name__ == '__main__': looper(5)

Now when you run this in the terminal, Pdb will be launched exactly as before.

Another benefit of using breakpoint() is that many Python IDEs will recognize that function and automatically pause execution. This means you can use the IDE’s built-in debugger at that point to do your debugging. This is not the case if you use the older set_trace() method.

Getting Help

This chapter doesn’t cover all the commands that are available to you in pdb. So to learn more about how to use the debugger, you can use the help command within pdb. It will print out the following:

(Pdb) help Documented commands (type help <topic>): ======================================== EOF c d h list q rv undisplay a cl debug help ll quit s unt alias clear disable ignore longlist r source until args commands display interact n restart step up b condition down j next return tbreak w break cont enable jump p retval u whatis bt continue exit l pp run unalias where Miscellaneous help topics: ========================== exec pdb

If you want to learn what a specific command does, you can type help followed by the command.

Here is an example:

(Pdb) help where w(here) Print a stack trace, with the most recent frame at the bottom. An arrow indicates the "current frame", which determines the context of most commands. 'bt' is an alias for this command.

Go give it a try on your own!

Wrapping Up

Being able to debug your code successfully takes practice. It is great that Python provides you with a way to debug your code without installing anything else. You will find that using breakpoint() to enable breakpoints in your IDE is also quite handy.

In this article you learned about the following:

  • Starting pdb in the REPL
  • Starting pdb on the Command Line
  • Stepping Through Code
  • Creating a Breakpoint with set_trace()
  • Adding Breakpoints in pdb
  • Using the built-in breakpoint() Function
  • Getting Help

You should go and try to use what you have learned here in your own code. Adding intentional errors to your code and then running them through your debugger is a great way to learn how things work!

The post Python 101 – Debugging Your Code with pdb appeared first on The Mouse Vs. The Python.

Categories: FLOSS Project Planets

Real Python: Pointers and Objects in Python

Planet Python - Tue, 2020-07-07 10:00

If you’ve ever worked with lower-level languages like C or C++, then you may have heard of pointers. Pointers are essentially variables that hold the memory address of another variable. They allow you to create great efficiency in parts of your code but can lead to various memory management bugs.

You’ll learn about Python’s object model and see why pointers in Python don’t really exist. For the cases where you need to mimic pointer behavior, you’ll learn ways to simulate pointers in Python without managing memory.

In this course, you’ll:

  • Learn why pointers in Python don’t exist
  • Explore the difference between C variables and Python names
  • Simulate pointers in Python
  • Experiment with real pointers using ctypes

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Erik Marsja: Adding New Columns to a Dataframe in Pandas (with Examples)

Planet Python - Tue, 2020-07-07 09:52

The post Adding New Columns to a Dataframe in Pandas (with Examples) appeared first on Erik Marsja.

In this Pandas tutorial, we are going to learn all there is about adding new columns to a dataframe. Here, we are going to use the same three methods that we used to add empty columns to a Pandas dataframe. Specifically, when adding columns to the dataframe we are going to use the following 3 methods:

  1. Simply assigning new data to the dataframe
  2. The assign() method to add new columns
  3. The insert() method to add new columns
Outline

The outline of the tutorial is as follow: a brief introduction, and a quick overview on how to add new columns to Pandas dataframe (all three methods). Following the overview of the three methods, we create some fake data, and then we use the three methods to add columns to the created dataframe.

Introduction

There are many things that we may want to do after we have created, or loaded, our dataframe in Pandas. For instance, we may go on and do some data manipulation tasks such as manipulating the columns of the dataframe. Now, if we are reading most of the data from one data source but some data from another we need to know how to add columns to a dataframe.

Adding a column to a Pandas dataframe is easy. Furthermore, as you surely have noticed, there are a few ways to carry out this task. Of course, this can create some confusion for beginners- Here, as a beginner you might see several different ways to add a column to a dataframe and you may ask yourself: which one should I use?

How to Add New Columns to a Dataframe in Pandas in 3 Ways

As previously mentioned, this tutorial is going to go through 3 different methods we can use when adding columns to the dataframe. First, we are going to use the method you may be familiar with if you know Python but have not worked with Pandas that much yet. Namely, we are going to use simple assigning:

1. Adding a New Column by Assigning New Data:

Here’s how to add a list, for example, to an existing dataframe in Pandas:df[‘NewCol’] = [1, 3, 4, 5, 6]. In the next example, we are going to use the assign() method:

2. Adding New Columns Using the assign() Method:

Here’s how to add new columns by using the assign() method: df = df.assign(NewCol1=[1, 2, 3, 4, 5], NewCol2=[.1, .2, .3., .5, -3]). After this, we will see an example of adding new columns using the insert() method:

3. Adding New Columns Using the insert() Method:

Here’s how new columns can be added with the insert() method: df.insert(4, [1, 2, 3, 4, 5]). In the next section, before we go through the examples, we are going to create some example data to play around with.

Pandas dataframe from a dictionary

In most cases, we are going to read our data from an external data source. Here, however, we are going to create a Pandas dataframe from a dictionary

import pandas as pd gender = ['M', 'F', 'F', 'M'] cond = ['Silent', 'Silent', 'Noise', 'Noise'] age = [19, 21, 20, 22] rt = [631.2, 601.3, 721.3, 722.4] data = {'Gender':gender, 'Condition':cond, 'age':age, 'RT':rt} # Creating the Datafame from dict: df = pd.DataFrame(data)

In the code chunk above, we imported Pandas and created 4 Python lists. Second, we created a dictionary with the column names we later want in our dataframe as keys and the 4 lists as values. Finally, we used the dataframe constructor to create a dataframe from our list. If you need to learn more about importing data to a Pandas dataframe check the following tutorials:

Example 1: Adding New Columns to a dataframe by Assigning Data

In the first example, we are going to add new columns to the dataframe by assigning new data. For example, if we are having two lists, containing new data, that we need to add to an existing dataframe we can just assign each list as follows:

df['NewCol1'] = 'A' df['NewCol2'] = [1, 2, 3, 4] display(df)

In the code above, we first added the list ([1 ,2 ,3 ,4 ,5]) by assigning it to a new column. To explain, the new column was created using the brackets ([]). Second, we added another column in the same way. Now, the second column, on the other hand, we just added a string (‘A’). Note, assigning a single value, as we did, will fill the entire newly added column with that value. Finally, when adding columns using this method we set the new column names using Python strings.

The dataframe with the new, added, columns

Now, it’s important to know that each list we assign to a new column from, for example, a list it needs to be of the exact same length as the existing columns in the Pandas dataframe. For example, the example dataframe we are working with have 4 rows:

If we try to add 3 new rows, it won’t work (see the image below, for error message).

Example 2: Adding New Columns to a dataframe with the assign() method

In the second example, we are adding new columns to the Pandas dataframe with the assign() method:

df.assign(NewCol1='A', NewCol2=[1, 2, 3, 4])

In the second adding new columns example, we assigned two new columns to our dataframe by adding two arguments to the assign method. These two arguments will become the new column names. Furthermore, each of our new columns also has the two lists we used in the previous example added. This way the result is exactly the same as in the first example.  Importantly, if we use the same names as already existing columns in the dataframe, the old columns will be overwritten. Again, when adding new columns the data you want to add need to be of the exact same length as the number of rows of the Pandas dataframe.

Example 3: Adding New Columns to dataframe in Pandas with the insert() method

In the third example, we are going to add new columns to the dataframe using the insert() method:

df.insert(4, 'NewCol1', 'Bre') df.insert(5, 'NewCol2', [1, 2, 3, 4]) display(df)

To explain the code above: we added two empty columns using 3 arguments of the insert() method. First, we used the loc argument to “tell” Pandas where we want our new column to be located in the dataframe. In our case, we add them to the last position in the dataframe. Second, we used the column argument (takes a string for the new column names). Lastly, we used the value argument to actually add the same list as in the previous examples. Here is the resulting dataframe:

As you may have noticed, when working with insert() method we need to how many columns there are in the dataframe. For example, when we use the code above it is not possible to insert a column where there already is one. Another option, however, that we can use if we don’t know the number of columns is using len(df.columns). Here is the same example as above using the length of the columns instead:

df.insert(len(df.columns), 'NewCol1', 'Bre') df.insert(len(df.columns), 'NewCol2', [1, 2, 3, 4])

Note, if we really want to we actually can insert columns wherever we want in the dataframe. To accomplish this we need to set the allow_duplicates to true. For example, the following adding column example will work:

df.insert(1, 'NewCol1', 'Bre', allow_duplicates=TRUE) df.insert(3, 'NewCol2', [1, 2, 3, 4], allow_duplicates=TRUE)

Now, if we have a lot of columns there are, of course, alternatives that may be more feasable than the one we have covered here. For instance, if we want to add columns from another dataframe we can either use the join, concat, or merge methods.

Conclusion

In this post, we learned how to add new columns to a dataframe in Pandas. Specifically, we used 3 different methods. First, we added a column by simply assigning a string and a list. This method is very similar to when we assign variables to Python variables. Second, we used the assign() method and added new columns in the Pandas dataframe. Finally, we had a look at the insert() method and used this method to add new columns in the dataframe. In conclusion, the best method to add columns is the assign() method. Of course, if we read data from other sources and want to merge two dataframe, only getting the new columns from one dataframe, we should use other methods (e.g., concat or merge).

Hope you enjoyed this Pandas tutorial and please leave a comment below. Especially, if there is something you want to be covered on the blog or something that should be added to this blog post. Finally, please share the post if you learned something new!

The post Adding New Columns to a Dataframe in Pandas (with Examples) appeared first on Erik Marsja.

Categories: FLOSS Project Planets

Everyday Superpowers: Stop working so hard on paths. Get started with pathlib!

Planet Python - Tue, 2020-07-07 09:39

Most people are working to hard to access files and folders with python. Pathlib makes it so much easier, and I have released two resources to help you get started using it.


Read more...
Categories: FLOSS Project Planets

OpenSense Labs: Why Low Code Development Is Not As Great As We Think

Planet Drupal - Tue, 2020-07-07 08:30
Why Low Code Development Is Not As Great As We Think Tuba Ayyubi Tue, 07/07/2020 - 18:00

Low-code development replaces the traditional method of hard coding and allows us to create our own applications without any help from the IT developers. It requires minimal hand-coding and enables faster delivery of applications with the help of pre-packaged templates, drag and drop tools, and graphic design techniques.

From leading low-code development platforms like Mendix and Outsystems to Acquia Cohesion (suited for Drupal websites), the low-code approach has been making waves as a great option for easy application development.

I am sure after reading the above lines you are left confused, that if low-code is an easy way out then why does the title talk about low code not being the right code. Well, if anything looks too good to be true, it’s not always that great. Let me tell you why!

Functionality-first and user needs later

Even though low code is a great help in making the lives of developers easier, it is unfortunate that it puts user experience at stake. A design-led approach or a progressive approach becomes harder to achieve with low code. Functionality over the need of the user never ends well.

Low code, as we know, saves time. And hence is said to be efficient. Whereas the truth is that it is efficient only with respect to time. The applications made on low code are hardly optimized for efficiency. If you want your web app to run smooth and fast, low code is not the go-to option for you.

No technical requirement: a myth

Low code is easy and can be done without including the technical team: True
Low code does not require any technical skill: false

For anyone of us to start working with the low code, the understanding of the development of low code is the first and the least requirement. It takes time to learn and understand the process. So, before one starts using the tools, it is important to ensure that they have the basic technical skills that are required.
Limited functions 

In a low code development tool, the number of functions that you can implement is limited. It is definitely a quick way to build applications but in case you want to try out something different, you do not have many options.

Also, once an app is created on low code, it is not very easy to add custom code or any other required functionality to it.

Does it help in cost-cutting?

When it comes to low code, the cost is both a draw and a drawback. 

Because of its flexibility, low code is easier to use and requires a small set of skills. So, you don’t have to specially hire someone and pay a hefty amount to do that.

Although it is easy to drag and drop building blocks that fulfil your requirements, once you need a special feature that is unavailable, you will need custom code. Merging the custom code can cost a lot more than a completely customized solution as a whole.

When a company starts, it starts small, and hence it is advised to have a provision in its low code contract for ramping up in the future. If not, the company has to face major downfall before they are even able to start properly.

Is it secure?

Low code has been giving rise to the question: Is it secure enough?

When you build an application using low code, it requires your complete trust. You don’t have control over data security and privacy and no access to source code which makes it difficult to identify the possibility of any sort of vulnerabilities.

Using low code to produce code that does not adhere to established best practices could violate an organization’s compliance measures. Doesn’t matter if the resulting application is secure.

Vendor Lock-In Risks

Vendor lock-in is one of the major limitations of low code development.

In the case of the teams that use low code, vendor lock-in can create poorly documented or even undocumented code that is difficult to maintain outside of the platform.

Hence, it is important to understand each vendor’s policies before licensing any tool and ensure that you know whether or not you are able to maintain applications outside of the platform.

Conclusion

Low code is indeed a useful tool but it comes with cons you can’t ignore. Platforms that have been using low code will only tell you that it’s faster and easier but lack of options and functions, security risks, and other major drawbacks make us rethink if it is actually the solution that we want for an enterprise application.

blog banner blog image Low-code Low-code application development website development Blog Type Tech Is it a good read ? On
Categories: FLOSS Project Planets

The Digital Cat: Flask project setup: TDD, Docker, Postgres and more - Part 3

Planet Python - Tue, 2020-07-07 08:00

In this series of posts I explore the development of a Flask project with a setup that is built with efficiency and tidiness in mind, using TDD, Docker and Postgres.

Catch-up

In the first and second posts I created a Flask project with a tidy setup, using Docker to run the development environment and the tests, and mapping important commands in a management script, so that the configuration can be in a single file and drive the whole system.

In this post I will show you how to easily create scenarios, that is databases created on the fly with custom data, so that it is possible to test queries in isolation, either with the Flask application or with the command line. I will also show you how to define a configuration for production and give some hints for the deployment.

Step 1 - Creating scenarios

The idea of scenarios is simple. Sometimes you need to investigate specific use cases for bugs, or maybe increase the performances of some database queries, and you might need to do this on a customised database. This is a scenario, a Python file that populates the database with a specific set of data and that allows you to run the application or the database shell on it.

Often the development database is a copy of the production one, maybe with sensitive data stripped to avoid leaking private information, and while this gives us a realistic case where to test queries (e.g. how does the query perform on 1 million lines?) it might not help during the initial investigations, where you need to have all the data in fron of you to properly understand what happens. Whoever learned how joins work in relational databases understands what I mean here.

In principle, to create a scenario we just need to spin up an empty database and to run the scenario code against it. In practice, things are not much more complicated, but there are a couple of minor issues that we need to solve.

First, I am already running a database for the development and one for the testing. The second is ephemeral, but I decided to setup the project so that I can run the tests while the development database is up, and the way I did it was using port 5432 (the standard Postgres one) for development and 5433 for testing. Spinning up scenarios adds more databases to the equation. Clearly I do not expect to run 5 scenrios at the same time while running the development and the test databases, but I make myself a rule to make something generic as soon I do it for the third time.

This means that I won't create a database for a scenario on port 5434 and will instead look for a more generic solution. This is offered me by the Docker networking model, where I can map a container port to the host but avoid assigning the destination port, and it will be chose randomly by Docker itself among the unprivileged ones. This means that I can create a Postgres container mapping port 5432 (the port in the container) and having Docker connect it to port 32838 in the host (for example). As long as the application knows which port to use this is absolutely the same as using port 5432.

Unfortunately the Docker interface is not extremely script-friendly when it comes to providing information and I have to parse the output a bit. Practically speaking, after I spin up the containers, I will run the command docker-compose port db 5432 which will return a string like 0.0.0.0:32838, and I will extract the port from it. Nothing major, but these are the (sometimes many) issues you face when you orchestrate different systems together.

The new management script is

File: manage.py

#! /usr/bin/env python import os import json import signal import subprocess import time import shutil import click import psycopg2 from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT # Ensure an environment variable exists and has a value def setenv(variable, default): os.environ[variable] = os.getenv(variable, default) setenv("APPLICATION_CONFIG", "development") APPLICATION_CONFIG_PATH = "config" DOCKER_PATH = "docker" def app_config_file(config): return os.path.join(APPLICATION_CONFIG_PATH, f"{config}.json") def docker_compose_file(config): return os.path.join(DOCKER_PATH, f"{config}.yml") def configure_app(config): # Read configuration from the relative JSON file with open(app_config_file(config)) as f: config_data = json.load(f) # Convert the config into a usable Python dictionary config_data = dict((i["name"], i["value"]) for i in config_data) for key, value in config_data.items(): setenv(key, value) @click.group() def cli(): pass @cli.command(context_settings={"ignore_unknown_options": True}) @click.argument("subcommand", nargs=-1, type=click.Path()) def flask(subcommand): configure_app(os.getenv("APPLICATION_CONFIG")) cmdline = ["flask"] + list(subcommand) try: p = subprocess.Popen(cmdline) p.wait() except KeyboardInterrupt: p.send_signal(signal.SIGINT) p.wait() def docker_compose_cmdline(commands_string=None): config = os.getenv("APPLICATION_CONFIG") configure_app(config) compose_file = docker_compose_file(config) if not os.path.isfile(compose_file): raise ValueError(f"The file {compose_file} does not exist") command_line = [ "docker-compose", "-p", config, "-f", compose_file, ] if commands_string: command_line.extend(commands_string.split(" ")) return command_line @cli.command(context_settings={"ignore_unknown_options": True}) @click.argument("subcommand", nargs=-1, type=click.Path()) def compose(subcommand): cmdline = docker_compose_cmdline() + list(subcommand) try: p = subprocess.Popen(cmdline) p.wait() except KeyboardInterrupt: p.send_signal(signal.SIGINT) p.wait() def run_sql(statements): conn = psycopg2.connect( dbname=os.getenv("POSTGRES_DB"), user=os.getenv("POSTGRES_USER"), password=os.getenv("POSTGRES_PASSWORD"), host=os.getenv("POSTGRES_HOSTNAME"), port=os.getenv("POSTGRES_PORT"), ) conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT) cursor = conn.cursor() for statement in statements: cursor.execute(statement) cursor.close() conn.close() def wait_for_logs(cmdline, message): logs = subprocess.check_output(cmdline) while message not in logs.decode("utf-8"): time.sleep(0.1) logs = subprocess.check_output(cmdline) @cli.command() def create_initial_db(): configure_app(os.getenv("APPLICATION_CONFIG")) try: run_sql([f"CREATE DATABASE {os.getenv('APPLICATION_DB')}"]) except psycopg2.errors.DuplicateDatabase: print( f"The database {os.getenv('APPLICATION_DB')} already exists and will not be recreated" ) @cli.command() @click.argument("filenames", nargs=-1) def test(filenames): os.environ["APPLICATION_CONFIG"] = "testing" configure_app(os.getenv("APPLICATION_CONFIG")) cmdline = docker_compose_cmdline("up -d") subprocess.call(cmdline) cmdline = docker_compose_cmdline("logs db") wait_for_logs(cmdline, "ready to accept connections") run_sql([f"CREATE DATABASE {os.getenv('APPLICATION_DB')}"]) cmdline = ["pytest", "-svv", "--cov=application", "--cov-report=term-missing"] cmdline.extend(filenames) subprocess.call(cmdline) cmdline = docker_compose_cmdline("down") subprocess.call(cmdline) @cli.group() def scenario(): pass @scenario.command() @click.argument("name") def up(name): os.environ["APPLICATION_CONFIG"] = f"scenario_{name}" config = os.getenv("APPLICATION_CONFIG") scenario_config_source_file = app_config_file("scenario") scenario_config_file = app_config_file(config) if not os.path.isfile(scenario_config_source_file): raise ValueError(f"File {scenario_config_source_file} doesn't exist") shutil.copy(scenario_config_source_file, scenario_config_file) scenario_docker_source_file = docker_compose_file("scenario") scenario_docker_file = docker_compose_file(config) if not os.path.isfile(scenario_docker_source_file): raise ValueError(f"File {scenario_docker_source_file} doesn't exist") shutil.copy(docker_compose_file("scenario"), scenario_docker_file) configure_app(f"scenario_{name}") cmdline = docker_compose_cmdline("up -d") subprocess.call(cmdline) cmdline = docker_compose_cmdline("logs db") wait_for_logs(cmdline, "ready to accept connections") cmdline = docker_compose_cmdline("port db 5432") out = subprocess.check_output(cmdline) port = out.decode("utf-8").replace("\n", "").split(":")[1] os.environ["POSTGRES_PORT"] = port run_sql([f"CREATE DATABASE {os.getenv('APPLICATION_DB')}"]) scenario_module = f"scenarios.{name}" scenario_file = os.path.join("scenarios", f"{name}.py") if os.path.isfile(scenario_file): import importlib os.environ["APPLICATION_SCENARIO_NAME"] = name scenario = importlib.import_module(scenario_module) scenario.run() cmdline = " ".join( docker_compose_cmdline( "exec db psql -U {} -d {}".format( os.getenv("POSTGRES_USER"), os.getenv("APPLICATION_DB") ) ) ) print("Your scenario is ready. If you want to open a SQL shell run") print(cmdline) @scenario.command() @click.argument("name") def down(name): os.environ["APPLICATION_CONFIG"] = f"scenario_{name}" config = os.getenv("APPLICATION_CONFIG") cmdline = docker_compose_cmdline("down") subprocess.call(cmdline) scenario_config_file = app_config_file(config) os.remove(scenario_config_file) scenario_docker_file = docker_compose_file(config) os.remove(scenario_docker_file) if __name__ == "__main__": cli()

where I added the scenario up and scenario down commands. As you can see the function up first copies the config/scenario.json and the docker/scenario.yml files (that I still have to create) into files named after the scenario.

Then I run the up -d command and wait for the database to be ready, as I already do for tests. After that, it's time to extract the port of the container with some very simple Python string processing and to initialise the correct environment variable.

Last, I import and execute the Python file containing the code of the scenario itself and print a friendly message with the command line to run psql to have a Postgres shell into the newly created database.

The down function simply tears down the containers and removes the scenario configuration files.

The two missing config files are pretty simple. The docker compose configuration is

File: docker/scenario.yml

version: '3.4' services: db: image: postgres environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} ports: - "5432" web: build: context: ${PWD} dockerfile: docker/web/Dockerfile environment: FLASK_ENV: ${FLASK_ENV} FLASK_CONFIG: ${FLASK_CONFIG} APPLICATION_DB: ${APPLICATION_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} POSTGRES_PORT: ${POSTGRES_PORT} command: flask run --host 0.0.0.0 volumes: - ${PWD}:/opt/code ports: - "5000"

Here you can see that the database is ephemeral, that the port on the host is automatically assigned, and that I also spin up the application (mapping it to a random port as well to avoid clashing with the development one).

The configuration file is

File: config/scenario.json

[ { "name": "FLASK_ENV", "value": "development" }, { "name": "FLASK_CONFIG", "value": "development" }, { "name": "POSTGRES_DB", "value": "postgres" }, { "name": "POSTGRES_USER", "value": "postgres" }, { "name": "POSTGRES_HOSTNAME", "value": "localhost" }, { "name": "POSTGRES_PASSWORD", "value": "postgres" }, { "name": "APPLICATION_DB", "value": "application" } ]

which doesn't add anything new to what I already did for development and testing.

Resources Scenario example 1

Let's have a look at a very simple scenario that doesn't do anything on the database, just to understand the system. The code for the scenario is

File: scenarios/foo.py

import os def run(): print("HEY! This is scenario", os.environ["APPLICATION_SCENARIO_NAME"])

When I run the scenario I get the following output

$ ./manage.py scenario up foo Creating network "scenario_foo_default" with the default driver Creating scenario_foo_db_1 ... done Creating scenario_foo_web_1 ... done HEY! This is scenario foo Your scenario is ready. If you want to open a SQL shell run docker-compose -p scenario_foo -f docker/scenario_foo.yml exec db psql -U postgres -d application

The command docker ps shows that my development environment is happily running alongside with the scenario

$ docker ps CONTAINER ID IMAGE COMMAND [...] PORTS NAMES 85258892a2df scenario_foo_web "flask run --host 0.…" [...] 0.0.0.0:32826->5000/tcp scenario_foo_web_1 a031b6429e07 postgres "docker-entrypoint.s…" [...] 0.0.0.0:32827->5432/tcp scenario_foo_db_1 1a449d23da01 development_web "flask run --host 0.…" [...] 0.0.0.0:5000->5000/tcp development_web_1 28aa566321b5 postgres "docker-entrypoint.s…" [...] 0.0.0.0:5432->5432/tcp development_db_1

And the output of the scenario up foo command contains the string HEY! This is scenario foo that was printed by the file foo.py. We can also successfully run the suggested command

$ docker-compose -p scenario_foo -f docker/scenario_foo.yml exec db psql -U postgres -d application psql (12.3 (Debian 12.3-1.pgdg100+1)) Type "help" for help. application=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -------------+----------+----------+------------+------------+----------------------- application | postgres | UTF8 | en_US.utf8 | en_US.utf8 | postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 | template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres + | | | | | postgres=CTc/postgres (4 rows) application=#

And inside the database we find the application database created explicitly for the scenario (the name is specified in config/scenario.json). If you don't know psql you can exit with \q or Ctrl-d.

Before tearing down the scenario have a look at the two files config/scenario_foo.json and docker/scenario_foo.yml. They are just copies of config/scenario.json and docker/scenario.yml but I think seeing them there might help to understand how the whole thing works. When you are done run ./manage.py scenario down foo.

Scenario example 2

Let's do something a bit more interesting. The new scenario is contained in scenarios/users.py

File: scenarios/users.py

from application.app import create_app from application.models import db, User app = create_app("development") def run(): with app.app_context(): db.drop_all() db.create_all() # Administrator admin = User(email="admin@server.com") db.session.add(admin) # First user user1 = User(email="user1@server.com") db.session.add(user1) # Second user user2 = User(email="user2@server.com") db.session.add(user2) db.session.commit()

I decided to be as agnostic as possible in the scenarios, to avoid creating something too specific that eventually would not give me enough flexibility to test what I need. This means that the scenario has to create the app and to use the database session explicitly, as I do in this example. The application is created with the "development" configuration. Remember that this is the Flask configuration that you find in application/config.py, not the one that is in config/development.json.

I can run the scenario with

$ ./manage.py scenario up users

and then connect to the database to find my users

$ docker-compose -p scenario_users -f docker/scenario_users.yml exec db psql -U postgres -d application psql (12.3 (Debian 12.3-1.pgdg100+1)) Type "help" for help. application=# \dt List of relations Schema | Name | Type | Owner --------+-------+-------+---------- public | users | table | postgres (1 row) application=# select * from users; id | email ----+------------------ 1 | admin@server.com 2 | user1@server.com 3 | user2@server.com (3 rows) application=# \q Step 2 - Simulating the production environment

As I stated at the very beginning of this mini series of posts, one of my goals was to run in development the same database that I run in production, and for this reason I went through the configuration steps that allowed me to have a Postgres container running both in development and during tests. In a real production scenario Postgres would probably run in a separate instance, for example on the RDS service in AWS, but as long as you have the connection parameters nothing changes in the configuration.

Docker actually allows us to easily simulate the production environment as well. Well, if our notebook was connected 24/7 we might as well host the production there directly. Not that I recommend this nowadays, but this is how many important companies begun many years ago when cloud computing had not been here yet. Instead of installing a LAMP stack we configure containers, but the idea doesn't change.

I will then create a configuration that simulates a production environment and then give some hints on how to translate this into a proper production infrastructure. If you want to have a clear picture of the components of a web application in production read my post Dissecting a web stack that analyses them one by one.

The first component that we have to change here is the HTTP server. In development we use Flask's development server, and the first message that server prints is WARNING: This is a development server. Do not use it in a production deployment. Got it, Flask! A good choice to replace it is Gunicorn, so first of all I add it in the requirements

File: requirements/production.txt

Flask flask-sqlalchemy psycopg2 flask-migrate gunicorn

Then I need to create a docker-compose configuration for production

File: docker/production.yml

version: '3.4' services: db: image: postgres environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} ports: - "${POSTGRES_PORT}:5432" volumes: - pgdata:/var/lib/postgresql/data web: build: context: ${PWD} dockerfile: docker/web/Dockerfile.production environment: FLASK_ENV: ${FLASK_ENV} FLASK_CONFIG: ${FLASK_CONFIG} APPLICATION_DB: ${APPLICATION_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} POSTGRES_PORT: ${POSTGRES_PORT} command: gunicorn -w 4 -b 0.0.0.0 wsgi:app volumes: - ${PWD}:/opt/code ports: - "8000:8000" volumes: pgdata:

As you can see here the command that runs the application is slightly different, gunicorn -w 4 -b 0.0.0.0 wsgi:app. It exposes 4 processes (-w 4) on the container's address 0.0.0.0 loading the app object from the wsgi.py file (wsgi:app). As by default Gunicorn exposes port 8000 I mapped that to the same port in the host.

As the production image need to install the production requirements I also took the opportunity to create the docker/web subdirectory and move the web Dockerfile there. Then I created the Dockerfile.production one

File: docker/web/Dockerfile.production

FROM python:3 ENV PYTHONUNBUFFERED 1 RUN mkdir /opt/code RUN mkdir /opt/requirements WORKDIR /opt/code ADD requirements /opt/requirements RUN pip install -r /opt/requirements/production.txt

Having moved the development Dockerfile into the subdirectory I also fixed the other docker-compose files to match the new configuration. I can now build the image with

$ APPLICATION_CONFIG="production" ./manage.py compose build web

The last thing I need is a configuration file

File: config/production.json

[ { "name": "FLASK_ENV", "value": "production" }, { "name": "FLASK_CONFIG", "value": "production" }, { "name": "POSTGRES_DB", "value": "postgres" }, { "name": "POSTGRES_USER", "value": "postgres" }, { "name": "POSTGRES_HOSTNAME", "value": "localhost" }, { "name": "POSTGRES_PORT", "value": "5432" }, { "name": "POSTGRES_PASSWORD", "value": "postgres" }, { "name": "APPLICATION_DB", "value": "application" } ]

as you can notice this is not very different from the development one, as I just changed the values of FLASK_ENV and FLASK_CONFIG. Clearly this contains a secret that shouldn't be written in plain text, POSTGRES_PASSWORD, but after all this is a simulation of production. In a real environment secrets should be kept in an encrypted manager such as AWS Secret Manager.

Remember that FLASK_ENV changes the internal settings of Flask, most notably disabling the debugger, and the FLASK_CONFIG=production loads the ProductionConfig object from application/config.py. That object is empty for the moment, but it might contain public configuration for the production server.

Resources Step 3 - Scale up

Mapping the container port to the host is not a great idea, though, as it makes it impossible to scale up and down to serve more load, which is the main point of running containers in production. This might be solved in many ways in the cloud, for example in AWS you might run the container in AWS Fargate and register them in an Application Load Balancer. Another way to do it on a sinlge host is to run a Web Server in front of your HTTP server, and this might be easily implemented with docker-compose

I will add nginx and serve HTTP from there, reverse proxying the application containers through docker-compose networking. First of all the new configuration for docker-compose

File: docker/production.yml

version: '3.4' services: db: image: postgres environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} ports: - "${POSTGRES_PORT}:5432" volumes: - pgdata:/var/lib/postgresql/data web: build: context: ${PWD} dockerfile: docker/web/Dockerfile.production environment: FLASK_ENV: ${FLASK_ENV} FLASK_CONFIG: ${FLASK_CONFIG} APPLICATION_DB: ${APPLICATION_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_HOSTNAME: ${POSTGRES_HOSTNAME} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} POSTGRES_PORT: ${POSTGRES_PORT} command: gunicorn -w 4 -b 0.0.0.0 wsgi:app volumes: - ${PWD}:/opt/code nginx: image: nginx volumes: - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro ports: - 8080:8080 volumes: pgdata:

As you can see I added a service nginx that runs the default Nginx image, mapping a custom configuration file that I will create in a minute. The application container doesn't need any port mapping, as I won't access it directly from the host anymore. The Nginx configuration file is

File: docker/nginx/nginx.conf

worker_processes 1; events { worker_connections 1024; } http { sendfile on; upstream app { server web:8000; } server { listen 8080; location / { proxy_pass http://app; proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Host $server_name; } } }

This is a pretty standard configuration, and in a real production environment I would add many other configuration values (most notably serving HTTPS instead of HTTP). The upstream section leverages docker-compose networking referring to web, which in the internal DNS directly maps to the IPs of the service with the same name. The port 8000 comes from the default Gunicorn port that I already mentioned before. I won't run the nginx container as root on my notebook, so I will expose port 8080 instead of the traditional 80 for HTTP, and this is also something that would be different in a real production environment.

I can at this point run

$ APPLICATION_CONFIG="production" ./manage.py compose up -d Starting production_db_1 ... done Starting production_nginx_1 ... done Starting production_web_1 ... done

It's interesting to have a look at the logs of the nginx container, as Nginx by default prints all the incoming requests

$ APPLICATION_CONFIG="production" ./manage.py compose logs -f nginx Attaching to production_nginx_1 [...] nginx_1 | 172.30.0.1 - - [05/Jul/2020:10:40:44 +0000] "GET / HTTP/1.1" 200 13 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"

The last line is what I get when I visit localhost:8080 while the production setup is up and running.

Scaling up and down the service is now a breeze

$ APPLICATION_CONFIG="production" ./manage.py compose up -d --scale web=3 production_db_1 is up-to-date Starting production_web_1 ... Starting production_web_1 ... done Creating production_web_2 ... done Creating production_web_3 ... done Resources Bonus step - A closer look at Docker networking

I mentioned that docker-compose creates a connection between services, and used that in the configuration of the nginx container, but I understand that this might look like black magic to some people. While I believe that this is actually black magic, I also think that we can investigate it a bit, so let's open the grimoire and reveal (some of) the dark secrets of Docker networking.

While the production setup is running we can connect to the nginx container and see what is happening in real time, so first of all I run a bash shell on it

$ APPLICATION_CONFIG="production" ./manage.py compose exec nginx bash

Once inside I can see my configuration file at /etc/nginx/nginx.conf, but this has not changed. Remember that Docker networking doesn't work as a templating engine, but with a local DNS. This means that if we try to resolve web from inside the container we should see multiple IPs. The command dig is a good tool to investigate the DNS, but it doesn't come preinstalled in the nginx container, so I need to run

root@33cbaea369be:/# apt update && apt install dnsutils

and at this point I can run it

root@33cbaea369be:/# dig web ; <<>> DiG 9.11.5-P4-5.1+deb10u1-Debian <<>> web ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30539 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;web. IN A ;; ANSWER SECTION: web. 600 IN A 172.30.0.4 web. 600 IN A 172.30.0.6 web. 600 IN A 172.30.0.5 ;; Query time: 0 msec ;; SERVER: 127.0.0.11#53(127.0.0.11) ;; WHEN: Sun Jul 05 10:58:18 UTC 2020 ;; MSG SIZE rcvd: 78 root@33cbaea369be:/#

The command outputs 3 IPs, which correspond to the 3 containers of the web service that I am currently running. If I scale down (from outside the container)

$ APPLICATION_CONFIG="production" ./manage.py compose up -d --scale web=1

then the output of dig becomes

root@33cbaea369be:/# dig web ; <<>> DiG 9.11.5-P4-5.1+deb10u1-Debian <<>> web ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13146 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;web. IN A ;; ANSWER SECTION: web. 600 IN A 172.30.0.4 ;; Query time: 0 msec ;; SERVER: 127.0.0.11#53(127.0.0.11) ;; WHEN: Sun Jul 05 11:01:46 UTC 2020 ;; MSG SIZE rcvd: 40 root@33cbaea369be:/# How to create the production infrastructure

This will be a very short section, as creating infrastructure and deploying in production are complex topics, so I want to just give some hints to stimulate your research.

AWS ECS is basically Docker in the cloud, and the whole structure can map almost 1 to 1 to the docker-compose setup, so it is worth learning. ECS can work on explicit EC2 instances that you manage, or in Fargate, which means that the EC2 instances running the containers are transparently managed by AWS itself.

Terraform is a good tool to create infrastructure. It has many limitations, mostly coming from its custom HCL language, but it's slowly becoming better (version 0.13 will finally allow us to run for loops on modules, for example). Despite its shortcomings, it's a great tool to create static infrastructure, so I recommend working on it.

Terraform is not the right tool to deploy your code, though, as that requires a dynamic interaction with the system, so you need to setup a good Continuous Integration system. Jenkins is a very well known open source CI, but I personally ended up dropping it because it doesn't seem to be designed for large scale systems. For example, it is very complicated to automate the deploy of a Jenkins server, and dynamic large scale systems should require zero manual intervention to be created. Anyway, Jenkins is a good tool to start with, but you might want to have a look at other products like CircleCI or Buildkite.

When you create your deploy pipeline you need to do much more than just creating the image and running it, at least for real applications. You need to decide when to apply database migrations and if you have a web front-end you will also need to compile and install the JavaScript assets. Since you don't want to have downtime when you deploy you will need to look into blue/green deployments, and in general to strategies that allow you to run different versions of the application at the same time, at least for short periods of time. Or for longer periods, if you want to perform A/B testing or zonal deployments.

Final words

This is the last post of this short series. I hope you learned something useful, and that it encouraged you to properly setup your projects and to investigate technologies like Docker. As always, feel free to send me feedback or questions, and if you find my posts useful please share them with whoever you thing might be interested.

Feedback

Feel free to reach me on Twitter if you have questions. The GitHub issues page is the best place to submit corrections.

Categories: FLOSS Project Planets

Google Summer of Code 2020 - Week 4

Planet KDE - Tue, 2020-07-07 08:00

According to my GSoC proposal, I should be done with the general purpose graph layout capabilities for Rocs and free to start working on layout algorithms specifically designed to draw trees. This is not the case for a series of reasons, including my decision to write my own implementation of a general purpose force-based graph layout algorithm and failure to anticipate the need for non-functional tests to evaluate the quality of the layouts. I still need to document the functionalities of the plugin and improve the code documentation as well. Besides that, although it is not present in my original purpose, I really want to include the layout algorithm presented in [1], because I have high expectations about the quality of the layouts it can produce.

Even though I am not done with this first part, I decided to start working on the layout algorithms for trees. Now that I am more used to the code base and to the precesses used by KDE, I expect to be more productive and finish everything on time.

The main motivation for implementing layout algorithms specifically designed for trees is the possibility of exploiting the properties of trees to come up with better layouts. Yesterday, I started studying more about layout algorithms for trees. Most of them are based on a subset of the following ideas:

  • Select a root node for the tree. This node can be provided by the user or selected automatically.
  • Partition the nodes by their depth in the rooted tree and draw all nodes with the same depth at the same layer (usually a line or a circle).
  • Draw the sub-trees in a recursive way and compose the resulting drawings to get the complete layout.

All of these ideas are related to the structure of trees. In order to improve my intuition about them, I wrote an experimental implementation based on the first two ideas above. The application of this implementation to a tree with 15 nodes generated the following result:

By taking advantage of the properties of trees, even simple solutions such as my one-day experimental implementation can guarantee some desirable layout properties that the general purpose force-based layout algorithm can not. For instance, it guarantees that there are no intersections between edges or between nodes. The force-based layout algorithm I implemented can generate layouts with pairs of edges that intersect even when applied to trees.

[1] R. Davidson and D. Harel. Drawing graphs nicely using simulated annealing.ACM Transactions on Graphics, 15(4):301–331, 1996.

Categories: FLOSS Project Planets

Specbee: Stop Spam! How to use the Captcha and ReCaptcha module in Drupal 8

Planet Drupal - Tue, 2020-07-07 06:58
Stop Spam! How to use the Captcha and ReCaptcha module in Drupal 8 Suresh Prabhu 07 Jul, 2020 Top 10 best practices for designing a perfect UX for your mobile app

Have you had enough of the spam comments, form submissions and email submissions by bots trying to infiltrate your website? Then you need a guard called Completely Automated Public Turing test to tell Computers and Humans Apart. Yes, and that is short for CAPTCHA. As annoying as it may be making us prove time and again that we are not bots, Captcha and ReCaptcha are the most effective against fighting automated programs trying to get into our websites. The Captcha module and ReCaptcha module in Drupal 8 are extremely helpful in protecting your Drupal website against spambots and used widely in user web forms and other regions of a web page where user inputs are required. Let’s learn more about the modules and how to implement them in your Drupal 8 website.

What is Captcha and ReCaptcha?

When we try to login to or register on a website, we are sometimes asked to identify and type the distorted numbers and letters into a provided box. This is the Captcha system. Captcha helps us to verify whether the visitor of your site is an actual human or a bot. ReCaptcha does the same in terms of protecting your website from spam except that it makes it tougher for spambots and more user friendly for humans.

How to use the Captcha module in Drupal 8

The Captcha module in Drupal 8 is an easy to use module largely used in forms to identify if the user is a human or a bot. The Captcha module is also compatible with Drupal 9. Let’s get started with installing and using the Captcha module in Drupal 8.

Download and Enable the captcha module

Download the captcha module from here and enable it. To enable the module, go to Extend and in the spam control category, you will find the CAPTCHA option. Click on the checkbox and then click install.

 

 

Enable both Captcha and Image Captcha. Image captcha provides an image-based captcha.

Configure the Captcha module

After installing the module, we must configure the module as per our requirements.

To configure the module, go to Configuration > People > CAPTCHA module settings.

Select the Default challenge type. This type is used on all the forms. You can change the type for an individual form. This module has two built in types -

  • Math : This will provide a simple math problem to the user to solve. 

  • Image : It provides an image of a word that can’t be read by bots.

The example of this type is given in the CAPTCHA examples tab on the same page.

To change the type for an individual form, go to Form Settings tab on the same page. Here we can see the list of forms in the site. Click on the enable button to enable the captcha to form.

To change the challenge type to a particular form, click on the down-arrow and click edit.

Give the form ID for which you want to change the challenge type and can change the type in the dropdown provided under challenge type.

This is not required unless the structure of the form does not change.

Adding the description to the Captcha for the visitor

Click on the checkbox to show the Challenge description box. This is not visible by default. Just click the checkbox, the description is already written. This description is editable and can display any message of your choice to the visitor.

Set validation and persistence

These are some of the features to the validation of the captcha. Here, we can make the validation difficult by requiring case sensitive validation. We can also change the appearance of the challenges. The second option under persistence makes the process simple for the visitor by hiding the challenge once the visitor is logged in and successfully completes the challenge.

Permissions

The captcha can be controlled by giving permissions.

One can change the captcha settings who has the Administer CAPTCHA settings permission. Those who have the skip CAPTCHA permission are not given any challenge. To test the captcha the user should not have the skip CAPTCHA permission. Administrators cannot test as they have this permission by default.

  ReCaptcha module in Drupal 8

The Captcha works as required, but there are some drawbacks to this. It is not user-friendly to visitors with visual disabilities. Reading distorted numbers and letters can be annoying to regular users. This may end up with the user not getting a chance to enter the site.

The solution for this problem is the ReCaptcha module. ReCaptcha module uses Google reCAPTCHA to improve the captcha system.

Download and Enable the ReCaptcha module

Download the captcha module from https://www.drupal.org/project/recaptcha and enable it.

Configure the module

After installing the module, go to Configuration > People > CAPTCHA Module Settings.

Select ReCaptcha in Default challenge type and click save configuration. After saving, go to ReCaptcha tab on the same page.

As the ReCaptcha uses Google ReCaptcha service, the site key and the secret key is required to use the ReCaptcha module. These keys are given by google once we register our site in google ReCaptcha. To register click on the register for reCAPTCHA link.

Once we click on it, we will see this form. We have to provide some information such as domain name, type of ReCaptcha. Accept the Terms of Service before clicking on submit. After the submission, you will get the site key and secret key. Enter it in the reCAPTCHA tab.

Choose which form you would like to use ReCaptcha. And then test the form.

 

If you want to test it in a local environment disable the domain name validation in reCAPTCHA configuration in google.

 

Captcha is used in almost all the sites these days. Captchas are an efficient way to reduce spam and fight the bots. They will distinguish between humans and robots and prevent them from automated attacks on systems and applications. Captchas are now being replaced by ReCaptcha, as ReCaptcha is more user friendly than captcha. ReCaptcha is a type of captcha which is easier for humans to solve but not for the bots. Drupal 8 has tons of functional modules like the Captcha and ReCaptcha to help out with securing your website from these spambots. Specbee is a Drupal development company having years of experience in leveraging the best of Drupal to build compelling websites. Have a requirement for your next Drupal project? Feel free to connect with us!

Drupal Planet Shefali ShettyApr 05, 2017 Subscribe For Our Newsletter And Stay Updated Subscribe

Leave us a Comment

  Shefali ShettyApr 05, 2017 Recent Posts Image Stop Spam! How to use the Captcha and ReCaptcha module in Drupal 8 Image How to manage Google Ads by integrating DFP (DoubleClick for Publishers) with your Drupal 9 website Image Functions and filters to get you started with Twig Tweak in Drupal 8 (with examples) Featured Success Stories

Know more about our technology driven approach to recreate the content management workflow for [24]7.ai

link

Find out how we transformed the digital image of world’s largest healthcare provider, an attribute that defined their global presence in the medical world.

link

Discover how a Drupal powered internal portal encouraged the sellers at Flipkart to obtain the latest insights with respect to a particular domain.

link
Categories: FLOSS Project Planets

ComputerMinds.co.uk: Recovering deleted content

Planet Drupal - Tue, 2020-07-07 05:00

Drupal 7 introduced the brilliant feature of letting users cancel their own account and with it various options for what to do with content they've created when they are cancelled. One of these options is to:

Delete the account and its content.

Which can prove somewhat problematic if used incorrectly.

You see, Drupal is very good at the latter part: deleting all the content created by the user. It's not very good at warning someone that they are about to delete potentially a lot of important content.

The scenario

Let me set the scene for you. Someone had an account on a Drupal site and did a lot of work, making pages etc. Then they left the organisation. Someone else comes along and after a while thinks: I should clean up all these old user accounts and delete them, we don't need them any more.
Unfortunately they use the aforementioned Delete the account and its content option.

A few days pass and then they notice that the cookie policy page has gone missing. And they are sure that the FAQ section had more than 3 questions in it.
Oh dear.

They now face a serious problem. They have two 'easy' options to resolve it:

  1. Restore a database backup from before they deleted the user to recover all the lost content.
  2. Attempt to manually re-create all the content that was deleted.

However, they've been using the site in the interim and have changed lots of content. So have other users of the site. They can't simply restore a database backup from before all the content was deleted because they'd lose all the changes since then. But they also size up the volumes of missing content, and they simply aren't sure what content has gone missing, but know that it's hundreds of pages. Also the references between content have been broken, content that still exists on the site is trying to reference content that isn't there. So now not only do they need to re-create content but they have to go around fixing all the other site content that references that content. Oh my.

The third option

There is another way:

  1. Automatically re-create all the content that was deleted.

But how?

If you've got a decent backup from before the deletion happened then you contact your friendly ComputerMinds and we'll help you out by following something along the lines of the below. If you don't have a decent backup, then you're toast: Learn your lesson and start making backups of your data that you can restore from!

But you've got that backup, right? Ideally from as close as possible to, but not after, the account and content being deleted. So let's see what you/we do with it:

We're going to repeat the deletion and work out how to put it all back.

Begin by restoring the code, files and database from your backup to a development machine.
Load up the site in your browser and get ready to perform the exact same operation that caused the problem in the first place, but don't perform it yet!

Now, identify tables that contain changes that you don't really care about, the more the merrier. I'm thinking the watchdog table, any cache_* tables etc. You might need expert knowledge of the site to make this list as long as possible, it'll help later because you can really cut down the amount of noise and work you'll have to do later.

Once you've done that you want to make a 'pre-delete' database dump. Something like this:

drush sql-dump --structure-tables-list='sessions,cache,watchdog' > pre-delete.sql

Now, go back to your browser and cancel the account in the same way that was done before, so: Delete the account and its content.

Once the deletion has happened we want to run the same drush command as before, but save the results to another file.

drush sql-dump --structure-tables-list='sessions,cache,watchdog' > post-delete.sql

Now we essentially have two database snapshots, the difference between the two is all the content that was deleted. So we'll aim to produce a set of SQL queries to restore all that to the production database.

I had very mixed results with trying to get two MySQL dump files that would diff easily in a way that would leave the correct INSERT/UPDATE statements to put all the content back. Comparing the two dump files pre-delete.sql and post-delete.sql directly just didn't seem to work.

Percona to the rescue!

There's a tool in the Percona suite called pt-table-sync that will diff two databases and produce a set of SQL statements that would make the data consistent between the two, i.e. the SQL 'diff'.

There's a final wrinkle that means that you actually need another database server at this point, because pt-table-sync can only sync from one server to another, not between two databases on the same server. However, in the age of Vagrant or Docker getting multiple MySQL servers running on your machine is no big issue. I'm going to suppose you have two database servers running on ports 3306 and 3307 on your local machine.

Restore each of the SQL dump files from before to an identically named database on the servers respectively. Then you can get pt-table-sync to produce the magic:

pt-table-sync --print --databases=db_name h=127.0.0.1,P=3306 h=127.0.0.1,P=3307 > content-restore.sql

To make the diff go the right 'way' make sure the server with the post-delete.sql file is listed first in the command line. And you may need to adjust the command to get it to connect to your servers correctly.

Once you've done that content-restore.sql should contain a set of SQL commands that you could run on the production server to restore all the deleted content. However, I'd recommend doing one final manual look through the file and making sure that nothing is going to run against tables that don't really matter or that can't be recovered in other ways.
It's a text file so review it line by line and understand what each line is going to do and make sure they are the expected changes!

Once you've done all that you can execute the content-restore.sql file on your production server and that should restore everything that was deleted from the database!

Wrap up

So we've done this twice now, for different clients. We were happy that we were able to recover their content and not force them to either lose all other changes made or have to re-create a lot of pages.
We learnt so much the first time we did this, that the second time it was actually a fairly smooth process that didn't take very long at all despite having to restore thousands of pieces of content. We've also taken steps to stop people from using this particularly dangerous option when cancelling a users account.

Obviously all of the above relies on having backups of your database, and being able to retrieve a point-in-time, not just the 'latest' one. If you don't have this in place already, go now and get that sorted!
If you have backups, maybe bookmark this page so that if you ever need to recover a large amount of accidentally deleted content you'll know a (fairly) easy way that works well.

Categories: FLOSS Project Planets

PSF GSoC students blogs: Weekly Blog Post | Gsoc'2020 | #6

Planet Python - Tue, 2020-07-07 03:38

<meta charset="utf-8">

Greeting, People of the world!

Our first evaluation took place last week and guess what? I passed it!!! Well, I am pretty excited and happy about it. 

 

1. What did you do this week?

I built the modal for colour customization for a single icon and did little edits here and there. Now it works perfectly. 

 

2. What is coming up next?

This week, I will be implementing logic for icon rotation in icons picker API and also added buttons for rotation on the icons editing Modal.

 

3. Did you get stuck anywhere?

Yes, a little when a few icons were not getting customized. Well, it was because of their SVG file having a style tag. This issue was resolved with the help of my mentors. 

Categories: FLOSS Project Planets

Cantor - Plots handling improvments

Planet KDE - Tue, 2020-07-07 03:22
Hello everyone,

this is the third post about the progress in my GSoC project and I want to present new changes in the handling of the external packages in Cantor.
The biggest changes done recently happened for Python. We now properly support integrated plots created with matplotlib.
Cantor intercepts the creation of plots and embedds the result into its worksheet.
This also works if multiple plots are created in one step the order of plots is preserved.
Also, text results between plots are also supported.

Besides matplotlib, we also properly handle Plot.ly - another popular graphing library for Python and R. This package has some requirements
that have to be fulfilled first. The user is notified about these requirements in case they are not fulfilled.

Similar implementation was also done for Julia and Octave, but to a smaller extent.
Though many preparational changes were done in the code for this, the only visible result for the user
are at the moment the new messages about unfulfited requirements of graphing packages.
Especially for Julia this is imprortannt now since for graphing the package GR was hard-coded in the past and there was no notification to the user
if this package was not installed and it was not immediately clear to the user why the creation of plots fails.
With this improvements Cantor is doing the next steps to become more user friendly.

There is another important change - the settings for graphing packages become dynamic.
The user can now change them on the fly without having to restart the session.


Also, the plot menu was extended. Julia and Python now can produce code for multiple packages - the prefered package can be choosen in settings.


In the next post I plan to show how the usability of Cantor panels is going to be improved.
Categories: FLOSS Project Planets

Gunnar Wolf: Raspberry Pi 4, now running your favorite distribution!

Planet Debian - Tue, 2020-07-07 03:00

Great news, great news! New images available!Grab them while they are hot!

With lots of help (say, all of the heavy lifting) from the Debian Raspberry Pi Maintainer Team, we have finally managed to provide support for auto-building and serving bootable minimal Debian images for the Raspberry Pi 4 family of single-board, cheap, small, hacker-friendly computers!

The Raspberry Pi 4 was released close to a year ago, and is a very major bump in the Raspberry lineup; it took us this long because we needed to wait until all of the relevant bits entered Debian (mostly the kernel bits). The images are shipping a kernel from our Unstable branch (currently, 5.7.0-2), and are less tested and more likely to break than our regular, clean-Stable images. Nevertheless, we do expect them to be useful for many hackers –and even end-users– throughout the world.

The images we are generating are very minimal, they carry basically a minimal Debian install. Once downloaded, of course, you can install whatever your heart desires (because… Face it, if your heart desires it, it must free and of high quality. It must already be in Debian!)

Oh — And very important: Due to a change in the memory layout, if you get the 8GB model (currently the top-of-the-line RPi4), it will still not have USB support, due to a change in its memory layout (that means, no local keyboard/mouse ☹). We are working on getting it ironed out!

Categories: FLOSS Project Planets

Pages