FLOSS Project Planets

Nikola: New feature in Nikola: Sections

Planet Python - Thu, 2015-09-03 10:01

This post is reproduced with permission from the author. See it in the original site

Sections are like newspaper sections that let you group related content together in a collection. Every post from a section appear under a common name, folder/address, and optionally use distinct styling. They also have their own landing pages containing an index with all their posts and their own syndication feed. With sections and post collections, you can diversify your Nikola blog by writing on different topics all on the same website. Readers who are only interested in one subsection of the content you publish can subscribe to the feed of the section or sections that interest them the most.

In Nikola, sections are normally built automatically based on the output folders specified in the POSTS option. Each output folder is a new section. The index pages and feeds for each section will be output in the same directory as the posts. Alternatively, sections can be assigned using a section property in each post’s metadata. Note that this will not change the output folder or address of a post and thus lose some of the uniformity you get with having posts include their section name as part of their address.

The following configuration example demonstrates how three sections on different topics are created. The first argument is the source path to where the posts are stored, the second argument is the output folder and section name, and the third argument is the template to use for each section. Posts can use the same template, but you may want to customize the template for each section with bigger hero images on your food section and special star rating systems and different HTML markup for your reviews.

POSTS = { ('posts/blog/*.md', blog', 'post.tmpl'), ('posts/food/*.md', 'food', 'post_recipe.tmpl'), ('posts/review/*.md', 'review, 'post_reviews.tmpl'), }

Posts cannot be added to multiple sections as this might create duplicate pages with different addresses. Duplicate pages is something you will want to avoid in most cases. If you really want a post to appear in multiple sections, you’re looking for Nikola’s tags or categories functionality.

Some customizations I’ve made to my own templates after reorganizing to use sections:

  • Display the name and color of the section a post belongs to on the front page.
  • Display a link to syndication feed for each section as well as the everything-feed at the top of each section and post belonging to that section.
  • Breadcrumb navigations from posts to their sections and from the sections to the front page. Encourages visitors to your site to find more content from the same section.

Additionally, each section and every post in that section will be automatically assigned a color created by shifting the hue of the site’s THEME_COLOR option in the HUSL color space. This creates a visually similar color that can be used to style the posts and the sections in a uniform way, allowing each section to have a unique style of their own. The color can be called from a theme using post.section_color() and can be used in an inline styles or a style element. The color manipulation functions can also be accessed directly in theme templates, allowing for shifting hue, saturation, or lightness of a given color. For example, a lighter version of a section’s color can be retrieved using color_hsl_adjust_hex( post.section_color(), adjust_l=0.05 ).

The options for controlling the behavior of sections are better documented in conf.py and include:

  • POSTS_SECTIONS for enabling or disabling sections (on by default)
  • POSTS_SECTION_ARE_INDEXES for making posts lists instead of indexes
  • POSTS_SECTION_COLOR for manually assigning colors to sections rather than auto-generated colors from THEME_COLOR
  • POSTS_SECTION_NAME for naming sections separately from their output folders
  • POSTS_SECTION_TITLE for controlling the title of the section indexes
  • POSTS_SECTION_DESCRIPTION for giving each section a description

There is currently no way of generating a list of all sections. A site is not expected to need more than three–twelve sections at the most.

Sections will be available in Nikola version 7.7.0 due later this week.

Drop by the Nikola mailing list or chat roomif you’ve built something cool with sections or just need a little help.

What on Earth is “Nikola,” anyway?

Nikola is a static site generator built in Python. What that means, is that it can turn a collection of text files into a beautiful website using templates and a collection of ready-made themes. This website, even this very page!, was built using Nikola. Learn more at the Nikola website.

I’ve contributed to the development of Nikola for the last two years — the new sectioning system only in the last week — and I’m really happy with how Nikola works, the community, and especially how it has helped me build a great website that I’m really proud of.

Categories: FLOSS Project Planets

InternetDevels: InternetDevels on Clutch’s Top Ukrainian Developers List

Planet Drupal - Thu, 2015-09-03 10:01

Hey Drupalists! Any news – fine or great? Because we cannot restrain our emotions for longer: American researcher of world’s IT industry Clutch ranged us as one of 12 Top Web and Software Developers in Ukraine. Not bad, yes? Especially if to take into account that we are the only Drupal developer on the list.

Read more
Categories: FLOSS Project Planets

Updating the Shop!

Planet KDE - Thu, 2015-09-03 09:39

Today we finally updated the offerings in the Krita webshop. The Comics with Krita training DVD by Timothée Giet is available as a free download using bittorrent, but if you want to support the Krita development, you can now download it directly for just €9,95. It’s still a really valuable resource, discussing not just Krita’s user interface, but also the technical details of creating comic book panels and even going to print.

We’ve also now got the USB sticks that were rewards in last year’s kickstarter for sale. By default, you get Comics with Krita, the Muses DVD and Krita 2.9.2 for Windows and OSX, as well as some brushes and other resources. That’s €34,95. For five euros more, I’ll put the latest Krita builds and the latest brush packs on it before sending it out. That’s a manual process at the moment since we release so often that it’s impossible to order the USB sticks from the supplier with the right version pre-loaded!

Because you can now get the Muses DVD, Comics with Krita and Krita itself on a USB Stick, we’ve reduced the price of the Muses DVD to €24,95! You can select either the download or the physical DVD, the price is the same.

And check out the very nice black tote bags and cool mugs as well!

All prices include shipping and only in the Netherlands, V.A.T. is added.

Categories: FLOSS Project Planets

Acquia Developer Center Blog: Why Drupal, Why Now? An Agency CTO’s Perspective

Planet Drupal - Thu, 2015-09-03 09:22
Corey CapletteReason #1: Drupal is Technology Ready

For the past 15 years Velir has been customizing and implementing top-tier content management systems. Given our focus on creating digital solutions, content management is at the core of almost every project we take on.

We’re veterans in the CMS space, but up until recently we exclusively worked with commercial CMS platforms. As the CTO of my agency, I wanted to explain why we’ve now decided to take the leap with Drupal.  

Tags: acquia drupal planet
Categories: FLOSS Project Planets

.VDMi/Blog: Pane Field: World domination with ctools panes inside your content

Planet Drupal - Thu, 2015-09-03 08:29
If you're building flexible content with something like Paragraphs and build your sites from an editors perspective, you will really like this one. Pane Field brings the ease of ctools panes to Entity Fields. With this modules you can use ctools panes inside your content!

Pane Field is a new Drupal module by VDMi.
Most of the time we will give our users 3 Paragraphs types:

  • Text (just a simple WYSIWYG and a title field)
  • Text with image (simple WYSIWYG, title field, image field (media/scald/image), image position)
  • Image (media/scald/image)

Over time we learned that when content editors discover such flexible content, they immediately start thinking about exceptions. You’ll probably get some questions about “special” content like animations, videos and call-to-actions. That's where Pane Field comes in!

While most of these exceptions can be implemented in Paragraphs, it’ll become a problem when you have 10 different animations, or 10 different call to actions and the list keeps growing. That’s why we thought of another way to keep the ease of content editing, but have the flexibility of ctools panes. Which gives you - the developer - an very easy way to quickly create new pieces of content that the editor can use.

Ctools panes (also known as content types) are simple ctools plugins that basically define what they are, how they are configured and how they should be rendered. A definition is placed inside the same file that takes care of configuration/rendering. The definition will look something like this:

<?php /** * Plugins are described by creating a $plugin array which will be used * by the system that includes this file. */ $plugin = array(   'title' => t('Buy our product'),   'description' => t('Buy our product and get another one for free call to action.'),   'category' => t('Call to Actions'),   'edit form' => 'shop_cta_buy_our_product_edit_form',   'render callback' => 'shop_cta_buy_our_product_render',   'admin info' => 'shop_cta_buy_our_product_admin_info',   'required context' => new ctools_context_required(t('Node'), 'node'),   'defaults' => array(     'title' => 'Improve your daily workflow with our turboblender',     'subtitle' => 'Buy now and get one free!',     'button' => 'Buy now!',     'button_link' => 'order',   ) );

The callbacks will look like this for a simple pane, they are in the same file as the definition:

/** * 'Edit form' callback for the content type. */ function shop_cta_buy_our_product_edit_form($form, &$form_state) {   $conf = $form_state['conf'];   $form['title'] = array(     '#title' => t('Title'),     '#type' => 'textfield',     '#default_value' => $conf['title'],   );   $form['subtitle'] = array(     '#title' => t('Subtitle'),     '#type' => 'textfield',     '#default_value' => $conf['subtitle'],     '#maxlength' => 2048,   );   $form['button'] = array(     '#title' => t('Button'),     '#type' => 'textfield',     '#default_value' => $conf['button'],   );   $form['button_link'] = array(     '#title' => t('Button Link'),     '#type' => 'textfield',     '#default_value' => $conf['button_link'],   );   return $form; } function shop_cta_buy_our_product_admin_info($subtype, $conf, $context) {   $output = new stdClass();   $output->title = '';   $output->content = '<strong>'. t('Title') . ':</strong> ' . check_plain($conf['title']) . '</br>';   $output->content .= '<strong>'. t('Subtitle') . ':</strong> ' . check_plain($conf['subtitle']) . '</br>';   $output->content .= '<strong>'. t('Button') . ':</strong> ' . check_plain($conf['button']) . '</br>';   $output->content .= '<strong>'. t('Button Link') . ':</strong> ' . check_plain($conf['button_link']) . '</br>';   return $output; } function shop_cta_buy_our_product_render($subtype, $conf, $panel_args, $context) {   $node = $context->data;   $block = new stdClass();   $block->title = '';   $content = array(     '#theme' => 'shop_cta_buy_our_product',     '#path' => drupal_get_path('module', 'shop_cta'),     '#title' => $conf['title'],     '#subtitle' => $conf['subtitle'],     '#button' => $conf['button'],     '#button_link' => $conf['button_link'],   );   $block->content = $content;   return $block; }

And there you go, you created a new piece of custom content for your website. In the theme function you can use a template that has some custom HTML. Now you can enable the pane in the instance settings of the pane field:

As you can see, your new pane is available, all default ctools panes are also available.

When you edit content, you can now add a new Paragraph, select the paragraph type with the pane field and then select the new ctools pane:

You’re probably wondering “WHY?” at this moment

Pane Field causes extreme flexibility in your content without the hassle of creating a new paragraphs type, adding fields to them, adding preprocess functions, creating a bundle template. It allowes you to create exceptions to your base types much faster. It also is faster (in CPU time) compared to using a new Paragraphs type because it’s not a new entity, it’s just a field value with a reference to the ctools pane and some configuration values.

One big note: because the pane configuration is simply stored serialized in the field value, you can’t use them in stuff like Views or Search API (you can render the field and index that though).

Here is a more comprehensive article about how to write content panes.

Categories: FLOSS Project Planets

Reinout van Rees: Automation for better behaviour

Planet Python - Thu, 2015-09-03 07:31

Now... that's a provocative title! In a sense, it is intended that way. Some behaviour is better than other behaviour. A value judgment! In the Netherlands, where I live, value judgments are suspect. If you have a comment on someone's behaviour, a common question is whether you're "better" than them. If you have a judgment, you apparently automatically think you've got a higher social status or a higher moral standing. And that's bad, apparently.

Well, I despise such thinking :-)

Absolute values

I think there are absolutes you can refer to, that you can compare to. Lofty goals you can try to accomplish. Obvious truths (which can theoretically be wrong...) that are recognized by many.

Nihilism is fine, btw. If you're a pure nihilist: I can respect that. It is an internally-logical viewpoint. Only you shouldn't complain if some other nihilist cuts you to pieces if that suits his purely-individual nihilistic purposes.

So for practical purposes I'm going to assume there's some higher goal/law/purpose/whatever that we should attain.

Take programming in python. PEP 8, python's official style guide is recognized by most of the python programmers as the style guide they should adhere to. At least, nobody in my company complained if I adjusted/fixed their code to comply with PEP 8. And the addition of bin/pep8 in all of our software projects to make it easy to check for compliance didn't raise any protests. Pyflakes is even clearer, as it often points at real errors of obvious omissions.

For django projects, possible good things include:

  • Sentry integration for nicely-accessible error logging.
  • Using a recent and supported django version. So those 1.4 instances we still have at my workplace should go the way of the dodo.
  • Using proper releases instead of using the latest master git checkout.
  • Using migrations.
  • Tests.
Automation is central to good behaviour

My take on good behaviour is that you should either make it easy to do the good thing or you should make non-good behaviour visible.

As an example, take python releases. As a manager you can say "thou shalt make good releases". Oh wow. An impressive display of power. It reminds me of a certain SF comic where, to teach them a lesson, an entire political assembly was threathened with obliteration from orbit. Needless to say, the strong words didn't have a measurable effect.

You can say the same words at a programmer meeting, of course. "Let's agree to make proper releases". Yes. Right.

What do you have to do for a proper release?

  • Adjust the version in setup.py from 1.2.dev.0 to 1.2.
  • Record the release date in the changelog.
  • Tag the release.
  • Update the version number in setup.py to 1.3.dev.0.
  • Add a new header for 1.3 in the changelog.

Now... That's quite an amount of work. If I'm honest, I trust about 40% of my colleagues to make that effort every time they release a package.

There is a better way. Those very same colleagues can be relied on to make perfect releases all the time if all they have to do is to call bin/fullrelease and press ENTER a few times to do all of the above automatically. Thanks to zest.releaser.

Zest.releaser makes it easier and quicker to make good releases than it is to make bad/quick/sloppy releases by hand.

Further examples

Now... here are some further examples to get you thinking.

All of our projects are started with "nensskel", a tool to create a skeleton for a new project (python lib, django app, django site). It uses "paste script"; many people now use "cookie cutter", which serves the same purpose.

  • For all projects, a proper test setup is included. You can always run bin/test and your test case will run. You only have to fill it in.

  • bin/fullrelease, bin/pep8, bin/pyflakes: if you haven't yet installed those programs globally, how easy can I make it for you to use them???

  • If you want to add documentation, sphinx is all set up for you. The docs/source/ directory is there and sphinx is automatically run every time you run buildout.

  • The README.rst has some easy do-this-do-that comments in there for when you've just started your project. Simple quick things like "add your name in the setup.py author field". And "add a one-line summary to the setup.py and add that same one to the github.com description".

    I cannot make it much easier, right?

    Now... quite some projects still have this TODO list in their README.

Conclusion: you need automation to enable policy

You need automation to enable policy, but even that isn't enough. I cannot possibly automatically write a one-line summary for a just-generated project. So I have to make do with a TODO note in the README and in the setup.py. Which gets disregarded.

If even such simple things get disregarded, bigger things like "add a test" and "provide documentation" and "make sure there is a proper release script" will be hard to get right. I must admit to not always adding tests for functionality.

I'll hereby torture myself with a quote. "Unit testing is for programmers what washing your hands is for doctors before an operation". It is an essential part of your profession. If you go to the hospital, you don't expect to have to ask your doctor to disinfect the hands before the operation. That's expected. Likewise, you shouldn't expect your clients to explicitly ask you for software tests: those should be there by default!

Again, I admit to not always adding tests. That's bad. As a professional software developer I should make sure that at least 90% test coverage is considered normal at my company. In the cases where we measure it, coverage is probably around 50%. Which means "bad". Which also means "you're not measuring it all the time". 90% should also be normal for my own code and I also don't always attain that.

Our company-wide policy should be to get our test coverage at least to 90%. Whether or not if that's our policy, we'll never make 90% if we don't measure it.

And that is the point I want to make. You need tools. You need automation. If you don't measure your test coverage, any developer or management policy statement will be effectively meaningless. If you have a jenkins instance that's in serious neglect (70% of the projects are red), you don't effectively have meaningful tests. Without a functioning jenkins instance (or travis-ci.org), you cannot properly say you're delivering quality software.

Without tooling and automation to prove your policy, your policy statements are effectively worthless. And that's quite a strong value statement :-)

Categories: FLOSS Project Planets

Steinar H. Gunderson: Intel GPU memory bandwidth

Planet Debian - Thu, 2015-09-03 06:30

Two days ago, I commented I was seeing only 1/10th or so of the theoretical bandwidth my Intel GPU should have been able to push, and asked if anyone could help be figure out the discrepancy. Now, several people (including the helpful people at #intel-gfx) helped me understand more of the complete picture, so I thought I'd share:

First of all, my application was pushing (more or less) a 1024x576 texture from a FBO to another, in fp16 RGBA, so eight bytes per pixel. This was measured to take 1.3 ms (well, sometimes long and sometimes shorter); 1024x576x8 bytes / 1.3 ms = 3.6 GB/sec. Given that the spec sheet for my CPU says 25.6 GB/sec, that's the basis for my “about 1/10th” number. (There's no separate number for the GPU bandwidth, since the CPU and GPU share memory subsystem and even the bottom level of the cache.)

But it turns out these numbers are not bidirectional as I thought they'd be; they cover both read and write. So I need to include the write bandwidth into the equation as well (and I was writing to a 1280x720 framebuffer). So with that into the picture, the number goes up to 9.3 GB/sec. In synthetic benchmarks, I was able to push this to 9.8, but no further.

So we're still a bit over a factor 2x off. But lo and behold, the quoted number is assuming dual memory channels—and Lenovo has only fitted the X240 with a single RAM chip, with no possibility of adding more! So the theoretical number is 12.8 GB/sec, not 25.6. ~75% of the theoretical memory bandwidth is definitely within what I'd say is reasonable.

So, to sum up: Me neglecting to count the writes, and Lenovo designing the X240 with a memory subsystem reminiscent of the Amiga. Thanks to all developers!

Categories: FLOSS Project Planets

Anatoly Techtonik: SCons build targets

Planet Python - Thu, 2015-09-03 06:03
SCons is awesome. Just saying. If you want to know (or troubleshoot) how SCons selects targets to be built, add this snippet at the end of your SConstruct:
def dump_targets(targets):
  for t in targets:
    if type(t) == str:
      name = t
    else:
      name = t.name
    print("  <" + str(t.__class__.__name__) + "> " + name)

print("[*] Default targets:")
dump_targets(DEFAULT_TARGETS)

print("[*] Command line targets:")
dump_targets(COMMAND_LINE_TARGETS)
print("[*] All build targets:")
dump_targets(BUILD_TARGETS)
For my copy of Wesnoth, 'scons .' produces this output:[*] Default targets:
<Alias> wesnoth
<Alias> wesnothd
[*] Command line targets:
<str> .
[*] All build targets:
<str> .And if you want to know how to specify targets or what do they mean, read the second page of SCons man documentation. Just for convenience I quote it here.

scons is normally executed in a top-level directory containing a SConstruct file, optionally specifying as command-line arguments the target file or files to be built.
By default, the commandsconswill build all target files in or below the current directory. Explicit default targets (to be built when no targets are specified on the command line) may be defined the SConscript file(s) using the Default() function, described below.
Even when Default() targets are specified in the SConscript file(s), all target files in or below the current directory may be built by explicitly specifying the current directory (.) as a command-line target:scons .Building all target files, including any files outside of the current directory, may be specified by supplying a command-line target of the root directory (on POSIX systems):scons /or the path name(s) of the volume(s) in which all the targets should be built (on Windows systems):scons C:\ D:\To build only specific targets, supply them as command-line arguments:scons foo barin which case only the specified targets will be built (along with any derived files on which they depend).
Specifying "cleanup" targets in SConscript files is not usually necessary. The -c flag removes all files necessary to build the specified target:scons -c .to remove all target files, or:scons -c build exportto remove target files under build and export. Additional files or directories to remove can be specified using the Clean() function. Conversely, targets that would normally be removed by the -c invocation can be prevented from being removed by using the NoClean() function.
A subset of a hierarchical tree may be built by remaining at the top-level directory (where the SConstruct file lives) and specifying the subdirectory as the target to be built:scons src/subdiror by changing directory and invoking scons with the -u option, which traverses up the directory hierarchy until it finds the SConstruct file, and then builds targets relatively to the current subdirectory:cd src/subdir
scons -u .
Categories: FLOSS Project Planets

Jorgen Schäfer: Elpy 1.9.0 released

Planet Python - Thu, 2015-09-03 05:46

I just released version 1.9.0 of Elpy, the Emacs Python Development Environment. This is a feature release.

Elpy is an Emacs package to bring powerful Python editing to Emacs. It combines a number of other packages, both written in Emacs Lisp as well as Python.

Quick Installation

Evaluate this:

(require 'package)
(add-to-list 'package-archives
'("elpy" .
"https://jorgenschaefer.github.io/packages/"))

Then run M-x package-install RET elpy RET.

Finally, run the following (and add them to your .emacs):

(package-initialize)
(elpy-enable)Changes in 1.9.0
  • Elpy now supports the autopep8 library for automatically formatting Python code. All refactoring-related code is now grouped under C-c C-r. Use C-c C-r i to fix up imports using importmagic, C-c C-r p to fix up Python code with autopep8, and C-c C-r r to bring up the old Rope refactoring menu.
  • C-c C-b will now select a region containing surrounding lines of the current indentation or more.
  • C-c C-z in a Python shell will now switch back to the last Python buffer, allowing to use the key to cycle back and forth between the Python buffer and shell.
  • The pattern used for C-c C-s is now customizeable in elpy-rgrep-file-pattern.
  • <C-return> now can be used to send the current statement to the Python shell. Be careful, this can break with nested statements.
  • The Elpy minor mode now also works in modes derived from python-mode, not just in the mode itself.

Thanks to ChillarAnand, raylu and Chedi for their contributions!

Categories: FLOSS Project Planets

KDE signs the User Data Manifesto 2.0 and continues to defend your freedom

Planet KDE - Thu, 2015-09-03 04:28

I believe that in today’s world where more an more of our daily life depends on technology it is crucial that people have control over that technology. You should be empowered to know what your technology does and you should be empowered to influence it. This is at the core of Free Software. Unfortunately it is not at the core of most of the technology people interact with every day – quite the opposite – walled gardens and locks wherever you look with few exceptions. KDE is working hard to provide you with technology that you control every single day so you are empowered and the one ultimately in charge of your technology, data and life – the basis for freedom for many today. This is written down in the first sentence of our manifesto: “We are a community of technologists, designers, writers and advocates who work to ensure freedom for all people through our software.”

Therefor I am proud to announce that KDE (through KDE e.V.) is one of the launch partners and thereby initial signatories of the User Data Manifesto 2.0. The User Data Manifesto defines basic rights for people to control their own data in the internet age:

  • Control over user data access
  • Knowledge of how the data is stored
  • Freedom to choose a platform

Do you want to join us in providing more people with more access to Free technology? Today is a good day!

Categories: FLOSS Project Planets

Calligra 2.9.7 Released

Planet KDE - Thu, 2015-09-03 03:48

The Calligra team is pleased to announce the release of Calligra Suite, and Calligra Active 2.9.7. It is recommended update that brings further improvements to the 2.9 series of the applications and underlying development frameworks.

Support Calligra! Bugfixes in This Release

Here is an overview of the most important fixes. There are several others that may be not mentioned here.

General
  • Removed a number of memory leaks in common code
  • Properly set normal default paragraphstyle as parent to footnote/endnote default ones
  • Fix copying text from inside a table cell without copying the entire cell (bug 350175)
  • Optimalization of table cell formatting
  • Fix: pressing Backspace in some cases didn’t delete the selected table (bug 350426)
  • Fix: Inserting a variable when having a selecion should overwrite the selection (bug 350435)
  • Fix: Pasting into the before-table-paragraph breaks it (bug 350427)
  • Make the final spell checking markup could be drawn the wrong way giving some weird visual glitches (bug 350433)
  • Fix writing direction button not working the first time in some cases. Changed the way of detection the current direction. (bug 350432)
  • Make icon size of the toolbox configurable (right-click on the toolbox to select a new size) (bug 336686)
  • Add a couple smaller toolbox icon sizes (14 pixels)
  • Make the default toolbox icons 14px since that looks closest to what they were before
  • Update tool tips to include keyboard shortcut (tool tips will automatically change with changes to shorcuts) (bug 348626)
  • Make the default size of the toolbox buttons dependent on screen resolution
  • Create subfolders for presets (related bug 321361)
  • Initialize colors to black, as per docs
  • Improved memory usage (use vectors)
  • Set the full file name as default directory in file dialogs
Kexi
  • General:
    • Fix vertical alignment of text in command link button widgets, it was especially broken in Breeze widget style
  • Tables:
    • Restore ability of altering table design. This was serious regression present in Kexi 2.9.5 and 2.9.6. (bug 350457)
  • Queries:
    • Don’t force saving when switching never stored query to Data view (on 2nd try)
  • CSV Import:
    • Fix detection of primary key column on CSV import (bug 351487)
    • Fix updates of primary key detection when value of ‘Start at line’ changes
  • SQLite databases:
    • Better results and error reporting for prepared statements
Krita

(See also Krita.org)

  • Highlights:
    • As is traditional, our September release has the first Google Summer of Code results. Wolthera’s Tangent Normal Brush engine has already been merged and provides:
      • Tangent Normal Brush Engine
      • Phong Bumpmap now accepts normal map input
      • Normalize filter
      • Tilt Cursor
    • We’ve got all-new icons!
    • You can configure the size of the icons used in the toolbox
    • Colorspacebrowser: if you want to know the nitty-gritty details about the colorspaces and profiles Krita offers, all information is now available in the new colorspace browser. Still under heavy polishing!
    • You can pick colors and use the fill tool everwhere in wrap-around mode
  • Other new features:
    • Implement ‘Scalable smoothness’ feature for Stabilizer smoother
    • Update tooltips for toolbox icons
    • Right click to undo last path point
    • Update tooltips to include keyboard shortcut
    • Make the default size of the toolbox buttons dependent on screen resolution
    • Added ability to merge down Selection Masks
    • Improve loading of PSDs of any colour space big time; 16bit CMYK psd files can now be loaded
    • Add three shortcuts to fill with opacity
    • Implement loading for ZIP compressed PSD files
    • XCF: load group layers from XCF files v3 or higher
    • Allow ‘shift’-modifer after dragging an assistant handle to snap lines
    • Add snap-single checkbox under assistant snapping.
    • Update brushes with optimised versions.(Basic_tip_default.kpp, Basic_tip_soft.kpp, Basic_wet.kpp, Block_basic.kpp, Block_bristles.kpp, Block_tilt.kpp, Ink_brush_25.kpp, Ink_gpen_10.kpp, Ink_gpen_25.kpp)
    • Mathematically robust normal map combination blending mode
    • Slow down updates for randomized brushes.
    • added convert to shape for selections
    • Added Trim to Image Size action
    • Optimise dodge and burn filter
    • Multiple layers merge with layer styles on Ctrl+E. (1) “Merge selected layers” is now deprecated and you can use usual Ctrl+E to merge multiple selection, (2) Mass-merging of layers with layer styles works correctly now, (3) Merging of clone layers together with their sources will not break Krita now)
    • Make color to alpha work with 16f channel depths
    • Add new shortcuts (Scale Image to new size = CTRL+ALT+I, Resize Canvas = CTRL+ALT+C, Create Group, Layer = CTRL+G, Feather Selection = SHIFT+F6)
  • Bug fixes:
    • Fix abr brush loading (bug 351599)
    • Remember the toolbar visibility state (bug 343615)
    • Do not let the wheel zoom if there are modifiers pressed (patch by var.spool.mail700@gmail.com. Thanks!) (bug 338839)
    • Fix active layer activation mask (bug 347500)
    • Remove misleading error message after saving fails
    • Prevent Krita from loading incomplete assistant (bug 350289)
    • Add ctrl-shift-s as default shortcut for save as (bug 350960)
    • Fix Bristle brush presets
    • Fix use normal map checkbox in phongbumpmap filter UI
    • Fix loading the system-set monitorprofile
    • Make cs-convert UI attempt to automatically determine when to uncheck optimise
    • Do not share textures when that’s not possible (related bug 351488)
    • Remove disabling of system profile checkbox
    • Update the display profile when moving screens (related bug 351488)
    • Update the display profile after changing the settings
    • Fix crash due to calling a virtual method from c-tor of KisToolPaint
    • Disable the layerbox if there’s no open image (bug 351664)
    • Correctly install the xcf import plugin on Windows (bug 345285)
    • Fix Fill with… (Opacity) actions
    • Make a transform tool work with Pass Through group layers (bug 351548)
    • Fix parsing XML with MSVC 2012
    • Make all the lines of paintop options look the same
    • Make sure a default KoColor is transparent (bug 351560)
    • Lots of memory leak fixes (pointers that weren’t deleted are now deleted)
    • Blacklist “photoshop”:DateCreated” when saving (bug 351497)
    • Only add shortcuts for Krita
    • Only ask for a profile for 16 bits png images, since there we assume linear by default, which is wrong for most png images
    • Don’t build the thumb creator on Windows or OSX
    • Work around encoding issues in kzip (bug 350498)
    • Better error message in PNG export (bug 348099)
    • Don’t rename resources
    • Also change the color selector when selecting a vector layer (bug 336693)
    • Remove old compatibility code (bug 349554)
    • Disable the opacity setting for the shape brush (bug 349571)
    • Initialize KoColor to black, as per apidox
    • Add some explanation to the recovery dialog (related bug 351411)
    • Load resources from subfolders (bug 321361)
    • Recreate a default bounds object on every KisMask::setImage() call (related bug 345619)
    • Fix a severe crash in Transformation Masks (bug 349819)
    • Add a barrier between sequentially undone commands with setIndex (bug 349819)
    • Fixed API of KisPNGConverter to not acces the entire KisImage
    • Check which color spaces PNG supports before passing the preview device to it (bug 351383)
    • Save CMYK JPEG’s correctly (bug 351298)
    • Do not crash saving 16 bit CMYK to JPEG (bug 351298)
    • Fix slowdown when activating “Isolate Layer” mode (bug 351195)
    • Fix loading of selection masks
    • Accept events so oxygen doesn’t get them (bug 337187)
    • Added optional flags to KisDocument::openUrl() and made “File Layer” not add its file to the recent files list (bug 345560)
    • Fix crash when activating Passthrough mode for a group with transparency mask (bug 351224)
    • Don’t truncate fractional brush sizes on eraser switch (patch by Alexey Elnatanov. Thanks!) (bug 347798)
    • Fix layout of the color options page (bug 351271)
    • Don’t add new layers to the group if it is locked
    • Transform invisible layers if they are part of the group
    • All Drag & Drop of masks (bug 345619)
    • Optimisize advanced color selector
    • Select the right list item in the fill layer dialog
    • Remove excessive qDebug statements (bug 349871)
    • Remove the non-working fast grid settings (bug 349514)
    • Make the luma inputboxes aware of locale (bug 344490)
    • Don’t crash if there isn’t a pattern (bug 348940)
    • Fix location of colon in color management settings (bug 350128)
    • Don’t hang when isolating a layer during a stroke (bug 351193)
    • Palette docker: Avoid showing a horizontal scrollbar (bug 349621)
    • Stamp and Clipboard brush fixes
    • Sort the dockers alphabetically
    • Add the toolbox to the docker menu (bug 349732)
    • Make it possible to R-select layers in a pass-through group (bug 351185)
    • Set a minimum width for the tool option popup (bug 350298)
    • Fix build on ARM (bug 351164)
    • Fixing pattern png loading on bundles
    • Don’t stop loading a bundle when a wrong manifest entry is found
    • Fix inherit alpha on fill layers (bug 349333)
    • Fix to resource md5 generation
    • Fix full-screen/canvas-only state confusion (patch by Alexey Elnatanov, Thanks!) (bug 348981)
    • Brush editor Stamp and Clipboard refactoring (bug 345195)
    • Don’t crash on closing krita if the filter manager is open (bug 351005)
    • Fix a memory leak in KisWeakSharedPtr
    • Re-enable antialias for selection tools (bug 350803)
    • Open the Krita Manual on F1 on all platforms (bug 347285)
    • Update all the action icons when the theme is changed
    • Workaround for Surface Pro 3 Eraser (bug 341899)
    • Fix an issue with mimetype detection
    • Fix a crash when PSD file type is not magic-recognized by the system (bug 350588)
    • Fix a hangup when pressing ‘v’ and ‘b’ in the brush tool simultaneously (bug 350280)
    • Fix crash in the line tool (bug 350280)
    • Fix crash when loading a transform mask with a non-affine transform (bug 350507)
    • Fixed Flatten Layer and Merge Down actions for layer with layer styles (bug 349479)
Document filters
  • Fix encoding of import filter source files for Applix* files
Try It Out

The source code of the release is available for download here: calligra-2.9.7.tar.xz.
Also translations to many languages and MD5 sums.
Alternatively, you can download binaries for many Linux distributions and for Windows (users: feel free to update that page).


What’s Next and How to Help?

The next step after the 2.9 series is Calligra 3.0 which will be based on new technologies. We expect it later in 2015.

You can meet us to share your thoughts or offer your support on general Calligra forums or dedicated to Kexi or Krita. Many improvements are only possible thanks to the fact that we’re working together within the awesome community.

(Some Calligra apps need new maintainers, you can become one, it’s fun!) How and Why to Support Calligra?

Calligra apps may be totally free, but their development is costly. Power, hardware, office space, internet access, travelling for meetings – everything costs. Direct donation is the easiest and fastest way to efficiently support your favourite applications. Everyone, regardless of any degree of involvement can do so. You can choose to:
Support entire Calligra indirectly by donating to KDE, the parent organization and community of Calligra: http://www.kde.org/community/donations.

Support Krita directly by donating to the Krita Foundation, to support Krita development in general or development of a specific feature: https://krita.org/support-us/donations.

Support Kexi directly by donating to its current BountySource fundraiser, supporting development of a specific feature, or the team in general: https://www.bountysource.com/teams/kexi. About the Calligra Suite

Calligra Suite is a graphic art and office suite developed by the KDE community. It is available for desktop PCs, tablet computers and smartphones. It contains applications for word processing, spreadsheets, presentation, databases, vector graphics and digital painting. For more information visit calligra.org.


About KDE

KDE is an international technology team that creates free and open source software for desktop and portable computing. Among KDE’s products are a modern desktop system for Linux and UNIX platforms, comprehensive office productivity and groupware suites and hundreds of software titles in many categories including Internet, multimedia, entertainment, education, graphics and software development. KDE’s software available in more than 60 languages on Linux, BSD, Solaris, Windows and Mac OS X.

} .button:hover{ padding:11px 32px; border:solid 1px #004F72; -webkit-border-radius:10px; -moz-border-radius:10px; border-radius: 10px; font:18px Arial, Helvetica, sans-serif; font-weight:bold; color:#E5FFFF; background-color:#3BA4C7; background-image: -moz-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -webkit-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -o-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -ms-linear-gradient(top, #3BA4C7 0% ,#1982A5 100%); filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#1982A5', endColorstr='#1982A5',GradientType=0 ); background-image: linear-gradient(top, #3BA4C7 0% ,#1982A5 100%); -webkit-box-shadow:0px 0px 2px #bababa, inset 0px 0px 1px #ffffff; -moz-box-shadow: 0px 0px 2px #bababa, inset 0px 0px 1px #ffffff; box-shadow:0px 0px 2px #bababa, inset 0px 0px 1px #ffffff;

} .button:active{ padding:11px 32px; border:solid 1px #004F72; -webkit-border-radius:10px; -moz-border-radius:10px; border-radius: 10px; font:18px Arial, Helvetica, sans-serif; font-weight:bold; color:#E5FFFF; background-color:#3BA4C7; background-image: -moz-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -webkit-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -o-linear-gradient(top, #3BA4C7 0%, #1982A5 100%); background-image: -ms-linear-gradient(top, #3BA4C7 0% ,#1982A5 100%); filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#1982A5', endColorstr='#1982A5',GradientType=0 ); background-image: linear-gradient(top, #3BA4C7 0% ,#1982A5 100%); -webkit-box-shadow:0px 0px 2px #bababa, inset 0px 0px 1px #ffffff; -moz-box-shadow: 0px 0px 2px #bababa, inset 0px 0px 1px #ffffff; box-shadow:0px 0px 2px #bababa, inset 0px 0px 1px #ffffff; }

.button a,.button a:link, .button a:visited, .button a:hover, .button a:active { color:#E5FFFF; } -->

Categories: FLOSS Project Planets

Codementor: Data Science with Python &amp; R: Data Frames I

Planet Python - Thu, 2015-09-03 03:35

Motivation

These series of tutorials on Data Science will try to compare how different concepts in the discipline can be implemented in the two dominant ecosystems nowadays: R and Python. We will do this from a neutral point of view. Our opinion is that each environment has good and bad things, and any data scientist should know how to use both in order to be as prepared as possible for the job market or to start a personal project.

To get a feeling of what is going on regarding this hot topic, we refer the reader to DataCamp’s Data Science War infographic. Their infographic explores what the strengths of R are over Python and vice versa, and aims to provide a basic comparison between these two programming languages from a data science and statistics perspective.

Far from being a repetition from the previous, our series of tutorials will go hands-on into how to actually perform different data science tasks such as working with data frames, doing aggregations, or creating different statistical models such in the areas of supervised and unsupervised learning.

As usual, we will use real-world datasets. This will help us to quickly transfer what we learn here to actual data analysis situations.

The first tutorial in our series will deal the an important abstraction, that of a Data Frame. In the very next tutorial, we will introduce one of the first tasks we face when we have our data loaded, that of the Exploratory Data Analysis. This task can be performed using data frames and basic plots as we will show here for both, Python and R.

All the source code for the different parts of this series of tutorials and applications can be checked at GitHub. Feel free to get involved and share your progress with us!

What is a DataFrame?

A data frame is used for storing tabular data. It has labeled axes (rows and columns) that we can use to perform arithmetic operations at on levels.

The concept was introduced in R before it was in Python Pandas so the later repeats many of the ideas from the former. In R, a data.frame is a list of vector variables of the same number of elements (rows) with unique row names. That is, each column is a vector with an associated name, and each row is a series of vector elements that correspond to the same position in each of the column-vectors.

In Pandas, a DataFrame can be thought of as a dict-like container for Series objects, where a Series is a one-dimensional NumPy ndarray with axis labels (including time series). By default, each Series correspond with a column in the resulting DataFrame.

But let’s see both data types in practice. First of all we will introduce a data set that will be used in order to explain the data frame creation process and what data analysis tasks can be done with a data frame. Then we will have a separate section for each platform repeating every task for you to be able to move from one to the other easily in the future.

Introducing Gapminder World datasets

The Gapminder website presents itself as a fact-based worldview. It is a comprehensive resource for data regarding different countries and territories indicators. Its Data section contains a list of datasets that can be accessed as Google Spreadsheet pages (add &amp;output=csv to download as CSV). Each indicator dataset is tagged with a Data provider, a Category, and a Subcategory.

For this tutorial, we will use different datasets related to Infectious Tuberculosis:

First thing we need to do is download the files for later use within our R and Python environments. There is a description of each dataset if we click in its title in the list of datasets. When performing any data analysis task, it is essential to understand our data as much as possible, so go there and have a read. Basically, each cell in the dataset contains the data related to the number of tuberculosis cases per 100K people during the given year (column) for each country or region (row).

We will use these datasets to better understand the TB incidence in different regions in time.

Downloading files and reading CSV Python

Download Google Spreadsheet data as CSV.

import urllib tb_deaths_url_csv = 'https://docs.google.com/spreadsheets/d/12uWVH_IlmzJX_75bJ3IH5E-Gqx6-zfbDKNvZqYjUuso/pub?gid=0&output=CSV' tb_existing_url_csv = 'https://docs.google.com/spreadsheets/d/1X5Jp7Q8pTs3KLJ5JBWKhncVACGsg5v4xu6badNs4C7I/pub?gid=0&output=csv' tb_new_url_csv = 'https://docs.google.com/spreadsheets/d/1Pl51PcEGlO9Hp4Uh0x2_QM0xVb53p2UDBMPwcnSjFTk/pub?gid=0&output=csv' local_tb_deaths_file = 'tb_deaths_100.csv' local_tb_existing_file = 'tb_existing_100.csv' local_tb_new_file = 'tb_new_100.csv' deaths_f = urllib.urlretrieve(tb_deaths_url_csv, local_tb_deaths_file) existing_f = urllib.urlretrieve(tb_existing_url_csv, local_tb_existing_file) new_f = urllib.urlretrieve(tb_new_url_csv, local_tb_new_file)

Read CSV into DataFrame by using read_csv().

import pandas as pd deaths_df = pd.read_csv(local_tb_deaths_file, index_col = 0, thousands = ',').T existing_df = pd.read_csv(local_tb_existing_file, index_col = 0, thousands = ',').T new_df = pd.read_csv(local_tb_new_file, index_col = 0, thousands = ',').T

We have specified index_col to be 0 since we want the country names to be the row labels. We also specified the thousands separator to be ‘,’ so Pandas automatically parses cells as numbers. Then, we transpose the table to make the time series for each country correspond to each column.

We will concentrate on the existing cases for a while. We can use head() to check the first few lines.

existing_df.head() TB prevalence, all forms (per 100 000 population per year) Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia … Uruguay Uzbekistan Vanuatu Venezuela Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe 1990 436 42 45 42 39 514 38 16 96 52 … 35 114 278 46 365 126 55 265 436 409 1991 429 40 44 14 37 514 38 15 91 49 … 34 105 268 45 361 352 54 261 456 417 1992 422 41 44 4 35 513 37 15 86 51 … 33 102 259 44 358 64 54 263 494 415 1993 415 42 43 18 33 512 37 14 82 55 … 32 118 250 43 354 174 52 253 526 419 1994 407 42 43 17 32 510 36 13 78 60 … 31 116 242 42 350 172 52 250 556 426 5 rows × 207 columns

By using the attribute columns we can read and write column names.

existing_df.columns Index([u'Afghanistan', u'Albania', u'Algeria', u'American Samoa', u'Andorra', u'Angola', u'Anguilla', u'Antigua and Barbuda', u'Argentina', u'Armenia', u'Australia', u'Austria', u'Azerbaijan', u'Bahamas', u'Bahrain', u'Bangladesh', u'Barbados', u'Belarus', u'Belgium', u'Belize', u'Benin', u'Bermuda', u'Bhutan', u'Bolivia', u'Bosnia and Herzegovina', u'Botswana', u'Brazil', u'British Virgin Islands', u'Brunei Darussalam', u'Bulgaria', u'Burkina Faso', u'Burundi', u'Cambodia', u'Cameroon', u'Canada', u'Cape Verde', u'Cayman Islands', u'Central African Republic', u'Chad', u'Chile', u'China', u'Colombia', u'Comoros', u'Congo, Rep.', u'Cook Islands', u'Costa Rica', u'Croatia', u'Cuba', u'Cyprus', u'Czech Republic', u'Cote dIvoire', u'Korea, Dem. Rep.', u'Congo, Dem. Rep.', u'Denmark', u'Djibouti', u'Dominica', u'Dominican Republic', u'Ecuador', u'Egypt', u'El Salvador', u'Equatorial Guinea', u'Eritrea', u'Estonia', u'Ethiopia', u'Fiji', u'Finland', u'France', u'French Polynesia', u'Gabon', u'Gambia', u'Georgia', u'Germany', u'Ghana', u'Greece', u'Grenada', u'Guam', u'Guatemala', u'Guinea', u'Guinea-Bissau', u'Guyana', u'Haiti', u'Honduras', u'Hungary', u'Iceland', u'India', u'Indonesia', u'Iran', u'Iraq', u'Ireland', u'Israel', u'Italy', u'Jamaica', u'Japan', u'Jordan', u'Kazakhstan', u'Kenya', u'Kiribati', u'Kuwait', u'Kyrgyzstan', u'Laos', ...], dtype='object')

Similarly, we can access row names by using index.

existing_df.index Index([u'1990', u'1991', u'1992', u'1993', u'1994', u'1995', u'1996', u'1997', u'1998', u'1999', u'2000', u'2001', u'2002', u'2003', u'2004', u'2005', u'2006', u'2007'], dtype='object')

We will use them to assign proper names to our column and index names.

deaths_df.index.names = ['year'] deaths_df.columns.names = ['country'] existing_df.index.names = ['year'] existing_df.columns.names = ['country'] new_df.index.names = ['year'] new_df.columns.names = ['country'] existing_df country Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia … Uruguay Uzbekistan Vanuatu Venezuela Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe year                                           1990 436 42 45 42 39 514 38 16 96 52 … 35 114 278 46 365 126 55 265 436 409 1991 429 40 44 14 37 514 38 15 91 49 … 34 105 268 45 361 352 54 261 456 417 1992 422 41 44 4 35 513 37 15 86 51 … 33 102 259 44 358 64 54 263 494 415 1993 415 42 43 18 33 512 37 14 82 55 … 32 118 250 43 354 174 52 253 526 419 1994 407 42 43 17 32 510 36 13 78 60 … 31 116 242 42 350 172 52 250 556 426 1995 397 43 42 22 30 508 35 12 74 68 … 30 119 234 42 346 93 50 244 585 439 1996 397 42 43 0 28 512 35 12 71 74 … 28 111 226 41 312 123 49 233 602 453 1997 387 44 44 25 23 363 36 11 67 75 … 27 122 218 41 273 213 46 207 626 481 1998 374 43 45 12 24 414 36 11 63 74 … 28 129 211 40 261 107 44 194 634 392 1999 373 42 46 8 22 384 36 9 58 86 … 28 134 159 39 253 105 42 175 657 430 2000 346 40 48 8 20 530 35 8 52 94 … 27 139 143 39 248 103 40 164 658 479 2001 326 34 49 6 20 335 35 9 51 99 … 25 148 128 41 243 13 39 154 680 523 2002 304 32 50 5 21 307 35 7 42 97 … 27 144 149 41 235 275 37 149 517 571 2003 308 32 51 6 18 281 35 9 41 91 … 25 152 128 39 234 147 36 146 478 632 2004 283 29 52 9 19 318 35 8 39 85 … 23 149 118 38 226 63 35 138 468 652 2005 267 29 53 11 18 331 34 8 39 79 … 24 144 131 38 227 57 33 137 453 680 2006 251 26 55 9 17 302 34 9 37 79 … 25 134 104 38 222 60 32 135 422 699 2007 238 22 56 5 19 294 34 9 35 81 … 23 140 102 39 220 25 31 130 387 714 R

In R we use read.csv to read CSV files into data.frame variables. Although the R function read.csv can work with URLs, https is a problem for R in many cases, so you need to use a package like RCurl to get around it.

library(RCurl) ## Loading required package: bitops existing_cases_file <- getURL("https://docs.google.com/spreadsheets/d/1X5Jp7Q8pTs3KLJ5JBWKhncVACGsg5v4xu6badNs4C7I/pub?gid=0&output=csv") existing_df <- read.csv(text = existing_cases_file, row.names=1, stringsAsFactor=F) str(existing_df) ## 'data.frame': 207 obs. of 18 variables: ## $ X1990: chr "436" "42" "45" "42" ... ## $ X1991: chr "429" "40" "44" "14" ... ## $ X1992: chr "422" "41" "44" "4" ... ## $ X1993: chr "415" "42" "43" "18" ... ## $ X1994: chr "407" "42" "43" "17" ... ## $ X1995: chr "397" "43" "42" "22" ... ## $ X1996: int 397 42 43 0 28 512 35 12 71 74 ... ## $ X1997: int 387 44 44 25 23 363 36 11 67 75 ... ## $ X1998: int 374 43 45 12 24 414 36 11 63 74 ... ## $ X1999: int 373 42 46 8 22 384 36 9 58 86 ... ## $ X2000: int 346 40 48 8 20 530 35 8 52 94 ... ## $ X2001: int 326 34 49 6 20 335 35 9 51 99 ... ## $ X2002: int 304 32 50 5 21 307 35 7 42 97 ... ## $ X2003: int 308 32 51 6 18 281 35 9 41 91 ... ## $ X2004: chr "283" "29" "52" "9" ... ## $ X2005: chr "267" "29" "53" "11" ... ## $ X2006: chr "251" "26" "55" "9" ... ## $ X2007: chr "238" "22" "56" "5" ...

The str() function in R gives us information about a variable type. In this case we can see that, due to the , thousands separator, some of the columns hasn’t been parsed as numbers but as character. If we want to properly work with our dataset we need to convert them to numbers. Once we know a bit more about indexing and mapping functions, I promise you will be able to understand the following piece of code. By know let’s say that we convert a column and assign it again to its reference in the data frame.

existing_df[c(1,2,3,4,5,6,15,16,17,18)] <- lapply( existing_df[c(1,2,3,4,5,6,15,16,17,18)], function(x) { as.integer(gsub(',', '', x) )}) str(existing_df) ## 'data.frame': 207 obs. of 18 variables: ## $ X1990: int 436 42 45 42 39 514 38 16 96 52 ... ## $ X1991: int 429 40 44 14 37 514 38 15 91 49 ... ## $ X1992: int 422 41 44 4 35 513 37 15 86 51 ... ## $ X1993: int 415 42 43 18 33 512 37 14 82 55 ... ## $ X1994: int 407 42 43 17 32 510 36 13 78 60 ... ## $ X1995: int 397 43 42 22 30 508 35 12 74 68 ... ## $ X1996: int 397 42 43 0 28 512 35 12 71 74 ... ## $ X1997: int 387 44 44 25 23 363 36 11 67 75 ... ## $ X1998: int 374 43 45 12 24 414 36 11 63 74 ... ## $ X1999: int 373 42 46 8 22 384 36 9 58 86 ... ## $ X2000: int 346 40 48 8 20 530 35 8 52 94 ... ## $ X2001: int 326 34 49 6 20 335 35 9 51 99 ... ## $ X2002: int 304 32 50 5 21 307 35 7 42 97 ... ## $ X2003: int 308 32 51 6 18 281 35 9 41 91 ... ## $ X2004: int 283 29 52 9 19 318 35 8 39 85 ... ## $ X2005: int 267 29 53 11 18 331 34 8 39 79 ... ## $ X2006: int 251 26 55 9 17 302 34 9 37 79 ... ## $ X2007: int 238 22 56 5 19 294 34 9 35 81 ...

Everything looks fine now. But still our dataset is a bit tricky. If we have a look at what we got into the data frame with head

head(existing_df,3) ## X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 ## Afghanistan 436 429 422 415 407 397 397 387 374 373 ## Albania 42 40 41 42 42 43 42 44 43 42 ## Algeria 45 44 44 43 43 42 43 44 45 46 ## X2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 ## Afghanistan 346 326 304 308 283 267 251 238 ## Albania 40 34 32 32 29 29 26 22 ## Algeria 48 49 50 51 52 53 55 56

and nrow and ncol

nrow(existing_df) ## [1] 207 ncol(existing_df) ## [1] 18

we see that we have a data frame with 207 observations, one for each country, and 19 variables or features, one for each year. This doesn’t seem the most natural shape for this dataset. It is very unlikely that we will add new countries (observations or rows in this case) to the dataset, while is quite possible to add additional years (variables or columns in this case). If we keep it like it is, we will end up with a dataset that grows in features and not in observations, and that seems counterintuitive (and unpractical depending of the analysis we will want to do).

We won’t need to do this preprocessing all the time, but there we go. Thankfully, R as a function t() similar to the method T in Pandas, that allows us to transpose a data.frame variable. The result is given as a matrix, so we need to convert it to a data frame again by using as.data.frame.

# we will save the "trasposed" original verison for later use if needed existing_df_t <- existing_df existing_df <- as.data.frame(t(existing_df)) head(existing_df,3) ## Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla ## X1990 436 42 45 42 39 514 38 ## X1991 429 40 44 14 37 514 38 ## X1992 422 41 44 4 35 513 37 ## Antigua and Barbuda Argentina Armenia Australia Austria Azerbaijan ## X1990 16 96 52 7 18 58 ## X1991 15 91 49 7 17 55 ## X1992 15 86 51 7 16 57 ## Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin ## X1990 54 120 639 8 62 16 65 140 ## X1991 53 113 623 8 54 15 64 138 ## X1992 52 108 608 7 59 15 62 135 ## Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Brazil ## X1990 10 924 377 160 344 124 ## X1991 10 862 362 156 355 119 ## X1992 9 804 347 154 351 114 ## British Virgin Islands Brunei Darussalam Bulgaria Burkina Faso ## X1990 32 91 43 179 ## X1991 30 91 48 196 ## X1992 28 91 54 208 ## Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands ## X1990 288 928 188 7 449 10 ## X1991 302 905 199 7 438 10 ## X1992 292 881 200 7 428 9 ## Central African Republic Chad Chile China Colombia Comoros ## X1990 318 251 45 327 88 188 ## X1991 336 272 41 321 85 177 ## X1992 342 282 38 315 82 167 ## Congo, Rep. Cook Islands Costa Rica Croatia Cuba Cyprus ## X1990 209 0 30 126 32 14 ## X1991 222 10 28 123 29 13 ## X1992 231 57 27 121 26 13 ## Czech Republic Cote d'Ivoire Korea, Dem. Rep. Congo, Dem. Rep. ## X1990 22 292 841 275 ## X1991 22 304 828 306 ## X1992 22 306 815 327 ## Denmark Djibouti Dominica Dominican Republic Ecuador Egypt ## X1990 12 1,485 24 183 282 48 ## X1991 12 1,477 24 173 271 47 ## X1992 11 1,463 24 164 259 47 ## El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland ## X1990 133 169 245 50 312 68 14 ## X1991 126 181 245 50 337 65 12 ## X1992 119 187 242 56 351 62 11 ## France French Polynesia Gabon Gambia Georgia Germany Ghana Greece ## X1990 21 67 359 350 51 15 533 30 ## X1991 20 55 340 350 48 15 519 29 ## X1992 19 91 325 349 50 14 502 27 ## Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Honduras ## X1990 7 103 113 241 404 39 479 141 ## X1991 7 101 111 248 403 43 464 133 ## X1992 7 96 108 255 402 34 453 128 ## Hungary Iceland India Indonesia Iran Iraq Ireland Israel Italy ## X1990 67 5 586 443 50 88 19 11 11 ## X1991 68 4 577 430 51 88 18 10 10 ## X1992 70 4 566 417 56 88 18 10 10 ## Jamaica Japan Jordan Kazakhstan Kenya Kiribati Kuwait Kyrgyzstan ## X1990 10 62 19 95 125 1,026 89 90 ## X1991 10 60 18 87 120 1,006 84 93 ## X1992 10 58 17 85 134 986 80 93 ## Laos Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Lithuania ## X1990 428 56 64 225 476 46 64 ## X1991 424 57 64 231 473 45 66 ## X1992 420 59 63 229 469 45 71 ## Luxembourg Madagascar Malawi Malaysia Maldives Mali Malta Mauritania ## X1990 19 367 380 159 143 640 10 585 ## X1991 18 368 376 158 130 631 9 587 ## X1992 17 369 365 156 118 621 9 590 ## Mauritius Mexico Micronesia, Fed. Sts. Monaco Mongolia Montserrat ## X1990 53 101 263 3 477 14 ## X1991 51 93 253 3 477 14 ## X1992 50 86 244 3 477 14 ## Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands ## X1990 134 287 411 650 170 629 11 ## X1991 130 313 400 685 285 607 10 ## X1992 127 328 389 687 280 585 10 ## Netherlands Antilles New Caledonia New Zealand Nicaragua Niger ## X1990 28 112 10 145 317 ## X1991 27 107 10 137 318 ## X1992 25 104 9 129 319 ## Nigeria Niue Northern Mariana Islands Norway Oman Pakistan Palau ## X1990 282 118 142 8 40 430 96 ## X1991 307 115 201 8 36 428 66 ## X1992 321 113 301 8 29 427 43 ## Panama Papua New Guinea Paraguay Peru Philippines Poland Portugal ## X1990 74 498 95 394 799 88 51 ## X1991 73 498 93 368 783 87 49 ## X1992 71 497 92 343 766 86 47 ## Puerto Rico Qatar Korea, Rep. Moldova Romania Russian Federation ## X1990 17 71 223 105 118 69 ## X1991 15 69 196 99 125 64 ## X1992 17 69 174 103 134 70 ## Rwanda Saint Kitts and Nevis Saint Lucia ## X1990 190 17 26 ## X1991 211 17 26 ## X1992 226 16 25 ## Saint Vincent and the Grenadines Samoa San Marino ## X1990 45 36 9 ## X1991 45 35 9 ## X1992 44 34 8 ## Sao Tome and Principe Saudi Arabia Senegal Seychelles Sierra Leone ## X1990 346 68 380 113 465 ## X1991 335 60 379 110 479 ## X1992 325 59 379 106 492 ## Singapore Slovakia Slovenia Solomon Islands Somalia South Africa ## X1990 52 55 66 625 597 769 ## X1991 52 56 62 593 587 726 ## X1992 53 59 59 563 577 676 ## Spain Sri Lanka Sudan Suriname Swaziland Sweden Switzerland ## X1990 44 109 409 109 629 5 14 ## X1991 42 106 404 100 590 5 13 ## X1992 40 104 402 79 527 6 12 ## Syrian Arab Republic Tajikistan Thailand Macedonia, FYR Timor-Leste ## X1990 94 193 336 92 706 ## X1991 89 162 319 90 694 ## X1992 84 112 307 89 681 ## Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan ## X1990 702 139 45 17 49 83 105 ## X1991 687 140 44 17 46 79 99 ## X1992 668 143 43 17 49 77 101 ## Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates ## X1990 42 593 206 67 47 ## X1991 40 573 313 64 44 ## X1992 37 554 342 67 42 ## United Kingdom Tanzania Virgin Islands (U.S.) ## X1990 9 215 30 ## X1991 9 228 28 ## X1992 10 240 27 ## United States of America Uruguay Uzbekistan Vanuatu Venezuela ## X1990 7 35 114 278 46 ## X1991 7 34 105 268 45 ## X1992 7 33 102 259 44 ## Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe ## X1990 365 126 55 265 436 409 ## X1991 361 352 54 261 456 417 ## X1992 358 64 54 263 494 415

Row names are sort of what in Pandas we get when we use the attribute .index in a data frame.

rownames(existing_df) ## [1] "X1990" "X1991" "X1992" "X1993" "X1994" "X1995" "X1996" "X1997" ## [9] "X1998" "X1999" "X2000" "X2001" "X2002" "X2003" "X2004" "X2005" ## [17] "X2006" "X2007"

In our data frame we see we have weird names for them. Every year is prefixed with an X. This is so because they started as column names. From the definition of a data.frame in R, we know that each column is a vector with a variable name. A name in R cannot start with a digit, so R automatically prefixes numbers with the letter X. Right know we will leave it like it is since it doesn’t really stop us from doing our analysis.

In the case of column names, they pretty much correspond to Pandas .columns attribute in a data frame.

colnames(existing_df) ## [1] "Afghanistan" "Albania" ## [3] "Algeria" "American Samoa" ## [5] "Andorra" "Angola" ## [7] "Anguilla" "Antigua and Barbuda" ## [9] "Argentina" "Armenia" ## [11] "Australia" "Austria" ## [13] "Azerbaijan" "Bahamas" ## [15] "Bahrain" "Bangladesh" ## [17] "Barbados" "Belarus" ## [19] "Belgium" "Belize" ## [21] "Benin" "Bermuda" ## [23] "Bhutan" "Bolivia" ## [25] "Bosnia and Herzegovina" "Botswana" ## [27] "Brazil" "British Virgin Islands" ## [29] "Brunei Darussalam" "Bulgaria" ## [31] "Burkina Faso" "Burundi" ## [33] "Cambodia" "Cameroon" ## [35] "Canada" "Cape Verde" ## [37] "Cayman Islands" "Central African Republic" ## [39] "Chad" "Chile" ## [41] "China" "Colombia" ## [43] "Comoros" "Congo, Rep." ## [45] "Cook Islands" "Costa Rica" ## [47] "Croatia" "Cuba" ## [49] "Cyprus" "Czech Republic" ## [51] "Cote d'Ivoire" "Korea, Dem. Rep." ## [53] "Congo, Dem. Rep." "Denmark" ## [55] "Djibouti" "Dominica" ## [57] "Dominican Republic" "Ecuador" ## [59] "Egypt" "El Salvador" ## [61] "Equatorial Guinea" "Eritrea" ## [63] "Estonia" "Ethiopia" ## [65] "Fiji" "Finland" ## [67] "France" "French Polynesia" ## [69] "Gabon" "Gambia" ## [71] "Georgia" "Germany" ## [73] "Ghana" "Greece" ## [75] "Grenada" "Guam" ## [77] "Guatemala" "Guinea" ## [79] "Guinea-Bissau" "Guyana" ## [81] "Haiti" "Honduras" ## [83] "Hungary" "Iceland" ## [85] "India" "Indonesia" ## [87] "Iran" "Iraq" ## [89] "Ireland" "Israel" ## [91] "Italy" "Jamaica" ## [93] "Japan" "Jordan" ## [95] "Kazakhstan" "Kenya" ## [97] "Kiribati" "Kuwait" ## [99] "Kyrgyzstan" "Laos" ## [101] "Latvia" "Lebanon" ## [103] "Lesotho" "Liberia" ## [105] "Libyan Arab Jamahiriya" "Lithuania" ## [107] "Luxembourg" "Madagascar" ## [109] "Malawi" "Malaysia" ## [111] "Maldives" "Mali" ## [113] "Malta" "Mauritania" ## [115] "Mauritius" "Mexico" ## [117] "Micronesia, Fed. Sts." "Monaco" ## [119] "Mongolia" "Montserrat" ## [121] "Morocco" "Mozambique" ## [123] "Myanmar" "Namibia" ## [125] "Nauru" "Nepal" ## [127] "Netherlands" "Netherlands Antilles" ## [129] "New Caledonia" "New Zealand" ## [131] "Nicaragua" "Niger" ## [133] "Nigeria" "Niue" ## [135] "Northern Mariana Islands" "Norway" ## [137] "Oman" "Pakistan" ## [139] "Palau" "Panama" ## [141] "Papua New Guinea" "Paraguay" ## [143] "Peru" "Philippines" ## [145] "Poland" "Portugal" ## [147] "Puerto Rico" "Qatar" ## [149] "Korea, Rep." "Moldova" ## [151] "Romania" "Russian Federation" ## [153] "Rwanda" "Saint Kitts and Nevis" ## [155] "Saint Lucia" "Saint Vincent and the Grenadines" ## [157] "Samoa" "San Marino" ## [159] "Sao Tome and Principe" "Saudi Arabia" ## [161] "Senegal" "Seychelles" ## [163] "Sierra Leone" "Singapore" ## [165] "Slovakia" "Slovenia" ## [167] "Solomon Islands" "Somalia" ## [169] "South Africa" "Spain" ## [171] "Sri Lanka" "Sudan" ## [173] "Suriname" "Swaziland" ## [175] "Sweden" "Switzerland" ## [177] "Syrian Arab Republic" "Tajikistan" ## [179] "Thailand" "Macedonia, FYR" ## [181] "Timor-Leste" "Togo" ## [183] "Tokelau" "Tonga" ## [185] "Trinidad and Tobago" "Tunisia" ## [187] "Turkey" "Turkmenistan" ## [189] "Turks and Caicos Islands" "Tuvalu" ## [191] "Uganda" "Ukraine" ## [193] "United Arab Emirates" "United Kingdom" ## [195] "Tanzania" "Virgin Islands (U.S.)" ## [197] "United States of America" "Uruguay" ## [199] "Uzbekistan" "Vanuatu" ## [201] "Venezuela" "Viet Nam" ## [203] "Wallis et Futuna" "West Bank and Gaza" ## [205] "Yemen" "Zambia" ## [207] "Zimbabwe"

These two functions show a common idiom in R, where we use the same function to get a value and to assign it. For example, if we want to change row names we will do something like:

colnames(existing_df) &lt;- new_col_names

But as we said we will leave them as they are by now.

Data Indexing Python

There is a whole section devoted to indexing and selecting data in DataFrames in the official documentation. Let’s apply them to our Tuberculosis cases dataframe.

We can access each data frame Series object by using its column name, as with a Python dictionary. In our case we can access each country series by its name.

existing_df['United Kingdom'] year 1990 9 1991 9 1992 10 1993 10 1994 9 1995 9 1996 9 1997 9 1998 9 1999 9 2000 9 2001 9 2002 9 2003 10 2004 10 2005 11 2006 11 2007 12 Name: United Kingdom, dtype: int64

Or just using the key value as an attribute.

existing_df.Spain year 1990 44 1991 42 1992 40 1993 37 1994 35 1995 34 1996 33 1997 30 1998 30 1999 28 2000 27 2001 26 2002 26 2003 25 2004 24 2005 24 2006 24 2007 23 Name: Spain, dtype: int64

Or we can access multiple series passing their column names as a Python list.

existing_df[['Spain', 'United Kingdom']] ountry Spain United Kingdom year     1990 44 9 1991 42 9 1992 40 10 1993 37 10 1994 35 9 1995 34 9 1996 33 9 1997 30 9 1998 30 9 1999 28 9 2000 27 9 2001 26 9 2002 26 9 2003 25 10 2004 24 10 2005 24 11 2006 24 11 2007 23 12

We can also access individual cells as follows.

existing_df.Spain['1990'] 44

Or using any Python list indexing for slicing the series.

existing_df[['Spain', 'United Kingdom']][0:5] country Spain United Kingdom year     1990 44 9 1991 42 9 1992 40 10 1993 37 10 1994 35 9

With the whole DataFrame, slicing inside of [] slices the rows. This is provided largely as a convenience since it is such a common operation.

existing_df[0:5] country Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia … Uruguay Uzbekistan Vanuatu Venezuela Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe year                                           1990 436 42 45 42 39 514 38 16 96 52 … 35 114 278 46 365 126 55 265 436 409 1991 429 40 44 14 37 514 38 15 91 49 … 34 105 268 45 361 352 54 261 456 417 1992 422 41 44 4 35 513 37 15 86 51 … 33 102 259 44 358 64 54 263 494 415 1993 415 42 43 18 33 512 37 14 82 55 … 32 118 250 43 354 174 52 253 526 419 1994 407 42 43 17 32 510 36 13 78 60 … 31 116 242 42 350 172 52 250 556 426 5 rows × 207 columns Indexing in production Python code

As stated in the official documentation, the Python and NumPy indexing operators [] and attribute operator . provide quick and easy access to pandas data structures across a wide range of use cases. This makes interactive work intuitive, as there’s little new to learn if you already know how to deal with Python dictionaries and NumPy arrays. However, since the type of the data to be accessed isn’t known in advance, directly using standard operators has some optimization limits. For production code, it is recommended that you take advantage of the optimized pandas data access methods exposed in this section.

For example, the .iloc method can be used for positional index access.

existing_df.iloc[0:2] country Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia … Uruguay Uzbekistan Vanuatu Venezuela Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe year                                           1990 436 42 45 42 39 514 38 16 96 52 … 35 114 278 46 365 126 55 265 436 409 1991 429 40 44 14 37 514 38 15 91 49 … 34 105 268 45 361 352 54 261 456 417 2 rows × 207 columns

While .loc is used for label access.

existing_df.loc['1992':'2005'] country Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia … Uruguay Uzbekistan Vanuatu Venezuela Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe year                                           1992 422 41 44 4 35 513 37 15 86 51 … 33 102 259 44 358 64 54 263 494 415 1993 415 42 43 18 33 512 37 14 82 55 … 32 118 250 43 354 174 52 253 526 419 1994 407 42 43 17 32 510 36 13 78 60 … 31 116 242 42 350 172 52 250 556 426 1995 397 43 42 22 30 508 35 12 74 68 … 30 119 234 42 346 93 50 244 585 439 1996 397 42 43 0 28 512 35 12 71 74 … 28 111 226 41 312 123 49 233 602 453 1997 387 44 44 25 23 363 36 11 67 75 … 27 122 218 41 273 213 46 207 626 481 1998 374 43 45 12 24 414 36 11 63 74 … 28 129 211 40 261 107 44 194 634 392 1999 373 42 46 8 22 384 36 9 58 86 … 28 134 159 39 253 105 42 175 657 430 2000 346 40 48 8 20 530 35 8 52 94 … 27 139 143 39 248 103 40 164 658 479 2001 326 34 49 6 20 335 35 9 51 99 … 25 148 128 41 243 13 39 154 680 523 2002 304 32 50 5 21 307 35 7 42 97 … 27 144 149 41 235 275 37 149 517 571 2003 308 32 51 6 18 281 35 9 41 91 … 25 152 128 39 234 147 36 146 478 632 2004 283 29 52 9 19 318 35 8 39 85 … 23 149 118 38 226 63 35 138 468 652 2005 267 29 53 11 18 331 34 8 39 79 … 24 144 131 38 227 57 33 137 453 680 14 rows × 207 columns

And we can combine that with series indexing by column.

existing_df.loc[['1992','1998','2005'],['Spain','United Kingdom']] country Spain United Kingdom 1992 40 10 1998 30 9 2005 24 11

This last approach is the recommended when using Pandas data frames, specially when doing assignments (something we are not doing here). Otherwise, we might have assignment problems as described here.

R

Similarly to what we do in Pandas (actually Pandas is inspired in R), we can access a data.frame column by its position.

existing_df[,1] ## X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 ## 436 429 422 415 407 397 397 387 374 373 346 326 ## X2002 X2003 X2004 X2005 X2006 X2007 ## 304 308 283 267 251 238 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

The position-based indexing in R uses the first element for the row number and the second one for the column one. If left blank, we are telling R to get all the row/columns. In the previous example we retrieved all the rows for the first column (Afghanistan) in the data.frame. And yes, R has a 1-based indexing schema.

Like in Pandas, we can use column names to access columns (series in Pandas). However R data.frame variables aren’t exactly object and we don’t use the . operator but the $ that allows accessing labels within a list.

existing_df$Afghanistan ## X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 ## 436 429 422 415 407 397 397 387 374 373 346 326 ## X2002 X2003 X2004 X2005 X2006 X2007 ## 304 308 283 267 251 238 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

An finally, since a data.frame is a list of elements (its columns), we can access columns as list elements using the list indexing operator [[]].

existing_df[[1]] ## X1990 X1991 X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 ## 436 429 422 415 407 397 397 387 374 373 346 326 ## X2002 X2003 X2004 X2005 X2006 X2007 ## 304 308 283 267 251 238 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

At this point you should have realised that in R there are multiple ways of doing the same thing, and that this seems to happen more because of the language itself than because somebody wanted to provide different ways of doing things. This strongly contrasts with Python’s philosophy of having one clear way of doing things (the Pythonic way).

For row indexing we have the positional approach.

existing_df[1,] ## Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla ## X1990 436 42 45 42 39 514 38 ## Antigua and Barbuda Argentina Armenia Australia Austria Azerbaijan ## X1990 16 96 52 7 18 58 ## Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin ## X1990 54 120 639 8 62 16 65 140 ## Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Brazil ## X1990 10 924 377 160 344 124 ## British Virgin Islands Brunei Darussalam Bulgaria Burkina Faso ## X1990 32 91 43 179 ## Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands ## X1990 288 928 188 7 449 10 ## Central African Republic Chad Chile China Colombia Comoros ## X1990 318 251 45 327 88 188 ## Congo, Rep. Cook Islands Costa Rica Croatia Cuba Cyprus ## X1990 209 0 30 126 32 14 ## Czech Republic Cote d'Ivoire Korea, Dem. Rep. Congo, Dem. Rep. ## X1990 22 292 841 275 ## Denmark Djibouti Dominica Dominican Republic Ecuador Egypt ## X1990 12 1,485 24 183 282 48 ## El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland ## X1990 133 169 245 50 312 68 14 ## France French Polynesia Gabon Gambia Georgia Germany Ghana Greece ## X1990 21 67 359 350 51 15 533 30 ## Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Honduras ## X1990 7 103 113 241 404 39 479 141 ## Hungary Iceland India Indonesia Iran Iraq Ireland Israel Italy ## X1990 67 5 586 443 50 88 19 11 11 ## Jamaica Japan Jordan Kazakhstan Kenya Kiribati Kuwait Kyrgyzstan ## X1990 10 62 19 95 125 1,026 89 90 ## Laos Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Lithuania ## X1990 428 56 64 225 476 46 64 ## Luxembourg Madagascar Malawi Malaysia Maldives Mali Malta Mauritania ## X1990 19 367 380 159 143 640 10 585 ## Mauritius Mexico Micronesia, Fed. Sts. Monaco Mongolia Montserrat ## X1990 53 101 263 3 477 14 ## Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands ## X1990 134 287 411 650 170 629 11 ## Netherlands Antilles New Caledonia New Zealand Nicaragua Niger ## X1990 28 112 10 145 317 ## Nigeria Niue Northern Mariana Islands Norway Oman Pakistan Palau ## X1990 282 118 142 8 40 430 96 ## Panama Papua New Guinea Paraguay Peru Philippines Poland Portugal ## X1990 74 498 95 394 799 88 51 ## Puerto Rico Qatar Korea, Rep. Moldova Romania Russian Federation ## X1990 17 71 223 105 118 69 ## Rwanda Saint Kitts and Nevis Saint Lucia ## X1990 190 17 26 ## Saint Vincent and the Grenadines Samoa San Marino ## X1990 45 36 9 ## Sao Tome and Principe Saudi Arabia Senegal Seychelles Sierra Leone ## X1990 346 68 380 113 465 ## Singapore Slovakia Slovenia Solomon Islands Somalia South Africa ## X1990 52 55 66 625 597 769 ## Spain Sri Lanka Sudan Suriname Swaziland Sweden Switzerland ## X1990 44 109 409 109 629 5 14 ## Syrian Arab Republic Tajikistan Thailand Macedonia, FYR Timor-Leste ## X1990 94 193 336 92 706 ## Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan ## X1990 702 139 45 17 49 83 105 ## Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates ## X1990 42 593 206 67 47 ## United Kingdom Tanzania Virgin Islands (U.S.) ## X1990 9 215 30 ## United States of America Uruguay Uzbekistan Vanuatu Venezuela ## X1990 7 35 114 278 46 ## Viet Nam Wallis et Futuna West Bank and Gaza Yemen Zambia Zimbabwe ## X1990 365 126 55 265 436 409

There we retrieved data for every country in 1990. We can combine this with a column number.

existing_df[1,1] ## X1990 ## 436 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

Or its name.

existing_df$Afghanistan[1] ## X1990 ## 436 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

What did just do before? Basically we retrieved a column, that is a vector, and accessed that vector first element. That way we got the value for Afghanistan for the year 1990. We can do the same thing using the [[]] operator instead of the list element label.

existing_df[[1]][1] ## X1990 ## 436 ## 17 Levels: 238 251 267 283 304 308 326 346 373 374 387 397 407 415 ... 436

We can also select multiple columns and/or rows by passing R vectors.

existing_df[c(3,9,16),c(170,194)] ## Spain United Kingdom ## X1992 40 10 ## X1998 30 9 ## X2005 24 11

Finally, using names is also possible when using positional indexing.

existing_df["X1992","Spain"] ## X1992 ## 40 ## Levels: 25 26 27 28 30 33 23 24 34 35 37 40 42 44

That we can combine with vectors.

existing_df[c("X1992", "X1998", "X2005"), c("Spain", "United Kingdom")] ## Spain United Kingdom ## X1992 40 10 ## X1998 30 9 ## X2005 24 11 Next Steps

So enough about indexing. In the next part of the tutorial on data frames we will see how to perform more complex data accessing using selection. Additionally, we will explain how to apply functions to a data frame elements, and how to group them.

Remember that all the source code for the different parts of this series of tutorials and applications can be checked at GitHub. Feel free to get involved and share your progress with us!

Categories: FLOSS Project Planets

ShiningPanda: Track your licenses!

Planet Python - Thu, 2015-09-03 02:00

Requires.io is proud to introduce a new feature: license tracking for your requirements!

And if a licence differs between your version of a package and its latest one, you get the information in both Requirement and Latest columns.

We hope you will enjoy this new feature!

Categories: FLOSS Project Planets

Python Software Foundation: CSA Awards to Tollervey, Stinner, and Storchaka

Planet Python - Wed, 2015-09-02 21:48
Greetings Readers,  I apologize for the hiatus I’ve taken recently from writing this blog -- other commitments temporarily got in the way. But during this time the PSF has been hard at work, and I intend to catch you up on their activities in the next few posts.  First of all, the Community Service Awards have been given out for both the second and third quarters of 2015. I am extremely happy to announce that the second quarter award went to our good friend, Nicholas Tollervey, for his excellent work in education and outreach. You can read more about Nick in a recent previous post to this blog (Tollervey), so I’ll forgo saying more about him here, other than congratulations,  and will turn to telling you about our third quarter award recipients. RESOLVED, that the Python Software Foundation award the 2015 3rd Quarter Community Service Award to Victor Stinner and Serhiy Storchaka (PSF CSA). Both Stinner and Storchaka are extremely active Python core developers. In the past three years, Serhiy has contributed well over 2000 commits, while Victor comes in a close second with almost 2000. Their hard work and dedication have helped increase Python’s vitality, relevance, and amazing growth -- a fact that the PSF wishes to recognize with this award. In addition, Serhiy Storchaka is active on the Python tracker, taking the time to help other contributors by reviewing and committing their patches. Victor Stinner’s work additionally includes 20 PEPs (see PEPs) as well as active participation in the Python community. You can view his PyCon 2014 talk here. He is also one of the developers of the tulip/asyncio project which provides asynchronous I/O support to Python. It was Victor who ported tulip/asyncio to Python 2; its usefulness has resulted in its recently being included as part of the Python 3.4 standard library.
Victor Stinner Please join me in congratulating our latest CSA recipients and in thanking them for their important work. I would love to hear from readers. Please send feedback, comments, or blog ideas to me at msushi@gnosis.cx.
Categories: FLOSS Project Planets

Continuum Analytics Blog: Continuum Analytics - September Tech Events

Planet Python - Wed, 2015-09-02 20:00

Our team is gearing up for a big presence at the Strata Hadoop World Conference at the end of this month, where we’ll be presenting theatre talks from our founder and developers, as well as demos, givewaways, and much more. Take a look at where we’ll be all month and let us know at info@continuum.io if you’d like to schedule a meeting.

Categories: FLOSS Project Planets

Justin Mason: Links for 2015-09-02

Planet Apache - Wed, 2015-09-02 19:58
Categories: FLOSS Project Planets

Mozilla Web Development: Node.js static file build steps in Python Heroku apps

Planet Python - Wed, 2015-09-02 19:07

I write a lot of webapps. I like to use Python for the backend, but most
frontend tools are written in Node.js. LESS gives me nicer style sheets, Babel
lets me write next-generation JavaScript, and NPM helps manage dependencies
nicely. As a result, most of my projects are polyglots that can be difficult to
deploy.

Modern workflows have already figured this out: Run all the tools. Most
READMEs I’ve written lately tend to look like this:

$ git clone https://github.example.com/foo/bar.git $ cd git $ pip install -r requirements.txt $ npm install $ gulp static-assets $ python ./manage.py runserver

I like to deploy my projects using Heroku. They take care of the messy details
about deployment, but they don’t seem to support multi-language projects easily.
There are Python and Node buildpacks, but no clear way of combining the two.

Multi Buildpack

GitHub is littered with attempts to fix this by building new buildpacks.
The problem is they invariable fall out of compatibility with Heroku. I could
probably fix, but then I’d have to maintain them. I use Heroku to avoid
maintaining infrastructure; custom buildpacks are one step forward, but two
steps back.

Enter Multi Buildpack, which runs multiple buildpacks at once.

It is simple enough that it is unlike to fall out of compatibility. Heroku has a
fork of the project on their GitHub account, which implies that it will be
maintained in the future.

To configure the buildpack, first tell Heroku you want to use it:

$ heroku buildpacks:set https://github.com/heroku/heroku-buildpack-multi.git

Next, add a .buildpacks file to your project that lists the buildpacks to run:

https://github.com/heroku/heroku-buildpack-nodejs.git https://github.com/heroku/heroku-buildpack-python.git

Buildpacks are executed in the order they’re listed in, allowing later
buildpacks to use the tools and scripts installed by earlier buildpacks.

The Problem With Python

There’s one problem: The Python buildpack moves files around, which makes it
incompatible with the way the Node buildpack installs commands. This means that
any asset compilation or minification done as a step of the Python buildpack
that depends on Node will fail.

The Python buildpack automatically detects a Django project and runs
./manage.py collectstatic. But the Node environment isn’t available, so this
fails. No static files get built.

There is a solution: bin/post_compile! If present in your repository, this
script will be run at the end of the build process. Because it runs outside of
the Python buildpack, commands installed by the Node buildpack are available and
will work correctly.

This trick works with any Python webapp, but lets use a Django project as an
example. I often use Django Pipeline for static asset compilation. Assets
are compiled using the command ./manage.py collectstatic, which, when properly
configured, will call all the Node commands.

#!/bin/bash export PATH=/app/.heroku/node/bin:$PATH ./manage.py collectstatic --noinput

Alternatively, you could call Node tools like Gulp or Webpack directly.

In the case of Django Pipeline, it is also useful to disable the Python
buildpack from running collectstatic, since it will fail anyways. This is done
using an environment variable:

heroku config:set DISABLE_COLLECTSTATIC 1

Okay, so there is a little hack here. We still had to append the Node binary
folder to PATH. Pretend you didn’t see that! Or don’t, because you’ll need
to do it in your script too.

That’s it

To recap, this approach:

  1. Only uses buildpacks available from Heroku
  2. Supports any sort of Python and/or Node build steps
  3. Doesn’t require vendoring or pre-compiling any static assets

Woot!

Categories: FLOSS Project Planets

Colan Schwartz: Upgrading a Drupal distribution

Planet Drupal - Wed, 2015-09-02 18:54
Topics: 

Upgrading Drupal distributions, technically referred to as installation profiles, can be tricky. If you aren't using Drupal Core, but rather a distribution of it, it's not possible to follow standard processes for upgrading Drupal core and contributed modules. You must upgrade the distribution as a whole.

In this article, we'll be working with the Web Experience Toolkit (wetkit) as the example distribution.

Assumptions Steps
  1. Switch to your Web directory and make sure your working tree is clean.
    1. cd $(drush dd @site); git status
  2. Note any commits made to the distro since it was last upgraded. These will have to be reapplied (cherry-picked in git-speak) after the upgrade unless they were explicitly added to the distro in the latest release. Find them with this command.
    • git log profiles/wetkit/
  3. Make sure that you read and understand any distro-specific information on upgrading it. In this example, the documentation is available over at Release Cycle / Updates.
  4. This is a workaround until Drush up should update contrib profiles as well is fixed. Download the distro with the following command. For the version number, use the version that was released immediately after the version that you currently have installed. For example, if you have 7.x-1.4, you need to download 7.x-1.5. If you're multiple versions behind, you'll need to repeat these instructions multiple times. Jumping more than one release ahead could cause problems.
  5. Unpack it.
    1. tar zxvf /tmp/wetkit-7.x-Y.Z-core.tar.gz --directory /tmp
  6. Move your sites folder out of the way so that it won't be overwritten. It's necessary to do this as the administrator to maintain permissions on the files directory.
    1. sudo mv sites /tmp
  7. Copy the new release's files into your Web dir. Don't copy core's Git ignore file; keep ours. This is a workaround for Rename core .gitignore file to example.gitignore and add explanatory comments.
    1. cp -r /tmp/wetkit-7.x-Y.Z/.htaccess /tmp/wetkit-7.x-Y.Z/* .
  8. Replace the new default sites dir with our own.
    1. rm -rf sites
    2. sudo mv /tmp/sites .
  9. Remove the distro-specific Git ignore files, as they'll cause us to ignore files we shouldn't.
    1. rm $(find profiles/wetkit -name ".gitignore")
  10. Stage all of the changed files.
    1. git add --all
  11. Commit the upgrade with a comment like, "Issue #123: Upgraded the WxT distro from release 7.x-A.B to 7.x-A.C."
    1. git commit
  12. Cherry-pick each of the commits that you noted in the first step. Ideally, these are upstream patches that you either found or posted yourself while working on the distro. For each commit message, use something like "Issue #567: Applied patch from https://www.drupal.org/node/1417630#comment-6810906." so you'll know if you'll be need to re-apply it again, or if it's been committed (and you no longer need to worry about it).

    1. git --edit cherry-pick COMMIT_ID_1
    2. git --edit cherry pick COMMIT_ID_2
    3. ...
  13. Update the database schema.
    1. drush updb
  14. Clear all of the application caches.
    1. drush cc all
  15. Test the local site to ensure everything is working.
Notes
  • Whenever we override the distro's module versions in sites/all/modules/contrib (this should be an extremely rare occurence, if ever), we should set up Profile Status Check. In fact, it probably wouldn't hurt to include this module in all of the official distributions.

This article, Upgrading a Drupal distribution, appeared first on the Colan Schwartz Consulting Services blog.

Categories: FLOSS Project Planets

Kontact and GnuPG under Windows

Planet KDE - Wed, 2015-09-02 17:53

Kontact has, in contrast to Thunderbird, integrated crypto support (OpenPGP and S/MIME) out-of-the-box.
That means on Linux you can simply start Kontact and read crypted mails (if you have already created keys).
After you select your crypto keys, you can immediately start writing encrypted mails. With that great user experince I never needed to dig further in the crypto stack.

But on Windows there is no GnuPG installed as default, so I need to dig into the whole world of crypto layers,
that are between Kontact and the actual part that does the de-/encryption.

Crypto Stack

Kontact uses a number of libraries that the team has written around GPGME.

The lowest level one is gpgmepp which is an object oriented wrapper for gpgme. This lets us avoid having to write code in C for KMail. Than we have libkleo which is a library built on top of gpgmepp that KMail uses to trigger de-/encryption in the lower levels. GPGME is the only required dependency to compile Kontact with crypto support.

But this is not enough to send and receive encrypted mail with Kontact on Windows, as I mentioned earlier. There are still runtime dependencies that we need to have in place. Fortunatelly the runtime crypto stack is already packaged by the GPG4Win team. Simply installing is still not enough to have crypto support, though. With GPG4Win, it is possible to select OpenPGP keys, create and read encrypted mails, but unfortunatelly it doesn't work with S/MIME.

So I had to dig futher into how GnuPG is actually working.

OpenPGP is handled by the gpg binary and for S/MIME we have gpgsm. Both are directly called from GPGME, using libassuan. Both application than talk to gpg-agent, which is actually the only programm that interacts with the key data. Both application can be used from the commandline, so it was easy to verify, that they were working and that we have no problems with GnuPG setup.

So first we start by creating keys (gpg --gen-key and gpgsm --gen-key) and than further testing what works with GPG4Win and what does not. We found a bug in GnuPG in the used version, but this one was closed in a newer version. Still Kontact didn't want to communicate with GPG4Win. The reason was a wrong standard path, preventing gpgme from finding gpgsm. With that fixed, we now have a working crypto stack under windows.

But to be honest, there are more application involved in a working crypto stack. At first we need gpgconf and gpgme-w32-spawn to be available in the Kontact directory. gpgconf helps gpgme to find gpg and gpgsm and is responsible to modify the content of .gnupg in the user's home directoy. Additionally, it infoms you about changes in config files. gpgme-w32-spawn is responsible for creating the other needed processes.

For having a UI where you can enter ypur password you need pinentry. S/MIME needs another agent, that does the CRL / OCSP checks. This is done by dirmgnr. In GnuPG 2.1 dirmgnr is the only component that performs connections to the outside. So every request that requires the Internet is done via dirmgnr.

This is, in short, the crypto stack that needs to work together to give you working encrypted mail support.

We are happy, that we now have a fully working Kontact under windows (again!). There are rumours, that Kontact was working also before that under windows with crypto support, but unfortunatelly when we started the crypted part was not working.

This work has done in the kolabsys branch, which is based on KDE Libraries 4. The next steps are to merge changes over to make sure that the current master branch of Kontact, which uses KDE Frameworks 5, is also working.

Randa

Coming up next week is the yearly Randa meeting where we will have the chance to sit together for a week and work on the future of Kontact. This meetings help tremendously in injecting momentum into the project, and we have a variety of topics to cover to direct the development for the time to come (and of course a lot of stuff to actively hack on). If you’d like to contribute to that you can help us with some funding. Much appreciated!

Categories: FLOSS Project Planets

Randa – KDE sprints 2015

Planet KDE - Wed, 2015-09-02 17:25

So, following this year’s Google Summer of Code, I have now the opportunity to be a part of the annual KDE sprints which will be hosted in Randa, Switzerland from the 6th to the 13th of September this year. Thankfully, I have all my documents sorted out and I’ll be flying to Switzerland on from […]

Categories: FLOSS Project Planets
Syndicate content