Feeds

Kushal Das: Keeping the tools simple

Planet Python - Wed, 2017-02-08 01:02

When I talk about programming or teach in a workshop, I keep repeating one motto: Try to keep things simple. Whenever I looked at the modern complex systems which are popular among the users, generally they are many simple tools working together solving a complex problem.

The Unix philosophy

Back in college days, I read about the Unix philosophy. It is a set of ideas and philosophical approaches to the software development. From the Wikipedia page, we can find four points.

  • Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new “features”.
  • Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.
  • Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them.
  • Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.
Do one thing

The first point is something many people decide to skip. The idea of a perfect system which can do everything leads to a complex over engineered software which takes many years to land in production. The first causality is the user of the system, then the team who has to maintain the system up & running. A simple system with proper documentation also attracts number of users. These first users also become your community. They try out the new features, provide valuable feedback. If you are working on an Open Source project, creating that community around your project is always important for the sustainability of the project.

Pipe and redirection between tools

Piping and redirection in Linux shells were another simple magic I learned during the early days in college. How a tool like grep can take the input stream and provide an output, which in turn can be used in the next tool, was one of the best thing I found in the terminal. As a developer, I spend a lot of time in the terminals, and we use piping and redirection in daily life many times.

Build and try and repeat

I think all of the modern agile followers know this point very well. Unless you are allowing your users to try out your tool and allow them to provide feedback, your tool will not be welcomed by the users. We, the programmers have this attitude that every problem can be solved by code, and our ideas are always the correct. No, that is not the case at all. Go out in the world, show your tool to as many people as possible, take feedback. Rewrite and rebuild your tool as required. If you wait for 3 years and hope that someone will force your tool to the users, that will not go well in long run.

Do One Thing and Do It Well

The whole idea of Do One Thing and Do It Well has been discussed many times. Search the term in your favorite search engine, and you will surely find many documents explaining the idea in details. Following this idea while designing tools or systems helped me till date. Tunir or gotun tried to follow the same ideas as much as possible. They are build to execute some command on a remote system and act accordingly to the exit codes. I think this is the one line description of both the tools. To verify if the tool is simple or not, I keep throwing the tool to the new users and go through the feedback.

Last night we received a mail from Dusty Mabe in the Fedora Cloud list, to test the updates-testing tree for Fedora Atomic. At the end of the email, he also gave the command to execute to rebase to the updates-testing tree.

# rpm-ostree rebase fedora-atomic/25/x86_64/testing/docker-host

With that as input from upstream, it was just adding the command in one line on top of the current official Fedora Atomic tests, and followed by a reboot command and wait for the machine to come back online.

sudo rpm-ostree rebase fedora-atomic/25/x86_64/testing/docker-host @@ sudo reboot SLEEP 120 curl -O http://infrastructure.fedoraproject.org/infra/autocloud/tunirtests.tar.gz tar -xzvf tunirtests.tar.gz ...

This helped me to find the regression in atomic command within the next few minutes while I was working on something else. As I reported the issue to the upstream, they are already working to find a solution (some discussion here). The simplicity of the tool helped me to get things done faster in this case.

Please let me know what do you think about this particular idea about designing software in the comments below.

Categories: FLOSS Project Planets

Drupal CMS Guides at Daymuse Studios: Products & Product Types in Drupal Commerce Module Guide

Planet Drupal - Tue, 2017-02-07 21:49

Our first Drupal Commerce Module Guide begins our e-commerce site creation tutorial. We need to understand what Product and Product Types are and how to implement them.

Categories: FLOSS Project Planets

Aaron Morton: Modeling real life workloads with cassandra-stress is hard

Planet Apache - Tue, 2017-02-07 20:00

The de-facto tool to model and test workloads on Cassandra is cassandra-stress. It is a widely known tool, appearing in numerous blog posts to illustrate performance testing on Cassandra and often recommended for stress testing specific data models. Theoretically there is no reason why cassandra-stress couldn’t fit your performance testing needs. But cassandra-stress has some caveats when modeling real workloads, the most important of which we will cover in this blog post.

Using cassandra-stress

When the time comes to run performance testing on Cassandra there’s the choice between creating your own code or using cassandra-stress. Saving time by relying on a tool that ships with your favourite database is the obvious choice for many.

Predefined mode

Using cassandra-stress in predefined mode is very cool. By running a simple command line, you can supposedly test the limits of your cluster in terms of write ingestion for example :

cassandra-stress write n=100000 cl=one -mode native cql3 -node 10.0.1.24

Will output the following :

Created keyspaces. Sleeping 1s for propagation. Sleeping 2s... Warming up WRITE with 50000 iterations... Connected to cluster: c228 Datatacenter: datacenter1; Host: /127.0.0.1; Rack: rack1 Datatacenter: datacenter1; Host: /127.0.0.2; Rack: rack1 Datatacenter: datacenter1; Host: /127.0.0.3; Rack: rack1 Failed to connect over JMX; not collecting these stats Running WRITE with 200 threads for 100000 iteration Failed to connect over JMX; not collecting these stats type, total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb total, 14889, 14876, 14876, 14876, 13,4, 5,5, 49,4, 82,1, 98,0, 110,9, 1,0, 0,00000, 0, 0, 0, 0, 0, 0 total, 25161, 9541, 9541, 9541, 21,0, 8,4, 75,2, 84,7, 95,8, 107,4, 2,1, 0,15451, 0, 0, 0, 0, 0, 0 total, 33715, 8249, 8249, 8249, 24,5, 10,3, 91,9, 113,5, 135,1, 137,6, 3,1, 0,15210, 0, 0, 0, 0, 0, 0 total, 45773, 11558, 11558, 11558, 17,1, 8,8, 55,3, 90,6, 97,7, 105,3, 4,2, 0,11311, 0, 0, 0, 0, 0, 0 total, 57748, 11315, 11315, 11315, 17,6, 7,9, 67,2, 93,1, 118,0, 124,1, 5,2, 0,09016, 0, 0, 0, 0, 0, 0 total, 69434, 11039, 11039, 11039, 18,2, 8,5, 63,1, 95,4, 113,5, 122,8, 6,3, 0,07522, 0, 0, 0, 0, 0, 0 total, 83486, 13345, 13345, 13345, 15,1, 5,5, 56,8, 71,1, 91,3, 101,1, 7,3, 0,06786, 0, 0, 0, 0, 0, 0 total, 97386, 12358, 12358, 12358, 16,1, 7,1, 63,6, 92,4, 105,8, 138,6, 8,5, 0,05954, 0, 0, 0, 0, 0, 0 total, 100000, 9277, 9277, 9277, 19,6, 7,5, 64,2, 86,0, 96,6, 97,6, 8,7, 0,05802, 0, 0, 0, 0, 0, 0 Results: op rate : 11449 [WRITE:11449] partition rate : 11449 [WRITE:11449] row rate : 11449 [WRITE:11449] latency mean : 17,4 [WRITE:17,4] latency median : 7,4 [WRITE:7,4] latency 95th percentile : 63,6 [WRITE:63,6] latency 99th percentile : 91,8 [WRITE:91,8] latency 99.9th percentile : 114,8 [WRITE:114,8] latency max : 138,6 [WRITE:138,6] Total partitions : 100000 [WRITE:100000] Total errors : 0 [WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:00:08 END

So at CL.ONE, a ccm cluster running on my laptop can go up to 9200 writes/s.
Is that interesting? Probably not…

So far cassandra-stress has run in self-driving mode, creating its own schema and generating its own data. Taking a close look at the table that was actually generated, here’s what we get :

CREATE TABLE keyspace1.standard1 ( key blob PRIMARY KEY, "C0" blob, "C1" blob, "C2" blob, "C3" blob, "C4" blob ) WITH COMPACT STORAGE AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE';

A compact storage table, with blob only columns and no clustering columns. What we will say is that it might not be representative of the data models you are running in production. Furthermore, keyspace1 is created at RF=1 by default. It is understandable since cassandra-stress should run on any cluster, even with a single node, but you might be happier with an RF=3 keyspace to model real life workloads.

Even though this is most likely nothing like your use case, it is a work everywhere, out of the box solution that is useful for evaluating hardware configurations (tuning IO for example) or directly comparing different versions and/or configurations of Cassandra.

That said, running a mixed workload will prove deceitful, even when simply comparing raw performance of different hardware configurations. Unless ran on a sufficient number of iterations, your read workload might exclusively be hitting memtables, and not a single sstable :

adejanovski$ cassandra-stress mixed n=100000 cl=one -mode native cql3 -node 10.0.1.24 Sleeping for 15s Running with 16 threadCount Running [WRITE, READ] with 16 threads for 100000 iteration Failed to connect over JMX; not collecting these stats type, total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb READ, 10228, 10211, 10211, 10211, 0,8, 0,6, 1,8, 2,8, 7,8, 8,5, 1,0, 0,00000, 0, 0, 0, 0, 0, 0 WRITE, 10032, 10018, 10018, 10018, 0,8, 0,6, 1,7, 2,7, 7,6, 8,5, 1,0, 0,00000, 0, 0, 0, 0, 0, 0 total, 20260, 20226, 20226, 20226, 0,8, 0,6, 1,7, 2,8, 7,7, 8,5, 1,0, 0,00000, 0, 0, 0, 0, 0, 0 READ, 21766, 11256, 11256, 11256, 0,7, 0,5, 1,6, 2,4, 7,1, 14,2, 2,0, 0,03979, 0, 0, 0, 0, 0, 0 WRITE, 21709, 11387, 11387, 11387, 0,7, 0,5, 1,5, 2,5, 9,0, 14,1, 2,0, 0,03979, 0, 0, 0, 0, 0, 0 total, 43475, 22639, 22639, 22639, 0,7, 0,5, 1,5, 2,4, 8,9, 14,2, 2,0, 0,03979, 0, 0, 0, 0, 0, 0 READ, 32515, 10600, 10600, 10600, 0,7, 0,5, 1,8, 4,0, 7,9, 11,8, 3,0, 0,02713, 0, 0, 0, 0, 0, 0 WRITE, 32311, 10454, 10454, 10454, 0,7, 0,5, 1,9, 4,5, 7,8, 10,5, 3,0, 0,02713, 0, 0, 0, 0, 0, 0 total, 64826, 21050, 21050, 21050, 0,7, 0,5, 1,9, 4,2, 7,9, 11,8, 3,0, 0,02713, 0, 0, 0, 0, 0, 0 READ, 42743, 10065, 10065, 10065, 0,8, 0,6, 2,0, 3,2, 8,4, 8,7, 4,1, 0,02412, 0, 0, 0, 0, 0, 0 WRITE, 42502, 10031, 10031, 10031, 0,8, 0,6, 1,9, 2,9, 7,4, 8,9, 4,1, 0,02412, 0, 0, 0, 0, 0, 0 total, 85245, 20095, 20095, 20095, 0,8, 0,6, 2,0, 3,0, 7,7, 8,9, 4,1, 0,02412, 0, 0, 0, 0, 0, 0 READ, 50019, 10183, 10183, 10183, 0,8, 0,6, 1,8, 2,7, 8,0, 11,1, 4,8, 0,01959, 0, 0, 0, 0, 0, 0 WRITE, 49981, 10473, 10473, 10473, 0,8, 0,5, 1,8, 2,7, 7,2, 12,3, 4,8, 0,01959, 0, 0, 0, 0, 0, 0 total, 100000, 20651, 20651, 20651, 0,8, 0,6, 1,8, 2,7, 7,6, 12,3, 4,8, 0,01959, 0, 0, 0, 0, 0, 0 Results: op rate : 20958 [READ:10483, WRITE:10476] partition rate : 20958 [READ:10483, WRITE:10476] row rate : 20958 [READ:10483, WRITE:10476] latency mean : 0,7 [READ:0,8, WRITE:0,7] latency median : 0,5 [READ:0,6, WRITE:0,5] latency 95th percentile : 1,8 [READ:1,8, WRITE:1,8] latency 99th percentile : 2,8 [READ:2,9, WRITE:3,0] latency 99.9th percentile : 6,5 [READ:7,8, WRITE:7,7] latency max : 14,2 [READ:14,2, WRITE:14,1] Total partitions : 100000 [READ:50019, WRITE:49981] Total errors : 0 [READ:0, WRITE:0] total gc count : 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:00:04

Here we can see that read latencies have been almost as fast as writes, which even with the help of SSDs is still unexpected.
Checking nodetool cfhistograms will confirm that we didn’t hit a single SSTable :

adejanovski$ ccm node1 nodetool cfhistograms keyspace1 standard1 No SSTables exists, unable to calculate 'Partition Size' and 'Cell Count' percentiles keyspace1/standard1 histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 0,00 11,86 17,08 NaN NaN 75% 0,00 14,24 20,50 NaN NaN 95% 0,00 20,50 29,52 NaN NaN 98% 0,00 29,52 35,43 NaN NaN 99% 0,00 42,51 51,01 NaN NaN Min 0,00 2,76 4,77 NaN NaN Max 0,00 25109,16 17436,92 NaN NaN

As stated here, “No SSTables exists”.

After a few runs with different thread counts (the mixed workload runs several times, changing the number of threads), we finally get some flushes which mildly change the distribution :

keyspace1/standard1 histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 0,00 11,86 17,08 310 5 75% 0,00 14,24 20,50 310 5 95% 1,00 20,50 61,21 310 5 98% 1,00 42,51 88,15 310 5 99% 1,00 88,15 105,78 310 5 Min 0,00 1,92 3,97 259 5 Max 1,00 155469,30 17436,92 310 5

There we can see that we’re reading 310 bytes partitions, and still hit only memtables in at least 75% of our reads.
Anyone operating Cassandra in production would love to have such outputs when running cfhistograms :)

As a best practice, always run separate write and read commands, each with an appropriate fixed throttle rate. This provides appropriately specified read-to-write ratio which improves the memtables and flushing issue and avoids coordinated omission.

Clearly the predefined mode has limited interest, and one might want to use a very high number of iterations to ensure data is accessed both in memory and on disk.

Let’s now dive into user defined workloads, which should prove way more useful.

User defined mode

User defined mode of cassandra-stress allows running performance tests on custom data models, using yaml files for configuration.
I will not detail how that works here and invite you to check this Datastax blog post and this Instaclustr blog post which should give you a good understanding of how to model your custom workload in cassandra-stress.

Our use case for cassandra-stress was to test how a customer workload would behave on different instance sizes in AWS. The plan was to model every table in yml files, try replicating partition size distribution and row mean size and use the rate limiter to be as close as possible to the observed production workload.

No support for maps or UDTs

The first problem we ran into is that cassandra-stress doesn’t support maps, and we had a map in a high traffic table. Changing that meant to go deep into cassandra-stress code to add support, which was not feasible in the time we had allocated to this project.
The poor man’s solution we chose here was to replace the map with a text column. While not being the same, we were lucky enough for that map to be written immutably only once and not partially updated over time. Getting the size right for that field was then done by trying different settings and checking on the mean row size.

Getting partition size distribution is hard

To be fair, cassandra-stress does a great job at allowing to have a specific distribution for partition sizes. As shown in the above links, you can use GAUSSIAN, EXTREME, UNIFORM, EXP and FIXED as distribution patterns.
Getting the right distribution will require several attempts though, and any change to the mean row size will require to tweak back distribution parameters to get a cfhistograms output that looks like your production one.

To batch or not to batch, it should be up to you…

There is no straightforward way to tell cassandra-stress not to use batches for queries and send them individually. You have to use the upper bound of your cluster distribution and use it as a divider for the select distribution ratio :

... - name: event_id population: UNIFORM(1..100B) cluster: EXTREME(10..500, 0.4) ... insert: partitions: FIXED(1) select: FIXED(1)/500 batchtype: UNLOGGED ...

Unfortunately, and as reported in this JIRA ticket, the current implementation prevents it from working as one could expect.
Diving into the code, it appears clearly that all rows in a partition will get batched together, whatever happens (from SchemaInsert.java):

public boolean run() throws Exception { List<BoundStatement> stmts = new ArrayList<>(); partitionCount = partitions.size(); for (PartitionIterator iterator : partitions) while (iterator.hasNext()) stmts.add(bindRow(iterator.next())); rowCount += stmts.size(); // 65535 is max number of stmts per batch, so if we have more, we need to manually batch them for (int j = 0 ; j < stmts.size() ; j += 65535) { List<BoundStatement> substmts = stmts.subList(j, Math.min(j + stmts.size(), j + 65535)); Statement stmt; if (stmts.size() == 1) { stmt = substmts.get(0); } else { BatchStatement batch = new BatchStatement(batchType); batch.setConsistencyLevel(JavaDriverClient.from(cl)); batch.addAll(substmts); stmt = batch; } client.getSession().execute(stmt); } return true; }

All partitions in an operation will get batched by chunks of 65k queries at most, with no trace of the batchSize argument that exists in the SchemaInsert constructor.

Since in our specific case, we were using Apache Cassandra 2.1, we patched that code path in order to get proper batch sizes (in our case, one row per batch).
The patch sadly does not apply on trunk as there were other changes made in between that prevent it from working correctly.

The code here should also be further optimized by using asynchronous queries instead of batches, as this has been considered a best practice for a long time now, especially if your operations contain several partitions.

I’ve submitted a new patch on CASSANDRA-11105 to fix the issue on the 4.0 trunk and switch to asynchronous queries instead of synchronous ones. You can still use batches of course if that fits your use case.

You can also consider the visits switch on inserts. It allows to spread the inserts for each partition in chunks, which will break partitions in several operations over the course of the stress test. The difficult part here is that you will have to closely correlate the number of iterations, the number of possible distinct partition keys, the number of visits per partition and the select ratio in order to get a representative distribution of inserts, otherwise the resulting partition sizes won’t match reality.

On the bright side, the visits switch can bring you closer to real life scenarios by spreading the inserts of a partition over time instead of doing it all at once (batched or not). The effect is that your rows will get spread over multiple SSTables instead of living in a single one.

You can achieve something similar by running several write tests before starting your read test.

Using our example from above, we could make sure all partitions would spread each row in different operations using the following command line :

cassandra-stress user profile=my_stress_config.yaml n=100000 'ops(insert=1)' cl=LOCAL_QUORUM truncate=always no-warmup -node 10.0.1.24 -mode native cql3 compression=lz4 -rate threads=20 -insert visits=FIXED\(500\)

Note that if you’re allowing a great number of distinct partition keys in your yaml, using 100k iterations like above will probably create small partitions as :

  • you won’t run enough operations to create enough distinct partitions
  • Some partitions will be visited way more times than they have rows, which will generate a lot of overwrites

You could try using the same distribution than your clustering key, but both won’t be correlated and you’ll end up with batches being generated.

You can find a few informations on the -insert switch in this JIRA comment.

Rate limiter ftw !

Once the yaml files are ready, you’ll have to find the right arguments for your cassandra-stress command line, which is not for the faint of heart. There are lots of switches here, and the documentation is not always as thorough as one would hope.
One feature that we were particularly interested in was the rate limiter. Since we had to model a production load over more than 10 tables, it was clear that all of it would be useless if they weren’t rate limited to match their specific production rate.
The -rate switch allows to specify both the number of threads used by cassandra-stress but also the rate limit :

cassandra-stress ... -rate threads=${THREADS} fixed="${RATE}/s"

We quickly observed upon testing that the rate limiter was not limiting queries but operations instead, and an operation could contain several partitions, which could contain several rows. This makes sense if you account for the bug above on batch sizes as one operation would most often be achieved by a single query. As being able to mimic the production rate was mandatory, we had to patch cassandra-stress into rate limiting each individual query.

Make sure you understand coordinated omission before running cassandra-stress with the rate limiter (which you should almost always do) and use fixed instead of throttle on the rate limiter.

The problem we couldn’t fix : compression

With our fancy patched stress.jar, we went and ran the full stress test suite : rates limits were respected, the mean row size was good and each query was sent individually.
Alas, our nodes soon backed up on compactions, which was totally unexpected as we were using SSDs that could definitely keep up with our load.

The good news was that our biggest table was using TWCS with a 1 day time window, which allowed to get the exact size on disk that was generated for each node per day. After letting the stress test run for a full day and waiting for compaction to catch up, it appears we had been generating 4 to 6 times the size of what we had in production.
It took some time to realize that the problem came from compression. While in production we had a compression ratio ranging from 0.15 to 0.25, the stress test environment shown ratios going from 0.6 to 0.9 in the worst case.
Our explanation for that behavior is that cassandra-stress generates highly randomized sequences of characters (and many out of the [a-zA-Z0-9] ranges) that compress poorly, by lack of repetition. Data in a table usually follows real life patterns that are prone to repetition of character sequences. Those compress well.

Such a difference in compression ratio is a big deal when it comes to I/O. For the same uncompressed data size, we couldn’t keep up with compactions and read latencies were way higher.
The only workaround we found, and it’s a poor man’s one once again, was to reduce our mean row size to match the size on disk. This is clearly unsatisfactory as the uncompressed size in memory does not match the production one then, which can change the heap usage of our workload.

Conclusion

There are a few deal breakers right now at using cassandra-stress to model a real life workload.
If you’ve tried to use it as it should work, it is highly probable that you’ve developed a love/hate relationship with the tool : it is very powerful in theory and is the natural solution to your problem, but in real life it can be difficult and time-consuming to get close to a production stress case against an existing defined datamodel.

On the bright side, there aren’t fundamental flaws and with a coordinate effort, without even redesigning it from the ground we could easily make it better.

Beware of short stress sessions, they have very limited value : you must be accessing data from disk in order to have realistic reads, and you must wait for serious compaction to kick in order to get something that looks like the real world.

I invite you to watch Chris Batey’s talk at Cassandra Summit 2016 for more insights on cassandra-stress.

Categories: FLOSS Project Planets

Weekly Python Chat: Working with iterables: itertools &amp; more

Planet Python - Tue, 2017-02-07 20:00

The itertools library is one of my favorite standard library modules. Let's talk about itertools and other utilities for working with iterables!

Categories: FLOSS Project Planets

Vincent Bernat: Write your own terminal emulator

Planet Debian - Tue, 2017-02-07 19:54

I was an happy user of rxvt-unicode until I got a laptop with an HiDPI display. Switching from a LoDPI to a HiDPI screen and back was a pain: I had to manually adjust the font size on all terminals or restart them.

VTE is a library to build a terminal emulator using the GTK+ toolkit, which handles DPI changes. It is used by many terminal emulators, like GNOME Terminal, evilvte, sakura, termit and ROXTerm. The library is quite straightforward and writing a terminal doesn’t take much time if you don’t need many features.

Let’s see how to write a simple one.

A simple terminal§

Let’s start small with a terminal with the default settings. We’ll write that in C. Another supported option is Vala.

#include <vte/vte.h> int main(int argc, char *argv[]) { GtkWidget *window, *terminal; /* Initialise GTK, the window and the terminal */ gtk_init(&argc, &argv); terminal = vte_terminal_new(); window = gtk_window_new(GTK_WINDOW_TOPLEVEL); gtk_window_set_title(GTK_WINDOW(window), "myterm"); /* Start a new shell */ gchar **envp = g_get_environ(); gchar **command = (gchar *[]){g_strdup(g_environ_getenv(envp, "SHELL")), NULL }; g_strfreev(envp); vte_terminal_spawn_sync(VTE_TERMINAL(terminal), VTE_PTY_DEFAULT, NULL, /* working directory */ command, /* command */ NULL, /* environment */ 0, /* spawn flags */ NULL, NULL, /* child setup */ NULL, /* child pid */ NULL, NULL); /* Connect some signals */ g_signal_connect(window, "delete-event", gtk_main_quit, NULL); g_signal_connect(terminal, "child-exited", gtk_main_quit, NULL); /* Put widgets together and run the main loop */ gtk_container_add(GTK_CONTAINER(window), terminal); gtk_widget_show_all(window); gtk_main(); }

You can compile it with the following command:

gcc -O2 -Wall $(pkg-config --cflags --libs vte-2.91) term.c -o term

And run it with ./term:

More features§

From here, you can have a look at the documentation to alter behavior or add more features. Here are three examples.

Colors§

You can define the 16 basic colors with the following code:

#define CLR_R(x) (((x) & 0xff0000) >> 16) #define CLR_G(x) (((x) & 0x00ff00) >> 8) #define CLR_B(x) (((x) & 0x0000ff) >> 0) #define CLR_16(x) ((double)(x) / 0xff) #define CLR_GDK(x) (const GdkRGBA){ .red = CLR_16(CLR_R(x)), \ .green = CLR_16(CLR_G(x)), \ .blue = CLR_16(CLR_B(x)), \ .alpha = 0 } vte_terminal_set_colors(VTE_TERMINAL(terminal), &CLR_GDK(0xffffff), &(GdkRGBA){ .alpha = 0.85 }, (const GdkRGBA[]){ CLR_GDK(0x111111), CLR_GDK(0xd36265), CLR_GDK(0xaece91), CLR_GDK(0xe7e18c), CLR_GDK(0x5297cf), CLR_GDK(0x963c59), CLR_GDK(0x5E7175), CLR_GDK(0xbebebe), CLR_GDK(0x666666), CLR_GDK(0xef8171), CLR_GDK(0xcfefb3), CLR_GDK(0xfff796), CLR_GDK(0x74b8ef), CLR_GDK(0xb85e7b), CLR_GDK(0xA3BABF), CLR_GDK(0xffffff) }, 16);

While you can’t see it on the screenshot1, this also enables background transparency.

Miscellaneous settings§

VTE comes with many settings to change the behavior of the terminal. Consider the following code:

vte_terminal_set_scrollback_lines(VTE_TERMINAL(terminal), 0); vte_terminal_set_scroll_on_output(VTE_TERMINAL(terminal), FALSE); vte_terminal_set_scroll_on_keystroke(VTE_TERMINAL(terminal), TRUE); vte_terminal_set_rewrap_on_resize(VTE_TERMINAL(terminal), TRUE); vte_terminal_set_mouse_autohide(VTE_TERMINAL(terminal), TRUE);

This will:

  • disable the scrollback buffer,
  • not scroll to the bottom on new output,
  • scroll to the bottom on keystroke,
  • rewrap content when terminal size change, and
  • hide the mouse cursor when typing.
Update the window title§

An application can change the window title using XTerm control sequences (for example, with printf "\e]2;${title}\a"). If you want the actual window title to reflect this, you need to define this function:

static gboolean on_title_changed(GtkWidget *terminal, gpointer user_data) { GtkWindow *window = user_data; gtk_window_set_title(window, vte_terminal_get_window_title(VTE_TERMINAL(terminal))?:"Terminal"); return TRUE; }

Then, connect it to the appropriate signal, in main():

g_signal_connect(terminal, "window-title-changed", G_CALLBACK(on_title_changed), GTK_WINDOW(window)); Final words§

I don’t need much more as I am using tmux inside each terminal. In my own copy, I have also added the ability to complete a word using ones from the current window or other windows (also known as dynamic abbrev expansion). This requires to implement a terminal daemon to handle all terminal windows with one process, similar to urxvtcd.

While writing a terminal “from scratch”2 suits my need, it may not be worth it. evilvte is quite customizable and can be lightweight. Consider it as a first alternative. Honestly, I don’t remember why I didn’t pick it.

UPDATED: evilvte has not seen an update since 2014. Its GTK+3 support is buggy. It doesn’t support the latest versions of the VTE library. Therefore, it’s not a good idea to use it.

You should also note that the primary goal of VTE is to be a library to support GNOME Terminal. Notably, if a feature is not needed for GNOME Terminal, it won’t be added to VTE. If it already exists, it will likely to be deprecated and removed.

  1. Transparency is handled by the composite manager (Compton, in my case). 

  2. For some definition of “scratch” since the hard work is handled by VTE

Categories: FLOSS Project Planets

Deeson: Getting started with Drupal 8 and Composer

Planet Drupal - Tue, 2017-02-07 19:00

At Deeson we are constantly looking for ways to improve the way we work, iterating on past projects to incorporate new techniques and best-practices.

We starting playing with Composer recently, as a tool for speeding up Drupal module updates. There were a few quirks, but it generally worked.

Then we discovered the Drupal Composer project, which makes it much simpler to manage Drupal core as well as modules.

This is great! We can use Composer to install and update Drupal core, modules and themes, and all in a consistent manner; anyone can run Composer and end up with the exact same set of code.

So now we can start excluding some of the off-the-shelf code from our Git repository (contrib modules, themes, and libraries.) This slims down our repositories and speeds up development for the whole team.

Combined with our approach to managing settings we’re really starting to limit the amount of custom stuff in the docroot now.

Having recently completed a site using this approach I started thinking: “Why do we even need the docroot in Git?”

So we got rid of it! One of the many benefits of working in self-managing teams!

We now have a very flat repository structure where the entire docroot is compiled during deployments. The project repository contains a CMI config directory, settings.php, modules and themes directories, and the all-important composer.json which manages everything that isn’t project-specific custom code.

Internally we use Bitbucket pipelines to manage building and deploying our projects.

Every commit triggers a pipelines build. The docroot is built, tests are run, and if all goes well, it gets pushed to the hosting platform.

We have put together a small Composer script which simply symlinks the modules, themes and settings into the docroot when running composer install. The rest of the build is vanilla composer.

Our composer.json is based on the version provided by Drupal Composer, but with a change to the post-install-cmd and post-update-cmd hooks.

"scripts": { "drupal-scaffold": "DrupalComposer\\DrupalScaffold\\Plugin::scaffold", "pre-install-cmd": [ "DrupalProject\\composer\\ScriptHandler::checkComposerVersion" ], "pre-update-cmd": [ "DrupalProject\\composer\\ScriptHandler::checkComposerVersion" ], "post-install-cmd": [ "@drupal-scaffold", "DrupalProject\\composer\\DeesonScriptHandler::createRequiredFiles" ], "post-update-cmd": [ "@drupal-scaffold", "DrupalProject\\composer\\DeesonScriptHandler::createRequiredFiles" ] }

Here we have replaced the script handler plugin with our own customised version, which creates the symlinks mentioned above during composer install and composer update. We also run the Drupal Scaffold plugin on every Composer install or Composer update, to ensure that all of the extra Drupal files like index.php and update.php exist in the docroot.

Taking the Drupal docroot out of our project repositories has required a shift in the way we think about developing Drupal projects, but ultimately we believe it has streamlined and simplified our development workflows.

We have turned this approach into a Drupal 8 Quick Start template, which can be used to very quickly get up and running with Drupal 8 using a composer-based workflow. The project is available on Github. PRs welcome!

Categories: FLOSS Project Planets

Carl Chenet: The Gitlab database incident and the Backup Checker project

Planet Python - Tue, 2017-02-07 19:00

The Gitlab.com database incident of 2017/01/31 and the resulting data loss reminded everyone (at least for the next days) how it’s easy to lose data, even when you think all your systems are safe.

Being really interested by the process of backing up data, I read with interest the report (kudos to the Gitlab company for being so transparent about it) and I was soooo excited to find the following sentence:

Regular backups seem to also only be taken once per 24 hours, though team-member-1 has not yet been able to figure out where they are stored. According to team-member-2 these don’t appear to be working, producing files only a few bytes in size.

Whoa, guys! I’m so sorry for you about the data loss, but from my point of view I was so excited to find a big FOSS company publicly admitting and communicating about a perfect use case for the Backup Checker project, a Free Software I’ve been writing these last years.

Data loss: nobody cares before, everybody cries after

Usually people don’t care about the backups. It’s a serious business for web hosters and the backup team from big companies but otherwise and in other places, nobody cares.

Usually everybody agrees about how backups are important but few people make them or install an automatized system to create backups and the day before, nobody verifies they are usable. The reason is obvious: it’s totally boring, and in some cases e.g for large archives, difficult.

Because verifying backups is boring for humans, I launched the Backup Checker project in order to automatize this task.

Backup Checker offers a wide range of features, checking lots of different archives (tar.{gz,bz2,xz}, zip, tree of files and offer lots of different tests (hash sum, size {equal, smaller/greater than}, unix rights, …,). Have a look at the official documentation for a exhaustive list of features and possible tests.

Automatize the controls of your backups with Backup Checker

Checking your backups means to describe in a configuration file how a backup should be, e.g a gzipped database dump. You usually know about what size the archive is going to be, what the owner and the group owner should be.

Even easier, with Backup Checker you can generate this list of criterias from an actual archive, and remove uneeded criterias to create a template you can re-use for different kind of archives.

Ok, 2 minutes of your time for a real word example, I use an existing database sql dump in an tar.gz archive to automatically create the list describing this backup:

$ backupchecker -G database-dump.tar.gz $ cat database-dump.list [archive] mtime| 1486480274.2923253 [files] database.sql| =7854803 uid|1000 gid|1000 owner|chaica group|chaica mode|644 type|f mtime|1486480253.0

Now, just remove parameters too precise from this list to get a backup template. Here is a possible result:

[files] database.sql| >6m uid|1000 gid|1000 mode|644 type|f

We define here a template for the archive, meaning that the database.sql file in the archive should have a size greater than 6 megabytes, be owned by the user with the uid of 1000 and the group with a gid of 1000, this file should have the mode 644 and be a regular file. In order to use a template instead of the complete list, you also need to remove the sha512 from the .conf file.

Pretty easy hmm? Ok, just for fun, lets replicate the part of the Gitlab.com database incident mentioned above and write an archive with an empty sql dump inside an archive:

$ touch /tmp/database.sql && \ tar zcvf /tmp/database-dump.tar.gz /tmp/database.sql && \ cp /tmp/database-dump.tar.gz .

Now we launch Backup Checker with the previously created template. If you didn’t change the name of database-dump.list file, the command should only be:

$ backupchecker -C database-dump.conf $ cat a.out WARNING:root:1 file smaller than expected while checking /tmp/article-backup-checker/database-dump.tar.gz: WARNING:root:database.sql size is 0. Should have been bigger than 6291456.

The automatized controls of Backup Checker trigger a warning in the log file. The empty sql dump has been identified inside the archive.

A step further

As you could read in this article, verifying some of your backups is not a time consuming task, given the fact you have a FOSS project dedicated to this task, with an easy way to realize a template of your backups and to use it.

This article provided a really simple example of such a use case, the Backup Checker has lots of features to offer when verifying your backups. Read the official documentation for more complete descriptions of the available possibilities.

Data loss, especially for projets storing user data is always a terrible event in the life of an organization. Lets try to learn from mistakes which could happen to anyone and build better backup systems.

More information about the Backup Checker project

 

 

Categories: FLOSS Project Planets

Carl Chenet: The Gitlab database incident and the Backup Checker project

Planet Debian - Tue, 2017-02-07 19:00

The Gitlab.com database incident of 2017/01/31 and the resulting data loss reminded everyone (at least for the next days) how it’s easy to lose data, even when you think all your systems are safe.

Being really interested by the process of backing up data, I read with interest the report (kudos to the Gitlab company for being so transparent about it) and I was soooo excited to find the following sentence:

Regular backups seem to also only be taken once per 24 hours, though team-member-1 has not yet been able to figure out where they are stored. According to team-member-2 these don’t appear to be working, producing files only a few bytes in size.

Whoa, guys! I’m so sorry for you about the data loss, but from my point of view I was so excited to find a big FOSS company publicly admitting and communicating about a perfect use case for the Backup Checker project, a Free Software I’ve been writing these last years.

Data loss: nobody cares before, everybody cries after

Usually people don’t care about the backups. It’s a serious business for web hosters and the backup team from big companies but otherwise and in other places, nobody cares.

Usually everybody agrees about how backups are important but few people make them or install an automatized system to create backups and the day before, nobody verifies they are usable. The reason is obvious: it’s totally boring, and in some cases e.g for large archives, difficult.

Because verifying backups is boring for humans, I launched the Backup Checker project in order to automatize this task.

Backup Checker offers a wide range of features, checking lots of different archives (tar.{gz,bz2,xz}, zip, tree of files and offer lots of different tests (hash sum, size {equal, smaller/greater than}, unix rights, …,). Have a look at the official documentation for a exhaustive list of features and possible tests.

Automatize the controls of your backups with Backup Checker

Checking your backups means to describe in a configuration file how a backup should be, e.g a gzipped database dump. You usually know about what size the archive is going to be, what the owner and the group owner should be.

Even easier, with Backup Checker you can generate this list of criterias from an actual archive, and remove uneeded criterias to create a template you can re-use for different kind of archives.

Ok, 2 minutes of your time for a real word example, I use an existing database sql dump in an tar.gz archive to automatically create the list describing this backup:

$ backupchecker -G database-dump.tar.gz $ cat database-dump.list [archive] mtime| 1486480274.2923253 [files] database.sql| =7854803 uid|1000 gid|1000 owner|chaica group|chaica mode|644 type|f mtime|1486480253.0

Now, just remove parameters too precise from this list to get a backup template. Here is a possible result:

[files] database.sql| >6m uid|1000 gid|1000 mode|644 type|f

We define here a template for the archive, meaning that the database.sql file in the archive should have a size greater than 6 megabytes, be owned by the user with the uid of 1000 and the group with a gid of 1000, this file should have the mode 644 and be a regular file. In order to use a template instead of the complete list, you also need to remove the sha512 from the .conf file.

Pretty easy hmm? Ok, just for fun, lets replicate the part of the Gitlab.com database incident mentioned above and write an archive with an empty sql dump inside an archive:

$ touch /tmp/database.sql && \ tar zcvf /tmp/database-dump.tar.gz /tmp/database.sql && \ cp /tmp/database-dump.tar.gz .

Now we launch Backup Checker with the previously created template. If you didn’t change the name of database-dump.list file, the command should only be:

$ backupchecker -C database-dump.conf $ cat a.out WARNING:root:1 file smaller than expected while checking /tmp/article-backup-checker/database-dump.tar.gz: WARNING:root:database.sql size is 0. Should have been bigger than 6291456.

The automatized controls of Backup Checker trigger a warning in the log file. The empty sql dump has been identified inside the archive.

A step further

As you could read in this article, verifying some of your backups is not a time consuming task, given the fact you have a FOSS project dedicated to this task, with an easy way to realize a template of your backups and to use it.

This article provided a really simple example of such a use case, the Backup Checker has lots of features to offer when verifying your backups. Read the official documentation for more complete descriptions of the available possibilities.

Data loss, especially for projets storing user data is always a terrible event in the life of an organization. Lets try to learn from mistakes which could happen to anyone and build better backup systems.

More information about the Backup Checker project

 

 

Categories: FLOSS Project Planets

Agaric Collective: Help kick the door for new contributors to Drupal back open (and get credit for it)

Planet Drupal - Tue, 2017-02-07 17:15

After years of giving a terrible initial experience to people who want to share their first project on Drupal.org, the Project Applications Process Revamp is a Drupal Association key priority for the first part of 2017.

A plan for incentivizing code review of every project, not just new ones, after the project applications revamp is open for suggestions and feedback.

Which makes it excellent timing that right now you can get credit on your Drupal.org profile and that of your organization, boosting marketplace ranking, for reviewing the year-old backlog of project applications requesting review. The focus is on security review for these project applications, but if you want to give a thorough review and then give your thoughts on how project reviews (for any project that opts in to this quality marker) should be performed and rewarded going forward, now's the time and here's the pressing need.

Categories: FLOSS Project Planets

Craig Small: WordPress 4.7.2

Planet Debian - Tue, 2017-02-07 16:53

When WordPress originally announced their latest security update, there were three security fixes. While all security updates can be serious, they didn’t seem too bad. Shortly after, they updated their announcement with a fourth and more serious security problem.

I have looked after the Debian WordPress package for a while. This is the first time I have heard people actually having their sites hacked almost as soon as this vulnerability was announced.

If you are running WordPress 4.7 or 4.7.1, your website is vulnerable and there are bots out there looking for it. You should immediately upgrade to 4.7.2 (or, if there is a later 4.7.x version to that).  There is now updated Debian wordpress version 4.7.2 packages for unstable, testing and stable backports.

For stable, you are on a patched version 4.1 which doesn’t have this specific vulnerability (it was introduced in 4.7) but you should be using 4.1+dfsg-1+deb8u12 which has the fixes found in 4.7.1 ported back to 4.1 code.

Categories: FLOSS Project Planets

Bits from Debian: DebConf17: Call for Proposals

Planet Debian - Tue, 2017-02-07 16:00

The DebConf Content team would like to Call for Proposals for the DebConf17 conference, to be held in Montreal, Canada, from August 6 through August 12, 2017.

You can find this Call for Proposals in its latest form at: https://debconf17.debconf.org/cfp

Please refer to this URL for updates on the present information.

Submitting an Event

Submit an event proposal and describe your plan. Please note, events are not limited to traditional presentations or informal sessions (BoFs). We welcome submissions of tutorials, performances, art installations, debates, or any other format of event that you think would be beneficial to the Debian community.

Please include a short title, suitable for a compact schedule, and an engaging description of the event. You should use the field "Notes" to provide us information such as additional speakers, scheduling restrictions, or any special requirements we should consider for your event.

Regular sessions may either be 20 or 45 minutes long (including time for questions), other kinds of sessions (like workshops) could have different durations. Please choose the most suitable duration for your event and explain any special requests.

You will need to create an account on the site, to submit a talk. We'd encourage Debian account holders (e.g. DDs) to use Debian SSO when creating an account. But this isn't required for everybody, you can sign up with an e-mail address and password.

Timeline

The first batch of accepted proposals will be announced in April. If you depend on having your proposal accepted in order to attend the conference, please submit it as soon as possible so that it can be considered during this first evaluation period.

All proposals must be submitted before Sunday 4 June 2017 to be evaluated for the official schedule.

Topics and Tracks

Though we invite proposals on any Debian or FLOSS related subject, we have some broad topics on which we encourage people to submit proposals, including:

  • Blends
  • Debian in Science
  • Cloud and containers
  • Social context
  • Packaging, policy and infrastructure
  • Embedded
  • Systems administration, automation and orchestration
  • Security

You are welcome to either suggest more tracks, or become a coordinator for any of them; please refer to the Content Tracks wiki page for more information on that.

Code of Conduct

Our event is covered by a Code of Conduct designed to ensure everyone's safety and comfort. The code applies to all attendees, including speakers and the content of their presentations. For more information, please see the Code on the Web, and do not hesitate to contact us at content@debconf.org if you have any questions or are unsure about certain content you'd like to present.

Video Coverage

Providing video of sessions amplifies DebConf achievements and is one of the conference goals. Unless speakers opt-out, official events will be streamed live over the Internet to promote remote participation. Recordings will be published later under the DebConf license, as well as presentation slides and papers whenever available.

DebConf would not be possible without the generous support of all our sponsors, especially our Platinum Sponsor Savoir-Faire Linux. DebConf17 is still accepting sponsors; if you are interested, or think you know of others who would be willing to help, please get in touch!

In case of any questions, or if you wanted to bounce some ideas off us first, please do not hesitate to reach out to us at content@debconf.org.

We hope to see you in Montreal!

The DebConf team

Categories: FLOSS Project Planets

Jonathan McDowell: GnuK on the Maple Mini

Planet Debian - Tue, 2017-02-07 14:34

Last weekend, as a result of my addiction to buying random microcontrollers to play with, I received some Maple Minis. I bought the Baite clone direct from AliExpress - so just under £3 each including delivery. Not bad for something that’s USB capable, is based on an ARM and has plenty of IO pins.

I’m not entirely sure what my plan is for the devices, but as a first step I thought I’d look at getting GnuK up and running on it. Only to discover that chopstx already has support for the Maple Mini and it was just a matter of doing a ./configure --vidpid=234b:0000 --target=MAPLE_MINI --enable-factory-reset ; make. I’d hoped to install via the DFU bootloader already on the Mini but ended up making it unhappy so used SWD by following the same steps with OpenOCD as for the FST-01/BusPirate. (SWCLK is D21 and SWDIO is D22 on the Mini). Reset after flashing and the device is detected just fine:

usb 1-1.1: new full-speed USB device number 73 using xhci_hcd usb 1-1.1: New USB device found, idVendor=234b, idProduct=0000 usb 1-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 1-1.1: Product: Gnuk Token usb 1-1.1: Manufacturer: Free Software Initiative of Japan usb 1-1.1: SerialNumber: FSIJ-1.2.3-87155426

And GPG is happy:

$ gpg --card-status Reader ...........: 234B:0000:FSIJ-1.2.3-87155426:0 Application ID ...: D276000124010200FFFE871554260000 Version ..........: 2.0 Manufacturer .....: unmanaged S/N range Serial number ....: 87155426 Name of cardholder: [not set] Language prefs ...: [not set] Sex ..............: unspecified URL of public key : [not set] Login data .......: [not set] Signature PIN ....: forced Key attributes ...: rsa2048 rsa2048 rsa2048 Max. PIN lengths .: 127 127 127 PIN retry counter : 3 3 3 Signature counter : 0 Signature key ....: [none] Encryption key....: [none] Authentication key: [none] General key info..: [none]

While GnuK isn’t the fastest OpenPGP smart card implementation this certainly seems to be one of the cheapest ways to get it up and running. (Plus the fact that chopstx already runs on the Mini provides me with a useful basis for other experimentation.)

Categories: FLOSS Project Planets

Olivier Berger: Making Debian stable/jessie images for OpenStack with bootstrap-vz and cloud-init

Planet Debian - Tue, 2017-02-07 13:09

I’m investigating the creation of VM images for different virtualisation solutions.

Among the target platforms is a destop as a service platform based on an OpenStack public cloud.

We’ve been working with bootstrap-vz for creating VMs for Vagrant+VirtualBox so I wanted to test its use for OpenStack.

There are already pre-made images available, including official Debian ones, but I like to be able to re-create things instead of depending on some external magic (which also means to be able to optimize, customize and avoid potential MitM, of course).

It appears that bootstrap-vz can be used with cloud-init provided that some bits of config are specified.

In particular the cloud_init plugin of bootstrap-vz requires a metadata_source set to “NoCloud, ConfigDrive, OpenStack, Ec2“. Note we explicitely spell it ‘OpenStack‘ and not ‘Openstack‘ as was mistakenly done in the default Debian cloud images (see https://bugs.debian.org/854482).

The following snippet of manifest provides the necessary bits :

--- name: debian-{system.release}-{system.architecture}-{%Y}{%m}{%d} provider: name: kvm virtio_modules: - virtio_pci - virtio_blk bootstrapper: workspace: /target # create or reuse a tarball of packages tarball: true system: release: jessie architecture: amd64 bootloader: grub charmap: UTF-8 locale: en_US timezone: UTC volume: backing: raw partitions: #type: gpt type: msdos root: filesystem: ext4 size: 4GiB swap: size: 512MiB packages: # change if another mirror is closer mirror: http://ftp.fr.debian.org/debian/ plugins: root_password: password: whatever cloud_init: username: debian # Note we explicitely spell it 'OpenStack' and not 'Openstack' as done in the default Debian cloud images (see https://bugs.debian.org/854482) metadata_sources: NoCloud, ConfigDrive, OpenStack, Ec2 # admin_user: # username: Administrator # password: Whatever minimize_size: # reduce the size by around 250 Mb zerofree: true

I’ve tested this with the bootstrap-vz version in stretch/testing (0.9.10+20170110git-1) for creating jessie/stable image, which were booted on the OVH OpenStack public cloud. YMMV.

Hope this helps

Categories: FLOSS Project Planets

InternetDevels: Drupal 8 SEO Checklist module: your reliable website optimization adviser

Planet Drupal - Tue, 2017-02-07 12:18

To do: improve your Drupal website’s SEO. This sounds like a pretty big task when it’s written out like that on your list! ;) Big tasks are easier to cope with when they are divided into clear, smaller steps. Great news! You can have a smart adviser, named SEO Checklist module, who can give you this list of steps and prompt you how to fulfill them.

Read more
Categories: FLOSS Project Planets

Asankha Perera: AdroitLogic Announces Major New Release of UltraESB with an Enterprise Integration Platform and Developer Studio

Planet Apache - Tue, 2017-02-07 11:19
A short while ago, we announced a major update to the UltraESB Enterprise Service Bus, called the UltraESB-X release. This is a rewrite of our high performance UltraESB Enterprise Service Bus run-time, on top of a new modular framework code-named Project-X. The new release retains the performance benefits introduced by the UltraESB, supporting HTTP/S exchanges with zero-copy data transfers and Non-blocking IO (NIO). This performance edge prompted Walmart to license our technology in 2013 to power its new integration platform; and for Kuoni GTA to support over half a billion requests per day, with its B2B travel API, since 2014.
 We also released the UltraStudio, a rich integration solution development studio to increase developer productivity. It enriches the end-to-end experience of integration development with powerful extensions and plugins to the popular IntelliJ IDEA IDE. The UltraStudio allows users to develop integration flows with a drag-and-drop palette of connectors and processors, and helps to test, debug and trace integration flows through the execution path, and view properties at each stage. The palette of connectors and processors keeps increasing, and its a simple effort to develop custom connectors and message processors, by end users. The YouTube video linked from Sajiths blog shows this in action.

The new Integration Platform announced today, offers integration capabilities as a service, over private, public or hybrid cloud based Docker containers, utilizing the Kubernetes framework for managing the run-time and availability. The Integration Platform becomes available as an on-demand service to multiple organizational units within an enterprise, allowing users to target different UltraESB-X deployments with varying availability requirements, over a set of shared physical resources. The platform manages the deployments and automatically repairs failed instances, guaranteeing availability and stability. In the event of a container failure, the incarnation details of the instance is retained to allow troubleshooting of the root cause, and the container instances thus retain state in an otherwise stateless container deployment to facilitate management. The metrics of each UltraESB-X runtime are reported to ElasticSearch which then powers a consolidated management and monitoring dashboard that manages the platform.

Try out the new products from our website, and the integration resources, samples and documentation from the new Developer site at http://developer.adroitlogic.org
Categories: FLOSS Project Planets

Flocon de toile | Freelance Drupal: Create an action for custom mass updates with Drupal 8

Planet Drupal - Tue, 2017-02-07 11:00

Drupal 8 makes it possible to carry out certain mass actions on the site's contents, such as publishing or unpublishing massively contents, positioning them at the top of lists, etc. It may be useful to provide to certain user profiles some customized actions related to the specificities of their site, such as highlighting certain taxonomy terms, changing the value of a specific field, and thus avoiding heavy and tedious update operations to users on each of the content to be modified.

Categories: FLOSS Project Planets

Coding Diet: Minor Refactorings

Planet Python - Tue, 2017-02-07 10:28

Looking through the code of an open-source Python project I came across code that amounts to the following:

host_name = self.connection.getsockname()[0] if self.server.host_name is not None: host_name = self.server.host_name

This looks odd because if the condition is True then we forget about the host_name derived from the call to self.connection.getsockname. However it could be that that call has some side-effect. Assuming that it does not we can re-write the above code as:

if self.server.host_name is not None: host_name = self.server.host_name else: host_name = self.connection.getsockname()[0]

Or even, if you prefer:

host_name = self.server.host_name if self.server.host_name is not None else self.connection.getsockname()[0]

Or:

host_name = self.server.host_name if host_name is None: host_name = self.connection.getsockname()[0]

Or finally:

host_name = self.server.host_name or self.connection.getsockname()[0]

Note that this last version has slightly different semantics in that it will use the call to getsockname if self.server.host_name is at all falsey, in particular if it is ''.

I love finding minor refactoring opportunities like this. I enjoy going through code and finding such cases. I believe if a project's source code has many such opportunites it hints that the authors do not engage in routine refactoring. However, this post is more about how such a refactoring interacts with Python's `@property' decorator.

The @property decorator

In this case we call getsockname, so when considering our refactoring we have to be careful that there are no significant side-effects. However even if the code had been:

host_name = self.connection.host_name if self.server.host_name is not None: host_name = self.server.host_name

We would still have to go and check the code to make sure that either connection or host_name is not implemented as property and if so, that they do not have significant side-effects.

One question arises. What to do if it does have significant side-effects? Leaving it as is means subsequent readers of the code are likely to go through the same thought process. The code as is, looks suspicious.

First off, if I could, I'd refactor self.connection.host_name so that it does not have side-effects, perhaps the side-effects are put into a separate method that is called in any case:

self.connection.setup_host_name() if self.server.host_name is not None: host_name = self.server.host_name else: host_name = self.connection.host_name

Assuming I don't have the ability to modify self.connection.host_name, I could still refactor my own code to have the side-effects separately. Something like:

class Whatever(...): def setup_host_name(self): self._host_name = self.connection.host_name .... self.setup_host_name() if self.server.host_name is not None: host_name = self.server.host_name else: host_name = self._host_name

I think that is better than the easy opt-out of a comment. Obviously in this case we have some control over the class in question since we're referencing self. But imagine that self was some other variable that we don't necessarily have the ability to modify the class of, then a more lightweight solution would be something like:

if other.server.host_name is not None: other.connection.host_name host_name = `other`.server.host_name else: host_name = `other`.connection.host_name

This of course assumes that the side-effects in question are not related to other.server.host_name. An alternative in either case:

connection_host_name = other.connection.host_name if other.server.host_name is not None: host_name = other.server.host_name else: host_name = connection_host_name

I'm not sure that either really conveys that other.connection.host_name is evaluated for its side-effects. Hence, in this case, I would opt for a comment, but you may disagree.

Conclusion

The @property decorator in Python is very powerful and I love it and use it regularly. However, we should note that it comes with the drawback that it can make code more difficult to evaluate locally.

A comprehensive test-suite would alay most fears of making such a minor refactor.

Finally, finding such code in a code-base is well worth refactoring. The odd bit of code like this is always likely to sneak in unnoticed, and can often be the result of several other simplifications. However, if you find many of these in a project's source code, I believe it is symptomatic of at least one of two things; either the authors don't engage in much refactoring or general code-maintenance, or there is a lack of a decent test-suite which makes such refactorings more of a chore.

Final irrelevant aside

In a previous post I made a suggestion about if-as syntax. I'm still against this idea, however this is the sort of situation it would (potentially) benefit. Our second syntax possibility could have been:

host_name = self.server.host_name as hn if hn is not None else self.connection.getsockname()[0]

Whilst this is shorter and reduces some repetition, it's unclear to me if this is more readable.

Categories: FLOSS Project Planets

Drupal Association blog: Drupal in Europe - Community Survey

Planet Drupal - Tue, 2017-02-07 10:17

TL;DR If you are a European community member, please take our community survey about Drupal in Europe.

After 6+ years working at the Drupal Association and knowing so many members around the world, it’s easy for me to think I know what is going on with the Project. But, it is a big world and each region, country, and local market has unique, evolving needs.

To avoid assuming the best way to help the community, I am asking for your input. I'm gathering insight one region at a time. I’ll share the feedback with staff and the Drupal Association Board to refine how we serve the community.

I’m starting first with our European community. This is so it's well timed with our DrupalCon Europe planning. In fact, the Drupal Association Board meets on 23 and 24 February where we will strategize how we can best support the European community. We’ll use your input to drive that discussion.

I’m collecting input in a few ways. Recently, I held roundtable discussions with various community organizers. Now I’m opening up the discussion to all who Drupal in Europe. Please tell me how the Drupal Association can best support Drupal by taking this community survey before February 16th.

Thanks for sharing your thoughts and needs. I look forward to hearing from you.

Categories: FLOSS Project Planets

2bits: How to configure Varnish Cache for Drupal with SSL Termination Using Pound or Nginx

Planet Drupal - Tue, 2017-02-07 08:49

Secure Socket Layer (SSL) is the protocol that allows web sites to serve traffic in HTTPS. This provides end to end encryption between the two end points (the browser and the web server). The benefits of using HTTPS is that traffic between the two end points cannot be deciphered by anyone snooping on the connection. This reduces the odds of exposing sensitive information such as passwords, or getting the web site hacked by malicious parties. Google has also indicated that sites serving content exclusively in HTTPS will get a small bump in Page Rank.

Historically, SSL certificate issuers have served a secondary purpose: identity verification. This is when the issuing authority vouches that a host or a domain is indeed owned by the entity that requests the SSL certificate for it. This is traditionally done by submitting paper work including government issued documentation, incorporation certificates, ...etc.

Historically, SSL certificates were costly. However, with the introduction of the Let's Encrypt initiative, functional SSL certificates are now free, and anyone who wants to use them can do so, minus the identity verification part, at least for now.

Implementing HTTPS with Drupal can be straightforward with low traffic web sites. The SSL certificate is installed in the web server, and that is about it. With larger web sites that handle a lot of traffic, a caching layer is almost always present. This caching layer is often Varnish. Varnish does not handle SSL traffic, and just passes all HTTPS traffic straight to Drupal, which means a lot of CPU and I/O load.

This article will explain how to avoid this drawback, and how to have it all: caching in Varnish, plus serving all the site using HTTPS.

The idea is quite simple in principle: terminate SSL before Varnish, which will never know that the content is encrypted upstream. Then pass the traffic from the encryptor/decryptor to Varnish on port 81. From there, Varnish will pass it to Apache on port 8080.

We assume you are deploying all this on Ubuntu 16.04 LTS, which uses Varnish 4.0, although the same can be applied to Ubuntu 14.04 LTS with Varnish 3.0.

Note that we use either one of two possible SSL termination daemons: Pound and Nginx. Each is better in certain cases, but for the large part, they are interchangeable.

One secondary purpose for this article is documenting how to create SSL bundles for intermediate certificate authorities, and to generate a combined certificate / private key. We document this because of the sparse online information on this very topic.

Install Pound aptitude install pound Preparing the SSL certificates for Pound

Pound does not allow the private key to be in a separate file or directory from the certificate itself. It has to be included with the main certificate, and with intermediate certificate authorities (if there are any).

We create a directory for the certificates:

mkdir /etc/pound/certs

cd /etc/pound/certs

We then create a bundle for the intermediate certificate authority. For example, if we are using using NameCheap for domain registration, they use COMODO for certificates, and we need to do the following. The order is important.

cat COMODORSADomainValidationSecureServerCA.crt \
  COMODORSAAddTrustCA.crt \
  AddTrustExternalCARoot.crt >> bundle.crt

Then, as we said earlier, we need to create a host certificate that includes the private key.

cat example_com.key example_com.crt > host.pem

And we make sure the host certificate (which contains the private key as well) and the bundle, are readable only to root.

chmod 600 bundle.crt host.pem Configure Pound

We then edit /etc/pound/pound.cfg

# We have to increase this from the default 128, since it is not enough
# for medium sized sites, where lots of connections are coming in
Threads 3000

# Listener for unencrypted HTTP traffic
ListenHTTP
  Address 0.0.0.0
  Port    80
 
  # If you have other hosts add them here
  Service
    HeadRequire "Host: admin.example.com"
    Backend
      Address 127.0.0.1
      Port 81
    End
  End
 
  # Redirect http to https
  Service
    HeadRequire "Host: example.com"
    Redirect "https://example.com/"
  End
 
  # Redirect from www to domain, also https
  Service
    HeadRequire "Host: www.example.com"
    Redirect "https://example.com/"
  End
End

# Listener for encrypted HTTP traffic
ListenHTTPS
  Address 0.0.0.0
  Port    443
  # Add headers that Varnish will pass to Drupal, and Drupal will use to switch to HTTPS
  HeadRemove      "X-Forwarded-Proto"
  AddHeader       "X-Forwarded-Proto: https"
 
  # The SSL certificate, and the bundle containing intermediate certificates
  Cert      "/etc/pound/certs/host.pem"
  CAList    "/etc/pound/certs/bundle.crt"
 
  # Send all requests to Varnish
  Service
    HeadRequire "Host: example.com"
    Backend
      Address 127.0.0.1
      Port 81
    End
  End
 
  # Redirect www to the domain
  Service
    HeadRequire "Host: www.example.com.*"
    Redirect "https://example.com/"
  End
End

Depending on the amount of concurrent traffic that your site gets, you may need to increase the number of open files for Pound. To do this, edit the file /etc/default/pound, and add the following lines:

# Increase the number of open files, so pound does not log errors like:
# "HTTP Acces: Too many open files"
ulimit -n 20000

Do not forget to change the 'startup' line from 0 to 1, otherwise pound will not start.

Configure SSL Termination for Drupal using Nginx

You may want to use Nginx instead of the simpler Pound in certain cases. For example, if you want to handle redirects from the plain HTTP URLs to the corresponding SSL HTTPS URls. Pound cannot do that. It redirects to the home page of the site instead.

Also, if you want to process your site's traffic using analysis tools, for example Awstats, you need to capture those logs. Although Pound can output logs in Apache combined format, it also outputs errors to the same log, at least on Ubuntu 16.04, and that makes these logs unusable by analysis tools.

First install Nginx:

aptitude install nginx

Create a new virtual host under /etc/nginx/sites-available/example.com, with this in it:

# Redirect www to no-www, port 80
server {
  server_name www.example.com;

  # Replace this line with: 'access_log off' if logging ties up the disk
  access_log /var/log/nginx/access-example.log;
 
  # Permanent redirect
  return 301 https://example.com$request_uri;
}

# Redirect www to no-www, SSL port 443
server {
  listen 80 default_server;
  listen [::]:80 default_server ipv6only=on;

  server_name example.com;

  # Replace this line with: 'access_log off' if logging ties up the disk
  access_log /var/log/nginx/access-example.log;
 
  # Permanent redirect
  return 301 https://$host$request_uri;
}

server {
  listen 443 ssl default_server;
  listen [::]:443 ssl default_server ipv6only=on;

  server_name example.com;

  # We capture the log, so we can feed it to analysis tools, e.g. Awstats
  # This will be more comprehensive than what Apache captures, since Varnish
  # will end up removing a lot of the traffic from Apache
  #
  # Replace this line with: 'access_log off' if logging ties up the disk
  access_log /var/log/nginx/access-example.log;

  ssl on;

  # Must contain the a bundle if it is a chained certificate. Order is important.
  # cat example.com.crt bundle.crt > example.com.chained.crt 
  ssl_certificate      /etc/ssl/certs/example.com.chained.crt;
  ssl_certificate_key  /etc/ssl/private/example.com.key;

  # Test certificate
  #ssl_certificate     /etc/ssl/certs/ssl-cert-snakeoil.pem;
  #ssl_certificate_key /etc/ssl/private/ssl-cert-snakeoil.key;

  # Restrict to secure protocols, depending on whether you have visitors
  # from older browsers
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;

  # Restrict ciphers to known secure ones
  ssl_ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256;

  ssl_prefer_server_ciphers on;
  ssl_ecdh_curve secp384r1;
  ssl_stapling on;
  ssl_stapling_verify on;

  add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
  add_header X-Frame-Options DENY;
  add_header X-Content-Type-Options nosniff;

  location / {
    proxy_pass                         http://127.0.0.1:81;
    proxy_read_timeout                 90;
    proxy_connect_timeout              90;
    proxy_redirect                     off;

    proxy_set_header Host              $host;
    proxy_set_header X-Real-IP         $remote_addr;
    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto https;
    proxy_set_header X-Forwarded-Port  443;
   
    proxy_buffers                      8 24k;
    proxy_buffer_size                  2k;
  }
}

Then link this to an entry in the sites-enabled directory

cd /etc/nginx/sites-enabled

ln -s /etc/nginx/sites-available/example.com

Then we add some performance tuning parameters, by creating a new file: /etc/nginx/conf.d/tuning. These will make sure that we handle higher traffic than the default configuration allows:

 
worker_processes       auto;

worker_rlimit_nofile   20000;

events {
  use epoll;
  worker_connections 19000;
  multi_accept       on;
}

http {
  sendfile           on;
  tcp_nopush         on;
  tcp_nodelay        on;
  keepalive_timeout  65;
  keepalive_requests 10000;
   
  client_body_buffer_size 128k;   
}

We now have either Pound or Nginx in place, handling port 443 with SSL certifcates, and forwarding the plain text traffic to Varnish.

Change Varnish configuration to use an alternative port

First, we need to make Varnish work on port 81.

On 16.04 LTS, we edit the file: /lib/systemd/system/varnish.service. If you are using Ubuntu 14.04 LTS, then the changes should go into /etc/default/varnish instead.

Change the 'ExecStart' line for the following:

Port that Varnish will listen on (-a :81)
Varnish VCL Configuration file name (/etc/varnish/main.vcl)
Size of the cache (-s malloc,1536m)

You can also change the type of Varnish cache storage, e.g. to be on disk if it is too big to fit in memory (-s file,/var/cache/varnish/varnish_file.bin,200GB,8K). Make sure to create the directory and assign it the correct owner and permissions.

We use a different configuration file name so as to not overwrite the default one, and make updates easier (no questions asks during update to resolve differences).

In order to inform systemd that we changed a daemon startup unit, we need to issue the following command:

systemctl daemon-reload Add Varnish configuration for SSL

We add the following section to the Varnish VCL configuration file. This will pass a header to Drupal for SSL, so Drupal will enforce HTTPS for that request.

# Routine used to determine the cache key if storing/retrieving a cached page.
sub vcl_hash {

  # This section is for Pound
  hash_data(req.url);

  if (req.http.host) {
    hash_data(req.http.host);
  }
  else {
    hash_data(server.ip);
  }

  # Use special internal SSL hash for https content
  # X-Forwarded-Proto is set to https by Pound
  if (req.http.X-Forwarded-Proto ~ "https") {
    hash_data(req.http.X-Forwarded-Proto);
  }
}

Change Apache's Configuration

If you had SSL enabled in Apache, you have to disable it so that only Pound (or Nginx) are listening on port 443. If you do not do this, Pound and Nginx will refuse to start with an error: Address already in use.

First disable the Apache SSL module.

a2dismod ssl

We also need to make Apache listen on port 8080, which Varnish will use to forward traffic to.

 
Listen 8080

And finally, your VirtualHost directives should listen on port 8080, as follows. It is also best if you restrict the listening on the localhost interface, so outside connections cannot be made to the plain text virtual hosts.

<VirtualHost 127.0.0.1:8080>
...
</VirtualHost>

The rest of Apache's configuration is detailed in an earlier article on Apache MPM Worker threaded server, with PHP-FPM.

Configure Drupal for Varnish and SSL Termination

We are not done yet. In order for Drupal to know that it should only use SSL for this page request, and not allow connections from plain HTTP, we have to add the following to settings.php:

// Force HTTPS, since we are using SSL exclusively
if (isset($_SERVER['HTTP_X_FORWARDED_PROTO'])) {
  if ($_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') {
    $_SERVER['HTTPS'] = 'on';
  }
}

If you have not already done so, you also have to enable page cache, and set the external cache age for cached pages. This is just a starting point, assuming Drupal 7.x, and you need to modify these accordingly depending on your specific setup.

// Enable page caching
$conf['cache'] = 1;
// Enable block cache
$conf['block_cache'] = 1;
// Make sure that Memcache does not cache pages
$conf['cache_lifetime'] = 0;
// Enable external page caching via HTTP headers (e.g. in Varnish)
// Adjust the value for the maximum time to allow pages to stay in Varnish
$conf['page_cache_maximum_age'] = 86400;
// Page caching without bootstraping the database, nor invoking hooks
$conf['page_cache_without_database'] = TRUE;
// Nor do we invoke hooks for cached pages
$conf['page_cache_invoke_hooks'] = FALSE;

// Memcache layer
$conf['cache_backends'][]    = './sites/all/modules/contrib/memcache/memcache.inc';
$conf['cache_default_class'] = 'MemCacheDrupal';
$conf['memcache_servers']    = array('127.0.0.1:11211' => 'default');
$conf['memcache_key_prefix'] = 'live';

And that is it. Now restart all the daemons:

service pound restart
service nginx restart # If you use nginx instead of pound
service varnish restart
service apache2 restart

Check that all daemons have indeed restarted, and that there are no errors in the logs. Then test for proper SSL recognition in the browser, and for correct redirects.

For The Extreme Minimalist: Eliminating Various Layers

The above solution stack works trouble free, and has been tested with several sites. However, there is room for eliminating different layers. For example, instead of having Apache as the backend web server, this can be replaced with Nginx itself, listening on both port 443 (SSL), and 8080 (backend), with Varnish in between. In fact, it is possible to even remove Varnish altogether, and use Ngnix FastCGI Cache instead of it. So Nginx listens on port 443, decrypts the connection, and passes the request to its own cache, which decides what is served from cache versus what gets passed through to Nginx itself on port 8080, which hands it over to PHP and Drupal.

Don't let the words 'spaghetti' and 'incest' take over your mind! Eventually, all the oddities will be ironed out, and this will be a viable solution. There are certain things that are much better known in Apache for now in regards to Drupal, like URL rewriting for clean URLs. There are also other things that are handled in .htaccess for Apache that needs to gain wider usage within the community before an Nginx only solution becomes the norm for web server plus cache plus SSL.

Apache MPM Worker Multithreaded with PHP-FPM is a very low overhead, high performance solution, and we will continue to use it until the Nginx only thing matures into a wider used solution, and has wider use and support within the Drupal community to remain viable for the near future.

Tags: Contents: 
Categories: FLOSS Project Planets

Glowing Qt Charts

Planet KDE - Tue, 2017-02-07 08:35

Have you ever had the need to visualize data graphically and add some ‘wow’-effect to it? I’m currently helping out with the development of a demo application, where we have some charts to visualize data received from sensors. Naturally, the designer wants the charts to be visually appealing.

At the beginning we had basic graphs to show the data, one of them being temperature. We used LineSeries QML type from Qt Charts with dynamic for this.

Temperature represented as LineSeries.

Now to make the graph more polished we decided to hide the labels and the grids drawn to the chart and make the background transparent. This way we had only the series drawn to the area reserved for the graph. To achieve this we modified our ChartView with two axis and a series to the following:

ChartView { id: chartView backgroundColor: "transparent" legend.visible: false ValueAxis { id: valueAxisX min: 0 max: maxNumOfTempReadings + 1 visible: false } ValueAxis { id: valueAxisY min: minimum max: maximum visible: false } LineSeries { id: avgTempSeries axisX: valueAxisX axisY: valueAxisY color: chartColor } }

But having just the series was no fun. With the main background of the application and the temperature values we had a chart that looked like in the following picture.

Temperature chart without effects.

As the design specification had some glow on elements in the UI, we decided to give a try to some graphical effects QML has to offer, more precisely the Glow effect. We added the Glow element to the same level with the ChartView (Glow and ChartView elements are siblings).

Glow { anchors.fill: chartView radius: 18 samples: 37 color: "#15bdff" source: chartView }

With the above code it was very easy to add some glow to the chart series. The same chart as shown in the beginning with the mentioned changes and some additional elements ended up looking like in the following picture.

Temperature graph with a glowing series.

Want to spice up your charts? Go and give it a try, it’s really easy.

The post Glowing Qt Charts appeared first on Qt Blog.

Categories: FLOSS Project Planets
Syndicate content