Tim Janik

Tim Janik studied computer science at the University of Hamburg, is a Free Software and Open Source author, advocate, speaker and contributor to various open projects. For more information see the Biography of Tim Janik. You can hire Tim Janik for professional consulting around Free Software through the Lanedo Website.

Dec 122014
Caption Text

Miller, Gary – Wikimedia Commons

In the last months I finally completed and merged a long standing debt into Rapicorn. Ever since the Rapicorn GUI layout & rendering thread got separated from the main application (user) thread, referencing widgets (from the application via the C++ binding or the Python binding) worked mostly due to luck.

I investigated and researched several remote reference counting and distributed garbage collection schemes and many kudos go to Stefan Westerfeld for being able to bounce ideas off of him over time. In the end, the best solution for Rapicorn makes use of several unique features in its remote communication layer:

  1. Only the client (user) thread will ever make two way calls into the server (GUI) thread, i.e. send a function call message and block for a result.
  2. All objects are known to live in the server thread only.
  3. Remote messages/calls are strictly sequenced between the threads, i.e. messages will be delivered and processed sequentially and in order of arrival.

This allows the following scheme:

  1. Any object reference that gets passed from the server (GUI) thread into the client (user) thread enters a server-side reference-table to keep the object alive. I.e. the server thread assumes that clients automatically “ref” new objects that pass the thread boundary.
  2. For any object reference that’s received by a client thread, uses are counted separately on the client-side and once the first object becomes unused, a special message is sent back to the server thread (SEEN_GARBAGE).
  3. At any point after receiving SEEN_GARBAGE, the server thread may opt to initiate the collection of remote object references. The current code has no artificial delays built in and does so immediately (thresholds for delays may be added in the future).
  4. To collect references, the serer thread swaps its reference-table for an empty one and sends out a GARBAGE_SWEEP command.
  5. Upon receiving GARBAGE_SWEEP, the client thread creates a list of all object references it received in the past and for which the client-side use count has dropped to zero. These objects are removed from the client’s internal bookkeeping and the list is sent back to the server as GARBAGE_REPORT.
  6. Upon receiving a GARBAGE_REPORT, corresponding to a previous GARBAGE_SWEEP command, the sever thread has an exact list of references to purge from its previously detached reference-table. Remaining references are merged into currently active table (the one that started empty upon GARBAGE_SWEEP initiation). That way, all object references that have been sent to the client thread but are now unused are discarded, unless they have meanwhile been added into the newly active reference-table.

So far, the scheme works really well. Swapping out the server side reference-tables copes properly with the most tricky case: A (new) object reference traveling from the server to the client (e.g. as part of a get_object() call result), while the client is about to report this very reference as unused in a GARBAGE_REPORT. Such an object reference will be received by the client after its garbage report creation and treated as a genuinely new object reference arriving, similarly to the result of a create_object() call. On the server side it is simply added into the new reference table, so it’ll survive the server receiving the garbage report and subsequent garbage disposal.

The only thing left was figuring out how to automatically test that an object is collected/unreferenced, i.e. write code that checks that objects are gone…
Since we moved to std::shared_ptr for widget reference counting and often use std::make_shared(), there isn’t really any way too generically hook into the last unref of an object to install test code. The best effort test code I came up with can be found in testgc.py. It enables GC layer debug messages, triggers remote object creation + release and then checks the debugging output for corresponding collection messages. Example:

TestGC-Create 100 widgets... TestGC-Release 100 widgets...
GCStats: ClientConnectionImpl: SEEN_GARBAGE (aaaa000400000004)
GCStats: ServerConnectionImpl: GARBAGE_SWEEP: 103 candidates
GCStats: ClientConnectionImpl: GARBAGE_REPORT: 100 trash ids
GCStats: ServerConnectionImpl: GARBAGE_COLLECTED: \
  considered=103 retained=3 purged=100 active=3

The protocol is fairly efficient in saving bandwitdh and task switches: ref-messages are implicit (never sent), unref-messages are sent only once and support batching (GARBAGE_REPORT). What’s left is the two tiny messages that initiate garbage collection (SEEN_GARBAGE) and synchronize reference counting (GARBAGE_SWEEP). As I hinted earlier, in the future we can introduce arbitrary delays between the two to reduce overhead and increase batching, if that ever becomes necessary.

While its not the most user visible functionality implemented in Rapicorn, it presents an important milestone for reliable toolkit operation and a fundament for future developments like remote calls across process or machine boundaries.

Merge 14.10.0 – Includes remote refernce counting,

Merge shared_ptr-reference-counting  – post-release.

Oct 222014

Poodle by Heather Hales

In my previous post Forward Secrecy Encryption for Apache, I’ve described an Apache SSLCipherSuite setup to support forward secrecy which allowed TLS 1.0 and up, avoided SSLv2 but included SSLv3.

With the new PODDLE attack (Padding Oracle On Downgraded Legacy Encryption), SSLv3 (and earlier versions) should generally be avoided. Which means the cipher configurations discussed previously need to be updated.

I’ll first recap the configuration requirements:

  • Use Perfect Forward Secrecy where possible.
  • Prefer known strong ciphers.
  • Avoid RC4, CRIME, BREACH and POODLE attacks.
  • Support browsing down to Windows XP.
  • Enable HSTS as a bonus.

The Windows XP point is a bit tricky, since IE6 as shipped with XP originally only supports SSLv3, but later service packs brought IE8 which at least supports TLS 1.0 with 3DES.

Here’s the updated configuration:

SSLEngine On
SSLProtocol All -SSLv2 -SSLv3
SSLHonorCipherOrder on
# Prefer PFS, allow TLS, avoid SSL, for IE8 on XP still allow 3DES
# Prevent CRIME/BREACH compression attacks
SSLCompression Off
# Commit to HTTPS only traffic for at least 180 days
Header add Strict-Transport-Security "max-age=15552000"

Last but not least, I have to recommend www.ssllabs.com again, which is a great resource to test SSL/TLS setups. In the ssllabs, the above configuration yields an A-rating for testbit.eu.

UPDATE: The above configuration also secures HTTPS connections against the FREAK (CVE-2015-0204) attack, as can be tested with the following snippet:

openssl s_client -connect testbit.eu:443 -cipher EXPORT

Connection attempts to secure sites should result in a handshake failure.

Aug 052014

Map Jan-Dec to 1-12

In a time critical section of a recent project, I came across having to optimize the conversion of three digit US month abbreviations (as commonly found in log files) to integers in C++. That is, for “Jan” yield 1, for “Feb” yield 2, etc, for “Dec” yield 12.

In C++ the simplest implementation probably looks like the following:

std::string string; // input value
std::transform (string.begin(), string.end(), string.begin(), ::toupper);
if (string == "JAN") return 1;
if (string == "FEB") return 2;
// ...
if (string == "DEC") return 12;
return 0; /* mismatch */

In many cases the time required here is fast enough. It is linear in the number of months and depending on the actual value being looked up. But for an optimized inner loop I needed something faster, ideally with running time independent of the actual input value and avoiding branch misses where possible. I could take advantage of a constrained input set, which means the ‘mismatch’ case is never hit in practice.

To summarize:

  • Find an integer from a fixed set of strings.
  • Ideal runtime is O(1).
  • False positives are acceptable, false negatives are not.

That actually sounds a lot like using a hash function. After some bit fiddling, I ended up using a very simple and quick static function that yields a result instantly but may produce false positives. I also had to get rid of using std::string objects to avoid allocation penalties. This is the result:

static constexpr const char l3month_table[] = {
  12, 5, 0, 8, 0, 0, 0, 1, 7, 4, 6, 3, 11, 9, 0, 10, 2
}; // 17 elements

/// Lookup month from 3 letters, with 30% chance returns 0 for invalid inputs.
static constexpr inline unsigned int
l3month (const char *l3str)
  return l3month_table[((l3str[1] & ~0x20) + (l3str[2] & ~0x20)) %
                       sizeof (l3month_table) / sizeof (l3month_table[0])];

The hash function operates only on the last 2 ASCII letters of the 3-letter month abbreviations, as these two are sufficient to distinguish between all 12 cases and turn out to yield good hash values. The expression (letter & ~0x20) removes the lowercase ASCII bit, so upper and lower case letters are treated the same without using a potentially costly if-branch. Adding the uppercased ASCII letters modulo 17 yields unique results, so this simple hash value is used to index a 17 element hash table to produce the final result.

In effect, this perfectly detects all 12 month names in 3 letter form and has an almost 30% chance of catching invalid month names, in which case 0 is returned – useful for debugging or assertions if input contracts are broken.

As far as I know, the function is as small as possible given the constraints. Anyone can simplify it further?

Apr 152014


The basic need to encrypt digital communication seems to be becoming common sense lately. It probably results from increased public awareness about the number of parties involved in providing the systems required (ISPs, backbone providers, carriers, sysadmins) and the number of parties these days taking an interest in digital communications and activities (advertisers, criminals, state authorities, voyeurs, …). How much to encrypt and to what extend seems to be harder to grasp though.

Default Encryption

A lot of reasons exist why it’s useful to switch to encryption of content and all data transfers by default (Ed). This has been covered elsewhere in more depth, just a very quick summary could be the following:

  • Not encrypting leads to exposing all online behavior to super easy surveillance by local, foreign and alien authorities, criminals or nosy operators.
  • Rare use of encryption makes it effectively an alert signal for sensitive messages (e.g. online banking) and an alert signal for interesting personalities (e.g. movement organizers).
Forward Secrecy

Even with good certificates in place and strong encryption algorithms like AES, there’re good reasons to enable use of perfect forward secrecy (PFS) for encrypted connections. One is that PFS means each connection uses a securely generated, separate new encryption key, so that recordings of the connection traffic cannot be decyphered in the future when the server certificate is acquired by an attacker. For the same reason PFS is also useful to mitigate security issues like heartbleed, since the vast majority of generated connection keys are not affected by occasional memory leaks, in contrast to the permanent server certificate that is likely to be exposed by even rare memory leaks.

HTTPS On Subdomains

I investigated the administrative side of things for default encryption of all testbit.eu traffic, which also hosts other sites like rapicorn.org. It turned out a number of measures need to be taken to yield an even mildly acceptable result:

  • Many browsers out there still only support encrypted HTTPS connections to just one virtual host per IP address (SSL & vhosts).
  • The affordable (or free) certificates signed by certificate authorities usually just allow one subdomain per certificate (e.g. a free StartSSL certificate covers just www.example.com and example.com).
Site Arrangements

So here’s the list of steps I took to improve privacy on testbit.eu:

  1. I moved everything that was previously hosted on rapicorn.org or beast.testbit.eu into testbit.eu/... subdirectories (leaving a bunch of redirects behind).
  2. Then I obtained a website certificate from startssl.com. I could get my 4096bit certificate signed from them free of charge within a matter of hours. CAcert seems to be the only free alternative, but it’s not supported by major browsers and isn’t even packaged by Debian now.
  3. Having the certificate, I setup my apache to a) reject known-to-be-broken encryption SSL/TLS settings; b) allow a few weak encryption variants to still support old XP browsers; c) give preference to strong encryption schemes with perfect-forward-secrecy.
  4. Last, I setup automatic redirection for all incoming traffic from HTTP to HTTPS on testbit.eu.
Apache PFS Configuration

The Apache configuration bits for TLS with PFS look like the following:

# SSL version 2 has been widely superseded by version 3 or TLS
SSLProtocol All -SSLv2
# Compression is rarely supported and vulnerable, see CRIME attack
SSLCompression Off
# Preferred Cipher suite selection favoring PFS
# Enable picking ciphers in the above order
SSLHonorCipherOrder on

Even stricter variants would be disabling 3DES and SSLv3. But it turns out that IE8 on XP still needs 3DES. Also Firefox-25 on Ubuntu-1304 still needs SSLv3. I still mean to support both for a few more months. This is the stricter configuration:

SSLProtocol All -SSLv3 -SSLv2
Apache HSTS Configuration

Sites that serve all content via HTTPS, might also decide to enforce secure connections with HTTP Strict Transport Security (HSTS), a recent security policy mechanism that prevents security downgrades of web connections and hijacking of web sessions. Apache needs the headers module enabled and a simple configuration line:

# Commit to HTTPS only traffic for at least 180 days
Header add Strict-Transport-Security "max-age=15552000"
Testing Testbit

Using this setup, I’m now getting a green A- at SSL Labs for Testbit: SSL Test.
Further references and tips can be found at BetterCrypto.org.

I’m happy about additional input to improve encryption and its use throughout the web, so tell me what I’ve been missing!

UPDATE: A newer post addresses setup changes to avoid the POODLE attack: Apache SSLCipherSuite without POODLE.

Nov 132013

Reblogged from the Lanedo GmbH blog:

Tobin Daily Visits Screenshot

Tobin Daily Visits Screenshot

During recent weeks, I’ve started to create a new tool “Tobin” to generate website statistics for a number of sites I’m administrating or helping with. I’ve used programs like Webalizer, Visitors, Google Analytics and others for a long time, but there’re some correlations and relationships hidden in web server log files that are hard or close to impossible to visualize with these tools.

The Javascript injections required by Google Analytics are not always acceptable and fall short of accounting for all traffic and transfer volume. Also some of the log files I’m dealing with are orders of magnitudes larger than available memory, so the processing and aggregation algorithms used need to be memory efficient.

Here is what Tobin currently does:

  1. Input records are read and sorted on disk, inputs are filtered for a specific year.
  2. A 30 minute window is used to determine visits via unique IP-address and UserAgent.
  3. Hits, such as images, CSS files, WordPress resource files, etc are filtered to derive page counts.
  4. Statistics such as per hour accounting and geographical origin are collected.
  5. Top-50 charts and graphs are generated in an HTML report from the collected statistics.

There is lots of room for future improvements, e.g. creation of additional modules for new charts and graphs, possibly accounting across multiple years, use of intermediate files to speed up processing and more. In any case, the current state works well for giant log files and already provides interesting graphs. The code right now is alpha quality, i.e. ready for a technical preview but it might still have some quirks. Feedback is welcome.

The tobin source code and tobin issue tracker are hosted on Github. Python 2.7 and a number of auxiliary modules are required. Building and testing works as follows:

$ git clone https://github.com/tim-janik/tobin.git
$ make # create tobin executable
$ ./tobin -n MySiteName mysite.log
$ x-www-browser logreport/index.html

Leave me a comment if you have issues testing it out and let me know how the report generation works for you. 😉

Sep 022013

London Southwark Bridge

Next Friday I’ll be giving a talk on Open Source In Business at the Campus Party Europe conference in the O2 arena, London. The talk is part of the Free Software Track at 14:00 on the Archimedes stage. I’m there the entire week and will be happy to meet up, so feel free to drop me a line in case you happen to be around the venue as well.

The picture above shows the London Southwark Bridge viewed from Millenium Footbridge with the London Tower Bridge in the background. Last week’s weather has been exceptional for sightseeing in London, so there were loads of great chances to visit that vast number of historical places and monuments present in London. My G+ stream has several examples of the pictures taken. I honestly hope the good weather continues and allows everyone to enjoy a great conference!

Update: The slides and video of the talk are now online here: Open Source In Business

Aug 062013

Reblogged from the Lanedo GmbH blog:

Documentation Tools

Would you want to invest hours or days into automake logic without a use case?

For two of the last software releases I did, I was facing this question. Let me give a bit of background. Recently the documentation generation for Beast and Rapicorn fully switched over to Doxygen. This has brought a number of advantages, such as graphs for the C++ inheritance of classes, build tests of documentation example code and integration with the Python documentation. What was left for improvement was a simplification of the build process and logic involved however.

Generating polished documentation is time consuming

Maintaining the documentation builds has become increasingly complex. One of the things adding to the complexity are increased dependencies, external tools are required for a number of features: E.g. Doxygen (and thus Qt) is required to build the main docs, dot is required for all sorts of graph generation, python scripts are used to auto-extract some documentation bits, rsync is used for incremental updates of the documentation, Git is used to extract interesting meta information like the documentation build version number or the commit logs, etc.

More complexity for tarballs?

For the next release, I faced the task of looking into making the documentation generation rules work for tarballs, outside of the Git repository. That means building in an environment significantly different from the usual development setup and toolchain (of which Git has become an important part). At the very least, this was required:

  • Creating autoconf rules to check for Doxygen and all its dependencies.
  • Require users to have a working Qt installation in order to build Rapicorn.
  • Deriving a documentation build version id without access to Git.
  • Getting the build dependencies right so we auto-build when Git is around but don’t break when Git’s not around.

All of this just for the little gain of enabling the normal documentation re-generation for someone wanting to start development off a tarball release.
Development based on tarballs? Is this a modern use case?

Development happens in Git

During this year’s LinuxTag, I’ve taken the chance to enter discussions and get feedback on development habits in 2013. Development based on tarballs certainly was the norm when I started in Free Software & Open Source, that was 1996. It’s totally not the case these days. A large number of projects moved to Git or the likes. Rapicorn and Beast have been moved to Git several years ago, we adopted the commit style of the Linux kernel and a GNU-style ChangeLog plus commit hash ids is auto-generated from Git for the tarballs.
Utilizing the meta information for a project living in Git comes naturally as time passes and projects get more familiar with Git. Examples are signed tags, scripts around branch-/merge-conventions, history greping or symbolic version id generation. Git also significantly improves spin-off developments which is why development of Git hosted projects generally happens in Git branches or Git clones these days. Sites like github encourage forking and pulling, going back to the inconveniences of tarball based development baring any history would be a giant leap backwards. In fact, these days tarballs serve as little more than a transport container for a specific snapshot of a Git repository.

Shipping pre-built Documentation

Taking a step back, it’d seem easier to avoid the hassle of adapting all the documentation build logic to work both ways, with and without Git, by simply including a copy of the readily built result into the tarball. Like everything, there’s a downside here as well of course, tarball size will increase significantly. Just how significantly? The actual size can make or break the deal, e.g. if it changed by orders of magnitude. Let’s take a look:

  • 6.0M – beast-0.8.0.tar.bz2
  • 23.0M – beast-full-docs.tar-bz2

Uhhh, that’s a lot. All the documentation for the next release totals around almost four times that of the last tarball size. That’s a bit excessive, can we do better?

It turns out that a large portion of the space in a full Doxygen HTML build is actually used up by images. Not the biggest chunk but a large one nevertheless, for the above example, we’re looking at:

  • 23M – du -hc full-docs/*png
  • 73M – du -hc full-docs/

So, 23 out of 73 MB for the images, that’s 32%. Doxygen doesn’t make it too hard to build without images, it just needs two configuration settings HAVE_DOT = NO and CLASS_DIAGRAMS = NO. Rebuilding the docs without any images also removes a number of image references, so we end up with:

  • 42M – slim-docs/

That’s a 42% reduction in documentation size. Actually that’s just plain text documentation now, without any pre-compressed PNG images. That means bzip2 could do a decent job at it, let’s give it a try:

  • 2.4M – beast-slim-docs.tar-bz2

Wow, that went better than expected, we’re just talking about 40% of the source code tarball at this point. Definitely acceptable, here’re the numbers for the release candidates in direct comparison, with and without pre-built documentation:

  • 6.1M – beast-no-docs-0.8.1-rc1.tar.bz2
  • 8.6M – beast-full-docs-0.8.1-rc1.tar.bz2

Disable Documentation generation rules in tarballs

Now that we’ve established that shipping documentation without graphs results in an acceptable tarball size increase, it’s easy to make the call to include full documentations with tarball releases. As a nice side effect, auto-generation of the documentation in tarballs can be disabled (not having the Git tree and other tools available, it’d be prone to fail anway). The only thing to watch out for is a srcdir!=builddir case with automake, as in Git trees documentation is build inside builddir, while it’s shipped and available from within srcdir in tarballs.

Pros and Cons for shipping documentation

  • Con: Tarball sizes increase, but the size difference seems accaptable, practical tests show less than 50% increase in tarball sizes for documentation excluding generated graphics.
  • Con: Tarball source changes cannot be reflected in docs. This mostly affects packagers, it’d be nice to receive substantial patches in upstream as a remedy.
  • Pro: The build logic is significantly simplified, allowing a hard dependency on Git and skipping complex conditionals for tool availability.
  • Pro: Build time and complexity from tarballs is reduced. A nice side effect, considering the variety of documentation tools out there, sgml-tools, doxygen, gtk-doc, etc.

For me the pros in this case clearly outweigh the cons. I’m happy to hear about pros and cons I might have missed.

Prior Art?

Looking around the web for cases of other projects doing this didn’t turn up too many examples. There’s some probability that most projects don’t yet trade documentation generation rules for pre-generated documentation in tarballs.

If you know projects that turned to pre-generated documentation, please let me know about them.

I’m also very interested in end-user and packager’s opinions on this. Also, do people know about other materials that projects include pre-built in tarballs? And without the means to regenerate everything from just the tarballs?