Tim Janik

Tim Janik studied computer science at the University of Hamburg, is a Free Software and Open Source author, advocate, speaker and contributor to various open projects. For more information see the Biography of Tim Janik. You can hire Tim Janik for professional consulting around Free Software through the Lanedo Website.

Oct 142015
 

ci-build-passing

I’ve spent the last week setting up Rapicorn and Beast with travis-ci.org, a free continuous integration service for Github. Since travis is only available for Github, this means the Beast Git repository (hosted on git.gnome.org) had to be moved (cloned) to Github.

Luckily, Git allows pushing to mutiple remotes:

git remote add all git@github.com:tim-janik/beast.git
git remote set-url --add --push all git@github.com:tim-janik/beast.git
git remote set-url --add --push all ssh://timj@git.gnome.org/git/beast
git remote show all
* remote all
  Fetch URL: git@github.com:tim-janik/beast.git
  Push  URL: git@github.com:tim-janik/beast.git
  Push  URL: ssh://timj@git.gnome.org/git/beast
  HEAD branch: master
  Remote branch:
    master new (next fetch will store in remotes/all)
  Local ref configured for 'git push':
    master pushes to master (up to date)

Now the following push will update both repositories:

git push all master

Also, ‘git push’ can be configured to push to ‘all’ instead of ‘origin’ by default:

git checkout master && git branch -u all/master
git push 
 To git@github.com:tim-janik/beast.git
  038d442..22c807a master -> master
 To ssh://timj@git.gnome.org/git/beast
  038d442..22c807a master -> master

The repos now contain a file .travis.yml that includes the complete build instructions, these need to be kept uptodate if any of the build dependencies change.

By default, travis-ci sets up Ubuntu 12.04 boxes for the continuous builds, but that’s way too old for most dependencies. Luckily there’s a beta program available to use Ubuntu 14.04 ‘trusty’, that can be selected with “dist: trusty”. The g++-4.8 compiler on trusty is still too old to build Beast, so the CI setup currently installs g++-5 from ppa:ubuntu-toolchain-r/test.

As a result, we now have automated test builds running on travis for the Rapicorn and Beast repositories that are triggered on each push command. After each build, the build bot reports success to the #beast IRC channel, and the current status can also be found via the “Build Status” buttons on github: Rapicorn Beast.

Jul 022015
 

Rapicorn 'visitor' branch

Trying to keep it up, here’s an update on recent developments in Rapicorn and Beast.

Git Branches

For now, Rapicorn and Beast are using Git branches the following way:

  • Topic branches are created for each change. Where possible, commits should compile and pass all tests (i.e. pass make check installcheck).
  • Once completed, topic branches are merged into the master branch. For intermediate merges of huge branches, I’ve recently been adding [ongoing] to the merge commit message. As an aside, branch merges should probably be more elaborate in the future to make devlog articles easier to write and potentially more accurate.
  • The master branch must always compile and pass all tests.
  • OpenHub: The OpenHub repo links have been adjusted to point at Rapicorn’s and Beast’s master branch. Because of problems with spammers and a corresponding reimplementations, code statistic updates on the OpenHub are platform currently stalled however.
    https://www.openhub.net/p/beast-bse
    https://www.openhub.net/p/rapicorn

Hello and goodbye clang++

Rapicorn C++11 code currently compiles with g++-4.7 and upwards. An initial attempt was made at making the C++11 code compile with clang++-3.4 but the incompatibilities are currently too numerous. A few good fixes have come out of this and are merged into master now, but further work on this branch probably has to wait for a newer clang++ version.

New Widgets

Rapicorn is growing more widgets that implement state rendering via SVG element matching. Recent additions are:

  • LayerPainter – A container that allows rendering widgets on top of each other.
  • ElementPainter – A container that displays state dependent SVG image elements.
  • FocusPainter – An ElementPainter that decorates its child according to focus changes.

IDL Improvements

Several changes around Rapicorn’s IDL compiler and support code made it into master recently:

  • The IDL layer got bind() and connect() mthods (on the ObjectBroker interface). This models the IDL setup phase after the zeromq API. Beast makes use of this when setting up IDL interface layers in the UI and in BSE.
  • The Python binding was rewritten using Cython. Instead of invoking a heap of generated Python glue code and talking to the message passing interfaces directly, the Python binding now sits on top of the C++ binding. This makes the end results operate much faster, is less complex on the maintenance side and more functional with regards to the Python API offered. As an added bonus, it also eases testing of the C++ bindings.
  • And just to prove the previous point, the new Cython port uncovered a major issue lurking in the C++ IDL handling of objects in records and sequences. At least since the introduction of remote reference counting, client side object handles and server side object references are implemented and treated in fundamentally different ways. This requires records (struct) and sequences (std::vector) to have separate implementation types on the client and server sides. Thus, the client and server types are now prefixed with ClnT_ and SrvT_ respectively. Newly generated typedef aliases are hiding the prefixes from user code.
  • IDL files don’t need ‘ = 0 ‘ postfixes for methods any more. After all, generating non-virtual methods wasn’t really used anyway.
  • The Enum introspection facilities got rewritten so things like the Enum name are also accessible now. This area probably isn’t fully finished yet, for future Any integration a more versatile API is needed still.
  • Auxillary information for properties is now accessible through an __aida_aux_data__() method on generated interfaces.
  • Generated records now provide a template method __accept__<>(Visitor) to visit all record fields by value reference and name string. Exemplary visitor implementations are provided to serialize/deserialize records to XML and INI file formats.

BEAST Developments

For the most part, changes in Beast are driving or chasing Rapicorn at the moment. This means that often the tip of Rapicorn master is required to build Beast’s master branch. Here is why:

  • Beast now uses RAPIDRES(1) to embedd compressed files. Rapicorn::Blob and Rapicorn::Res make these accessible.
  • Beast now makes use of Rapicorn’s IDL compiler to generate beastrc config structures and to add a new ‘Bse‘ IDL layer into libbse that allows the UI code to interface with Bse objects via C++ interfaces. Of course, lots of additional porting work is needed to complete this.
  • Beast procedures (a kind of ‘remote method’ implemented in C with lots of boilerplate code) are now migrated to C++ methods one by one which majorly simplifies the code base, but also causes lots of laborious adaptions on the call sites, the UI and the undo system. An excursion into the changes this brings for the undo implementation is provided in DevLog: A day of templates.
  • The GParamSpec introspection objects for properties that Beast uses for GUI generation can now be constructed from __aida_aux_data__()  strings, which enabled the beastrc config structure migration.
  • An explanatory file HACKING.md was added which describes the ongoing migration efforts and provides help in accessing the object types involved.

What’s next?

For the moment, porting the object system in Beast from GObject to IDL based C++11 interfaces and related procedure, signal and property migrations is keeping me more than busy. I’ll try to focus on completing the majority of work in this area first. But for outlooks, adding a Python REPL might make a good followup step. 😉

Jun 272015
 

c++11Yesterday I spent some 14+ hours on getting a templated undo method wrapper going.
Just to throw it all away this morning.

Here’s what I was trying to achieve, the C version of BEAST implements undo as follows:

// bse_track_remove_tick():
BseTrack *track;
uint tick;
BsePart *part;
bse_item_push_undo_proc (track, "insert-part", tick, part);

That is, it queues an undo step, that if executed, will call the “insert-part” procedure
on a BseTrack object that inserts a BsePart object at a ‘tick’.
This all happens through a varargs interface with lots of magic behind the scenes. In
particular the reference to ‘part’ is tricky. Future modifications to the BseTrack (or
project) may cause the removal and destruction of the BsePart object involved here.
While the execution of future undo steps will re-create a BsePart to be inserted here
before the step at hand is executed, the ‘part’ object pointer will have to be changed
to the re-created one instead of the destroyed one.
To achieve this, bse_item_push_undo_proc() internally converts the ‘part’ pointer into
a serializable descriptor string that allows to re-identify the BsePart object and the
undo machinery will resolve that before “insert-part” is called.

Now on to C++. I wanted the new pendant in the C++ version of Beast to look like:

// TrackImpl::remove_tick():
TrackImpl *this;
const uint tick;
PartImpl &part;
push_undo ("Remove Tick", *this, &TrackImpl::insert_part, tick, part);

But…

Under the hood that means push_undo() (which is a template method on ItemImpl, a base type of TrackImpl) needs to process its variable argument list to:

  • A) Put each argument into a wrapper structure and store away the argument list (i.e. std::tuple<Wrapper<Args>…>).
  • B) Special case the wrapper structure for objects to store a descriptor internally (i.e. template specialisation on Wrapper<Arg> for Arg=ItemImpl& or derived).
  • C) Copy the wrapped argument list into a closure to be called when the undo step is executed.
  • D) When the closure is called, “unwrap” each of the wrapped arguments to yield its original type (i.e. construct a std::tuple<Args…> from std::tuple<Wrapper<Args>…>).
  • E) When unwrapping an object, resolve the descriptor stored internally (i.e. put more magic into Wrapper<Arg> to yield a valid Arg& object).
  • F) Construct a variable argument call to &TrackImpl::insert_part(…) (i.e. apply a C++ argument pack).

In short, I got A, B, C, D, F working after significant efforts.
A is somewhat straight forward with C++11 variable template arguments. C can be accomplished with a C++11 lambda capture list and F involves copying over std::integer_sequence from the C++14 proposals and hacking its std::apply() template to support instance + method calls. Last, D can be implemented in a related fashion to F.
What’s left is B and E, i.e. writing a wrapper that will store and yield ordinary arguments such as int or std::string and convert ItemImpl& derived types back and forth between a string representation.
Probably laborious but doable — or so I thought.

It turns out that because of all the argument and tuple packing hassle (template recursion, integer sequencing and more) involved in implementing A, D, F, it would be hard to pass needed serialization context into Wrapper<>. And what’s much worse is that g++-4.9 started to choke on template errors during the Wrapper<> development, aborting with “confused by earlier errors” after pages and pages of template error messages. clang++-3.4 isn’t yet capable of processing the C++11 used by Rapicorn, so it wasn’t of help here either (I plan on another attempt at porting my C++11 code to be clang++ compatible once I get my hands on a newer clang++ version).
I.e. in the end, I gave up after an overlong day in the middle of E, everything else having been accomplished. g++-4.9 choking was a main let down, but probably even more important is that I had the necessary state and mood to process multiple pages of template error messages yesterday, but the same cannot be expected of every push_undo() user in the future if any push_undo() argument ever mismatches.

This morning, I threw away yesterdays templating excess and within an hour got an alternative interface to work:

// undoing part removal needs an undo_descriptor because future
// deletions may invalidate and recreate the part object
TrackImpl *this;
const uint tick;
PartImpl &part;
UndoDescriptor<PartImpl> part_descriptor = undo_descriptor (part);
auto lambda = [tick, part_descriptor] (TrackImpl &self) {
  PartImpl &part = self.undo_resolve (part_descriptor);
  self.insert_part (utick, part);
};
push_undo ("Remove Tick", *this, lambda);

That is, this interface is fully type-safe, but the ‘part’ wrapping has to be done manually, which involves writing a small lambda around TrackImpl::insert_part(). If any argument of the lambda or push_undo() calls is erroneous, the compiler will point at a single failing variable assignment in the implementation of push_undo<>() and list the mismatching arguments.
That is much more digestible than multiple template recursion error pages, so it’s a plus on the side of future maintenance.

The short version of push_undo<>() that takes a method pointer instead of a lambda is still available for implementing undo steps that don’t involve object references, incidentally covering the majority of uses.

May 262015
 

A good while ago at a conference, I got into a debate over the usefulness of TLS (thread-local storage of variables) in performance critical code. Allegedly TLS should be too slow for practical uses, especially for shared libraries.

TLS can be quite useful for context sensitive APIs, here’s a simple example:

push_default_background (COLOR_BLUE);
auto w = create_colorful_widget(); // gets blue background
pop_default_background();

For a single threaded program, the above push/pop functions can keep the default background color for widget creation in a static variable. But to allow concurrent widget creation from multiple threads, that variable will have to be managed per-thread, so it needs to become a thread local variable.

Another example is GSlice, a memory allocator that keeps per-thread allocation caches (magazines) for fast successive allocation and deallocation of equally sized memory chunks. While operating within the cache size, only thread local data needs to be accessed to release and reallocate memory chunks. So no other synchronization operations with other threads are needed that could degrade performance.

GCC (I’m using 4.9.1 here), GLibc (2.19), et all have seen a lot of improvements since, so I thought I’d dig out an old benchmark and evaluate how TLS does nowadays. To test the shared library case in particular, I’ve written the benchmark as a patch against Rapicorn and posted it here: thread-local-storage-benchmark.diff.

The following table lists the best results from multiple benchmark runs. The numbers shown are the times for 2 million function calls to fetch a (TLS) pointer of each kind (plus some benchmarking overhead), on a Core-i7 CPU @ 2.80GHz in 64bit mode:

Local pointer access (no TLS):                0.003351 seconds
Shared library TLS pointer access:            0.003741 seconds
Static pointer access (no TLS):               0.004450 seconds
Executable global TLS pointer access:         0.004735 seconds
Executable function-local TLS pointer access: 0.004828 seconds

The greatest timing variation in these numbers is within thirty percent (30.6%). In realistic scenarios, the time needed for pointer accesses is influenced by a lot of other more dominant factors, like code locality and data cache faults.

So while it might have been true that TLS had some performance impacts in its infancy, with a modern tool chain on AMD64 Linux, performance is definitely not an issue with the use of thread-local variables.

Here is the count out in nano seconds per pointer access call:

TLS Benchmark

Let me know if there are other platforms that don’t perform as well.

May 052015
 

Giving in to persistent nagging from Stephen and Stefan about progress updates (thanks guys), I’ll cherry pick some of the branches recently merged into Rapicorn devel for this post. We’ll see if I can keep posting updates more regularly in the future… 😉

Interactive Examples

Following an idea Pippin showed me for his FOSDEM talk, I’ve implemented a very simple small script (merged with the ‘interactive-examples’ branch) to restart an example program if any file of a directory hierarchy changes. This allows “live” demonstrations of widget tree modifications in source code, e.g.:

cd rapicorn/
misc/interactive.sh python ./docs/tutorial/tuthello.py &
emacs ./docs/tutorial/tuthello.py
# modify and save tuthello.py

Everytime a modification is saved, tuthello.py is restarted, so the test window it displays “appears” to update itself.

Shared_ptr widgets

Last weekend, I also pushed the make_shared_widgets branch to Rapicorn.

Some while ago, we started to use std::shared_ptr<> to maintain widget reference counts instead of the hand-crafted ref/unref functions that used atomic operations. After several cleanups, we can now also use std::make_shared() to allocate the same memory block for storing the reference count and widget data. Here is an image (originals by Herb Sutter) demonstrating it:

make_shared_widgets

The hand-optimized atomic operations we used previously had some speed advantages, but using shared_ptr was needed to properly implement remote reference counting.

Resources

Since 2003 or so, Beast and later Rapicorn have had the ability to turn any resource file, e.g. PNG icons, into a stream of C char data to be compiled into a program data section for runtime access. The process was rather unordered and adhoc though, i.e. any source file could include char data generated that way, but each case needed its own make rules and support code to access/uncompress and use that data. Lately I did a survey across other projects on how they go about integrating resource files and simplified matters in Rapicorn based on the inspirations I got.
With the merge of the ‘Res’ branch, resource files like icons and XML files have now all been moved under the res/ directory. All files under this subdir are automatically compressed and compiled into the Rapicorn shared library and are accessible through the ‘Res’ resource class. Example:

Blob data = Res ("@res icons/example.png");

Blob objects can be constructed from resources or memory mapped files, they provide size() and data() methods and are automatically memory managed.

New eval syntax

In the recently merged ‘factory-eval-syntax’ branch, we’ve changed the expression evaluation syntax for UI XML files to the following:

<label markup-text="@eval label_variable"></label>

Starting attribute values with ‘@’ has precedence on other platforms and is also useful in other contexts like resources, which allows us to reduce the number of syntax special cases for XML notations.

Additionally, the XML files now support property element syntax, e.g. to set the ‘markup_text’ property of a Label:

<Label>
    <Label.markup-text> Multiline <b>Text</b>... </Label.markup-text>
</Label>

This markup is much more natural for complex property values and also has precedence on other platforms.

What’s next

I’m currently knee deep in the guts of new theming code, the majority of which has just started to work but some important bits still need finishing. This also brings some interesting renovation of widget states, which I hope to cover here soon. As always, the Rapicorn Task List contains the most important things to be worked on next. Feedback on missing tasks or opinions on what to prioritize are always appreciated.

Dec 312014
 

In the true tradition of previous years, this years 31c3 in Hamburg revealed another bummer about surveillance capacities:

The brief summary is that viable attacks are available to surveillance agencies for PPTP, IPSEC, SSL/TLS and SSH.
New papers reveal that as of 2012, OTR and PGP seem to have resisted decryption attempts.

A related “Spiegel” article provides more details and the leaked papers that contain this information: Inside The Nsa War On Internet Security.

Several vulnerabilities regarding SSL/TLS have been discovered and fixed in the past years since these papers were created. But at the very least, for state agencies the possibility remains to decrypt individual connections with fake certificates via man-in-the-middle-attacks.

Claiming decryption of SSH caught me by surprise though, it’s a tool deeply ingrained into my daily workflow.

At he conference, I got a chance to discuss this with Jacob after studying some of the Spiegel revelations and since I’ve been asked about this so much I’ll wrap it up here:

  • The cited papers put an emphasis on breaking other crypto protocols like PPTP and IPSEC. That and even SSL enjoy much more focus than SSH attack possibilities.
  • Clearly, good attacks are possible against password protected sessions, given lots of computation power or (targeted) password collection databases.
  • Also 768bit RSA keys are probably nowadays breakable by surveillance agencies and 1024 bit key could be within reach based on revelations about their processing capacities.
  • Even 2048 bit keys could become approachable given future advances in mathematical attacks or weak random number generators used for key generation as was the case in Debian 2008 (CVE-2008-0166).
  • Additionally, there always remains the possibility of an undiscovered SSH implementation bug or protocol flaw that’s exploitable for agencies.

Fact is, we don’t yet know enough details about all possible attack surfaces against SSH available to the agencies and we badly need more information to know what infrastructure components remain save and reliable for our day to day work. However we do have an idea about the weak spots that should be avoided.

My personal take away is this:

  • Never allow password based SSH authentication ever:
    /etc/ssh/sshd_config: PasswordAuthentication no
  • Use 4096bit keys for SSH authentication only, I have been doing this for more than 5 years and performance has not been a problem:
    ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_HOSTNAME -C account@HOSTNAME
  • Turn to PGP and OTR for useful encryption.

Have a happy new year everyone…

Dec 122014
 
Caption Text

Miller, Gary – Wikimedia Commons

In the last months I finally completed and merged a long standing debt into Rapicorn. Ever since the Rapicorn GUI layout & rendering thread got separated from the main application (user) thread, referencing widgets (from the application via the C++ binding or the Python binding) worked mostly due to luck.

I investigated and researched several remote reference counting and distributed garbage collection schemes and many kudos go to Stefan Westerfeld for being able to bounce ideas off of him over time. In the end, the best solution for Rapicorn makes use of several unique features in its remote communication layer:

  1. Only the client (user) thread will ever make two way calls into the server (GUI) thread, i.e. send a function call message and block for a result.
  2. All objects are known to live in the server thread only.
  3. Remote messages/calls are strictly sequenced between the threads, i.e. messages will be delivered and processed sequentially and in order of arrival.

This allows the following scheme:

  1. Any object reference that gets passed from the server (GUI) thread into the client (user) thread enters a server-side reference-table to keep the object alive. I.e. the server thread assumes that clients automatically “ref” new objects that pass the thread boundary.
  2. For any object reference that’s received by a client thread, uses are counted separately on the client-side and once the first object becomes unused, a special message is sent back to the server thread (SEEN_GARBAGE).
  3. At any point after receiving SEEN_GARBAGE, the server thread may opt to initiate the collection of remote object references. The current code has no artificial delays built in and does so immediately (thresholds for delays may be added in the future).
  4. To collect references, the serer thread swaps its reference-table for an empty one and sends out a GARBAGE_SWEEP command.
  5. Upon receiving GARBAGE_SWEEP, the client thread creates a list of all object references it received in the past and for which the client-side use count has dropped to zero. These objects are removed from the client’s internal bookkeeping and the list is sent back to the server as GARBAGE_REPORT.
  6. Upon receiving a GARBAGE_REPORT, corresponding to a previous GARBAGE_SWEEP command, the sever thread has an exact list of references to purge from its previously detached reference-table. Remaining references are merged into currently active table (the one that started empty upon GARBAGE_SWEEP initiation). That way, all object references that have been sent to the client thread but are now unused are discarded, unless they have meanwhile been added into the newly active reference-table.

So far, the scheme works really well. Swapping out the server side reference-tables copes properly with the most tricky case: A (new) object reference traveling from the server to the client (e.g. as part of a get_object() call result), while the client is about to report this very reference as unused in a GARBAGE_REPORT. Such an object reference will be received by the client after its garbage report creation and treated as a genuinely new object reference arriving, similarly to the result of a create_object() call. On the server side it is simply added into the new reference table, so it’ll survive the server receiving the garbage report and subsequent garbage disposal.

The only thing left was figuring out how to automatically test that an object is collected/unreferenced, i.e. write code that checks that objects are gone…
Since we moved to std::shared_ptr for widget reference counting and often use std::make_shared(), there isn’t really any way too generically hook into the last unref of an object to install test code. The best effort test code I came up with can be found in testgc.py. It enables GC layer debug messages, triggers remote object creation + release and then checks the debugging output for corresponding collection messages. Example:

TestGC-Create 100 widgets... TestGC-Release 100 widgets...
GCStats: ClientConnectionImpl: SEEN_GARBAGE (aaaa000400000004)
GCStats: ServerConnectionImpl: GARBAGE_SWEEP: 103 candidates
GCStats: ClientConnectionImpl: GARBAGE_REPORT: 100 trash ids
GCStats: ServerConnectionImpl: GARBAGE_COLLECTED: \
  considered=103 retained=3 purged=100 active=3

The protocol is fairly efficient in saving bandwitdh and task switches: ref-messages are implicit (never sent), unref-messages are sent only once and support batching (GARBAGE_REPORT). What’s left is the two tiny messages that initiate garbage collection (SEEN_GARBAGE) and synchronize reference counting (GARBAGE_SWEEP). As I hinted earlier, in the future we can introduce arbitrary delays between the two to reduce overhead and increase batching, if that ever becomes necessary.

While its not the most user visible functionality implemented in Rapicorn, it presents an important milestone for reliable toolkit operation and a fundament for future developments like remote calls across process or machine boundaries.

Merge 14.10.0 – Includes remote refernce counting,

Merge shared_ptr-reference-counting  – post-release.