X11 security isolation

I previously wrote about methods for running untrusted code on a Linux workstation, with bare-metal performance and convenient local access to the build tree. Probably the best method for doing this is to use schroot. But by default, processes running under schroot still have access to the host’s X server, and can do things like keystroke logging and screenshot capture.

This is quite a nasty error on my part, it means the system I’ve been using for the last two years doesn’t actually meet one of my main security goals. So I think a post-mortem is in order.

Linux provides a concept of “abstract sockets”, which are named sockets which exist outside of the filesystem. So the set of abstract sockets is shared between processes with different root filesystems.

In March 2008, Adam Jackson added abstract socket support to the X.org server, based on a patch by Bill Crawford. The rationale for this was unclear at the time, but in September when client support was added, Adam Jackson explained to the Xcb mailing list that “the main advantages [of abstract sockets] are that they work without needing access to /tmp”.

So from the original introduction of the feature, it was acknowledged that the rationale was to bypass security controls.

In 2010, Jan Chadima applied the same rationale when he requested that the feature be added to OpenSSH’s X forwarding (bug #1789). He explained that “this is useful when the selinux rules prevents the /tmp directory”. Here we had the first critical evaluation of the security of the feature, from Damien Miller who wrote:

Isn’t the solution for SELinux rules breaking /tmp to fix the SELinux rules? Abstract sockets look like a complete trainwreck waiting to happen: a brand new, completely unstructured but shared namespace, with zero intrinsic security protections (not even filesystem permissions) where every consumer application must implement security controls correctly, rather than letting the kernel do it.

Well said.

In 2014, Keith Packard proposed to have the X.org server stop listening on regular UNIX sockets by default, relying entirely on abstract sockets. The problem of OpenSSH’s non-compliance was raised, so he suggested:

Perhaps someone with a clue about the security implications of using abstract sockets vs file system sockets might chime in and explain why using abstract sockets is safer than file system sockets…

Cue cricket chirping noise.

The issue is known to other people who have tried to sandbox processes on hosts with an X server. The developers of a browser-sandboxing system called Firejail wrote in February 2016:

The only way to disable the abstract socket @/tmp/.X11-unix/X0 is by using a network namespace. If for any reasons you cannot use a network namespace, the abstract socket will still be visible inside the sandbox. Hackers can attach keylogger and screenshot programs to this socket.

This, thankfully, does not appear to be true. You can use the X server command line parameter -nolisten local to prevent your X server from listening on the abstract socket. The UNIX socket transport (called “unix” in the X command line) will still be enabled, and all applications will use it instead.

For plain xinit, this means having a /etc/X11/xinit/xserverrc containing something like

#!/bin/sh

exec /usr/bin/X -nolisten tcp -nolisten local "$@"

For LightDM, you can create a file called e.g. /etc/lightdm/lightdm.conf.d/50-no-abstract.conf with contents:

[Seat:*]
xserver-command=X -nolisten local

At least, it works for me. You can test it within the chroot using:

socat ABSTRACT-CONNECT:/tmp/.X11-unix/X0 -

This should report “Connection refused”.

Dimmable night light

I’m not a big fan of using technology just for the sake of it. We used to have a wireless battery-powered door bell, but it was unreliable: once, a heavy-handed delivery driver pushed in the rubber button so far that it got stuck under the case. Then every caller for a week after that just pressed it anyway, and assumed it was working and that the lack of sound must be because the buzzer couldn’t be heard from outside the house. So I replaced it with a ship bell, which has the advantage of providing instant ear-splitting feedback to the user as to whether it is working or not. It has had 100% uptime over several years of operation.

So it is with dismay that I observe the trend towards push-button or touch controls on everything. My wife needs a dimmable bedside lamp: bright enough to help her breastfeed in bed at night, but not so bright so as to interfere with sleeping. I went shopping and found a wide variety of inappropriate designs. For example, some have only a single button and need to be cycled through the brightest setting in order to turn them off. How can a designer make such a thing and still take pride in their work? I know potentiometers are expensive, costing $1 or so, but even at the top end of the market, with lamps costing $200, the best they can do is put half a dozen touch switches on them, giving you bidirectional control over brightness and cycling through colour temperature settings. But they are still harder to use than an old-fashioned knob, and with an inappropriate minimum brightness for my application.

So I made my own.

Hazel threw an LED torch down the stairs and broke it. Angela asked me to fix it. Well, it only cost $4 from the Reject Shop in the first place, so my repairs weren’t very careful. I opened up the aluminium case lengthwise with a Dremel cut-off wheel and found the problem — the circuit board had been soldered to the case, and that solder joint had broken. Oh well, every failure is an opportunity, right?

For days, as I walked around the house, I looked at every item for its potential as a lamp. Eventually I settled on the acrylic case for an old iPod Shuffle. I put the iPod itself in the bin.

DSC02604

It provides indirect lighting: the upward-facing LEDs put a spot on the ceiling, lighting up the room without excessive glare for people lying in bed.

Bill of materials:

  • Salvaged LED array
  • 10kΩ linear potentiometer
  • 100Ω resistor
  • BC548 NPN transistor
  • 1N914 type power diode
  • iPod shuffle case
  • 2.1mm DC barrel jack
  • Universal switch-mode power supply set to 9V

All items were from my stock or salvaged.

dimmable lamp

To dim an LED with a potentiometer, you need to control the current rather than the voltage. If you control the voltage, then you’ll get nothing at all until it reaches a certain threshold, and then the brightness will rise exponentially until something overheats. So I adapted a simple voltage-controlled current source from Horowitz & Hill (2nd edition) to provide roughly linear current control as you turn the knob. Biasing the lower end with a diode brings the zero current point to approximately the zero position of the potentiometer. In practice, at the lowest setting, there is a very slight glow from the LED array which is only visible in a very dark room. I’ll call that a feature, to help you find the knob at night.

Battery power?

Update: A friend asked me about battery power. The torch the LED array came from used 3 AAA batteries, so about 4.5V, but with this inefficient current source the supply voltage needs to be doubled, since at full brightness, the voltage drop across the resistor R2 is as much as across the LED. And even if you used a current source that could go from rail to rail, you would still waste up to half the power.

A better solution is to power it from its original 4.5V, but with PWM. No doubt something could be cooked up with 555 timers, but they’re not really my style, I don’t have them in stock. I do have microcontrollers, and a microcontroller solution for this would have some nice advantages.

So I would use an ATtiny44, with a circuit very similar to my season clock (which, by the way, is still running on its original AA batteries, almost 3 years later). I would measure the potentiometer voltage with the microcontroller’s ADC, and when it drops below a certain voltage, go into sleep mode, waking say once every 100ms. I would power the potentiometer from a digital output, saving 450µA in sleep mode, just turning it on long enough to measure it. Maximum DC output for this chip is 40mA, so an outboard transistor may be needed, depending on choice of the LED array.

Lightweight isolation for software build and test

About two years ago, I decided that it’s not a good idea to constantly download unreviewed software written by untrusted individuals and to run that software with full privileges on my laptop. If I was telling a child how to avoid getting viruses on their Windows machine, this would be obvious and normal. But because I am talking about developing open source software on Linux, I find myself some distance from the beaten track.

I like things to be fast and cheap, and so I like the idea of a local build system. But my laptop is used for all sorts of things that should not necessarily be shared with the developers of the software I am patching. I want bare metal performance, but I want to restrict access to sensitive files, such as:

  • The SSH agent socket
  • The X display, which allows keyboard logging and a variety of other attacks
  • Password manager databases
  • Emails (such as password reset emails)
  • Write access to /tmp, which allows race attacks on various services
  • The application’s own source tree…

At first, I used a Debian package called schroot, which automates a lot of the work involved in setting up a permanent chroot. And in the last few weeks, I have been trialling LXC, a very similar technology which has recently matured substantially.

At first, I used a read-only source directory, but I kept encountering cases where build systems want to write to their own source trees. MediaWiki now uses Composer extensively, Parsoid uses npm, and HHVM’s build system subtly depends on the build directory being the same as the source directory. So in all these cases, it’s convenient to give the build system its own writeable source tree. It’s possible to do this without giving the application the ability to write to the copy of the source tree which is destined for a git commit.

The solution I’ve settled on for this is aufs, which is a kind of union filesystem. It is really a joy to use. I can edit source files in my GUI editor, and as soon as I hit “save”, the changes are visible in the build environment. The build system can edit or delete its own source files, but those modifications are not visible in the host environment. And if the build system screws up its own source tree beyond easy rectification (which happens surprisingly often), I can just wipe the whole overlay, instantly reverting the build tree to the state seen in the host environment. It is like git clean except that you can put unversioned files into the host source tree without any risk of accidental deletion.

If I need to generate a source file with the build system and then transfer it to the host system for commit, it is just a simple file move:

mv /srv/chroot/build-overlay/mw/core/autoload.php .

Comparison of LXC and schroot

schroot LXC
localhost Shared Isolated
Comprehensible error messages Yes No
Automatic mount point setup Yes No
Automatic user ID sharing Yes (setup.nssdatabases) No
Unprivileged start and login Yes (schroot.conf “users”) No (lxc-start, lxc-attach must run as root)
systemd inside container? No Yes
SysV init scripts inside container? Yes Yes
Root filesystem storage options Diverse Limited

The lack of network isolation in schroot could be a problem if you have sensitive services bound to TCP on localhost. It’s possible to bring up network services inside the container — even ones that are duplicated in the host, as long as you use a different port number or IP address. It’s a little-known fact that 127.0.0.1 is only one IP address in a subnet of 16.8 million — you can bind local services in the container to say 127.0.0.2.

schroot is generally less buggy and easier to use than LXC. That’s partly maturity and partly the greater level of difficulty involved in the implementation of LXC. For example, schroot is able to process fstab files by just running mount(8), whereas LXC is forced to reimplement significant amounts of code from mount(8). It parses fstab-like syntax itself and has special-case support for many different filesystems.

In LXC it’s normal to run the whole system as a daemon, starting with /bin/init — this is the default behaviour of lxc-start. There are lots of components that make this work, each with its own log file. Often, when I made a configuration error, lxc-start would print no error but the container would fail to start, then you had to hunt around in the logs, and turn on logs where necessary, to figure out what went wrong.

In schroot, by contrast, a persistent session is simply a collection of mounts, there does not need to be any process running inside the container for it to exist. So session start is synchronous and error propagation is trivial. Session termination is implemented as a shell script that iterates through /proc/*/root, killing all processes that appear to be running under the session in question.

schroot has some great options for root filesystem storage. For example, you can store the whole root filesystem in a .tar.gz file. When schroot starts a session, it will unpack the archive for you, which only takes a couple of seconds for a base system. By default such a root filesystem operates as a snapshot, but it can optionally update the tar file for you on session shutdown.

How to do it

I previously wrote some notes about how to set up MediaWiki under schroot.

The procedure to set up an AUFS-based build system is almost identical in schroot and LXC. Let’s say we’re making a container called “parsoid” with a union mount for the parsoid source tree. My host source tree is in ~tstarling/src/wmf/mediawiki/services/parsoid , and I create an empty overlay directory writeable by tstarling in /srv/chroot/build-overlay/parsoid.

For schroot, you would have /etc/schroot/chroot.d/parsoid containing:

[parsoid]
type=directory
description=Parsoid build and test
directory=/srv/chroot/parsoid
setup.fstab=parsoid/fstab

And in /etc/schroot/parsoid/fstab:

none /srv/parsoid aufs br=/srv/chroot/build-overlay/parsoid=rw:/home/tstarling/src/wmf/mediawiki/services/parsoid=ro

The container root (/srv/chroot/parsoid) is created by directly invoking debootstrap.

For LXC, you create the container with lxc-create, and then add the mount to /var/lib/lxc/parsoid/config:

lxc.mount.entry = none srv/parsoid aufs br=/srv/chroot/build-overlay/parsoid=rw:/home/tstarling/src/wmf/mediawiki/services/parsoid=ro

X11 isolation

Updated June 6: When using schroot, you need to configure your X server to not use “abstract sockets”, which have a global namespace (within each netns) independent of the current root filesystem. If you are using lightdm, create a file called /etc/lightdm/lightdm.conf.d/50-no-abstract.conf with contents:

[Seat:*]
 xserver-command=X -nolisten local

If your Linux distribution runs xinit directly, you would need a /etc/X11/xinit/xserverrc file containing something like:

#!/bin/sh
exec /usr/bin/X -nolisten tcp -nolisten local "$@"

For more information on X isolation, see my followup blog post.

SSL implementations compared

I reviewed several SSL implementations for coding style: OpenSSL, NSS, GnuTLS, JSSE, Botan, MatrixSSL and PolarSSL. I looked at how buffers are handled in parsers and writers. Of all of them, I think only JSSE, i.e. pure Java, can be trusted to be free of buffer overflows. It suggests that a good webserver for security-critical applications would be Tomcat, without native extensions.

In OpenSSL, the Heartbleed patch itself is a good example of what not to do:

    /* Read type and payload length first */
    if (1 + 2 + 16 > s->s3->rrec.length)
        return 0; /* silently discard */
    hbtype = *p++;
    n2s(p, payload);
    if (1 + 2 + payload + 16 > s->s3->rrec.length)
        return 0; /* silently discard per RFC 6520 sec. 4 */
    pl = p;

Bounds checks are rolled up into an obscure calculation, then the code proceeds to memcpy() straight out of the buffer.

NSS has a similar style. For example, ssl3_ComputeDHKeyHash() has:

    bufLen = 2*SSL3_RANDOM_LENGTH + 2 + dh_p.len + 2 + dh_g.len + 2 + dh_Ys.len;

In GnuTLS, a similar example of precalculation can be found in this Heartbleed-inspired patch:

 response = gnutls_malloc(1 + 2 + data_size + DEFAULT_PADDING_SIZE);

PolarSSL was also similar in style.

An embedded, open-core SSL library called MatrixSSL shows an alternative style, which proves that there are alternatives to precalculation. It has bounds checks distributed throughout the parser. Before each read, the remaining length is calculated from the cursor position, and compared with the read length:

    if (end - c < SSL2_HEADER_LEN) {
        return SSL_PARTIAL;
    }
    if (ssl->majVer != 0 || (*c & 0x80) == 0) {
        if (end - c < ssl->recordHeadLen) {
            return SSL_PARTIAL;
        }
        ssl->rec.type = *c; c++;
        ssl->rec.majVer = *c; c++;
        ssl->rec.minVer = *c; c++;
        ssl->rec.len = *c << 8; c++;
        ssl->rec.len += *c; c++;
    } else {
        ssl->rec.type = SSL_RECORD_TYPE_HANDSHAKE;
        ssl->rec.majVer = 2;
        ssl->rec.minVer = 0;
        ssl->rec.len = (*c & 0x7f) << 8; c++;
        ssl->rec.len += *c; c++;
    }

This makes auditing easier, and should be commended. However, it’s still a long way short of actually following CERT security guidelines by using a safe string library. And since this is an embedded library, it would have been nice to see a backend free of dynamic allocation, to allow verification tools like Polyspace Code Prover to provide strong guarantees.

Botan is written in C++ by a single author who claims to be aware of security issues, which sounded very promising. In C++, implementing bounds checks should be trivial. However, the code didn’t live up to my expectations. It does pass around std::vector<byte> instead of char* (often by value!), but the author makes extensive use of a memcpy() wrapper to actually read and write those vectors. For example:

std::vector<byte> Heartbeat_Message::contents() const
   {
   std::vector<byte> send_buf(3 + m_payload.size() + 16);
   send_buf[0] = m_type;
   send_buf[1] = get_byte<u16bit>(0, m_payload.size());
   send_buf[2] = get_byte<u16bit>(1, m_payload.size());
   copy_mem(&send_buf[3], &m_payload[0], m_payload.size());
   // leave padding as all zeros

   return send_buf;
   }

The functions are short, with short code paths between length determination and buffer use, which gives it similar auditability to MatrixSSL. But the potential security advantages of using C++ over C were squandered, and frequent copying and small dynamic allocations means that performance will not be comparable to the C libraries.

So it seems that in every case, authors of C and C++ SSL libraries have found unbounded memory access primitives like memcpy() to be too tempting to pass up. Thus, if we want to have an SSL library with implicit, pervasive bounds checking, apparently the only option is to use a library written in a language which forces bounds checking. The best example of this is surely JSSE, also known as javax.net.ssl. This SSL library is written in pure Java code — the implementation can be found here. As I noted in my introduction, it is used by Tomcat as long as you don’t use the native library. The native library gives you “FIPS 140-2 support for TLS/SSL”; that is to say, it links to a library that probably has undiscovered buffer overflow vulnerabilities.