Device emulator: Still work in progress

In a previous post I explained the reasons why all devices should have their code source available under a license that at least permits to modify it. For devices that are meant to be integrated with third party software, an emulator should also be available.

I see this problem all the time: VoIP vendors obviously want their phones integrated in whatever software VoIP provider are deploying, but do not provide any reasonable solution to repetitively test the integration of these devices. It starts at development time: A developer can certainly get on its desk the minimal 3 phones that are needed to test most call scenarios, but it gets quickly annoying to have to repetitively pick up and answer these phones to test the code. Even if all the annoyances that come with this did not matter (noise, cost, space on desk etc…), we live in a world were automatic regression testing is (or should) be mandatory, and there is not much of it that can be done when using a real phones. In the beginning of VoIP we only had VoIP adapters (into which you plug a Plain Old Telephone Service – POTS), and it was relatively easy to use a voice modem in place of the POTS to build an emulator, but now these kind of adapters are only used for fax machines and all the other devices are VoIP phones. To automatize the usage of these phone we need a way to press buttons, inject and extract audio and sometimes video streams without any user intervention. Some people develops hardware solutions based on various micro-controllers, Arduino and others, but the point is that providing a reliable emulator should be the responsibility of the hardware vendor.

And these days it is not as difficult as it was in the past to do so. More and more embedded designs are built around Embedded Linux which has its own emulator, QEMU. For example all the Android devices provided by Google can be emulated this way, making it easy to repetitively test an application for these devices. A few years back I used this to show a demo that involved three Android phones, all simulated in a laptop and real devices were only used to demo the quality of the audio. That made the demo easier to follow, and far easier to set up.

But the point of an emulator is that it should be running the exact same software than the real device, so the first thing that an vendor must do is to add its hardware into the QEMU emulator. And that is unfortunately a task that can be easily done only by the vendor, as a lot of the times the datasheet for the hardware is only available under NDA. For the Nephelion project – which will be distributed with emulator and source code for its device – we were not able to build an emulator using the same exact binary because of this reason. We have plans to switch to hardware that will be easier to emulate, but there is still another major issue that will prevent us to get an emulator that is 100% identical to the real hardware:

The Nephelion device is a USB gadget, but QEMU does not permit to connect the gadget inside the guest and make it visible in the host as if is was a real device. There was some experimental patches doing this a long time ago around the Openmoko project, but it seems that these patches were never integrated into QEMU.

We are using usbip, which makes our device emulator again different from the real thing as the code inside the device now has to detect that it is running on an emulator so it can start the usbip daemon and bind the dummy hcd to it. And that’s without even talking about the fact that usbip does not yet support USB 3.0.

So for now we have an emulator that is probably good enough for development and testing but there is too many differences with the real device to consider it a complete replacement that can detect the same range of problems.

A better analogy in Oracle vs Google: API is jargon

As a software developer I know that in Oracle vs Google, Google and Judge Alsup are right and Oracle and the White House are wrong. But perhaps we need a better analogy to explain to non-developers that APIs are not copyrightable.

A word of caution: I am definitively not a lawyer, and you will not find even the beginning of a legal argument there. This is just an attempt of explaining by analogy what I think APIs are, not only as someone who make a living from using them, implementing them and when forced to do so, inventing new one, but also as someone who will greatly suffer if the minefield that is software development is made even more dangerous if API are deemed copyrightable.

When I read some of the arguments from Oracle and in particular the book chapter titles analogy, I felt it contrived and only superficially explaining what APIs are – the analogy breaks down quickly when you try to look at it more deeply. So I came up with something a little bit better, which is that an API is a jargon.

Jargon, as defined by The Concise Oxford Dictionary is “words or expressions used by a particular profession or group that are difficult for others to understand.” Note that it is not a set of new words that are specific to a group or profession, but words that are used with a different meaning for the purpose of improving communication of ideas inside that group or profession.

As an example taken from the Computing group or profession, the word “protocol” as a very specific meaning. In the same dictionary it has this separate definition: “Computing; a set of rules governing the exchange or transmission of data between devices.” Note also that even inside a group or profession, sub-groups can develop their own jargon, making it difficult to communicate to each other, but greatly improving it inside these sub-groups: The IETF and the ITU can be a good example of that.

An API is very close to that: Ordinary words or arrangements of words (expressions) that have a different meaning. If we take the java.net package as an example, each of the word used (can be verbs or nouns, etc…) have a very different sense than is used generally: Socket.isConnected() does not mean to test that an electrical socket has a plug into it, it means something different in this jargon (namely that an identifier is associated with some state resulting of a successful protocol exchange).

And if we want to push the analogy a bit further, one can imagine a dictionary that explain each of the words and expressions used by a jargon (The New Hacker’s Dictionary is a good example of that). The definition of each word or expression would be copyrightable, and is equivalent to the implementation of an API. Someone can write a different dictionary for the same jargon, but could not copy word for word the definitions for the first one – that would be the equivalent of Google, taking the jargon known as the standard Java library (developed by Sun and many others through the JCP), and writing a new implementation.

As I said there is not a legal argument there, but the question that can now be asked is this: Although it is perfectly clear that in a jargon dictionary the text of the definitions is copyrighted, is the list of words and expressions defined by this dictionary also copyrighted? I believe that the same answer is applicable to APIs.

Improving standard compliance with transclusion

Inserting fragments of a standard specification – IETF RFC or other – as comments in the source code that implement it seems to be a simple way to assure a good conformance. Unfortunately doing so can create legal issues if not done carefully.

These days I am spending a lot of time implementing IETF’s (and other SDO’s) protocols for the Nephelion Project. I am no stranger to network protocol implementation, as this is mostly what I am doing since more than 25 years, but this time the very specific code that is needed for this project is required to be as close as possible to the standard. So I am constantly referring to the text inside the various RFCs to verify that my code is conformant. Obviously copying the text fragments as comments would greatly simplify the development and at the end make the translation between the English text that is used to describe the protocol and the programming language I use to implement it a lot more faithful.

At this point I should insert the usual IANAL but, at least to my understanding, that is something that is simply not possible. My intent is to someday release this code under a Free Software license, but even if it was not the case, I believe that all software should be built with the goal of licensing it in the future, this license being a commercial one or a FOSS one. The issue here is that the RFCs are copyrighted and that modifying is simply not permitted by the IETF Trust and, in my opinion, rightly so as a standard that anybody can freely modify is not much of a standard. But publishing my code under a FOSS license would give everyone the right to modify it (under the terms of the license), and that would apply too to the RFC fragments inserted in the source code.

So the solution I use to at the same time keep the specification and the implementation as close as possible and to not have to worry about code licensing is to use transclusion. Here’s an example of comment in the code source for the UDP module:

% @transclude file:///home/petithug/rsync/ietf/rfc/rfc768.txt#line=48,51

The syntax follows the Javadoc (and Pldoc, and Scaladoc) syntax. The @transclude tag indicates that the text referenced by the URL must be inserted in the source code but only when displayed in a text editor. Here’s what the same code looks like when loaded in VIM (the fragment for RFC 768 is reproduced here under fair use):

% @transclude file:///home/petithug/rsync/ietf/rfc/rfc768.txt#line=48,51
% {@transcluded
% Source Port is an optional field, when meaningful, it indicates the port
% of the sending process, and may be assumed to be the port to which a
% reply should be addressed in the absence of any other information. If
% not used, a value of zero is inserted.
% @@@}

(I chose this example because, until few days ago, I did not even know that using a UDP source port of 0 was conformant).

The @transcluded inline tag is dynamically generated by a VIM plugin but this tag will never appear anywhere else than in the VIM buffer, even after saving it to the disk. The fragment syntax is from RFC 5147, and permits to select the lines that must be copied (An RFC will never changes, so hardcoding the line number in the code source cannot break in the future).

The plugin can be installed from my Debian repository with the usual “apt-get install vim-transclusion”. The plugin is kind of rough for now: only the #line=<from>,<to> syntax is supported, hardcoding the full path is not very friendly, curl is required, etc… But that is still a huge improvement over having to keep specification and implementation separate.

What I did these last 4 months: Project Nephelion

Between the end of my employment at Jive Communications and the Christmas break I did a lot of research on an idea I got four years ago after an especially frustrating network protocol debugging session. I put some notes on my lab book, wrote some code, assigned a codename (Project Nephelion) and forgot about it until I had, at Jive, yet another frustrating debugging session – although for completely different reasons. So I started analyzing everything that make this kind of debugging difficult and, after talking with friends that are in the same line of work, I decided to spend few months looking for solutions to improve the situation and see if I can build a business around it.

During these 4 months I designed the architecture for a device and its associated software and, even if I was not able to get answers to all my questions, at the beginning of 2015 I decided to build a prototype of the device and work on a first version of the software. The goal is to bring the prototype with me to the IETF meeting in Dallas (March 22 – 27, 2015) to show it to fellow protocol designers and implementers to gather feedback.

There is still a lot to do before I can explain in details what this product will do (I still have to fill at least one more patent application) but I just posted a description of the problems that this product may (or may not) solve on the website of the project:

The art of debugging network protocol problems

Meanwhile, and as a teaser, here’s the list of books that I bought and read specifically for this project during the exploratory phase back in 2014:

An Undecidable Problem in SIP

A few years back, one of my colleague at 8×8 made an interesting suggestion. We were at the time discussing a recurring problem of SIP call loops in the Packet8 service and his suggestion was to write a program that would analyze all the various forwarding rules installed in the system, and simply remove those that were the cause for the loops. I wish I remember what I responded at the time and that I had the insight to say that writing such program is, well, impossible, but that’s probably not what happened.

Now “impossible” is a very strong word and I must admit that I have spent most of my career in computers thinking that nothing was impossible to code and that, worst case, I just needed a better computer. It just happens that there is a whole class of problems that are impossible to code – not just difficult to code, or not that the best possible code will take forever to return an answer without using a quantum computer, but that it is impossible to write a program that always return correct answers for these problems. The SIP call loop problem is one of them.

To make sense of this we will need to define a lot of concepts, so lets start with a SIP call. SIP is, for better or for worse, the major session establishment protocol for VoIP. A SIP client (e.g. a phone or a PSTN gateway) establishes a call more or less like how a Web browser contacts a web site, with the difference that in the case of SIP the relationship with the server lasts for the duration of the call. One of the (deeply broken) features of SIP is that there may exist intermediate network elements called SIP Proxies that can, during the establishment of a call, redirect the call to a new destination. In this case the server, instead of answering the call itself, creates a new connection to a different destination which can itself creates a new connection and so on. This mechanism obviously can create loops, especially when the forwarding rules in these servers are under the control of the end-user – which was the problem we encountered at 8×8. There is many mechanisms a SIP proxy can use to decide if, when, and where to redirect a call, but for the sake of simplicity we will consider only a subset of all these possibilities, and assume that all the SIP proxies involved use the Call Processing Language (CPL), a standard XML-based language designed to permit end-users to control, among other things, how a SIP proxy forwards calls on their behalf. Jive’s users can think of a CPL script as equivalent to the dialplan editor, but for just one user and with the additional constraint that it is not possible to create loops inside a CPL script.

Here an example of CPL script taken from the standard (RFC 3880) that will forward incoming calls to the user’s voicemail if she does not answer the calls on her computer:

<cpl xmlns="urn:ietf:params:xml:ns:cpl" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:ietf:params:xml:ns:cpl cpl.xsd">
  <incoming>
    <location url="sip:jones@jonespc.example.com">
      <proxy>
        <redirection>
          <redirect/>
        </redirection>
        <default>
          <location url="sip:jones@voicemail.example.com">
            <proxy/>
          </location>
        </default>
      </proxy>
    </location>
  </incoming>
</cpl>

One interesting feature of CPL is that it is extensible, so even if only a subset of all the capabilities of a standard compliant SIP proxy can be implemented using baseline CPL, it is possible to add extensions to be able to use any legal feature of the SIP standard. As an example, the following CPL script (also taken from RFC 3880) rejects calls from specific callers by using such an extension:

<cpl xmlns="urn:ietf:params:xml:ns:cpl" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:ietf:params:xml:ns:cpl cpl.xsd">
  <incoming>
    <address-switch field="origin" subfield="user" xmlns:re="http://www.example.com/regex">
      <address re:regex="(.*.smith|.*.jones)">
        <reject status="reject" reason="I don't want to talk to Smiths or Joneses"/>
      </address>
    </address-switch>
  </incoming>
</cpl>

For the rest of this discussion, we will consider that we can only use CPL extensions that are legal in a standard SIP proxy.

Now that we established the perimeter of our problem, we can formally formulate the question as follows: If we consider a VoIP system made only of endpoints (phones) and SIP proxies that run CPL scripts, and with the complete knowledge of all the (legal) CPL extensions in use, is it possible to write a program that takes as input all these scripts and an initial call destination (known as a SIP URI) and that can tell us if the resulting call will loop or not?

That seems simple enough: Try to build an oriented graph of all the forwarding rules, and if the graph is not a directed acyclic graph, then it is looping.

Let’s say that my former colleague took up the challenge and wrote this program. Using Java, he would have written something like this:

boolean isLooping(Map<URI, Document> configuration, URI initial) {
  // some clever code here...
  }

The “configuration” parameter carries the whole configuration of our SIP system as a list of mappings between an Address-Of-Record (AOR, the identifier for a user) and the CPL script that is run for this user. The “initial” parameter represents the initial destination for a call (which may contain the AOR of a user). The Java function returns true if a call to this user will loop, or false if the call is guaranteed to reach someone or something, other then the originator of the call, without looping.

Now let’s change the subject for a little bit and talk about something called the Cyclic Tag System (CTS). CTS is a mechanism to create new bit strings from an initial bit string and an ordered list of bit strings called productions, using the following rules:

1. Remove the leftmost bit of the current bit string.
2. If the removed bit was equal to 1 then concatenate the next production in the list to the current bit string (restarting at the first production when all are used).
3. Continue at rule (1) unless the current bit string is empty.

The following example from Wikipedia uses an initial bit string of 11001 and a list of productions of { 010, 000, 1111 }. Following the rules above, we generate the following bit strings:

11001
1001010
001010000
01010000
1010000
010000000
10000000
...

Simple enough.

Now, let’s say that we want to implement CTS in SIP. The CPL language is not powerful enough for that but we can define a very simple extension (still conforming to the SIP standard) that permits to copy a character substring into a parameter for the next destination of a call, e.g.:

<location m:url="sip:c.org;p={destination.string:2:10}" />

Here we extract the “string” parameter from the original destination and copy ten characters starting in second position into the new destination. With this new extension, we can now write three CPL scripts that will implement the CTS example above:

Script attached to user `sip:p1@jive.com`:

<address-switch field=”destination”>
  <address contains=”;bitstring=1”>
    <location m:url=”sip:p2@jive.com;bitstring={destination.bitstring:1}010” />
  </address>

  <otherwise>
    <address-switch field=”destination”>
      <address contains=”;bitstring=0”>
        <location m:url=”sip:p2@jive.com;bitstring={destination.bitstring:1}” />
      </address>

      <otherwise>
        <reject />
      </otherwise>
    </address-switch>
  </otherwise>
</address-switch>

Script attached to user `sip:p2@jive.com`:


<address-switch field=”destination”>
  <address contains=”;bitstring=1”>
    <location m:url=”sip:p3@jive.com;bitstring={destination.bitstring:1}000” />
  </address>

  <otherwise>
    <address-switch field=”destination”>
      <address contains=”;bitstring=0”>
        <location m:url=”sip:p3@jive.com;bitstring={destination.bitstring:1}” />
      </address>

      <otherwise>
        <reject />
      </otherwise>
    </address-switch>
  </otherwise>
</address-switch>

Script attached to user `sip:p3@jive.com`:

<address-switch field=”destination”>
  <address contains=”;bitstring=1”>
    <location m:url=”sip:p1@jive.com;bitstring={destination.bitstring:1}1111” />
  </address>

  <otherwise>
    <address-switch field=”destination”>
      <address contains=”;bitstring=0”>
        <location m:url=”sip:p1@jive.com;bitstring={destination.bitstring:1}” />
      </address>

      <otherwise>
        <reject />
      </otherwise>
    </address-switch>
  </otherwise>
</address-switch>

On our Java function isLooping(), these three scripts would go on the first parameter (indexed by the address of the proxy) and the initial parameter would contain “`sip:p1@jive.com;bitstring=11001`”.

It is trivial to write a program that can generate the CPL scripts needed to implement any list of productions. No need to even write a program, a simple XSLT stylesheet can do the job.

The reason we implemented CTS in SIP is that in 2000, Matthew Cook published a proof that CTS is Turing-Complete. Turing-Complete is a fancy expression that means that it is computationally equivalent to a computer, which basically means that any computation that can be done on a computer can also be done by the CTS system – obviously doing it multiple orders of magnitude slower than a computer, but both a computer (or a quantum computer, or a Turing machine, or lambda calculus, or any other Turing-Complete system) and the CTS system can compute exactly the same things. Because only a Turing-Complete system can simulate another Turing-Complete system, by showing that we can implement any CTS production list in SIP we just proved that SIP is equivalent to a computer. In turn that means that it is possible to convert any existing program unto a list of CPL scripts (with our small extension) and an initial SIP URI. So knowing this and knowing that the Java bytecode is also Turing-Complete (it is trivial to implement CTS in Java), we now can write this new function:


Map<URI, Document> convert(Runnable runnable) {
  // complex, but definitively implementable
  }

The function takes a Java class (i.e. a list of bytecodes) as parameter and returns a list of { SIP URI, CPL scripts } mappings as result. The class to convert implements Runnable so we have a unique entry point in it (i.e. its run() method), an entry point that by convention becomes the SIP URI on the first mapping returned by the function.

Now that we have these two basic functions, let’s write a complete Java class that is using them:

class Paradox implements Runnable {
  boolean isLooping(Map<URI, Document> configuration, URI initial) {
    // some clever code here
    }

  Map<URI, Document> convert(Runnable runnable) {
    // complex, but definitively implementable
    }

  public void run() {
    Map<URI, Document> program = convert(this);
    if (!isLooping(program, program.keySet().iterator().next())) {
      while (true);
      }
    }
  }

What we are doing here is converting the whole class into a SIP proxy configuration, and passing it to the code that can predict if this program will loop or not. Then we do something a bit tricky with the result, which is to immediately exit if we detect that it will loop. But because this is exactly the code that was under the scrutiny of the isLooping() implementation, it should have returned false, which should have executed the “while (true);” code. But executing the “while (true);” code means that the code is looping, which again contradicts the result we have.

Both cases are impossible which can have only one explanation: The isLooping() implementation is unable to find if this program loops or not. Thus, we have shown that it is possible to write a Java program that loops but cannot detect that it is looping. And because we previously proved that SIP with CPL is Turing-Complete, we know that if it is possible to write such a program in Java, it is theoretically possible to do so in SIP configuration. Because of this, we now know that it is impossible to write a program that can reliably predict if a SIP configuration loops or not. More precisely, for any implementation of isLooping(), it is always possible to find an instance of the “configuration” and “initial” parameters for which this version of isLooping() will return an incorrect response.

Now that we have the answer to our question, let’s have a better look at what really happen in a VoIP system. A SIP call never really loop forever (well, unless the PSTN is involved, but that’s a different story), because there is a mechanism to prevent that. Each call contains a counter (Max-Forwards) that is decremented when it traverses a SIP proxy, and the call ends when this counter reaches zero. Max-Forwards is a little like CPU quota. It does not improve the quality of a program; rather it just prevents it from making things worse. Putting aside the fact that we are in the business of establishing communications between people, not finding fancy ways to prevent that, the Turing-Completeness of SIP still gets in the way as, for the same reasons than before, it is also impossible to write a program that will reliably predict if a call will fail because the Max-Forwards value will reach zero.

This is a sad state of affairs that the only way to reliably predict a possible failure is to let this failure happen, but the point of this article is that this is not really the fault of the programmer, but just the consequence of the limitations of computation in this universe.

Note that at least some SIP systems are not necessarily Turing-Complete – here we had to add an extension to our very limited system based on CPL to make it Turing-Complete. But it is far easier to prove that something is Turing-Complete than proving it is not, so even if I could not find a way to prove that a pure CPL system is Turing-Complete, that does not prove it is not. But worse, we saw that there is not much difference between a Turing-Complete system and one that is not, so even a small and seemingly irrelevant modification to a system can make it Turing-Complete. So basically writing a program that tries to reach definitive conclusions on the behavior of a system moderately complex – like SIP – is an exercise in futility.

Many thanks to Matt Ryan for his review of this article.

A simple scheme for software version numbers

There is as many opinions on how software version numbers should be structured than there is developers. It is difficult to design a scheme that is simple and that will stay consistent for the whole lifetime of a product – one good example of product that periodically change the meaning of the version number is the Linux kernel – at one time odd version numbers meant development code and even numbers production code. Now the rule seems that the major component of the version is incremented whenever the project leader feels that the minor component is too large.

My own schemes were always plagued by one annoying inconsistency: I always start the numbering at 0.1, meaning that the software is still in its design phase, postponing the switch to 1.0 to when the API (whatever this means) is stable enough that it can be guaranteed to not require corrections. The inconsistency becomes visible when trying to go through the next iteration of versions after 1.0, iteration that will be concluded by a version 2.0. Some people use very high numbers (1.99, and so on) for this purpose, but that never looked right to me.

So I finally found and adopted a simple scheme that (I think) clearly indicates which phase of the development process it belongs to. All version numbers follows the <major>.<minor&gt.<correction> scheme, starting at 0.1.0. A minor value of 0 always means that the API is stable (i.e. that this API will be maintained forever), and any other number that this is a different API and that this API is still under development (i.e. that users of this API should be prepared to modify their code). It will be simpler to understand with some examples:

  • 0.1.0: This is the first version, the API is still under development.
  • 0.1.1: Same API than before, but bugs in the implementations were fixed.
  • 0.2.0: API modified, but still not stable.
  • 1.0.0: First version using a stable API.
  • 1.0.1: Bugs fix on the stable version.
  • 1.1.0: A new development cycle started, with a different API.
  • 1.1.1: Same API than before, but bugs in the implementation were fixed.
  • 1.0.2: New bugs fixed in the stable API.
  • 1.2.0: New development version with a different API.
  • 2.0.0: Second version with a stable API.
  • 2.0.1: Bugs fix on the second stable version.
  • 1.0.2: New bugs fixed in the first stable API.
  • 2.1.0: Beginning of a new cycle of development, and so on…

One may ask why the numbering does not start with 0.0.0. In this case the minor part is 0 and that would mean that this is a stable API, but the only reasonable design for a stable API this early in the process would be the absence of API, which would require to release a first Debian package that contains nothing. So it seems reasonable to skip this step and start directly with version 0.1.0. But note how this is reminiscent of the way unit testing is supposed to be done, i.e. that tests should be written before the actual code that permit them to succeed is written.

A configuration and enrollment service for RELOAD implementers

As people quickly discover, implementing RELOAD is not an easy task – the specification is complex, covers multiple network layers, is extensible and flexible and the fact that security is mandatory creates even more challenges at debug time. This is why new implementations generally focus on having the minimum set of features working between nodes using the same software.

Making two different RELOAD implementations interoperate requires a lot more work, mostly because connecting to a RELOAD overlay is not as simple as providing an IP address and port to connect to. Because of the extensibility of RELOAD, all the nodes in an overlay must use the same set of parameters, parameters that are collected and distributed in an XML document that need to be cryptographically signed. In addition to this, all nodes must communicate over (D)TLS links, using both client and server certificates signed by a CA that is local to the overlay. Configuration file and certificates must be distributed to each node and when two or more implementations wants to participate in the same overlay, adhoc methods to provision these elements are no longer adequate. The standard way to do that is through a configuration and enrollment server but unfortunately that is probably the part of the RELOAD specification that most implementers would assign the lowest priority, thus creating an higher barrier to interoperability testing than one would expect.

This is why during the last RELOAD interoperability testing event in Paris, I volunteered to provide configuration and enrollment servers as a service to RELOAD implementers, so they do not have to worry about this part. I already had my own configuration and enrollment servers, but I had to rewrite them from scratch because of two additional requirements: They had to work with any set of parameters, even some that my own implementation of RELOAD do not support yet, and it must be possible to host servers for multiple overlays on the same physical server (virtual server). A first set of servers are now deployed and in use by the participants of the last RELOAD interoperability event, so it is now time to open it to a larger set of participants.

First what this service is not: It is not to host commercial services, and it is not meant to showcase implementations. The service is free for RELOAD implementers (up to 5 overlays per implementer) for the explicit purpose of letting other implementers connect to your RELOAD implementation, which means that you are supposed to provision a username/password for any other implementer on request, on a reciprocity basis. Contact me directly if you are interested in an usage that does not fit this description.

The enrollment for the service is simple: send me an email containing the X500 names that will be used to provision your servers. Here’s an example to provision a fictional overlay named “my-overlay-reload.implementers.org”:

C=US, ST=California, L=Saratoga, O=Impedance Mismatch, LLC, OU=R&D,
CN=my-overlay-reload.implementers.org

The C=, ST=, L=, O= and OU= components should describe your organization (not you). The CN= component contains the name requested for your overlay. Note that the “-reload.implementers.org” part is mandatory, but you can choose to use whatever name before this suffix, as long as it is not already taken, that it follows the DNS label rules and that it does not contain a dot (wildcard certificates do not support sub-subdomains).

With these information I will provision the following:

  • The DNS RR as described in the RELOAD draft
  • A configuration server.
  • An enrollment server, with its CA certificate
  • A secure Operation, Administration and Management (OAM) server.

The DNS server will permit to retrieve the IP addresses and ports that can be used to connect to the configuration server. If we reuse our example above, the following command will retrieve the DNS name and port:

$ host -t SRV _reload-config._tcp.my-overlay-reload.implementers.org
_reload-config._tcp.my-overlay-reload.implementers.org has SRV record 40 0 443
my-overlay-reload.implementers.org.

Note that the example uses the new service and well-known URL name that were agreed on in the Vancouver meeting, but the current name (p2psip-enroll) will be supported until the updated specification is published.

The DNS name can then be resolved (the IPv6 address is functional):

$ host my-overlay-reload.implementers.org
my-overlay-reload.implementers.org has address 173.246.102.69
my-overlay-reload.implementers.org has IPv6 address
2604:3400:dc1:41:216:3eff:fe5b:8240

Then the configuration file can be retrieved by following the rules listed in the specification:

$ curl --resolve my-overlay-reload.implementers.org:443:173.246.102.69
https://my-overlay-reload.implementers.org/.well-known/reload-config

The returned configuration file will contain a root-cert element containing the CA certificate that was created for this overlay, and will be signed with a configuration signer that will be maintained by the configuration server. Basically the configuration server will automatically renew the configuration signer and resign the configuration file every 30 days, or sooner if you upload a new configuration file (more on this later). Note that to ensure that there is no lapse in the rollover of signer certificates, the configuration file must be retrieved periodically (the expiration attribute contains the expiration date of the signer certificate, so retrieving the configuration document one or two days before this date will guarantee that any configuration file can be used to validate the next one in sequence). This feature frees the implementers from developing its own signing tools (a future version will permit the implementers to maintain their own signer and to upload a signed configuration file).

The configuration file also contain an enrollment-server element, pointing to the enrollment server itself, that can be used to create certificates as described in the specification. The enrollment server requires a valid username/password to create a certificate and anyway the default configuration document returned is filled with the minimum parameters required, so they are useless as it to run a real overlay. Modifying the configuration document and managing the users that can request a certificate (and so join the overlay) is the responsibility of the OAM server.

Because the OAM server uses a client certificate for authentication, it uses a different domain name than the configuration and enrollment server. The domain name will use the “-oam-reload-implementers.org” suffix, and will use a separate CA to create the client certificate, so a user of the overlay cannot use its certificate to change the configuration (That would be a good idea to define a new X.509 extended key usage purpose for RELOAD to test for this).

The OAM server uses a RESTful API to manage the configuration and enrollment servers (well, as RESTful as possible, because the API is in fact auto-generated from a JMX API, and I did not find another solution that to map a JMX operation to a POST. But more on this in a future blog entry). Here’s the commands to add a new user, change a user password, list the users and remove a user:

$ curl --cert client.crt --key client.key --data "name=myname&password=mypassword"
https://my-overlay-oam-reload.implementers.org/type=Enrollment/addUser
$ curl --cert client.crt --key client.key --data "name=myname&password=mypassword"
https://my-overlay-oam-reload.implementers.org/type=Enrollment/modifyUser
$ curl --cert client.crt --key client.key https://notomele-reload.implementers.org/type=Enrollment/Users
$ curl --cert client.crt --key client.key --data "name=myname"
https://my-overlay-oam-reload.implementers.org/type=Enrollment/removeUser

The password is stored as a bcrypt hash, so it is safe as long as you do not use weak passwords.

The last step is to modify the configuration, probably to add a bootstrap element element. Currently the OAM server manages what is called a naked configuration, which is a configuration document stripped of all signatures. The current naked configuration can be retrieved with the following command:

$ curl --cert client.crt --key client.key https://my-overlay-oam-reload.implementers.org/type=Configuration/NakedConfiguration > config.relo

The file can then be freely modified with the following constraints:

  • The file must be valid XML and must conform to the schema in the specification (including the use of namespaces).
  • The sequence attribute value must be increased by exactly one, modulo 65535.
  • The instance-name attribute must not be modified.
  • The expiration attribute must not be modified.
  • A node-id-length element must not be added.
  • The root-cert element must not be removed or modified, and a new one must not be added.
  • The enrollment-server element must not be removed or modified, and a new one must not be added.
  • The configuration-signer element must not be removed or modified, and a new one must not be added.
  • A shared-secret element must not be added.
  • A self-signed-permitted element with a value of “true” must not be added.
  • A kind-signer element must not be added.
  • A kind-signature or signature element must not be added
  • Additional configuration elements must not be added.

Then the file can be uploaded into the configuration server:

$ curl --cert client.crt --key client.key -T config.relo https://my-overlay-oam-reload.implementers.org/type=Configuration/NakedConfiguration

If there is a problem in the configuration, an HTTP error 4xx should be returned, hopefully with a text explaining the problem (please send me an email if you think that the text is not helpful, or if a 5xx error is returned).

Jarc: Generated service

It is not permitted for an annotation processor to modify a Java source file, so a processor willing to add code to an existing class is left with only two solutions (if we exclude method instrumentation): Generating a superclass or generating a subclass.

Generating a superclass has the advantage that the constructors of the annotated class can be used directly. Let say that we have a annotation processor that is designed to help implement class composition, as described in Effective Java, item #16. Instead of writing the whole ForwardingSet class, an annotation processor could generate it automatically from this code fragment:

@Forwarding(ForwardingSet.class)
public class InstrumentedSet<E> extends ForwardingSet<E> {
  InstrumentedSet(Set<E> s) {
    super(new HashSet<>());
  }

But generating a superclass is not always possible. For example let’s imagine an annotation processor that generates the JMX boilerplate necessary to export attributes. An existing class with such annotation could look something like this:

public class MyData {
  @Attribute int counter;
  }

In this case the processor for the @Attribute annotation will generate a JMX interface (Let’s call it MyDataMXBean) that declares the getcounter and setcounter methods, and a class extending MyData and implementing the JMX interface (let’s call it MyDataImpl).

The code generated would take care of the boring stuff, like synchronization and so on, which is certainly an improvement over writing and maintaining it. But the problem with subclasses is that we do not know the name of the class that was generated. Note that we have no other choice than to know the name of the superclass because we have to inherit from it. For subclasses it is better to let the processor choose the name, but now we need a way to be able to instantiate the generated class without knowing this name (in our example to register it in the MBean server).

The obvious way of doing this is to use a ServiceLoader. We can add a factory method in the MyData class to instantiate the generated class, something like this:

static MyData newInstance() {
  return ServiceLoader.load(MyData.class).next();
  }

But for this technique to work, we need to describe the service in the jar file. Using the method explained in a previous post does not help in this case, because we still do not know what will be the name of the generated class.

The version 0.2.30 of jarc provides a solution to this problem. This new version contains a new annotation, @Service, that can be used to annotate a generated class. A processor integrated in jarc will read this annotation at compile time, and automatically generates the service entry in the built jar file, as if an X-Jarc-Service attribute has been added to the manifest file. This works because this processor will be invoked after the @Attribute processor, and so knows the name of the class that has been generated. Here is for example the code fragment that the code generator would have generated for MyDataImpl:

@Service(MyData.class)
@MXBean
class MyDataImpl extends MyData implements MyDataMXBean {

Note that classes used as services require an empty constructor and that can be a problem if the class it extend does not have an empty constructor itself. The solution in this case is to define an additional factory class as the service.

First we define our factory as an abstract class:

abstract class Factory {
  MyData newInstance(int init);
  }

We adjust our factory method accordingly:

static MyData newInstance(int init) {
  return ServiceLoader.load(Factory.class).next().newInstance(init);
  }

The @Attribute processor must generate an additional class that extends the factory class and this is the class which is declared as a service:

@MXBean
class MyDataImpl extends MyData implements MyDataMXBean {
  @Service(Factory.class)
  static class FactoryImpl extends Factory {
  MyDataImpl newInstance(int init) {
    return new MyDataImpl(init);
    }
  }

  MyDataImpl(init init) {
    super(init);
  }

Decrypting SSL sessions in Wireshark

Debugging secure communication programs is no fun. One thing I learned during all these years developing VoIP applications is to never, ever trust the logs to tell the truth, and especially the one I put myself in the code. The truth is on the wire, so capturing the packets and been able to analyze them, for example with Wireshark, is one of the most important tool that a communication developer can use. But the need for secure communications makes this tool difficult to use.

As said before, the first solution would be to display the packets directly in the logs before they are encrypted or after they are decrypted but as this is probably the same idiot who wrote the code to be debugged that will write the code to generate the logs, there is little chance to have something useful as a result.

Another solution is to have a switch that permits to use the software under test in a debug mode which does not require to encrypt and decrypt the communications. A first problem is that it increases the probability of forgetting to turn the switch off after debugging or to use the wrong setting in production. Another problem is that the secure and unsecure modes may very well use different code paths, and so may behave differently. And lastly some communication protocols, especially the modern one, do not have an unsecure setting (for example RELOAD and RTCWEB), and for very good reasons.

A slightly better solution that can reduce the difference between code paths and work with secure-only protocols is to use a null cipher but for security reasons both sides must agree beforehand to use a null cipher. That in fact probably increases the probability that someone forget to switch off the null cipher after a test.

So the only remaining solution is to somehow decrypt the packets in Wireshark. The standard way of doing that is to install the private RSA in Wireshark, which immediately creates even more problems:

  • This cannot be used to debug programs in production, because the sysadmin will never accept to give the private key.
  • This does not work if the private key is stored in an hardware token, like a smartcard as, by design, it is impossible to read the private key from these devices.
  • Even with the private key, the SSL modes that can be used are limited. For example a Diffie–Hellman key exchange cannot be decrypted by Wireshark.

Fortunately since version 1.6, Wireshark can use SSL session keys instead of the private key. Session keys are keys that are created from the private key but that are used only for a specific session, and disclosing them does not disclose the private key. This solves most of the problems listed above:

  • Sysadmins can disclose only the session key related to a specific session.
  • Session keys are available even if the private key is stored in an hardware token.
  • Sessions keys are the result of, e.g., a Diffie–Hellman key exchange, so there is no need to restrict the SSL modes for debugging.

Now we just need to have the program under test storing the session keys somewhere so Wireshark can use them. For example the next version of NSS (the security module used by Firefox, Thunderbird and Chrome) will have an environment variable that can be used to generate a file that can be directly used by Wireshark (see this link for more details).

Adding the support for this format in Java requires to maintain a modified build of Java, which can be inconvenient. A simpler solution is to process the output of the -Djavax.net.debug=ssl,keygen debug option. The just uploaded Debian package named keygen2keylog contains a program that does this for you. After installation, start Wireshark in capture mode with the name of the SSL session key file that will be generated as parameter, something like this:

$ wireshark -o ssl.keylog_file:mykeys.log -k -i any -f "host implementers.org"

(Remember that you do not need to run Wireshark under root to capture packets if you run the following command after each update of the package: sudo setcap ‘CAP_NET_RAW+eip CAP_NET_ADMIN+eip’ /usr/bin/dumpcap)

Then you just need to pipe the debug output of your Java program to keygen2keylog to see the packets been decrypted in Wireshark, e.g.:

$ java -Djavax.net.debug=ssl,keygen -jar mycode.jar | keygen2keylog mykeys.log

And the beauty of this technique is that the packets are decrypted as they are captured.

NAT64 discovery

Last week I volunteered to review draft-ietf-behave-nat64-discovery-heuristic, an IETF draft that describes how an application can discover a NAT64 prefix that can be used to synthesize IPv6 addresses for embedded IPv4 addresses that cannot be automatically synthesized by a DNS64 server (look here for a quick overview of NAT64/DNS64).

I am not a DNS or IPv6 expert, so I had to do a little bit of research before starting to understand that draft, and that looked interesting enough to decide to write an implementation, which is probably the best way to find problems in a draft (and seeing how often I find bugs in published RFCs that should be a mandatory step, but that’s another discussion). I installed a PC with the Linux Live CD of ecdysis, and configured it to use a /96 subnet of my /64 IPv6 subnet. After this I just had to add a route on my development computer to be able to use NAT64. I did not want to change my DNS configuration, so I forced the nameserver in the commands I used. With that configuration I was able to retrieve a synthesized IPv6 address for a server that do not have IPv6 addresses, then ping6 it:

$ host -t AAAA server.implementers.org 192.168.2.133
server.implementers.org has IPv6 address 2001:470:1f05:616:1:0:4537:e15b

$ ping6 2001:470:1f05:616:1:0:4537:e15b
PING 2001:470:1f05:616:1:0:4537:e15b(2001:470:1f05:616:1:0:4537:e15b) 56 data bytes
64 bytes from 2001:470:1f05:616:1:0:4537:e15b: icmp_seq=1 ttl=49 time=49.4 ms

As said above, the goal of NAT64 discovery is to find the list of IPv6 prefixes. The package nat64disc, that can be found at the usual place in my Debian/Ubuntu repository, contains one command, nat64disc, that can be used to find the list of prefixes:

$ nat64disc -d ipv4only.implementers.org -n 192.168.2.133 -l
Prefix: 2001:470:1f05:616:1:0:0:0/96 (connectivity check: nat64.implementers.org.)

When the draft will be published, the discovery mechanism will use by default the domain “ipv4only.arpa.” but this zone is not populated yet, so I added the necessary record to ipv4only.implementers.org so the tool can be used immediately. This domain name must be passed with the -d option on the command line.

As explained above, I did not want to modify my DNS configuration, so I have to force the address of the nameserver (i.e.e the DNS64 server) on the command line, with the -n option. Interestingly this triggered a bug in Java, as when forcing the nameserver the resolver will send an ANY request, which is not processed by DNS64. People interested in the workaround can look in the source code, as usual (note that there is another workaround in the code also related to a resolver bug, bug that prevents to use IPv6 addresses in /etc/resolv.conf).

I also provisioned a connectivity server for my prefix, as shown in the result. If the tool finds a connectivity server associated with a prefix, it will use it to check the connectivity and remove the prefix from the list of prefixes if the check fails.

The tool can also being use to synthesize an IPv6 address:

$ nat64disc -d ipv4only.implementers.org -n 192.168.2.133 69.55.225.91
69.55.225.91 ==> 2001:470:1f05:616:1:0:4537:e15b

and to verify that an IPv6 address is synthetic:

$ nat64disc -d ipv4only.implementers.org -n 192.168.2.133 2001:470:1f05:616:1:0:4537:e15b
2001:470:1f05:616:1:0:4537:e15b is synthetic

The tool does not process DNSSEC records yet, and I will probably not spend time on this (unless, obviously, someone pay me to do that).