Cautionary note: UUIDs generally do not meet security requirements

Some time ago, I was doing a web application penetration test, and noticed that the authentication tokens were UUIDs. This I knew was risky, and in fact if you read the UUID rfc, section 6 of it warns:

Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example.

As a penetration tester, I was salivating over the thought that I could possibly find the ultimate attack on the application: being able to hijack sessions by predicting past and future UUIDs given a single UUID that I got for my authentication. But this of course is only possible if I knew how the UUIDs were being generated, and there is no fixed standard on their implementation. So what could I do? Ask the developers of course!

Discussion with one of the developers revealed that the UUIDs were generated on the server side in JavaScript using code called randomUUID(). JavaScript on the server side? I thought it must be Nodejs, but the developer said it was not. Instead, he insisted that they were using both Java and JavaScript on the server side, but the UUIDs are generated in JavaScript. Hmmm, I thought I should just take his word for it.

So I Google for randomUUID JavaScript, and this turned up:

function randomUUID() {
  var s = [], itoh = '0123456789ABCDEF';

  // Make array of random hex digits. The UUID only has 32 digits in it, but we
  // allocate an extra items to make room for the '-'s we'll be inserting.
  for (var i = 0; i < 36; i++) s[i] = Math.floor(Math.random()*0x10);

  // Conform to RFC-4122, section 4.4
  s[14] = 4;  // Set 4 high bits of time_high field to version
  s[19] = (s[19] & 0x3) | 0x8;  // Specify 2 high bits of clock sequence

  // Convert to hex chars
  for (var i = 0; i < 36; i++) s[i] = itoh[s[i]];

  // Insert '-'s
  s[8] = s[13] = s[18] = s[23] = '-';

  return s.join('');
}

Ahhh, good! It uses Math.random( ), which almost certainly is not cryptographically secure, so now it was just a matter of finding out how Math.random( ) was implemented in JavaScript. More Googling revealed this C equivalent of the function.  Further research revealed that the stack overflow post was not quite correct: a division by 2^32 was necessary at the end to convert the number between 0 and 1. Thus the code looks like this:

uint32_t V8::Random() {
    // Random number generator using George Marsaglia's MWC algorithm.
    static uint32_t hi = 0;
    static uint32_t lo = 0;

    // Initialize seed using the system random(). If one of the seeds
    // should ever become zero again, or if random() returns zero, we
    // avoid getting stuck with zero bits in hi or lo by reinitializing
    // them on demand.
    if (hi == 0) hi = random();
    if (lo == 0) lo = random();

    // Mix the bits.
    hi = 36969 * (hi & 0xFFFF) + (hi >> 16);
    lo = 18273 * (lo & 0xFFFF) + (lo >> 16);
    return ((hi << 16) + (lo & 0xFFFF)) / Math.pow(2, 32);
}

As a cryptographer, it didn’t take me very long to see that this pseudo random number generator (prng) has a problem, and thus I was at this point overly excited that I could easily break the UUID generator. I spent a couple hours implementing my attack, and sadly, it turned out not to work. Why?

I went back to the developer and asked him “are you sure you are using JavaScript to generate the UUIDs?” He double-checked, and found out that in fact that they were calling Java’s uuid.randomUUID(). So I looked into that, and found:

The UUID is generated using a cryptographically strong pseudo random number generator.

At this point I gave up.

A recent blog

I regularly read reddit’s netsec subredit, and a couple days ago there was an extremely popular blog posted there based upon the problems with the above prng: TIFU by using Math.random().

This author was mainly interested in the prng for the lack of collisions property, but not cryptographic purposes. He explained in fine detail the problem with the prng. I read through the blog thinking to myself “Gee, I noticed that a while back. Maybe I should have written my own blog about this!”

So today I decided to just add a little extra example of what’s wrong with that prng, this time focusing on its use for cryptographic purposes. I illustrate that it is absolutely trivial to break the JavaScript randomUUID() when it is used for something like authentication tokens. This also fits in well with my previous blog, which showed how to break a number of amateur cryptographic designs.

Breaking the randomUUID() function

If you have not read the blog post above on the fault with the prng, then here’s my own short description. We will work from the C code implementation of Math.random( ) above.

The state consists of (hi, lo), which are both 32-bit unsigned words. It would seem that one needs to try 2^64 combinations to get the state, however it turns out that in our case, we don’t need to. Look at the return value:

return ((hi << 16) + (lo & 0xFFFF)) / Math.pow(2, 32);

Let’s break it down:

  • (hi << 16) is shifting hi 16-bits to the left. The highest 16-bits get shifted off, the least significant bits get moved to most significant bits positions.
  • (lo & 0xFFFF) means adding on the least significant 16-bits onto the above value.

So the net result is that the (l0 & 0xFFFF) does not interfere with the (hi << 16). After the division by 2^32, we get a fractional number between 0 and 1.  If we think about the binary expansion of this fractional number, the first 16-bits after the decimal point come from hi, the next 16-bits come from lo.

The other observation about the Math.random( ) function is that this is the only place where the two state words get mixed. So this tells us that if we only care about the most significant bits from the output, then the lo value does not get mixed in. And that means that we can brute force the part of the prng state that we care about by only concentrating on the hi value, which has 2^32 possibilities. That’s quick and easy to do on a desktop computer.

As you might have seen by now, the randomUUID() function above only uses a few most significant bits. Each nibble of the UUID string is calculated from:

s[i] = Math.floor(Math.random()*0x10);

The ‘0x10’ is the hexadecimal value of 16: in other words, we are shifting the random value left by 4 bits. This means that only the most significant 4 bits from hi get pushed past the decimal point, and then Math.floor( ) drops the remaining bits.

Implementing the attack

The attack is actually quite trivial. We get one UUID, and then we brute force (2^32 possibilities) the hi value that could have caused it. Note: if you are a cryptographer, you can likely find a more clever way than brute forcing hi, but with only 2^32 possibilities, why bother?

Once we have the hi value, we can compute future outputs of the prng that will produce the same UUIDs regardless of what lo is. That means that you can set lo to whatever you want: the attack will work with any value.

So in C, the search would look something like this:

static unsigned int hi = 0, lo = 0;     // initialise with dummy values
int main( int argc, char** argv ) {
    unsigned char uuid[37], *target_uuid;
    int i;
    unsigned int candidate_hi;

    if (argc < 2) {
            printf("usage: %s uuid\n\n", argv[0]);
            return -1;
    }

    target_uuid = argv[1]; // input validation omitted for brevity
    printf("Computing...\n");
    candidate_hi = 0; uuid[36] = '\0';
    // main loop
    do {
        hi = candidate_hi;
        // generate randomUUID using candidate_hi as the seed,
        // store result in the string uuid.
        randomUUID( uuid );
        if (!strncmp( uuid, target_uuid, 36 ))
            break;
        ++candidate_hi;
    } while (candidate_hi != 0);

    if (candidate_hi != 0) {
        int i;
        printf("\nThe value of hi is %u,", hi );
        printf(" and the next 10 UUIDs are:\n");
        for (i=0; i < 10; ++i) {
            randomUUID( uuid );
            printf("\t%s\n", uuid );
        }
    }
    else
        printf("No solution found.\n");
    return 0;
}

Here we have made the (hi, lo) global variables used by randomUUID( ). The main( ) takes in a UUID from the command line, that we call target_uuid. We loop through all possible values for hi and for each one, we compute a UUID. We then check to see if it matches the target, and if so, done.

Complete Source Code

The complete source code for the search as described above is here.  It is a bit slow, but it is not hard to make it faster (see next section).  To test it, you will want to verify that you can break some UUIDs generated in JavaScript.  If you are using the Chrome browser, then see this simple JavaScript code here, which runs directly in your browser.  That code calls Math.random() to generated the randomness, just as described above.  However, the attack code will not work if the UUID is generated in other browsers because different browsers implement Math.random() in different ways.  For that reason, I have also another JavaScript implementation that works in all browsers by emulating Chrome’s Math.random(). You can download that here.

Making it faster

There are two ways to make it faster, algorithmically or improving the implementation. Here we focus only on the latter.

The code currently computes a whole UUID for a candidate_hi, which means calling Math.random( ) 36 times. However, if we have a wrong candidate, there is a good chance that you the newly computed UUID will be wrong in the first few nibbles. Why not check the UUID that you are computing while it is being computed against the target and abort early if it does not match? This gives a huge speedup. On my work Macbook Pro, I can perform the brute force within 1 minute.

Concluding remarks

Are you a developer relying on UUIDs for security? If so, you better double-check the implementation under the hood. If you are not sure what is there, then I highly recommend you move away from UUIDs and use something that was designed for cryptographic purposes. UUIDs were not designed for this purpose! If your UUID generator happens to have this property, then you should feel fortunate.

If you’re a pentester and you find some system that relies on UUIDs for security, then I’d be appreciative if you can find out what is implemented under the hood. Often times these things can be broken trivially, as demonstrated above.