Stateless CSRF Tokens

UPDATE: I posted improved versions of these functions in Better Stateless CSRF Tokens.

I’ve been thinking about CSRF tokens lately. If you are using the built in $_SESSION feature of PHP a common pattern goes something like this ( similar to what Chris Shiflett describes ) :

[sourcecode lang=”php”]
session_start();

$token = base64_encode( openssl_random_pseudo_bytes( 32 ) );
$_SESSION[‘token’] = $token;
$_SESSION[‘token_time’] = time();

echo ‘<input type="hidden" name="token" value="’ . $token . ‘" />’;

if (
isset( $_POST[‘token’] )
&& $_POST[‘token’] === $_SESSION[‘token’]
) {
if ( ( time() – $_SESSION[‘token_time’] ) <= 300 ) {
// valid token, within time limit
}
}

[/sourcecode]

A few notes about this approach. First, use openssl_random_pseudo_bytes instead of mt_rand ( suggested by Kevin Schroeder ) when possible. Second, be sure to only use === when comparing the token value. You want to avoid automatic type juggling.

Why worry about automatic type juggling when comparing CSRF tokens? Try this script:

[sourcecode lang=”php”]
$token = ‘abc123’;
$form_token = 0;

if ( $form_token == $token ) {
echo "Valid tokenn";
} else {
echo "Invalid tokenn";
}
[/sourcecode]

Even though $token and $form_token clearly don’t contain the same values this script will display ‘Valid Token’ because of automatic type juggling. That makes your CSRF token basically useless as an attacker can set the token to zero and it will be considered valid. Switching to === will display the expected ‘Invalid token’.

This is all fine and good until you want to avoid using PHP sessions. Perhaps you have several web servers and don’t want to deal with shared session storage. Or have servers in multiple data centers and don’t want to try and sync state across them. What ever the reason, popping a token into $_SESSION isn’t an option in this case. In short you want some sort of stateless CSRF token.

One method is to generate a token based on known values that won’t change and lasts for a given period of time. This is what WordPress does. You can see the WordPress implementation in the inaccurately named wp_create_nonce and wp_verify_nonce functions ( WordPress nonces aren’t really nonces, they can be used more than once ). The high level version is that WordPress takes a known set of values like the user id, NONCE_KEY, descriptive action text, and current time; then runs them through an MD5 HMAC.

By default the tokens are good for 24 hours. You can adjust the time a WordPress nonce value is valid for by filtering nonce_life. That isn’t very flexible though. If you want to use different HMAC keys and timeouts across various requests then you end up having to filter both sides ( create and verify ).

I got to wondering what a more flexible approach to stateless CSRF tokens would look like. Here is what I’m thinking of:

[sourcecode lang=”php”]
function request_token_generate( $data_str, $key, $timeout = 900 ) {
$now = microtime( true );
$hash = hash_hmac( ‘sha256’, "$data_str|$now|$timeout", $key );

return base64_encode( $hash ) . "|$now|$timeout";
}

function request_token_verify( $token, $data_str, $key ) {
list( $hash, $hash_time, $timeout ) = explode( ‘|’, $token, 3 );
if ( empty( $hash ) || empty( $hash_time ) || empty( $timeout ) ) {
return false;
}

if ( microtime( true ) > $hash_time + $timeout ) {
return false;
}

$hash = base64_decode( $hash );
$check_hash = hash_hmac(
‘sha256’, "$data_str|$hash_time|$timeout", $key
);

if ( $check_hash === $hash ) {
return true;
}

return false;
}
[/sourcecode]

For request_token_generate you’ll need to provide a string containing unique data about the user and request, an HMAC key value, and an optional timeout. One example of the data string would be a combination of the internal user id, descriptive text about the action being taken, timestamp of when the password on the account was last changed, and the last 6 characters of the hashed password. Depending on the flow of the request you might want to include the URL that the form is expected to be on and the remote client IP address.

There is no specific limit on what you could include in the data string. Anything likely to be fairly unique to that user and request would be a good candidate. For that mater you could generate an additional unique key for each user at signup time that could be included.

[sourcecode lang=”php”]
$key = ‘5up3R53cr3T!’;
$data_str = ‘45873’ . ‘delete_post_345’
. ‘2013-05-01 14:45:32’ . ‘2dH6hi’;
$request_token = request_token_generate( $data_str, $key );

echo "Token: $request_tokenn";
[/sourcecode]

That will produce a token that looks something like:

MTljOWZmNzhmOTY5Y2Y5Y2IxNjdkNzQ5YzVkYTcwNzMyMzNjMjhmNTdmZTFhZjVkNmEwNTAyMmFjMjBmMTExMQ==|1373336147.4861|900

This is really 3 values separated by |, the first is the base64’d HMAC ( using SHA256 instead of MD5 ). The second value is the timestamp for when the token was generated and the third is how long the token is good for in seconds. Verifying the token is easy enough:

[sourcecode lang=”php”]
if ( request_token_verify( $request_token, $data_str, $key ) ) {
echo "Valid tokenn";
} else {
echo "! INVALID ! tokenn";
}
[/sourcecode]

The verification side needs to have access to the same values that went into building $data_str and the HMAC key. You don’t need to know the timeout for the token because the timeout is included as part of the token value.

This approach prevents tampering by including the timestamp and the time out values as part of the HMAC call ( as does WordPress ). Testing this is easy enough:

[sourcecode lang=”php”]
$key = ‘5up3R53cr3T!’;
$data_str = ‘45873’ . ‘delete_post_345’ . ‘2013-05-01 14:45:32’ . ‘0dH6hi’;
$token = request_token_generate( $data_str, $key, 15 );

// confirm original works
echo "Should be valid: ";
if ( request_token_verify( $token, $data_str, $key ) ) {
echo "Valid tokenn";
} else {
echo "! INVALID ! tokenn";
}

// turn back time
list( $hash, $hash_time, $timeout ) = explode( ‘|’, $token, 3 );
$hash_time = $hash_time – 100;
$fake_token = "$hash|$hash_time|$timeout";

echo "Should be INVALID: ";
if ( request_token_verify( $fake_token, $data_str, $key ) ) {
echo "Valid tokenn";
} else {
echo "! INVALID ! tokenn";
}

// alter timeout
list( $hash, $hash_time, $timeout ) = explode( ‘|’, $token, 3 );
$fake_token = "$hash|$hash_time|10000";

echo "Should be INVALID: ";
if ( request_token_verify( $fake_token, $data_str, $key ) ) {
echo "Valid tokenn";
} else {
echo "! INVALID ! tokenn";
}
[/sourcecode]

The first test checks that the unmodified token is valid. The second test attempts to set the timestamp back in time. The third attempts to increase the timeout value. The two altered tokens fail because the hash values no longer match.

Overall I’m happy with this approach to stateless CSRF tokens.

10 Comments

  1. Fun read — thanks for taking the time to write it up.

  2. It was a fun exercise thinking through the issues involved.

  3. Useful but not stateless. If you need user specific data (and you definitely want to for this) that’s “state”.

  4. “…as an attacker can set the token to zero and it …”
    How, can an attacker pass an INT? 😉

  5. True, a but not additional state. Perhaps that would have been a better way to describe it.

    You should be validating user accounts along with CSRF tokens anyway.

  6. Depends on what other processes submitted data goes through.

  7. I blogged about this in January 2012: http://appsandsecurity.blogspot.se/2012/01/stateless-csrf-protection.html
    … and later that year I did a presentation on even further research, proposing a triple submit to countermeasure HttpOnly cookie overwrites via subdomain XSS and cookie jar overflow: http://www.slideshare.net/johnwilander/stateless-anticsrf

  8. I didn’t want to get into cookies for CSRF tokens. If you have multiple forms on a page you would potentially want a unique CSRF token for each form. While you could still do that using cookies, it would bit a more management.

  9. It looks like this might be vulnerable to timing attacks, because of the == string comparison to check the hash in request_token_verify. You should use a constant-time string comparison to prevent this.

  10. There is a === comparison in that function, is that what you meant to refer to?

    I also need to update this post to point to an improved version of the code – https://josephscott.org/archives/2013/08/better-stateless-csrf-tokens/.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2019 Joseph Scott

Theme by Anders NorénUp ↑