Physically Unclonable Functions in Practice

Introduction

Physically Unclonable Functions (PUFs) are, according to wikipedia a physically-defined “digital fingerprint” that serves as a unique identity for a semiconductor device such as a microprocessor. They are based on unique physical variations which occur naturally during semiconductor manufacturing. A PUF is a physical entity embodied in a physical structure.

To keep things simple, it is a way to use physical imperfections created during the manufacturing process to generate unique values. Since these values are random by nature, they are called unclonable (which is not totally true, more on that later).

There are two main types of PUFs: weak and strong. Both of them are based on a challenge-response scheme, where the response to a challenge depends on the device itself. In the case of a weak PUF, the number of available challenges is considerably lower (and can be 1, which will be the case here) than on a strong PUF.

Many types of PUFs exist and are used in different ways, but the easiest to pull off available is the SRAM PUF. SRAM PUFs rely on the fact that an SRAM cell is in an undefined state when powered up. Turns out, this undefined value is often the same, thus providing us a value that is both random and unique to our chip. SRAM is available in a lot of various microcontrollers. As an example, we’ll take the NUCLEO-F429ZI and use its SRAM as a weak PUF.

Creating the PUF

Characterization

Not all SRAM cells are the same across every reboot. Some bits might flip from time to time, and we need to remove those bits from the calculation in order to get the same value every time. Multiple methods exists to avoid this problem, like ECC for instance. For this demonstration, we will use a simpler method and generate a bitmask to remove the flipping bits from the final value.

Within an empty mbed IDE project, we created a simple firmware that reads 64 bytes from the SRAM3 (starting at address 0x20020000) and prints them on the serial :

int main()
{
    int i;
    unsigned char tmp[64];
    unsigned char *ram_buffer = (unsigned char *) 0x20020000;
    
    for(i=0; i< 64; i++){
        printf("%02x ", ram_buffer[i]);
    }
    printf("\r\n");
}

Then, a Python script will read this value, compare it with the previous one, and remove all flipped bits. The procedure is pretty simple :

Capture the SRAM buffer
Apply the current mask on the capture
XOR resulting value with the previous capture, this will make all different bits appear as ones.
Update the mask with the newly found flipping bits

To make sure that our value is stable, we will repeat this procedure 1000 times. Hopefully, our mask will remove all flipping bits and the final value should still contain some consistent bits.

Managing the power source

Since we need to physically remove power from the MCU a thousand times and won’t do it manually, we will need to automate this. Thankfully, our lab is equipped with manageable power supplies. With the Instrumental library, it was easy to control the power:

powerSupply = GPD_3303S(visa_address='ASRL/dev/ttyUSB1::INSTR') 
powerSupply.output = False
powerSupply.current1 = "0.5A"
time.sleep(0.5) 
powerSupply.output = True

For more constrained budgets, power supplies like RD Tech DPS modules offer such manageability using the excellent Sigrok toolsuite. Using these supplies is a bit more complicated but still manageable in Python :

from sigrok.core.classes import *

context = Context.create()
driver = context.drivers['rdtech-dps']
conf = ConfigKey.get_by_identifier('conn').parse_string('/dev/ttyUSB3')
driver.scan(conn=conf)
device = driver.scan(conn=conf)[0]

def reset_device():
  device.open()
  device.config_set(ConfigKey.ENABLED, ConfigKey.ENABLED.parse_string('off'))
  device.config_set(ConfigKey.VOLTAGE_TARGET, ConfigKey.VOLTAGE_TARGET.parse_string('3.3V'))
  device.config_set(ConfigKey.ENABLED, ConfigKey.ENABLED.parse_string('on'))
  device.close()

Combining all together, we come up with this prototype script :

MASK = b'\xff'*64
previous_capture = read_from_device()
reset_device()
for _ in range(1000):
    current_capture = read_from_device()
    for i in range(64):
        tmp = ord(current_capture[i]) & ord(MASK[i])
        tmp = tmp ^ ord(previous_capture[i])
        MASK = MASK & ~tmp
        previous_capture[i] = tmp.to_bytes(1, byteorder='little')
    reset_device()

Results

After a thousand iterations, the output values were stable enough for our PUF to be usable. Out of the maximum 512 bits, we were able to find a pretty good number of stable bits :

board ID	Number of valid bits
#1	292
#2	291
#3	312
#4	268

Using the PUF value

To illustrate the use of a PUF, we created an example secure “cloud” connection that uses the PUF value as a private key, then sets up a TLS connection to a server using a client certificate as an authentication mechanism.

To start, we took the Hello HTTPS client basic example from the mbed examples and customized it to fit our needs.

Elliptic curve cryptography

One of the supported signature algorithms in TLS is ECDSA which stands for Elliptic Curve Digital Signature Algorithm. The elliptic curve we used for this algorithm is called P-256 or prime256v1 in openssl. One of the public parameters of the curve is G and it is called the generator. The main advantage of Elliptic curve cryptography is the ability to offer public key cryptography like RSA with smaller keys. Basically the private key $d$ of this algorithm is a number which has 256 bit of entropy and if so, the algorithm is considered as safe by cryptographers. Since the PUF value has more than 256 bits of entropy we hashed it with SHA-256 to obtained a 256-bit key value which is used directly as ECDSA private key. The coresponding public key we have to compute later is a point computed from the private key : $d \cdot G$ .

Generating a device certificate

Generating a private key

To generate a PUF value, we used the characterization steps above, then used the resulting mask like so :

static const unsigned char mask[] = {0x59, 0x3d, 0x32, 0xfe, 0x47, 0xa5, 0x4a, 0x85, 0x88, 0x35, 0x4e, 0x27, 0x63, 0x49, 0x37, 0xb6, 
                                     0xff, 0x1b, 0xbe, 0xc2, 0xce, 0x63, 0x95, 0xab, 0x30, 0x3f, 0x77, 0x9d, 0x59, 0xd3, 0xe2, 0x75, 
                                     0xdd, 0xff, 0x1e, 0x03, 0x2e, 0xf1, 0xee, 0xe1, 0x52, 0xe8, 0xaa, 0x8b, 0x0e, 0x9d, 0xfa, 0xea, 
                                     0x4e, 0x3d, 0x79, 0x0c, 0xd7, 0xeb, 0xbd, 0x7e, 0x73, 0x35, 0x9e, 0x5b, 0xbe, 0x5d, 0x42, 0xd7};
int compute_key(unsigned char *output)
{
    int i;
    unsigned char tmp[64];
    unsigned char *ram_buffer = (unsigned char *) 0x20020000;
    
    for(i=0; i< 64; i++){
        tmp[i] = ram_buffer[i] & mask[i];
    }
    mbedtls_sha256((unsigned char *)tmp, 64, output, 0);
    return 0;
}

This returns a value that can be used as the private key. In order to use it, the following Python script has been used to generate a private key in PEM format :

# pip3 install --user pycryptodome

>>>from Crypto.PublicKey import ECC
>>>d=0x8e140886f96ef269e736cb1fe24ea12627df6971f32d6c15b6cbc2810af19382 # PUF value
>>>ECC.construct(curve="prime256v1", d=d).export_key(format='PEM')
'-----BEGIN PRIVATE KEY-----\nMIGHAgEAMBMGByqGSM49AgEGCCqGSM49AwEHBG0wawIBAQQgjhQIhvlu8mnnNssf\n4k6hJiffaXHzLWwVtsvCgQrxk4KhRANCAAQ6ezsYJBMBjVBhWLjPzcZjSv7/z4WQ\nZI/820/RgryR+phEx6oY8EE8+EVA5+JgXuhIoTvirMKnhWkHBu+NtNNL\n-----END PRIVATE KEY-----'

Generating a CSR, and sign the certificate

From this private key, it is easy to generate a certificate using openSSL :

openssl req -key puf.key -new -out puf.csr

Then, signing the certificate with a CA is made using the following command :

openssl x509 -req -in puf.csr -CA ca.cert.pem -CAkey ca.key.pem -CAcreateserial -out puf.pem -days 365 -sha256

This will create a puf.pem file that contains the certificate for this particular device. This certificate can then be inserted into the device firmware as a mean of authentication.

Using the certificate

Importing the certificate and private key

To use client authentication, mbedtls provides a function called mbedtls_ssl_conf_own_cert()doc, which takes an SSL context, the certificate and a mbedtls_pk_context structure. This structure can be generated like this :

int generate_pk_context(mbedtls_pk_context *pk, unsigned char *secret_key, unsigned int len)
{
    const mbedtls_pk_info_t *pk_info;
    pk_info = mbedtls_pk_info_from_type( MBEDTLS_PK_ECKEY );
    mbedtls_pk_init( pk );
    mbedtls_pk_setup( pk, pk_info );
    mbedtls_ecp_keypair *eck;
    eck = mbedtls_pk_ec( *pk );
    mbedtls_ecp_group_load( &eck->grp, MBEDTLS_ECP_DP_SECP256R1 );
    mbedtls_mpi_read_binary( &eck->d, secret_key, len);
    //Computing public key Q = d*G
    mbedtls_ecp_mul( &eck->grp, &eck->Q, &eck->d, &eck->grp.G, NULL, NULL );
    mbedtls_mpi_write_binary( &eck->d, secret_key, len);
    return 0;
}

Where secret_key is a pointer to our previously generated PUF value.

Mbedtls SSL context setup

To add the certificate authentication to the example, the following calls have to be added :

ret = mbedtls_x509_crt_parse(&clicert,
                    reinterpret_cast<const unsigned char *>(TLS_CLIENT_CERT),
                    strlen(TLS_CLIENT_CERT) + 1);

unsigned char puf_key[32];
compute_key(puf_key);
generate_pk_context(&pk, puf_key, 32);

ret = mbedtls_x509_crt_parse(&cacert,
                    reinterpret_cast<const unsigned char *>(TLS_PEM_CA),
                    strlen(TLS_PEM_CA) + 1);

mbedtls_ssl_conf_ca_chain(&ssl_conf, &cacert, NULL);

ret = mbedtls_ssl_conf_own_cert(&ssl_conf, &clicert, &pk);

These calls parse the client certificate stored in TLS_CLIENT_CERT, generate the PUF value, create the mbedtls_pk_context and configure the SSL context.

Server setup

For the server, a simple nginx configuration allows to require a client certificate and validate it using the CA :

[...]
http {
    
    ssl_certificate     /etc/nginx/nginx.pem;
    ssl_certificate_key /etc/nginx/nginx.key;
    ssl_ciphers         EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH;
    ssl_protocols       TLSv1.2;

    ssl_client_certificate /etc/nginx/ca.cert.pem;
    ssl_verify_client   on;

    [...]

}

Challenging the implementation

To challenge this design, and introduce people to the concept of PUFs, we used the same code to create a challenge for the Insomni’hack CTF. The challenge was to reverse a firmware to locate the PUF function and the mask. Then, by abusing the fact that the debug interface was still available, contestants were able to dump the SRAM buffer and recompute the private key to impersonate the device and access the server from their computer.

Reversing the firmware takes some time, especially since all the function names were stripped out of the resulting binary, but we left some pretty obvious debug messages inside of the firmware. Using these messages, it was possible to reconstruct a fair amount of the logic and find references to the mbedtls_ssl_conf_own_cert() function.

An other thing that was tedious as authors was to keep the SWD debugging interface available but restricting it so it would not be possible to bypass the challenge completely. For this, we set up the Readout protection on the STM32 to RDP level 1. This level prevents any debugger access to the flash memory, but still allows to access the contents of the RAM, which was the intended way to recover the PUF data. During development, we found some ways to break the challenge, so we implemented a (dumb) anti debugging technique by using a thread into mbedOS that monitors the DHCSR register and resets the board if a debugger is attached.

void dbg_thread(void)
{
    static const uint32_t *DHCSR = (const uint32_t *) 0xE000EDF0;
    while(1) {
        if(*DHCSR&1) {
            printf("Debugger detected !\r\n");
            memset(client, 0, sizeof(HelloHttpsClient));
            NVIC_SystemReset();
        } else {
            wait(0.1);
        }
    }
}

In the case where the protections were still bypassed, the returned flag was encrypted using AES-256 with the PUF key.

Unfortunately, no team was able to solve it in time. One team was able to access the server and get the encrypted version of the flag, but did not have enough time to decrypt the flag. You can find a writeup for the CTF challenge here.

Conclusion

Using a SRAM PUF is easily doable with off the shelf microcontrollers. Their use can serve multiple purposes like the one presented here. This technique allows to easily hide a secret value from potential attackers and allow to safely identify a device. However, further hardening techniques must still be used in order to prevent an attacker to retrieve this value from the SRAM and easily clone the PUF value. The use of a strong PUF is preferred in this case, as it is way more complicated to clone it (but not impossible).

Sylvain & Nicolas