GPG Memory Forensics

Pretty Good Privacy (PGP) and the open source implementation GNU Privacy Guard (GPG) are encryption solutions following the OpenPGP standard. Even if GPG has been criticized in the past, it is widely used and deployed and has been publicly reviewed for many years. For example, GnuPG VS-Desktop has been recently approved by the German government to secure data at the VS-NfD (restricted) level. Thus it is used in practice to protect sensitive data, for example, to sign and encrypt emails, sign commits, or authenticate packages for several Linux distributions. To decrypt a message, the following command can be used:

$ gpg -d clear.gpg
gpg: encrypted with 3072-bit RSA key, ID 6F2C79C919966FC1, created 2021-07-21
      "Test key <[email protected]>"
Hello GPG

The first time the decryption is called, the system asks the user for their passphrase to decrypt the private key needed to decrypt the file.

Then for future decryption attempts, the passphrase is not asked but read from cache. The same mechanism is used for symmetric-key encryption. The cache time to live has a default value of 10 minutes. After the time to live elapsed, the cached item is cleared from memory.

To avoid having key material directly in the clear in memory, GPG encapsulates such key material before storing it in memory. The idea of that is that if a TPM is available, then the encapsulation key can be stored in a safe memory area. However, TPMs are usually not used, and the encapsulation key stays in regular memory.

A cached item is stored in the ITEM structure. We find this structure in gnupg/agent/cache.c (GPG version 2.3.4):

Cached item linked list

The ITEM structure is a linked list containing the cached item address and additional data. Among them, there are the time of creation and time of last access. These values are unix epoch times, the number of seconds elapsed since January 1, 1970. Then there is the time to live (ttl) field which is the number of seconds gpg-agent has to keep the item in cache. By default it is set to 10 minutes (0x0258 seconds). If gpg-agent found that the last accessed time is older than the time to live, the item is cleared from cache. After that we have the address of a secret_data_s structure. The secret_data_s structure contains the length of the cached item in bytes followed by the encrypted item with AES in key wrap mode. GnuPG uses AES key wrap mode of operation to encapsulate key material in memory as defined in RFC 3394. The AES key wrap algorithm is used to encrypt (wrap) keys or secrets with another key. It is used in several solutions like Apple FileVault 2 or Cryptomator. The mode uses two 64-bit blocks concatenated together as input of AES encryption:

First iteration of AES key wrap

This operation is iterated 6 times, and the final R values obtained are outputted as the ciphertext. Decryption works in exactly the same way but in reverse order. The initialization vector (IV) can be any value but the RFC default value is 0xa6a6a6a6a6a6a6a6. It allows for verification of the integrity of the decrypted key after the decryption. If the last decrypted block yields a value that starts with the IV, decryption is considered correct as long as no other error is returned. GPG uses the implementation of AES key wrap provided by the Libgcrypt library. In GPG, the cached item encryption is used to prevent attackers from simply grepping for passphrases in memory as commented in gnupg/agent/cache.c (GPG version 2.3.4):

However, as we will see later, someone who has access to the memory can also retrieve the encryption key and decrypt cached items anyway. To avoid having sensitive values left in memory after processing, Libgcrypt deletes those values when they are not used anymore. For example, a variable is wiped with the function wipememory and the stack is cleaned with the function _gcry_burn_stack. In Libgcrypt, the function _gcry_cipher_aeswrap_decrypt libgcrypt/cipher/cipher-aeswrap.c on line 81) did not clean a temporary variable containing the last decrypted block. Suppose we use GPG to decrypt some cleartext using a passphrase. At the end of the cached item decryption, the temporary variable contains the IV value 0xa6a6a6a6a6a6a6a6 followed by the first 8 bytes of the passphrase. For example, if a dump of gpg-agent’s memory using gcore or a dump of the whole system memory using LiME can be obtained, then, we should retrieve the constant 0xa6a6a6a6a6a6a6a6 in memory, next to the first 8 bytes of the passphrase. We reported this problem to GPG maintainers. This was quickly corrected in Libgcrypt 1.8.9 and future versions should not be affected by this issue.

Nevertheless, we investigated more to understand if it is possible to decrypt completely the passphrase in cache. We saw that each cached entry has two timestamps created and accessed of type time_t. This information can be leveraged to search for such patterns in memory and retrieve the location of cache_item_s instances. If we can estimate the time of the creation of the cached item, we can search for masked timestamps concatenated in memory followed by the time to live value. For example, the following regular expression matches all timestamps created after July 27, 2021 12:45:52 PM and before February 6, 2022 5:06:08 PM with a time to live of 10 minutes:

.{3}\x61\x00\x00\x00\x00.{3}\x61\x00\x00\x00\x00\x58\x02

To further reduce the number of false positives during the search, for each match, the created time can be checked to be less than or equal to the accessed time. Then, as soon as the ITEM structure has been found, we can access the secret_data_s structure.

To recover the encryption key, we used a known method to recover AES keys in memory. In AES, before data is encrypted, the AES key schedule is used to derive 11 subkeys that are used later in the algorithm.

AES key schedule algorithm

These subkeys are usually stored consecutively in RAM and there are relations between each consecutive subkeys. Thus, we scan the memory byte per byte trying to see if at this position the AES key schedule relation are verified. If it is the case for two consecutive rounds, there is a very high probability that there are subkeys stored at this location. We scan the process memory until we find a 128-bit expanded AES key. Then we use this key to decrypt the cached item. If the integrity of AES key wrap is verified we know we have properly decrypted the cached item and recovered the passphrase.

At this point, we had everything we needed to implement our solution. Volatility is a widely used forensics framework. It is used to analyze volatile memory dump artifacts to extract data. For example, it has been used in the past to recover Bitlocker volume encryption keys while they are in RAM or to solve challenges of previous SSTIC editions. The framework is highly customizable and allows writing plugins in Python to fit specific needs. Therefore, we developed two plugins for Volatility3. The first plugin retrieves partial (or complete, up to 8 characters) passphrases from memory by searching in gpg-agent‘s memory the constant IV of aes-wrap. This plugin would not work on versions of GPG later than version 2.3.4.

$ vol -f ubuntu-21.10-vm-gpg.raw -s ./volatility-gpg/symbols/ -p ../gpg-mem-forensics/volatility-gpg/ linux.gpg_partial
Volatility 3 Framework 2.0.0
Progress:  100.00               Stacking attempts finished
Offset  Partial GPG passphrase (max 8 chars)

0x7fb73d53a2a0  my_passp

The first 8 bytes of the passphrase were found in the clear in memory. The second plugin retrieves cached items and cache encryption keys from memory and therefore helps recover plaintexts. Here is an example of usage on an Ubuntu 21.10 VM dump:

$ vol -f ubuntu-21.10-vm-gpg.raw -s ./volatility-gpg/symbols/ -p ../gpg-mem-forensics/volatility-gpg/ linux.gpg_full
Volatility 3 Framework 2.0.0
Progress:  100.00               Stacking attempts finished
Offset  Private key     Secret size     Plaintext

0x7fb738002658  0b78497b0d26239211b8841c59e943f7        32      my_passphrase

The plugin successfully found the entire passphrase my_passphrase in memory. The first plugin execution took 6.4 seconds and the second 59.7 seconds on an Intel Core i7-7600U CPU for a 1GB RAM dump.

Memory forensics may be used during a criminal investigation to analyze memory dumps obtained during a search and seizure. After the data has been copied, the investigator may need to obtain the passphrases stored encrypted in cached items to further decrypt conversations. These techniques may also be applied to investigate virtual machines stored remotely in servers that are seized.

Ransomware is a common problem nowadays. These malicious software encrypt files on an infected machine and ask the owner for a ransom so that they can recover their files. It happens that some ransomware rely on well studied tools to encrypt a user’s data. Ransomware such as KeyBTC, VaultCrypt or Qwerty use GPG to encrypt files. In the event that a victim of such a ransomware just noticed what happened when they get infected, the victim or the incident response team could make a memory dump of the whole system and later retrieve the password or decryption key from memory. Thus, thwarting the ransomware threat and retrieving the original files without having to pay a ransom at all. Note that the ransomware would have to rely on symmetric encryption for this to work (gpg --symmetric) and, so far, we have not found any ransomware relying on GPG symmetric encryption.

To conclude, we have analyzed a bug in Libgcrypt where after cleaning memory, it was still possible to read 8 bytes of the passphrase in the clear from a memory dump. We have further analyzed how to decrypt cached items stored in GPG memory. Our work highlights that GPG is a solid encryption solution, but it should be used in conjunction with a TPM or a secure enclave solution to harden the security against physical attacks.

Leave a Reply