For the Forum EPFL 2014 job fair, we created a crypto challenge—in the spirit of CTF hacking contests—and offered to award a quadcopter drone to the first student to solve it and Bus Pirate devices to the next 2nd to 10th.
After about 10 days, no one solved it, although the difficulty was well below that of a competitive crypto challenge at (say) DEFCON qualifications.
Given the password-protected ZIP archive https://www.kudelskisecurity.com/kschallenge.zip, a first step was to recover the password: 1234. OK, next step:
The ZIP archive contained:
- a text file named ciphertext and containing the string “407747220c5f2754c6fd192502c8f597814bc95ce897eb6a3c45ba4a7150fe86d054d1607e94bf0fc90790ddba8925b50c8c6159b091651442779e86febfc6b1bd736d77d28d89d96ebee2d4038f26cb77a1b843afdb1c7bf3c01518a77b6f87d25b8071cafbfc30ee3f569f72342dcc” (224 characters, thus 112 bytes); one was supposed to deduce that this was the hexadecimal representation of a ciphertext (encrypted message).
- a command-line Python script named challenge.py, taking as parameter a file name and encrypting the file with CBC-AES-128 (from Python’s Crypto.Cipher module) using a key generated by a custom PRNG. The script is copied below:
#!/usr/bin/env python # Copyright (c) 2014 Nagravision SA, all rights reserved import os import sys import hashlib from Crypto.Cipher import AES ZEROBLOCK = '\x00'*16 def encrypt(key, iv, plaintext): assert len(key) == 16, 'key isnt 16-byte' assert len(iv) == 16, 'iv isnt 16-byte' assert len(plaintext) % 16 == 0, 'plaintext length isnt multiple of 16' cipher = AES.new(key, AES.MODE_CBC, iv) return cipher.encrypt(plaintext) def xor(xs, ys): return ''.join(chr(ord(x) ^ ord(y)) for x, y in zip(xs, ys)) class PRNG(object): def __init__(self): self.state_bytes = 64 self.key = ZEROBLOCK self.iv = ZEROBLOCK # get low-entropy string from the environment entropy = ''.join(os.listdir('/proc')) # hash to make a higher-entropy string seed = int(hashlib.sha256(entropy).hexdigest(), 16) # ensure distinct processes have distinct seeds pid = os.getpid() new_seed = self.__diversify(pid) * seed # hash again to get a 32-byte string final_seed = hashlib.sha256(str(new_seed)).digest() # initialize state self.state = final_seed + '\x00'*(self.state_bytes - len(final_seed)) # fill state with pseudorandom bytes # + proof-of-work, against bruteforce for i in range(10000): self.__update() def __print_state(self): print self.state.encode('hex') def __diversify(self, x): return pow(3, x, 65537) & 0xffff def __update(self): mask = encrypt(self.key, self.iv, self.state) self.state = xor(mask, self.state) def get_bytes(self, nbbytes): randbytes = self.state[-nbbytes:] self.__update() return randbytes def main(): prng = PRNG() plaintext = open(sys.argv).read() key = prng.get_bytes(16) iv = ZEROBLOCK ciphertext = encrypt(key, iv, plaintext) print ciphertext.encode('hex') if __name__ == '__main__': main()
The PRNG is seeded with
- a snapshot of the content of /proc, which is highly unpredictable on a typical Linux system.
- the “diversified” PID, which on a Linux system ranges from 1 (init) up to 32768 (see /proc/sys/kernel/pid_max)
These two elements are then multiplied together on the line
new_seed = self.__diversify(pid) * seed. Reminder: for all x, we have x×0=0.
In the above PRNG, one can observe a function
__diversify(), which doesn’t seem to add entropy, let alone to “diversify” the PID, but rather to deterministically map a PID x to the value (3x mod 65537) mod 65536, since the AND mask
0xffff is equivalent to a reduction modulo 1+
0xffff. Now, observe that
pow(3, 32768, 65537) equals 65536, which is equivalent to 0 modulo 65536.
Bottom line: there is a PID, namely 32768, for which the key does not depend on the highly unpredictable content of /proc. With this PID, one obtained the key “622c4f56d20f0825e22751e0f29b38f5”. One could then decrypt the ciphertext, and notice that it looked like a gzip’d file stripped of its first bytes (1f 8b 08 08). The rest is left as an exercise :-)
Good Challenge! Thanks a lot!
The last bit, for me, was the last one: undestanding that few bytes were ripped off from the plaintext. How should one understand a modified gzip file was hidden in the plaintext?
I would like to train more on this kind of challenges, so I hope a new one will be online (quite) soon!
One were supposed to guess that the file obtained was compressed data based on the high-entropy content. The start of the file (with file name, etc.) gave hint that it was a GZ rather than another compressed format, and 1F 8B 08 08 is a typical GZ signature :-)