Implementing a ZK-focused authenticated-encryption scheme

In the last few years, several practitioners have proposed zk-focused cryptographic constructions such as hashing and encryption primitives that operate on binary and prime fields and that have a low multiplicative complexity. They are also called arithmetization-oriented primitives. Some examples are the Reinforced Concrete, Rescue and MiMC hashing functions. These constructions typically target zkSNARKs and STARKs -based applications.

In this post, we focus on Ciminion, an authenticated-encryption scheme proposed by Dobraunig et al. and presented at EUROCRYPT 2021 and describe how it can be implemented using a domain specific language (DSL) for writing zkSNARKs circuits. Then, we compare our implementation with the MiMC cipher in Feistel mode (which uses a permutation that is widely used in circuits), and show how our Ciminion implementation outperforms it for a large number of plaintexts.

Introduction

zkSNARKs make it possible to convince another party that a particular statement is true restricting other information about the statement to a minimum. In this case, the statement consists on a deterministic arithmetic circuit with inputs and outputs. The recent availability of domain specific languages (DSLs) for creating zk-focused applications and circuits (some examples are circom2 and Leo) allows to speed-up the creation of privacy- friendly applications.

There are certain categories of applications that can be easily described using zkSNARKs:

Digital economy : Via zkSNARKs it is possible to prove that a person’s funds are higher than a certain value without revealing the exact amount and that this value belongs to a specific interval and has certain properties (think about a range proof with a greater degree of expressiveness). Further, by implementing a digital signature scheme such as ECDSA or EdDSA as a zkSNARK circuit, we can prove that a signing operation, described as a set of constraints, has been performed correctly.
Private authentication and authorization: Via zkSNARKs we can prove that we own a particular official identification document and that a certain field contains a specific value or lies in a particular interval and not revealing the rest of the fields of the document. In a way, we can obtain the same privacy capabilities that an attribute-based credential system provides. Moreover, proving that a value belong to a list of candidates without revealing it is easy to implement in a circuit and makes it possible to implement group-based authentication systems.

Finally, both zkSNARKs and STARKs are typically used in Layer-2 constructions and have been proposed in a wide range applications such as network traffic compliance and verifiable machine learning.

In this article, we focus on zkSNARKs DSLs, which provide a direct path for building privacy-friendly applications. We refer the reader to different resources for learning the specifics of zkSNARKs, together with recent advances such as PlonKup, elastic SNARKs and Nova.

Describing circuits with a DSL

Using a DSL such as circom2, one can describe a zkSNARK as an arithmetic circuit with inputs (which can be private, public or a mix of both) and outputs. The circom2 compiler transforms the circuit description into a Rank-1 Constraint System (R1CS). Other tools such as snarkjs rely on the output of the circom2 compiler to provide the verifier and prover components that can be used to build applications based on zkSNARKs.

ZK-focused cryptographic primitives

Arithmetic operations in the circuit are always performed modulo a large prime and in the case of circom2, within the scalar field of the pairing-friendly curve BN254¹ by default (whose security level was decreased to 103 bits in https://eprint.iacr.org/2019/885). However, it can be compiled to use the BLS-12-381 curve.

A zk-application might need to perform a hash operation, to encrypt a sensitive value or to derive an authentication tag of a message. Even if typical constructions such as SHA-3 and an AES modes could be implemented in a DSL, the performance in terms of number of constraints of the resulting circuits would be affected by schemes that rely on bitwise operations. For this reason, in the last few years several practitioners have proposed constructions such as the MiMC, Poseidon, and Rescue hashing algorithm and ciphers such as GMiMC and Ciminion. These constructions mainly operate on prime and binary fields and try to reduce the number of expensive arithmetic operations such as multiplications.

Ciminion

Ciminion is an authenticated-encryption scheme that was presented at EUROCRYPT 2021 by Dobraunig et al. It has been designed to target zero-knowledge proof applications such as zkSNARKs, STARKs and multi-party computation. It requires a limited number of field multiplication and it can be implemented over a prime field.

Ciminion reduces the amount of field multiplications by using the Toffoli gate and the Farfalle construction in order to minimize the number of rounds. In contrast, to other schemes such as MiMC that uses the power mapping f(x) = x^3, the Toffoli gate transforms a triple (a, b, c) into a triple (a, b, ab + c).

Farfalle is a permutation-based construction for a pseudorandom function (PRF) that focuses on parallelization and was designed by the Keccak Team in 2016. It typically uses a family of permutations with different number of rounds, and three layers: a mask derivation function, a compression layer (pC) and a expansion layer (pE). The Farfalle construction was designed with versatility in mind and can be transformed into an AE construction.

Ciminion structure and components

Ciminion receives the following input parameters: a nonce N and two subkeys (K_1 and K_2). It first applies the permutation pC to them and ouputs an intermediate state. Then, this state is transformed by the pE permutation. Two of the resulting elements are used to encrypt the first two plaintext elements P_1 and P_2. The rest of plaintext elements P_2i and P_2i+1 are processed as follows: another pair of subkey elements are added to the intermediate state. Then, the rolling function and the pE permutation are applied to obtain another two field elements that are finally used to encrypt the subsequent pair of plaintext field elements.

The Ciminion structure, adapted from the original paper

Both the pE and pC permutations transform a triple (a, b, c) using the round function with different number rounds:

The Ciminion round function, adapted from the original paper

RC are the fixed round constants. They are generated using SHAKE-256 and can be precomputed before performing the authenticated-encryption operation. Finally, the rolling function performs the Toffoli gate operation:

The Ciminion rolling function, adapted from the original paper

In Ciminion, a set of subketys needs to be generated for performing authenticated-encryption. This is performed using two master keys, MK1 and MK2 and a public IV. Ciminion expands the master key elements using a sponge construction that relies on the pC permutation:

SubKey generation in Ciminion, adapted from the original paper

Implementing Ciminion in circom2

For a given circuit written in circom2, the compiler generates the proof and a R1CS version of the circuit in the form of (A * B) – C = 0. Further, a zkSNARK implementation such as snarkjs can use either Groth16 or PLONK to verify the proof.

In circom2, each circuit is considered a component that can be imported in another circuit in a way that it is possible to generate a library of reusable gadgets such as circomlib. Constrains in the circom2 library are described using the “===” operator and by default, all the signals are private.

We refer the reader to the circom2 documentation for installation instructions and language details. We have found hardhat-circom and the circom2 syntax plugins for vscode and vim very useful.

The rolling function

In this section, we explain how to implement the Ciminion components in circom2.

First, we start with the smaller components and we start composing them in a bottom-up fashion til arriving at the Farfalle-like construction of Ciminion. The rolling function it is the simplest component operation of Ciminion. It just performs the Toffoli gate operation. This is a component with 3 inputs and 3 outputs, being the triple transformed the output:

pragma circom 2.0.3;

template Rolling() {
    signal input s0;
    signal input s1;
    signal input s2;

    signal output b0;
    signal output b1;
    signal output b2;

    b0 <== s2 + s0*s1;    
    b1 <== s0;    
    b2 <== s1;    
}

The pC and pE permutations

The Ciminion round function involves the following inputs: the triple a, b, c (e.g. a_0, b_0, c_0), four round constants (e.g. RC_0, RC_1, RC_2, RC_3) and 3 outputs: a_1, b_1, c_1, that is,
the output triple after performing the transformation operation. Then, we can instantiate the permutation (round function f_i) to create the iterated permutations pC and pE:

pragma circom 2.0.3;

template Permutation() {
    signal input a0;
    signal input b0;
    signal input c0;

    signal input RC0;
    signal input RC1;
    signal input RC2;
    signal input RC3;

    signal output a1;
    signal output b1;
    signal output c1;

    signal a0_mul_b0_plus_c0_plus_b0;

    a0_mul_b0_plus_c0_plus_b0 <== a0*b0 + c0 + b0;

    a1 <== c0 + a0*b0 + RC2;
    b1 <== a0 + RC3*(a0_mul_b0_plus_c0_plus_b0) + RC0;
    c1 <== a0_mul_b0_plus_c0_plus_b0 + RC1;
}

Once we have the permutation implemented, we only need to iterate during a specific a number of rounds and the corresponding list of round constants:

template IteratedPermutationN() {    
    signal input a0;    
    signal input b0;    
    signal input c0;    

    signal output a1;    
    signal output b1;    
    signal output c1;    

    var c[536] = [    
          292334644102394411537362212093572878140,    
          250282066453690133315708418387534650045,    
          25210252274841859825108081164557115781,    
          16246747660448753010909888379129524409,    
          131669148524554050620166690622542333908,    
          73395003443371763039182051680716568403,    
    [...]

Given a number of rounds nRounds per permutation, we instantiate one Permutation component per iteration (in circom2, a value can be assigned only once to a signal). The outputs of a permutation component i are redirected to the input of a permutation component i+1 and the final triple consists of the output of the permutation component nRounds – 1:

    component perm[nRounds];                        

    for (var i=0; i<nRounds; i++) {

       perm[i] = Permutation();                        

       RC0 = c[4*i];                                  
       RC1 = c[4*i + 1];                              
       RC2 = c[4*i + 2];                               
       RC3 = c[4*i + 3];                              

       perm[i].RC0 <== RC0;                            
       perm[i].RC1 <== RC1;                            
       perm[i].RC2 <== RC2;                            
       perm[i].RC3 <== RC3;                            

       if (i == 0) {                                   
          perm[i].a0 <== a0;                          
          perm[i].b0 <== b0;                          
          perm[i].c0 <== c0;                          
       } else {                                       
         perm[i].a0 <== perm[i-1].a1;                
         perm[i].b0 <== perm[i-1].b1;                
         perm[i].c0 <== perm[i-1].c1;
      }
    }

    a1 <== perm[nRounds - 1].a1;
    b1 <== perm[nRounds - 1].b1;
    c1 <== perm[nRounds - 1].c1;
}

The key generation operation

The key generation operation in Ciminion takes as input the IV and two master keys. It iterates the permutation pC (permutation N in the Ciminion reference code) on this input in order to generate the subkeys K_i. The output corresponds to the first value of the output triple (a1, b1, c1), that is a_i for subkey K_i. We can generate a component that is instantiate according to the number of sub keys e.g. nKeys:

pragma circom 2.0.3;

include "permutation.circom";

template KeySchedule(nKeys) {
    signal input s0;
    signal input s1;
    signal input s2;

    signal output keys[nKeys];
    component p_n[nKeys];

    for (var i=0; i<nKeys; i++) {
        p_n[i] = IteratedPermutationN();

        if (i == 0) {
            p_n[i].a0 <== s0;
            p_n[i].b0 <== s1;
            p_n[i].c0 <== s2;            
        } else {
            p_n[i].a0 <== p_n[i-1].a1;
            p_n[i].b0 <== p_n[i-1].b1;
            p_n[i].c0 <== p_n[i-1].c1;
       }
       keys[i] <== p_n[i].a1;
    }
}

The Farfalle-like structure

Finally, we have all the components to generate the authenticated-encryption
primitive.

This component will use:

the key generation component to generate the nSubKeys subkeys.
the permutations pC and pE
the rolling function
the MAC generation algorithm for a given ciphertext

We receive as inputs: the nonce and the two master key elements, nPairs
of plaintexts and we output: a TAG element that authenticates nPairs
of ciphertexts:

template CiminionEnc(nPairs) {    

    var nSubKeys = 2*nPairs + 3;    

    signal input MK_0;    
    signal input MK_1;    
    signal input nonce;    

    signal input PT[nPairs*2];    
    signal output CT[nPairs*2];    
    signal output TAG;

The first part of the encryption operation starts the key schedule:

    component key_schedule = KeySchedule(nSubKeys);    

    key_schedule.s0 <== 1;     
    key_schedule.s1 <== MK_0;    
    key_schedule.s2 <== MK_1;

Then, the permutation pC is applied to the supplied nonce and to the first pair of subkeys.
Then, for each pair of plaintexts, the rolling and pE permutation are applied:

    for (var i = 0; i < nPairs; i++) {
        // roll

        rolls[i] = Rolling();

        if (i == 0) {
            rolls[i].s0 <== p_n.a1 + key_schedule.keys[2*i + 4];
            rolls[i].s1 <== p_n.b1 + key_schedule.keys[2*i + 3];
            rolls[i].s2 <== p_n.c1;
        } else {   
            rolls[i].s0 <== rolls[i-1].b0 + key_schedule.keys[2*i + 4];
            rolls[i].s1 <== rolls[i-1].b1 + key_schedule.keys[2*i + 3];
            rolls[i].s2 <== rolls[i-1].b2;
        }

        // second permutation pR

        p_rs[i] = IteratedPermutationR();

        p_rs[i].a0 <== rolls[i].b0;
        p_rs[i].b0 <== rolls[i].b1;
        p_rs[i].c0 <== rolls[i].b2;

Afterwards, ciphertext pairs are generated and the MAC is updated:

        CT[2*i] <== p_rs[i].a1 + PT[2*i];
        CT[2*i + 1] <== p_rs[i].b1 + PT[2*i + 1];

        // MAC generation

        if (i == 0) {
            acc_1[i] <== CT[2*i] * key_schedule.keys[0];
            acc_2[i] <== acc_1[i] + CT[2*i + 1];
            MAC[i] <== acc_2[i] * key_schedule.keys[0];
        } else {
            acc_1[i] <== (MAC[i-1] + CT[2*i]) * key_schedule.keys[0];
            acc_2[i] <== acc_1[i] + CT[2*i + 1];
            MAC[i] <== acc_2[i] * key_schedule.keys[0];
        }
    }

The output, corresponds to the ciphertext pairs and the tag:

    component p_r2 = IteratedPermutationR();

    p_r2.a0 <== inner_upper_branches_0;
    p_r2.b0 <== inner_upper_branches_1;
    p_r2.c0 <== inner_upper_branches_2;

    TAG <== MAC[nPairs - 1] + p_r2.a1;

Unit testing

circom2 allows to debug and test a circuit implementation via Mocha. In this section, we show how we could test the Ciminion implementation that we have created. First, we’ll need to declare a main component in the circuit description that we want to test. We start with the Ciminion rolling function:

component main = Rolling();

Via the ffjavascript package we can perform finite field arithmetic operations within the BN254 curve scalar field and can compare the results. For instance, we can generate random inputs for the Ciminion rolling function and execute the function in JavaScript:

const F1Field = require("ffjavascript").F1Field;
const Scalar = require("ffjavascript").Scalar;
const crypto = require('crypto').webcrypto;

exports.p = Scalar.fromString("21888242871839275222246405745257275088548364400416034343698204186575808495617");
const Fr = new F1Field(exports.p);


        let a = new BigUint64Array(1);
        let r1 = (crypto.getRandomValues(a)[0]);
        let r2 = (crypto.getRandomValues(a)[0]);
        let r3 = (crypto.getRandomValues(a)[0]);

        const s0 = Scalar.fromString(r1);
        const s1 = Scalar.fromString(r2);
        const s2 = Scalar.fromString(r3);

        const b0 = s2 + s0*s1;    
        const b1 = s0;    
        const b2 = s1;

We obtain the witness values from the rolling circuit:

        const circuit = await wasm_tester(path.join(__dirname, "circuits", "rolling.circom"));
     let witness;
        witness = await circuit.calculateWitness({ "s0": s0, "s1": s1, "s2":s2}, true);

and compare:

      await circuit.assertOut(witness, {"b0": b0});
      await circuit.assertOut(witness, {"b1": b1});
      await circuit.assertOut(witness, {"b2": b2});

From command line, we invoke Moch:

$ mocha rolling.js

  Rolling function
    ✔ Conformance test (47ms)


  1 passing (52ms)

Finally, we can also test that encryption was performed correctly by decrypting the result. In that case, we can first write a circuit that decrypts a set of ciphertexts and call both circuits from mocha.

In order to reverse the encryption operation, we need to obtain the plaintexts as:

     PT[2*i] <==  CT[2*i] - p_rs[i].a1;
     PT[2*i + 1] <== CT[2*i + 1] - p_rs[i].b1 ;

We import both encryption and decryption circuits and check the authentication
tag and the decryption values:

/* Decryption test for Ciminion circuit */

const chai = require("chai");
const path = require("path");
const F1Field = require("ffjavascript").F1Field;
const Scalar = require("ffjavascript").Scalar;
exports.p = Scalar.fromString("21888242871839275222246405745257275088548364400416034343698204186575808495617");
const Fr = new F1Field(exports.p);

const wasm_tester = require("circom_tester").wasm;

const assert = chai.assert;

describe("Ciminion - encryption operation", function ()  {

    this.timeout(100000);

    it("Authentication tag check", async() => {

        const ciminion_input = {
            "MK_0": "0",
            "MK_1": "0",
            "nonce": "1",
            "IV": "1",
            "PT": ["21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918", "21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918"]

          };

          const ciminion_input_dec = {
            "MK_0": "0",
            "MK_1": "0",
            "nonce": "1",
            "IV": "1",
            "CT": ["21592519839218542425120198614531742298033892440087867998118713380820756220718","7889798674627961413366316750795654309310714845357364960283444849787781529858","21049421697506414118152249991945313702405695293538082623907752021251428302407","5714257272097132615426035247399194719164619087183836862864633319728296164225"]
          };

        const circuit = await wasm_tester(path.join(__dirname, "circuits", "ciminion_enc.circom"));
        const circuit_dec = await wasm_tester(path.join(__dirname, "circuits", "ciminion_dec.circom"));

        let witness;
        let witness_dec;

        witness = await circuit.calculateWitness(ciminion_input, true);

        await circuit.assertOut(witness, {"TAG": "1300322832108596540141310981879129316384285895603221372961580627161106587830"});
        await circuit.assertOut(witness, {"CT": ["21592519839218542425120198614531742298033892440087867998118713380820756220718","7889798674627961413366316750795654309310714845357364960283444849787781529858","21049421697506414118152249991945313702405695293538082623907752021251428302407","5714257272097132615426035247399194719164619087183836862864633319728296164225"]});

    });

    it("Decryption check", async() => {

      const ciminion_input = {
          "MK_0": "0",
          "MK_1": "0",
          "nonce": "1",
          "IV": "1",
          "PT": ["21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918", "21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918"]

        };

        const ciminion_input_dec = {
          "MK_0": "0",
          "MK_1": "0",
          "nonce": "1",
          "IV": "1",
          "CT": ["21592519839218542425120198614531742298033892440087867998118713380820756220718","7889798674627961413366316750795654309310714845357364960283444849787781529858","21049421697506414118152249991945313702405695293538082623907752021251428302407","5714257272097132615426035247399194719164619087183836862864633319728296164225"]
        };



      const circuit_dec = await wasm_tester(path.join(__dirname, "circuits", "ciminion_dec.circom"));
      let witness_dec;
      witness_dec = await circuit_dec.calculateWitness(ciminion_input_dec, true);

      await circuit_dec.assertOut(witness_dec, {"PT": ["21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918", "21888242871839275222246405745257275088548364400416034343692024637573625142322", "21888242871839275222246405745257275088548364400416034343693920155242279081918"]});

  });

});

Finally, we can also test a circuit that just recompute the MAC of a given ciphertext and import it:

  it("MAC recheck", async() => {

      const ciminion_input_dec = {
        "MK_0": "0",
        "MK_1": "0",
        "nonce": "1",
        "IV": "1",
        "CT": ["21592519839218542425120198614531742298033892440087867998118713380820756220718","7889798674627961413366316750795654309310714845357364960283444849787781529858","21049421697506414118152249991945313702405695293538082623907752021251428302407","5714257272097132615426035247399194719164619087183836862864633319728296164225"]
      };

    const circuit_mac = await wasm_tester(path.join(__dirname, "circuits", "ciminion_mac.circom"));

    let witness_mac;

    witness_mac = await circuit_mac.calculateWitness(ciminion_input_dec, true);

    await circuit_mac.assertOut(witness_mac, {"TAG": "1300322832108596540141310981879129316384285895603221372961580627161106587830"});

});

Number of constraints, code and examples

We have compared the number of constraints required for implementing Ciminion with those of MiMC in encryption mode (particulary Feistel MiMC-2n/n). The MiMC construction is typically used in circom2 circuits. We see that Ciminion scales better for larger plaintexts, due to the utilization of the Farfalle construction in comparison to the MiMC construction based on the Feistel structure.

We have released the code and tests utilized in this article in our GitHub repository.

Acknowledgments

Part of this work was done during the 2nd 0xPARC Learning Group.

Notes

¹The BN254 curve security level has been estimated in around 103 bits at the time of writing this post. See https://eprint.iacr.org/2022/586.

Last update: 14th July 2022