I. INTRODUCTION
AZERO-knowledge proof is a protocol that enables a prover to convince a verifier of knowledge of witness without any unnecessary leakage of the witness [1]. Specifically, zero-knowledge succinct non-interactive argument of knowledge (zk-SNARKs) is, literally, a zero-knowledge proof which is one-round protocol whose proof size is small. Since its introduction, zk-SNARKs have drawn vast attention due to their versatility and diverse applications including cryptocurrency [2]–[4], deep learning [5] and database queries [6]. In addition, there is an attempt to standardize zero-knowledge proofs, named ZKProof [7], to apply them to the industry, and many famous companies such as Google and Microsoft take part in this workshop.
Most constructions proposed so far mainly depend on pre-quantum primitives and hardness assumptions such as pairing [8], [9], hidden order group [10], and discrete logarithm problem [11], [12]. There exist hash-based constructions [13]–[15] which are secure under the quantum computer, however, it takes relatively high verification cost and storage cost than group-based constructions since they contain many hash function iterations. On the other hand, lattice-based constructions have been proposed as promising candidates for post-quantum zk-SNARKs. However, the proposed latticebased constructions are inefficient compared to group-based constructions [8]–[12] in all aspects, especially, in the proof size: while the group-based scheme [9] has 131 bytes of proof (with BN-128 curve), the best one [16] among lattice-based (designated) SNARKs requires 270 kilobytes of proof.
Here, we pay attention that the best one [16] and all other zk-SNARKs (except [17], [18]) from lattices exploit encoding schemes based on learning with errors (LWE) problem. However, several lattice-based constructions [19], [20] have shown that lattice hard problems with algebraic structures such as Ring LWE (RLWE) [21] or NTRU [22] have mathematical structures with which one can improve the efficiency of schemes. Therefore, replacing LWE by RLWE (in scheme construction) is one of the widely used techniques for improving the efficiency of a lattice-based encryption scheme, and we can raise the following natural and meaningful question:
Is it possible to enhance the efficiency of the lattice-based zk-SNARK with hard problems from algebraic lattices?
Related Work. Coming to the quantum revolution, building post-quantum zk-SNARKs with lattice-based cryptographic primitives has been highlighted as one of the challenging problems in the area of cryptography and security. We review some previous work [16]–[18], [23], [24] proposing designated verifier zk-SNARKs based on lattices.
Boneh et al. [17], [18] proposed the first lattice-based SNARG from linear-only encoding assumption on the encryption scheme based on (R)LWE problem. Specifically, the latter achieved the quasi-optimality in prover’s cost via the linearonly vector encryption scheme over rings and linear PCP [25] with multiple provers. While those work provide the best asymptotic cost among others, authors left the construction of a zk-SNARK from lattices as an open problem.
On the other hand, Gennaro et al. [23] proposed zk- SNARK with square span program (SSP) [26] assuming that an encoding scheme from LWE problem also satisfies the similar classical hardness assumptions — q-power knowledge exponent (PKE), q-power Diffie Hellman (PDH) — from finite groups (previously, those assumptions are exploited to construct zk-SNARKs from pairing groups, e.g., [27], [28]). Similarly, Nitulescu [16] presented a lattice-based zk-SNARG with square arithmetic programs (SAP) assuming that an encoding scheme from LWE problem satisfies the ‘lineartargeted malleability’ assumption which is a slightly weaker assumption than the linear-only assumption. It also has the advantage that the size of the proof is smaller than the aforementioned lattice-based zk-SNARKs, with the proof consisting of only two LWE encodings. Recently, Naganuma et al. [24] also proposed, via similar approach as above, a latticebased zk-SNARK from quadratic arithmetic programs (QAP) and then compared their result to the previous work [16], [23] with implementation. As expected in theory, while SAP-based zk-SNARK [16] has smaller proof size and less verification cost, their QAP-based one [24] is better in other aspects: setup time, prover’s cost, and the size of common reference strings. A concurrent and independent work [29] proposed a ring-variant of Pinocchio, named Rinocchio based on the quadratic ring programs (QRP) similarly as our ring-QAP. However, their construction is focused on SNARK which does not provide zero-knowledge property. For more details about the differences, we refer to Section III-B.
All of those works, including ours, provide zk-SNARKs with designated verifiers (i.e., the verification requires a private verification key) only, and constructing a publicly verifiable zk-SNARK from lattice is still an open problem.
A. Our Approach
We propose a new lattice-based zk-SNARK from RLWE problem, linear-only encoding assumption over this ring, and the notion of ring-QAP. Moreover, we provide a tight analysis on conventional noise flooding technique to reduce the size of RLWE encodings based on the Hellinger distance and due to this analysis, we can reduce not only the size of a proof in amortized sense but also the size of a single encoding.
Previously, only an LWE-based encoding scheme was exploited [16], [24] to construct zk-SNARK from lattices. To enhance the efficiency by leveraging the ring structures, we extend QAPs over [TeX:] $$\mathbb{Z}_p$$ to a ring-QAPs over a polynomial ring [TeX:] $$R_p=\mathbb{Z}_p[X] /\left(X^N+1\right)$$ with the generalized Schwartz-Zippel Lemma over [TeX:] $$R_p$$ then employ an RLWE-based encoding scheme having an element of ring [TeX:] $$R_p$$ as a message. It gives a zk-SNARK for arithmetic circuits over a ring [TeX:] $$R_p$$, to which one can apply the traditional message packing method, then we significantly reduce the proof size in amortized sense.
More precisely, when N is a power of 2 and p = 1 mod 2N, [TeX:] $$R_p$$ is isomorphic to [TeX:] $$\mathbb{Z}_p^N$$, and a single ring element has one-to-one correspondence with an N dimension vector over [TeX:] $$\mathbb{Z}_p$$, which enables to pack multiple field elements to one ring element. Then, we can outsource N computations to an untrusted prover and reduce the computational complexity of the prover and the verifier as well as the proof size in the amortized sense.
In addition, to shorten the proof size, we provide a new analysis on the parameters of zk-SNARKs using the Hellinger distance rather than the statistical distance from the previous construction. For zero-knowledgeness, all conventional latticebased zk-SNARKs [16], [23], [24] from square span programs (SSPs), SAPs, and QAPs must exploit a noise flooding technique to hide the error term in final encodings which will be disclosed to a verifier. In other words, for the error term e given in the final encodings, we must guarantee that no one can distinguish e+D from D where D is a certain distribution. To this end, previous work chose D as a uniform distribution on a large interval and employed the statistical distance as a measure to show the closeness of the above two distributions.
Unfortunately, the previous analysis with the statistical distance — providing a rough upper bound on adversaries success probability — requires that if [TeX:] $$\kappa \approx \lambda,$$ where is the security parameter and [TeX:] $$\kappa$$ is the -log of statistical distance between two distributions. In contrast, the closeness derived from Hellinger distance provides more tight analysis on the success probability of adversaries on (decision) security game, thereby requiring relaxed requirement, e.g., [TeX:] $$\kappa^{\prime} \approx \lambda / 2$$ where [TeX:] $$\kappa^{\prime}$$ is -log of Hellinger distance.
As a result, our protocol can also reduce the size of a single encoding in both asymptotic and concrete settings. Specifically, the size of single proof is about 276.5 KB when = 110, which is much smaller than that of the previous work in the lattice-based zk-SNARKs [16], [23], [24]. In addition, under 128 bit security parameter, our proof size is 156 bytes with an amortization cost and it is comparable to 138 bytes of Groth [9], the shortest proof size among all zk-SNARKs.
Concurrent Works. There are two independent and concurrent works1 that improves the lattice-based SNARKs.
1 The first version of our draft was submitted in Feb 9 2021 while Rinocchio was published in ePrint in 10 Mar 2021; [30] was published during the review period.
Ganesh et al. [29] propose a new SNARK (without ZK property) called Rinocchio for general ring arithmetic computations. To satisfy the soundness in the ring setting, they also employ the generalized Schwartz-Zippel lemma (Lemma 4) and the Ring-LWE encodings against quantum adversaries. On the other hand, Rinocchio is slightly different from our ring- QAP based zk-SNARK, similarly as the Pinocchio [27] is different from Groth’s work [9]. For example, Rinocchio requires 9 RLWE encodings to describe the proof of arguments, but the proof of our zk-SNARK only consists of 3 RLWE encodings. In addition, [29] uses q-PDH and q-PKE assumptions over rings that are weaker than ‘linear-only’ encoding assumption that we use. While their work focused on the generality, we focus on better efficiency exploiting specific case with a ring of the form [TeX:] $$\mathbb{Z}_p^n$$. Furthermore, we provide a tighter analysis on noise flooding technique for zero-knowledgeness (while [29] does not) as will be described in the following subsection. Our analysis could be applicable to Rinocchio for building a lattice based zk-SNARK with a shorter proof.
Another work by Yuval Ishai et al. [30] also proposes zk- SNARKs for reducing the proof size. Their construction is built on Bitansky et al. [25] compiler with linear-only vector encryption suggested by Boneh et al. [17]. To minimize the proof size, they employ several methods including modulus switching2 on the proof encoding, exploiting a linear PCPs and vector encryptions on quadratic extension field (of a base finite field [TeX:] $$\mathbb{Z}_p$$), etc. While they can achieve the smallest proof size among all zk-SNARKs based on lattice assumption, their construction does not support batching multiple proofs in contrasts to ours. As a quick comparison, our construction utilizes an extension ring [TeX:] $$R_p=\mathbb{Z}_p[X] /\left(X^N+1\right)$$ with high degree N targeting the smallest amortized proof size while their optimization utilizes quadratic extension fields [TeX:] $$\mathbb{Z}_p[X] /\left(X^2+1\right)$$ to reduce the single proof size. Thus, our proposal still remains the lattice-based zk-SNARKs having the smallest proof size in amortized sense.
2 A widely employed technique in fully homomorphic encryption to reduce the modulus of a ciphertext without modifying the underlying messages.
B. Application — Verifiable Machine Learning Training
As an interesting application scenario of our proposed zk- SNARK, we present the verification of machine learning (ML) training. The ML training phase is composed of many computation steps where the portion of input data is used to update the model parameters. Assume that a client outsources to a server a training phase of ML model with data to be trained on. However, since the training phase is composed of many steps of computations on large data, a server may miss some portion of training steps and/or the training data. Therefore, both a client and a server have an incentive to verify and prove that the final output model is trained correctly with the given data. This is possible with zk-SNARKs where the client and the server act as a verifier and a prover, respectively, by generating and verifying the proof of training computation; see Fig. 1.
Verifying ML training phase with zk-SNARK.
While one can use any zk-SNARKs in this scenario, our zk- SNARK — with reduced amortized proof size — can provide smaller overall proof size than the previous work. For detail, assume that the training phase is composed of many training steps each of which can be described as follows:
where [TeX:] $$\vec{W}_i$$ and [TeX:] $$\vec{W}_{i+1}$$ are the model parameters before and after the i-th step [TeX:] $$f_i$$, respectively, and [TeX:] $$D_i$$ is the (portion of) data used in each step. Then, the entire training phase can be verified by verifying every zk-SNARK proof for each step [TeX:] $$f_i$$3 given that the prover sends all intermediate [TeX:] $$\vec{W}_i$$'s along with the proof to the verifier; hence, it requires CRS for the computation of each [TeX:] $$f_i$$'s only (which will be identical in most cases). In contrasts, if we consider proving the entire training phase with initial and final parameters only, it requires problematically large amount of CRS requirement due to the huge size of entire computation. In the former case with affordable CRS size, our zk-SNARK whose proof is capable of proving and verifying many computations simultaneously provides much smaller proof size than the previous zk-SNARKs. Specifically, if there are n training steps, previous zk-SNARKs require n proofs while ours does only [TeX:] $$\lceil n / N\rceil$$ proofs where N is the maximum proof capability of ours in one proof encoding. Note that the verifier has all intermediate [TeX:] $$\vec{W}_i$$'s and arranges them correctly as input and output of parallel circuits then verifies correct computation of all [TeX:] $$\vec{W}_i$$'s with the zk-SNARK proof.
3 3In usual ML training [TeX:] $$f_i$$'s are almost the same for all steps. Our zk-SNARK can also handle different [TeX:] $$f_i$$'s given that the circuit size of them are bounded.
Note, in addition, that the number n of training steps are usually bigger or comparable to the amortization capability N = 2, 048 in our zk-SNARK construction, realizing the best possible amortized proof size in most cases. On the other hand, the server may not want to disclose some of hyper-parameters — the external values used to control the learning process, e.g., learning rate, batch size, number of iterations, etc. — since it comprises a secret know-how for getting good training results. This can be kept secret by zero-knowledge property of zk-SNARK.
C. Application — Merkle Proof with Smaller Proof Size
Merkle trees, proposed by Ralph Merkle [31], is a binary tree in which the leaf node is a cryptographic hash of a data block, and non-leaf node is a cryptographic hash value of its child nodes. This technique is used to prove efficiently that some data block received from the other belongs to the tree. Therefore, Merkle trees are widely used in many applications, especially, peer-to-peer systems such as Git and BitTorrent and recently, many cryptocurrencies also employ Merkle trees to verify the data block received from other nodes.
For Merkle proofs, the prover provides a sequence of hash value needed to compute the hash value of its parent from the leaf node to the root node. Then, the verifier climbs Merkle trees and ensure the validity of the proof when the computed root hash value coincides with the public root value. To be more concrete, given a data D belonging to a binary tree of depth [TeX:] $$\ell$$, the Merkle proof for D is [TeX:] $$\pi=\left(\pi_1, \cdots, \pi_{\ell}\right)$$, where [TeX:] $$\pi_i$$ is the hash value for level i and [TeX:] $$\pi_1$$ is the publicly known Merkle root. For a cryptographic hash function H, [TeX:] $$h_{\ell}=H(D)$$, and for each [TeX:] $$i \in\{2,3, \cdots, \ell\} \text {, }$$ the verifier goes to level i − 1 nodes from level i nodes by checking if [TeX:] $$h_{i-1}$$ is [TeX:] $$H\left(\pi_i, h_i\right)$$ or [TeX:] $$H\left(h_i, \pi_i\right)$$ depending on the Merkle path of D to the root. Lastly, he can obtain [TeX:] $$h_1$$ and accepts the proof only if [TeX:] $$h_1=\pi_1$$. In other words, the verification circuit can be represented by multiple evaluations of the same hash function H as follows: for [TeX:] $$i \in\{2,3, \cdots, \ell\}$$,
Now, for the application of zk-SNARK, we assume, similarly as the previous application example, that the prover sends all [TeX:] $$\pi_i$$'s, [TeX:] $$h_i$$'s, and the information of the Merkle path along with a SNARK proof to a verifier. Then, the verifier arranges each input for each evaluation appropriately (as [TeX:] $$\pi_i$$, [TeX:] $$h_i$$ or [TeX:] $$h_i$$, [TeX:] $$\pi_i$$) and verifies above computation with the SNARK proof. Since the verifier has the intermediate hash values, this process enables the verifier to check the dependency between levels. In this case, with our zk-SNARK, the size of proof can be reduced considerably since we need only [TeX:] $$\lceil\ell / N\rceil$$ proofs while previous one requires [TeX:] $$\ell$$ proofs. Ours will be also beneficial if one needs to prove/verify many Merkle proofs simultaneously. Moreover, with zk-SNARK, the computational complexity for the verifier can be less than that for the original Merkle verification where she needs to evaluate many hash functions.
We tried to implement Merkle proof to prove the efficiency. Even though libsnark provides a gadget for SHA-256, it does not fit in SEAL library. For this reason, we generate a random circuit with [TeX:] $$2^{15}$$ multiplicative gates and [TeX:] $$2^{5}$$ the number of inputs instead. As far as we know, a circuit representing SHA-256 also has about [TeX:] $$2^{15}$$ multiplicative gates. Our experiment result is summarized in Table III in Section V. According to our implementation, the verification time is fast enough. Moreover, a proof contains many independent instances and it implies that it is beneficial to prove many instances at the same time such as Merkle proof. Additionally, our scheme is designed based on Groth’s zk-SNARK scheme [9] whose one of key features is that complexity for the verifier is independent on the circuit. In this context, we believe that the current experiment indicates that verification using zk-SNARK is useful when the number of hash functions is sufficiently large.
D. Organization
Section II provides preliminaries about discrete Gaussian distributions, definitions of RLWE and zk-SNARK. Section III provides our main protocol, zk-SNARK from RLWE, which consists of ring-QAP, zk-SNARK from RLWE, security proofs and better noise flooding using Rènyi divergence. Section IV provides the size of the proof with various security parameters and comparison between ours and the previous work.
II. PRELIMINARIES
Let [TeX:] $$\mathbb{Z}, \mathbb{Q}, \mathbb{R}$$ be the integers, rational, real, respectively, and [TeX:] $$\mathbb{Z}_q=\mathbb{Z} / q \mathbb{Z}$$ the set of integers modulo q represented as integers from ([TeX:] $$-q / 2, q / 2] \cap \mathbb{Z} \text {, and } \mathbb{Z}[X]$$ be the set of all polynomials with integer coefficients. Throughout this paper, we denote N for a power of two integer so that [TeX:] $$X^N+1$$ is a cyclotomic polynomial, [TeX:] $$R=\mathbb{Z}[X] /\left(X^N+1\right)$$ and [TeX:] $$R_q=R / q R=\mathbb{Z}_q[X] /\left(X^N+1\right)$$. We use left-arrow notations in the following two cases: For a finite set S, [TeX:] $$s \leftarrow S$$ denotes that s is uniformly sampled from S. For a distribution D, [TeX:] $$s \leftarrow \mathcal{D}$$, denotes that s is sampled from the distribution D. A statistical distance between two discrete distributions [TeX:] $$D_1$$ and [TeX:] $$D_2$$, denoted by SD ([TeX:] $$D_1$$, [TeX:] $$D_2$$), is [TeX:] $$\sum_{x \in X} \frac{1}{2} \operatorname{Pr}\left|D_1(x)-D_2(x)\right|.$$ We denote [TeX:] $$\operatorname{Pr}[s \leftarrow \mathcal{D} \mid A]$$ as the probability that an event A occurs when [TeX:] $$s \leftarrow \mathcal{D} .$$
A. Lattices and Discrete Gaussian Distribution
A lattice L is defined as an additive discrete subgroup of [TeX:] $$\mathbb{R}^n$$ and is represented by integral linear combinations of a basis [TeX:] $$\mathbf{B} \in \mathbb{R}^{n \times r} \text {, i.e., } \mathcal{L}=\mathbf{B} \mathbb{Z}^r \text {. }$$ We first recall the definition and some properties of a Discrete Gaussian distribution over a lattice L.
Definition 1 (Discrete Gaussian Distribution): Let L be a lattice contained in [TeX:] $$\mathbb{R}^n$$. Then, for any positive real number , we define a function ρ as follows.
Then, we define a discrete Gaussian distribution [TeX:] $$\chi_{\mathcal{L}, \sigma}$$ with a standard deviation whose probability density function is [TeX:] $$\rho_\sigma(\mathbf{x}) / \rho_\sigma(\mathcal{L})$$ where [TeX:] $$\rho_\sigma(\mathcal{L})$$ is the sum of all points [TeX:] $$\mathbf{x} \in \mathcal{L}, \text { i.e., } \rho_\sigma(\mathcal{L})=\sum_{\mathbf{x} \in \mathcal{L}} \rho_\sigma(\mathbf{x}) \text {. }$$
Lemma 1 (Tail Bounds): For [TeX:] $$\sigma \gt 0$$ and [TeX:] $$T \gt 0$$, it holds that
Lemma 2 ( [32], Tail Bounds of Inner Product): For [TeX:] $$\sigma \gt 0$$ and [TeX:] $$T \gt 0$$, it holds that
B. Linear-Only Encoding Scheme from RLWE
We introduce an encoding scheme which is a building block of our zk-SNARK construction. The encoding scheme has the ring [TeX:] $$R_p$$ and [TeX:] $$R_q$$ as the message space and the encoding space, respectively for some integers p and q.
Definition 2 (RLWE Encoding Scheme): Our RLWE Encoding scheme is composed of three algorithms KeyGen, Enc, Dec as follows: ([TeX:] $$\Delta=\lfloor q / p\rfloor, \chi_\sigma$$ denotes the discrete Gaussian distribution with a standard deviation )
KeyGen[TeX:] $$\left(1^\lambda\right) \rightarrow \mathrm{sk}$$: Sample [TeX:] $$\mathbf{s} \leftarrow R_q.$$ Output sk = s as a secret key.
Enc[TeX:] $$(\mathrm{sk}, \mathbf{m}) \rightarrow \mathrm{ct}$$: To encrypt [TeX:] $$\mathbf{m} \in R_p$$, sample [TeX:] $$\mathbf{a} \leftarrow R_q$$ and [TeX:] $$\mathbf{e} \leftarrow \chi_\sigma^N$$ for a discrete Gaussian distribution [TeX:] $$\chi_\sigma$$. Then, compute [TeX:] $$\mathbf{b}=\mathbf{a} \cdot \mathbf{s}+\mathbf{e}+\Delta \mathbf{m}$$ mod q, and output a ciphertext [TeX:] $$\mathrm{ct}:=(\mathbf{a}, \mathbf{b})$$.
Eval[TeX:] $$\left(\boldsymbol{c}, d,\{\mathbf{c t}\}_{i=1}^I\right) \rightarrow \mathrm{ct}$$: For a given vector [TeX:] $$c \in R_p^I$$ and I ciphertexts [TeX:] $$\mathrm{ct}_i=\left(a_i, b_i\right),$$ output the ciphertext [TeX:] $$\mathrm{ct}:=\left(\boldsymbol{c} \cdot\left(a_1, a_2, \cdots, a_I\right), \boldsymbol{c} \cdot\left(b_1, b_2, \cdots, b_I\right)+\Delta d\right)$$
Dec[TeX:] $$(\text {sk, ct}) \rightarrow \mathbf{m}:$$ To decrypt [TeX:] $$\mathrm{ct}=(\mathbf{a}, \mathbf{b})$$, compute [TeX:] $$\mathbf{d}=\mathbf{b}-\mathbf{a s}$$ mod q and output [TeX:] $$\left\lfloor\frac{p \mathbf{d}}{q}\right\rceil \bmod p$$
This encoding scheme is indeed a symmetric key encryption from [33] and is semantically secure under the RLWE assumption (Definition 4 given below). We also remark that an addition of and a scalar multiplication on ciphertexts homomorphically correspond to those operations on the underlying messages, or more precisely: for all [TeX:] $$\left(m_i\right)_{i \in I} \in R_p^I$$, [TeX:] $$c \in R_p^I, \text { and } d \in R_p$$, the following probability is bigger than [TeX:] $$1-\operatorname{negl}(\lambda)$$:
given that [TeX:] $$p I\|\mathbf{e}\| \lt q / 2 p.$$ This allows a party (without sk) to output a ciphertext whose underlying message is an affine combination of the underlying messages of given ciphertexts.
Essential to our construction of zk-SNARK is the assumption that above encoding scheme is a linear-only encoding scheme. Roughly, it assumes that the only way for a PPT adversary to generate a valid new ciphertext is linearly combining the given ciphertexts. The formal definition is as follows, and a generalized version of this assumption (with messages composed of vectors) was also exploited in [17], [18] to construct SNARG from (R)LWE assumptions.
Definition 3 (Linear-Only Encoding [17], [25]): Fix a security parameter . An encoding scheme [TeX:] $$\mathcal{E}=(\text {KeyGen, Enc, Dec})$$ over a ring [TeX:] $$\mathfrak{R}$$ is a linear-only encoding scheme if for any PPT adversary [TeX:] $$\mathcal{A}$$, there exists an efficient extractor [TeX:] $$\mathcal{X}_{\mathcal{A}}$$ such that for all auxiliary inputs [TeX:] $$z \in\{0,1\}^\lambda,$$ and any plaintext generation algorithm [TeX:] $$\mathcal{M}$$ (which outputs some elements from [TeX:] $$\mathcal{R}$$), we have that for [TeX:] $$\text { sk } \leftarrow \text { KeyGen }\left(1^\lambda\right),$$ [TeX:] $$\left(a_1, a_2, \cdots, a_m\right) \leftarrow \mathcal{M}\left(1^\lambda\right),$$ [TeX:] $$\mathrm{ct}_i \leftarrow \operatorname{Enc}\left(\mathrm{sk}, a_i\right)$$ for all [TeX:] $$i \in[m], \mathrm{ct}^{\prime} \leftarrow \mathcal{A}\left(\left\{\mathrm{ct}_i\right\}_{i \in[m]} ; z\right),$$ [TeX:] $$(\boldsymbol{\pi}, b) \leftarrow \mathcal{X}_{\mathcal{A}}\left(\left\{\mathrm{ct}_i\right\}_{i \in[m]} ; z\right), a^{\prime} \leftarrow\left(a_1, a_2, \cdots, a_m\right) \cdot \boldsymbol{\pi}+b,$$
Remark 1: The fact that our encoding scheme resembles usual fully homomorphic encryption (FHE) schemes does not contradict our assumption that it is a Linear-Only Encoding. In FHE, it is necessary for the ciphertext after non-scalar multiplication to be decrypted with specified secret key [33] or to be performed key-switching procedure [34] for correct decryption, both of which are not allowed in our encoding scheme.
We finally describe the ring learning with errors (RLWE) assumption as follows.
Definition 4 (Ring LWE assumption): Let R be [TeX:] $$\mathbb{Z}[X] /\left(X^N+1\right)$$ with a power of two integer N, and [TeX:] $$R_q=R / q R.$$ Then, a decision ring LWE (RLWE) assumption is hard to distinguish the two distributions
[TeX:] $$\left.\left\{\left(\mathbf{a}_i, \mathbf{b}_i=\mathbf{a}_i \mathbf{s}+\mathbf{e} \bmod q\right): \mathbf{a}_i, \mathbf{s} \leftarrow R_q \text { and } \mathbf{e} \leftarrow \chi_\sigma^N\right\}\right\}$$
[TeX:] $$\left\{(\mathbf{a}, \mathbf{u}): \mathbf{a}, \mathbf{u} \leftarrow R_q\right\}$$
where [TeX:] $$\chi_\sigma$$ is a discrete Gaussian distribution with a standard deviation defined on [TeX:] $$\mathbb{Z}$$4
4 Formal definition of RLWE is different to the definition 4, but if N is power of two, then the definition of RLWE is the same as definition 4. We refer to [35].
The following lemma is a corollary from [36].
Lemma 3 (Hardness of RLWE [36]): Let N be a power of two integer, and R be the ring [TeX:] $$\mathbb{Z}[X] /\left(X^N+1\right)$$ and [TeX:] $$R_q=R / q R \text {, }$$ where q = 1 mod 2N. If [TeX:] $$\sigma=\alpha q \gt \sqrt{N}(N m / \log (N m))^{1 / 4} \cdot \omega(\sqrt{\log N}),$$ then the RLWE problem with respects to parameters N, q, m, [TeX:] $$\chi$$ where [TeX:] $$\chi$$ a discrete Gaussian with standard deviation , is quantumly at least as hard as approximate the shortest vector problem within a factor [TeX:] $$\widetilde{O}\left(N(N m / \log (N m))^{1 / 4} / \alpha\right)$$ in an ideal of the ring [TeX:] $$\mathbb{Z}\left[\zeta_{2 N}\right]$$ where [TeX:] $$\zeta_{2 N}$$ is the 2N-root of the unity.
Remark 2 (SIMD Operation): When p is a prime such that [TeX:] $$p \equiv 1 \bmod 2 N,$$ the message space [TeX:] $$R_p$$ of the above encoding scheme is isomorphic as a ring to [TeX:] $$\mathbb{Z}_p^N.$$ Therefore, we can simultaneously encode N messages from [TeX:] $$\mathbb{Z}_p$$ to a single encoding, and a single operation (addition or scalar multiplication) on the ciphertexts or messages of [TeX:] $$R_p$$ corresponds to those on many messages of [TeX:] $$\mathbb{Z}_p$$, which is called a single instruction multiple data (SIMD) operation.
C. Succinct Non-Interactive Arguments
In this section, we provide definitions of the notion of succinct non-interactive zero-knowledge argument of knowledge (zk-SNARK).
Definition 5 (Non-interactive proof system): Let [TeX:] $$\mathcal{R}$$ be a relation which comprises pairs [TeX:] $$(\phi, \omega) \in \mathcal{R}.$$ We call the statement and the witness. A non-interactive proof system for a relation [TeX:] $$\mathcal{R}$$ comprises three algorithms as follows:
Setup([TeX:] $$\mathcal{R}$$): The setup algorithm provides a common reference string crs and a simulation trapdoor for the relation [TeX:] $$\mathcal{R}$$.
Prove([TeX:] $$\mathcal{R}\text {.crs, } \phi, \omega$$): The prove algorithm outputs a proof .
Verify([TeX:] $$\mathcal{R}\text {.crs, } \phi, \pi$$): The verify algorithm provides 0 (reject) or 1 (accept).
If a verifier can only verify a proof using the verifiable common string vrs which contains secret information, the proof system is called the ‘designated’ non-interactive proof system.
Completeness roughly says that an honest prover can convince an honest verifier. We formally describe the completeness.
Definition 6 (Completeness): (Setup, Prove,Verify, Sim) algorithms are said to have completeness if they satisfy that for all
is bigger than [TeX:] $$1-\operatorname{negl}(\lambda)$$, where [TeX:] $$\operatorname{negl}(\lambda)$$ is a negligible function in .
The knowledge soundness says that if a prover can provide a valid proof, then there is an efficient algorithm for extracting the witness for the given statement with the same inputs and random coins. We formally describe the knowledge soundness.
Definition 7 (Knowledge soundness): For any polynomial time adversary [TeX:] $$\mathcal{A},$$ there exists a polynomial time extractor [TeX:] $$\mathcal{X}_{\mathcal{A}}$$ such that
where [TeX:] $$\operatorname{negl}(\lambda)$$ is a negligible function in .
The zero-knowledge roughly says that a prover cannot leak any information except for the truth of the statement. We also formally describe the zero-knowledge.
Definition 8: [TeX:] $$\text {[} \epsilon_{z k} \text {-Zero-Knowledge] }$$ For all [TeX:] $$(\phi, \omega) \in \mathcal{R}$$ and an adversary [TeX:] $$\mathcal{A}$$, the following equality holds.
where means that the difference of two probability is bounded by [TeX:] $$\epsilon_{z k} \ll 1$$.
Now we present the definition of succinctness and finally, zk-SNARK.
Definition 9 (Succinctness): A non-interactive proof system is succinct if the proof size and verification time is polynomial in the security parameter and [TeX:] $$|\phi|+\log |w|$$ where and w are input and witness of the relation [TeX:] $$\mathcal{R}$$, respectively.
Definition 10 (zk-SNARK): If a non-interactive proof system satisfies the completeness, knowledge soundness, zeroknowledge, and succinctness, then it is called zk-SNARK. In addition, if it requires a secret information for a verifier (to Verify), it is called the designated zk-SNARK.
III. ZK-SNARK FROM RLWE
In this section, we propose a zk-SNARK from RLWE. Here, we use a ring [TeX:] $$R_p=\mathbb{Z}_p[X] /\left(X^N+1\right)$$ as a message space with a power-of-two N and a prime p such that p = 1 mod 2N to fully exploit slot-wise computations. Indeed, the ring isomorphism [TeX:] $$R_p \cong \mathbb{Z}_p^N$$ allows N slot-wise computations. Then, we can simultaneously verify at most N possibly distinct circuits (having the same bound on the size) in a single decryption process. Previously, to verify circuits with N pairs of input and output, a verifier must perform N decryption processes.
Intuition of zk-SNARK construction: Groth [9] and ours.
Since our zk-SNARK construction resembles that of Groth [9], we briefly overview the intuition behind the construction of [9] and the distinguished aspect of ours. Both constructions are based on the QAP (detail will be given below) where the divisibility of [TeX:] $$v_{C, i}(x) \cdot w_{C, i}(x)-y_{C, i, o}(x) \text { by } t_C(x)$$ is equivalent to the correct evaluation of a circuit C with an input i and output o. The divisibility is checked by letting a prover to provide the quotient [TeX:] $$h_{C, i, o}(x)$$ along with (the part of) polynomials [TeX:] $$v_{C, i}(x), w_{C, i}(s), y_{C, i, o}(x)$$ satisfying the divisibility condition with [TeX:] $$t_C(x)$$. On the other hand, for efficient verification and succinct proof, instead of working directly on those polynomials, a verifier (or a trusted party) encodes the random point r (and its corresponding powers) with an linearly homomorphic encoding scheme so that a prover can generate a proof without knowing r (necessary for soundness). Two significant differences of our construction from Groth [9] are (i) We use RLWE encoding scheme for better amortized proof size, which additionally requires noise flooding technique for zero-knowledge; (ii) We exploit generalized version of QAP for a finite ring (instead of field) to deal with the messages space [TeX:] $$R_p$$ of the RLWE encoding scheme.
A. Ring-Quadratic Arithmetic Program (Ring-QAP)
Previously, QAP has been used to confirm arithmetic circuit satisfiability over finite field [TeX:] $$\mathbb{F}$$, so every element which appears at the above definition is contained in [TeX:] $$\mathbb{F}$$. However, since the message space of the RLWE based encoding is not a field, but a ring, the existing QAP definition cannot capture the case. Thus, the necessity for ring-QAP is natural, which is the generalization of the previous QAPs from a finite field [TeX:] $$\mathbb{F}$$ to a ring [TeX:] $$\mathfrak{R}$$. We first introduce a definition of ring-QAP.
Definition 11 (Ring-QAP; adapted from [27]): A QAP [TeX:] $$\mathcal{Q}$$ over a ring [TeX:] $$\mathfrak{R}$$ comprises three sets of m + 1 polynomials [TeX:] $$\mathcal{V}=\left\{v_k(x)\right\}, \mathcal{W}=\left\{w_k(x)\right\}, \mathcal{Y}=\left\{y_k(x)\right\}$$ (over [TeX:] $$\mathfrak{R}$$), for [TeX:] $$k \in\{0,1, \ldots, m\}$$, and a target polynomial [TeX:] $$t(x) \in \mathfrak{R}[x]$$. Suppose C : [TeX:] $$\Re^n \rightarrow \mathfrak{R}^{n^{\prime}}$$ is an arithmetic circuit that takes as input n elements of [TeX:] $$\mathfrak{R}$$ and outputs n′ elements, for a total of n + n′ I/O elements. Then we say that [TeX:] $$\mathcal{Q}$$ computes C if: [TeX:] $$\left(a_1, a_2, \ldots, a_{n+n^{\prime}}\right) \in \Re^{n+n^{\prime}}$$ is a valid assignment of C’s inputs and outputs, if and only if there exist coefficients [TeX:] $$\left(a_{n+n^{\prime}+1}, a_{n+n^{\prime}+2}, \ldots, a_m\right)$$ such that [TeX:] $$t(x)$$ divides [TeX:] $$p(x)$$, where:
Remark 3 (Description of [TeX:] $$\mathcal{V}, \mathcal{W}, \mathcal{Y}$$): Recall that, in the original QAP [27] over a finite field, the target polynomial [TeX:] $$t(x)$$ is defined by [TeX:] $$\prod_g\left(x-r_g\right)$$ with distinct roots [TeX:] $$r_g$$'s, each corresponding to each multiplication gate. Then, polynomials [TeX:] $$\mathcal{V}, \mathcal{W}, \text{ and } \mathcal{Y}$$ are constructed in a way that their evaluation values on [TeX:] $$r_g$$, i.e., [TeX:] $$\left(v_0\left(r_g\right)+\sum_{i=1}^m a_i v_i\left(r_g\right),\left(w_0\left(r_g\right)+\sum_{i=1}^m a_i w_i\left(r_g\right)\right.\right.$$, and [TeX:] $$\left(y_0\left(r_g\right)+\sum_{i=1}^m a_i y_i\left(r_g\right)\right.$$ are respectively, left input, right input, and output of the multiplication gate corresponding to [TeX:] $$r_g$$. In our Ring-QAP, the target polynomial [TeX:] $$t(x),$$ along with [TeX:] $$\mathcal{V}, \mathcal{W}, \text{ and } \mathcal{Y}$$ are defined in the same way as those of the original QAP, but with a caution in choosing [TeX:] $$r_g$$'s due to the following Schwartz-Zippel lemma on the ring.
For the soundness of zk-SNARKs, the Schwartz-Zippel lemma should be required. The original lemma only provides an upper bound of the probability that the evaluation of nonzero multivariate polynomials at a random point from some finite set is zero. Thus, it does not also capture a polynomial ring case, but fortunately Schwartz [37] and Bishonoi et al. [38] deal with the ring variant of Schwartz-Zippel lemma as follows.
Lemma 4 (Generalized Schwartz-Zippel Lemma [37], [38]): Let [TeX:] $$\mathfrak{R}$$ be a finite ring, and let [TeX:] $$S \subseteq \mathfrak{R}$$ be a set satisfying that
for all [TeX:] $$x, y \in S$$ such that [TeX:] $$x \neq y$$, [TeX:] $$x-y$$ is invertible.5
Then, for all n-variate nonzero polynomial [TeX:] $$f: \Re^n \rightarrow \mathfrak{R}$$ of total degree D,
Example 1 (Set S with Maximal Cardinality): For a prime p, when a ring [TeX:] $$R_p=\mathbb{Z}_p[X] /\left(X^N+1\right)$$ is isomorphic to [TeX:] $$\mathbb{Z}_p^N$$, a set [TeX:] $$S:=\left\{(a, a, \ldots, a): a \in \mathbb{Z}_p\right\} \subseteq R_p$$ satisfies the desired condition of the above lemma with [TeX:] $$\mathfrak{R}=R_p$$. Note that [TeX:] $$|S|=p$$ and S has the maximal cardinality among all such subsets: if a set S′ has cardinality bigger than p, then by pigeon hole principle, there exist distinct [TeX:] $$x, y \in S^{\prime}$$ having the same value in at least one of its coordinate (hence, [TeX:] $$x-y$$ is not invertible).
To exploit the above lemma in our case, we choose [TeX:] $$S:=\left\{a \cdot 1^N: a \in \mathbb{Z}_p\right\}$$, where 1 is a vector of ones; all coefficients are one. Thus, we directly obtain the Theorem 1 that the probability that f can be vanished at a random point can be described as follows.
Theorem 1 (Rp-QAP with maximal cardinality): For a ring [TeX:] $$R_p \cong \mathbb{Z}_p^N$$ (with prime p) and any arithmetic circuit C : [TeX:] $$R_p^n \rightarrow R_p^{n^{\prime}}$$ of fan-in 2 with m wires and d multiplication gates, if p d, then there exists a QAP [TeX:] $$\mathcal{Q}=\left(\mathcal{V}=\left\{v_k(x)\right\}_{k=0}^m, \mathcal{W}=\right.\left.\left\{w_k(x)\right\}_{k=0}^m, \mathcal{Y}=\left\{y_k(x)\right\}_{k=0}^m, t(x)\right)$$ computing C. More precisely,
and [TeX:] $$\mathcal{V}, \mathcal{W}, \mathcal{Y}$$ can be defined by combining [TeX:] $$\left\{\lambda_j(x)\right\}_{j=0}^{d-1}$$ where
for distinct roots [TeX:] $$r_0, \cdots, r_{d-1} \in A:=\left\{a \cdot \mathbf{1}^N: a \in \mathbb{Z}_p\right\} \subseteq R_p.$$
B. Our Designated zk-SNARK from RLWE
We now describe our zk-SNARK with RLWE-based linearonly encoding (Section II-B), which is composed of three algorithms: (Setup, Prove,Verify). Roughly speaking, the protocol is a natural conversion from a DL-based encoding into the RLWE-based linear-only encoding with Rp-QAP with maximal cardinality where [TeX:] $$R_p=\mathbb{Z}_p[x] /\left(x^N+1\right)$$ and N is a power of 2. We assume [TeX:] $$p, q \equiv 1 \bmod 2 N$$ which implies that [TeX:] $$R_p \cong \mathbb{Z}_p^N \text { and } R_q \cong \mathbb{Z}_q^N.$$
Let [TeX:] $$\mathcal{R}$$ be a relation that a prover wants to prove, which is represented by an arithmetic circuit with n inputs, n′ outputs, and m wires (composed of input and output of the circuit, output of multiplication gates, and constant addition and multiplication gates). [TeX:] $$\text { Let } \mathcal{V}=\left\{v_k(x)\right\}, \mathcal{W}=\left\{w_k(x)\right\} \text {, }\mathcal{Y}=\left\{y_k(x)\right\}$$, and [TeX:] $$t(x)$$ be the ring-QAP (Definition 11) corresponding to this arithmetic circuit. Then, for the valid statements [TeX:] $$\left(a_1, \ldots, a_{n+n^{\prime}}\right)$$ and witnesses [TeX:] $$\left(a_{n+n^{\prime}+1}, \ldots, a_m\right)$$ of the relation [TeX:] $$\mathcal{R}$$, it holds that
for some polynomial [TeX:] $$h(x)$$ of degree at most the number of multiplication gates.
[TeX:] $$(\text {crs, vrs}) \leftarrow \operatorname{Setup}(\mathcal{R})$$. This algorithm receives a relation [TeX:] $$\mathcal{R}$$ as an input and outputs a common refernce string crs. In addition, our scheme only supports a designated verifier and Setup outputs an additional information, called vrs. The trusted third party (TTP) chooses random elements [TeX:] $$\alpha, \beta, \delta, r \leftarrow R_p$$, and generates the master secret key of RLWE encoding [TeX:] $$\mathrm{s} \leftarrow R_q .$$ Then, TTP computes crs and vrs as follows:
Here Enc denotes an encoding algorithm for RLWE as defined in Section II-B and [TeX:] $$\operatorname{Enc}\left(0_j\right)^{\prime} \mathrm{s}$$'s are encodings of zero. TTP makes public crs, however, vrs is sent to the designated verifier and it should be kept secret.
[TeX:] $$\pi \leftarrow \operatorname{Prove}\left(\text { crs }, a_1, \ldots, a_m\right).$$ To generate a proof , a prover executes Prove algorithm which receives crs, statements and witnesses as an input. He chooses random elements [TeX:] $$\gamma_u, \gamma_v \leftarrow R_p$$ generates three encodings Enc(A(r)), Enc(B(r)), and Enc(C(r)) through homomorphic summations and scalar multiplications where
and for [TeX:] $$\mathbf{r}_A, \mathbf{r}_B, \mathbf{r}_C \leftarrow\{0,1\}^{N \log q+2 \lambda}$$ and [TeX:] $$I \in\{A, B, C\},$$ computes re-randomized encodings
where [TeX:] $$\mathbf{e}_I^*$$ is sampled from a distribution that outputs large elements to smudge the error terms in RLWE encodings. We will formally describe how to sample [TeX:] $$\mathbf{e}_I^*$$ in the next Section III-C.
[TeX:] $$1 / 0 \leftarrow \operatorname{Verify}\left(\pi, v r s, a_1, \ldots, a_{n+n^{\prime}}\right).$$ A (designated) verifier who has vrs can check the validity of = (Enc(A(r)), Enc(B(r)), Enc(C(r))). The verifier can obtain a tuple (A,B,C) through executing a decryption algorithm from . Then, he tests
and accepts the proof if the test passes.
C. Noise Flooding with Optimized Parameters
A verifier in our protocol decrypts the RLWE ciphertexts using a secret key to obtain messages. The decryption process of the RLWE based encoding gives a verifier the error terms as well as corresponding message. Due to the construction of RLWE ciphertexts, error terms may contain some information about affine computations which are conducted on encrypted data and thus information about the error term must be hidden. To overcome this restriction, previous works [16], [23], [24] introduced a noise flooding technique where one adds a large values to hide an existing error term.
Noise flooding in the previous work. In previous work, prover injects a sufficiently large error [TeX:] $$\mathrm{e}^*$$ to a proof ciphertext [TeX:] $$(\mathbf{a}, \mathbf{b}=\mathbf{a} \cdot \mathbf{s}+\mathbf{e})$$ so that the added ciphertext [TeX:] $$\left(\mathbf{a}, \mathbf{b}+\mathbf{e}^* \bmod \right. q R)$$ has an error [TeX:] $$\mathbf{e}+\mathrm{e}^* \bmod q R$$ that is statistically close to [TeX:] $$\mathrm{e}^*$$. Then, following lemma guarantees that no adversary can obtain any significant information on the error term [TeX:] $$\mathrm{e}$$ from the decryption of a proof ciphertext.
Lemma 5 ( [23]): Let [TeX:] $$B_1, B_2$$ be positive numbers and x be a fixed number in an interval [TeX:] $$\left[-B_1, B_1\right]$$. Let Y be the uniform distribution defined on an interval [TeX:] $$\left[-B_2, B_2\right]$$. Then, the statistical distance between a distribution Y and Y + x is bounded by [TeX:] $$B_1 / B_2 \text {. }$$
Specifically, the lemma implies that [TeX:] $$B_2=B_1 \times 2^\kappa$$ bounds the statistical distance between two distributions to be [TeX:] $$2^{-\kappa}$$. Then, from the probability preservation property of statistical distance, it gives that the scheme with noise flooding satisfies the zero-knowledge property (Definition 8) with [TeX:] $$\epsilon_{z k}=2^{-\kappa}$$.
Our noise flooding with tighter parameters. In this paper, we propose to investigate the computational costs of distinguishing two distributions with the notion of Hellinger distance as recently proposed by [39]. From this, by computing the cost more tightly, we can use better parameters while providing the same zero-knowledge property. More specifically, we compute more tight lower bound of the computational costs required for an adversary to break the zero-knowledge then set the parameter accordingly.
We remark that, conventional argument based on the statistical distance (as above) requires that a new error to be larger than the initial error in ratio exponential to the statistical parameter. To circumvent this limitation, there have been several approaches (especially lattice-based cryptography) proposed to use closeness measures other than statistical distance on distributions [39]–[46]. Our approach can be seen as an adaptation of [39] for the zero-knowledge property in latticebased encoding scheme.
At first, we introduce the Hellinger distance and its property, a key ingredient for our better noise flooding technique.
Definition 12 (Hellinger distance): Let [TeX:] $$D_1, D_2$$ be two discrete distributions over a domain X. The Hellinger distance between [TeX:] $$D_1$$ and [TeX:] $$D_2$$ is defined by
If [TeX:] $$H\left(D_1, D_2\right)$$ is smaller than [TeX:] $$2^{-t}$$, we say that a pair [TeX:] $$(D_1, D_2)$$ is [TeX:] $$2^{-t}$$-Hellinger close pair.
[39] recently showed that replacing a distribution [TeX:] $$D_1$$ with the other distribution [TeX:] $$D_2$$ in the security game for the decision problem loss only a few bit security if [TeX:] $$(D_1, D_2)$$ is [TeX:] $$2^{-\kappa / 2}$$-Hellinger close pair. More formally, they proved the following lemma.
Lemma 6 (Theorem 5 in [39]): Let [TeX:] $$\Pi^{D_1}$$ be a cryptographic primitive with black box access to a distribution [TeX:] $$D_1 \text { and } G^{D_1}$$ be a decision security game regarding [TeX:] $$\Pi^{D_1}$$. Suppose that [TeX:] $$(D_1, D_2)$$ is [TeX:] $$2^{-\kappa / 2}$$-Hellinger close pair. Then, if [TeX:] $$\Pi^{D_1}$$ achieves [TeX:] $$\kappa \text {-bit }$$ security, then [TeX:] $$\Pi^{D_2}$$ satisfies [TeX:] $$\kappa-6.847 \text {-bit }$$ security.
Now, with this lemma, we can show that adding a noise from appropriate discrete Gaussian distribution achieves the goal of noise flooding technique as follows: Let [TeX:] $$D_1$$ and [TeX:] $$D_2$$ be discrete Gaussian with the standard deviation [TeX:] $$\sigma^{\prime}$$ centered at zero and [TeX:] $$\mathbf{e}$$, respectively. Then, from the above lemma with [TeX:] $$\Pi^{D_i}$$ as a designated zk-SNARK from RLWE with black box access to [TeX:] $$D_i$$, it suffices to show that [TeX:] $$H\left(D_1, D_2\right) \leq 2^{-\kappa / 2}$$, which results in [TeX:] $$G^{D_2}$$, a security game for the zero-knowledge (Definition 8), is [TeX:] $$\kappa \text {-bit }$$ secure, i.e., the advantage of adversary is less than [TeX:] $$2^{-\kappa}$$ (note that [TeX:] $$G^{D_1}$$ is already [TeX:] $$\geq \kappa+6.847$$ secure since it does not have any information on the error term [TeX:] $$\mathbf{e}$$).
The following lemma provides a sufficient size of [TeX:] $$\sigma^{\prime}$$ in the above argument given that [TeX:] $$\|\mathbf{e}\| \leq B \text {. }$$
Lemma 7: Let P and Q be discrete Gaussian distributions with the standard deviation [TeX:] $$\sigma^{\prime}$$ centered at zero and y, respectively, such that [TeX:] $$|y| \leq B$$. Then, it satisfies that
Proof: We will regard P,Q as continuous Gaussian distributions because [TeX:] $$\sigma^{\prime}$$ that we will use is sufficiently large. Then, from the definition of Gaussian distribution and Hellinger distance,
The integral can be converted as follows.
Using the fact that [TeX:] $$\int_{\mathbb{R}} \exp \left(-c u^2\right) d u=(c / \pi)^{-1 / 2}$$ for all c > 0, we obtain
Substituting this to the first equation gives the claim.
Now, to satisfy [TeX:] $$\kappa \text {-bit }$$ security in zero-knowledge (i.e., [TeX:] $$\epsilon_{z k}=2^{-\kappa}$$ in Definition 8), it requires that
In other words,
Parameters. Let [TeX:] $$\kappa$$ be the statistical parameter and B the size of the error term in the final encodings (before noise flooding). To achieve the [TeX:] $$\kappa \text {-bit }$$ security, it suffices to set [TeX:] $$\sigma^{\prime}$$ as above (4). Then, the remaining part is to choose q such that q/2p is bigger than [TeX:] $$\Omega\left(\sigma^{\prime}\right)$$ to achieve the correctness. On the other hand, according to the previous analysis that exploited the statistical distance as a measure of closeness of two distributions, [TeX:] $$\sigma^{\prime}$$ is approximately set to [TeX:] $$\Omega\left(2^\kappa B\right)$$, which implies that [TeX:] $$q / 2 p \geq \Omega\left(2^\kappa B\right)$$. Consequently, in our tight analysis based on the Hellinger distance, q is polynomial in [TeX:] $$\kappa$$ rather than [TeX:] $$\exp (\kappa)$$ as in the conventional analysis.
More specifically, in later Section IV, we will present improved concrete parameters due to the analysis with the Hellinger distance, and estimate the size of proof based on the improved parameters.
D. Security Proofs
Theorem 2: Let [TeX:] $$\kappa$$ be the statistical security parameters, and be the security parameter. Let [TeX:] $$N=N(\lambda), q=q(\lambda)$$ and [TeX:] $$\sigma=\sigma(\lambda)$$ be RLWE parameters in Lemma 3 satisfying that [TeX:] $$B=8 p \sigma \sqrt{m+N \log q+2 \lambda+3},$$ where m is the number of wires in target circuit C. Assume that our RLWE encoding scheme (Definition 2) is a Linear-Only Encoding scheme (Definition 3). Then, for the circuit C, the scheme described in Section III-B is a designated zk-SNARK (Definition 10).
Clearly, it is straightforward to prove the completeness. Moreover, our scheme consists of three RLWE encodings which are polynomial size in , and the verification procedure takes polynomial time in , so the succinctness also holds.
We now introduce a leftover hash lemma [47] which is necessary to prove zero-knowledge property.
Lemma 8 (Specialized leftover hash lemma): For nonnegative integers n, q, 2, t and real number [TeX:] $$\epsilon \text {, }$$ if [TeX:] $$\mathbf{A} \leftarrow \mathbb{Z}_q^{n \times t}$$ and [TeX:] $$\mathrm{r} \leftarrow \mathbb{Z}_2^t,$$ then we have
where [TeX:] $$\mathbf{A} \cdot \mathbf{r}$$ is computed in [TeX:] $$\mathbb{Z}_q$$ and [TeX:] $$\mathbf{u} \leftarrow \mathbb{Z}_q^n$$. Thus, if [TeX:] $$t\gt N \log q+2 \log (1 / \epsilon),$$ then two distributions are [TeX:] $$\operatorname{SD}((\mathbf{A}, \mathbf{A}\mathbf{r}),(\mathbf{A}, \mathbf{u})) \leq \epsilon.$$
Lemma 9 (Zero-knowledge): The protocol has zero knowledge under the parameters in Theorem 2.
Proof: We build a simulated proof [TeX:] $$\pi^{\prime}$$ that follows the same distribution as a proof . The algorithm comprises two steps; constructing elements and generating RLWE ciphertexts using crs. First, choose [TeX:] $$A^{\prime}, B^{\prime} \leftarrow R_p$$, and compute
Then, using crs, we can generate three RLWE ciphertexts [TeX:] $$\operatorname{Enc}\left(A^{\prime}\right), \operatorname{Enc}\left(B^{\prime}\right), \text { and } \operatorname{Enc}\left(C^{\prime}\right)$$ and output a proof [TeX:] $$\pi^{\prime}=(\operatorname{Enc}(A), \operatorname{Enc}(B), \operatorname{Enc}(C)).$$ Then, the simulated proof can pass the verification (3).
As the last step, we need to prove that , and [TeX:] $$\pi^{\prime}$$ are statistically or computationally indistinguishable. Each encoding in , and [TeX:] $$\pi^{\prime}$$ consists of a pair ([TeX:] $$\mathbf{a}, \mathbf{b}$$ mod qR) with [TeX:] $$\mathbf{b}=\mathbf{a} \cdot \mathbf{s}+\mathbf{e}+\frac{q}{p} \mathbf{m}$$. By the leftover hash lemma 8, the first component of any encoding looks like a random element in [TeX:] $$R_q \cong \mathbb{Z}_q^N$$. More precisely, every element in [TeX:] $$R_q$$ can be regarded as a vector in [TeX:] $$\mathbb{Z}_q^N$$, so we apply the lemma 8 to randomize the first component of each encoding when [TeX:] $$N \log q+2 \lambda$$ encodings of zero are provided.
On the other hand, the noise flooding technique in Lemma 5 shows that [TeX:] $$\mathrm{e}$$ is independent of any witness since the error term looks like a random element. Therefore, two proofs are indistinguishable.
Lemma 10 (Knowledge soundness): The protocol has knowledge soundness under the parameters in Theorem 2.
Proof: Suppose that there exists an adversary [TeX:] $$\mathcal{A}$$ which can break knowledge soundness with a non-negligible probability. We will construct a knowledge extractor [TeX:] $$\mathcal{X}$$ based on [TeX:] $$\mathcal{A}$$.
Let [TeX:] $$\pi=(A(r), B(r), C(r))$$ be a tuple of RLWE ciphertexts. Then, [TeX:] $$\mathcal{A}$$ which allows affine computations can obtain follows.
where [TeX:] $$A_\alpha, A_\beta, A_\delta,\left\{A_i\right\}_{i=n+n^{\prime}+1}^m$$ are scalars in [TeX:] $$R_p$$ and [TeX:] $$A(r), A_{h(r)}$$ are polynomials of degree d with coefficients in [TeX:] $$R_p$$. Similarly we obtain representations about B and C.
Our construction allows slot-wise computations by ring operations, and the verification (3) can be considered as slotwise computations, i.e., independent computations on each [TeX:] $$\mathbb{Z}_p.$$ Note that a verifier in our protocol outputs accept when the equation holds for all slots, and it is enough to show slot-wise knowledge soundness.
Since [TeX:] $$\mathcal{A}$$ can break the soundness, [TeX:] $$\mathcal{A}$$ can pass the verification equation on each slot. For simplicity, we use a notation tilde to denote slot-wise results. Then, for each slot, the verification equation is considered as follows.
Moreover, after [TeX:] $$\mathcal{A}$$ computes affine operations rings, [TeX:] $$\mathcal{A}$$ can obtain equations
where tilded elements are included in a finite field [TeX:] $$\mathbb{Z}_p$$ for some prime p. Similarly, [TeX:] $$\mathcal{A}$$ obtains representations about [TeX:] $$\widetilde{B} \text { and } \widetilde{C}$$.
Now, we reconsider the random elements [TeX:] $$\widetilde{\alpha}, \widetilde{\beta}, \widetilde{\delta}$$ as formal variables. Then, [TeX:] $$\widetilde{A} \widetilde{B}$$ contains formal variables [TeX:] $$\widetilde{\alpha}^2, \widetilde{\beta}^2$$ and [TeX:] $$1 / \widetilde{\delta}^2$$, but they are not included in the right-hand side of the verification (5) in each slot. Thus, for passing the verification process, [TeX:] $$\widetilde{A}_\alpha \widetilde{B}_\alpha \alpha^2$$ must be the zero, which implies [TeX:] $$\widetilde{A}_\alpha \text { or } \widetilde{B}_\alpha$$ is zero. Without loss of generality, we assume that [TeX:] $$\widetilde{B}_\alpha=0$$. Similarly, we compare the coefficients of [TeX:] $$\widetilde{\alpha} \widetilde{\beta} \text { with } \widetilde{\beta}^2 \text {. }$$
Thus, it holds that [TeX:] $$\widetilde{A}_\alpha \widetilde{B}_\beta+\widetilde{A}_\beta \widetilde{B}_\alpha=\widetilde{A}_\alpha \widetilde{B}_\beta=1$$ and [TeX:] $$\widetilde{A}_\beta \widetilde{B}_\beta=0.$$ Without loss of generality, we also assume that [TeX:] $$\widetilde{A}_\alpha=\widetilde{B}_\beta=1$$ and [TeX:] $$\widetilde{B}_\beta=0.$$ For a coefficient of [TeX:] $$1 / \delta^2$$, we observe that
Moreover, for the coefficients of [TeX:] $$\widetilde{\alpha} / \widetilde{\delta} \text { and } \widetilde{\beta} / \widetilde{\delta}$$, we observe that
Since [TeX:] $$\widetilde{A}_\alpha=\widetilde{B}_\beta=1$$ and [TeX:] $$\widetilde{A}_\beta=\widetilde{B}_\alpha=0 \text {, }$$ each component of a coefficient term of [TeX:] $$1 / \widetilde{\delta}^2$$ is zero. Hence, [TeX:] $$\widetilde{A} \text { and } \widetilde{B}$$ are rewritten as follows.
Moreover, it holds that
Thus, the verification equation (5) implies that
since [TeX:] $$\widetilde{\delta}$$ are considered as a formal variable. Moreover, since [TeX:] $$\widetilde{\alpha} \text { and } \widetilde{\beta}$$ are also formal variables, we also observe that
We set [TeX:] $$\tilde{a}_i=C_i$$ for [TeX:] $$i \in\left\{n+n^{\prime}+1, \cdots, m\right\} \text {. }$$ Then, it holds that [TeX:] $$\widetilde{B}(r)=\sum_{i=0}^m \tilde{a}_i \widetilde{v}_i(r)$$ and [TeX:] $$\widetilde{A}(r)=\sum_{i=0}^m \widetilde{a}_i \widetilde{u}_i(r).$$ Moreover, for a variable r, we also observe that [TeX:] $$\widetilde{A}(r) \widetilde{B}(r)=\sum_{i=0}^m \widetilde{a}_i \widetilde{w}_i(r)+\widetilde{h}(r) \widetilde{t}(r) \text {, }$$ which implies that for each slot, the set [TeX:] $$\left\{\widetilde{a}_i\right\}_{i=n+n^{\prime}+1}^m=\left\{\widetilde{C}_i\right\}_{i=n+n^{\prime}+1}^m$$ is a witness of the statement [TeX:] $$\left\{\widetilde{a}_i\right\}_{i=1}^{n+n^{\prime}}.$$ The slot-wise knowledge soundness completes the knowledge soundness of our construction.
IV. PROOF SIZE ESTIMATION
We now estimate the size of proof of our designated zk- SNARK from RLWE. First, we provide concrete parameters of our protocol for circuits with [TeX:] $$2^{16}$$ gates for achieving the 110, 128, and 164-bit security, respectively. Due to the fancy analysis with respect to the Hellinger distance (equivalently, Rènyi divergence of order 1/2), concrete parameters improve considerably. Specifically, we describe the size of the proof of our scheme and then compare it with that of previous works.
Concrete parameters. We set the parameters to satisfy the following.
Our designated zk-SNARK has 164-bit security estimated by Albrecht et al’s LWE security estimator [48] with the reduction_cost_model=BKZ.sieve cost model.6 With this model, the parameters of the previous work only satisfy 110-bit security, but not 164-bit security that was claimed by authors. Thus, we provide several types of parameter suggestions as follows.
6 After we submitted this paper, a new estimator, called lattice-estimator, was published. However, we still use a previous estimator, named LWE-estimator because of the consistency of this paper.
- For fair comparison, the bit-size of the message space, and other parameters related to circuits are the same as previous work [23], [24].
- We provide new parameters satisfying 164-bit security that previous work desired.
- We also provide a parameter achieving the 128-bit security to compare our amortized proof size and the smallest proof size of the group based zk-SNARK [9].
To make a fair and easy comparison with previous work, we follow the way of selecting parameters in the previous papers as much as possible.
Let N, q and [TeX:] $$\sigma=\alpha q$$ be parameters of RLWE instances, and p be a 32-bit prime such that p = 1 mod 2N. A tight analysis based on the Hellinger distance instead of statistical distance loss 6.847 bit security [39]. In other words, to satisfy 32-bit statistical security that is the same as previous one, we need to consider parameters which require 39-bit statistical robust.
More precisely, for fair comparisons with previous work, we consider an arithmetic circuit with at most [TeX:] $$2^{16}$$ gates and [TeX:] $$d=2^{15}$$ multiplication gates, which can cover many example applications such as the SHA-256 evaluation. Then, setting a tailcut parameter [TeX:] $$T=8, B=|e|$$ is [TeX:] $$8 p \sigma \sqrt{m+t+3}$$ where m is the number of wires in ring-QAP and [TeX:] $$t=N \log q+2 \lambda \text {. }$$ Furthermore, [TeX:] $$\sigma^{\prime}$$ should satisfy that
Finally, we set [TeX:] $$m=2^{22}$$ as in [24] so that [TeX:] $$\sigma^{\prime} \approx 2^{19}$$. B and [TeX:] $$8 \sigma^{\prime}\lt q / 2 p$$ for the correctness (of encoding scheme). Then, it holds that
For readability, we list the parameters of our zk-SNARK in Table I with various security parameter .
CONCRETE PARAMETERS OF OUR DESIGNATED ZK-SNARK FROM RLWE. HERE d IS THE NUMBER OF MULTIPLICATIVE GATES. WE FIX T = 8 AND [TeX:] $$d=2^{15}$$ FOR FAIR COMPARISON.
Interestingly, the Hellinger distance provides a significant practical improvement independent of our introduction of RLWE encoding. Moreover, with our RLWE encoding and ring-QAP, our protocol is much more efficient in the amortized sense than previous zk-SNARKs from SSPs and QAPs. For the same circuit satisfiability, Gennaro et al. [23] and Naganuma et al. [24] chose LWE parameters ([TeX:] $$N, \log \alpha, \log q$$) as (1400,−180, 736) for 110-bit security. On the other hand, we choose RLWE parameters (2048,−98, 160) for achieving the same security. Thus, we reduce not only the size of an encoding in amortized sense but also the size of a single encoding.
Proof size. We can now estimate the size of the proof for our scheme. Our proof comprises three RLWE encodings, and the size of each encoding is about 2N log q bits because of [TeX:] $$R_q=\mathbb{Z}[X] /\left(X^N+1\right)$$ is the encoding space. Then, our encoding has the size [TeX:] $$2048 \times 260 \text { bits } \approx 113.1 \mathrm{~KB}$$ and the proof size is about 399.4 KB under 164-bit security since the proof consists of three encodings. On 110-bit security, we can see that our scheme has about 276.5 KB of proof size which is smaller than all previous work [16], [23], [24] from lattices, e.g., Nitulescu [16] has 270 KB of proof, which is the smallest among previous lattice-based work.
If we consider the amortized proof size, our scheme is even comparable to the best result from the previous zk-SNARKs (without post-quantum security). More precisely, since our scheme allows N verification simultaneously for each proof and our proof size is about 284.2 KB under 128-bit security, the size of amortized proof is only 156 bytes with N = 2048 and it is almost the same as 138 bytes of Groth [9] which has the shortest proof size among all zk-SNARKs. The proof size for each scheme is summarized in Table 2.
COMPARISON OF PROOF SIZE OF EACH ZK-SNARKS.
7 In original proposal, [16] is relied on linear-targeted malleability assumption, a weaker assumption than linear-only assumption. However, to achieve zk-SNARK, it also requires linear-only assumption or a similar one with efficient extractor.
8 Here, we assume the evaluation circuit of SHA-256, which corresponds to an arithmetic circuit with [TeX:] $$2^{16}$$ gates or less [23].
9 With bn-128 curve, https://github.com/zcash/zcash/issues/2465
Size of common reference string. In lattice-based zk- SNARK, the common reference string (crs) is composed of encodings for proving circuit evaluation and for leftover hash lemma (for zero-knowledge). In our proposal, the number of encodings in crs is the same as that in [24] which built a lattice-based zk-SNARK from QAP as ours. One difference is that our encoding from RLWE has 2N log q-bits which can be reduced to [TeX:] $$N \log q=2048 \cdot 180 \text {-bits }$$ with pseudorandom generator, while the one from LWE in [24] has [TeX:] $$\log q^{\prime}=736$$ bits (when reduced similarly with pseudorandom generator). When we consider the size of crs in amortized sense (with N amortization), however, the size of each encoding in our crs can be [TeX:] $$\approx \log q=180 \text { bits }$$ bits which is much smaller than that of [24].
Prover complexity. While our focus is on reducing the (amortized) proof size of the zk-SNARK as other work in the literature, we can also compare the prover/verifier complexity of our work with the previous works. Note that our SNARK requires ring multiplications over [TeX:] $$\mathbb{Z}_q[X] /\left(X^N+1\right)$$ which may cost [TeX:] $$\Theta\left(N^2\right)$$ operations over [TeX:] $$\mathbb{Z}_q$$ while previous SNARKs from LWE requires constant multiplications over [TeX:] $$Z_q^{N^{\prime}}$$ which costs [TeX:] $$\Theta\left(N^{\prime}\right)$$ only. We remark that this problem can be mitigated by applying Number Theoretic Transform to our solution, which can reduce the cost to be [TeX:] $$\Theta(N \log N)$$ (in this case, we must take the ciphertext modulus [TeX:] $$q \equiv 1 \bmod N$$ so that the ciphertext space [TeX:] $$R_q \cong \mathbb{Z}_q^N$$). Then, our prover/verifier complexity can be roughly logN in amortized sense — it is now better than the previous work having N — given that we utilize the full batch N for the proof.
Extension to other circuits. We believe that our conversion and analysis can be applied to previous zk-SNARKs from SSP and SAP beyond the QAP. In particular, if someone wants to convert a SAP based zk-SNARK from LWE to RLWE assumption for achieving more smaller proof rather than our QAP based zk-SNARK, then, under the 128-bit security, he/she can obtain that 213 KB proof size, and it could be regarded as 104 bytes proof size for a single verification due to the amortized sense. However, as Naganuma et al. [24] mentioned, the scheme might be less efficient than QAP based zk-SNARK.
V. EXPERIMENTAL RESULTS
Experimental setup. We implement our new lattice-based designated zk-SNARK and present the experiment results for our protocol. On implementation, we adopted libnsark library [49] for the zk-SNARK part and Microsoft SEAL library [50] for the RLWE encoding part then integrated them.10 Our experiments were conducted on Linux Ubuntu 22.04.01 LTS with AMD EPYC 7502 CPU and 32 GB memory.
10 To this end, we made some minor changes in each library, e.g., SEAL only supports a maximum 54-bit coefficient modulus space for N = 2048, while we require at least 180-bit.
In our experiment, we generated a random circuit with [TeX:] $$2^{15}$$ multiplicative gates and [TeX:] $$2^5$$ number of inputs, which can also cover the SHA-256 evaluation. Then, we measured the proof generation and verification time under the various security parameters given in Table I.
TIMING RESULTS WITH [TeX:] $$T=8, d=2^{15},$$ AND NUMBER OF INPUTS [TeX:] $$2^5$$
Prover time. Table III presents the proof generation time for each parameter. In our implementation, the main operation for prover is a linear combination between RLWE encoding and a ring element (instead of multi-exponentiation in other zk- SNARKs). In Table III, it takes about 7 seconds to generate a proof under the parameter with = 128. For simplicity, we measured the time for generating a proof with only one instance, while an RLWE encoding supports batching multiple proofs by nature. More specifically, the RLWE encoding with N = 2048 and log q = 208 can have 2048 messages simultaneously, and thus the amortized time for generating a proof for one instance is about 3.5 milliseconds.
Verifier time. Table III presents the verification time. As expected by the (3) (in Section III-B), the verifier complexity is independent of the circuit size and only depends on the number of inputs. According to our experiment, it takes about 11ms to verify a proof with the number of inputs [TeX:] $$2^5$$, and amortized time for verifying a proof is about 0.005 ms.