This is the server implementing the second revision of the Cloudflare blinded tokens protocol. For a description of the original protocol and motivations, see the challenge bypass specification or our talk at Real World Crypto 2017.
The protocol is based on a variant of an OPRF password management scheme by Jarecki, Kiayias, Krawczyk and Xu. When adapted to our needs, this scheme allows us to achieve the same goals using faster primitives, less bandwidth, and simpler secret-key operational logistics compared to the earlier RSA-based protocol.
- George Tankersley ([email protected])
- Alex Davidson ([email protected])
- Nick Sullivan ([email protected])
- Filippo Valsorda ([email protected])
We'd like to thank Dan Boneh for suggesting OPRFs in the first place; Ian Goldberg for his extensive advice and the batch proof; and Brian Warner, Zaki Manian, Tony Arcieri, Isis Lovecruft, Henry de Valence, Trevor Perrin, and several anonymous others for their valuable help, input, and review.
To run the server:
go run server/main.go --key testdata/p256-key.pem --comm testdata/test-p256-commitment
To demo token issuance:
cat testdata/bl_sig_req | nc localhost 2416
For a full client implementation, and further details on required message formatting and technical considerations, see the browser extension.
- Signs blinded tokens and generates valid batch DLEQ proof
- Verifies redemption tokens
A message authentication code (MAC) on a message is a keyed authentication tag that can be only be created and verified by the holder of the key.
A pseudorandom function is a function whose output cannot be efficiently distinguished from random output. This is a general class of functions; concrete examples include hashes and encryption algorithms.
An oblivious pseudorandom function (OPRF) is a two-party protocol between sender S and receiver R for securely computing a pseudorandom function f_x(·) on key x contributed by S and input t contributed by R, in such a way that receiver R learns only the value f_x(t) while sender S learns nothing from the interaction.
In this protocol, the Cloudflare edge is the "sender" holding x and the inputs t are the tokens. So the clients don't learn our key and we don't learn the token values.
The core difference is that where we previously relied on asymmetric cryptography for signatures, we can instead construct a symmetric exchange in which a secret key known only to the edge is used to create per-token MAC keys for each redemption request in a way that prevents the server from learning the token values and the client from learning the secret key. A full technical overview of this protocol can be read here.
Given a group setting and three hashes H_1, H_2, H_3 we build a commitment to a random token per request using a secret key x held by the edge servers. H_1 and H_2 are hash functions onto, respectively, the group and {0, 1}^λ where λ is a security parameter. The server also publishes publically a generator g along with a commitment (or 'public key') y = g^x, this is used for generating discrete-log equivalence proofs (DLEQs). We provide a brief overview of the how DLEQs are generated and verified below.
- Client generates random token
x
and a blinding factorr
. - Client calculates
p = H_1(t)^r
and sendsp
to the edge along with a CAPTCHA solution. - Edge validates the solution and computes
q = p^x = H_1(t)^(rx)
and a DLEQ proofπ
dependent on the output ofH_3
overg,y,p,q
and other group elements -- see below for more details. - The server returns
(q,π)
to the client. - Client unblinds b to retrieve
n = q^(1/r) = H_1(t)^x
. Now both the server and the client can calculateH_2(t, n)
as a shared key for the MAC. - When the client wants to redeem a token it presents
(t, MAC(request-binding-data))
whererequest-binding-data
is made of information observable by the edge that is unique(ish) to that particular request. - The server uses
t
as a double-spend index and recalculatesn
usingx
. Then it can validate the MAC using the shared key. - We know that a matching commitment value is valid because generating it requires access to
x
.
In step (3.) above, we call for a zero-knowledge proof of the equality of a discrete logarithm (our edge key) with regard to the returned token q
that the client receives. This allows the client to verify that the same key x
is being used to 'sign' all tokens that are sent to the edge.
The protocol naturally provides q = p^x in the edge response. To ensure that the edge has not used unique x value to tag users, we require them to publish a public key, y = g^x, as mentioned above. If p, q are orthogonal we can use a Chaum-Pedersen proof [CP93] to prove in zero knowledge that log_p(y) == log_p(q). We note this as DLEQ(q/p == y/g).
The proof follows the standard non-interactive Schnorr pattern. For a group of prime order Q with orthogonal generators p, g, public key t, and point q:
-
Prover chooses a random nonce
k <--$-- Z/QZ
-
Prover commits to the nonce with respect to both generators
a = g^k, b = p^k
-
Prover constructs the challenge
c = H_3(g,y,p,q,a,b)
-
Prover calculates response
s = k - cx (mod q)
-
Prover sends (c, s) to the verifier
-
Verifier recalculates commitments
a' = g^s + y^c b' = p^s + q^c
-
Verifier hashes
c' = H_3(g,y,p,q,a',b')
and checks that c == c'.
If all users share a consistent view of the tuple (y, g) for each key epoch, they can all prove that the tokens they have been issued share the same anonymity set with respect to k. One way to ensure this consistent view is to pin a key in each copy of the client and use software update mechanisms for rotation. A more flexible way is to pin a reference that allows each client to fetch the latest version of the key from a trusted location. We currently use the former method but plan to migrate to the latter in the near future.
In practice, the issuance protocol operates over sets of tokens rather than just one. A system parameter, m, determines how many tokens a user is allowed to request per valid CAPTCHA solution. Consequently, users generate (t_1, t_2, ... , t_m) and (b_1, b_2, ... , b_m); send (p_1, p_2, ... , p_m) to the edge; and receive (q_1, q_2 ... , q_m) in response.
Generating an independent proof of equality for each point implies excess overhead in both computation and bandwidth consumption. Therefore, we employ a batch proof to show consistent key usage for an entire set of tokens at once. The proof is a parallelized Schnorr protocol for the common-exponent case taken from [Hen14] and adapted for non-interactivity:
Given (g, y, q); (p_1,...,p_m), (q_1, ... ,q_m); q_i = k(p_i) for i = 1...m
-
Prover calculates a seed using a Fiat-Shamir transform:
z = H_3(g, y, p_1, ... , p_m, q_1, ... , q_m)
-
Prover initializes PRNG(z) and samples from it to non-interactively generate
c_1, ... , c_m <--$-- Z/qZ.
-
Prover generates composite group elements M and Z
M = (c_1*p_1) + (c_2*p_2) + ... + (c_m*p_m) Z = (c_1*q_1) + (c_2*q_2) + ... + (c_m*q_m)
-
Prover sends proof
(c, s) <-- DLEQ(M/Z == y/g)
-
Verifier recalculates the PRNG seed from protocol state, generates the composite elements, and checks that c' == c as in the single-element proof above.
We can see why this works in a reduced case.
For (p_1, p_2), (q_1, q_2), and (c_1, c_2):
q_1 = x(p_1)
q_2 = x(p_2)
(c_1*q_1) = c_1(x*p_1) = x(c_1*p_1)
(c_2*q_2) = c_2(x*p_2) = x(c_2*p_2)
(c_1*q_1) + (c_2*q_2) = x[(c_1*p_1) + (c_2*p_2)]
So the composite points will have the same discrete log relation x as the underlying individual points.
- 10x savings in token size (~256 bits instead of ~2048)
- Simpler & faster primitives (also: available in-browser via SJCL)
- No need for public-key encryption at all, since the derived shared key used to calculate each MAC is never transmitted and cannot be calculated without knowledge of the edge key or the client's blinding factor.
- The only secret to be managed is a 32-byte scalar.
- Easier key rotation. Instead of managing RSA certificates with pinning or transparency, we can publish/pin the commitment component of a DLEQ proof to allow clients to positively verify they're in the same anonymity set with regard to k as everyone else. Alternatively or additionally, if we publish historical k values then auditors who save their b results can check our honesty.
See the Cloudflare blog post (to be written)
See the spec here.