(Un)linkable Pseudonyms for Governmental Databases

(Un)linkable Pseudonyms for Governmental Databases

Jan CamenischIBM Research – Zurich

[email protected]

Anja LehmannIBM Research – Zurich

[email protected]

ABSTRACTWhen data maintained in a decentralized fashion needs tobe synchronized or exchanged between different databases,related data sets usually get associated with a unique iden-tifier. While this approach facilitates cross-domain dataexchange, it also comes with inherent drawbacks in termsof controllability. As data records can easily be linked,no central authority can limit or control the informationflow. Worse, when records contain sensitive personal data,as is for instance the case in national social security sys-tems, such linkability poses a massive security and privacythreat. An alternative approach is to use domain-specificpseudonyms, where only a central authority knows the cross-domain relation between the pseudonyms. However, currentsolutions require the central authority to be a fully trustedparty, as otherwise it can provide false conversions and ex-ploit the data it learns from the requests. We propose an(un)linkable pseudonym system that overcomes those limi-tations, and enables controlled yet privacy-friendly exchangeof distributed data. We prove our protocol secure in the UCframework and provide an efficient instantiation based ondiscrete-logarithm related assumptions.

Categories and Subject DescriptorsD.4.6 [Security and Protection]: Cryptographic control;H.2.7 [Database Administration]: Security, integrity,and protection; K.4.1 [Public Policy Issues]: Privacy

Keywordspseudonyms; unique identifier; unlinkability; databases

1. INTRODUCTIONWhen data is collected and maintained on a large scale,

that data often does not reside in a single database but isdistributed over several databases and organisations, eachbeing responsible for a particular aspect in the overall sys-tem. To still allow for collaborative operations and data

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected]’15, October 12–16, 2015, Denver, Colorado, USA.Copyright is held by the owner/author(s). Publication rights licensed to ACM.ACM 978-1-4503-3832-5/15/10 ...$15.00.DOI: http://dx.doi.org/10.1145/2810103.2813658 .

exchange among the different entities, related data is thenindexed with an identifier that is unique in the entire system.

An important example of such a distributed setting is astate-controlled social security system maintaining varioussets of personal data. Therein users can interact with differ-ent entities, such as different health care providers, publicand private pension funds, or tax authorities. All these en-tities can act autonomously in their respective domains andkeep individual records for the users they interact with. Incertain scenarios, the different entities also have to exchangedata about particular users. For instance, assume a healthcare provider offers special discounts for people with lowincome and tax authorities store information about users’salaries. Then, to verify whether a user is eligible for a dis-count, the health care system together with the tax author-ity should be able to check if the user satisfies the criteria.

Global Identifiers.Currently, the probably most prominent approach to en-

able such decentralized data management is to use uniqueglobal identifiers among all entities. In the context of so-cial security systems, this is for instance implemented in theUS, Sweden, and Belgium. Here, each citizen gets assigned aunique and nation-wide social security number. The advan-tage of this approach is that it naturally allows all entitieswithin the system to correlate their individually maintainedrecords. However, having such a unique identifier for eachuser is also a significant privacy threat: When data is lostor stolen, also any adversary obtaining the data can use theunique identifier to link all the different data sets together.Also, interactions of users with different entities become eas-ily traceable.

Thus, the impact of security breaches is rather severe,which in turn, makes the data maintained by the individualentities a lucrative target for data thieves. In addition, asall entities can trivially link their records, the data exchangecan hardly be controlled and authorized. However, in partic-ular in the case of a social security system, a certain controlto supervise and, if necessary, limit the data flow is usu-ally desired. For instance in the Belgium system currently acentral authority called “crossroads bank for social security”(CBSS) [13], serves as hub for all data exchange. Wheneversocial and private entities want to exchange data based onthe global identification number, they have to request ex-plicit authorization from the CBSS, as enforced by nationallaw. From a privacy point of view, though, this added con-trollability makes the system even worse, as now a centralauthority learns which requests are made for which user. In asocial security system, those requests can reveal quite sensi-

tive information itself. For instance, in the example outlinedabove, the central authority would learn from the requestswhich persons suffer from health issues and probably havelow or no income, even if it has no access to the health andtax records itself. Also in terms of security it still assumesthat all entities behave honestly and do not correlate theirrecords without approval of the central authority.

Pseudonyms & Controlled Conversion.Having such a central authority actually allows for a more

privacy-friendly solution. Namely, a central authority (thatwe call converter) could derive and distribute entity-specificidentifiers (aka pseudonyms), in a way that pseudonymsknown by different entities can only be linked with the helpof the converter. Thus, it would then even be technicallyenforced that different entities have to request permissionfrom the converter, as without its help they would not beable to connect their records anymore. Of course, the lat-ter argument only holds if the data sets maintained by theentities do not contain other unique identifying informationwhich allows linkage without using the pseudonyms.

Such a pseudonymous identification system clearly im-proves the controllability of the data exchanges and alsoavoids imposing a unique identifier that makes the usertraceable by default. Both are significant advantages com-pared with the solution where only a single global identi-fier is used throughout the entire system. However, as nowthe converter is indeed required in every request it yields apowerful entity that still must be trusted to not exploit theinformation it gathers.

Existing Solutions.In existing solutions, the need to fully trust the converter

seems in fact inherent. A similar pseudonymous frame-work using a central converter is for instance described byGalindo and Verheul [20]. Therein, the converter computesa pseudonym nymi,A based on main identifier uid i and forserver SA, as nymi,A = Enc(kA, uid i), where Enc is a block-cipher and kA a symmetric key that the converter has cho-sen for SA, but is only known to the converter. Whenan entity SA then wishes to request some data for nymi,A

from another entity SB , it sends the pseudonym to theconverter. The converter then decrypts nymi,A to obtainuid i and derives the local pseudonym nymi,B by computingnymi,B = Enc(kB , uid i) for the key kB it had chosen for SB .Thus, here the converter is necessarily modeled as a trustedthird party, as it always learns the generated pseudonyms,the underlying uid i and also has full control over the trans-lations it provides (i.e., a corrupt converter could transformpseudonyms arbitrarily).

Another example is the Austrian eID system [12], whichis one of the few eID solutions that allows one to deriveentity-specific pseudonyms from the unique social securitynumber. However, it currently only supports that unlink-able pseudonyms are created by the users themselves, butit does not consider a central authority that can provide aconversion service on a large scale. It is easy to imaginethough, how such a converter could be realized. Roughly, apseudonym nymi,A is computed as H(Enc(k, uid i)||SA), i.e.,the encrypted main user identifier uid i and the identifierof the respective entity SA are concatenated and the hashvalue of both yields the pseudonym. Here, the key k is aglobal key that is used for all pseudonyms, but is again onlyknown to the converter. In order to enable conversions be-

tween pseudonyms, the converter could simply keep a tablewith the related hash values and then perform the conver-sion based on looking up the corresponding value.

Hereby, the trust requirements for the converter can actu-ally be reduced if one considers pseudonym generation andconversion as two different tasks. Then, only the entity re-sponsible for pseudonym generation would have to knowthe key k under which the user identifiers are encrypted,whereas the converter merely keeps the hash table with therelated pseudonyms. The converter would then only knowwhich pseudonyms belong together, but can not determinefor which particular user they are standing for. Thus, alsoduring conversion, a malicious converter does not learn theparticular user for which a conversion is requested anymore,but only his pseudonym.

However, the converter can still link all requests that aremade for the same (unknown) user. As each query usuallyleaks some context information itself, being able to link allthat information together might still allow the converter tofully identify the concrete user behind a pseudonym. For in-stance, regular queries for the same pseudonym to the pen-sion fund might indicate that the person behind the pseudo-nym is older than 60 years, and queries to entities that areassociated with a certain region such as local municipalitiesfurther reveal the place that person might live in.

Via the comparable CBSS authority in Belgium, severalhundreds of million messages are exchanged every year, witha peak of 806 million messages in 2009. Using those valuesas a reference for the social security use case, one has toassume that a converter learning “only” the requests andtheir relation would still obtain a significant amount of con-text data. How context information and meta data can beleveraged to fully de-anonymize pseudonymized data sets,was recently impressively demonstrated for “anonymized”credit card transactions [15] and in the Netflix and AOLincidents [21, 5].

Thus, from a privacy and a security perspective it isclearly desirable to minimize the information a convertercan collect as much as possible. This means, the con-verter should not even learn which requests relate to whichpseudonyms.

Other Related Work.There exists a line of work on reversible pseudonymiza-

tion of data records, in particular in the eHealth context,aiming at de-sensitizing patient records [1, 22, 14, 8, 24].The main focus in these works is to derive pseudonyms fromunique patient identifiers, such that the pseudonyms do notreveal any information about the patient anymore, yet al-low de-anonymization by a trusted party (or a combinationof several semi-trusted parties). However, in all solutions,pseudonym generation must be repetitively unambiguous topreserve the correlation between all pseudonymized records.Consequently, data exchange is trivial and does not requirea converter. Thus, pseudonyms are linkable by default,whereas our approach is the opposite: pseudonyms shouldbe unlinkable by default, yet preserve the correlation whichallows to re-establish the linkage only if necessary via a (po-tentially untrusted) converter.

Our Contribution.In this paper we tackle the challenge of enabling privacy-

friendly yet controlled data exchange in a decentralized sys-tem. That is, we propose an (un)linkable pseudonym system

where a converter serves as central hub to ensure controlla-bility. The converter establishes individual pseudonyms foreach server derived from a unique main identifier that ev-ery user has, but without learning the derived pseudonyms.The converter is still the only authority that can link dif-ferent pseudonyms together, but it does not learn the par-ticular user or pseudonym for which such a translation isrequested. The converter can not even tell if two data ex-changes were done for the same pseudonym or for two dif-ferent ones. Thus, the only information the converter stilllearns is that a server SA wants to access data from a serverSB . We consider this to be the right amount of informa-tion to balance control and privacy. For instance for the usecase of a social security system, it might be allowed thatthe health care provider can request data from the tax au-thority but should not be able to access the criminal recordsof its registered users. Thus, there is no need to learn forwhich particular user a request is made, or whether severalrequests belong together. In our system, the converter isable to provide such access control but does not learn anyadditional information from the queries.

We start by formally defining the functional and securityproperties such an (un)linkable pseudonym system shouldideally provide. Our definition is formulated in the univer-sal composability framework, and thus comes with strongguarantees when composed with other UC secure protocols.We then describe our system using generic building blocks.

The idea of our solution to build pseudonyms by addingseveral layers of randomness to the user identifier, such thatthey allow for consistent (and blind) conversions yet hidethe contained identifier towards the servers. Roughly, togenerate a pseudonym nymi,A for a user uid i on server SA,the converter first applies a verifiable PRF on uid i and thenraises the derived value to a secret exponent that it assignsfor each server. The trick thereby is that those secret keysare known only to the converter, but not to the servers.

Now, consider the blind conversion procedure. It canof course be realized with a generic multiparty protocol,where the first server SA inputs the pseudonym to be con-verted and the converter inputs all its secret keys, and theoutput of the second server SB would be the convertedpseudonym, provided that the input by SA was a indeeda valid pseudonym. However, such a computation would berather inefficient. We therefore aim to construct a specificprotocol that achieves this efficiently.

We propose a blind conversion protocol that performs theconversion on encrypted pseudonyms, using a homomorphicencryption scheme. To transform a pseudonym from oneserver to another, the converter then exponentiates the en-crypted pseudonym with the quotient of the secret keys ofthe two servers. The challenge is to make that entire processverifiable, ensuring that the conversion is done in a consis-tent way but without harming the privacy properties.

To ensure controllability in the sense that a server canonly request conversions for pseudonyms it legitimately ob-tained via the converter, we also make use of a novel build-ing block which we call dual-mode signatures. Those allowto obtain signatures on encrypted messages, which can thenbe “decrypted” to a signature on the underlying plaintextmessage. We also provide a concrete construction for thosesignatures based on the recent signature scheme by Abe etal. [2], which might be of independent interest. Our dual-mode signatures can be seen as a spezialised variant of com-

muting signatures [19], and therefore allow for more efficientschemes.

Finally, we prove that our protocol realizes our ideal func-tionality based on the security of the building blocks. Wealso provide concrete instantiations for all generic buildingblocks used in our construction which already come with op-timizations and enhance the efficiency of our solution. Wheninstantiated with the suggested primitives, our protocol issecure based on discrete-logarithm related assumptions.

2. SECURITY DEFINITIONIn this section we first informally discuss the main entities

and procedures in our (un)linkable pseudonym system andthen define the desired security properties by describing howan ideal functionality would handle that task.

For the sake of simplicity, we will speak about user iden-tifier uid i, whenever we mean a unique identifier to whichseveral distributed data sets should be related. However, itshould not be misunderstood that our system is restrictedto user data, as it can handle arbitrary related data sets dis-tributed over several servers. The main entities in our sys-tem are a converter X and a set of servers S = {SA,SB , . . . }.

The converter X is the central authority that blindly de-rives and converts the (un)linkable pseudonyms. More pre-cisely, for a user identifier uid i and server identifier SA,the converter can establish the server-specific pseudonymnymi,A. However, this must be done in a way that only SAis privy of the resulted nymi,A.

All generated pseudonyms can also be verified by theservers. In particular, if a server SA does know the un-derlying uid i, and the converter allows for verification, itcan check that nymi,A is indeed derived from uid i. This iscrucial to allow for a secure migration from an existing in-dexing system based on unique uid i’s to our pseudonymoussystem. However, such a verification must be explicitly al-lowed by the converter. Without his approval, a server evenwhen knowing some uid i could not verify whether it belongsto a certain pseudonym nymi,A or not.

A server SA can then maintain data for some user uid iwho is known to him as nymi,A. If SA wants to ac-cess some data for the same underlying user from anotherserver SB , it must initiate a conversion request via the con-verter. The converter is the only entity that can transforma pseudonym nymi,A into nymi,B . However, X executes theconversion function in a blind manner, i.e., without learningnymi,A,nymi,B , the underlying uid i or even if two requestsare made for the same pseudonym or not. If a conversion isgranted by X , only SB will learn the converted pseudonymnymi,B . The subsequent data exchange between SA and SBcan be handled using the query identifier qid that is used inthe request and is mapped to nymi,A on SA’s and to nymi,B

on SB ’s domain.Apart from all the privacy features it is of course crucial

that pseudonyms are generated and converted in a consis-tent way. More precisely, the generated pseudonyms nymi,A

must be unique for each server domain SA and the conver-sion must be transitive and consistent with the pseudonymgeneration.

2.1 Ideal FunctionalityWe now formally define such an (un)linkable pseudonym

system with blind conversion by describing an ideal function-ality in the universal composability (UC) framework [11],

which is a general framework for analyzing the security ofcryptographic protocols. Roughly, a protocol is said to se-curely realize a certain ideal functionality F , if an environ-ment can not distinguish whether it is interacting with thereal protocol or with F and a simulator. A protocol that isproven to be secure in the UC framework then enjoys strongsecurity guarantees even under arbitrary composition withother (UC secure) protocols. At the end of the section wewill also discuss how the aforementioned (informal) proper-ties are enforced by our functionality.

In this paper, we assume static corruptions, meaning thatthe adversary decides upfront which parties are corrupt andmakes this information known to the functionality. TheUC framework allows us to focus our analysis on a sin-gle protocol instance with a globally unique session iden-tifier sid . Here we use session identifiers of the form sid =(sid ′,X ,S,U,N), for some converter and server identifiersX ,S = {SA,SB , . . . } and a unique string sid ′ ∈ {0, 1}∗.Further, it must hold that |N| ≥ |U|, where U denotes thespace of user identifiers and N the pseudonym space. Wealso assume unique query identifiers qid = (qid ′,SA,SB)for each conversion request, containing the identities of thecommunicating servers SA and SB . Those unique sessionand query identifiers can be established, e.g., by exchangingrandom nonces between all involved parties and using theconcatenation of all nonces as sid ′ and qid ′ respectively.

The definition of our ideal functionality Fnym is presentedin detail in Figure 1. For simplicity, we refer to Fnym as Ffrom now on. We also use the following writing conventionsin order to reduce repetitive notation:

• At each invocation, F checks that sid has the form sid =(sid ′,X ,S,U,N), with |N| ≥ |U|. When we say that Freceives input from or provides output to SA or X , wemean the particular X or SA ∈ S specified in the sid .

• For the CONVERT and PROCEED interfaces, F checksthat qid = (qid ′,SA,SB) and only considers the first mes-sage for each pair (sid , qid). Subsequent messages for thesame (sid , qid) are ignored.

• When we say that F outputs a message to a party, thishappens directly, i.e, the adversary neither sees the mes-sage nor can delay it.

• When we say that F sends a message m to A and waitsfor m′ from A, we mean that F chooses a unique exe-cution identifier, saves the local variables and other rele-vant information for the current interface invocation, andsends m together with the identifier to A. When A theninvokes a dedicated resume interface with a message m′

and an execution identifier, F looks up the informationassociated to the identifier and continues processing therequest for input m′.

• When we say that F proceeds only under a certain condi-tion, we implicitly assume that a failure message is sentto the caller whenever that condition is not fulfilled.

We now describe the behaviour of all interfaces also in asomewhat informal manner to clarify the security propertiesthat our functionality provides.

Pseudonym Generation.The NYMGEN interface allows a converter to trigger the

generation of a pseudonym nymi,A for user uid i and serverSA. If no pseudonym for that combination of uid i,SA existsin F , a new one is created. Thereby, if X or SA are hon-

est, the new pseudonym is chosen at random from N. (InFigure 1 this is denoted by nymi,A ←$ N.) Only if both,the converter and the server are corrupt, the adversary canprovide the pseudonym nym∗i,A. All generated pseudonymsare stored within F as (nym, sid , uid i,SA,nymi,A), i.e, therecords also include the underlying uid i.

The generated pseudonym nymi,A is then output directlyto SA. Thus, while the converter is the crucial entity to es-tablish a server-specific pseudonym, it does not learn thepseudonym itself. The converter can additionally spec-ify whether the server output shall consist solely of thepseudonym, or come in a verifiable manner. Verifiable meansthat the server SA receives a pseudonym nymi,A togetherwith an underlying uid i, assuring that nymi,A indeed be-longs to uid i. Such verification is indicated with the flaganon = 0, whereas anon = 1 will hide the uid i from SA. Thereason to include the option anon = 0 and thus the“leakage”of uid i is that a server might already know and use the uid iand thus should be able to verify to which particular user anew pseudonym belongs to (and ideally delete the uid i af-terwards). Allowing this non-privacy-friendly option mightappear counter-intuitive at a first glance. However, with-out having the possibility to verify whether a pseudonymindeed belongs to certain uid i, the pseudonyms would havenot much meaning. Thus, we consider the option anon = 0crucial for bootstrapping such a system, but of course itshould be used with care. We discuss further interestingstrategies for pseudonym provisioning in Section 6.

When both X and SA are corrupt, we also allow the gen-eration of pseudonyms without assigning a proper uid i ∈ Uyet. Instead, the pseudonyms are stored for a “dummy”identifier uid i /∈ U. However, such unassigned pseudonymsare only allowed as long as X does not wish to provide averifiable pseudonym, i.e., where anon = 1.

Assign UID.The ASSIGN interface is only available when the converter

is corrupt. It allows the adversary to replace a “dummy”identifier uid i /∈ U in all records with a proper uid ′i ∈ U,if uid ′i is not used in any other pseudonym record. Aftera pseudonym got a assigned a “proper” identifier uid ′i, theconverter can now also distribute the pseudonyms for uid ′iin a verifiable manner via the NYMGEN interface.

This reflects that, as long as no honest server has verifiedthe connection of a pseudonym to a particular user identifier,all F can guarantee is that pseudonyms that were derivedfrom each other, all belong together (including transitive re-lations). However, the relation to a particular uid might stillbe unassigned. Only when the converter provides a verifiablepseudonym nymi,A, i.e., it links a pseudonym to its underly-ing uid , this connection between nymi,A and uid i becomesknown, and must be guaranteed by the ideal functionalityfrom then on. Which is exactly what this interface does.

Conversion Request.The CONVERT interface allows a server SA to initiate

a conversion for some pseudonym nymi,A towards anotherserver SB , and associated with query identifier qid . Therequest will only be processed if nymi,A is registered withinF . To ask for the converter’s approval, X is then notifiedabout the request. However, X only learns that SA wants torun a conversion towards SB , but nothing else, in particularnot the pseudonym nymi,A the request was initiated for.

1. Pseudonym Generation. On input of (NYMGEN, sid , uid i,SA, anon) from converter X :• If X is honest, only proceed if uid i ∈ U, where U is taken from sid .

• If X is corrupt, also proceed if uid i /∈ U, but only if anon = 1.

• Send (NYMGEN, sid ,SA) to A and wait for (NYMGEN, sid ,SA,nym∗i,A) from A.

• If a pseudonym record (nym, sid , uid i,SA,nymi,A) for uid i,SA exists, retrieve nymi,A, otherwise create a new record wherenymi,A is determined as follows:

– if X or SA are honest, set nymi,A ←$ N,

– if X and SA are corrupt, and no other pseudonym record for nym∗i,A,SA exists, set nymi,A ← nym∗i,A. Abort otherwise.

• If anon = 1, output (NYMGEN, sid ,nymi,A,⊥) to SA and output (NYMGEN, sid ,nymi,A, uid i) otherwise.

2. Assign UID. On input of (ASSIGN, sid , uid i, uid ′i) from adversary A:• Proceed only if X is corrupt, uid i /∈ U and uid ′i ∈ U.

• If no pseudonym record for uid ′i exists yet, replace the current “dummy” uid i with the “real” uid ′i in all records(nym, sid , uid i,SA,nymi,A), abort otherwise.

• Send (ASSIGN, sid) to A.

3. Conversion Request. On input of (CONVERT, sid , qid ,nymi,A,SB) from server SA:• Proceed only if a record (nym, sid , uid i,SA,nymi,A) for nymi,A,SA exists.

• Send (CONVERT, sid , qid) to A and wait for response (CONVERT, sid , qid) from A.

• Create a conversion record (convert, sid , qid , uid i,SA,SB), where uid i is taken from the pseudonym record for nymi,A,SA.

• Output (CONVERT, sid , qid ,SA,SB) to X .

4. Conversion Response. On input of (PROCEED, sid , qid) from converter X :• Proceed only if a conversion record (convert, sid , qid , uid i,SA,SB) for qid exists.

• Send (PROCEED, sid , qid) to A and wait for (PROCEED, sid , qid ,nym∗i,B) from A.

• If a pseudonym record (nym, sid , uid i,SB ,nymi,B) for uid i,SB exists, retrieve nymi,B , otherwise create a new record wherenymi,B is determined as follows:

– if X or SB are honest, set nymi,B ←$ N,

– if X and SB are corrupt, and no other pseudonym record for nym∗i,B ,SB exists, set nymi,B ← nym∗i,B . Abort otherwise.

• Output (CONVERTED, sid , qid ,SA,nymi,B) to SB .

Figure 1: Ideal Functionality Fnym with sid = (sid ′,X ,S,U,N)

Conversion Response.The PROCEED interface allows a converter to blindly

complete a conversion request towards SB . The convertedpseudonym nymi,B is either retrieved from an existingrecord using the internal knowledge of the underlying uid i ofthe requested nymi,A, or generated from scratch and storedtogether with uid i in F . Again, as long as not both, X andSB are corrupt, the new pseudonym is a random value in N.Finally, SB (and only SB) receives the converted pseudonymnymi,B . As F performs the conversion based on the under-lying uid i, the desired consistency properties are naturallyguaranteed.

Discussion.Overall, our ideal functionality defined in Figure 1 guar-

antees the following security and privacy properties even inthe presence of corrupted entities.

Security against corrupt SA,SB: The pseudonyms re-ceived by the servers do not leak any information aboutthe underlying user identifier uid i, and can only be es-tablished via the converter. That is, even if a server SAis corrupt and knows a user identifier uid i, it can notpredict the server-local pseudonym nymi,A himself. Thisis enforced by F as it generates pseudonyms only whenrequested or allowed (in a conversion) by X and producespseudonyms that are merely random values in N.

Further, for pseudonyms nymi,A and nymi,B held by twocorrupt servers SA and SB , the servers cannot tell – with-out the help of the converter – whether they belong to thesame uid i or not (of course only if the servers do not bothknow the underlying uid i from a verifiable pseudonym asotherwise linkage is trivial). This follows again from the

randomness of the pseudonyms. If only one server SA orSB is corrupt, then the corrupt server cannot use a con-version to learn any information about the correspondingpseudonym of the other honest server – even if the con-verter is corrupt too: our functionality does not give anyoutput to SA, and SB only receives (qid ,nymi,B), butnot the initial nymi,A.

Security against corrupt X : If the converter is corrupt,it can trigger pseudonyms for uid i’s and servers SA ofits choice, however X can not determine or predict thepseudonym values whenever they are generated for anhonest server (neither via pseudonym generation nor con-version). This is guaranteed by our definition as F gener-ates new pseudonyms as random values in N and outputsthem directly to the respective server, i.e, without the ad-versary seeing them.

Further, in a conversion request between two honestservers SA and SB , a corrupt converter does not learnany information about the pseudonym nymi,A or nymi,B ,or even whether two request where made for the samepseudonym or not. This follows clearly from F , as theonly information X gets is that SA requested a conver-sion towards SB . If the converter and one of the servers iscorrupt, the adversary can of course learn the pseudonymof the corrupted server. If even both SA,SB are cor-rupt, then the adversary obviously learns all involvedpseudonyms, but this is unavoidable.

Our functionality also guarantees consistency, even in thepresence of a corrupt converter. That is, even when gen-erated or converted by a corrupt X , honest servers areensured that pseudonym generation is injective, conver-sion is transitive and both procedures generate consistent

pseudonyms. This is naturally enforced by our function-ality as F is aware of the underlying uid i and uses thatknowledge to ensure consistent conversions and a uniquepseudonym for each (uid i,SA) combination.

3. BUILDING BLOCKSHere, we introduce the building blocks for our construc-

tion. Apart from standard proof protocols, (verifiable) pseu-dorandom functions and homomorphic encryption we alsoneed a new primitive which we call dual-mode signatures.We provide a formal definition for those signature schemesand also detail an instantiation based on the structure-preserving signature scheme by Abe et al. [2].

3.1 Bilinear MapsLet G, G and Gt be groups of prime order q . A map

e : G × G → Gt must satisfy bilinearity, i.e., e(gx, gy) =e(g, g)xy; non-degeneracy, i.e., for all generators g ∈ G and

g ∈ G, e(g, g) generates Gt; and efficiency, i.e., there existsan efficient algorithm G(1τ ) that outputs the bilinear group

(q ,G, G,Gt, e, g, g) and an efficient algorithm to compute

e(a, b) for any a ∈ G, b ∈ G. If G = G the map is symmetricand otherwise asymmetric.

3.2 Proof ProtocolsWhen referring to the zero-knowledge proofs of knowl-

edge of discrete logarithms and statements about them,we will follow the notation introduced by Camenisch andStadler [10] and formally defined by Camenisch, Kiayias,and Yung [9].

For instance, PK{(a, b, c) : y = gahb ∧ y = gahc} de-notes a “zero-knowledge Proof of Knowledge of integers a,b and c such that y = gahb and y = gahc holds,” wherey, g, h, y, g and h are elements of some groups G = 〈g〉 = 〈h〉and G = 〈g〉 = 〈h〉. Given a protocol in this notation, it isstraightforward to derive an actual protocol implementingthe proof [9]. SPK denotes a signature proof of knowledge,that is a non-interactive transformation of a proof with theFiat-Shamir heuristic [18].

Often we use a more abstract notation for proofs, e.g., byNIZK{(w) : statement(w)} we denote any zero-knowledgeproof protocol of knowledge of a witness w such that thestatement(w) is true. The idea is that when we use SPKwe have the concrete realization in mind whereas withNIZK we mean any non-interactive zero-knowledge proof.Sometimes we need witnesses to be online-extractable,which we make explicit by denoting with NIZK{(w1, w2) :statement(w1, w2)} the proof of witnesses w1 and w2, wherew1 can be extracted.

3.3 (Verifiable) Pseudorandom FunctionsTo generate pseudonyms and verify their correct gener-

ation, we require a pseudorandom function PRF that al-lows for a proof that it was correctly computed. Infor-mally, a pseudorandom function PRF(x, i) with key gener-ation (x, y) ←$ PRFKGen(1τ ) is verifiable if it allows for anefficient proof that a value z is a proper PRF output for inputi and secret key x: πz ←$ NIZK{(x) : z = PRF(x, i)}(i, z).

Dodis and Yampolskiy [16] have proposed such a function,

PRFG(x, i) = g1/(x+i), which works in a cyclic group G = 〈g〉of order q . The pseudorandomness of which is based onthe q-Decisional Diffie-Hellman Inversion problem [7]. The

algorithms for it are as follows (here we deviate from theiralgorithms in the way we define the proof as we require thatthe proof algorithm be zero-knowledge).

The key generation PRFKGenG(1τ ) generates a randomsecret key x ∈ Zq with corresponding public key y ← gx.The proof πz of correct computation of the PRF, i.e., z =PRFG(logg y, i), does not need to be online extractable in our

construction, and thus is as follows: πz ←$ SPK{(x) : y =gx ∧ g/zi = zx}(y, g, i, z).

We will also need a standard (i.e., non-verifiable) pseu-dorandom permutation, which consists of a key generationk ←$ PRPKGenG(1τ ), a function z ← PRPG(k, i) and its effi-ciently computable inverse i← PRP−1

G (k, z). For simplicity,we assume PRPG to work in a group G as well.

3.4 Homomorphic Encryption SchemesWe require an encryption scheme (EncKGenG,EncG,DecG)

that is semantically secure and that has a cyclic group Gas message space. It consists of a key generation algorithm(epk , esk)←$ EncKGenG(1τ ), where τ is a security parameter,an encryption algorithm C ←$ EncG(epk ,m), with m ∈ G,and a decryption algorithm m ← DecG(esk , C). Sometimeswe will make the randomness used in the encryption processexplicit, in which case we will write C ← EncG(epk ,m, r),where r encodes all the randomness, i.e., EncG(·, ·, ·) is adeterministic algorithm.

We require further that the encryption scheme has anappropriate homomorphic property, namely that there isan efficient operation � on ciphertexts such that, if C1 ∈EncG(epk ,m1) and C2 ∈ EncG(epk ,m2), then C1 � C2 ∈EncG(epk ,m1 · m2). We will also use exponents to denotethe repeated application of �, e.g., C2 to denote C � C.

ElGamal Encryption (with a CRS Trapdoor).We use the ElGamal encryption scheme, which is homo-

morphic and chosen plaintext secure. The semantic securityis sufficient for our construction, as the parties always proveto each other that they formed the ciphertexts correctly. Let(G, g, q) be system parameters available as CRS such thatthe DDH problem is hard w.r.t. τ , i.e., q is a τ -bit prime.

EncKGenG(1τ ) : Pick random x from Zq , compute y ← gx,and output esk ← x and epk ← y.

EncG(epk ,m) : To encrypt a message m ∈ G under epk =y, pick r ←$ Zq and output the ciphertext (C1, C2) ←(yr, grm).

DecG(esk , C) : On input the secret key esk = x and a ci-

phertext C = (C1, C2) ∈ G2, output m′ ← C2 · C−1/x1 .

In our concrete instantiation we will use a variation ofElGamal encryption with a CRS trapdoor, which allowsto make proofs for correct ciphertexts efficiently online ex-tractable. That is, we assume that the CRS additionallycontains a public key y. For encryption, each ciphertextgets extended with an element C0 ← yr, which will be ig-nored in normal decryption. In our security proof of theoverall scheme, the simulator will be privy to x = logg y asit can set the CRS appropriately and thus is able to decrypt

as m′ ← C2 · C−1/x0 .

3.5 Signatures SchemesWe require two different kinds of signature schemes: One

signature scheme is needed for server SA to sign a request

to the converter so that later a server SB can verify thatwhat it gets from the converter stems indeed from server SA.For this, any standard signature scheme (SigKGen, Sign,Vf)is sufficient. Such as scheme consists of a key generationalgorithm (spk , ssk) ←$ SigKGen(1τ ), a signing algorithmσ ←$ Sign(ssk ,m), with m ∈ {0, 1}∗, and a signature ver-ification algorithm {0, 1} ← Vf(spk , σ,m). The security def-initions are standard and we thus do not repeat them here.

The second signature scheme we require is for the con-verter to sign pseudonyms. This scheme needs to sup-port the signing of plain pseudonyms as well as encryptedpseudonyms. Also it needs to allow for (efficient) proofs ofknowledge of a signature on a pseudonym that is encrypted.Commuting signatures [19] would fit our bill here. How-ever, because of their generality, their use would make ourconstruction much less efficient than what we present. Thereason for that is that almost all inputs and outputs in theconstruction come with non-interactive proofs that they arewell defined. As the definitions for commuting signaturesalso include these proofs, we cannot use (a subset of) theseeither. Blazy et al. [6] define signature schemes that can sign(randomizable) ciphertexts. Such schemes are a special caseof commuting signatures and much closer to what we need.However, the security definition they give requires that thekeys for the encryption scheme be honestly generated andthe decryption key be available in the security game. Thismeans that when using such a scheme in a construction,the decryption keys need to be extractable from adversar-ial parties and correct key generation enforced, which wouldlead to less efficient schemes. We therefore need to provideour own definition that does not suffer from the drawbacksdiscussed. We call this a dual-mode signature scheme as itallows one to sign messages in the plain as well as when theyare contained in an encryption.

Finally, we point out that the dual-mode signatures aresimilar to blind signature schemes, where the signer alsosigns “encrypted” messages. Now the typical security defi-nition for blind signatures requires only that an adversarybe not able to produce more signatures than he ran sign-ing protocols with the signer. That kind of definition wouldnot be good enough for us – for our construction we needto be sure that the signer indeed only signs the messagethat is contained in the encryption. Further, the setting forwhich we will use those dual-mode signatures would not berealizable by blind signatures: in our protocol a server SAencrypts a message under a public key of a server SB , theconverter then signs a derivation of the ciphertext, and SBfinally decrypts the signature.

Dual-Mode Signature Schemes.A dual-mode signature scheme consists of the algorithms

(SigKGenG, SignG,EncSignG,DecSignG,VfG and also uses anencryption scheme (EncKGenG,EncG,DecG) that has thegroup G as message space. In particular, the algorithmsworking with encrypted messages or signatures also get thekeys (epk , esk) ←$ EncKGenG(1τ ) of the encryption schemeas input.

SigKGenG(1τ ) : On input the security parameter and be-ing parameterized by G, this algorithm outputs a publicverification key spk and secret signing key ssk .

SignG(ssk ,m) : On input a signing key ssk and a messagem ∈ G outputs a signature σ.

EncSignG(ssk , epk , C) : On input a signing key ssk , a publicencryption key epk , and ciphertext C = EncG(epk ,m),outputs an “encrypted” signature σ of C.

DecSignG(esk , spk , σ) : On input an “encrypted” signatureσ, secret decryption key esk and public verification keyspk , outputs a standard signature σ.

VfG(spk , σ,m)→$ {0, 1} : On input a public verification keyspk , signature σ and message m, outputs 1 if the signa-ture is valid and 0 otherwise.

For correctness, we require that for all (spk , ssk) ←$

SigKGenG(1τ ), all (epk , esk) ←$ EncKGenG(1τ ), allm ∈ G, and all random choices in Sign(·, ·), inEncG(·, ·), and EncSignG(·, ·, ·), we have that VfG(spk ,SignG(ssk ,m),m) = 1 and VfG(spk ,DecSignG(esk , spk ,EncSignG(ssk , epk ,EncG(epk ,m))),m) = 1

In terms of security, we extend the standard unforgeabil-ity definition to allow the adversary to also get signatureson encrypted messages. Thereby, the oracle OEncSign willonly sign correctly computed ciphertexts, which is modeledby providing an additional encryption oracle OEnc and onlysign ciphertexts that were generated via OEnc. When usingthe scheme, this can easily be enforced by asking the signa-ture requester for a proof of correct ciphertext computation,and, indeed, in our construction such a proof is needed forother reasons as well. Note that we do not require that the“encrypted” signature output by EncSignG does not leak anyinformation about the signature contained in it.

Experiment ExpDMSIG-forgeA,DMSIG,EncG

(G, τ):

(spk , ssk)←$ SigKGen(1τ )L← ∅; C← ∅(m∗, σ∗)←$ AOSign(ssk,·),OEnc(·,·),OEncSign(ssk,·,·)(spk)where OSign on input (mi):add mi to the list of queried messages L← L ∪mi

return σ ←$ SignG(ssk ,mi)where OEnc on input (epk i,mi):run Ci ←$ EncG(epk i,mi) and add (epk i, Ci,mi) to Creturn Ci

where OEncSign on input (epk i, Ci):retrieve (epk i, Ci,mi) from C, abort if it doesn’t exist;add mi to the list of queried messages L← L ∪mi

return σ ←$ EncSignG(ssk , epk i, Ci)return 1 if VfG(spk , σ∗,m∗) = 1 and m∗ /∈ L

Figure 2: Unforgeability experiment for dual-modesignatures

Definition 3.1. (Unforgeability of Dual-ModeSignatures). We say a dual-mode signature scheme isunforgeable if for any efficient algorithm A the probabilitythat the experiment given in Figure 2 returns 1 is negligible(as a function of τ).

AGOT+ (Dual-Mode) Signature Scheme.To instantiate the building block of dual-mode signa-

tures we will use an extension of the structure-preservingsignature scheme by Abe et al. [2], which we denote asAGOT+ scheme. First, we recall the original AGOT scheme(SigKGenG, SignG,VfG) slightly adapted to our notation, andthen we describe how to instantiate the additional algo-rithms EncSignG and DecSignG with respect to a homomor-phic encryption scheme (EncKGenG,EncG,DecG).

AGOT assumes the availability of system pa-rameters crs = (q,G, G,Gt, e, g, g, x) consisting of

(q,G, G,Gt, e, g, g) ←$ G(1τ ) and an additional randomgroup element x ←$ G, that is, the key generation is splitin two parts, one that generates the public parameters andone that generates the public and secret keys for the signer.For our construction, the former part will also generatethe group G that will also be the message space of theencryption scheme. Thus, SigKGenG becomes that secondpart of the AGOT key generation, abusing notation, wegive it the public parameters as input instead of the securityparameter τ . For all other algorithms, we assume that thepublic parameters, in particular the group G, are given asimplicit input.

SigKGenG(q,G, G,Gt, e, g, g, x) : Choose a random v ←$ Zq ,compute y ← gv, and return spk = y and ssk = v.

SignG(ssk ,m) : On input a message m ∈ G and key ssk = v,choose a random u ←$ Z∗q , and output the signature σ =(r, s, t, w) including the randomization token w where:

r ← gu, s← (mv · x)1/u, t← (sv · g)1/u, w ← g1/u.

VfG(spk , σ,m) : Parse σ = (r, s, t, w′) and spk = y and ac-

cept if and only if m, s, t ∈ G, r ∈ G, and

e(s, r) = e(m, y) · e(x, g), e(t, r) = e(s, y) · e(g, g).

Note that for notational simplicity, we consider w part ofthe signature, i.e., σ = (r, s, t, w), but that the verificationequation does not perform any check on w. As pointed outby Abe et al., a signature σ = (r, s, t) can be randomizedusing the randomization token w to obtain a signature σ′ =(r′, s′, t′) by picking a random u′ ←$ Z∗q and computing

r′ ← ru′, s′ ← s1/u′

, t′ ← (tw(u′−1))1/u′2.

This randomization feature is useful to efficiently proveknowledge of a signature on an encrypted message, which isneeded in our protocol. We show in Section 5 how such aproof can be constructed.

Now, we present the additional algorithms that allow toobtain signatures on encrypted messages M1.

EncSignG(ssk , epk ,M) : On input a proper encryption M =EncG(epk ,m) of a message m ∈ G under epk , and secretkeyssk = v, choose a random u ←$ Z∗q , and output the(partially) encrypted signature σ = (r, S, T, w):

r ← gu, S ← (Mv � EncG(epk , x))1/u,

T ← (Sv � EncG(epk , g))1/u, w ← g1/u.

DecSignG(esk , spk , σ) : Parse σ = (r, S, T, w), computes ← DecG(esk , S), t ← DecG(esk , T ) and output σ =(r, s, t, w).

It is not hard to see that σ = (r, s ← DecG(esk , S), t ←DecG(esk , T ), w) is a valid signature on m← DecG(esk ,M),and that the distribution of these values is the same aswhen m was signed directly. More formally, we provethat the AGOT scheme extended with the above algorithmsEncSignG,DecSignG yields an unforgeable dual-mode signa-ture scheme. The proof is given in the full paper.

1In the AGOT+ scheme, we write M to denote the encryp-tion of a message m, instead of C. Likewise, capital lettersS, T denote the encrypted versions of the values s, t thatwould be computed in a standard AGOT signature.

Theorem 3.2 (Unforgeability of AGOT+). If theAGOT signature scheme (SigKGenG, SignG,VfG) is an un-forgeable signature scheme then, together with the algorithmsEncSignG,DecSignG described above, the AGOT+ scheme(SigKGenG, SignG,EncSignG,DecSignG,VfG) is an unforge-able dual-mode signature scheme.

For our construction, we also require the signer to provethat it computed the signature on an encrypted messagecorrectly. In Section 5 we describe how such a proof canbe done. (Intuitively, one would think that one could justdecrypt and then verify whether the result is a valid signa-ture. However, we cannot do this in the security proof ofour pseudonym scheme where we reduce to the security ofthe homomorphic encryption scheme, as then we don’t havea decryption oracle.)

4. OUR PROTOCOLIn this section we present our protocols for an (un)linkable

pseudonym system. We first give a high-level idea and thenexplain the detailed construction.

Roughly, the computation of pseudonyms is done in sev-eral layers, each adding randomness to the process suchthat the final pseudonym nymi,A is indistinguishable froma random value (if not both, X and SA, are corrupt) asrequired by our ideal functionality. At the same time, thepseudonyms must still have some (hidden) structure, whichallows the consistent transformation of pseudonyms by theconverter.

The main idea is to let the converter first derive a pseu-dorandom “core identifier” zi ← PRFG(xX , uid i) from uid iand for secret key xX . From the unique core identifier zi,the converter then derives its pseudonym contribution usinga secret exponent xA that it chooses for each server SA ∈ S,but never reveals to them.

For the blind conversion, we use homomorphic encryptionso that the first server SA can encrypt the pseudonym forthe second server SB hand this encryption to the converter,who, using the homomorphic properties of the encryptionscheme, raises the encrypted pseudonym to the quotient ofthe two servers’ secret keys, thereby transforming the en-crypted pseudonym.

The tricky part is to make this whole pseudonym gener-ation and conversion process verifiable and consistent, butwithout harming the unlinkability and blindness properties.In particular, for pseudonym generation a server SA must beensured that it receives correctly formed pseudonyms. Forconversion, SA needs to prove to the converter that it en-crypted a valid pseudonym, and the converter needs to proveto the server SB that it applied the conversion correctly.This is achieved by a careful composition of nested encryp-tion, dual-mode signatures which allow signing of plain andencrypted messages, and zero-knowledge proofs.

In the following we give the detailed description of ourprotocol and also provide some intuition for the protocoldesign.

4.1 Detailed DescriptionWe now describe our protocol assuming that a certificate

authority functionality FCA, a secure message transmissionfunctionality FSMT (enabling authenticated and encryptedcommunication), and a common reference string function-ality FCRS are available to all parties. For details of those

functionalities we refer to [11]. FCRS provides all partieswith the system parameters, consisting of the security pa-rameter τ and a cyclic group G = 〈g〉 of order q (which is aτ -bit prime). In the description of the protocol, we assumethat parties call FCA to retrieve the necessary key materialwhenever they use a public key of another party. Further,if any of the checks in the protocol fails, the protocol endswith a failure message.

Converter Setup:

(epkX , eskX )←$ EncKGenG(1τ )(xX , yX )←$ PRFKGenG(1τ )for each server SA ∈ S:

(spkX ,A, sskX ,A)←$ SigKGenG(1τ )choose a random xA ←$ Zq and compute yA ← gxA

store skX ← (eskX , xX , {xA, sskX ,A}∀SA∈S)register pkX ← (epkX , yX , {yA, spkX ,A}∀SA∈S) with FCA

Server Setup (by each server SA ∈ S):

(epkA, eskA)←$ EncKGenG(1τ )(spkA, sskA)←$ SigKGen(1τ )kA ←$ PRPKGenG(1τ )store skA ← (eskA, sskA, kA)register pkA ← (spkA, epkA) with FCA

Figure 3: Setup of Converter and Servers

Setup.Before starting a new instance of our (un)linkable

pseudonym system, we assume that the converter and allservers use standard techniques [11, 4] to agree on a sessionidentifier sid = (sid ′,X ,S,U,N) where sid ′ is a fresh andunique string, X and S = {SA,SB , . . . } denote the identitiesof the communicating parties, and U = Zq and N = G definethe domain of user identifiers and pseudonyms respectively.Then, whenever a new sid has been agreed on, all specifiedentities X and S generate their keys as described in Fig-ure 3. For simplicity, we assume that the converter setupis trusted and discuss in the full paper how this assumptioncan be relaxed (e.g., by adding NIZKs proving knowledge ofthe secret keys, or distributed key generation).

Pseudonym Generation.A pseudonym nymi,A for main identifier uid i and serverSA is jointly computed by the server and the converterX , as depicted in Figure 4. The generation is initiatedby the converter and starts by applying a pseudorandomfunction to uid i obtaining a secret “core identifier” zi ←PRFG(xX , uid i). As xX is a secret key known only to theconverter, the servers are not privy of the mapping betweenuid i and zi. From the core identifier zi – which is the samefor all servers in S – the converter then derives a server-specific “inner pseudonym” xnymi,A ← zxAi for a secret con-version value xA that X chooses internally for every serverSA, but never reveals to them. By using a verifiable PRFand proving correctness of the computation in πnym , the en-tire process of deriving the inner pseudonym xnymi,A canbe verified by the server. If anon = 0, i.e., the pseudonymshould be verifiably derived from a particular uid i that isalso given to the server, the proof is done w.r.t. that uid i,whereas for anon = 1, πnym only shows that the pseudonymwas formed correctly for some uid i. In the latter case, theproof actually shows that the pseudonym is of the correct

form xnymi,A = zxAi for some zi ∈ G and also allows forextraction of zi as this will be required in the security proof.

The inner pseudonym xnymi,A gets also accompanied witha server-specific signature σnym generated by the converter(using a dedicated signing key for each server). This signa-ture will be crucial in a conversion request to ensure thatonly the server SA, for which the pseudonym was intendedfor, can subsequently use it in a conversion. We use the dual-mode signature for that purpose, as the converter needs tosign pseudonyms also in a blind way when they are gener-ated via a conversion request.

When receiving a correctly signed and derived xnymi,A,the server SA then adds the final pseudonym layer by apply-ing a pseudorandom permutation to xnymi,A for secret keykA as nymi,A ← PRPG(kA, xnymi,A). This ensures that theserver’s output nymi,A cannot be linked to xnymi,A or uid iby a corrupt converter.

Conversion Request.When a server SA wishes to convert a pseudonym nymi,A

towards a server SB , it sends a conversion request to X , asdescribed in Figure 5. Each request also comes with a uniquequery identifier qid (which can be established through thesame standard techniques as sid). To achieve blindness ofthe request towards X , the server encrypts the unwrappedinner pseudonym xnymi,A under SB ’s public key. We alsoadd a second layer of encryption using X ’s public key. Thisnested encryption is necessary to allow X to later prove cor-rectness of a conversion towards the target server SB , butwithout SB learning the value xnymi,A. The signature σC ofSA on the nested encryption serves the same purpose. Both,the signature and proof are thereby bound to the query iden-tifier qid , such that a corrupt X cannot reuse the values ina different session.

Finally, we also want to ensure that SA can only trig-ger conversions of correct pseudonyms that “belong” to theserver. Therefore, SA has to prove in πA that the ciphertextsent in the request contains a pseudonym xnymi,A that issigned under the correct key of the converter (but withoutrevealing the signature).

When the converter X receives such a request, it first veri-fies the signature σC and proof πA. If both are valid, X asksthe environment whether it shall proceed. This is the hookto some external procedure which decides if the conversionfrom SA to SB shall be granted or not.

Conversion Response.If the converter gets the approval to proceed, X and SB

complete the conversion as depicted in Figure 5. First, Xuses the homomorphic property of the encryption schemeand raises the encrypted pseudonym to the quotient xB/xAof the two servers’ secret conversion keys, thereby blindlytransforming the encrypted inner pseudonym into xnymi,B .The converter also re-randomizes the ciphertext by multi-plying an encryption of “1” which is crucial for proofing un-linkability. To allow SB to subsequently use the obtainedpseudonym, the converter also “blindly” signs the encryptedxnymi,B with the dual-mode signature scheme. For ensur-ing consistency of a conversion (even in the presence of acorrupt converter), X proves correctness of the transforma-tion in πX . The converter then sends the encrypted innerpseudonym xnymi,B as C′′ with encrypted signature σnymto SB , and also forwards the received tuple (C, σC , πA).

Step1. Upon input (NYMGEN, sid , uid i,SA, anon), converter X generates its pseudonym contribution:

a) Check that uid i ∈ Zq, and if so compute xnymi,A ← PRFG(xX , uid i)xA and σnym ← SignG(sskX ,A, xnymi,A).

b) Prove correctness of the pseudonym generation in the proof πnym :

if anon = 0 : πnym ←$ NIZK{(xA, xX , zi) : xnymi,A = zxAi ∧ yA = gxA ∧ zi = PRFG(xX , uid i)}(sid) .

if anon = 1 : πnym ←$ NIZK{(xA, zi) : xnymi,A = zxAi ∧ yA = gxA ∧ zi ∈ G}(sid), and set uid i ← ⊥ .

c) Send (sid , xnymi,A, σnym , πnym , uid i) via FSMT to SA, end with no output.

Step2. Upon receiving (sid , xnymi,A, σnym , πnym , uid i) from X , SA verifies input and derives final pseudonym:

a) Verify that VfG(spkX ,A, σnym , xnymi,A) = 1 and that the proof πnym is correct w.r.t. yA and yX , uid i (if uid i 6= ⊥).b) Compute nymi,A ← PRPG(kA, xnymi,A), store (sid ,nymi,A, σnym) and end with output (NYMGEN, sid ,nymi,A, uid i).

Figure 4: Pseudonym Generation

When SB receives a conversion request, it first checks thatσC , πA are valid, ensuring that the request indeed was trig-gered by SA for query qid and for the pseudonym containedin C. When SB also verified the correctness of the conver-sion via πX , it decrypts xnymi,B and corresponding signa-ture σnym . The final pseudonym nymi,B is again derivedusing the PRPG. It stores the pseudonym and signature,and outputs nymi,B together with the query identifier qid .

4.2 Security and EfficiencyOur protocol described above securely realizes the ideal

functionality Fnym defined in Section 2. The detailed proofof the following theorem is given in the full paper.

Theorem 4.1. The (un)linkable pseudonym system de-scribed in Section 4 securely implements the ideal function-ality Fnym defined in Section 2 in the (FCA,FCRS,FSMT)hybrid-model, provided that

– (EncKGenG,EncG,DecG) is a semantically secure homomor-phic encryption scheme,

– (SigKGenG,SignG,EncSignG,DecSignG,VfG) is an unforge-able dual-mode signature scheme (as defined in Def. 3.1),

– (SigKGen, Sign,Vf) is an unforgeable signature scheme,– (PRFKGenG,PRFG) is a secure and verifiable pseudorandom

function,– (PRPKGenG,PRPG) is a secure pseudorandom permutation,– the proof system used for NIZK is zero-knowledge,

simulation-sound and online-extractable (for the underlinedvalues), and

– the DDH-assumption holds in group G.

When instantiated with the ElGamal encryp-tion scheme for (EncKGenG,EncG,DecG), withSchnorr signatures [25, 23] for (SigKGen, Sign,Vf),with the AGOT+ dual-mode signature scheme for(SigKGenG, SignG,EncSignG,DecSignG,VfG), with the Dodis-Yampolskiy-PRF [16] for (PRFKGenG,PRFG), and with theproof-protocols and “lazy” PRPG described in Section 5,then by the security of the underlying building blocks wehave the following corollary:

Corollary 4.1. The (un)linkable pseudonym system de-scribed in Section 4 and instantiated as described above, se-curely realizes Fnym in the (FCA,FCRS,FSMT)-hybrid modelunder the Symmetric eXternal Decision Diffie-Hellman(SXDH) assumption [3], the q-Decisional Diffie-Hellman In-version assumption [7], and the unforgeability of the AGOTscheme (which holds in the generic group model).

Efficiency.With the primitives instantiated as stated above, we ob-

tain the following efficiency figures, where expG denotes anexponentiation in group G and pair stands for a pairing com-putation. Many of these exponentiations can be merged intomulti-base exponentiations which allows to substantially op-timize the computational complexity.

Converter XPseudonymGeneration : 10(+1)expG + 1 expGConversionRequest : 7 expG + 4 expGt

+ 8 pairConversionResponse : 34 expG + 2 expG

Server (SA or SB)

PseudonymGeneration : 4(+1)expG + 4 pairConversionRequest : 8 expG + 4 expGt

+ 8 pairConversionResponse : 30 expG + 5 expGt

+ 12 pair

4.3 Honest-but-Curious ConverterOur protocol achieves very strong security against active

attacks, tolerating even a fully corrupt converter. One mightargue though that a converter in our system is at most ofthe honest-but-curious type, i.e., the converter will alwaysperform the protocol correctly but might aim at exploitingthe information it sees or loose its data. Then, it will be suf-ficient to consider a weaker model where the converter willeither be non-corrupted or of such honest-but-curious type.Regarding servers, considering active attacks is less debat-able, however. Indeed, our pseudonym system can be usedby a multitude of servers, possibly from private and publicdomains, and thus security should hold against servers thatbehave entirely malicious (as in our notion).

If one is willing to assume the weaker honest-but-curiousadversary model for the converter, one can easily derive amore light-weight version from our protocol. Roughly, allparts where the converter proves correctness of its compu-tations can be omitted. We now briefly sketch the necessarychanges to our protocol and their impact on the efficiencynumbers.

Pseudonym Generation.In the pseudonym generation, the proof generation πnym

by the converter and the verification of πnym and receivedsignature σnym by the server SA can be omitted. This re-duces the complexity of the converter’s part to 6expG+1 expGand SA has to perform no exponentiation or pairing any-more.

ConversionRequest : The server SA requests a conversion of pseudonym nymi,A towards server SB .

Step1. Upon input (CONVERT, sid , qid ,nymi,A,SB), Server SA computes and sends request:

a) Retrieve (sid ,nymi,A, σnym) for nymi,A and abort if no such record exists.

b) Compute xnymi,A ← PRP−1G (kA,nymi,A), C ←$ EncG(epkX ,EncG(epkB , xnymi,A)), and σC ← Sign(sskA, (sid , qid , C)).

c) Prove knowledge of a converter’s signature σnym on the underlying xnymi,A and under key spkX ,A:

πA ←$ NIZK{(xnymi,A, σnym) : VfG(spkX ,A, σnym , xnymi,A) = 1 ∧ C = EncG(epkX ,EncG(epkB , xnymi,A))}(sid , qid).

d) Send (sid , qid , C, πA, σC ,SB) via FSMT to X and end with no output.

Step2. Upon receiving (sid , qid , C, πA, σC ,SB) from SA, X verifies request and asks for permission to proceed:

a) Verify that Vf(spkA, σC , (sid , qid , C)) = 1 and πA is correct w.r.t. spkX ,A and the received ciphertext C.b) Store (convert, sid , qid , C, πA, σC ,SA,SB) and output (CONVERT, sid , qid ,SA,SB)

ConversionResponse : The converter X and server SB blindly convert the encrypted pseudonym into nymi,B .

Step1. Upon input (PROCEED, sid , qid), X blindly derives the encrypted pseudonym xnymi,B for SB:

a) Retrieve the conversion record (convert, sid , qid , C, πA, σC ,SA,SB) for qid and abort if no such record exists.b) Compute C′ ← DecG(eskX , C) and C′′ ←$ (C′ � EncG(epkB , 1))∆ where ∆← xB/xA (mod q) .c) Sign the encrypted pseudonym using the secret key sskX ,B for SB as σnym ←$ EncSignG(sskX ,B , epkB , C

′′).d) Prove correctness of the computation of C′′ and σnym in πX :

πX ←$ NIZK{(∆, C′, sskX ,B , eskX ) : σnym = EncSignG(sskX ,B , epkB , C′′) ∧

C′ = DecG(eskX , C) ∧ C′′ = (C′ � EncG(epkB , 1))∆ ∧ y∆A = yB}(sid , qid).

e) Send (sid , qid , C, C′′, σC , σnym, πA, πX , SA) via FSMT to SB .

Step2. Upon receiving (sid , qid , C, C′′, σC , σnym, πA, πX , SA) from X , SB derives its local pseudonym nymi,B:

a) Verify that Vf(spkA, σC , (sid , qid , C)) = 1, πA is correct w.r.t. spkX ,A, C and πX is correct w.r.t. C′′.

b) Compute xnymi,B ← DecG(eskB , C′′) and σnym ← DecSignG(eskB , spkX ,B , σnym).

c) Verify that VfG(spkX ,B , σnym, xnymi,B) = 1 and derive the final pseudonym as nymi,B ← PRPG(kB , xnymi,B).d) Store (sid ,nymi,B , σnym) and end with output (CONVERTED, sid , qid ,SA,nymi,B).

Figure 5: Conversion Request and Response Protocol

Conversion Request.The changes to the conversion protocol are slightly more

complex. When SA prepares its request, we can remove theouter encryption layer of C and omit the signature σC . Bothallowed X to forward SA’s request in a blind yet verifiablemanner to SB . Relying on an honest-but-curious converter,this is not needed anymore. Overall, the complexity in theconversion request decreases to 5 expG + 4 expGt

+ 8 pair forSA and 3 expG + 4 expGt

+ 8 pair for X .

Conversion Response.When the converter computes its conversion response, we

can omit the proof πX . Further, X does not have to forwardthe proof πA to SB as this was needed in the security proofonly when the converter was corrupt. Also the ciphertextC needs no longer to be forwarded to SB and in fact shouldnot be forwarded, as we just changed C to be a direct en-cryption of nymi,A under S ′Bs key. Overall, the only values

sent from X to SB are now (sid , qid , C′′, σnym, SA). Conse-quently, also the part of the receiving server SB gets morelight-weight: it does neither have to verify πA, πX , or σCanymore. SB ’s verification of the decrypted signature σnym

on the converted pseudonym becomes obsolete, too. Thissignificantly reduces the overall complexity of the responseprotocol to 23 expG + 1 expG for X and 3 expG for SB .

5. CONCRETE INSTANTIATIONSIn this section we describe how to instantiate the differ-

ent proofs used in our protocol, assuming that the ElGa-mal encryption scheme [17] is used for (EncKGenG,EncG,DecG), AGOT+ signatures (as defined in Section 3.5) for(SigKGenG, SignG,EncSignG,DecSignG,VfG) and the Dodis-Yampolskiy [16] construction for the verifiable pseudoran-dom function (PRFKGenG,PRFG). The concrete instantia-tions of the standard signature and the PRP have no in-fluence on the proofs, as they don’t appear in any of theproven statements. We also describe some optimisations forcomputing the nested encryption C and derivation of theciphertext C′′ which enhance the efficiency of our scheme.

Our instantiation requires that FCRS provides all partiesinstead of the single group G with three groups G = 〈g〉,G = 〈g〉, Gt of prime order q, and a bilinear map e : G×G→Gt. Those are generated as (q ,G, G,Gt, e, g, g) ←$ G(1τ ).For the system parameters of the AGOT(+) scheme also anadditional random group element x ←$ G is included in theCRS. Finally, to achieve online extractability for the NIZKproofs, we require that the CRS further contains a randompublic key y ∈ G. In the security proof, the simulator willchoose a random x ∈ Zq and set y ← gx, which allows toefficiently extract the necessary values as described in thepreliminaries.

Overall, the CRS in our scheme then has the form crs =(q ,G, G,Gt, e, g, g, x, y). The converter’s keys for the dual-mode signature (for each server SA ∈ S) have the form(spkX ,A = yX ,A, sskX ,A = vX ,A) and for the ElGamal en-cryption we have (epkX = yX , eskX = xX ).

5.1 Pseudonym GenerationIn πnym a converter has to prove that it has gener-

ated its pseudonym contribution xnymi,A correctly. If thepseudonym is not anonymous (anon = 0), it also includesa proof that the core identifier zi = PRFG(xX , uid i) wascomputed correctly.

If the flag anon = 1, the proof πnym is instantiated asfollows: first compute an ElGamal encryption of zi underthe CRS key as Z = (Z1, Z2)← (yr, zig

r) with a randomlychosen r ←$ Zq . Then compute the proof π′nym :

π′nym ←$ SPK{(x′A, r) : Z1 = yr ∧ Z2 = grxnymx′Ai,A ∧

g = yx′AA }(sid , g, yA, xnymi,A, Z1, Z2)

and output πnym ← (Z, π′nym). For the analysis of this proof

notice that zi = xnym1/xAi,A = xnym

x′Ai,A.

If the flag anon = 0, then πnym is instantiated as:

πnym ←$ SPK{(x′A, x′X ) : 1 = gx′X y−x′AX ∧ g = y

x′AA ∧

g = (xnymuidii,A )x

′Axnym

x′Xi,A}(sid , g, yA, xnymi,A, yX , uid i)

where yX is part of the converter’s public key. Let us analysethe latter proof. The first term established that x′X = x′AxX ,the second one that x′A = 1/xA and the third term that

xnymi,A = (g1/(uidi+xX ))xA .

5.2 Conversion RequestIn a conversion request, the server SA has to prove that

it knows a converter’s signature on the inner pseudonymxnymi,A, which it provided in double encrypted form C =EncG(epkX ,EncG(epkB , xnymi,A)).

We start with the description of how the double encryp-tion C = EncG(epkX ,EncG(epkB , xnymi,A)) is instantiated.We already apply some optimizations and extend the ci-phertext such that it allows for the online extraction of thepseudonym and its signature in the proof πA.

Let eskX = xX and epkX = yX be the encryption keypairs of the converter and eskB = xB and epkB = yB forserver SB . Let y be the public key in the CRS. Then C iscomputed as an extended ElGamal encryption as

C := (C0, C1, C2, C3)← (yr1+r2 , yr1B , yr2X , g

r1+r2xnymi,A)

with r1, r2 ←$ Zq . Let (r, s, r, w) be the converter’s signatureon xnymi,A.

Then, the proof πA if realized as follows: First the serverSA randomizes the signature σ = (r, s, t, w) for key spkX ,A =

yX ,A by picking a random u′ ←$ Z∗q and computes σ′ =(r′, s′, t′, w′) as

r′ ← ru′, s′ ← s1/u′

, t′ ← (tw(u′−1))1/u′2.

Then it computes the proof

π′A ←$ SPK{(r1, r2, ν1, ν2) : C0 = yr1+r2 ∧C1 = yr1B ∧ C2 = yr2X ∧ S1 = yν1 ∧ T1 = yν2 ∧

e(x, g)e(C3, yX ,A)/e(S2, r′) = e(g, yX ,A)r1+r2e(g, r′)−ν1 ∧

e(g, g)e(S2, yX ,A)/e(T2, r′) = e(g, yX ,A)ν1e(g, r′)−ν2

}(sid , qid , crs, C, r′, S, T, w, yX ,A, yX , yB),

where S = (S1, S2) = (yν1 , gν1s′) and T = (T1, T2) =(yν2 , gν2t′) are (ordinary) ElGamal encryptions under theCRS key that make this proof online extractable. It out-puts πA = (π′A, S, T, r

′). The analysis of this proof is givenin the full version of the paper.

5.3 Conversion ResponseLet us now detail how the converter computes the cipher-

text C′′ = (C′′1 , C′′2 ) and the proof πX in which X proves

that it derived and signed C′′ (containing the translatedpseudonym) correctly.

Given the ciphertext C = (C0, C1, C2, C3), the con-

verter computes C′2 ← C3/C1/xX2 , C′′2 ← (C′2

xB/xA)gr,

and C′′1 ← CxB/xA1 yrB , with r ←$ Zq . Let ∆ = xB/xA

(mod q). Notice that (C1, C′2) is an encryption of xnymi,A

under yB (provided SA computed C honestly) and that we

have C′′2 = (C3/C1/xX2 )∆gr. Thus, C′′ = (C′′1 , C

′′2 ) is an

encryption of xnymi,B under yB .Now, the converter computes the signature on the ci-

phertext (C′′1 , C′′2 ) and for signing key sskX ,B = vX ,B

(with public key spkX ,B = yX ,B) . Choose a random

u, ρ1, ρ2 ←$ Z∗q , and compute the (partially) encrypted sig-nature σ = (r, S, T, w):

r ←gu, w ←g1/u

S1 ←C′′1vX ,B/uyρ1B , S2 ←(C′′2

vX ,Bx)1/ugρ1 ,

T1 ←SvX ,B/u

1 yρ2B , T2 ←(SvX ,B

2 g)1/ugρ2 .

Output σ = (r, (S1, S2), (T1, T2), w), where (S1, S2) and(T1, T2) are encryptions under SB ’s public key yB .

Then, the proof πX that X computed and signed (C′′1 , C′′2 )

correctly and is as follows:

πX ←$ SPK{(u′, v′, ρ1, ρ2,∆, r, p) : g = ru′∧ w = gu

′∧

1 = y−u′

X ,B gv′ ∧ S1 = C′′1

v′yρ1B ∧ S2 = C′′2

v′xu

′gρ1 ∧

T1 = Sv′

1 yρ2B ∧ T2 = Sv

′2 g

u′gρ2 ∧ yB = y∆

A ∧

C′′1 = C∆1 y

rB ∧ C′′2 = C∆

3 Cp2 gr ∧ 1 = g∆ypX

}(sid , qid , crs, r, S, T, w,C′′1 , C′′2 , C, yX ,B , yX , yB) .

The last term establishes that p = −∆/xX . The last fourterms show that the ciphertext (C′′1 , C

′′2 ) was computed cor-

rectly from (C0, C1, C2, C3) whereas all other terms showthat the “encrypted” signature was computed correctly.

5.4 Instantiating PRPG via Lazy SamplingOur scheme makes use of a pseudorandom permutation

PRPG to let each server derive its final pseudonym nymi,A asnymi,A ← PRPG(kA, xnymi,A) and also re-obtain xnymi,A

via PRP−1G in a conversion request. However, we mainly in-

troduced the PRPG for notational convenience. In fact, it issufficient to choose a random nymi,A ←$ G whenever a freshxnymi,A is received, and to keep a list Lnym of the mapping(nymi,A, xnymi,A). Then, whenever xnymi,A appears again,SA simply retrieves nymi,A from Lnym and vice-versa.

6. CONCLUSION AND EXTENSIONSWe have presented a protocol that allows to maintain

and exchange data in a decentralized manner, based onpseudonyms which are per se unlinkable but can be trans-formed from one server to another with the help of a central

converter. Our protocol overcomes the typical privacy bot-tleneck of such a system as it performs the pseudonym gen-eration and conversion only in a blind way. It also providesstrong guarantees in terms of consistency and controllabilityeven if the converter is corrupt.

An interesting area for future work is to detail the differentapproaches on how to securely provision the pseudonyms.For instance, one possibility would be to combine our systemwith privacy-enhancing credentials that contain the uniqueidentifier uid i and are given to the users. That could allow auser to obtain a particular pseudonym contribution xnymi,A

from the converter, and later prove towards SA that she isindeed the correct “owner” of xnymi,A.

Roughly, the idea would be to modify the pseudonymgeneration to output also a commitment com to uid i,e.g., com = guidhr for the random opening information r.The proof πnym (for anon = 1) that is generated by theconverter to ensure correctness of the pseudonym would bemodified accordingly to

πnym ←$ NIZK{(xX , xA, uid i, r) : zi = PRFG(xX , uid i) ∧

xnymi,A = zxAi ∧ yA = gxA ∧ com = guidihr}(sid , com).

Then, a user could register with a server SA by providingxnymi,A, the proof πnym and then prove to the server thatshe owns a credential with the same uid i that is containedin the commitment com without revealing uid i.

Another interesting extension are audit capabilities. In apseudonym system with a fully trusted converter, the con-verter could keep a log file of all server requests and allowthe user (or a trusted auditor) to monitor which entities cor-related or exchanged his data. With the blind conversions inour system, such a central audit is not immediately possibleanymore. It is an interesting open problem how to add suchaudit capabilities without harming the privacy properties ofour system.

In a similar vein, it would be desirable to combineour pseudonym system with policy enforcement tools in aprivacy-preserving manner. That is, allowing the user tospecify which data exchanges are permitted and enable theconverter to blindly check whether a received conversion re-quest violates any user constraints.

AcknowledgementsThis work was supported by the European Commissionthrough the Seventh Framework Programme, under grantagreements #321310 for the PERCY grant and #318424for the project FutureID.

7. REFERENCES[1] H. Aamot, C. D. Kohl, D. Richter, and

P. Knaup-Gregori. Pseudonymization of patientidentifiers for translational research. BMC MedicalInformatics and Decision Making 13:75, 2013.

[2] M. Abe, J. Groth, M. Ohkubo, and M. Tibouchi.Unified, minimal and selectively randomizablestructure-preserving signatures. TCC 2014.

[3] G. Ateniese, J. Camenisch, S. Hohenberger, andB. de Medeiros. Practical group signatures withoutrandom oracles. ePrint Archive, Report 2005/385.

[4] B. Barak, Y. Lindell, and T. Rabin. Protocolinitialization for the framework of universalcomposability. ePrint Archive, Report 2004/006.

[5] M. Barbaro and T. Zeller. A face is exposed for aolsearcher no. 4417749. New York Times, 2006.

[6] O. Blazy, G. Fuchsbauer, D. Pointcheval, andD. Vergnaud. Signatures on randomizable ciphertexts.PKC 2011.

[7] D. Boneh and X. Boyen. Efficient selective-ID secureidentity based encryption without random oracles.EUROCRYPT 2004.

[8] B. Elger, J. Iavindrasana, L. Iacono, H. Muller,N. Roduit, P. Summers, and J. Wright. Strategies forhealth data exchange for secondary, cross-institutionalclinical research. Comput Methods Programs Biomed:99(3), 2010.

[9] J. Camenisch, A. Kiayias, and M. Yung. On theportability of generalized schnorr proofs.EUROCRYPT 2009.

[10] J. Camenisch and M. Stadler. Efficient groupsignature schemes for large groups. CRYPTO 1997.

[11] R. Canetti. Universally composable security: A newparadigm for cryptographic protocols. ePrint Archive,Report 2000/067.

[12] Austrian Citizen Card.http://www.a-sit.at/de/dokumente_

publikationen/flyer/buergerkarte_en.php.

[13] Belgian Crossroads Bank for Social Security.http://www.ksz.fgov.be/.

[14] F. de Meyer, G. de Moor, and L. Reed-Fourquet.Privacy protection through pseudonymisation inehealth. Stud Health Technol Inform: 141, 2008.

[15] Y.-A. de Montjoye, L. Radaelli, V. K. Singh, andA. Pentland. Unique in the shopping mall: On thereidentifiability of credit card metadata. Science30:347, 2015.

[16] Y. Dodis and A. Yampolskiy. A verifiable randomfunction with short proofs and keys. PKC 2005.

[17] T. ElGamal. A Public Key Cryptosystem and aSignature Scheme Based on Discrete Logarithms.CRYPTO 1984.

[18] A. Fiat and A. Shamir. How to prove yourself:Practical solutions to identification and signatureproblems. CRYPTO 1986.

[19] G. Fuchsbauer. Commuting signatures and verifiableencryption. EUROCRYPT 2011.

[20] D. Galindo and E. R. Verheul. Microdata sharing viapseudonymizatio. Joint UNECE/Eurostat worksession on statistical data confidentiality, 2007.

[21] A. Narayanan and V. Shmatikov. Robustde-anonymization of large sparse datasets. IEEESymposium on Security and Privacy, 2008.

[22] T. Neubauer and J. Heurix. A methodology for thepseudonymization of medical data. Int J Med Inform:80(3), 2010.

[23] D. Pointcheval and J. Stern. Security proofs forsignature schemes. EUROCRYPT 1996.

[24] K. Pommerening, M. Reng, P. Debold, and S. Semler.Pseudonymization in medical research - the genericdata protection concept of the tmf. GMS MedizinischeInformatik 1:17, 2005.

[25] C.-P. Schnorr. Efficient identification and signaturesfor smart cards. CRYPTO 1990.

Documents

(Un)linkable Pseudonyms for Governmental Databases