16
Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Wenqi Wei, Lei Yu School of Computer Science, Georgia Institute of Technology, Atlanta, Georgia, USA Abstract—Membership inference attacks seek to infer the membership of individual training instances of a privately trained model. This paper presents a membership privacy analysis and evaluation system, called MPLens, with three unique contribu- tions. First, through MPLens, we demonstrate how membership inference attack methods can be leveraged in adversarial ma- chine learning. Second, through MPLens, we highlight how the vulnerability of pre-trained models under membership inference attack is not uniform across all classes, particularly when the training data itself is skewed. We show that risk from member- ship inference attacks is routinely increased when models use skewed training data. Finally, we investigate the effectiveness of differential privacy as a mitigation technique against membership inference attacks. We discuss the trade-offs of implementing such a mitigation strategy with respect to the model complexity, the learning task complexity, the dataset complexity and the privacy parameter settings. Our empirical results reveal that (1) minority groups within skewed datasets display increased risk for membership inference and (2) differential privacy presents many challenging trade-offs as a mitigation technique to membership inference risk. Index Terms—membership inference, differential privacy, data skewness, machine learning I. I NTRODUCTION Machine learning-as-a-service (MLaaS) has seen an ex- plosion of interest with the development of cloud platform services. Many cloud service providers, such as Amazon [1], Google [2], IBM [3], and Microsoft [4], have launched such MLaaS platforms. These services allow consumers and appli- cation companies to leverage powerful machine learning and artificial intelligence technologies without requiring in-house domain expertise. Most MLaaS platforms offer two categories of services. (1) Machine learning model training. This type of service allows users and application companies to upload their datasets (often sensitive) and perform task-specific analysis including private machine learning and data analytics. The ultimate goal of this service is to construct one or more trained predictive models. (2) Hosting service for pre-trained models. This service provides pre-trained models with a prediction API. Consumers are able to select and query such APIs to obtain task specific data analytic results on their own query data. With the exponential growth of digital data in governments, enterprises, and social media, there has also been a growing demand for data privacy protections, leading to legislation such as HIPAA [5], GDPR [6], and the 2018 California privacy law [7]. Such legislation puts limits on the sharing and transmission of the data analyzed by these platforms and used to train predictive models. All MLaaS providers and platforms are therefore subject to the compliance of such privacy regulations. With the new opportunities of MLaaS and the growing attention on privacy compliance, we have seen a rapid increase in the study of potential vulnerabilities involved in deploying MLaaS platforms and services. Membership inference attacks and adversarial examples represent two specific vulnerabilities against deep learning models trained and deployed using MLaaS platforms. With more mission critical cyber appli- cations and systems using machine learning algorithms as a critical functional component, such vulnerabilities are a major and growing threat to the safety of cyber-systems in general and the trust and accountability of algorithmic decision making in particular. Membership Inference. Membership inference refers to the ability of an attacker to infer the membership of training examples used during model training. We call a membership inference a black-box attack if the attacker only has the access to the prediction API of a privately trained model hosted by a MLaaS provider. A black-box attacker therefore does not have any knowledge of either the private training process or the privately trained model. Consider a financial institution with a large database of previous loan applications. This financial institution leverages its large quantity of data in conjunction with a MLaaS platform to develop a predictive model. The goal of constructing such a model is to provide prediction when given individuals’ personal financial data as input, such as a likelihood of being approved for a loan under multiple credit evaluation categories. This privately trained predictive model is then deployed with a MLaaS API at different offices of this financial institution. When potential applicants provide their own data to the query API, they receive some prediction statistics on their chance of receiving the desired loan and perhaps even the ways in which they may improve their likelihood of loan approval. In the context of the loan application approval model, a membership inference attack considers a scenario wherein a user of the prediction service is an attacker. This attacker provides data of a target individual X and, based on the model’s output, the attacker tries to infer if X had applied for a loan at the given financial institution. Both the financial institution and individual X have an in- terest in protecting against such membership inference attacks. Loan applicants (such as X) consider their applications and financial data to be sensitive and private. They do not want arXiv:1911.09777v1 [cs.CR] 21 Nov 2019

Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Effects of Differential Privacy and Data Skewnesson Membership Inference Vulnerability

Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Wenqi Wei, Lei YuSchool of Computer Science, Georgia Institute of Technology, Atlanta, Georgia, USA

Abstract—Membership inference attacks seek to infer themembership of individual training instances of a privately trainedmodel. This paper presents a membership privacy analysis andevaluation system, called MPLens, with three unique contribu-tions. First, through MPLens, we demonstrate how membershipinference attack methods can be leveraged in adversarial ma-chine learning. Second, through MPLens, we highlight how thevulnerability of pre-trained models under membership inferenceattack is not uniform across all classes, particularly when thetraining data itself is skewed. We show that risk from member-ship inference attacks is routinely increased when models useskewed training data. Finally, we investigate the effectiveness ofdifferential privacy as a mitigation technique against membershipinference attacks. We discuss the trade-offs of implementingsuch a mitigation strategy with respect to the model complexity,the learning task complexity, the dataset complexity and theprivacy parameter settings. Our empirical results reveal that (1)minority groups within skewed datasets display increased risk formembership inference and (2) differential privacy presents manychallenging trade-offs as a mitigation technique to membershipinference risk.

Index Terms—membership inference, differential privacy, dataskewness, machine learning

I. INTRODUCTION

Machine learning-as-a-service (MLaaS) has seen an ex-plosion of interest with the development of cloud platformservices. Many cloud service providers, such as Amazon [1],Google [2], IBM [3], and Microsoft [4], have launched suchMLaaS platforms. These services allow consumers and appli-cation companies to leverage powerful machine learning andartificial intelligence technologies without requiring in-housedomain expertise. Most MLaaS platforms offer two categoriesof services. (1) Machine learning model training. This type ofservice allows users and application companies to upload theirdatasets (often sensitive) and perform task-specific analysisincluding private machine learning and data analytics. Theultimate goal of this service is to construct one or more trainedpredictive models. (2) Hosting service for pre-trained models.This service provides pre-trained models with a predictionAPI. Consumers are able to select and query such APIs toobtain task specific data analytic results on their own querydata.

With the exponential growth of digital data in governments,enterprises, and social media, there has also been a growingdemand for data privacy protections, leading to legislationsuch as HIPAA [5], GDPR [6], and the 2018 Californiaprivacy law [7]. Such legislation puts limits on the sharingand transmission of the data analyzed by these platforms and

used to train predictive models. All MLaaS providers andplatforms are therefore subject to the compliance of suchprivacy regulations.

With the new opportunities of MLaaS and the growingattention on privacy compliance, we have seen a rapid increasein the study of potential vulnerabilities involved in deployingMLaaS platforms and services. Membership inference attacksand adversarial examples represent two specific vulnerabilitiesagainst deep learning models trained and deployed usingMLaaS platforms. With more mission critical cyber appli-cations and systems using machine learning algorithms as acritical functional component, such vulnerabilities are a majorand growing threat to the safety of cyber-systems in generaland the trust and accountability of algorithmic decision makingin particular.

Membership Inference. Membership inference refers tothe ability of an attacker to infer the membership of trainingexamples used during model training. We call a membershipinference a black-box attack if the attacker only has the accessto the prediction API of a privately trained model hosted bya MLaaS provider. A black-box attacker therefore does nothave any knowledge of either the private training process orthe privately trained model.

Consider a financial institution with a large database ofprevious loan applications. This financial institution leveragesits large quantity of data in conjunction with a MLaaS platformto develop a predictive model. The goal of constructing sucha model is to provide prediction when given individuals’personal financial data as input, such as a likelihood of beingapproved for a loan under multiple credit evaluation categories.This privately trained predictive model is then deployed witha MLaaS API at different offices of this financial institution.When potential applicants provide their own data to the queryAPI, they receive some prediction statistics on their chance ofreceiving the desired loan and perhaps even the ways in whichthey may improve their likelihood of loan approval. In thecontext of the loan application approval model, a membershipinference attack considers a scenario wherein a user of theprediction service is an attacker. This attacker provides dataof a target individual X and, based on the model’s output, theattacker tries to infer if X had applied for a loan at the givenfinancial institution.

Both the financial institution and individual X have an in-terest in protecting against such membership inference attacks.Loan applicants (such as X) consider their applications andfinancial data to be sensitive and private. They do not want

arX

iv:1

911.

0977

7v1

[cs

.CR

] 2

1 N

ov 2

019

Page 2: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

their loan to be public knowledge. The financial institution alsoowns the training dataset and the privately trained model andlikely considers their training data to not only be confidentialfor consumer privacy but also an organizational asset [8]. It istherefore a high priority for MLaaS providers and companiesto protect their private training datasets against membershipinference risks for maintaining and increasing their competi-tive edge.

Adversarial Machine Learning. The second category ofvulnerability is the adversarial input attacks against modernmachine learning models, also referred to as adversarial ma-chine learning (ML). Adversarial ML attacks are broadly clas-sified into two categories: (1) evasion attacks wherein attackersaim to mislead pre-trained models and cause inaccurate output;and (2) poisoning attacks which aim to generate a poisonoustrained model by manipulating its construction during thetraining process. Such poisoned models will misbehave at pre-diction time and can be deceived by the attacker. Adversarialdeep learning research to date has been primarily centeredon the generation of adversarial examples by injecting theminimal amount of perturbation to benign examples requiredto either (1) cause a pre-trained classification model to mis-classify the query examples with high confidence or (2) cause atraining algorithm to produce a learning model with inaccurateor toxic behavior.

The risks of adversarial ML have triggered a flurry ofattention and research efforts on developing defense methodsagainst such deception attacks. As predictive models take acritical role in many sensitive or mission critical systems andapplication domains, such as healthcare, self-driving cars, andcyber manufacturing, there is a growing demand for privacypreserving machine learning which is secure against theseattacks. For example, the ability of adversarial ML attacks totrick a self-driving car into identifying a stop sign as a speedlimit sign poses a significant safety risk. Given that most of theprediction models targeted are hosted by MLaaS providers andkept private with only a black-box access API, one commonapproach in developing adversarial example attacks is to use asubstitute model of the target prediction model. This substitutemodel can be constructed in two steps. First, use membershipinference methods to infer the training data of the target modeland its distribution. Then, utilize this data and distributioninformation to train a substitute model.

Interestingly, most of the research efforts to date havebeen centered on defense methods against already developedadversarial examples [9] but few efforts have been dedicated tothe countermeasures against membership inference risks andan attacker’s ability to develop an effective substitute model.

In this paper, we focus on investigating two key problems re-garding model vulnerability to membership inference attacks.First, we are interested in understanding how skewness intraining data may impact the membership inference threat.Second, we are interested in understanding a frequently askedquestion: can differentially private model training mitigatemembership inference vulnerability? This includes severalrelated questions, such as when such mitigation might be

effective and the reasons why differential privacy may notalways be the magic bullet to fully conquer all membershipinference threats.

Machine learning with Differential Privacy. Differentialprivacy provides a formal mathematical framework, whichbounds the impact of individual instances on the output ofa function when this function is constructed in a differentiallyprivate manner. In the context of deep learning, a deep neuralnetwork model is said to be differentially private if its trainingfunction is differentially private therefore guaranteeing theprivacy of its training data. Thus, conceptually, differentialprivacy provides a natural mitigation strategy against mem-bership inference threats. If training processes could limit theimpact that any single individual instance may have on themodel output, then the differential privacy theory [10] wouldguarantee that an attacker would be incapable of identifyingwith high confidence that an individual example is included inthe training dataset. Additionally, recent research has indicatedthat differential privacy also has a connection to the modelrobustness against adversarial examples [11].

Unfortunately, differential privacy can be challenging toimplement efficiently in deep neural network training for anumber of reasons. First, it introduces a substantial num-ber of parameters into the machine learning process, whichalready has an overwhelming number of hyper-parametersfor performance tuning. Second, existing differentially privatedeep learning methods tend to have a high cost in prolongedtraining time and lower training and testing accuracy. Theeffort for improving deep learning with differential privacyhas therefore been centered on improving training efficiencyand maintaining high training accuracy [12], [13]. We arguethat balancing privacy, security, and utility remains an openchallenge for supporting differential privacy in the context ofmachine learning in general, and deep neural network modeltraining in particular.

Contributions of the paper. In this paper, we presenta privacy analysis and compliance evaluation system, calledMPLens, which investigates Membership Privacy through amulti-dimensional Lens. MPLens aims to expose membershipinference vulnerabilities, including those unique to varyingdistributions of the training data. We also leverage MPLensto investigate differential privacy as a mitigation techniquefor membership inference risk in the context of deep neuralnetwork model training. Our privacy analysis system canserve for both MLaaS providers and data scientists to conductprivacy analysis and privacy compliance evaluation. This paperpresents our initial design and implementation of MPLens andit makes three original contributions.

First, through MPLens, we demonstrate how membershipinference attack methodologies can be leveraged in adversarialML. Datasets developed by using the model prediction API forMLaaS not only reveal private information about the trainingdataset, such as the underlying distributions of the privatetraining data, but they can also be used in developing andvalidating the adverse utility of adversarial examples.

Second, MPLens identifies and highlights that the vulnera-

Page 3: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

bility of pre-trained models to the membership inference attackis not uniform when the training data itself is skewed. Weshow that risk from membership inference attacks is routinelyincreased when models use skewed training data. This vul-nerability variation becomes particularly acute in federatedlearning environments wherein participants are likely to holdinformation representing different subsets of the populationand therefore may incur different vulnerability to attack due totheir participation. We argue that an in-depth understanding ofsuch disparities in privacy risks represent an important aspectfor promoting fairness and accountability in machine learning.

Finally, we investigate the effectiveness of differential pri-vacy as a mitigation technique against membership inferenceattacks, with a focus on deep learning models. We discussthe trade-offs of implementing such a mitigation strategy forpreventing membership inference and the impact of differentialprivacy on different classes when deep neural network (DNN)models are trained using skewed training datasets.

II. MEMBERSHIP INFERENCE ATTACKS

Attackers conducting membership inference attacks seek toidentify whether or not an individual is a member of the datasetused to train a particular target machine learning model.We discuss the definition and the generation of membershipinference attacks in this section, which will serve as the basicreference model of membership inference.

A. Attack Definition

In studying membership inference attacks there are twoprimary sets of processes at play: (1) the training, deployment,and use of the machine learning model which the attacker istargeting for inference and (2) the development and use of themembership inference attack. Each of these two elements hasguiding pre-defined objectives impacting respective outputs.

1) Machine Learning Model Training and Prediction:The training of and prediction using the machine learningmodel which the attacker is targeting may be formalizedas follows. Consider a dataset D comprised of n traininginstances (x1, y1), (x2, y2), ..., (xn, yn) with each instance xicontaining m features, denoted by xi = (xi,1, xi,2, ..., xi,m),and a class value yi ∈ Zk, where k is a finite integer ≥ 2.Let Ft : Rm → Rk be the target model trained usingthis dataset D. Ft is then deployed as a service such thatusers can provide a feature vector x ∈ Rm, and the servicewill then output a probability vector p ∈ Rk of the formp = (p1, p2, ..., pk), where pi ∈ [0, 1]∀i, and

∑ki=1 pi = 1.

The prediction class label y according to Ft for a feature vectorx is the class with highest probability value in p. Thereforey = arg maxi∈Zk

Ft(x).2) Membership Inference Definition: Given some level of

access to the trained model Ft the attacker conducts his or herown training to develop a binary classifier which serves as themembership inference attack model. The most limited accessenvironment in which an attacker may conduct the member-ship inference attack is the black-box access environment. Thatis, an environment wherein the attacker may only query the

target model Ft through some machine learning as a serviceAPI and receive only the corresponding prediction vectors.

Let us consider an attacker with such black-box access toFt. Given only a query input x and output Ft(x) from sometarget model Ft trained using a dataset D, the membershipinference attacker attempts to identify whether or not x ∈ D.

Dataset Accuracy of Membership Inference (%)Adult 59.89

MNIST 61.75CIFAR-10 90.44

Purchases-10 82.29Purchases-20 88.98Purchases-50 93.71Purchases-100 95.74

TABLE I: Attack accuracies targeting decision tree models. Baselineaccuracy against which to compare results is 50%.

Many different datasets and model types have demonstratedvulnerability to membership inference attacks in black-boxsettings. Table I reports 5 accuracy results for black-boxattackers targeting decision tree models for problems rangingfrom binary classification (Adult) to 100-class classification(Purchases-100). We note that all experiments evaluated theattack model against an equal number of instances in the targettraining dataset D as those not in D. The baseline membershipinference accuracy is therefore 50%. We refer readers to [14]for more details on these datasets and experimental set up.

These results demonstrate both the viability of membershipinference attacks as well as the variation in vulnerabilitybetween datasets. This accentuates the need for practitionersto evaluate their system’s specific vulnerability.

Recently, researchers showed similar membership inferencevulnerability in settings where attackers have white-box accessto the target model, including the output from the intermediatelayers of a pre-trained neural network model or the gradientsfor the target instance [15]. Interestingly, this study showedthat the intermediate layer outputs, in most cases, do not leadto significant improvements in attack accuracy. For example,with the CIFAR-100 dataset and AlexNet model structure, ablack-box attack achieves 74.6% accuracy while the white-box attack achieves 75.18% accuracy. This result furthersupports the understanding that the attackers can gain sufficientknowledge from only the black-box access to the pre-trainedmodels which is common in MLaaS platforms. Attackers donot require either full or even partial knowledge of the pre-trained target model as black-box attacks include the primarysource of membership inference vulnerability.

B. Attack Generation

The attack generation process can vary significantly basedon the power of the attacker. For example, the attack pro-posed in [16] requires knowledge of the training error ofFt. The attack technique proposed in [17], however, requirescomputational power and involves the training of multiplemachine learning models. The techniques proposed in [18]

Page 4: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Fig. 1: Workflow of the membership inference attack generation.

are different still in that they require the attacker to developeffective threshold values. Figure 1 gives a workflow sketchof membership inference attack generation algorithm. We usethe shadow model technique documented in [17] and [14] todescribe the attack generation process of membership infer-ence attacks, while noting that many of the processes may beapplicable to other attack generation techniques.

1) Generating Shadow Data and Substitute Models: In theshadow model technique, an attacker must first generate oraccess a shadow dataset, a synthetic labeled dataset D′ tomirror the data in D. While [17] and [14] both outline potentialapproaches to generating such a synthetic dataset from scratch,we would like to note that in many cases, attackers may alsohave examples of their own which can be used as seeds for theshadow data generation process or to bootstrap their shadowdataset. Consider our example of the financial institution. Acompetitor to the target institution may in fact have their owncustomer data, which could be leveraged to bootstrap a shadowdataset.

Once the attacker has developed the shadow dataset D′, thenext phase of the membership inference attack is to leverageD′ to train and observe a series of shadow models. Specifically,the shadow dataset is used to train multiple shadow modelseach of which is designed to emulate the behavior of thetarget model. Each shadow model is trained on a subset of theshadow dataset D′. As the attacker knows which portion ofD′ was provided to each shadow model, the attacker may thenobserve the shadow models’ behavior in response to instanceswhich were in their training set versus behavior in responseto those that were held out.

2) Generating Attack Datasets and Models: Attackers usethe observations of the shadow models to develop an attackdataset D∗ which captures the difference between the outputgenerated by the shadow models for instances included in thetraining data and those previously unseen by models.

Once the attack dataset D∗ has been developed, D∗ is usedto generate a binary classifier Fa which provides predictionson whether an instance was previously known to a model basedon the model’s output from that instance. At attack time thisbinary classifier may the be deployed against the target modelservice in a black-box setting. The attack model takes as inputprediction vectors of the same structure as those provided bythe shadow models and contained within D∗ and producesas output a prediction of 0 or 1 representing “out” and “in”

respectively with the former indicating an instance that wasnot in the training dataset of the target model and the latterindicating an instance that was included.

The totality of these two phases: (1) generating shadow dataand substitute models and (2) generating attack datasets andmodels, constitute the primary processes for constructing themembership inference attack.

III. CHARACTERIZATION OF MEMBERSHIP INFERENCE

A. Impact of Model Based Factors on Membership Inference

The most widely acknowledged factor impacting vulnerabil-ity to membership inference attacks is the degree of overfittingin the trained target model. Shokri et al. [17] demonstrate thatthe more over-fitted a DNN model is, the more it leaks undermembership inference attacks. Yeom et al. [16] investigatedthe role of overfitting from both the theoretical and theexperimental perspectives. While their results confirm thatmodels become more vulnerable as they overfit more severely,the authors also state that overfitting is not the only factorleading to model vulnerability under the membership inferenceattack. Truex et al. [14] further demonstrate that several othermodel based factors also play important roles in causing modelvulnerability to membership inference, such as classificationproblem complexity, in-class standard deviation, and the typeof machine learning model targeted.

B. Impact of Attacker Knowledge on Membership Inference

Another category of factors that may cause model vulnera-bility to the membership inference attacks is the type and scopeof knowledge which attackers may have about the target modeland its training parameters. For example, Truex et al. [14]identified the impact that attacker knowledge with respect toboth the training data of the target model and the target datahave on the accuracy of the membership inference attack. Thiswas evaluated by varying the degree of noise in the shadowdataset and target data used by the attacker. Table II showsthe experimental results on four datasets with four types oflearning tasks. The datasets include the CIFAR-10 datasetwhich contains 32×32 color images of 10 different classesof objects while the Purchases datasets were developed fromthe Kaggle Acquire Valued Shoppers Challenge dataset con-taining the shopping history of several thousand individuals.Each instance in the Purchases datasets then represents anindividual and each feature represents a particular product.If an individual has a purchase history with this product inthe Kaggle Acquired Valued Shoppers Challenge dataset, therewill be a 1 for the feature and otherwise a 0. The instancesare then clustered into different shopping profile types whichare treated as the classes. Table II reports results for Purchasesdatasets considering 10, 20, and 50 different shopping profiletypes.

The experiments in Table II demonstrate the impact of theattacker knowledge of the target data points by evaluatinghow adding varying degrees of to data features may impacton the success rate of membership inference attacks. Noiseuniformly sampled from [0, σ], σ ≤ 1 and added to features

Page 5: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Degree of NoiseAttack Accuracy (%) Attack Accuracy (%)

with Noisy Target Data with Noisy Shadow DataCIFAR-10 Purchases-10 Purchases-20 Purchases-50 CIFAR-10 Purchases-10 Purchases-20 Purchases-50

0 67.49 66.69 80.70 88.52 67.49 66.69 80.70 88.520.1 65.37 68.72 80.40 86.38 66.85 67.37 80.23 88.810.2 63.88 66.01 77.47 85.85 65.36 66.86 81.20 88.350.3 60.43 62.90 73.93 84.66 64.74 67.46 80.32 88.840.4 60.48 60.07 68.23 83.21 62.64 66.91 80.14 88.630.5 58.33 58.29 64.73 79.12 60.94 67.61 80.36 88.170.6 57.53 57.58 61.51 73.50 60.09 66.68 80.08 88.540.7 55.97 54.94 59.78 70.43 58.92 67.67 80.83 88.580.8 55.35 54.44 58.21 67.16 58.66 66.73 80.49 88.540.9 54.07 54.03 57.91 65.21 57.57 68.06 80.52 87.841.0 53.95 52.72 56.02 62.44 56.55 67.32 80.43 87.70

TABLE II: Impact of noisy target data and noisy shadow data

normalized within [0, 1]. Given a level of uncertainty of xor inaccuracy in D′ on the part of the attacker, representedby a corresponding degree of noise σ, Table II evaluateshow effective the attacker remains in launching a membershipinference attack to identify if x ∈ D or x /∈ D. The resultsreported in Table II are reported for four logistic regressionmodels, each one trained on a different dataset with graduallyincreasing σ values (degree of noise).

We make two interesting observations from Table II. First,for all four datasets, the more accurate (the less noise) theattacker knowledge about x is, the higher the model vulner-ability (attack success rate) to membership inference. Thisshows the accuracy of attacker knowledge about the targetedexamples is an important factor in determining model vulner-ability and attack success rate (in terms of attack accuracy).Second, in comparison to the noisy target data, adding noiseto the shadow dataset results in a less severe drop in accuracy.Similar trends are however still observed with slightly higherattack success rates under smaller σ values for all four datasets.This set of experiments demonstrates that attackers with dif-ferent knowledge and different levels of resources may havedifferent success rates in launching a membership inferenceattack. Thus, model vulnerability should be evaluated bytaking into account potential or available attacker knowledge.

C. Transferability of Membership Inference

Inspired by the transferability of adversarial exam-ples [19], [20], [21], [22], membership inference attacks arealso shown to be transferable. That is, attack model Fa trainedon an attack dataset D∗ containing the outputs from a set ofshadow models is effective not only when shadow models andthe target model are of the same type but also when the shadowmodel type varies. This property further opens the door to theblack-box attackers who do not have any knowledge of thetarget model.

Table III demonstrates this property of membership in-ference attacks. It reports the membership inference attackaccuracy for various attack configurations against a decisiontree model trained on the Purchases-20 dataset. It shows thetransferability of membership inference attacks for differentcombinations from four different model types uses as theattack model type Fa (rows) and the shadow model type(columns). In this experiment, the target prediction model is

Purchases-20 Shadow Model TypeAttack Model DT k-NN LR NB

DT 88.98 87.49 72.08 81.84k-NN 88.23 72.57 84.75 74.27LR 89.02 88.11 88.99 83.57NB 88.96 78.60 89.05 66.34

TABLE III: Accuracy (%) of membership inference attack against adecision tree target model trained on the Purchases-20 dataset.

a decision tree. We observe that while using decision tree asthe shadow model results in the most consistent membershipinference attack success compared to other combinations,multiple combinations with both k-NN and logistic regression(LR) shadow models achieve attack success within 5% of gapcompared to the most successful attack configuration usingdecision tree shadow models. Table III also shows that multipletypes of models can be successful as the binary attack classifierFa. This set of experiments also shows that the worst attackperformances against the DT target prediction model are seenwhen the shadow models are trained using Naıve Bayes (NB),with the worst performance reported when NB is the modeltype of the attack model as well. These results indicate that(1) the same strategy used for selecting the shadow modeltype may not be optimal for the attack model and (2) shadowmodels of different types other than the target model type maystill lead to successful membership inference attacks. We referreaders to [14] for additional detail.

The transferability study in Table III indicates an attackerdoes not always need to know the exact target model con-figuration to launch an effective membership inference attackas attack models can be transferable from one target modeltype to another. And although finding the most effective attackstrategy can be a challenging task for attackers, vulnerabilityto membership inference attack remains serious even withsuboptimal attack configurations with almost all configurationsreporting attack accuracy about 70% and many above 85%.

D. Training Data Skewness on Membership Attacks

The fourth important dimension of membership inferencevulnerability is the risk imbalance across different predictionclasses when the training data is skewed. Even when theoverall membership inference vulnerability appears limitedwith attack success close to the 50% baseline (in or out random

Page 6: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Fig. 2: Vulnerability of Adult dataset highlighting minority (>50K)class with decision trees.

guess), there may be subgroups within the training data, whichdisplay significantly more vulnerability.

For example, Figure 2 illustrates the impact of data skew-ness on membership inference vulnerability. In this set ofexperiments, we measure the membership inference attackaccuracy for a decision tree target model trained on the pub-licly available Adult dataset [23]. The adult dataset contains48,842 instances, each with 14 different features and presentsa binary classification problem wherein one wishes to identifyif an individual’s yearly salary is > $50K or ≤ $50K. Theclass distribution, however, is skewed with less than 25%of instances being labeled >$50K. As overfitting is widelyconsidered a key factor in membership inference vulnerabil-ity, we simulate overfitting by increasing the depth of thetarget decision tree model. Figure 2 shows that the impactof overfitting (increasing in X-axis) on both the aggregatemembership inference vulnerability in terms of membershipinference attack accuracy (accuracy over all classes) and theminority membership inference vulnerability (attack accuracyover the minority class). In this case aggregate vulnerability isthe accuracy of the membership inference attack evaluated onan equal number of randomly selected examples seen by thetarget model (“in”) as as unseen (“out”) while the minorityvulnerability reports the membership inference attack accu-racy evaluated on only the subset of the previously selectedexamples whose class is >$50K (the minority class).

We observe from Figure 2 that, the minority class has anincreased risk under the membership inference attack as themodel over-fits more severely. This follows the intuition thatminority class members have fewer other instances amongstwhom they can hide in the training set and thus are moreeasily exposed under membership inference attacks. Thisaligns well to some extent with the observation that smallertraining dataset sizes can lead to a greater overall risk formembership inference [17]. We argue that it is important forboth data owners for model training and the MLaaS providersto consider vulnerability not just for the entire training dataset,but also the level of risk for minority populations specificallywhen evaluating privacy compliance.

IV. MITIGATION STRATEGIES AND ALGORITHMS

A. Differential Privacy

Differential privacy is a formal privacy framework with atheoretical foundation and rigorous mathematical guaranteeswhen effectively employed [10]. A machine learning algorithmis defined to be differentially private if and only if theinclusion of a single instance in the training dataset will causeonly statistically insignificant changes to the output of thealgorithm. Theoretical limits are set on such output changes inthe definition of differential privacy, which is given formallyas follows:

Definition 1 (Differential Privacy [10]): A randomizedmechanism K provides (ε, δ)- differential privacy if for anytwo neighboring database D1 and D2 that differ in only asingle entry, ∀S ⊆ Range(K),

Pr(K(D1) ∈ S) ≤ eε Pr(K(D2) ∈ S) + δ (1)

If δ = 0, K is said to satisfy ε-differential privacy.

In the remaining of the paper, we focus on ε-differentialprivacy for presentation convenience. To achieve ε-differentialprivacy (DP), noise defined by ε is added to the algorithm’soutput. This noise is proportional to the sensitivity of theoutput. Sensitivity measures the maximum change of theoutput due to the inclusion of a single data instance.

Definition 2 (Sensitivity [10]): For f : D → Rk, thesensitivity of f is

∆ = maxD1,D2

||f(D1)− f(D2)||2 (2)

for all D1, D2 differing in at most one element.The noise mechanism which is used is therefore bounded

by both the sensitivity of the function f , Sf , and the privacyparameter ε. For example, consider the Gaussian mechanismdefined as follows:

Definition 3 (Gaussian Noise Mechanism): M(d) , f(D)+N(0, S2

fσ2)

where N(0, S2fσ

2) is the normal distribution with mean 0 andstandard deviation Sfσ. A single application of the Gaussianmechanism in Definition 3 to a function f with sensitivity Sfsatisfies (ε, δ)-differential privacy if δ ≥ 5

4exp(−(σε)2/2) andε < 1 [24].

Additionally, there exist several nice properties of differ-ential privacy for either multiple iterations of a differentiallyprivate function f or the combination of multiple differentfunctions f1, f2, ... wherein each fi satisfies a correspondingεi-differential privacy. These composition properties are im-portant for machine learning processes which often involvemultiple passes over the training dataset D. The formal compo-sition properties of differential privacy include the following:

Definition 4 (Composition properties [24], [25]): Letf1, f2, . . . , fn be n algorithms, such that for each i ∈ [1, n],fi satisfies εi-DP. Then, the following properties hold:

Page 7: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

• Sequential Composition: Releasing the outputsf1(D), f2(D), . . . , fn(D) satisfies (

∑ni=1 εi)-DP.

• Parallel Composition: Executing each algorithm on adisjoint subset of D satisfies maxi(εi)-DP.

• Immunity to Post-processing: Computing a function ofthe output of a differentially private algorithm does notdeteriorate its privacy, e.g., publicly releasing the outputof fi(D) or using it as an input to another algorithm doesnot violate εi-DP.

Differential privacy can be employed to different types ofmachine learning models. Due to the space constraint, in therest of the paper, we specifically focus on differentially privatetraining of deep neural network (DNN) models.

B. Mechanisms for Differentially Private Deep Learning

DNNs are complex, sequentially stacked neural networkscontaining multiple layers of interconnected nodes. Each noderepresents the dataset in a unique way and each layer ofnetworked nodes processes the input from previous layersusing learned weights and a pre-defined activation function.The objective in training a DNN is to find the optimal weightvalues for each node in the multi-tier networks. This is ac-complished by making multiple passes over the entire datasetwith each pass constituting one epoch. Within each epoch, theentire dataset is partitioned into many mini-batches of equalsize and the algorithm processes these batches sequentially,each including only a subset of the data. When processingone batch, the data is fed forward through the network usingthe existing weight values. A pre-defined loss function iscomputed for the errors made by the neural network learnerwith respect to the current batch of data. An optimizer, such asstochastic gradient descent (SGD), is then used to propagatethese errors backward through the network. The weights arethen updated according to the errors and the learning rateset by the training algorithm. The higher the learning ratevalue, the larger the update made in response to the backwardpropagation of errors.

Differentially private deep learning can be implemented byadding a small amount of noise to the updates made to thenetwork such that there is only a marginal difference betweenthe following two scenarios: (1) when a particular individualis included within the training dataset and (2) when the indi-vidual is absent from the training dataset. The noise added tothe updates is sampled from a Gaussian distribution with scaledetermined by an appropriate noise parameter σ correspondingto a desired level of privacy and controlled sensitivity. Thatis, the privacy budget ε, according to Definition 1, shouldconstrain the value of σ at each epoch. Let n denote thenumber of epochs for the DNN training, a pre-defined hyper-parameter set at the training configuration as the terminationcondition. Let one epoch satisfy εi-differential privacy givenσi from the Gaussian mechanism in Definition 3. Then, atraditional accounting using the composition properties ofdifferential privacy (Definition 4) would dictate that n epochswould result in an overall privacy guarantee of ε if εi = ε/n,

and each epoch employed the Gaussian mechanism with thesame value σi. We refer to this approach the fixed noiseperturbation method [12]. An alternative approach proposedby Yu et. al. in [13] advocates a variable noise perturbationapproach, which uses a decaying function to manage the totalprivacy budget ε and define variable noise scale σ based ondifferent settings of εi for each different epoch i (1 ≤ i ≤ n),aiming to add variable amount of noise to the ith epoch in adecreasing manner as the training progresses in epochs. Thus,we have εi 6= εj for i 6= j, 1 ≤ i, j ≤ n and σi for a givenepoch i is bounded by its allocated privacy budget εi. Thesame overall privacy guarantee is met when

∑ni=1 εi,= ε for

the differentially private DNN training of n epochs.

C. Differentially Private Deep Learning with Fixed σThe first differentially private approach to training deep

neural networks is proposed by Abadi et al. [12] and imple-mented on the tensorflow deep learning framework [26]. Asummary of their approach is given in Algorithm 1. To applydifferential privacy, the sensitivity of each epoch is boundedby a clipping value C, specifying that an instance may impactweight updates by at most the value C. To achieve differentialprivacy, weight updates at the end of each batch include noiseinjection according to the sensitivity defined by C and thescale of noise σ. The choice of σ is directly related to theoverall privacy guarantee.

Algorithm 1 Differentially Private Deep Learning: Fixed σ

Input: Dataset D containing training instances x1, ..., xN , lossfunction L(θ) = 1/N

∑i L(θ, xi), learning rate ηt, noise scale

σ, batch size L, norm bound C, number of epochs EInitialize θ0 randomlySet T = E ∗N/Lfor t ∈ [T ] do

Set sampling probability q = L/NTake a random sample Lt from DCompute gradientFor each xi ∈ Lt, compute gt(xi)← ∇θtL(θt, xi)Clip gradientgt(xi)← gt(xi)/max(1, ‖gt(xi)‖2

C )Add Noisegt ← 1/L(

∑i gt(xi) +N(0, σ2C2I))

Descentθt+1 ← θt − ηtgtOutput θT and compute the overall privacy cost (ε, δ) usinga privacy accounting method.

Let σ =

√2 log(1/δ)

ε , then according to differential privacytheory [24], each step (processing of a batch) is (ε, δ)-differentially private. If Lt is randomly sampled from Dthen additional properties of random sampling [27], [28] maybe applied. Each step then becomes (O(qε), qδ)-differentiallyprivate. The moments accountant privacy accounting methodis also introduced in [12] to prove that Algorithm 1 is(O(qε

√T ), δ)-differentially private given appropriate parame-

ter settings.

Page 8: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

We refer to Algorithm 1 as the fixed noise perturbationapproach as each epoch is treated equally by introducing thesame noise scale to every parameter update.

D. Differentially Private Deep Learning with Variable σ

The variable noise perturbation approach to differentiallyprivate deep learning is proposed by Yu et al. [13]. It extendsthe fixed noise scale of σ over the total n epochs in [12]by introducing two new capabilities. First, Yu et al. in [13]pointed out a limitation of the approach outlined in Algorithm1. Namely, Algorithm 1 specifically calls for random sampling,wherein each batch is selected randomly with replacement.However, the most popular implementation for partitioning adataset into mini-batches in many deep learning frameworksis random shuffling, wherein the dataset is shuffled and thenpartitioned into evenly sized batches. In order to develop adifferentially private DNN model under random shuffling, Yuet.al [13] extends Algorithm 1 of [12] by introducing a newprivacy accounting method.

Additionally, Yu et al. [13] analyze the problem of usingfixed noise scale (fixed σ values), and propose employingdifferent noise scales to the weight updates at different stagesof the training process. Specifically, Yu et al. [13] propose aset of methods for privacy budget allocation, which improvemodel accuracy by progressively reducing the noise scale asthe training progresses. The variable σ noise scale approach isinspired by two observations. First, as the training progresses,the model begins to converge causing the noise being intro-duced to the updates to potentially become more impactful.This slows down the rate of model convergence and causeslater epochs to no longer increase model accuracy comparedto non-private scenarios. Second, the research on improvingtraining accuracy and convergence rate of DNN training hasled to a new generation of learning rate functions that replacethe constant learning rate baseline by decaying learning ratefunctions and cyclic learning rates [29], [30]. Yu et al. [13]employed a similar set of decay functions to add noise at adecreased scale. That is, the noise defined by σi at the ith

epoch as the training progresses is less than σj at the jth

epoch given 1 ≤ j < i ≤ n. The performance of four differenttypes of decay functions to introduce variable noise scale bypartitioning σ over n epochs were evaluated.

E. Important Implementation Factors

1) Choosing Epsilon: In differentially private algorithms,the ε value dictates the amount of noise which must beintroduced into the DNN model training and therefore the the-oretical privacy bound. Choosing the correct ε value requiresa careful balance between tolerable privacy leakage given thepractical setting as well as the tolerable utility loss.

For example, Naldi and D’Acquisto propose a methodin [31] for finding the optimal ε value to meet a certain accu-racy for Laplacian mechanisms. Lee and Clifton alternativelyemploy the approach in [32] to analyze a particular adversarialmodel. Hsu et. al [33], on the other hand, takes an individual’sperspective on data privacy by opt-in incentivization. Kohli

and Laskowski [34] also promote choosing an ε based onindividual privacy preferences.

Despite these existing approaches, determining the “right”ε value remains a complex problem and is likely to be highlydependent on the privacy policy of the organization that ownsthe model and the dataset, the vulnerability of the model, thesensitivity of the data, and the tolerance to utility loss in thegiven setting. Additionally, there might be scenarios, such asthe healthcare setting, in which even small degrees of utilityloss are intolerable and are combined with stringent privacyconstraints given highly sensitive data. In these cases, it maybe hard or even impossible to find a good ε value for existingdifferentially private DNN training techniques.

2) The Role of Transfer learning: Another key consider-ation is the use of transfer learning for dealing with morecomplex datasets [12], [13]. For example, model parametersmay be initialized by training on a non-private, similar dataset.The private dataset is then only used to further hone a subsetof the model parameters. This helps to reduce the numberof parameters affected by the noise addition required byexercising differential privacy. The use of transfer learninghowever relies on a strong assumption that such a non-private,similar dataset exists and and is available. In many cases, thisassumption may be unrealistic.

3) Parameter Optimization: In additional to the privacybudget parameters (ε, δ), differentially private deep learningintroduces influential privacy parameters, such as the clippingvalue C that bounds the sensitivity for each epoch and thenoise scale approach (including potential decay parameters)for σ. The settings of these privacy parameters may impactboth the training convergence rate, and thus the training time,and the training and testing accuracy. However, tuning theseprivacy parameters becomes much more complex as we needto take into account the many learning hyper-parameters al-ready present in DNN training, such as the number of epochs,the batch size, the learning rate policy, and the optimizationalgorithm. These hyper-parameters need to be carefully con-figured for high performance training of deep neural networksin a non-private setting. For deep learning with differentialprivacy, one needs to configure the privacy parameters byconsidering the side effect on other hyper-parameters and re-configure the previously tuned learning hyper-parameters toaccount for the privacy approach.

For example, for a fixed total privacy budget values (ε, δ),too small of a number of epochs may result in reducedaccuracy due to insufficient time to learn. A higher number ofepochs however will result in a higher σ value required eachepoch (recall Algorithm 1). Therefore, a carefully tuned hyper-parameter for a non-private setting, such as the optimal numberof epochs, may no longer be effective when differentiallyprivate deep learning is enabled.

A number of challenging questions remain open problemsin differentially private DNN training, such as at what point domore epochs lead to accuracy loss under a particular privacysetting? What is the right noise decaying function for effectivedeep learning with differential privacy? Can we learn privately

Page 9: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

with high accuracy given complex datasets? The balance ofthe many parameters in a differentially private deep learningsystem presents new challenges to practitioners.

With these questions in mind, we develop MPLens, a mem-bership privacy analysis system, which facilitates the eval-uation of model vulnerability against membership inferenceattacks. Through MPLens, we investigate the effectiveness ofdifferential privacy as a mitigation technique against mem-bership inference attacks, including the trade-offs of imple-menting such a mitigation strategy for preventing membershipinference and the impact of differential privacy on differentclasses for the DNN models trained using skewed trainingdatasets. We also highlight how the vulnerability of pre-trained models under the membership inference attack is notuniform when the training data itself is skewed with minoritypopulations. We show how this vulnerability variation maycause increased risks in federated learning systems.

V. MPLENS: SYSTEM OVERVIEW

MPLens is designed as a privacy analysis and privacycompliance evaluation system for both data scientists andMLaaS providers to understand the membership inferencevulnerability risk involved in their model training and modelprediction process. Figure 3 provides an overview of thesystem architecture. The system allows providers to specifya set of factors that are critical to privacy analysis. Examplefactors include the data used to train their model, what datamight be held by the attacker, what attack technique might beused, the degree of data skewness in the training set, whetherthe prediction model is constructed using the differentiallyprivate model training, what configurations are used for theset of differential privacy parameters, and so forth. Giventhe model input, the MPLens evaluation system reports theoverall statistics on the vulnerability of the model, the per-class vulnerability, as well as the vulnerability of any sensi-tive populations such as specific minority groups. Examplestatistics include attack accuracy, precision, recall, and f-1score. We also include attacker confidence for true positives,false positives, true negatives, and false negatives, the averagedistance from both the false positives and the false negatives tothe training data, and the time required to execute the attack.

A. Target Model Training

Over-fitting is the first factor that MPLens measures for con-ducting membership vulnerability analysis. MPLens specif-ically highlights the overfitting analysis by reporting theAccuracy Difference between the target model training ac-curacy and testing accuracy. This enables MLaaS providersand domain-specific data scientists to understand whethertheir vulnerability might be linked to overfitting. As previ-ously indicated, while overfitting is strongly correlated withmembership inference vulnerability, it is not the only sourceof vulnerability. Thus, when MPLens indicates undesirablevulnerability an absence of significant overfitting, analysis maybe triggered to investigate if vulnerability is linked to othermodel or data characteristics as those discussed in Section III.

B. Attacker Knowledge

Our MPLens system is by design customizable to un-derstand multiple attack scenarios. For instance, users mayspecify the shadow data, which is used by the attacker. Thisallows the user to consider a scenario in which the attackerhas access to some subset of the target model’s training data,one where the attacker has access to some data that are drawnfrom the same distribution as the training data of the targetmodel, or one where the attacker has noisy and inaccuratedata, such as that evaluated in Table II and possibly generatedthrough black-box probing [14], [17].

The MPLens system is additionally customizable withrespect to the attack method, including the shadow modelbased attack techniques [17], and the threshold-based attacktechniques [16]. Furthermore, when using a threshold-basedattack, our MPLens system can either accept pre-determinedvalues representing attacker knowledge of the target modelerror or it can also determine good threshold values throughthe shadow model training.

These customizations allow MLaaS providers and users ofMPLens to specify the types of attackers they wish to analyze,analyze their model vulnerability against such attackers, andevaluate the privacy compliance of their model training andmodel prediction services.

C. Transferability

Another aspect of privacy analysis is related to the spe-cific model training methods used to generate membershipinference attacks and whether different methods result insignificant variations in membership inference vulnerability.Consider the attack method from [17], the user can specifynot only the attacker’s data but also the shadow model train-ing algorithm and the membership inference binary classifiertraining algorithm. Each element is customizable as a systemparameter when configuring MPLens for specific privacy riskevaluation. The MPLens system makes no assumption on howthe attacker develops the shadow dataset, what knowledge isincluded in the data, whether the attacker has knowledge ofthe target model algorithm, or what attack technique is used.This flexibility allows MPLens to support evaluation acrossvarious transferable attack configurations.

VI. EXPERIMENTAL RESULTS AND ANALYSIS

A. Datasets

All experiments reported in this section were conductedusing the following four datasets.

a) CIFAR-10: The CIFAR-10 dataset contains 60,000color images [35] and is publicly available. Each image isformatted to be 32 x 32. The CIFAR-10 dataset contains 10classes with 6,000 images each: airplane, automobile, bird, cat,deer, dog, frog, horse, ship and truck. The problem is thereforea 10-class classification problem where the task is to identifywhich class is depicted in a given image.

Page 10: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Fig. 3: Overview of System to Evaluate Membership Inference Vulnerability

Fig. 4: Training Data Fig. 5: Predicted In Fig. 6: Predicted Out

Comparison of 2-D PCA for CIFAR-10 dog and truck classes. Plots show (1) training data, (2) data predicted as training data, and (3) datapredicted as non-training data by the membership inference attack.

b) CIFAR-100: Also publicly available, CIFAR-100 sim-ilarly contains 60,000 color images [35] formatted to 32 x32. 100 classes are represented ranging from various animals,pieces of household furniture, or types of vehicles. Each classhas 600 available images. The problem is therefore a 100-classimage classification problem.

c) MNIST: MNIST is a publicly available dataset con-taining 70,000 images of handwritten digits [36]. Each imageis formatted to be 32 x 32 and processed such that the digitis at the center of the image. The MNIST dataset constitutesa 10-class classification problem where the task is to identifywhich digit between 0 and 9, inclusive, is contained within agiven image.

d) Labeled Faces in the Wild: The Labeled Faces in theWild (LFW) database contains face photographs for uncon-strained face recognition with more than 13,000 images offaces collected from the web. Each face has been labeledwith the name of the person pictured. 1,680 of the peoplepictured have two or more distinct photos in the data set. Eachperson is then labeled with a gender and race (including mixedraces). Data is then selected for the top 22 classes which wererepresented with a sufficient number of data points.

B. Membership Inference Risk: Adversarial Examples

Given a target model and its training data, the membershipinference attack, using black-box access to the model predic-tion API, can be used to create a representative dataset. Thisrepresentative dataset can be leveraged to generate substitutemodels for the given target model. One can then use such sub-stitute models to generate adversarial examples using differentadversarial attack methods [37].

Figures 4-6 provide visualization plots for the comparisonof 2-D PCA given the images of the dog and truck classes inCIFAR-10. The plots are divided relative to the membershipinference attack prediction output including the true targetmodel training data (Figure 4), the data predicted by themembership inference attack as training data (Figure 5), andthe data predicted as non-training data by the membershipinference attack (Figure 6).

These plots illustrate the accuracy of the distribution ofinstances predicted as in the target model’s training datasetthrough membership inference. They clearly demonstrate howeven with the inclusion of false positives, an attacker cancreate a good representation of the training data distribution,particularly compared to those instances not predicted to be inthe target training data. An attacker can therefore easily train

Page 11: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Fig. 7: Minority Class Percentage vs Membership Inference Vulner-ability. Experiments are conducted with the CIFAR-10 dataset withthe automobile as the minority class.

a substitute model on this representative dataset which thenenables the generation of adversarial examples by attackingthis substitute model. Examples developed to successfullyattack a substitute model trained on the instances in Figure 4are likely to also be successful against a model trained on theinstances in Figure 4.

C. Membership Inference Risk: Data Skewness

To date, the membership inference attack has been studiedeither using training datasets with uniform class distributionor without specific consideration of the impact of any dataskewness. However, as we demonstrated earlier, the risk ofmembership inference vulnerability can vary when class rep-resentation is skewed. Minority classes can display increasedrisk to membership inference attack as models struggle tomore effectively generalize past the training data when fewerinstances are given. In this section, we focus on studyingthe impact of data skewness on vulnerability to membershipinference attacks.

In Figure 7 we investigate the impact of data skewness onmembership inference vulnerability by controlling the repre-sentation of a single class. We reduce the automobile imagesfrom the CIFAR-10 dataset to only 1% of the data and thenincrease the representation until the dataset is again balancedwith automobiles representing 10% of the training images. Wethen plot the aggregate membership inference vulnerabilitywhich is the overall membership inference attack accuracyevaluated across all classes as well as the vulnerability of justthe automobile class.

Figure 7 demonstrates that in cases where the automobileclass constitutes 5% or less of the total training dataset, i.e.the automobile class has fewer than half as many instancesas each of the other classes, this skewness will result inthe automobile class displaying more severe vulnerability tomembership inference attack.

Interestingly, the automobile class displays lower mem-bership inference vulnerability than the average vulnerabilityreported in the CIFAR-10 dataset when the dataset is balanced.However, when the automobile class becomes a minority classwith fewer than half the instances of each of the other classes,the vulnerability shifts to be greater than that reported bythe overall model. This gap becomes greater as continued

decreased representation results in continued increased vul-nerability for the automobile class.

Target Population Attack Accuracy (%)Aggregate 70.14

Male Images 68.18Female Images 76.85

White Race Images 62.77Racial Minority Images 89.90

TABLE IV: Vulnerability to membership inference attacks for dif-ferent subsets of the LFW dataset.

Table IV additionally shows the vulnerability of a DNNtarget model trained on the LFW dataset to membershipinference attacks. We analyze this vulnerability by breakingdown the aggregated vulnerability across the top 22 classesinto four different (non-disjoint) subsets of the LFW dataset:Male, Female, White Race, and Racial Minority. We observethat the training examples of racial minorities experience thehighest attack success rate (89.90%) and are thus highlyvulnerable to membership inference attacks compared to im-ages of white individuals. Similarly, female images, whichrepresent less than 25% of the training data, demonstratehigher average vulnerability (76.85%) compared with imagesof males (68.18%).

To provide deeper insight and more intuitive illustration forthe increased vulnerability of minority groups under member-ship inference attacks, Table V provides 7 individual examplesof images in the LFW dataset targeted by the membershipinference attack. That is, given a query with each exampleimage, the target model predicts the individual’s race andgender (22 separate classes) which the attack model, a binaryclassifier, uses to predict if that image was “in” or “out” ofthe target model’s training dataset. The last row reports theground truth as to whether or not the image was in the targetmodel training set.

Through Table V, we highlight how minority populationsare more likely to be identified by an attacker with a higherdegree of confidence. We next discuss each example from leftto right to articulate the impact of data skewness on modelvulnerability to the membership inference attack.

For the 1st image, the target model is highly confident withits prediction and its prediction is indeed correct. Using themembership inference attack model, the attacker predicts thatthe model must have seen this example with high confidenceand it succeeds in the membership inference attack.

For the 2nd image, the target model is less confident withits prediction, although the prediction outcome is correct. Theattacker succeeds in the membership inference attack becauseit correctly predicts that this example is not in the training set,though the attackers confidence on this membership inferenceis much less certain (close to 50%) compared to that of theattack to the 1st image. We conjecture that the relativelylow prediction confidence by the target model may likelycontribute to the fact that the attacker is unable to obtain ahigh confidence for his membership inference attack.

Page 12: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

3 = correctprediction7 = wrongprediction

Target3 99.99 3 65.81 3 72.56 7 62.30 3 99.99 7 99.63 3 98.38Confidence (%)

Attacker3 86.10 3 50.49 7 61.85 3 72.06 3 56.40 3 99.88 7 53.29Confidence (%)

In Training in out out out in out outData?

TABLE V: Examples of target and attack model performance and confidence on various images from the LFW dataset.

The 3rd image is predicted by the target model correctlywith a confidence of 72.56%, which is about 11.5% moreconfidence than that for the 2nd image. However, the attackerwrongly predicts that the example is in the training set whenthe ground truth shows that this example is not in the trainingset. Assuming the same logic as with the 1st image, i.e.,the confidence and accuracy of target model prediction mayindicate that the image was in the training dataset, could havecaused the attacker to be misled.

For the 4th image, the target model has an incorrect pre-diction with the confidence of 62.30%. The attacker correctlypredicts that this example is not in the training set. It is clearthat a somewhat confident and yet incorrect prediction by thetarget model is likely to result in high attacker confidence thatthis minority individual has not been seen during the training.

These four images highlight the compounding downfallfor minority populations. Models are more likely to overfitthese populations. This leads to poor test accuracy for thesepopulations and makes them more vulnerable to attack. Asthe 3rd example shows, the way to fool attackers is to havean accurate target model that can show reasonable confidencewhen classifying minority test images.

We next compare the results from the previous four imageswhich represented minority classes with results from threeimages representing the majority class (white male images).For the 5th image in Table V, the target model predictscorrectly with high confidence and the attacker is able tocorrectly predict that the image was in the training data. Thisresult can be interpreted through comparison with the attackerperformance for the 1st image. For the 5th query image whichis from the majority group, the attacker predicts correctly thatthe image has been seen in training with barely over 50% inconfidence, showing relatively high uncertainty compared withthe 1st image from a minority class. This indicates that modelaccuracy and confidence are weaker indicators with respect tomembership inference vulnerability for the majority class.

For the 6th image, the target model produces an incorrectprediction with high confidence. The attacker is very confidentthat the query image is not in the training dataset, whichis indeed the truth. This demonstrates rare a potential vul-nerability for the majority class: When the target model hashigh confidence in an inaccurate prediction, an attack modelis able to confidently succeed in the membership inference

attack. Through this example and the above analysis, we seethat the majority classes have two advantages compared to theminority classes: (1) it is less common for the target model todemonstrate this vulnerability of misclassification with highconfidence; and (2) for majority classes, the accuracy andprivacy are aligned rather than as competing objectives.

For the 7th image, the target model makes a correct predic-tion with high confidence. The attacker makes the incorrectprediction that the example was in the training dataset of thetarget model. But the truth is that the example is not in thetraining set. This membership attack failed and is the flip sideof the 5th image, with both reporting low attack confidence.Again by comparing with the 1st image of minority, it showshow model confidence and accuracy may lead to membershipinference vulnerability for the minority classes in a way thatis not true for the majority classes.

D. Mitigation with Differentially Private Training

The second core component of our experimental analysis isto use MPLens to investigate the effectiveness of differentialprivacy employed to deep learning models as a countermeasurefor membership inference mitigation.

To define utility loss we follow [38] and consider loss =1 − dp−acc

acc where acc represents accuracy in a non-privatesetting and dp− acc is the accuracy when differential privacyis employed for the same model and data.

1) Model and Problem Complexity:In Table VI compares three different datasets (first column)

and learning tasks (third column) using different model struc-tures (second column). We measure utility loss incurred whendifferential privacy is introduced, including both the trainingloss and test loss, as well as the vulnerability when non-private training is used. For MNIST, we follow the architectureoutlined in [12] by first applying PCA with 60 components toreduce data dimensionality before feeding the data into simpleneural network with one hidden ReLU layer containing 1,000units.

For CIFAR-10 and CIFAR-100 we test two separate train-ing approaches and architectures. First, we implement thearchitecture used in [38] for both CIFAR-10 and CIFAR-100.This includes no transfer learning but does include a PCAtransformation prior to training to reduce dimensionality to 50features. The network used in this case consists of two fully

Page 13: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Dataset # Trainable Classes Train Loss Test Loss Non-PrivateParameters w/ DP (%) w/ DP (%) Vulnerability (%)

MNIST 71,010 10 5.09 3.51 53.11CIFAR-10 83,978 10 55.90 14.34 72.58CIFAR-100 107,108 100 82.66 49.34 74.04CIFAR-10 1,036,810 10 51.00 26.59 72.94(with TL)CIFAR-100 1,071,460 100 85.60 58.39 89.08(with TL)

TABLE VI: Problem complexity vs membership inference vulnera-bility and differential privacy utility loss (ε, δ) = (10, 10−5)).

connected layers with 256 units each and a softmax layer forthe output.

We also follow the approach from [12] which uses twoReLU convolutional layers with 64 channels, 5 × 5 convo-lutions, and a stride of 1. Each convolutional layer is thenfollowed by a 2× 2 max pool. There are two fully connectedlayers with 384 units before the final softmax layer. Priorto training the network, transfer learning (TL) is used toinitialize model weights. For CIFAR-10 transfer learning isdone with the CIFAR-100 dataset while the reverse is donefor the CIFAR-100 dataset. The convolutional layers are fixedso that only the fully connected and softmax layers are updatedduring training. Finally, the data is cropped to 24×24 to furtherreduce dimensionality.

For the non-private setting, we fix learning to 100 epochsand use the RMSPROP optimizer with learning rate set to0.001, decay to 1e − 6, and the batch size set at 32. Forthe differential privacy setting we use some of the defaultparameter values from the tensorflow privacy repository forMNIST example [39]: setting the learning rate to 0.15, batchsize to 256, microbatches to 256, and the clipping value to1.0. We then set the noise multiplier to the correct scaleaccording to the privacy budget (ε, δ) and 100 epochs oflearning. In Table VI, (ε, δ) = (10, 10−5). We employ theRenyi accounting approach [40] and leverage the tensorflowDPGRADIENTDESCENTGAUSSIANOPTIMIZER.

In Table VI we compare the five scenarios (rows) with re-spect to utility loss and vulnerability of membership inferenceattacks. The results demonstrate that there is an unfortunatetrade-off between the vulnerability and the usability of differ-ential privacy as a mitigation technique. More complex modelsand classification problems display more significant utility losswhen differential privacy is employed during the DNN modeltraining. The MNIST dataset represents a less complex datasetthan the CIFAR-10 and CIFAR-100 datasets. In MNIST theimages are black and white with MNIST images of the sameclass being more similar than the colored object images in theCIFAR datasets. Therefore, the MNIST dataset is more easilylearned by a small DNN model architecture with reasonableaccuracy in the presence of a differentially private optimizerwhich adds noise perturbation into each training epoch. Notsurprisingly, for the same reason, the MNIST model hasreduced vulnerability to membership inference attacks as thetraining examples are more similar to one another.

The CIFAR-10 dataset, on the other hand, is more complexand require more complex models with a greater number oftrainable parameters. In this case the models demonstrate ahigher vulnerability to membership inference attack than the

MNIST dataset. Unfortunately, this complexity also increasesthe loss in utility when a differentially private optimizer isemployed in the model training.

Additionally, comparing the CIFAR-10 dataset with theCIFAR-100 dataset, CIFAR-100 represents a more complexlearning problem with 100 possible classes rather than 10.This again results in not only a more significant loss for theCIFAR-100 dataset in the presence of differentially privatedeep learning, but also higher vulnerability to membershipinference attacks.

Compare the two models trained on the most complexdataset CIFAR-100, we observe the impact of the modelcomplexity on the vulnerability of membership inference at-tack. The CIFAR-100 with transfer learning (last row) willhave a (24, 24, 3) image input compared to the 50 featureinput from the PCA approach using CIFAR-100 (third row).After processing the image through the fixed convolutionallayers, the model with transfer learning still has significantlymore parameters (compare row 5 with row 3). We note themore complex model structure (over 1 million parameters) hasincreased membership inference vulnerability for the CIFAR-100 dataset with 89.08% attack accuracy compared with thevulnerability of 74.04% when using the less complex modelwith reduced dimensionality of input (with 107,108 trainableparameters). However the mode complex model also reportsa greater test loss of 58.39% compared to the loss of 49.34%for the less complex model.

Our experimental results from Table VI underscore animportant observation. Using differential privacy as effectivemitigation for membership inference remains an open chal-lenge. One primary reason is that even with differentiallyprivate deep learning, the most vulnerable DNN models (anddatasets) are unfortunately also those which experience thegreatest utility loss.

2) Implications for Skewed Datasets:An additional challenge in the deployment of differential

privacy is the impact on skewed datasets. Using the approachand model structure from [38], we experiment with differentlevels of data skewness in the CIFAR-10 dataset. We againgradually decrease the representation of the automobile classfrom 10% (even distribution) to 2% of the training data for afixed ε of 1000. While this is a large ε value when consideringthe differential privacy theory, the results in [38] indicate thatmore common theoretical ε values, such as 1, result in nomeaningful learning at all. Our experiments also validate thisobservation. We conduct this set of experiments by varyingthe data skewness in the presence of the largest ε reportedin [38]. In addition, we also measure and report the resultsfor smaller ε with the automobile class set to 2% of itstraining data. Figure 8 shows the results of these two setsof experiments, measured using overall model accuracy lossand automobile F-1 score loss. We compare the differentiallyprivate deep learning setting with the non-private setting usingthe same model architecture and data distribution. We reportthe percentage of utility loss according to these scores.

In statistical analysis of binary classification, the F-1 score

Page 14: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

Fig. 8: Impact of differentially private model training on minority classes.

measures the accuracy by considering both the precision andthe recall of the test to compute the score. The F-1 scoreis the harmonic mean of the precision and recall. An F-1 score reaches its best value at 1 (perfect precision andrecall) and worst at 0. Let p be the number of correct positiveresults divided by the number of all positive results returnedby the classifier, and r be the number of correct positiveresults divided by the number of all relevant samples (allsamples that should have been identified as positive). We haveF − 1 = 2(p × r)/(p + r). To isolate the utility loss forthe automobile class we therefore report the loss in F-1 scorecalculated using the precision and recall of only the automobileclass where automobile is considered positive and all otherclasses negative.

On the left sub-figure of Figure 8, we compare the accuracyloss of the overall model with the loss in F-1 score of theautomobile class (class 1). We vary the data skewness of theautomobile class in the training set, and report the results bythe percentage of decrease in F-1 score when using differen-tially private deep learning setting compared to the non-privatedeep learning setting. This left sub-figure shows that as theskewness decreases from 2% to 10% (X-axis), the gap betweenthe private setting and the non-private setting becomes closer.When the data skewness is high for the automobile class (class1), the minority group is more vulnerable, as seen in Figure 7,but unfortunately also shows the greatest utility loss for bothtraining and test when differential privacy is introduced.

On the right sub-figure of Figure 8, we again compare theaccuracy loss of the overall model with the loss in F-1 score ofthe automobile class (class 1). This time we vary the privacyparameter ε from 10 to 1000, and report the measurementresults by the percentage of utility loss when using a dif-ferentially private deep learning setting compared to a non-private deep learning setting. We observe that the minoritygroup has much higher utility loss under the privacy setting,even when the privacy budget parameter ε is set to larger (andtheoretically almost meaningless) values. This confirms theobservation that differential privacy as a mitigation techniquepresents a catch-22 as the minority class is more likely to besuccessfully targeted by a membership inference attack butalso suffers the most under differential privacy.

In summary, Figure 8 highlights how differential privacy asa mitigation technique also leads to unfortunate outcomes forminority classes. That is, the data which is most vulnerableto attack is also the most likely to experience intolerable

accuracy loss under differentially private setting comparedto non-private setting. This mirrors the challenge whereincomplex datasets and complex model architectures are morevulnerable to membership inference attacks, and at the sametime, are also more likely to experience significant accuracyloss with differentially private deep learning.

VII. IMPLICATIONS FOR FEDERATED LEARNING SYSTEMS

In response to legislative restrictions on data sharing,federated learning has gained increased attention from bothacademics and industry. In a federated learning environment,multiple institutions (or individuals) may leverage their collec-tive data by sharing model parameters rather than sharing rawdata. For example, rather than a group of financial institutionssharing their data with one central authority for model training,each can learn a federated model by sharing model parametersand keeping their training data private. Such federated modelshould be more accurate in prediction than any single partici-pant’s model. There are a number of ways one can build such afederated learning model. First, one can share the local trainingparameters during each epoch or each iteration of the localmodel training in a federated learning system. This allowscross-model learning in an iterative manner. Alternatively,one can also build a federated learning model by sharingthe model parameters or the locally trained model after eachparticipant has completed its local training over its privatetraining dataset. In this scenario, the federated learning modelcan be viewed as an ensemble of its participants trained modelsover a horizontally partitioned dataset, which is the union ofits participants training datasets.

Recent work on membership inference has shown thatfederated learning exposes participants to significant privacyrisk [14], [15], [41]. In response to such membership pri-vacy vulnerability, proposals for privacy-preserving federatedlearning have been put forward, including adding differen-tially private noise prior to the parameter updates by eachparty [42]–[44]. However, preliminary results have indicatedthat each individual participant is likely to have differentprivacy risks [42]. Also when taking into account the privacypolicy of the participating institution, the sensitivity of the dataheld by that institution, and the distribution of their data, oneinstitution may demand a higher degree of privacy than anotherparticipant. We argue that the membership privacy analysisand the membership inference mitigation through differentialprivacy can be a valuable component in the research and

Page 15: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

development of federated machine learning systems for real-world mission critical applications.

VIII. RELATED WORK

Existing related research can be broadly classified into threecategories: (1) membership inference, (2) the connection be-tween membership inference and adversarial machine learning,and (3) differential privacy as a mitigation technique.

Membership Inference Attacks. The membership infer-ence attack against machine learning models was first pre-sented in [17] where the authors proposed the shadow modelattack. [14] characterized attack vulnerability with respectto different model types and datasets and introduced thevulnerability of loosely federated systems while the authorsin [18] relax adversarial assumptions and generate a modeland data independent attacker. Hayes et. al. demonstrate themembership inference vulnerability of generative models [45]while Melis et al. highlight the potential for feature leakagein collaborative learning environments [46]. Finally in [47]the authors articulate the power of black-box attacks com-pared with white-box attacks. We expand on the foundationfrom these works to offer new insights for the membershipinference attack particularly focusing on skewed datasets anddifferentially private mitigation techniques.

Membership Inference and Adversarial Machine Learn-ing. There have been some recent works in analyzing theconnection between membership inference and adversarialmachine learning. Song et al. [48] and Mejia et. al. [49]demonstrate that existing adversarial defense methods causean increase in membership inference vulnerability. Our mem-bership inference evaluation system highlights the orthogonalrisk. That is, how membership inference can inform adversarialmachine learning attackers.

Differential Privacy to Mitigate Membership Inference.Both [38] and [50] investigate differential privacy as a mitiga-tion technique for membership inference attacks. Both indicatethat existing differential privacy techniques do not display vi-able accuracy and privacy protection trade-offs. That is, eithermodels show large accuracy losses or display high membershipinference vulnerability. We extend this line of work to inves-tigate the implications of DNN model complexity, learningtask complexity, and data skewness on membership inferencevulnerability and on the effectiveness of differentially privatelearning as a mitigation strategy.

IX. CONCLUSION

Membership inference attacks seek to infer membershipof individual training instances of a privately trained modelthrough black-box access to the prediction API of a MLaaSprovider. This paper provides an in-depth characterization ofmembership inference vulnerability from multiple perspec-tives. We develop MPLens to expose membership inferencevulnerability and perform privacy analysis and privacy com-pliance evaluation. We demonstrate how membership inferenceattack methods can be used in the development of represen-tative datasets which not only signify a privacy leakage but

also facilitate the generation of substitute models with respectto the attack target model, therefore heightening the riskfor adversarial attacks. We investigate training data skewnessand its impact on membership inference vulnerability. Wealso evaluate differential privacy as a mitigation techniquefor membership inference against deep learning models. Ourempirical results show that (1) minority groups within skeweddatasets display increased risk for membership inference and(2) differential privacy presents many trade-offs as a mitigationtechnique to membership inference risk.

Acknowledgement

The authors are partially sponsored by NSF CISE SaTC(1564097) and the first author acknowledges the IBM PhD Fel-lowship award received in Sept 2019. Any opinions, findings,and conclusions or recommendations expressed in this materialare those of the author(s) and do not necessarily reflect theviews of the National Science Foundation or other fundingagencies and companies mentioned above.

REFERENCES

[1] Amazon Machine Learning: Developer Guide. Amazon Web Services,2018.

[2] Cloud Machine Learning Engine Documentation. Google, 2018.[3] IBM Data Science and Machine Learning. IBM, 2018.[4] Marshall Copeland, Julian Soh, Anthony Puca, Mike Manning, and

David Gollob. Microsoft Azure: Planning, Deploying, and ManagingYour Data Center in the Cloud. Apress, Berkely, CA, USA, 1st edition,2015.

[5] Accountability Act. Health insurance portability and accountability actof 1996. Public law, 104:191, 1996.

[6] General Data Protection Regulation. Regulation (eu) 2016/679 of theeuropean parliament and of the council of 27 april 2016 on the protectionof natural persons with regard to the processing of personal data and onthe free movement of such data, and repealing directive 95/46. OfficialJournal of the European Union (OJ), 59(1-88):294, 2016.

[7] Assembly bill 375. (2017-2018 reg. sess.). ca 2018: California legisla-ture.

[8] Peter Lake and Paul Crowther. Data, an organisational asset. In ConciseGuide to Databases, pages 3–19. Springer, 2013.

[9] Ling Liu, Wenqi Wei, Ka-Ho Chow, Margaret Loper, Emre Gursoy,Stacey Truex, and Yanzhao Wu. Deep neural network ensembles againstdeception: Ensemble diversity, accuracy and robustness. arXiv preprintarXiv:1908.11091, 2019.

[10] Cynthia Dwork. Differential privacy: A survey of results. In Interna-tional Conference on Theory and Applications of Models of Computa-tion, pages 1–19. Springer, 2008.

[11] Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu,and Suman Jana. Certified robustness to adversarial examples withdifferential privacy. In IEEE Symposium on Security and Privacy (SP).IEEE, 2019.

[12] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, IlyaMironov, Kunal Talwar, and Li Zhang. Deep learning with differentialprivacy. In Proceedings of the 2016 ACM SIGSAC Conference onComputer and Communications Security, pages 308–318. ACM, 2016.

[13] Lei Yu, Ling Liu, Calton Pu, Mehmet Emre Gursoy, and Stacey Truex.Differentially private model publishing for deep learning. In IEEESymposium on Security and Privacy (SP). IEEE, 2019.

[14] Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Lei Yu, and Wenqi Wei.Demystifying membership inference attacks in machine learning as aservice. IEEE Transactions on Services Computing, 2019.

[15] Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensiveprivacy analysis of deep learning: Stand-alone and federated learningunder passive and active white-box inference attacks. 2019.

[16] Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Pri-vacy risk in machine learning: Analyzing the connection to overfitting.In IEEE 31st Computer Security Foundations Symposium (CSF), pages268–282. IEEE, 2018.

Page 16: Effects of Differential Privacy and Data Skewness on ...an attacker’s ability to develop an effective substitute model. In this paper, we focus on investigating two key problems

[17] Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov.Membership inference attacks against machine learning models. In IEEESymposium on Security and Privacy (SP), pages 3–18. IEEE, 2017.

[18] Ahmed Salem, Yang Zhang, Mathias Humbert, Mario Fritz, and MichaelBackes. Ml-leaks: Model and data independent membership inferenceattacks and defenses on machine learning models. arXiv preprintarXiv:1806.01246, 2018.

[19] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna,Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing propertiesof neural networks. arXiv preprint arXiv:1312.6199, 2013.

[20] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha,Z Berkay Celik, and Ananthram Swami. Practical black-box attacksagainst deep learning systems using adversarial examples. arXivpreprint, 2016.

[21] Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferabilityin machine learning: From phenomena to black-box attacks usingadversarial samples. arXiv preprint arXiv:1605.07277, 2016.

[22] Andras Rozsa, Manuel Gunther, and Terrance E Boult. Are accuracyand robustness correlated. In 15th IEEE International Conference onMachine Learning and Applications (ICMLA), pages 227–232. IEEE,2016.

[23] Dua Dheeru and Efi Karra Taniskidou. UCI machine learning repository,2017.

[24] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations ofdifferential privacy. Foundations and Trends® in Theoretical ComputerScience, 9(3–4):211–407, 2014.

[25] Frank D McSherry. Privacy integrated queries: an extensible platformfor privacy-preserving data analysis. In Proceedings of the 2009 ACMSIGMOD International Conference on Management of data, pages 19–30. ACM, 2009.

[26] Martın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, ZhifengChen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, MatthieuDevin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, GeoffreyIrving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser,Manjunath Kudlur, Josh Levenberg, Dandelion Mane, Rajat Monga,Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, JonathonShlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vin-cent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, PeteWarden, Martin Wattenberg, Martin Wicke, Yuan Yu, and XiaoqiangZheng. TensorFlow: Large-scale machine learning on heterogeneoussystems, 2015. Software available from tensorflow.org.

[27] Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, SofyaRaskhodnikova, and Adam Smith. What can we learn privately? SIAMJournal on Computing, 40(3):793–826, 2011.

[28] Amos Beimel, Shiva Prasad Kasiviswanathan, and Kobbi Nissim.Bounds on the sample complexity for private learning and privatedata release. In Theory of Cryptography Conference, pages 437–454.Springer, 2010.

[29] Yanzhao Wu, Ling Liu, Juhyun Bae, Ka-Ho Chow, Arun Iyengar, CaltonPu, Wenqi Wei, Lei Yu, and Qi Zhang. Demystifying learning ratepolices for high accuracy training of deep neural networks. arXivpreprint arXiv:1908.06477, 2019.

[30] Francois Chollet. Xception: Deep learning with depthwise separableconvolutions. In Proceedings of the IEEE conference on computer visionand pattern recognition, pages 1251–1258, 2017.

[31] Maurizio Naldi and Giuseppe D’Acquisto. Differential privacy: Anestimation theory-based method for choosing epsilon. arXiv preprintarXiv:1510.00917, 2015.

[32] Jaewoo Lee and Chris Clifton. How much is enough? choosing ε fordifferential privacy. In International Conference on Information Security,pages 325–340. Springer, 2011.

[33] Justin Hsu, Marco Gaboardi, Andreas Haeberlen, Sanjeev Khanna, ArjunNarayan, Benjamin C Pierce, and Aaron Roth. Differential privacy: Aneconomic method for choosing epsilon. In IEEE 27th Computer SecurityFoundations Symposium, pages 398–410. IEEE, 2014.

[34] Nitin Kohli and Paul Laskowski. Epsilon voting: Mechanism designfor parameter selection in differential privacy. In IEEE Symposium onPrivacy-Aware Computing (PAC), pages 19–30. IEEE, 2018.

[35] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers offeatures from tiny images. 2009.

[36] Yann LeCun, Corinna Cortes, and CJ Burges. Mnist handwrittendigit database. AT&T Labs [Online]. Available: http://yann. lecun.com/exdb/mnist, 2, 2010.

[37] Wenqi Wei, Ling Liu, Margaret Loper, Ka-Ho Chow, Emre Gursoy,Stacey Truex, and Yanzhao Wu. Cross-layer strategic ensemble defenseagainst adversarial examples. arXiv preprint arXiv:1910.01742, 2019.

[38] Bargav Jayaraman and David Evans. Evaluating differentially privatemachine learning in practice. In 28th USENIX Security Symposium(USENIX Security 19). Santa Clara, CA: USENIX Association, 2019.

[39] Galen Andrew, Steve Chien, and Nicolas Papernot. tensorflow privacy,2019.

[40] Joseph Geumlek, Shuang Song, and Kamalika Chaudhuri. Renyidifferential privacy mechanisms for posterior sampling. In Advancesin Neural Information Processing Systems, pages 5289–5298, 2017.

[41] Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang,and Hairong Qi. Beyond inferring class representatives: User-levelprivacy leakage from federated learning. In IEEE INFOCOM 2019-IEEEConference on Computer Communications, pages 2512–2520. IEEE,2019.

[42] Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, HeikoLudwig, and Rui Zhang. A hybrid approach to privacy-preservingfederated learning. arXiv preprint arXiv:1812.03224, 2018.

[43] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone,H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, andKarn Seth. Practical secure aggregation for privacy-preserving machinelearning. In Proceedings of the 2017 ACM SIGSAC Conference onComputer and Communications Security, pages 1175–1191. ACM, 2017.

[44] Lingchen Zhao, Qian Wang, Qin Zou, Yan Zhang, and Yanjiao Chen.Privacy-preserving collaborative deep learning with unreliable partici-pants. IEEE Transactions on Information Forensics and Security, 2019.

[45] Jamie Hayes, Luca Melis, George Danezis, and Emiliano De Cristofaro.Logan: Membership inference attacks against generative models. Pro-ceedings on Privacy Enhancing Technologies, 2019(1):133–152, 2019.

[46] Luca Melis, Congzheng Song, Emiliano De Cristofaro, and VitalyShmatikov. Exploiting unintended feature leakage in collaborativelearning. In IEEE Symposium on Security and Privacy (SP). IEEE,2019.

[47] Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, CordeliaSchmid, and Herve Jegou. White-box vs black-box: Bayes optimalstrategies for membership inference. arXiv preprint arXiv:1908.11229,2019.

[48] Liwei Song, Reza Shokri, and Prateek Mittal. Privacy risks of securingmachine learning models against adversarial examples. arXiv preprintarXiv:1905.10291, 2019.

[49] Felipe A Mejia, Paul Gamble, Zigfried Hampel-Arias, Michael Lomnitz,Nina Lopatina, Lucas Tindall, and Maria Alejandra Barrios. Robust orprivate? adversarial training makes models more vulnerable to privacyattacks. arXiv preprint arXiv:1906.06449, 2019.

[50] Md Atiqur Rahman, Tanzila Rahman, Robert Laganiere, Noman Mo-hammed, and Yang Wang. Membership inference attack against dif-ferentially private deep learning model. Transactions on Data Privacy,11(1):61–79, 2018.