18
UT DALLAS Erik Jonsson School of Engineering & Computer Science FEARLESS engineering Incentive compatible Assured Data Sharing & Mining Murat Kantarcioglu

Incentive compatible Assured Data Sharing & Mining

Embed Size (px)

DESCRIPTION

Murat Kantarcioglu. Incentive compatible Assured Data Sharing & Mining. Incentives and Trust in Assured Information Sharing. Combining intelligence through a loose alliance Bridges gaps due to sovereign boundaries Maximizes yield of resources - PowerPoint PPT Presentation

Citation preview

Page 1: Incentive compatible Assured Data Sharing & Mining

UT DALLAS Erik Jonsson School of Engineering & Computer Science

FEARLESS engineering

Incentive compatible Assured Data Sharing & Mining

Murat Kantarcioglu

Page 2: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Incentives and Trust in Assured Information Sharing

Combining intelligence through a loose allianceBridges gaps due to sovereign boundariesMaximizes yield of resourcesDiscovery of new information through correlation, analysis of

the ‘big picture’Information exchanged privately between two participants

Drawbacks to sharingMisinformationFreeloading

Goal: Create means of encouraging desirable behavior within an environment which lacks or cannot support a central governing agent

Page 3: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Possible Scenarios

• You may verify the shared data, and issue fines if the data is wrong– This is easy

• You may verify the share data but cannot issue fines– Little bit harder

• You may only verify some aggregate result– Hardest

Page 4: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Game Matrix

Play (agent j) Do Not Play

Truth Lie

Play(Agent i)

Truth0

0

Lie0

0

Do Not Play0

00

00

0

ijiv tPC

)(2

maxmin

ijiv tPC )(

jijv tPC )(

2maxmin

jijv tPC

)(2

maxmin

ijiv tPC

)(2

maxmin

ijiv tPC )(

jijv tPC )( jijv tPC )(

Value of information

Minimal verification probability

Cost of Verificatio

n

Trust value

Agent type

Page 5: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Behaviors Analyzed in Data Sharing SimulationsName Strategy Verification? Punishment? Comments

Honest Truth No No Optimistic, maximizes returns

Dishonest Lie No No Takes advantage of other players, trumps Honest in 1 on 1

Random Truth, Lie No No Chaotic, chooses either with equal probability

Tit-for-Tat Truth, Lie Always Special Mirrors other players’ actions, starts by selecting Truth

LivingAgent Truth Trust-based No trading Verifies activity according to trust ratings, will cease activity for number of rounds with player who is caught lying

Liar Truth, Lie Trust-based No trading Identical to LivingAgent but lies with small probability

SubtleLie Truth, Lie Trust-based No trading Identical to Liar, except lies whenever information value reaches certain threshold

Page 6: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Simulation Results

We set δmin = 3, δmax = 7, CV = 2

Lie threshold is set 6.9

Honest behavior wins %97 percent of the time if all behaviors exist.

Experiments show without LivingAgent behavior, Honest

behavior cannot flourish.

Please see the following paper for mode details:

“Incentive and Trust Issues in Assured Information Sharing”Ryan Layfield, Murat Kantarcioglu, and Bhavani ThuraisinghamInternational Conference on Collaborative Computing 2008

Page 7: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Verifying Final Result: Our Model

• Players P1....P

n:

• Each has some data (x1...xn), and • Goal: compute a data mining function, D(x1,...,xn) that maximizes

the sum of the participants valuation function.

• Player Pt: Mediator between parties, computes the

function securely, and has test data xt

• Players value privacy, correctness, exclusivity

• Problem: How do we ensure that players share data truthfully?

Page 8: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Assumption

• The best model that maximizes sum of the valuation function is the model built by using the submitted input data.

• Formally: Given submitted valuation functions and submitted data

– D(x) = argmaxmM ({k}

vk(m) ) for any set of players

Page 9: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Mechanism

• Reservation utility normalized to 0

• ui(m) = v

i(m) – p

i(vi,v-i)

• [u = utility] [v = valuation] [p = payment]

• pi(vi,v-i) = argmaxm’M (

{k!=i}(v

k(m’)) –

{k!=i}(v

k(m))

• vi(m) = max{0,acc(m)-acc(D(x

i)} – c(D)

– c is the cost of computation, acc is accuracy

Page 10: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Mechanism

• We compute pi using the independent test set

held by Pt

• Intuitively, mechanism rewards players based on their contribution to the overall model

• This is a VCG mechanism, proved incentive compatible, under our assumption

Page 11: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Experiments

• Does this assumption hold for normal data?• Methodology

• 4 data sets from UCI Repository• 3-party vertical partitioning, naïve-Bayes classifiers• Determine accuracy and payouts

• Payouts estimated by acc(classifier) – acc(classifier without player i’s data) – constant cost

• Once with all players truthful• Once for each player and for each amount of perturbation

• (1%, 2%, 4%, 8%, 16%, 32%, 64%, 100%)

• 50 runs on each

Page 12: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Census-Income (Adult)

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Overall Accuracy

Player 1 LyingPlayer 2 LyingPlayer 3 Lying

Page 13: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Census-Income (Adult)

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0Payouts based on Overall Accuracy

Player 1 LyingPlayer 2 LyingPlayer 3 Lying

Page 14: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Census-Income (Adult)

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6Payouts - Overall Accuracy - Player 1 Lying

Player 1Player 2Player 3

Page 15: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Census-Income (Adult)

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4Payouts - Overall Accuracy - Player 2 Lying

Player 1Player 2Player 3

Page 16: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Census-Income (Adult)

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4Payouts - Overall Accuracy - Player 3 Lying

Player 1Player 2Player 3

Page 17: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Breast-Cancer-Wisconsin

TL(1%)

L(2%)L(4%)

L(8%)L(16%)

L(32%)L(64%)

L(100%)

0.91

0.92

0.93

0.94

0.95

0.96

0.97Overall Accuracy

Player 1 LyingPlayer 2 LyingPlayer 3 Lying

Page 18: Incentive compatible Assured Data Sharing & Mining

FEARLESS engineering

Conclusions

• Does the assumption hold?• Not always, but it is very close, and would work as a practical

assumption

• If better model is found through lying, does this hurt or help?• Consideration: change the goal; not to prevent lying but to

build the most accurate classifier• Finding the “right” lie may take too much computation for

profitability