28
Incentive Compatible Privacy- Preserving Data Analysis M.V.Rupa Sri 310204120033

Incentive Compatible Privacy Preserving Data Analysis

Embed Size (px)

DESCRIPTION

Now a days, data management applications have evolved from pure storage and retrieval of information to finding interesting patterns and associations from large amounts of data. With the advancement of Internet and networking technologies, more and more computing applications, including data mining programs, are required to be conducted among multiple data sources that scattered around different spots, and to jointly conduct the computation to reach a common result. However, due to legal constraints and competition edges, privacy issues arise in the area of distributed data mining, thus leading to the interests from research community of both data mining. In this project each party participates in a protocol to learn the output of some function f over the joint inputs of the parties. We mainly focus on the DNCC model instead of considering a probabilistic extension. Deterministic Non Cooperative Computation needs to be extended to include the possibility of collusion.

Citation preview

Page 1: Incentive Compatible Privacy Preserving Data Analysis

Incentive Compatible Privacy-Preserving Data Analysis

M.V.Rupa Sri

310204120033

Page 2: Incentive Compatible Privacy Preserving Data Analysis

ABSTRACT

• In many cases, competing parties who have private data may collaboratively conduct privacy-preserving distributed data analysis (PPDA) tasks to learn beneficial data models or analysis results. The field of privacy has seen rapid advances in recent years because of the increases in the ability to store data. In particular, recent advances in the data mining field have lead to increased concerns about privacy.

• It is often highly valuable for organizations to have their data analyzed by external agents. However, any program that computes on potentially sensitive data risks leaking in- formation through its output. Differential privacy provides a theoretical framework for processing data while protecting the privacy of individual records in a dataset.

Page 3: Incentive Compatible Privacy Preserving Data Analysis

EXISTING SYSTEM

• SECURE MULTIPARTY COMPUTATION• Definition:

In existing, we generally assume that participating parties provide truthful inputs. This assumption is usually justified by the fact that learning the correct data analysis models or results is in the best interest of all participating parties. If any party does not want to learn data models and analysis results, the party should not participate in the protocol.

Page 4: Incentive Compatible Privacy Preserving Data Analysis

PROPOSED SYSTEM

• The term incentive compatible means that participating parties have the incentive or motivation to provide their actual inputs when they compute functionality. Although SMC-based privacy-preserving data analysis protocols (under the malicious adversary model) can prevent participating parties from modifying their inputs once the protocols are initiated, they cannot prevent the parties from modifying their inputs before the execution. On the other hand, parties are expected to provide their true inputs to correctly evaluate a function that satisfies the NCC model. Therefore, any functionality that satisfies the NCC model is inherently incentive compatible under the assumption that participating parties prefers to learn the function result correctly and if possible exclusively. Now the question is which functionalities or data analysis tasks satisfy the NCC model.

Page 5: Incentive Compatible Privacy Preserving Data Analysis

ADVANTAGES IN PROPOSED SYSTEM

• Each of these deals with the problem of ensuring truthfulness in data mining. However, each one requires the ability to verify the data after the calculation.

• Although verification based techniques are very useful, there are cases where verification is not feasible due to legal, social and privacy concerns.

Page 6: Incentive Compatible Privacy Preserving Data Analysis

MODULES

 • User Interface Design • Create Multiple Organizations • Data Analysis and Integration • Inputs computation model• Association Data Mining

Page 7: Incentive Compatible Privacy Preserving Data Analysis

Module Description• USER INTERFACE DESIGN:• In this module we create a user page using Graphical User

Interface(GUI), which will be the media to Connect user with the server and through which client can able to give request to the server and server can send the response to the client, through this module we can establish the communication between client and server using webpage.

• A program interface that takes advantage of the computer's graphics capabilities to make the program easier to use. Well-designed graphical user interfaces can free the user from learning complex command languages. On the other hand, many users find that they work more effectively with a command-driven interface, especially if they already know the command language. Its goal is to enhance the efficiency and ease of use for the underlying logical design of a stored program. Thus the user interacts with information by manipulating visual widgets that allow for interactions appropriate to the kind of data they hold. The widgets of a well-designed interface are selected to support the actions necessary to achieve the goals of the user.

Page 8: Incentive Compatible Privacy Preserving Data Analysis

Module Description(contd..)

• CREATE MULTIPLE ORGANIZATIONS:

This is second module of our project. Here we are design no. of

parties. Each and every party may have information to store their

database. All the parties may send their inputs to Data Analysis module.

Here all n no. of parties will send their inputs to single data analysis . The

data analysis will store their inputs either horizontal or vertical partitions.

In this module we can create no. of parties. Each and every party may

nave own data base it can store their information either vertical portion or

horizontal portion.

Page 9: Incentive Compatible Privacy Preserving Data Analysis

Module Description(contd..)

• DATA ANALYSIS AND INTEGRATION:

This is the third module of our project. Our Data Analysis designed

using cryptographic techniques. Data are generally assumed to be either

vertically or horizontally partitioned. In the case of horizontally partitioned

data, different sites collect the same set of information about different entities.

In the case of vertically partitioned data, we assume that different sites collect

information about the same set of entities. A party can store their input data

either vertical partition or horizontal partitioned. If parties choose horizontal

partition then the input data for many different individuals. Same way if

parties choose horizontal partition then the input data for many different

individuals.

Page 10: Incentive Compatible Privacy Preserving Data Analysis

Module Description(contd..)

• Inputs computation model

• This is fourth module of our project. This model to design for compute

all the truthful inputs of all participating parties here going to

assumptions like the first priority for every participating party is to learn

the correct result. Another one is, if possible, every participating party

prefers to learn the correct result exclusively.

Page 11: Incentive Compatible Privacy Preserving Data Analysis

Module Description(contd..)

• ASSOCIATION DATA MINING

• This is last module of our project. Our data mining is summarize the

association rule mining and analyze whether the association rule mining

can be done in an incentive compatible manner over horizontally or

vertically partitioned database. If get in the requested query then it

search where it is located either horizontal partition or vertical partition

retrieve the result from partition after that result send to particular party.

Page 12: Incentive Compatible Privacy Preserving Data Analysis

TECHNIQUE USED

ASSOCIATION RULEMINING ALGORITHM

The above definition simply states what function could be computed in NCC setting deterministically (i.e., computation result is correct with probability one), and no party could correctly compute the correct result once the party lies about his or her inputs in a way that changes the original function result. In other words, if a party i replaces its true input vi with v_ i and if f(v_ i, v−i) _= f(vi, v−i), then party i should not be able to calculate the correct f(vi, v−i) from f(v_ i, v−i). And vi. Note that strategy (ti, gi) means that the way the input is modified, denoted by ti, and the way the output is calculated, denoted by gi. In ti can be considered as choosing a value different from the actual input, and gi can be considered as the ways the correct μ and s2 are computed. Another implication of the above definition is that for any ti, the corresponding gi should be deterministic, because each party want to exactly compute the “correct” result.

Page 13: Incentive Compatible Privacy Preserving Data Analysis

• A two-party protocol is proposed to securely compute JC. The protocol consists of two stages

Page 14: Incentive Compatible Privacy Preserving Data Analysis

SYSTEM ARCHITECTURE

Parties

User login

DB

Validate

Data analysis

Vertical portion Horizontal potion

NCC Model

TTP

Rule mining

Page 15: Incentive Compatible Privacy Preserving Data Analysis

System Architecture Description

• In above diagram contains client Login, Database, Work Allocation, Worker Page, Computing, Reposting, and Work Grouping. First computation node will start running. After party node enter user name and password that is validated by compatible node. Then computation node assigns the work to the data mining nodes. Data mining node finishes his work and reposted to the compatible node. TTP collects the inputs of parties and group of parties input for particular work presented by party nodes.

Page 16: Incentive Compatible Privacy Preserving Data Analysis

 USE CASE DIAGRAM

input

private inputs TTP

compute the input data

party1

party2function over join the inputs

party3

vertical portion

NCC model

horizantal portion

Data mining

Page 17: Incentive Compatible Privacy Preserving Data Analysis

CLASS DIAGRAM

Party's

createdatastore

create()database()vertiacal()horizanta()

input computative model

updatedatacompute

receive()compute()correct()exclusive()

NCC_Mpdel

createinputdatastoreretrieve

inputs()datamining()ttp()vertical()horizantal()

Data Mining

storeupdateinput

horizantal()vertical()compute()

Page 18: Incentive Compatible Privacy Preserving Data Analysis

SEQUENCE DIAGRAM 

parties data analysis NCC Model Rule mining

to store data

either vertical or horizantal

sending the inputs

all the inputs are compute

diff inputs og parties stored

sending requested data to NCC

response

Page 19: Incentive Compatible Privacy Preserving Data Analysis

ACTIVITY DIAGRAM

parties

NCC Model

Data Ming

vertical portion horizantal portion

Page 20: Incentive Compatible Privacy Preserving Data Analysis

LOGIN FORM

Page 21: Incentive Compatible Privacy Preserving Data Analysis
Page 22: Incentive Compatible Privacy Preserving Data Analysis
Page 23: Incentive Compatible Privacy Preserving Data Analysis

Organization Login

Page 24: Incentive Compatible Privacy Preserving Data Analysis

ORGANIZATION’S INFORMATION:

Page 25: Incentive Compatible Privacy Preserving Data Analysis
Page 26: Incentive Compatible Privacy Preserving Data Analysis

Participating parties:

Page 27: Incentive Compatible Privacy Preserving Data Analysis

Data Sharing:

Page 28: Incentive Compatible Privacy Preserving Data Analysis

Conclusion

• Even though privacy-preserving data analysis techniques guarantee that nothing other than the final result is disclosed, whether or not participating parties provide truthful input data cannot be verified. In this paper, we have investigated what kinds of PPDA tasks are incentives compatible under the NCC model. Based on our findings, there are several important PPDA tasks that are incentive driven. Table II classifies the common data analysis tasks studied in this paper into DNCC or Non-DNCC categories. Most often, data partition schemes can make a difference in determining DNCC or Non-DNCC classifications.