Supporting Secure File-Level and Block-Level Data De ... · sharing scheme in distributed cloud storage system. ... duplication strategies like client-side or server ... Secure-Cloud

www.ijatir.org

ISSN 2348–2370

Vol.08,Issue.12,

September-2016,

Pages:2308-2311

Copyright @ 2016 IJATIR. All rights reserved.

Supporting Secure File-Level and Block-Level Data De-Duplication

In Cloud Computing K. CHANDRASHEKAR

1, M. GANESH KUMAR

2

1PG Scholar, Dept of CSE, Malla Reddy Institute of Engineering & Technology, Maisammaguda, RR(Dt), Telangana, India. 2Asst Prof, Dept of CSE, Malla Reddy Institute of Engineering & Technology, Maisammaguda, RR(Dt), Telangana, India.

Abstract: Nowadays data reliability is very critical issue in a

de-duplication storage system because it has only one copy

for every file keep in the server is shared by owners.

Encryption mechanism has been used to protect the

confidentiality before outsourcing the data into the cloud. To

enhance the storage performance in cloud, need to eliminate

the duplicate copies of the data. But cloud consists of one

copy for every file stored in cloud and those are owned by

several users. Hence it uses the de-duplication technique to

eliminate the duplicate copies of the data and improves the

storage utilization and provides the privacy for the sensitive

data when the users outsourced the data in cloud. This paper

formalizes the notion of reliable de-duplication system. We

propose new de-duplication system with higher reliability

where the data chunks are distributed across multiple cloud

servers. Hence it uses the secret sharing file level and blocks

level of data de-duplication with file compression,

implemented an equivalent in the proposed system and

demonstrated in a realistic environments.

Keywords: Secret Sharing, De-duplication Storage, File

Level De-duplication System, Block Level De-duplication

System, and De-duplication system with Tag Consistency.

I. INTRODUCTION Cloud –storage service providers like drop box, complex

and others perform a de-duplication to save by only storing

one copy of every file upload. Storage-performance is a

technique wherever it has the capacity to provide level of

data as a group than the single instance store technology that

identify and eliminate the need to store repeated instances of

identical files and additionally reduce the upload bandwidth.

The fundamental idea of this paper is to eliminate the

duplicate copies of storage data and there will be only one

copy of file hold on in cloud even if such a file is used by

number of users. Therefore there will be a challenge for

privacy of sensitive data when they are outsourced by users

to cloud. To achieve this, the notion of distributed reliable de-

duplication system is used. In this, we propose a new

distributed system with a higher reliability of de-duplication

system where the data chunks are distributed to multiple

cloud servers. The de-duplication system eliminates duplicate

data by keeping just one physical copy and referring other

redundant data to an equivalent copy. The security

requirements of data confidentiality and tag consistency

are also achieved by establishing a deterministic secret

sharing scheme in distributed cloud storage system. The

security analysis de-duplication systems are secure in

terms of the proposed system the proposed system.

We implement the proposed system and demonstrate

that the incurred overhead is extremely limited in realistic

environments. With the rapidly increasing quantity of data

storage – performance techniques are wide used to backup

data and reducing network and storage overhead by

eliminating the redundant data. The high compression and

de-duplication ratios can allow best usage of the resources

of the storage –provider and lower value of the user with

the storage efficiency functions like compression and de-

duplication. Data de-duplication is that the process by that

a provider stores one copy of a file that used. So, the de-

duplication system has been proposed on various de-

duplication strategies like client-side or server-side de-

duplication, file-level, block-level de-duplication. There

are two types of de-duplication in terms of size. i.e. File-

level de-duplication ,which discovers the redundancies

between the two different files and removes the

redundancies to reduce the capacity demand, and Block-

level de-duplication, that discovers and removes the

redundancies between data blocks by using the fixed or

variable sized chunks. Now way days data reliability is

very critical issue in a de-duplication storage system

because it has only one copy for each file stored in the

server is shared by owners.

Encryption mechanism has been used to shield the

confidentiality before outsourcing the info into the cloud.

In this paper we represented about the file-level and

block-level de-duplication and downloading of the file or

block from the cloud. The paper will also show how to

style secure de-duplication systems with higher reliability.

The distributed cloud storage server offer higher fault

tolerance. For the protection of data confidentiality in any

it uses the key sharing that is compatible with the

distributed storage systems, where ever it splits a file into

number of blocks and also encoded by using the

technique called secret sharing. These blocks will be

distributed across multiple independent storage servers.

K. CHANDRASHEKAR, M. GANESH KUMAR

International Journal of Advanced Technology and Innovative Research

Volume. 08, IssueNo.12, September-2016, Pages: 2308-2311

To support the de-duplication, a short cryptographic hash

value of the content is used. Another distinctive proposal of

the paper is data integrity which includes the tag consistency.

The traditional de-duplication methods will not directly

expand and applied in distributed and multi-server systems.

The traditional systems will not give resistance to the

collusion attack which is launched by multiple servers.

II. RELATED WORK

Data de-duplication method is used for to removing the

redundant copies of data. These techniques are very

interesting techniques. The reliability means validity and

consistency of test results. It produces consistent results.

They only focused on files without encryption mechanism,

without considering the reliable de-duplication over cipher

text. Cipher text is also called encrypted or encoded info. In

1997 M. Bellaire explained the idea of security and scheme

for symmetric encryption. They give different plan of

security and analyze the reduction among them. They provide

method of encryption using a block cipher. It’s have two

goals. First is to study the plan of security for symmetrical

encryption and second is to offer concrete security analysis

of fixed symmetric encryption device. Convergent encryption

provides data security in de-duplication. Bellaire et al

describes the message latched encryption system and offer

its application in secure outsourced storage. Encryption is

used to realize the data privacy. Encrypted data is referred to

as cipher text. Li et al explained block level having some

key management problems; through many servers .Bellaire et

al. displayed how to protect personal data through the

conversion of inevitable message into the unpredictable

message. In their system, another third party knew the key

server. It absolutely was introduced to produce the file tag to

see the replicate copies. Stanek et al. cultivated better

efficiency and security of information storage. They offer

completely different security for all kinds of data

III. FRAME WORK

The proposed system will give the reliable stored data in

the cloud while achieving the confidentiality and the

integrity. In this the main goal is to change the de-duplication

and the distributed storage of the data across the multiple

servers. It consists of file level and block level uploading and

down loading of the file by using the following terms. Such

as secret sharing, file-level distributed de-duplication system

and the block-level distributed de-duplication systems. It

consists of two types of algorithms like share and recovers.

The share rule is employed to partition the data or secrets S

into k-r pieces of the equal size and produces the r random

pieces of the same size and translates the k items using a non-

symmetric k of n erasure code into an equivalent size of the

secret code. The secret is extracted and recovered with the

rule of recover. From the n shares the recover can take k

values and so outputs the original secret S. It uses two kinds

of Tag Generation rule i.e. Tag-Gen and Tag-Gen’. The Tag-

Gen rule is used to map the original data copy F and outputs a

tag T (F). It has been generated by the user and performs the

duplicate copies of the data with the server.

Tag-Gen’ is used to receive the input from F and index

j which gives the output as a Tag. It has been generated

by the user for the proof of ownership for F. Message

Authentication code also used in the secret sharing which

is used to authenticate the message and provides an

integrity and authenticity assurance on the message. Mac

is applied to integrity of the outsourced stored file. It is

constructed by keyed hash operate that takes input as

secret key and outputs a Mac. The users of same key

generating the Mac will verify the correctness of the Mac

value and detect whether the file has been changed or not.

Fig1. System Architecture.

A. File Level De-duplication System

The file level distributed de-duplication can support

for efficient duplicate check. For each file it computes

tags and those are sent to Secure-Cloud Storage Service

Providers. The collusion attack is prevented by using

Secure-Cloud Storage Service Providers where the tags

are hold on at different storage servers that are

independent and different. The file is uploading to the

server Secure-Cloud Storage Service Providers assumed

that n identifiers denoted by id1, id2…id-n respectively. It

defines the safety parameters secret sharing and also Tag

generation rule Tag-Gen. whereas the file F is uploading,

the user interacts with Secure-Cloud Storage Service

Providers for the de-duplication. The user can have the

file Tag QF= Tag-Gen (F) to Secure-Cloud Storage

Service Providers for the duplication check. Once it

completes, it downloads the file F, from K out of storage

servers. Later the user reconstructs file F using the

recovery algorithm.

B. Block Level De-duplication System:

The block level distributed de-duplication system first

performs the file level de-duplication, before uploading

the file. If no duplicity is found in the file, it divides the

file into number of chunks and performs the block level

de-duplication. It uploads the block in same manner of

file-level uploading. If any duplicates found in file, the

user checks for file-level de-duplication then perform the

block level de-duplication. The file is split into set of

fragments Bi, for every block it performs duplicate check

by using Tag-Gen (Bi) algorithm. Once completion of

uploading it downloads the block from K out of storage

servers, then it gathers all and reconstructs the file F =

(Bi) by using the recovery algorithm.

Supporting Secure File-Level and Block-Level Data De-Duplication In Cloud Computing



C. De-duplication system with Tag Consistency:

To prevent maliciously generated cipher-text replacement

attack a Tag consistency algorithm has been used for security

purpose. It records the original data copy C and produces the

output as T(C) that is used to achieve the duplicate check

with the server. Hence n person will ready to get the details

of an equivalent tag from unique messages by using Tag

Consistency. So, it provides the protection guarantee that the

duplicate fake attacks in which a message is often

undetectable replaced by a fake one. The Tag consistency is

computed by the data owner from the data files, which cannot

be verified by the storage server. If the data owner uploads or

replaces the file, it is completely different from the file

equivalent to the Tag. If the user has been detected the

duplicate check before only, then he will able to extract the

exact file. To over return this drawback, need to compute the

Tag directly from the cipher text by using Hash performs. It

prevents the cipher text replacement attack as a result of the

cloud storage server is ready to compute the Tag by itself.

IV. EXPERIMENTAL RESULTS

In our experiments, any number of users can registered

and login into the system. Who are authorized users they can

upload the files into the cloud. Those uploaded files are

stored in chunk format in cloud. Those upload files duplicate

check in two ways file-level and block-level if any duplicate

files are available in file-level and block-level , then that file

cannot uploaded in the cloud and to that particular file tag

consistency will be assigned to the user. But that file can be

downloaded by data owner as well as data users. In the below

chart we can observe that difference between the length of

both Total execution time and RSSS execution time.

Fig2.

We can observe that Total execution time length is higher

than RSSS execution time length. The difference will be

shown in the sense of time length. So we can consider that

the advantage of file compression. Through our

implementation we can store the big file in chunks format

and detect the duplicate files as well as we can increase the

storage space of cloud with file compression by using Ramp

Secret sharing mechanism.

V. CONCLUSION

We proposed the enhancement of storage system with

improved reliability of data to achieve the confidentiality,

integrity and duplicate data of the users it supports the file

level and block level of information de-duplication with

file compression. It achieves the integrity by using the

Tag Consistency thought; also it reduces the storage space

in cloud and uploads bandwidth. It has been implemented

by using the Ramp secret sharing scheme, demonstrated

the file and block level data to upload and download the

file.

VI. REFERENCES [1]M.Bellare, S. Keelveedhi, and T. Ristenpart, “Dupless:

Serveraided encryption for deduplicated storage,” in

USENIX Security Symposium, 2013.

[2] “Message-locked encryption and secure dedupli-

cation,” in EUROCRYPT, 2013, pp. 296–312.

[3] G. R. Blakley and C. Meadows, “Security of ramp

schemes,” in Advances in Cryptology: Proceedings of

CRYPTO ‟84, ser. Lecture Notes in Computer Science, .

R. Blakley and D. Chaum, Eds. Springer-Verlag

Berlin/Heidelberg, 1985, vol. 196, pp. 242–268.

[4] A. D. Santis and B. Masucci, “Multiple ramp

schemes,” IEEE Transactions on Information Theory, vol.

45, no. 5, pp. 1720–1728, Jul. 1999.

[5] M.O. Rabin, “Efficient dispersal of information for

security, load balancing, and fault tolerance,” Journal of

the ACM, vol. 36, no. 2, pp. 335– 348, Apr. 1989.

[6] P. Golle, S. Jarecki, and I. Mironov. Cryptographic

primitives enforcing communication and storage

complexity. In”Financial Cryptography ‟02”, volume

2357 of LNCS, pages 120–135. Springer, 2003.

[7] A. Juels and B. S. Kaliski, Jr. Pors: proofs of

retrievability for large files. In ACM CCS ‟07, pages

584–597. ACM, 2007.

[8] H. Shacham and B. Waters. Compact proofs of

retrievability. In ASIACRYPT ‟08, pages 90–107.

Springer-Verlag, 2008.

[9] A.D. Santis and B. Masucci, „„Multiple Ramp

Schemes,‟‟ IEEE Trans. Inf. Theory, vol. 45, no. 5, pp.

1720-1728, July 1999.

[10] G.R. Blakley and C. Meadows, „„Security of Ramp

Schemes,‟‟ in Proc. Adv. CRYPTO, vol. 196, Lecture

Notes in Computer Science,G.R. Blakley and D. Chaum,

Eds., 1985, pp. 242-268.

[11] M.O. Rabin, „„Efficient Dispersal of Information for

Security, Load Balancing, Fault Tolerance,‟‟ J. ACM,

vol. 36, no. 2, pp. 335- 348, Apr. 1989

[12] G.R. Blakley and C. Meadows , “ Security of Ramp

Scheme”, inProc.Adv.Cryptol,1985,Vol.196,PP.242-268

[13] J.Li. X. Chen , M.Li.,J.Li,P.Lee and W.Lou “Secure

deduplication with efficient and reliable convergent key

management”,IEEE Transactions. Parallel Distributed

Syst, Vol.25, no 6, PP.1615-1625, Jun 2014.

[14] J.Stanek, A.Sorniotti, E.Androulaki, and L.Kencl, ”A

Secure data deduplication scheme for cloud

storage”,inTech.Rep.,2013.

[15] Gore Swapnali, Gore Supriya,Tengale Kanchan,

Tengale Varsha, “Modern Secure Distributed De-

K. CHANDRASHEKAR, M. GANESH KUMAR



duplication System With Improved Reliability” IJRCS ,Vol 5

Octo-2015,ISSN:2277 128x.

[16]W.K. Ng,Y.Wen and H.Zhu,” Private Data Deduplication

Protocols in Cloud Storage”,in Proc.27th Annu. ACM

Symp.Appl.Comput. 2012, PP.441-446.

Documents

Supporting Secure File-Level and Block-Level Data De ... · sharing scheme in distributed cloud storage system. ... duplication strategies like client-side or server ... Secure-Cloud