5

Click here to load reader

Privacy Concealment and Data Security over Outsourced Cloud

Embed Size (px)

Citation preview

Page 1: Privacy Concealment and Data Security over Outsourced Cloud

Privacy Concealment and Data Security over Outsourced Cloud Environments

Sunit Ranjan Poddar1, Sanika Sarang Kamtekar2, Ameya Ravindra Sawant3, Diksha PawanKumar Jangid4 and Prof. Mandar Mokashi5

Dept. of Computer Engineering, Dr. D. Y. Patil School of Engineering, Lohegaon, Pune, India1,2,3,4,5 [email protected], [email protected], [email protected],

[email protected], [email protected]

Abstract In an era of wireless internet, the use of mobile devices and cell phones has reached to a great extent. In this new digital world the availability of data accessibility over internet plays an important role. Dynamic storage systems are preferred for storing huge amount of diverse data. The cloud servers are capable of processing and storing any amount of data required, furthermore the cloud servers are flexible and scalable to a great extent. The information on these servers is of huge importance to the data owners and thus require protective strategies. We thus propose a system which consists of a group of techniques resulting in providing a secure data storage and data integrity. We intend to use RSA algorithm for encrypting and decrypting the data and digital signatures. We also destine to use SHA1 for generating hash keys. These hash keys help us support our motive for redundancy checks as well as for maintaining data integrity. Our system also equip de-duplication of user data which promises efficient bandwidth utilization without harming the privacy of the user. Load balancing is another feature that we include in our system for better performance and storage measures. We strongly believe that the existing systems do not provide the features included in our proposed framework. Also, our system promises improved time and space complexities. Keywords: Data Integrity, De-duplication, Load Balancing, Hash-keys, RSA, Message Digest algorithm, Secure Hash algorithm, Third Party Auditor, Public Auditing 1. Introduction

Cloud storage is an on-demand data outsourcing service model. It has established itself quite firmly in the field of distributed systems and is acquiring huge interest in due to its flexibility and low maintenance cost. Cloud storage providers are third party auditors who are responsible for providing the storage facility to users. But the data which is outsourced to cloud servers is susceptible to insider/outsider attacks as well as prone to corruption.

The benefit of cloud storage is that user can store the data for very long time without having to be concerned about the maintenance of data. In some cases, the user stores the data on cloud but does not access the data for a long time or reads it very seldom. Data Integrity and security of such data is very important in cloud servers as there can be a probability of server crashes and other disasters. If the data on the outsourced server somehow gets corrupted, the system should be able to detect the corruption and should repair the corrupted data. It should also restore the original data in cloud. If we store out data in a single server, then the system becomes vulnerable against single point-of-failure. Thus if we share the load of storage in cloud, we can make the data available even when there is a failure at one of the servers. It is also possible that the data may get tampered while process, or when it is stored at some location. At this point of time, we are probably unable to retrieve the untampered original data but we can surely take care of finding out whether the data is being tampered or not. If in case the file/data is tampered in any way, the respective client gets notified about it and thus he can safeguard his file.

Security over cloud environments is important factor in our system. To provide such shielding over the environments we are using encryption techniques. One such technique is Message Digest algorithm, henceforth used as MD5. It is widely used cryptographic function which generates 128-bit (16 byte) hash values. These hash values are generally expressed in the text format as a 32-bit hexadecimal number. MD5 is also user for verification of data integrity.

Another such technique for safety measure is the Secure Hash algorithm, hereafter called as SHA1, is used for generating digital signatures as well as hash key generation. This hash key generation will provide us to maintain the data integrity. Also, the hash keys are used to map the data of arbitrary size to the data of fixed sizes. These hash functions are also used for cryptography purpose, called as cryptographic hash function. It allows to verify whether some random

Sunit Ranjan Poddar et al, Int.J.Computer Technology & Applications,Vol 6 (6),961-965

IJCTA | Nov-Dec 2015 Available [email protected]

961

ISSN:2229-6093

Page 2: Privacy Concealment and Data Security over Outsourced Cloud

input maps to a certain hash value or not. This requires to be collision resistant, means that it is difficult to find similar kind which produces same hash values. SHA1 generates 160-bit values. It is one of the most widely used cryptographic hash functions.

Even when the cloud servers have considerably large amount of space, saving the same data again and again over different places is worthless. This Data Redundancy affects the performance of the system. Using deduplication techniques, we will be able to overcome the problem of redundancy. For this we are introducing a new technique where if a client uploads some file with same content that is previously uploaded, it will be linked to the location of previous file and it won’t be actually uploaded anywhere on the cloud.

For the betterment of performance measure of he system, it becomes important to look for some aspects which divides the data over networks and process them there itself, which is called as Load Balancing. We will divide the file into chunks of data. These chunks would be saved on the cloud servers at different locations as we know that a simple divide and conquer strategy helps is perform well on the data provided. One more advantage of load balancing is that it will provide us security over data deterioration. As the file would be divided in to chunks and spread over a network at different locations, tampering a chunk(s) will not alter the whole data at once.

For communication purpose amongst the server, client and the data owner, we need to provide some security measures for secure transfer and retrieval of messages. Our resolve is to use a system in which the encryption key is kept public. This encryption key differs from the decryption key, which is kept profound. One such technique which meets our demand is RSA. It is used to encrypt and decrypt messages. RSA is an asymmetric cryptography algorithm,, which means that the encryption and decryption keys would be different. RSA is also called as Public Key cryptography.

Our system included a Third Party Auditing (TPA) entity which keeps track of the integrity of our data, redundancy of the data, and also notifies the data owner if the individual’s data is being tampered by any outsider/insider or not. In such cases, the TPA would be responsible for notifying the data owner about the tamper. Also, the TPA will abandon the ongoing transaction if it identifies that either of the parties is not truthful to the other one.

Considering all the points that we have discussed in this section, we believe our framework promises enhanced performances as deduplication technique restricts the data redundancy to occur, therefore saving us valuable time, cost and space.

2. Literature Survey Jian Liu[1] is the author of this paper and

proposes a public auditing system that uses regenerating-code-based cloud storage system. In this framework, the encryption of data is carried out in earlier stages rather than applying it in auditing techniques later on. For semi-trusted proxy has been introduced to handle the repairs and authentication of the blocks of data keeping in mind that the user cannot always be online just to observe that the data is not tampered. BLS signature has been used for authentication purposes along with encoding procedures. This framework promises to be flexible and can be applied to regenerating-code-based cloud environments.

This paper deals with data integrity protection (DIP) in a multi-server environment. Henry C. Chen[2] has proposed a system which implements a DIP scheme for making FM SR-DIP codes, which enables the system to become a fault tolerant in nature. The client is able to verify any subsets of his data or its integrity and consistency against malicious attacks.

The existing systems provide us with many privacy concealment techniques. One such system proposed by Raluca Ada Popa[3] presents the first order preserving technique that achieves ideal security. Mutable Ciphertexts is the main technique used in this system. This technique is implemented using the sorted order of Ciphertexts matching against the sorted order of corresponding plaintexts. This allows the databases and other application to process the queries that involves order over encrypted data efficiently. This system achieves ideal IND-OCPA security, where the adversary learns the order of elements based on Ciphertexts.

Stavros Papadopoulos[4] proposes a system which presents strong location privacy with the help of PIR (privacy information retrieval) functionality. The paper has a benchmark solution which is based upon the existing PIR based technique. The framework introduces AHG technique to tackle the drawbacks of the former implemented system. BNC technique is also used in this project for preserving location privacy. The important technique used in this paper is AHG (Aggregate Hilbert Grid). This algorithm provides a solution over the drawbacks of BNC by eliminating the empty database blocks (Dummy entries). Results show that the retrieval of the data blocks by the PIR is done in a secure way which therefore prevents the LBS from identifying that particular block. Furthermore, block access obfuscation is provided as the blocks follow a common query plan. This system has outperformed its competitors BNC and has enhanced response time

Sunit Ranjan Poddar et al, Int.J.Computer Technology & Applications,Vol 6 (6),961-965

IJCTA | Nov-Dec 2015 Available [email protected]

962

ISSN:2229-6093

Page 3: Privacy Concealment and Data Security over Outsourced Cloud

and can be applied to system in need of strict location privacy.

As above, Haibo HU[5] proposed a system that comprises of a secure traversal framework as well as an encryption technique that is based on privacy homomorphism. In this system, the author studies the problem of processing the private queries on the indexed data for protection of mutual privacy in a cloud environment. The author has proved the system to be feasible to implement, efficient, and robust under various parameter settings. As a future scope, the author plans to extend his work to other query types which included top k-queries, skyline queries, and multi-way joins.

Yonjun Ren[6] has suggested a framework which promises data integrity for multimedia file which are uploaded over cloud servers. In their system they have proposed a data storage auditing (DSA) to test that the multimedia data is stored o the cloud in a proper way. In this paper, digital watermarking technique has been used for providing security measures of multimedia auditing service. This system is also enabled with copyright protection of image files by self-embedded technology. Their system promise to reduce the consumption of resources and has applications in the fields of related activities.

Giuseppe Ateniese[7] has proposed a system which verifies whether the file is secure in an untrusted cloud server. It mainly is a provable data possession model. A framework uses techniques of verifiable homomorphic tags which actually do not have access to the data in the cloud but have the ability to verify data possession. The system uses a sampling scheme at server storage so that the sampled data can be verified in an efficient way. Remote checking is carried out in this system. This model permits the server to access small proportions of a file which can be helpful deciding whether the data stored has retained authenticity and consistency. 3. Proposed System

In the existing system, there is no provision of any system/individual monitoring the transactions between the parties. This leads to scams, leakage of data and even corruption. Also, no measures have been taken for balancing the load the servers which often leads to resource exhaustion sometimes causing server crash. To overcome this problem of server crash, the existing systems use the methodology of storing the same data elsewhere, i.e. data redundancy. The redundant data is usually if no use as the occurrence of crashing of a server is almost negligible.

The proposed system consists of techniques and measures to overcome the existing system’s

problems. The framework includes following entities –

• TPA • Outsourced Cloud Server • Data Owner • Database • Client Parties

We introduce a Public Auditing technique which acts as a monitoring entity in our proposed system. The TPA in our system is responsible for maintaining the data integrity and to monitor the transactions between the client parties and the databases. It is often seen that the files and the data over cloud space gets tampered. TPAs also take charge of periodically tracking the data consistency on the cloud space, if it is tampered then it triggers a notification alarm to the data owner. This helps to take precautions by the data owner as they can now delete the tampered file from the server and replace it with original one.

The use of cloud servers has increased exponentially and gaining more significance when it comes to data storage and online data availability. Here in our system, the cloud space is provided to the users i.e. the data owners with a benefit of data encryption. The data is divided into chunks and spread over the network of servers to provide load balancing. This helps in increased system performance and data consistency. The part of file which gets tampered can be easily identified and replaced instead of the whole file being inspected for tamper. This saves us significant amount of time and cost. While retrieval these chunks are clubbed together and sent as a query result to the respective data owner.

The authentication of digital signatures and block recognition is carried out by the database with the help of hash keys. These hash keys are generated by the web servers. The algorithms which generate the hash keys are the cryptographic algorithms which in combination with encryption techniques help us provide a better security and data integrity over the cloud space.

Our proposed framework introduces a new technique known as duplication of files. Often it happens that two or more different data owners store files with the same content over the cloud which consumes unnecessary space. Using duplication, storing the same files is being avoided at the server side by linking the new file to the old one, but not actually storing it anywhere.

The web service here plays a vital role in this system. This service enables us to map the data flow from data owner to the server, authenticate it for further processing, implement business logic and check for deduplication of the file. The web service in our system is the central agent which routes the

Sunit Ranjan Poddar et al, Int.J.Computer Technology & Applications,Vol 6 (6),961-965

IJCTA | Nov-Dec 2015 Available [email protected]

963

ISSN:2229-6093

Page 4: Privacy Concealment and Data Security over Outsourced Cloud

data flow and other controls from one place to other as shown in the fig. System Architecture.

The database introduced in the system is used to store the data which is needed for secure transmission, redundancy checks and to check whether the data is being tampered by the TPAs. The TPAs periodically checks for the identifier stored at database with the data stored at the cloud space. If the TPA identifies any dissimilarity, it notifies the data owner.

On comparing our system with the existing systems, we see that our system provided various security measures over cloud server with better performance in terms of time and space complexities.

Our system can be applied in various fields for countless data owners to big MNCs. The data owners can use this system to store their data at some secure place from where they can retrieve the data easily. Our system can also be used when two or more MNCs collaborate. While collaboration, these MNCs have to keep their data at some location from where both the parties can use it as well as process it if they have permission to use it. The TPA plays an important role while transactions between these companies. As the TPA is a machine and not a human, TPA can take precautious measures if it identifies that either of the parties tend to scam or trick the other.

Problem Statement: To develop a web system

that performs data integrity checking and failure repairing on the data being uploaded on a cloud as a third party without disturbing the privacy of the user with de-duplication giving efficient bandwidth utilization.

4. Conclusion In our proposed framework, we have aimed to

build such a system which can store the data efficiently without data redundancies. We have also taken care of the security factors which may make data/file vulnerable to threats, scams or tampering. We have also included a third party auditor which is a public auditing system, providing security over cloud space for transaction threats. Also, this TPA notifies the respective data owner in cases of tampered files. For the betterment of performance, we are dividing these files into chunks and while retrieving it is merged back together to form the whole file. As of now, we conclude that our system is better than the existing systems in terms of functionalities as well as performance concerns. 5. Acknowledgement

We would like to thank DR. D. Y. Patil School of Engineering for providing us with all the required amenities. We are also grateful to Prof. S. S. Das, Head of Computer Engineering Department, DYPSOE, Lohegaon, Pune for their indispensable support, suggestions and motivation during the entire course of the project. 6. References [1] Jian Liu, Kun Huang, Hong Rong, Huimei

Wang, and Ming Xian, “Privacy-Preserving Public Auditing for Regenerating-Code-Based Cloud Storage”, IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 7, JULY 2015

[2] Henry C.H. Chen and Patrick P.C. Lee, “Enabling Data Integrity Protection in Regenerating-Coding-Based Cloud Storage: Theory and Implementation”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 2, FEBRUARY 2014

[3] Raluca Ada Popa, Frank H. Li, and Nickolai Zeldovich, An Ideal-Security Protocol for Order-Preserving Encoding, IEEE S&P 2013

[4] Stavros Papadopoulos, Spiridon Bakiras and Dimitris Papadias, Nearest Neighbor Search with Strong Location Privacy,

[5] Haibo Hu, Jianliang Xu, Chushi Ren, Byron Choi, Processing Private Queries over Untrusted Data Cloud through Privacy Homomorphism, ICDE’14

[6] Yongjun Ren, Jian Shen, Jin Wang,JiangXu and Liming Fang, “ Security Data Auditing based on Multifunction Digital Watermark for Multimedia File in Cloud Storage”, IJMUE Vol.9, No.9 (2014),pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.9.24

Data Flow

Authentication and Block

Recognition

Application Server

Load Balancing

Auditing

Security Message

Flow Server1 Server2 Server3 Server3

Web Service

Business Logic

De-Duplication

Database

Data Owner

Fig. System Architecture

Sunit Ranjan Poddar et al, Int.J.Computer Technology & Applications,Vol 6 (6),961-965

IJCTA | Nov-Dec 2015 Available [email protected]

964

ISSN:2229-6093

Page 5: Privacy Concealment and Data Security over Outsourced Cloud

[7] Giuseppe Ateniese, Randal Burns, Reza Curtmola, Joseph Herring, Lea Kissner, Zachary Peterson and Dawn Song, “Provable Data Possession at Untrusted Stores”, 14 ACM conference on Computer and Communications Security, 2007

[8] Anuradha.R and Dr. Y. Vijayalatha, “A Distributed Storage Integrity Auditing for Secure Cloud”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 8, August 2013, ISSN:2277 128X

[9] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia, “Above the Clouds: A Berkeley View of Cloud Computing”, http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html, February 10,2009

[10] Pankaj Arora, Rubal Chaudhary Wadhawan and Er. Satinder Pal Ahuja, “Cloud Computing Security Issues in Infrastructure as a Service”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 1, January 2012, ISSN: 2277 128X

Sunit Ranjan Poddar et al, Int.J.Computer Technology & Applications,Vol 6 (6),961-965

IJCTA | Nov-Dec 2015 Available [email protected]

965

ISSN:2229-6093