Data Security in the Cloud (2)

Glasgow Caledonian University (GCU)

School of Engineering and Built Environment

Research and Project Methods 1

MMG412775

Session 2013-2014

Coursework Assignment

Securing Data Using Cloud Computing

Lecturer: Dr. Ali Shahrabi

Student Name: Rakendu Indus Rathy

Student ID: S1346027

Programme: MSc Advanced Computer Networking

Contents

1.Introduction........................................................................................................................................4

1.1. Background.................................................................................................................................4

1.2. Problem Description...................................................................................................................5

2. Literature Review..............................................................................................................................6

3. Research Question, Objectives and Technical Route.........................................................................9

3.1. Research Question......................................................................................................................9

3.2. Objectives...................................................................................................................................9

3.3. Knowledge / Skills.....................................................................................................................10

3.4. Projects Expected Outcomes....................................................................................................10

4. Technical Route...............................................................................................................................10

5. Scheduling and Risks........................................................................................................................12

5.1. Gant Chart................................................................................................................................13

5.2. Risks in the Research................................................................................................................14

Indicative References..........................................................................................................................15

Table of Figure:

Figure 1 Simulation of Multi Distributed System of Cloud Storage......................................................10Figure 2First Threat Model..................................................................................................................11Figure 3 Second Threat Model.............................................................................................................11Figure 4 Distributed Storage of Data in the Cloud...............................................................................12

1. Introduction This section provides the background to the research and the description of the problem

1.1. BackgroundCloud computing has been defined by Gellman (2009) as a model for enabling anytime,

anywhere network access to a shared configurable computing systems including networks,

servers, data warehouse devices and services. Cloud computing is new revolutionary

technology that brings with it numerous benefits. According to Cavoukian (2008) this

includes rapid elasticity which is the flexible provisioning and release of unlimited computing

facilities; service on demand which enables users to access a variety of services on demand

without the intervention of a service provider; unlimited network access; location

independence and measuring services. Cloud computing offers major cost savings in

infrastructure as well as access to a host of functionalities and services which would not have

been otherwise available to organizations and individuals.

Despite its manifold benefits however, the very nature of cloud computing raises questions on

the safety and the integrity of user data stored in the cloud. Cloud computing services are

offered through the three basic models of Software As A Service (SaaS), Platform As A

Service (PaaS) and Infrastructure As A Service (Iaas). Jensen et al., (2009) pointed out that

every model differs from the others in terms of facilities offered and security requirements. In

all cases these requirements are defined in terms of responsibility shared between the owners

of the data and the service providers of the cloud services. The SaaS model provides the least

extensibility to users but the highest security which is provided by the service provider

(Jensen et al., 2009). PaaS provides more extensibility to users than SaaS but it is the users

who are conjointly responsible for maintaining the safety of the data. IaaS systems offer the

greatest extensibility to users but the least security as the onus is on the consumer to secure

the data (Oliveria et al., 2010). Thus it may be observed that the more the users utilize the

cloud and its services the more responsible for the security of data stored on the cloud. This

can be problematic as the capacity of users to secure themselves from threats can be limited.

Much research has been done on the issue of the various types of attacks data in the cloud

might be subject to and how to guard against these attacks. This includes such recent research

as Arrington (2012), Browne (2012) and Dijk and Juels (2010). All data security in the cloud

boils down to confidentiality, integrity, availability and traceability of the data (Arrington,

2012). Gruschka and Jensen (2010) said that these in turn may be reduced to securing the

data while it is at rest, in a state of movement between systems and being used by the

customer’s as well controlling access to the database itself. In order that data is not stolen or

corrupted, safety mechanisms have to be put in place the protects data as it gets transferred

from one data base to another. Singh et al., (2011) suggested that in order to ensure the

confidentiality of the data, the data stored in the cloud has to be always encrypted. To ensure

that the integrity of the data based is maintained and not tampered with, access to the data

stored in the cloud needs to be monitored and controlled at all times. Various methods have

been developed to achieve these security requirements. This includes management of keys,

controlling access techniques, encryption methodologies, remote checking of integrity of data

and proof of ownership (Shin and Kobara, 2010). Numerous standard security protocols have

been developed to protect data in the cloud. These include HTTPS, key verifications and

SSH.

1.2. Problem Description Despite these security measures, however, the author believes that cloud security systems

ignore the most vulnerable point in the entire cloud computing chain which is the service

provider. This is particularly manifest in the fact that standards for the protection of data

stored in service provider data warehouses are yet to be formulated. However, the problem is

serious because when clients contact service providers of cloud computing systems for

various services, they actually loose or surrender their physical control of data warehouses

where the data is stored. These are in turn almost exclusively controlled by the service

provider. This leaves open the data warehouse to tampering and possible theft of data by the

service providers themselves or by their employees. Unauthorized access or manipulation of

confidential data can have disastrous consequences for users. Even if the users discontinue

use of the cloud and erase stored data, there is no assurance that the data is in fact deleted.

Users believe that the presence of the service level agreement (SLA) are enough to pre-empt

any unauthorized access, theft or manipulation of their data. Nevertheless, this cannot be

prevented in cases where there is premeditated or conscious effort to access the data on the

part of the service provider. Despite this, almost self-obvious threat however, very little

research has been conducted as to how users might protect their data from the service

provider. It is this gap that this research will address

2. Literature Review The security problems surrounding cloud computing systems stem from their service models

and the methods in which these models are implemented. The infrastructure as a service

model allows users to rent compute systems including hardware and internet facilities in

order to deploy their applications. Here the service provider owns the equipment and bears

the responsibility of running the underlying systems (Itani et al., 2009). The platform as a

service model provides the users with more functionalities such as application development

and messaging. Here again, users do not have to worry about the underlying infrastructure

whose management is that of the service provider (Rafael et al., 2011). In the case of

software as a service model, users utilize the applications that run on the service providers

cloud infrastructure. In this model as well, the user has no control over the infrastructure that

includes network connections, data warehouses and servers.

In all these cases, it may be noted that the provisioning of services is dependent on the service

provider. In addition, cloud computing is comprised of four main models of deployment.

Gellman (2009) said that these include private clouds where infrastructure is provided for

exclusive use by one organization only, public clouds where cloud systems are available for

use by the general public, community clouds where cloud networks are provisioned for use

by a dedicated community of users and hybrid clouds which are a mixture of the

aforementioned cloud configurations. In all these cases again, it is the service provider who is

responsible for the proper functioning of the underlying infrastructure, provisioning of

services and storage of data. Invariably, the rights and responsibilities of service provider’s

vis-à-vis their clients are captured in Service Level Agreements or SLA’s. These documents

spell out what service levels customers can expect from the cloud providers, and what levels

of security the service providers are expected to provide. In this scenario, the service provider

is most critical to ensure the safety and security of data stored in the cloud and for the

integrity of computations provided on the cloud.

This dependence on the service provider for securing cloud systems has been ignored in the

literature which instead focuses on how to protect data in the cloud from various attacks such

as hijacking, insecure interfaces, denial of service, malware attacks etc. This is because of the

frequency and virulence of these attacks in recent times. In June 2005, MasterCard reported

that almost forty million of its customers risked loss of their credit card data due to data

leakages from the computer storage systems of a credit card processing firm (Du et al., 2011).

In 2009, Heartland, which was a credit card payment processing firm for a quarter of a

million businesses, reported that malware infections in its data storage computers put millions

of transactions at risk (Arrington, 2012). A similar complained was made by Hannaford

Brothers in 2008 when 1800 of its credit card holders were subject to a phishing attack due to

compromised servers on those cloud systems where the firm had stored their data (Arrington,

2012).

All of this highlights one of the biggest drawbacks of cloud computing, which is that users

lose control of physical storage devices where data is stored on the cloud with responsibility

and control becoming the exclusive responsibility of the service provider. The onus of

securing the integrity and the privacy of data stored in the cloud is particularly important

given that service providers are market agents separate from the client firms. However, data

stored in the cloud is particularly vulnerable to tampering and misuse by the service provider,

notwithstanding the provisioning of the SLA. This is also known as attack by malicious

insiders (Browne, 2012). Typically such an attacks happens with an insider such as a service

providers gains access to a cloud system where data is stored for malicious purposes. Here

privacy and security of access to data gets compromised. Gellman (2009) points out that

service providers themselves or even their employees can use their access to systems to read

or even manipulate stored data. Jensen (2009) indicated that when the geographic distance

between clients and their service providers is large, this problem of securing the cloud system

from tampering by the service provider is exacerbated. In addition, in eventualities such as

bankruptcy of the service provider, buy out of service provider by other companies, or

migration by the client from one service provider to another, there is no guarantee of the

safety of stored data (Dijlk and Juels, 2010). There is no assurance that data stored on service

provider data warehouses is in fact completely erased or that there has been no data leakages

during migration of data storage from one service provider to another (Singh et al., 2011).

Even though service providers are equipped with safety rules and regulations as well as

strong infrastructure that can provide for customers data privacy and more availability,

several reports of privacy breach have been reported in recent years. In 2011, a suit was filed

against Dropbox Inc which was a service provider for cloud back up services based in the

United States (Arrington, 2012). The complaint, indicated that despite assurances provided by

the service provider, the data files of his firm had been tampered with and that best practices

of ensuring safety on the cloud was not maintained. In a survey conducted in the US in 2011,

43% of the firms interviewed reported security lapses in the cloud services they had used.

40% of the respondents indicated that their IT security requirements were not being met by

their service providers (Browne, 2012). Research conducted by found that obtaining data

from third party service providers of services was far easier than obtaining data from the

clients themselves.

Gruschka and Jensen (2010) proved that malicious insiders are very harmful because of their

ability to bypass all possible detection and prevention systems installed in the cloud. This

includes prevention of physical access, internal audits, log charts and use of cryptograms.

Gruschka and Jensen’s (2010) research indicated that malicious insiders work through

compromising passwords, breaking cryptographic keys and accessing files which store

passwords and then using these passwords to access files.

It is not that the scenario of possible abuse by service providers is not recognized. Research

conducted by Itani et al., (2009) is an indication of this. These researchers developed more

advanced versions of conventional cryptographic functions that were otherwise applied in

centralized data storage systems to maintain privacy of data. The cryptographic approaches

developed by these authors were exclusively for hiding customer data from their service

providers.

Various methods have been used by other researchers as well. This includes the masquerade

trap based detection system developed by Olivera et al., (2001), the profiling strategy

implemented by Singh et al., (2011) and the fog computing approach developed by Browne

(2009). The masquerade system developed by Olivera et al., (2001) used trap based systems

to detect intrusions conducted by malicious insiders. However the disadvantage with this

method is possible losses of data and data leakage. The user profiling method suffers from the

disadvantage that it is cumbersome and laborious to detect any intrusions. Fog computing is a

very niche and complicated skill with limited ability to pre-empt attacks on cloud data. In

research conducted by Shin and Kobara (2010), the customer’s identity was detached from

data stored and available only to the user and not to the service provider. Nevertheless all of

these studies focussed on one single service provider only which threatens to become a

bottleneck for cloud services. Research conducted by Gruschka and Jensen (2010) indicated

that cryptographic measures, which is the most common method in use today for securing

data in the cloud, is in fact insufficient for protecting data. They argue for hybrid models that

combine privacy, distribution of computing facilities and building of trust ecosystems to

properly secure data stored in the cloud from tampering by service providers. Du et al.,

(2010) indicated that one of the biggest challenges in cloud security today is ensuring that the

service provider does not retain the user data even after the end users migrate to other service

providers. Such data then becomes susceptible to misuse and tampering and even decryption

providing meaningful information to the service providers who can potentially misuse it.

These are called passive attacks where customers who have migrated to other service

providers are clueless about attacks carried out by their previous service providers.

Complicated cryptograms that secure data from attack are expensive and can be unaffordable

to the majority of clients.

It may be inferred here that a more distributed form of provisioning and utilization of cloud

services might be more effective than the traditional, single service providers systems being

used today. In addition, such a system must be affordable to the large mass of users of cloud

computing services as well.

3. Research Question, Objectives and Technical Route

3.1. Research Question

How to distribute data across multiple clouds and networks to secure it from being misused

by the service provider?

3.2. Objectives

The main objectives of the dissertation are given below:

To study the threat of single service providers to data security in the cloud

To implement a distributed multi-cloud storage system that will provide customers

with better security of their data stored in the cloud

To implement a distributed multi-cloud storage system that is cost effective and

provides best quality of service.

3.3. Knowledge / Skills This paper will follow a simulation approach using C Sharp DotNet. Hence a knowledge of C

Sharp DotNet will be necessary. The author will need to learn how to program in C Sharp

DotNet, generate scenarios, subject them to simulated attacks and then analyse the results.

3.4. Projects Expected OutcomesThe main purpose of the project will be to expose vulnerability of the cloud computing

system where client depends only on one cloud. The greater resilience of a multi-distributed

cloud system towards malicious attacks by the service provider or even to system failure will

be indicated.

4. Technical Route

The development of the multi-distributed system will be conducted in the form of a

simulation of a model in C Sharp DotNet. This model is indicated in figure 1.

Figure 1 Simulation of Multi Distributed System of Cloud Storage

Here the storage services for data in the cloud is considered between the cloud users

designated as (U) and the service providers designated as SP. Since cloud services are priced

on the quantity of data stored and the length of time of storage, the model is assumed to hold

the data for the same time period. There are p number of Service Providers with each service

provider associated with a particular quality of service factor designated as QoS. The cost of

providing storage services is indicated by the notation C. Each SP also has differing levels of

QoS associated with it as well as different values of C. Therefore any user of cloud service

(p1, p2, p3.....pN) can use more than one SP according to the desired level of security and

affordability of budget.

This research will also implement a threat model which will then be tested in C Sharp

DotNet. Two types of threat models are considered. The first is single point of failure which

impacts availability of data. This is a very realistic scenario if any server provided by the

service provider should crash. This would make data retrieval from the service provider by

the client very difficult. A schematic representation of this first threat model is indicated in

figure 2.

Figure 2First Threat Model

In the second type of threat model, what is considered is the possibility of attach from service

providers who collude together to siphon data from the cloud and then misuse it. This model

is represented in Figure 3.

Figure 3 Second Threat Model

As a solution for these hypothetical attacks, the author will devise a system that will

distribute data amongst nine storage clouds. This is indicated in Figure 4.

Figure 4 Distributed Storage of Data in the Cloud

The reasoning here is that if that data is stored in more than one cloud, then even if one cloud

breaks down or is compromised, data may still be retrieved from the other 8 clouds. In

addition, while collusion amongst two cloud service providers may be feasible, collusion

amongst all nine appears to be remote. Moreover, even if one or two service providers turn

truant and access data from one or more clouds, they will not be able to make sense of such

data because they will need access to all the pieces of data that are stored in the remaining

systems which is not possible. In this way, a distributed storage cloud system secures the

integrity of data from possible malicious tampering by the service providers.

A total of four iterations are planned in this experiment. Each iteration will consist of four

steps.

The first step will test the performance of the cloud computing system with two service

providers for condition of first threat which is broken point of contact. The second step will

test the performance of the cloud computing system with ten service providers for condition

of first threat which is broken point of contact.

The third step will test the performance of the system for the second threat which is two

service providers colluding and The fourth step will test the performance of the system for

two service provider collusion for distributed system of ten service providers. The results will

be tabulated and indicated graphically and explained in relation to the literature review.

5. Scheduling and Risks This section indicates the Gant Chart and the Risks the author foresees in this project.

5.1. Gant Chart15th – 31st

Mar

1st – 30th

April

1st – 7th

May

8th – 15th

May

16th – 23rd

May

24th – 31st

May1st – 15th June

16th – 20th

June

Complete Proposal and Obtain Sign off

Write Introduction, Literature Review ,

Methodology and obtain sign off

Run 1st round of simulation and tabulate results

Run 2nd round of simulations and tabulate

results

Run 3rd round of simulations and tabulate

results

Run 4th round of simulations and tabulate

results

Write Discussion and Analysis Chapters with

reference to Literature Review

Write Conclusion and Future Scope of Study

5.2. Risks in the Research The main risk of this research may be attributed to the lack of enough competency on the part

of the author on C Sharp DotNet which is the software that will be used to conduct the

experiment. The author will mitigate this by extensive practice, referring at all times to the IT

department of the college when in doubt to solve any problems or confusion that might arise.

The author surmises that distributing data amongst a host of cloud data warehouses might

solve the problem of possible tampering by any individual service provider. However, this

increase the complexity of access, the costs that are associated with multiple service

providers and the need to piece together data laboriously once it is received piecemeal from

multiple systems. Whether all this will actually happen needs to be tested in which case there

will be a trade-off between enhanced security and ease of operation. It will be up to the end

user to decide. Complexity of access can potentially stymie any advantage that might be

gained through enhanced security as on a day to day basis nobody wants a complicated style

of operation. This possibility is a risk as it nullifies any gains that this research might

highlight of distributed or multiple cloud storage systems.

Indicative References

1. Arrington, A. (2012) Gmail Disaster: Reports of mass email deletions. New York, Mc Graw & Hill.

2. Browne, P.S. (2012) Data privacy and integrity: an overview. In Proceeding of SIGFIDET ’12 Proceedings of the ACM SIGFIDET.

3. Du, J. W. Wei, X. Gu, T. Yu (2010) “RunTest: assuring integrity of dataflow processing in cloud computing infrastructures”, In Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security (ASIACCS ’10), ACM, New York, NY, USA, 293-304.

4. Gellman, R. (2009) Privacy in the clouds: Risks to privacy and confidentiality from cloud computing”, Prepared for the World Privacy Forum, online at http://www.worldprivacyforum.org/pdf/WPF Cloud Privacy Report.pdf,

5. Gruschka, M. Jensen, H. (2010) Attack surfaces: A taxonomy for attacks on cloud services”, Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on, 5-10 July 2010.

6. Itani, A. Kayssi, A. Chehab, (2009) Privacy as a Service: Privacy-Aware Data Storage and Processing in Cloud Computing Architectures,” Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, Dec 2009.

7. Jensen, J. Schwenk, N. Gruschka, L.L. Iacono, P. (2009) On Technical Security Issues in Cloud Computing”, IEEE International Conference on Cloud Computing, (CLOUD II 2009), Banglore, India, September 2009, 109-116.

8. Dijk, A. Juels, J. (2010) On the Impossibility of Cryptography Alone for Privacy-Preserving Cloud Computing. Boston, Macmillian Publications.

9. Oliveira, L. Lima, T. T. V. Vinhoza, J. Barros, M. M´edard, J. (2010) “Trusted storage over untrusted networks”, IEEE GLOBECOM 2010, Miami, FL. USA.

10. Yashaswi Singh, Farah Kandah, Weiyi Zhang Department of Computer Science, North Dakota State University, Fargo, ND 58105, “Secured cost effective multi-cloud data storage in cloud computing”,IEEE INFOCOM WORKSHOP ON CLOUD COMPUTING,2011

11. Rafael Moreno-Vozmediano, Ruben S. Montero, and Ignacio M. Llorente, Member, IEEEComputer Society,” Multicloud Deployment of Computing Clusters for Loosely Coupled MTC Applications”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 6, JUNE 2011

http://www.worldprivacyforum.org/pdf/WPF%20Cloud%20Privacy%20Report.pdf

12. P. S. Browne, “Dataprivacy and integrity: an overview”, In Proceeding of SIGFIDET ’71 Proceedings of the ACM SIGFIDET (now SIGMOD), 1971.

13. Cavoukian, “Privacy in clouds”, Identity in the Information Society, Dec 2008.

14. J. Du, W. Wei, X. Gu, T. Yu, “RunTest: assuring integrity of dataflow processing in cloud computing infrastructures”, In Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security (ASIACCS ’10), ACM, New York, NY, USA, 293-304.

15. S. H. Shin, K. Kobara, “Towards secure cloud storage”, Demo for CloudCom2010, Dec 2010.

16. W. Itani, A. Kayssi, A. Chehab, “Privacy as a Service: Privacy-Aware Data Storage and Processing in Cloud Computing Architectures,” Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, Dec 2009.

17. N. Gruschka, M. Jensen, “Attack surfaces: A taxonomy for attacks on cloud services”, Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on, 5-10 July 2010.

18. R. Gellman, “Privacy in the clouds: Risks to privacy and confidentiality from cloud computing”, Prepared for the World Privacy Forum, online at http://www.worldprivacyforum.org/pdf/WPF Cloud Privacy Report.pdf, Feb 2009.

19. J. Du, W. Wei, X. Gu, T. Yu, “RunTest: assuring integrity of dataflow processing in cloud computing infrastructures”, In Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security (ASIACCS ’10), ACM, New York, NY, USA, 293-304.

Documents

Data Security in the Cloud (2)