21
Journal of Digital Forensics, Security and Law Volume 13 | Number 3 Article 7 9-30-2018 A Forensic Enabled Data Provenance Model for Public Cloud Shariful Haque University of Alabama, [email protected] Travis Atkison University of Alabama, [email protected] Follow this and additional works at: hps://commons.erau.edu/jdfsl Part of the Computer Law Commons , and the Information Security Commons is Article is brought to you for free and open access by the Journals at Scholarly Commons. It has been accepted for inclusion in Journal of Digital Forensics, Security and Law by an authorized administrator of Scholarly Commons. For more information, please contact [email protected], [email protected]. (c)ADFSL Recommended Citation Haque, Shariful and Atkison, Travis (2018) "A Forensic Enabled Data Provenance Model for Public Cloud," Journal of Digital Forensics, Security and Law: Vol. 13 : No. 3 , Article 7. DOI: hps://doi.org/10.15394/jdfsl.2018.1570 Available at: hps://commons.erau.edu/jdfsl/vol13/iss3/7

A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

Journal of Digital Forensics,Security and Law

Volume 13 | Number 3 Article 7

9-30-2018

A Forensic Enabled Data Provenance Model forPublic CloudShariful HaqueUniversity of Alabama, [email protected]

Travis AtkisonUniversity of Alabama, [email protected]

Follow this and additional works at: https://commons.erau.edu/jdfsl

Part of the Computer Law Commons, and the Information Security Commons

This Article is brought to you for free and open access by the Journals atScholarly Commons. It has been accepted for inclusion in Journal of DigitalForensics, Security and Law by an authorized administrator of ScholarlyCommons. For more information, please contact [email protected],[email protected]. (c)ADFSL

Recommended CitationHaque, Shariful and Atkison, Travis (2018) "A Forensic Enabled Data Provenance Model for Public Cloud," Journal of Digital Forensics,Security and Law: Vol. 13 : No. 3 , Article 7.DOI: https://doi.org/10.15394/jdfsl.2018.1570Available at: https://commons.erau.edu/jdfsl/vol13/iss3/7

Page 2: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

A FORENSIC ENABLED DATAPROVENANCE MODEL FOR PUBLIC

CLOUDMd. Shariful Haque, Travis Atkison∗

University of AlabamaTuscaloosa, AL, USA

[email protected], [email protected]

ABSTRACT

Cloud computing is a newly emerging technology where storage, computation and servicesare extensively shared among a large number of users through virtualization and distributedcomputing. This technology makes the process of detecting the physical location or own-ership of a particular piece of data even more complicated. As a result, improvements indata provenance techniques became necessary. Provenance refers to the record describingthe origin and other historical information about a piece of data. An advanced data prove-nance system will give forensic investigators a transparent idea about the data’s lineage, andhelp to resolve disputes over controversial pieces of data by providing digital evidence. Inthis paper, the challenges of cloud architecture are identified, how this affects the existingforensic analysis and provenance techniques is discussed, and a model for efficient provenancecollection and forensic analysis is proposed.

Keywords: data provenance, conceptual model, cloud architecture, digital forensic, digitalevidence, data confidentiality, forensic requirements, provenance challenges

1. INTRODUCTION

In recent years, advancement in data pro-cessing and communication technology andtheabundance of digital storage capacity en-able coupling multiple computing resourcestomanage large amount of data. The con-cept of cloud computing was developed to of-ferthis computing power and storage as ser-vice virtually. In such a utility-based busi-nessmodel, a consumer can utilize the of-fered services on-demand following the ”pay-as-you-go”approach (Voorsluys, Broberg, &

∗Corresponding Author

Buyya, 2011). However, efficient data man-agement must include some other param-eters to maintain the quality of the dataand enable reusing it. Data provenance orlineage is a form of metadata that storesthe origin of a piece of data, keeps track ofits ownership, and manages the history ofthe computational processing the data goesthrough in its lifetime. This data manage-ment technique is useful not only as a sourceof data regeneration or as a component foridentifying errors through backward track-ing, it also helps in regulatory purposes andresolving disputes by facilitating a proper in-vestigation in digital forensics. These as-

c© 2018 ADFSL Page 47

Page 3: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

pects of provenance records make it an es-sential topic to be discussed in cloud archi-tecture.

Clouds basically use the concept of multi-tenancy and virtualization to ensure efficientutilization of available resources. It identi-fies and solves some challenges of large scaledata processing, such as on-demand resourceallocation based on computational require-ments, distribution and coordination of jobsamong different machines, automatic recov-ery management, dynamic scaling of oper-ations based on workloads, and releasingthe machines when all the jobs are com-plete (Amazon Web Services, 2008). Thesefeatures make cloud computing a popularchoice for small and medium scale indus-tries, and research says this market will ex-pand with a 30% CAGR (Compound An-nual Growth Rate) to reach 270 billion by2020 (Market Research Media, 2016). Ascloud computing grows in popularity, so doconcerns about security, compliance, privacyand legal matters (Chung & Hermans, 2010).Storing data in a location with an unknownowner’s record with thinner boundaries, andbacking up data by an untrusted third partyare a couple of the notable security con-cerns with cloud architecture (Hashizume,Rosado, Fernandez-Medina, & Fernandez,2013). These issues can eventually affect thetrustworthiness of the data, as well as themetadata associated with it, making it dif-ficult to find accurate evidence to apply inforensics. Though there are several digitalforensic tools to apply in general IT scenar-ios, most of them have failed to prove theirsuccess over cloud architecture. Thus, newdigital forensic methods must be developedto meet the challenges of a cloud architec-ture.

In this paper, a provenance model to in-crease the capability of digital forensics in acloud environment will be examined. Whilethis model can be experimented on other

cloud deployment models, the public cloudarchitecture will be the focus as it introducesmore challenges than the other models. Therest of the paper is organized as follows: InSection 2, basic concepts of public cloud, dig-ital forensic and data provenance along withtheir challenges and requirements will be dis-cussed. In Section 3, previous work on re-lated areas will be explored. In Section 4,the proposed data provenance model will bediscussed along with additional design con-siderations. Finally, conclusions and futurework will be discussed in Section 5.

2. BACKGROUND

2.1 Public Cloud

(Mell, Grance, et al., 2011) defined Cloudcomputing as a “model for enabling ubiq-uitous, convenient, on-demand network ac-cess to a shared pool of configurable comput-ing resources (e.g., networks, servers, stor-age, applications, and services) that can berapidly provisioned and released with min-imal management effort or service providerinteraction.” With lots of services offeredthrough advanced technologies, cloud com-puting is redefining computing technology.Its “pay-as-you-go” and “provision-as-you-go” make it the most suitable choice forstoring personal information, maintainingshared documents with a large group ofusers, and hosting large applications thatrequire higher computational power. Ascloud computing offers wide ranges of ser-vices based on demand from a single user toa large organization, different service and de-ployment models have been designed. One ofthe widely used deployment models of cloudservice is public cloud, which (Jansen &Grance, 2011) describe as an “infrastructureand computational resource” that is “ownedand operated by an outside party that de-livers to the general public via multi-tenant

Page 48 c© 2018 ADFSL

Page 4: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

platform”. Figure-1 depicts a general ar-chitecture of the public cloud model wherecloud consumers are accessing the infrastruc-ture of a cloud provider.

Figure 1. Public Cloud Model (Bohn et al.,2011)

2.1.1 Challenges

As its definition suggests, the public cloudmodel is actually designed to offer individu-als and enterprises the opportunity to mini-mize cost with on-demand infrastructure andcomputation support in a shared environ-ment. However, the shift of control over thedata and application to a different adminis-trative entity raises some security and pri-vacy concerns. A cloud provider’s reliableand powerful infrastructure is often vulner-able to threats like media failures, maliciousattacks or software bugs. Their ability toaccess a user’s data with malicious inten-tion is another type of privacy issue foundin the public cloud model. (Ren, Wang, &Wang, 2012) identified data owners’ inabil-ity to monitor and define the access controlpolicy and lack of control over the record rel-evant to cloud resource consumption as pos-sible security challenges.

(Bohn et al., 2011) and (Ren et al., 2012)considered hardware virtualization as an-other critical privacy and security issues.Hardware virtualization leads to a multi-tenant architecture where the physical in-frastructures are shared among multiple con-sumers with logically separated control over

the resources. An attacker might be able toovercome this logical separation using config-uration errors or software bugs to gain illegit-imate access. Accessing services over the in-ternet also increases vulnerabilities throughvarious network threats (Bohn et al., 2011).

When an organization manages its own re-sources and records in its secured computingenvironment, it can clearly arrange the nec-essary protective measures and have a de-tailed idea about the location and structureof the data. On the contrary, cloud comput-ing services are dispersed and data is dupli-cated over multiple locations to ensure avail-ability of the records. Cloud providers main-tain this information solely to eliminate pos-sible security breaches. However, the inabil-ity to locate the actual position of the datais a major compliancy concern for an orga-nization while exporting its business controlover to the cloud (Kandukuri, Rakshit, etal., 2009).

2.2 Digital Forensics

With the expanding use of technological de-vices and widespread adoption of commu-nication networks to share data, it has be-come necessary to develop technologies thatcan store and examine records from differ-ent sources. Digital forensics is a logicalapproach to identifying, collecting, and ex-amining data while ensuring the integrityand chain of custody is properly preservedand maintained. (Kent, Chevalier, Grance,& Dang, 2006) describe several applicationsof digital forensics based on a variety ofdata sources, such as investigating cyber-crime and violations of internal policy, recon-structing security incidents, troubleshootingoperational problems and recovering fromsystem damage.

(Zawoad, Hasan, & Skjellum, 2015) pro-vide an elaborate description of the steps ofthe digital forensic process based on the for-mal definition. Figure-2 demonstrates the

c© 2018 ADFSL Page 49

• • -··--!!----• ---!!----5----

Cloud Consumen acrcs~ing thecloudo\-cr the nctwork

1 I ! I

I I I I

- _ 111111 - - - - - - -=--.. mpi llf9 m:i-

Cloud Consumen accessing the cloud fi:omwithinthemterprise network

Page 5: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

digital forensic process flow. The identifica-tion process consists of two steps where theincident and most relevant evidence are iden-tified along with their possible correlationwith other incidents in the system. In thecollection step, all digital evidence relatedto an incident is accumulated from differentsources, preserving their integrity. Later, therecords are properly organized by extractingand investigating the data characteristics. Inthe final stage, a well formulated documentis prepared by the investigator to present tothe court.

Figure 2. Process Flow of Digital Forensic(Zawoad et al., 2015)

2.2.1 Challenges

Applying the digital forensic process in asimple computing architecture is complex bynature in and of itself; however, when acloud architecture is taken into considera-tion, the process becomes even more chal-lenging. The distributed nature of cloudinfrastructure and some of its features af-fect the credibility of the collected artifactsand eventually increase the complexity of theforensic process in each of its steps.

(O’shaughnessy & Keane, 2013) illus-trated some of the issues relevant to thecloud forensic process while describing thedifference from the general approach of dig-ital forensics based on the “Integrated Digi-tal Investigation Process Model” of (Carrier,Spafford, et al., 2003). Carrier illustratedthe relevance between digital and physicalinvestigation and described the digital in-vestigation model with five phases – preser-vation, survey, search and collection, re-construction and presentation. However,

(Alqahtany, Clarke, Furnell, & Reich, 2015)identified the challenges based on the gen-eral process model of forensics – identifica-tion, collection, organization and presenta-tion. Forensic challenges in cloud computingin this paper will be described referencingboth of these models.

Any digital investigation must begin withthe collection of the digital device to preservethe artifact related to an occurrence. Incloud computing, volatile memory of a vir-tual machine instance and the unavailabil-ity of the virtual images demands creation ofadvanced technologies to preserve evidence.Logs, the valuable container of evidence, aremaintained by the cloud providers, but in-tegrity of this data might be affected by un-lawful actions. A cloud environment alsoimposes restrictions over the control of dataand relational information (e.g., location ofdata) which reduces the opportunity of ef-fective data preservation.

Once the necessary evidence is preservedand ready for initial analysis to build a sat-isfactory theory, the incident identificationprocess can begin. (O’shaughnessy & Keane,2013) described the effect of cloud servicemodels in this phase. In both Software asa Service (SaaS) and Platform as a Service(PaaS), the consumer has limited access tothe cloud infrastructure, which leaves theinvestigator with a limited option to iden-tify any occurrence directly in the server in-stance. Infrastructure as a Service (IaaS)provides better access than the other twomodels as it allows the client to configuretheir server with necessary logs. The dif-ferent level of access in the different ser-vice models eventually hinders the processof building a unified model for incident iden-tification and documentation.

Data collection involves accumulatingdata from different sources like storage de-vices, client’s browser history, client’s com-munication with the service provider and ac-

Page 50 c© 2018 ADFSL

Page 6: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

cess information of the cloud and other net-work level data. Distributed cloud architec-tures introduce various challenges for the in-vestigator in this phase. They require ex-tensive knowledge of the cloud architecture,data storage techniques, extraction mecha-nisms for preserving data integrity and main-taining privacy of cloud consumers. Also,shared storage for managing the logging in-formation for multiple enterprises raise con-cern for privacy concern if any incident needsthorough investigation and require collect-ing the complete logging information. Datacollection also might be affected by inte-gration of inefficient provenance techniques,lack of time synchronization and inappropri-ate maintenance of the chain of custody.

Collected records are further used in a con-trolled environment to reconstruct an inci-dent to solidify theory building in the re-construction phase. This requires arrangingrecords from different sources in a unifiedstructure, identifying correlations betweendifferent events to successfully recreate theincident. Unavailability of proper forensictools and utility applications in a cloud envi-ronment introduces critical challenges in thisphase.

A successful forensic investigation endswith the submission of a well-formulateddocumentation of an incident in the courtwith proper digital and physical artifacts todefend the report. The distributed natureof the cloud might require a detailed ex-planation for proper understanding by thejury, location of data might introduce dif-ficulty to determine the judicial boundary,and sources and the data collection processmight be questioned for the lack of credibil-ity; hence, all these issues need to be over-come during this phase.

2.2.2 Requirements

(Bohn et al., 2011), in their conceptual refer-ence model of cloud architecture, mentioned

about five major cloud actors with theirdefined responsibilities and offered services.Figure 3 illustrates a cloud architecture fo-cusing on the cloud actors, based on thehigh-level architecture shown in (Bohn et al.,2011). Each actor has sole or shared respon-sibilities to execute different business trans-actions or operations in a cloud environment.In this scenario, cloud providers offer ser-vices which are negotiated by the brokers,connected and transported by the carrier,used by the cloud consumers and evaluatedindependently for performance and securityby auditors.

Figure 3. Conceptual Architecture of Cloud

The contribution of these actors is laterused by (Ruan & Carthy, 2012) when dis-cussing the different types of digital investi-gations. While an investigator is responsiblefor the external investigations, cloud actorsplay a vital role in the internal investigationsby taking various measures for security intoconsideration. Based on the (Cloud Secu-rity Alliance, 2011) defined by Cloud Secu-rity Alliance, (Ruan & Carthy, 2012) pre-sented a list of responsibilities of differentcloud actors in digital forensics. The respon-sibilities are identified as required criteria ofcloud forensics.

Ensuring data ownership with steward-ship, data retention and disposal, data reten-tion and disposal, facility security, clock syn-chronization, and integrating audit log andintrusion detection are identified as the sole

c© 2018 ADFSL Page 51

r-- ' Cloud Consumer

Cloud Cloud Pro";der Broker

Cloud Auditor

Cloud Carner

Page 7: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

responsibilities of the cloud provider [23].Data ownership with stewardship helps indetermining ownership of the records andmaintaining a chain in custody in the foren-sic investigation. With a proper backup fa-cility and redundant data storage policy, theprovider can ensure successful data reten-tion. Strict data disposal operations alsoneed to be integrated to ensure complete re-moval of data from all the sources withoutleaving any option to recover them. Backupsare the only possible means of reconstructingany event or incident.

Cloud providers also need to ensurethere are controlled access and authoriza-tion checks for any entry to the facility andduring data relocation to a different facil-ity. In digital forensics, identifying eventsin proper order helps in incident reconstruc-tion. Cloud providers need to ensure that allthe processes are running under a synchro-nized environment. Integrating different log-ging functionality to record every single op-eration is another important responsibilityof cloud providers, which undoubtedly playsthe most vital role in digital forensic.

(Ruan & Carthy, 2012) also listed someshared responsibilities of providers and con-sumers based on the (Cloud Security Al-liance, 2011). Defining meaningful taxon-omy, depending on data type, origin, legal orcontractual constraints and sensitivity, helpsin forensic investigations. All the impor-tant assets need to be recorded with owner-ship information, and different data-relatedoperations need to be logged with commonforensic related terms. For the purpose of fa-cilitating regulatory, statutory and contrac-tual requirements for compliance mapping,elements might be assigned with legislativedomains and jurisdiction. Other shared re-sponsibilities of the provider and consumerinclude authorization, multi-factor authen-tication, establishment of policy and proce-dure as a part of incident management, and

preservation of data and evidence integrity.

2.3 Data Provenance

Data provenance has proved its applicabil-ity and necessity in different application do-mains, data processing systems and repre-sentation models. For e-science, it can beused for transformation and easy derivationof data. In warehousing, it can be used toanalyze and represent data. For Service Ori-ented Architectures (SOA), it can be usedto execute complex computations. In doc-ument management and software develop-ment tools, it can be used to identify thelifecycle of a document. As the nature of thelarge range of applications differs, so doestheir technique of managing the provenancerecords.

Data provenance was classified in severalcategories based on the techniques used totrace the records and the way these are pre-served. The techniques for data tracing areclassified under two approaches: “lazy” and“eager” (W. C. Tan, 2004). In the “lazy”approach, provenance records are computedonly when they are required, while prove-nance records are carried with the data inthe “eager” approach. Sequence-of-Deltaand Timestamping are two alternative ap-proaches of archiving provenance records(W. C. Tan, 2004). The Sequence-of-Deltaapproach stores the reference version and asequence of forward or backward deltas be-tween versions. On the other hand, times-tamping uses versions and times to identifydata at various steps of its lifetime.

Provenance can also be discussed fromthe points of the affect of the sourcedata on the existence of data records, orthe locations from which those records arefetched (Buneman, Khanna, & Wang-Chiew,2001). These concepts are termed as why-provenance and where-provenance, respec-tively. From the aspect of data process-ing architecture, another classification was

Page 52 c© 2018 ADFSL

Page 8: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

proposed by (Simmhan, Plale, & Gannon,2005). This approach was later adoptedby (Muniswamy-Reddy, Holland, Braun, &Seltzer, 2006) and elaborated with more cat-egories. They extend the database-orientedapproach of (Simmhan et al., 2005) to in-clude a file and file-system oriented ap-proach called Provenance Aware StorageSystem (PASS). A grid-based solution com-prises the service-oriented architecture inwhich software-development tools are part ofa scripting-architecture. Environment archi-tecture was the fourth category included by(Muniswamy-Reddy et al., 2006).

In a domain-specific approach of storingprovenance, data and provenance are looselycoupled, as data is managed by the file sys-tem and provenance is stored in databasesystems (Muniswamy-Reddy et al., 2006).Lineage File System (LinFS) used a thirdparty database to store provenance recordswhile able to track it at file system level au-tomatically (Sar & Cao, 2005). The obvi-ous issue with file and provenance separa-tion was addressed in Flexible Image Trans-port System by (Wells, Greisen, & Harten,1981) where the file header contains the ad-ditional attributes as a provenance record.Another approach for tight coupling be-tween data and provenance was later in-troduced in PASS. PASS tracks the prove-nance automatically like LinFS and man-ages its database directly into the kernel byits file system PASTA (Muniswamy-Reddyet al., 2006). PASS proved its superiorityby ensuring better synchronization betweendata and provenance; it also provides se-curity over the provenance and features toquery those records. File Provenance Sys-tem (FiPS) is another provenance systemthat collects provenance records automati-cally operating at the system level and belowthe Virtual File System (VFS) layer (Sultana& Bertino, 2013).

2.3.1 Challenges

Like digital forensics, the dynamic natureof cloud computing and its architectureintroduces several challenges to establisha successful provenance technique. Re-searchers have highlighted heterogeneous ar-chitecture of cloud infrastructure, granular-ity , virtualization, availability and data in-tegrity as the prime challenges while iden-tifying a better provenance approach inthe cloud (Muniswamy-Reddy, Macko, &Seltzer, 2010; Sakka, Defude, & Tellez, 2010;Zhang, Kirchberg, Ko, & Lee, 2011; Katilu,Franqueira, & Angelopoulou, 2015; Abbadi,Lyle, et al., 2011).

(Muniswamy-Reddy et al., 2010) discov-ered the lack of extensibility of existingprovenance systems in the area of cloud com-puting and also their inability to supportavailability and scalability as some of thecritical areas. Incompatible availability fea-tures between the provenance system andthe cloud architecture may affect the prin-ciple goal of provenance architecture. In ad-dition, lack of scalability becomes clearly vis-ible when a database is used to store prove-nance in the cloud architecture. This featuremakes provenance records query-able, butin a distributed architecture, it introduces adeadlock or scalability bottleneck. Introduc-tion of a parallel distributed database, a pos-sible solution to this problem, is often crit-icized because of its lack of maintainabilityand cost.

(Sakka et al., 2010) highlighted challeng-ing areas arise due to the variety in cloudarchitecture. Difference in policies appliedby these architectures to identify objectsweaken the ability to establish a unique ob-ject identification technique in cloud. Also,there is no efficient and unified architec-ture for an inter-operable provenance systemwhich could manage the heterogeneous poli-cies followed by different entity and multiple

c© 2018 ADFSL Page 53

Page 9: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

level of granularity.

The virtualization feature of cloud archi-tecture and its nature of fault tolerance alsoimposes some technical issues in provenancecollection. (Zhang et al., 2011) discussed thenecessity of maintaining a record of operat-ing systems, server locations and memorymanagement of virtual and physical envi-ronments to provide more accurate informa-tion to the consumers. They also prioritizedthe maintenance of provenance informationabout data migration which takes place incase of hardware failure.

(Abbadi et al., 2011) presented taxonomyof cloud infrastructures with a descriptionof the dynamic nature of this structure andidentified the challenges in the area of log-ging and auditing. It is imperative for asystem administrator to identify the loca-tion and reason of an error through auditingand logging. But cloud’s multilayer archi-tecture makes it difficult to identify the ori-gin of data from humongous resources col-lected from a large number of diversifiedsources. (Abbadi et al., 2011) defined therole of the administrator in this scenario asidentifying the time intervals when physicallayers and virtual layers are used by the vir-tual resources and applications respectively,and combining these records with other rele-vant log files. They also spotted some short-comings like lack of protection in maintain-ing logging and auditing records and lack oftrustworthiness of the provider maintainingthe cloud infrastructure.

2.3.2 Requirements

In this section, requirements of data prove-nance will be discussed from two different as-pects: the mandatory properties to fulfill thegeneral criteria of a provenance record andthe required features for a successful prove-nance technique in the cloud.

(Zhao, Bizer, Gil, Missier, & Sahoo, 2010)categorized the requirement of provenance

based on concerning areas such as content,management and usage which are essentialfor an effective resource description frame-work (RDF) model. They presented an elab-orated discussion over each criterion. Theseinformation include identity of the content,evaluation of the content over the time,records of processes or entities contributed inits evaluation including justification and de-sign choices behind the evaluation process.From the management perspective of theprovenance records, a well-formatted repre-sentation language and the accessibility ofthe provenance records are also consideredas important attributes.

(Moreau et al., 2011) also identified a sim-ilar set of integral components in their OpenProvenance Model (OPM). Basic informa-tion about an object, processes and agentsinvolved in its evaluation, interdependenciesbetween different entities, and actual role ofthe agents are the most important tokensof a provenance record. This informationlater needs to be recorded in an easily rep-resentable language supported by heteroge-neous systems and should allow customiza-tion to include additional information. Ad-ditionally, the recorded information shouldbe easily accessible with a simple query.

With the necessary attributes mentionedabove, provenance system also needs to fol-low some standards in terms of data mainte-nance, dependency on the associated appli-cation, level of granularity and data integrityand presentation (Muniswamy-Reddy et al.,2010; Katilu et al., 2015; Taha, Chaisiri, &Ko, 2015; Y. S. Tan, Ko, & Holmes, 2013).

The data-coupling feature confirms the as-sociation between the data and provenancerecords. Appropriate data coupling cancelsthe chances of misleading data that mightcontain new provenance information for olddata or vice-versa. This coupling must alsoinclude the causal ordering among differententities while recording the provenance in-

Page 54 c© 2018 ADFSL

Page 10: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

formation (Muniswamy-Reddy et al., 2010).A provenance tracking system should en-

able tracking a particular object from everyhost in a distributed environment, which isan essential property to detect an incidentand ensure accountability across a system.That way, if multiple host contribute in anyoperation on an object, provenance recordsshould maintain record for each responsi-ble hosts. A provenance system should alsobe independent of the application and plat-form so that no additional configuration isrequired to track if a new application is in-troduced in a system or if the kernel of asystem is modified (W. C. Tan, 2004).

Provenance system should strictly main-tain the integrity of the records by mak-ing them temper-evident and ensure persis-tence of the record. Data integrity assurethe accuracy or correctness of the recordsover time while temper-evidence confirms re-liability, authenticity, admissibility, believe-ability and completeness (Taha et al., 2015).On the other hand, persistence confirms theexistence of provenance information evenwhen the actual object is erased from thesystem as long as the object has a descen-dant. The nature of provenance informationalso demands confidentiality in every levelof data management e.g., data access, datastorage and data collection.

Capturing provenance records from multi-ple levels is also considered to be an essen-tial property of a provenance system. Multi-layer granularity provides complete scenarioof any incident occurred with an object indifferent level of a system (Katilu et al.,2015). An advanced interface to representthe provenance record along with an ef-ficient data retrieval technique is anothermeasure that defines the success of a prove-nance system. This sort of interface withefficient query mechanism helps identify-ing any occurrence quickly and understandpatterns and characteristics of provenance

record (Katilu et al., 2015).

3. RELATED WORKSDigital provenance, in the cloud environ-ment, has been studied for a long time inorder to meet the critical requirements im-posed by the architecture and to mitigate thechallenges revolved around digital forensics.Some researchers proposed different prove-nance techniques, while others have identi-fied efficient approaches for effective forensicsolutions (Muniswamy-Reddy et al., 2010;Zhang et al., 2011). Several techniques werealso proposed where provenance architectureis specially discussed considering the require-ments of digital forensic (Li, Chen, Huang,& Wong, 2014; P. M. Trenwith & Venter,2014; Lu, Lin, Liang, & Shen, 2010).

(Muniswamy-Reddy et al., 2010) extendedPASS for cloud architecture which was ini-tially used in local file systems and net-work attached storage. They have im-plemented three provenance recording pro-tocols using Amazon Web Service (AWS)- single cloud storage, cloud storage withcloud database, and cloud storage with clouddatabase and messaging service based onPASS. Cloud messaging service in the thirdarchitecture ensured its superiority over theother two by confirming provenance data-coupling through transaction and providingthe fastest service.

Cloud providers usually track every oper-ation executed inside a cloud environmentto ensure immediate response to any inci-dent. Flogger is such a logging mechanismwhich can track file provenance both in vir-tual and physical machines (Ko, Jagadpra-mana, & Lee, 2011). DataPROVE, a dataprovenance technique was proposed basedon records generated by Flogger in collab-oration with other system monitoring toolslike User, Process, Event Tracker, ChangeTracker and Network Traffic Tracker (Zhang

c© 2018 ADFSL Page 55

Page 11: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

et al., 2011). It collects provenance infor-mation in the application layer service likePaaS, SaaS and IaaS, rather than just vir-tual and physical machine information. Italso captures other cloud related essentialprovenance information like client identifi-cation record and transfer of informationbetween different virtual and physical ma-chines placed in different locations. Addi-tionally, DataPROVE also provides efficientdata API for indexing and query operations.

S2Logger is another approach of maintain-ing secured data provenance through loggingat the block-level and file-level. It is imple-mented based on Flogger (Suen, Ko, Tan,Jagadpramana, & Lee, 2013). In this mech-anism, (Suen et al., 2013) addressed featureslike capture mapping between virtual andphysical resources, proper maintenance ofthe provenance chain and data integrity andconfidentiality. S2Logger also introduced anend-to-end data tracing mechanism that iscapable of capturing detail provenance infor-mation from all nodes connected in the cloudenvironment.

(Lu et al., 2010) have designed a prove-nance mechanism enabling the forensic fea-tures in the cloud environment. Provenancewas designed based on a bilinear pairingtechnique considering some critical aspectsof cloud like data confidentiality, anony-mous access of the user, and dispute reso-lution. This mechanism introduced compu-tation and communication overhead at thedata owner’s end. This issue was later iden-tified by (Li et al., 2014), and they haveintroduced a broadcast encryption methodto overcome this. They have also ensuredanonymity of the user by implementing anattribute-based signature (ABS) scheme foraccess control and privacy by designing ananonymous key-issuing protocol.

(P. M. Trenwith & Venter, 2014) discussedanother source of log information which mayallow the investigator to answer to the phys-

ical location of the data. They adopted aprovenance technique in their proposed sys-tem where transport layer protocol informa-tion was recorded to identify the locationof data. They also considered the appli-cation layer data to answer the additionalqueries like who, when, what and how infor-mation which are required for a successfulforensic investigation. Provenance record isproposed to be stored in a centralized log-ging server which eventually can overcomethe problem of strong coupling between dataand its meta data and can also ensure dataintegrity. They proposed another model forprovenance with features like digital objecttracking and location identification at anypoint of the lifetime of that particular digi-tal object (P. Trenwith & Venter, 2015). Inthis model, they introduced the concept of awrapper object which wrapped a file alongwith its provenance records embedded to-gether and the location of the wrapper ob-ject would be maintained in a separate cen-tralized server.

4. FORENSIC

ENABLED

PROVENANCE MODEL

4.1 Additional DesignConsideration

Designing an efficient and forensic enabledprovenance mechanism in a cloud environ-ment requires that some other areas be con-sidered which vastly contribute to the suc-cess of the process. Identifying the possi-ble sources of provenance records, decidingon a unified model to store those records,data storage with required features for ef-ficient data management, and ensuring se-curity and integrity of the data for forensicoperations are important factors that mustbe considered for a provenance mechanism.

Page 56 c© 2018 ADFSL

Page 12: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

General cloud architecture can be dividedinto several layers – physical resource layer,resource abstraction layer and service layeras shown in Figure 4. (Ruan & Carthy,2012) identified the sources of forensic ar-tifacts from each of the layers. Amongthese, it is necessary to identify the actualsources which provide necessary provenancerelated information and help in forensic anal-ysis. Hard disks, network logs and accessrecords are important sources from the phys-ical layer. Event logs and virtual resourcesare necessary from the abstraction layer, andaccess logs, events logs, rigid informationabout the virtual operating system, and ap-plication logs are vital sources from the ser-vice layer. (Imran & Hlavacs, 2013) identi-fied provenance information in different sub-layers of the service layer and developed anintegrated provenance data collection pro-cess. In our conceptual model approach ofcollecting provenance data, we will also con-sider the featured criteria used in these pro-posed systems which can be helpful for theforensic process.

Figure 4. Cloud System Architecture([(Bohn et al., 2011)])

The format and properties for loggingprovenance data collected from differentsources is important in a provenance collec-tion system. (Marty, 2011) has identifiedsome important properties of provenancedata like timestamps, application, user, ses-sion id, category, and reason and severityof the event. Applications running in theservice layer mostly contribute to this typeof information which can be collected fromdifferent application and event logs. A dis-tributed file system, like Google File System(GFS), maintains a separate set of informa-tion vital for provenance and forensics. TheGFS master usually keeps information fileand chunk (part of a file) namespace, map-ping from file to chunks, and location of thereplica of the chunks (Ghemawat, Gobioff, &Leung, 2003). The Hadoop Distributed FileSystem (HDFS) also maintains a transactionlog, called EditLog, in its NameNode whichcontains every modification recorded in filesystem metadata (Borthakur et al., 2008).Google cloud maintains region and zone re-lated information while creating a virtualmachine instance recording which has prove-nance data along with other informationthat may guide an investigator in forensicanalysis. The heterogeneous characteristicsof these records make the process of choos-ing a particular format very hard. Marty hassuggested a syntax of logging the informa-tion as “key-value” pair, which is consideredas an ideal syntax of logging information inour design approach. We also include a uni-fied naming approach of the “key” to ensuresynchronicity of the same properties comingfrom different from sources.

Provenance data records must be eas-ily searchable; therefore, a data repositorythat can provide the best performance andan efficient query interface must be chosen.(Muniswamy-Reddy, 2006) explored differ-ent database architectures to choose the bestdata repository. They considered aspects

c© 2018 ADFSL Page 57

Service Layer

Saas

PaaS

IaaS

Resource Abstraction & Control Layer

Physical Resource Layer

Hardware

Facility

Page 13: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

like the format of maintaining dependencyof the files and different approaches of stor-ing the annotation metadata. They com-pared Berkley DB as a schemaless database,MySQL as a RDBMS database, XMLDB asXML database and OpenLDAP as a rep-resentative of an LDAP architecture andfound that Berkley DB outperformed otherdatabase architectures in terms of stan-dard database related operations in this sce-nario. While implementing PASS in cloud,(Muniswamy-Reddy et al., 2010) also usedSimpleDB and Amazon S3 to store prove-nance objects.

Provenance records logged in different lay-ers are basically maintained by different ac-tors of the cloud. Cloud consumers are usu-ally responsible for the records maintenanceservice layer while providers manage the log-ging activities of the abstraction and physi-cal layers. Provenance records in these lay-ers may often be affected by the collusionbetween the consumer and provider whicheventually can mislead the forensic investi-gator. The investigator can also play the roleof an accomplice with either of these parties,which ultimately diminishes the whole ap-proach of data provenance and forensic anal-ysis. To ensure integrity of the provenancedata and trustworthiness of the collectionsystem, it is necessary to include a verifica-tion process which can be used to validatethe provenance information. (Zawoad et al.,2015) proposed a Proof Publisher Method(PPM) in their Open Cloud Forensic (OCF)which can be used by the court authority toverify the forensic reports provided by theinvestigators. They also proposed a reliableway of collecting the evidence through se-cured read-only APIs accessible only by theinvestigator and court authority.

4.2 Proposed Model

Figure 5 shows the proposed Forensic En-abled Cloud Provenance Model. This sec-

tion will provide a description of the pro-posed model and surveys its requirementsof provenance, forensic analysis, and neces-sary design considerations. The concept ofthe general actors of cloud operations andforensic analysis have been utilized in thismodel; however, the contribution of each ofthe cloud actors and access mechanisms tothe provenance records for the investigatorand court authority have been redefined. Itis also necessary to introduce a unified log-ging mechanism in different layers of a cloudenvironment to leverage data manipulationwhile recording provenance information in apersistent database.

Figure 5. Forensic Enabled Cloud Prove-nance Model

In a cloud model, consumers and providersare the two main contributors to the cloudoperation, while a cloud auditor basicallyperforms analysis on the cloud services andprovides feedback. In the proposed model,an additional responsibility for the cloud au-ditors is included. Usually cloud services likeSaaS and PaaS are designed to provide theconsumer with less control over the systemwhile IaaS allows configuring the necessaryapplications. In these scenarios, a cloud au-ditor’s additional responsibility will be moni-toring whether the services used by the con-sumers are capable of logging every opera-tion or whether a provider enables the log-ging features for each application where theconsumer has limited access. The providers

Page 58 c© 2018 ADFSL

Page 14: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

additional responsibility will be introducingpolicies in SLA which will clearly instructthe clients about installing necessary audit-ing features. The provider should also re-design the access policies so that the clientcan have enough options for customization.The provider should also be capable of mon-itoring every operation performed by theclient. In an ideal scenario, the providershould install firewalls and other applica-tions to detect intrusions or any form of ma-licious activity.

Database architecture is an important as-pect of designing an ideal provenance model.It is required not only to accommodate avariety of information or manage large vol-umes of data, but also to ensure availabil-ity and scalability. Considering these facts,the proposed system is a schema-less dis-tributed database system to store the prove-nance record coming in heterogeneous for-mat from different sources. The distributednature of the database architecture will en-sure the availability and scalability features.

Data will be stored in a key-value formatwhere the key name will maintain a con-sistent naming convention to facilitate thequery operation maintained through a uni-fied data modeling policy. Each transactionwill be logged separately with the referenceto the dependent object. Keeping a singleinstance of the provenance record requiresupdating the instance very frequently andhence reduces the expected performance ofthe system. In the case of recording prove-nance for a file, it is recommended that acopy of the content that will be logged inthe repository be maintained. Provenancerecord of a data should also maintain rela-tionship information with the process or en-vironment variables that effected it at anypoint of its lifetime.

A timestamp with every transaction is avital field. Loggers in different layers will re-quire maintaining strict synchronization of

time to capture the sequence of events prop-erly. To maintain proper chronology of thedata from different systems, each transactionwill be saved with timestamp information forboth the source and data storage system.This will eventually help in identifying theunsynchronized nodes or sources.

Trustworthy data storage policy is anotherimportant concern. It is necessary to en-sure integrity of the data and employ tech-niques to easily identify any forged data andpossible miscreants. To ensure this, thedatabase should be designed for capturingnecessary audit information. Audit informa-tion will include the application or user iden-tity which eventually stores or update anyinformation. For all of the data that will besaved an additional attribute will be addedas provenance. This attribute will containthe hash value of the data item. This hasheddata will be used later for verification of theinformation.

If any issue is reported by any of the otherthree actors of cloud services, the process offorensic analysis starts. In this case, an in-vestigator will conduct the preliminary in-vestigation by collecting provenance recordsfrom the provenance storage and finally pre-pare a report eligible for submission to thecourt authority. The court authority willverify the report and provide the verdict. Tofacilitate this process, query and verificationinterfaces for the investigator and court au-thority have been introduced in the proposedsystem. These services will be provided bythe service providers.

4.3 Discussion

This section briefly demonstrates how theproposed model of provenance collectionovercomes the challenges already discussedin Section 2 and how efficiently this modelcan facilitate the process of forensic analy-sis. In table-1, we also present a compara-tive analysis with the existing approaches in

c© 2018 ADFSL Page 59

Page 15: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

terms of the challenges of provenance man-agement for forensic analysis in public cloudaddressed by those models.

One of the primary concerns about cloudcomputing is the shift of control of data orfiles to a different authority. Data might beaffected by malicious attacks, software bugsor system failure. The distributed nature ofcloud architecture can ensure availability ofdata in these scenarios. Any unlawful accessfrom a provider’s end can be easily identifiedthrough the provenance information, as ev-ery access to any file or data is recorded bythe provenance system.

Allowing multi-tenant facility through vir-tualization and identifying the location is themost notable challenge in a public cloud en-vironment. Collecting snapshots of virtualmachine logs with additional information toidentify the virtual machine instance cansolve this problem. File system metadatacontains details about the file status and lo-cation. Integrating necessary measures al-ready discussed in 4.1 can lead to an efficientsolution to these issues. Maintaining prove-nance records with location information willeventually help investigators identifying thelocation of an object and collect necessaryartifacts accordingly.

Forensic investigation is often hamperedbecause of the lack of synchronization be-tween stored records and variations ofrecords coming from different sources. In theproposed model, a unified model to identifyattributes of the entity and relevant oper-ations is introduced. This will facilitate theprocess of analysis as investigators can easilyidentify and differentiate between entities ofa particular piece of information and theireffect on the provenance record or forensicanalysis.

Managing the large collection of prove-nance data is a difficult job. The largevolume of information, variety in data for-mats and increase in volume in high veloc-

ity make the provenance collection processa highly challenging task. To ensure effi-cient forensic analysis, a provenance collec-tion technique can compromise none of theseconcerns. Hence, it is recommended to en-sure a better data management policy. Theproposed model suggests a unified namingconvention for “keys” coming from differentsources. A well-chosen naming conversionhelps to identify the exact information insearch and can also identify relations in loginformation coming from different sources.

Regeneration of data or an incident fromthe lineage information is another focusof a data provenance system. Provenancerecords maintained with the actual contentcan ease this process as an investigator caneasily extract any verified content from theprovenance storage and apply the opera-tions which occur in the other version of thedata to check if these operations generatethe same result. This forensic enabled dataprovenance model also considers the storingof the data along with its metadata in theprovenance log.

This proposed provenance model collectsinformation from different layers of a cloudarchitecture. It also considers necessarymeasures to include location informationand other attributes required to identify therelated entities or actions of data. This largevolume of information opens enough oppor-tunity for proper analysis.

A loose coupling between data and prove-nance records is maintained as prove-nance information is stored separately ina database. Separation between these tworecords ensure the existence of provenanceinformation even if the actual data is miss-ing. By nature, provenance information isimmutable and persistent. If any provenanceis deleted from the database intentionally, adatabase audit can provide necessary hintsto identify the miscreants.

The proposed model also prioritizes the

Page 60 c© 2018 ADFSL

Page 16: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

Proposed Approaches(Muniswamy-Reddy et al.,2010)

(Zhanget al.,2011)

(Li etal.,2014)

(P. M. Tren-with &Venter,2014)

(Luet al.,2010)

OurAp-proach

PublicCloud

Virtualization 3 3

Security &Privacy

3 3 3 3 3

Data Loca-tion

3 3

DigitalForensic

Data Preser-vation

3 3 3 3

Identification 3 3 3 3 3

Collection 3 3 3 3

Scenario Re-construction

3 3 3 3 3

Presentation 3

Provenance

Data Cou-pling

3 3 3

Cross-HostTracking

3

ApplicationIndependence

3 3

MultilayerGranularity

3 3 3

Data In-tegrity

3 3 3 3 3

Confidentiality 3 3 3 3

Query-able 3 3 3

Persistence 3 3

Causal Or-dering

3 3 3

Table 1. Comparative analysis of the proposed approaches

necessity of a query and verification inter-face for the investigator and court author-ity. A well-designed interface with featuresof executing customize requests to extractrequired information from the provenancedatabase can prove the efficiency of a prove-nance model. Effective data visualizationis another important aspect of these inter-faces. A detailed overview of an event in

chronological order, providing informationabout a file with location and list of pro-cesses that contributed in evolution of thefile can help an investigator define the le-gal boundary while requesting for actual ar-tifacts from the providers without interven-ing other users’ information and identifyingthe actual lineage of the file. A step by stepverification can also help a jury or court au-

c© 2018 ADFSL Page 61

Page 17: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

thority to detect the possible symptoms ofcollusion between the consumer and cloudservice provider, consumer and investigator,or provider and investigator.

5. CONCLUSION

Maintaining the provenance information ofdata is as important as the data itself. Inmodern day computing technology, it is anintegral part of data to ensure its authen-ticity and integrity. In this paper, we haveidentified the challenges of cloud architec-ture, discussed how this affects the existingforensic analysis and provenance techniques,and discussed a model for efficient prove-nance collection and forensic analysis. Wehave also briefly highlighted the necessity ofprovenance data visualization and explainedwhy or how it can be helpful for modern dayforensic analysis.

REFERENCES

Abbadi, I. M., Lyle, J., et al. (2011).Challenges for provenance in cloudcomputing. In Tapp.

Alqahtany, S., Clarke, N., Furnell, S., &Reich, C. (2015). Cloud forensics: areview of challenges, solutions andopen problems. In Cloud computing(iccc), 2015 international conferenceon (pp. 1–9).

Amazon Web Services. (2008, September).Building greptheweb in the cloud, part1: Cloud architectures.(http://developer.amazonwebservices.com/connect/ entry.jspa?externalID=1632[September 30 2016])

Bohn, R. B., Messina, J., Liu, F., Tong, J.,& Mao, J. (2011). Nist cloudcomputing reference architecture. InServices (services), 2011 ieee worldcongress on (pp. 594–596).

Borthakur, D., et al. (2008). Hdfsarchitecture guide. Hadoop ApacheProject , 53 .

Buneman, P., Khanna, S., & Wang-Chiew,T. (2001). Why and where: Acharacterization of data provenance.In International conference ondatabase theory (pp. 316–330).

Carrier, B., Spafford, E. H., et al. (2003).Getting physical with the digitalinvestigation process. InternationalJournal of digital evidence, 2 (2),1–20.

Chung, M., & Hermans, J. (2010). Fromhype to future: Kpmg’s 2010 cloudcomputing survey. KPMG , 8–28.

Cloud Security Alliance. (2011, September).Cloud controls matrix v1.2 - cloudcontrols matrix working group.(https://cloudsecurityalliance.org/download/cloud-controls-matrix-v1-2/)

Ghemawat, S., Gobioff, H., & Leung, S.-T.(2003). The google file system(Vol. 37) (No. 5). ACM.

Hashizume, K., Rosado, D. G.,Fernandez-Medina, E., & Fernandez,E. B. (2013). An analysis of securityissues for cloud computing. Journal ofinternet services and applications ,4 (1), 5.

Imran, M., & Hlavacs, H. (2013). Layeringof the provenance data for cloudcomputing. In Internationalconference on grid and pervasivecomputing (pp. 48–58).

Jansen, W., & Grance, T. (2011). Sp800-144. guidelines on security andprivacy in public cloud computing.

Kandukuri, B. R., Rakshit, A., et al.(2009). Cloud security issues. InServices computing, 2009. scc’09. ieeeinternational conference on (pp.517–520).

Katilu, V. M., Franqueira, V. N., &

Page 62 c© 2018 ADFSL

Page 18: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

A Forensic Enabled Data Provenance Model for Public Cloud JDFSL V13N3

Angelopoulou, O. (2015). Challengesof data provenance for cloud forensicinvestigations. In Availability,reliability and security (ares), 201510th international conference on (pp.312–317).

Kent, K., Chevalier, S., Grance, T., &Dang, H. (2006). Guide to integratingforensic techniques into incidentresponse. NIST Special Publication,10 , 800–86.

Ko, R. K., Jagadpramana, P., & Lee, B. S.(2011). Flogger: A file-centric loggerfor monitoring file access andtransfers within cloud computingenvironments. In Trust, security andprivacy in computing andcommunications (trustcom), 2011 ieee10th international conference on (pp.765–771).

Li, J., Chen, X., Huang, Q., & Wong, D. S.(2014). Digital provenance: Enablingsecure data forensics in cloudcomputing. Future GenerationComputer Systems , 37 , 259–266.

Lu, R., Lin, X., Liang, X., & Shen, X. S.(2010). Secure provenance: theessential of bread and butter of dataforensics in cloud computing. InProceedings of the 5th acm symposiumon information, computer andcommunications security (pp.282–292).

Market Research Media. (2016,September). Global cloud computingmarket forecast 2015-2020.(https://www.marketresearchmedia.com/?p=839)

Marty, R. (2011). Cloud applicationlogging for forensics. In Proceedings ofthe 2011 acm symposium on appliedcomputing (pp. 178–184).

Mell, P., Grance, T., et al. (2011). The nistdefinition of cloud computing.

Moreau, L., Clifford, B., Freire, J., Futrelle,

J., Gil, Y., Groth, P., . . . others(2011). The open provenance modelcore specification (v1. 1). Futuregeneration computer systems , 27 (6),743–756.

Muniswamy-Reddy, K.-K. (2006). Decidinghow to store provenance.

Muniswamy-Reddy, K.-K., Holland, D. A.,Braun, U., & Seltzer, M. I. (2006).Provenance-aware storage systems. InUsenix annual technical conference,general track (pp. 43–56).

Muniswamy-Reddy, K.-K., Macko, P., &Seltzer, M. I. (2010). Provenance forthe cloud. In Fast (Vol. 10, pp.15–14).

O’shaughnessy, S., & Keane, A. (2013).Impact of cloud computing on digitalforensic investigations. In Ifipinternational conference on digitalforensics (pp. 291–303).

Ren, K., Wang, C., & Wang, Q. (2012).Security challenges for the publiccloud. IEEE Internet Computing ,16 (1), 69–73.

Ruan, K., & Carthy, J. (2012). Cloudcomputing reference architecture andits forensic implications: Apreliminary analysis. In Internationalconference on digital forensics andcyber crime (pp. 1–21).

Sakka, M. A., Defude, B., & Tellez, J.(2010). Document provenance in thecloud: constraints and challenges. InMeeting of the european network ofuniversities and companies ininformation and communicationengineering (pp. 107–117).

Sar, C., & Cao, P. (2005). Lineage filesystem. Online athttp://crypto.stanford.edu/ cao/lineage. html , 411–414.

Simmhan, Y. L., Plale, B., & Gannon, D.(2005). A survey of data provenancein e-science. ACM Sigmod Record ,

c© 2018 ADFSL Page 63

Page 19: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3 A Forensic Enabled Data Provenance Model for Public Cloud

34 (3), 31–36.Suen, C. H., Ko, R. K., Tan, Y. S.,

Jagadpramana, P., & Lee, B. S.(2013). S2logger: End-to-end datatracking mechanism for cloud dataprovenance. In Trust, security andprivacy in computing andcommunications (trustcom), 201312th ieee international conference on(pp. 594–602).

Sultana, S., & Bertino, E. (2013). A fileprovenance system. In Proceedings ofthe third acm conference on data andapplication security and privacy (pp.153–156).

Taha, M. M. B., Chaisiri, S., & Ko, R. K.(2015). Trusted tamper-evident dataprovenance. InTrustcom/bigdatase/ispa, 2015 ieee(Vol. 1, pp. 646–653).

Tan, W. C. (2004). Research problems indata provenance. IEEE Data Eng.Bull., 27 (4), 45–52.

Tan, Y. S., Ko, R. K., & Holmes, G.(2013). Security and dataaccountability in distributed systems:A provenance survey. In Highperformance computing andcommunications & 2013 ieeeinternational conference on embeddedand ubiquitous computing (hpcc euc),2013 ieee 10th internationalconference on (pp. 1571–1578).

Trenwith, P., & Venter, H. (2015). Locatingand tracking digital objects in thecloud. In Ifip international conferenceon digital forensics (pp. 287–301).

Trenwith, P. M., & Venter, H. S. (2014). Adigital forensic model for providingbetter data provenance in the cloud.In Information security for southafrica (issa), 2014 (pp. 1–6).

Voorsluys, W., Broberg, J., & Buyya, R.(2011). Introduction to cloudcomputing. Cloud computing:

Principles and paradigms , 1–41.Wells, D., Greisen, E., & Harten, R. (1981).

Fits-a flexible image transportsystem. Astronomy and AstrophysicsSupplement Series , 44 , 363.

Zawoad, S., Hasan, R., & Skjellum, A.(2015). Ocf: an open cloud forensicsmodel for reliable digital forensics. InCloud computing (cloud), 2015 ieee8th international conference on (pp.437–444).

Zhang, O. Q., Kirchberg, M., Ko, R. K., &Lee, B. S. (2011). How to track yourdata: The case for cloud computingprovenance. In Cloud computingtechnology and science (cloudcom),2011 ieee third internationalconference on (pp. 446–453).

Zhao, J., Bizer, C., Gil, Y., Missier, P., &Sahoo, S. (2010). Provenancerequirements for the next version ofrdf. In W3c workshop rdf next steps.

Page 64 c© 2018 ADFSL

Page 20: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

Publication Information

The Journal of Digital Forensics, Security and Law (JDFSL) is a publication of the Association of Digital

Forensics, Security and Law (ADFSL). The Journal is published on a non-profit basis. In the spirit of the

JDFSL mission, individual subscriptions are discounted. However, we do encourage you to recommend

the journal to your library for wider dissemination.

The Journal is published in electronic form under the following ISSN:

ISSN: 1558-7223

The Journal was previously published in print form under the following ISSN:

ISSN: 1558-7215

The office of the Association of Digital Forensics, Security and Law (ADFSL) is located at the following

address:

Association of Digital Forensics, Security and Law

4350 Candlewood Lane

Ponce Inlet, Florida 32127

Tel: 804-402-9239

Fax: 804-680-3038

E-mail: [email protected]

Website: http://www.adfsl.org

Publication Information JDFSL V13N3

@ 2018 ADFSL Page 65

Page 21: A Forensic Enabled Data Provenance Model for Public Cloudatkison.cs.ua.edu/papers/Forensic_Provenance_Cloud.pdf · vices based on demand from a single user to a large organization,

JDFSL V13N3

Page 66 @ 2018 ADFSL