21
Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Embed Size (px)

Citation preview

Page 1: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Dr. Bhavani ThuraisinghamThe University of Texas at Dallas (UTD)

October 2010

Secure Cloud Computing and Cloud Forensics

Page 2: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Cloud Computing: NIST Definition• Cloud computing is a pay-per-use model for enabling available, convenient, on-

demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is comprised of five key characteristics, three delivery models, and four deployment models.

• Key Characteristics: On-demand self-service, Location independent resource pooling. Rapid elasticity, Pay per use.

• Delivery Models: Cloud Software as a Service (SaaS), Cloud Platform as a Service (PaaS), Cloud Infrastructure as a Service (IaaS).

• Deployment Models: Private cloud, Community cloud, Public cloud. Hybrid cloud. • Our goal is to demonstrate policy based assured information sharing on clouds

Page 3: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Security Challenges for Clouds• Policy

– Access Control and Accountability• Data Security and Privacy Issues

– Third party publication of data; Security challenges associated with data outsourcing; – Data at the different sites have to be protected, with the end results being made

available; querying encrypted data– Secure Query Processing/Updates in Cloud

• Secure Storage• Security Related to Virtualization• Cloud Monitoring• Protocol and Network Security for Clouds• Identity Management• Cloud Forensics

Page 4: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

04/19/23 4

Layered Framework

Application

(Law Enforcement)

Hadoop/MapReduc/Storage

HIVE/SPARQL/Query

XEN/Linux/VMM

Secure Virtual Network Monitor

PoliciesXACML

Risks/Costs

QoSResource Allocation

Cloud Monitors

Figure.2 Layered Framework for Assured Cloud

Approach: Study the problem with current principles and technologies and then develop principles for secure cloud computing

Page 5: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Secure Query Processing with Hadoop/MapReduce

• We have studied Clouds based on Hadoop• Query Rewriting and Optimization Principles defined and

implemented for two types of data• (i) Relational data: Secure query processing with HIVE• (ii) RDF Data: Secure query processing with SPARQL• Demonstrated with XACML Policies (content, temporal,

association)• Joint demonstration with Kings College and University of Insubria

– First demo (2010): Each party submits their data and policies– Our cloud will manage the data and policies – Second demo (2011): Multiple clouds

Page 6: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Principles of Secure Query Optimization

• Query optimization principles defined and strategies implemented in the 1970s and 1980s for relational data (IBM System R and DB2 Ingres)– Query Rewriting, Query Evaluation Procedures, Search strategy, Cost

functions

• Secure query optimization principles defined and strategies implemented in the 1980s and 1990s (Honeywell, MITRE)

• Extended secure query optimization for cloud environment– Query optimization for RDF data– Secure query optimization for RDF data – Secure query optimization for RDF data in a cloud environment

Page 7: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data

Policies include content dependent access control, association based access control, time-dependent access control

Table/View definition and loading, Users can create tables as well as load data into tables. Further, they can also upload XACML policies for the table they are creating. Users can also create XACML policies for tables/views. Users can define views only if they have permissions for all tables specified in the query used to create the view. They can also either specify or create XACML policies for the views they are defining.

Fine-grained Access Control with Hive

Page 8: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Fine-grained Access Control with Hive

System Architecture

Page 9: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

SPARQL Query Optimizer for Secure RDF Data Processing

• Developed a secure query optimizer and query rewriter for RDF Data with XACML policies and implemented on top of JENA

• Storage Support– Built a storage mechanism for very large RDF graphs for

JENA– Integrated the system with Hadoop for the storage of large

amounts of RDF data (e.g. a billion triples)– Need to incorporate secure storage strategies developed in

FY09

Page 10: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

ServerBackend

System ArchitectureWeb Interface

Data Preprocessor

N-Triples Converter

Prefix Generator

Predicate Based Splitter

Predicate Object Based Splitter

MapReduce Framework

Parser

Query Validator & Rewriter

XACML PDP

Plan Generator

Plan Executor

Query Rewriter By Policy

New Data Query Answer

Page 11: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Security for AMAZON S3• Many organizations are using cloud services like Amazon S3 for data storage. A few important

questions arise here –

– Can we use S3 to store the data sources used by Blackbook?; Is the data we store on S3, secure? Is it accessible by any user outside our organization? ; How do we restrict access to files to the users within the organization?

– BLACKBOOK is a semantic-web based tool used by analysts within the Intelligence Community. The tool federates queries across data sources. These data sources are databases or applications located either locally or remotely on the network. BLACKBOOK allows analysts to make logical inferences across the data sources, add their own knowledge and share that knowledge with other analysts using the system.

• We use Amazon S3 to store the data sources used by Blackbook.

• To keep our data secure, we encrypt the data using AES (Advanced Encryption Standard) before uploading the data files on Amazon S3.

• To restrict access to the files to the users within the organization, we implemented RBAC policies using XACML

Page 12: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

XACML Design Implementation in Hadoop

• Until July 2010, little security in Hadoop• We have designed XACML for Hadoop• Use of In-line Reference Monitor Concept is being

explored• Examining current Hadoop security (released July

2010 and will complete XACML implementation December 2010)

• Also examining accountability for Hadoop (with Purdue)

Page 13: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Secure VMM: Xen Architecture Xen Hypervisor – The

hypervisor runs just on top of the hardware and traps all calls by VMs to access the hardware.

Domain 0 (Dom0): Domain 0 is a modified version of Linux that is used to manage the other VMs.

Domain U (DomU): Domain U is the user domain in Xen. DomU is where all of the untrusted guest OSs reside.

Page 14: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Virtual Machines DomU is broken into two parts Para-Virtualized Domains (PV) and Hardware

Assisted Virtualized Domains (HVM) Para-virtualized Domain (PV): A Para-virtualized domain is a modified operating

system that is aware that it is a virtual machine. Can achieve near native performance.

Hardware Assisted Virtualized Machine Domain (HVM) – HVMs are VMs that run operating systems that have not been modified to work with Dom0. This allows closed source operating systems like Windows.

Memory: PVs are given Read-Only access to memory and any updates are controlled by the hypervisor. HVMs are given a shadow page table because they do not know how to work with non-contiguous physical address spaces.

I/O Management: I/O Management is controlled by Dom0. PVs share memory with Dom0 through which they can pass messages with it. Dom0 runs the Qemu deamon to emulate the devices for the HVMs

Page 15: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Security IssuesAccess Control – At the moment access control is discretionary. Fine-

grained multilevel controls are needed (Integrity Lock architecture)Secure Boot – The boot process needs to be secured. Proper attestation

methods need to be developed.Component Isolation – Dom0 supports networking, disk I/O, VM boot

loading, hardware emulation, workload balancing, etc. Dom0 needs to be decomposed into components

Logging – More robust logging is needed to develop a clear view of the chain of events.

Introspection – Introspection is a security technique where a virtual machine running security software is allowed to look inside the memory of another VM. Software such as IPSs and antriviruses, using introspection should be safe from tampering if the monitored VM is exploited.

Page 16: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Overall Architecture of Accountable Grid Systems (Purdue)Overall Architecture of Accountable Grid Systems (Purdue)

* Accountability agents* Strategies for accountability data collection* Exchange of information among accountability agents to generate alarms

Page 17: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Data Collection Approaches

Job-flow based approachJobs flow across different organizational units

Long computations are often divided into many sub-jobs to be run in parallel

A possible approach is to employ point-to point agents which collect data at each node that the job traverses

Grid node based approachIt focuses on a given location in the flow and at a given instant of time for all jobs

Viewpoint is fixed

The combination of two approaches allows us to collect complementary information

Page 18: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Detection at the victim node (e.g., gatekeeper, head node)

• From the data obtained in grid-node based strategy, the agent detects anomalies concerning resource consumption by using methodologies such as statistical modeling, entropy based approaches

• However such approaches are often not accurate, and result in high rate of false detection

• By using data concerning job’s flow collected in job-flow based strategy, agents cooperate in three alarms (light, moderate, and critical) to further detect attacks

• Upon receiving a critical alarm, the agent takes proper actions such as increasing the priority of jobs identified as legal or killing malicious jobs including jobs that may potentially perform bad operations.

Page 19: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Cloud Forensics(Kyun Ruan, University College Dublin)

• Forensic Readiness– General Forensic Readiness; Synchronization of Data; Location Anonymity; Identity

Management; Encryption and Key Management; Log Format

• Challenges Unique to Cloud– Multi-tenancy and Resource Sharing; Multiple Jurisdictions; Electronic Discovery

• Challenges Exacerbated by the Cloud– The Velocity of Attack Factor; Malicious Insider; Data Deletion; Hypervisor-level Investigation;

Proliferation of Endpoints

• Opportunities with Cloud– Cost-effectiveness; Robustness; Scalability and Flexibility; Forensics as a Cloud Service;

Standards and Policies

Page 20: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Current and Future Research• Secure VMM (Virtual Machine Monitor)

– Exploring XEN VMM and examining security issues• Demonstration using the Secure Cloud with North Central Texas

Fusion System Data (with ADB Consulting)• Coalition demonstration (with London and Italy)• Integrate Secure Storage Algorithms into the Storage System

Developed (2011)• Identity Management (2011 with Purdue)• Secure Virtual Network Monitor (Future, 2012)• Cloud Forensics

Page 21: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) October 2010 Secure Cloud Computing and Cloud Forensics

Education Program• We offer a course in Cloud computing (Industry

adjunct professor; Spring 2009)• Course planned for Spring 2012 that incorporate

the research results (Building and Securing the Cloud)– Topics covered will include technologies, techniques,

tools and trends