20
MediaPlatform Application Security and Continuity Plan MediaPlatform, Inc. 8383 Wilshire Boulevard, Suite 750 Los Angeles, CA 90211 (310) 909-8410 www.mediaplatform.com Introduction MediaPlatform System Security General Security Instance Isolation Data at Rest Security Network Security Facility System Configuration Management Security Policy and Standards for the Technical Team Purpose Scope Policy General Configuration Requirements Security Incident Response Protocol Enforcement Software Development Methodology Client-Specific Customization Change Management Procedures Identify the change or potential change Change request analyzed Change request evaluated for go/no-go decision Change planning Change implementation Release change Client verification/change closed Critical patch releases ("hot-fixes") Redundancy and Disaster Recovery Overview Redundancy and Failover Physical Server Redundancy Application Redundancy Backups Local File System Shared File System Database Monitoring

MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

MediaPlatform Application Security and Continuity Plan

MediaPlatform, Inc.8383 Wilshire Boulevard, Suite 750Los Angeles, CA 90211(310) 909-8410www.mediaplatform.com

 

IntroductionMediaPlatform System Security

General SecurityInstance IsolationData at Rest SecurityNetwork SecurityFacility

System Configuration ManagementSecurity Policy and Standards for the Technical Team

PurposeScopePolicyGeneral Configuration RequirementsSecurity Incident Response ProtocolEnforcement

Software Development MethodologyClient-Specific Customization

Change ManagementProcedures

Identify the change or potential changeChange request analyzedChange request evaluated for go/no-go decisionChange planningChange implementationRelease changeClient verification/change closedCritical patch releases ("hot-fixes")

Redundancy and Disaster RecoveryOverviewRedundancy and Failover

Physical Server RedundancyApplication Redundancy

BackupsLocal File SystemShared File SystemDatabase

Monitoring

Page 2: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Disaster RecoveryChange ManagementQuarterly AuditsCustomer CommunicationMaintenance Windows"Minor" System Failures"Major" System Failures

Sample Performance ReportsUptime

AWSServer

Load and PerformanceMetrics for EC2Metrics for EBSMetrics for ELBMetrics for RDSSample Reports

 

Introduction

MediaPlatform is a company with a distinguished track record of excellence in engineering and service delivery.When you join the MediaPlatform technical team as a developer, architect, or system administrator, you areassuming responsibility for the valuable technology assets we create and the customers we serve. This SystemPolicy Guide is intended to give you a high level view of how we manage our systems,code base, hostingenvironments and global content distribution network (CDN) partnerships. It is not the final word on all mattersrelating to system and code management, but it should provide a framework for our regular work. As of version 2.2 of this guide, these managers own the following domains:

Software development code issues: Brett Law, VP of Application DevelopmentArchitectural issues and cloud platform management: Greg Pulier, PresidentJIRA priority, release schedule, program management, agile methodology: Denis Khoo, CTOProduct management issues: Jared Hawkins, Product Manager

 

MediaPlatform System Security

MediaPlatform's production systems support video and webcasting needs, including mission-critical executivecommunications needs, of some of the largest corporations in the world. The following section describes ourapproach to security and how we handle the main security issues that arise in our business.MediaPlatform hosts its applications at Amazon Web Services (AWS). AWS provides a highly reliable, scalable,low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countriesaround the world. Companies such as Ericsson, the Guardian newspapers, Newsweek, Reddit, and Hitachi Systemsrely on AWS for critical hosting needs. AWS has passed a SAS70 audit.

General Security

Each of our application environments exists behind a separate firewall.  Access to any of these machines iscontrolled by highly encrypted keys that are only available to high level system administrators.  In addition to server

Page 3: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

level firewalls, our platform relies on an additional firewall that protects the network as a whole.  This firewall is underconstant monitoring for a wide variety of potential security threats.Security within Amazon EC2 is provided on multiple levels: The operating system (OS) of the host system, thevirtual instance operating system or guest OS, a firewall and signed API calls.  Each of these items builds on thecapabilities of the others.  The goal is to ensure that data contained within Amazon EC2 cannot be intercepted bynon-authorized systems or users and that Amazon EC2 instances themselves are as secure as possible withoutsacrificing the flexibility in configuration that customers demand. Further details are provided below:

Host Operating System: AWS administrators with a business need are required to use their individualcryptographically strong SSH keys to gain access to a bastion host.   These bastion hosts are specificallybuilt systems that are designed and configured to protect the management plane of the cloud. Onceconnected to the bastion, authorized administrators are able to use a privilege escalation command to gainaccess to an individual host. All such accesses are logged and routinely audited.  When an AWS employeeno longer has a business need to administer EC2 hosts, their privileges on and access to the bastion hostsare revoked.Guest Operating System: Virtual instances are completely controlled by the customer.  They have full rootaccess and all administrative control over additional accounts, services, and applications. AWSadministrators do not have access to customer instances, and cannot log into the guest OS. Customersshould disable password-based access to their hosts and utilize token or key-based authentication to gainaccess to unprivileged accounts.  Further, customers should employ a privilege escalation mechanism withlogging on a per-user basis.  For example, if the guest OS is Linux, utilize SSH with keys to access the virtualinstance, enable shell command-line logging, and use the 'sudo' utility for privilege escalation.  Customersshould generate their own key pairs in order to guarantee that they are unique, and not shared with othercustomers or with AWS. Firewall: Amazon EC2 provides a complete firewall solution; this mandatory inbound firewall is configured in adefault deny mode and the Amazon EC2 customer must explicitly open any ports to allow inbound traffic. Thetraffic may be restricted by protocol, by service port, as well as by source IP address (individual IP or CIDRblock).

The firewall can be configured in groups permitting different classes of instances to have different rules, for examplethe case of a traditional three-tiered web application.  The group for the web servers would have port 80 (HTTP) andport 443 (HTTPS).  The group for the application servers would have port 8000 (application specific) accessible onlyto the web server group.  The group for the database servers would have port 3306 (MySQL) open only to theapplication server group.  All three groups would permit administrative access on port 22 (SSH), but only from thecustomer's corporate network.  Highly secure applications can be deployed using this expressive mechanism.The firewall is controlled not by the host/instance itself, but requires the customer's X.509 certificate and key toauthorize changes, thus adding an extra layer of security.  Within EC2, the host administrator and cloudadministrator can be separate people, permitting two man rule security policies to be enforced.  In addition, IPtablescan be used on each instance for more granular controls.  This can restrict both inbound and outbound traffic oneach instance.The level of security afforded by the firewall is a function of which ports are opened by the customer, and for whatduration and purpose.  The default state is to deny all incoming traffic, and developers should plan carefully whatthey will open when building and securing their applications.  Well-informed traffic management and security designis still required on a per-instance basis.

API: Calls to launch and terminate instances, change firewall parameters, and perform other functions are allsigned by an X.509 certificate or the customer's Amazon Secret Access Key.  Without access to thecustomer's Secret Access Key or X.509 certificate, Amazon EC2 API calls cannot be made on their behalf.  Inaddition, API calls can be encrypted in transit with SSL to maintain confidentiality.  Amazon recommendsalways using SSL-protected API endpoints.

Instance Isolation

Amazon EC2 currently utilizes a highly customized version of the Xen Hypervisor, taking advantage of

Page 4: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

paravirtualization.  Because paravirtualized guests rely on the Hypervisor to provide support for operations thatnormally require privileged access, it is possible to run the guest OS with no elevated access to the CPU.  Thisexplicit virtualization of the physical resources leads to a clear separation between guest and hypervisor, resulting instrong security separation between the two.Different instances running on the same physical machine are isolated from each other utilizing the Xen Hypervisor. Amazon is an active participant and contributor within the Xen community, which ensures awareness of potentialpending issues.  In addition, the aforementioned firewall resides within the hypervisor layer, between the physicalinterface and the instance's virtual interface.  All packets must pass through this layer, thus an instance's neighborshave no additional access to that instance, and can be treated as if they are on separate physical hosts.  Thephysical RAM is separated using similar mechanisms.Customer instances have no access to raw disk devices, but instead are presented with virtualized disks.  The AWSproprietary disk virtualization layer automatically wipes every block of storage used by the customer, and guaranteesthat one customer's data is never exposed to another. 

Data at Rest Security

MediaPlatform utilizes Amazon's S3 storage service as its primary file share and storage for content.  S3 is highlyscalable, reliable, secure, and fast.  To ensure scalability and reliability, S3 automatically manages data replicationacross multiple data centers.  Amazon S3 supports server side encryption for data at rest.  Amazon S3's ServerSide Encryption handles the encryption, decryption and key management.  Keys are stored in hosts that are separteand distinct from those used to store the data.  Data is encrypted using 256-bit AES encryption, also known asAES-256, one of the strongest block ciphers available.  According to Amazon, "the entire encryption, keymanagement, and decription process is inspected and verified internally on a regular basis as part of [their] existingauditing process.

Network Security

The AWS network provides significant protection against traditional network security issues and the customer canimplement further protection.  The following are a few examples:

Distributed Denial Of Service (DDoS) Attacks: AWS API endpoints are hosted on the same Internet-scale,world class infrastructure that supports the Amazon.com retail site.  Standard DDoS mitigation techniquessuch as syn cookies and connection limiting are used.  To further mitigate the effect of potential DDoSattacks, Amazon maintains internal bandwidth which exceeds its provider-supplied Internet bandwidth.Man In the Middle (MITM) Attacks: All of the AWS APIs are available via SSL-protected endpoints whichprovides server authentication.  Amazon EC2 AMIs automatically generate new SSH host keys on first bootand log them to the console.  Customers can then use the secure APIs to call the console and access thehost keys before logging into the instance for the first time.  Customers are encouraged to use the SSLendpoints for all of their interactions with AWS.IP Spoofing: Amazon EC2 instances cannot send spoofed traffic.  The Amazon -controlled, host-basedfirewall infrastructure will not permit an instance to send traffic with a source IP or MAC address other than itsown. Port Scanning: Port scans by Amazon EC2 customers are a violation of the Amazon EC2 Acceptable UsePolicy (AUP).  Violations of the AUP are taken seriously, and every reported violation is investigated.  WhenPort scanning is detected it is stopped and blocked.  Port scans of Amazon EC2 instances are generallyineffective because, by default, all inbound ports on Amazon EC2 instances are closed.

The customer's strict management of security groups can further mitigate the threat of port scans.  If the customerconfigures the security group to allow traffic from any source to a specific port, then that specific port will bevulnerable to a port scan.  In these cases, the customer must use appropriate security measures to protect listeningservices that may be essential to their application from being discovered by an unauthorized port scan.  Forexample, a web server must clearly have port 80 (HTTP) open to the world, and the administrator of this server isresponsible for ensuring the security of the HTTP server software, such as Apache.

Page 5: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Packet sniffing by other tenants: It is not possible for a virtual instance running in promiscuous mode toreceive or "sniff" traffic that is intended for a different virtual instance.  While customers can place theirinterfaces into promiscuous mode, the hypervisor will not deliver any traffic to them that is not addressed tothem.  This includes two virtual instances that are owned by the same customer, even if they are located onthe same physical host.  Attacks such as ARP cache poisoning do not work within EC2.  While Amazon EC2does provide ample protection against one customer inadvertently or maliciously attempting to view another'sdata, as a standard practice customers should encrypt sensitive traffic.

 

Facility

Our applications are hosted in the cloud across multiple secured data centers.  Only the cloud provider (Amazon isour current provider of choice) has knowledge of the geographical location of the data centers and physical access.Amazon has many years of experience in designing, constructing, and operating large-scale data centers.  Thisexperience has been applied to the AWS platform and infrastructure.  AWS data centers are housed in nondescriptfacilities, and critical facilities have extensive setback and military grade perimeter control berms (earthen barriers)as well as other natural boundary protection.  Physical access is strictly controlled both at the perimeter and atbuilding ingress points by professional security staff utilizing video surveillance, state of the art intrusion detectionsystems, and other electronic means.  Authorized staff must pass two-factor authentication no fewer than threetimes to access data center floors.  All visitors and contractors are required to present identification and are signedin and continually escorted by authorized staff.Amazon only provides data center access and information to employees who have a legitimate business need forsuch privileges.  When an employee no longer has a business need for these privileges, his or her access isimmediately revoked, even if they continue to be an employee of Amazon or Amazon Web Services.  All physicaland electronic access to data centers by Amazon employees is logged and audited routinely.

System Configuration Management

MediaPlatform strictly maintains all system configurations using a robust configuration management system. MediaPlatform's configuration management system is based on puppet, a very popular open source tool.  Theconfiguration management system manages all settings and installations on MediaPlatform's servers, includinginstalled software, software configuration, SSL certificates, etc.  As a best practice, changes are never made directlyon the server, and are controlled and deployed using MediaPlatform's configuration management system.  Eachserver will periodically check for updates every ten minutes, and will apply any changes as directed byMediaPlatform's configuration management system.  As such, MediaPlatform has all server configurationscentralized and maintained in an orderly fashion.  This provides dramatically improved visibility and control,especially when dealing with a large number of individual servers as MediaPlatform does.  This also allowsMediaPlatform to instantiate new servers with very predictable configurations, and allows MediaPlatform to scalevery quickly.  

MediaPlatform generally ties system configuration changes to its software releases.  This allows for systemconfiguration changes to be tested in DEV and QA first before being deployed to PROD.  In certain cases, it may notbe feasible to tie a system configuration change with a software release schedule, and a system configurationupdate may be pushed to PROD outside of a release schedule.  

Security Policy and Standards for the Technical Team

In addition to common sense security practices, and all HR department-issued policies regarding equipment use,confidentiality and conduct, to maintain a high level of information and infrastructure security, we require adherenceto the following security standards and policies by members of the technical team:

Page 6: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Purpose

This policy establishes information security requirements for MediaPlatform Technical Teams to ensure thatMediaPlatform confidential information and technologies are not compromised, and that production services andother MediaPlatform interests are protected from Technical Teams activities.

Scope

This policy applies to all Technical Teams, MediaPlatform employees and third parties who access MediaPlatform'sinfastructure. All existing and future equipment, which fall under the scope of this policy, must be configuredaccording to the referenced documents.

Policy

1. The CTO of MediaPlaform is the primary point of contact (POC) for all security matters. The VP of ApplicationDevelopment is a back-up POC for the Technical Teams. 2. Technical Teams are responsible for security and any impact on the corporate production network and any othernetworks. Team managers are responsible for adherence to this policy and associated processes. Where policiesand procedures are undefined Technical Teams managers must do their best to safeguard MediaPlatform fromsecurity vulnerabilities.3. Team managers are responsible for the Technical Teams's compliance with all MediaPlatform security policies.The following are particularly important: Password Policy for networking devices and hosts, Wireless Security Policy,

.Anti-Virus Policy, and physical security4. The CTO is responsible for controlling Technical Teams access. Access to any given Technical Teams will onlybe granted by the CTO or designee, to those individuals with an immediate business need within the TechnicalTeams, either short-term or as defined by their ongoing job function. This includes continually monitoring the accesslist to ensure that those who no longer require access to the Technical Teams have their access terminated.5. The System Admin must maintain a firewall device between the corporate production network and all TechnicalTeams equipment.6. The System Admin and/or CTO reserve the right to interrupt Technical Teams connections that impact thecorporate production network negatively or pose a security risk.7. The System Admin must record all Technical Teams IP addresses, which are routed within MediaPlatformnetworks, in Enterprise Address Management database along with current contact information for that TechnicalTeams.8. Any Technical Teams that wants to add an external connection must provide a diagram and documentation to theCTO with business justification, the equipment, and the IP address space information. The CTO will review forsecurity concerns and must approve before such connections are implemented.9. All user passwords must comply with MediaPlatform's . In addition, individual user accounts onPassword Policyany Technical Teams device must be deleted when no longer authorized within three (3) days. Group accountpasswords on Technical Teams computers (Linux, Mac, Windows, etc) must be changed quarterly (once every 3months). For any Technical Teams device that contains MediaPlatform proprietary information, group accountpasswords must be changed within three (3) days following a change in group membership.

General Configuration Requirements

1. All traffic between the corporate production and the Technical Teams network must go through a firewallmaintained by the System Admin. Team network devices (including wireless) must not cross-connect the TechnicalTeams and production networks.2. Original firewall configurations and any changes thereto must be reviewed and approved by the CTO. The CTOmay require security improvements as needed. 3. The Teams are prohibited from engaging in port scanning, network auto-discovery, traffic spamming/flooding, andother similar activities that negatively impact the corporate network and/or non-MediaPlatform networks. These

Page 7: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

1. 2.

a. b. c. d. e. f.

g. h. i.

3.

a. b. c. d. e. f.

g. h. i. j.

4.

a.

activities must be restricted within the Technical Teams.4. Traffic between production networks and Technical Teams networks, as well as traffic between separateTechnical Teams networks, is permitted based on business needs and as long as the traffic does not negativelyimpact on other networks. Labs must not advertise network services that may compromise production networkservices or put Technical Teams confidential information at risk.5. The company reserves the right to audit all Technical Teams-related data and administration processes at anytime, including but not limited to, inbound and outbound packets, firewalls and network peripherals.6. Company owned gateway devices are required to comply with all MediaPlatform product security advisories andmust authenticate against our authentication servers.7. The enable password for all Technical Teams owned gateway devices must be different from all other equipmentpasswords in the Technical Teams. The password must be in accordance with MediaPlatform's .Password PolicyThe password will only be provided to those who are authorized to administer the Technical Teams network.8. In Technical Teams where non-MediaPlatform personnel have physical access (e.g., training), direct connectivityto the corporate production network is not allowed except through dedicated public interfaces such as theWebCaster account of that client or the training account. Connectivity for authorized personnel from these TechnicalTeams can be allowed to the corporate production network only if authenticated against the CorporateAuthentication servers, temporary access lists (lock and key), SSH, client VPNs, or similar technology approved byInfoSec.9. Infrastructure devices (e.g. IP Phones) needing corporate network connectivity must adhere to the Open AreasPolicy.10. All Technical Teams external connection requests must be reviewed and approved by InfoSec. Analog or ISDNlines must be configured to only accept trusted call numbers. Strong passwords must be used for authentication.11. All Technical Teams networks with external connections must not be connected to MediaPlatform corporateproduction network or any other internal network directly or via a wireless connection, or via any other form ofcomputing equipment.

Security Incident Response ProtocolThe person who discovers the incident will call the CTO or President immediately, regardless of time of day.The person reporting the incident must log the following information in JIRA immediately:

The nature of the incident.What equipment or persons are involved.Location of equipment or persons involved.How was the incident detected?When was event first noticed that supported the idea that the incident occurred?Is the equipment/software/network affected business critical?What is the severity of the potential impact?Name of system being targeted, along with operating system, IP address, and location.IP address and any information about the origin of the attack.

Senior management, including the CTO, will meet or discuss the situation over the telephone and determinea response strategy.

Is the incident real or perceived?Is the incident still in progress?What data or property is threatened and how critical is it?What is the impact on the business should the attack succeed? Minimal, serious, or critical?What system or systems are targeted, where are they located physically and on the network?Is the incident inside the trusted network?Is the response urgent?Can the incident be quickly contained?Will the response alert the attacker and do we care?What type of incident is this? Ex: virus, worm, intrusion, abuse, damage.

An incident ticket will be created in JIRA. The incident will be categorized into the highest applicable level ofone of the following categories:

Page 8: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

4.

a. b. c.

5.

a. b. c. d. e. f.

g. h. i. j.

1.

2.

3. 4.

a.

b. c. d. e. f.

5. a. b. c. d. e. f.

g. 6.

7.

8.

9. a. b.

c. d.

Category one - A threat to sensitive data or a client instanceCategory two - A threat to MediaPlatform systemsCategory three - A disruption of services

Team members will establish and follow one of the following procedures basing their response on the incidentassessment.

Worm response procedureVirus response procedureSystem failure procedureActive intrusion response procedure - Is critical data at risk?Inactive Intrusion response procedureSystem abuse procedureProperty theft response procedureWebsite denial of service response procedureDatabase or file denial of service response procedureSpyware response procedure.

The team may create additional procedures which are not foreseen in this document. If there is no applicableprocedure in place, the team must document what was done and later establish a procedure for the incident.

Team members will use forensic techniques including reviewing system logs, looking for gaps in logs,reviewing intrusion detection logs, interviewing witnesses and the incident victim to determine how theincident was caused. Only authorized people should be performing interviews or examining evidence and theauthorized personnel may vary by situation and the organization.Team members will recommend changes to prevent the occurrence from happening again or infecting othersystems.Upon management approval, the changes will be implemented.Team members will restore the affected system(s) to the uninfected state. They may do any or more of thefollowing:

Re-install the affected system(s) from scratch and restore data from backups if necessary. Preserveevidence before doing this.Make users change passwords if passwords may have been sniffed.Be sure the system has been hardened by turning off or uninstalling unused services.Be sure the system is fully patched.Be sure real time virus protection and intrusion detection is running.Be sure the system is logging the correct events and to the proper level.

Documentation - The following shall be documented:How the incident was discovered.The category of the incident.How the incident occurred whether through email, through firewall, etc.Where the attack came from such as IP addresses and other related information about the attacker.What the response plan was.What was done to respond.Whether the response was effective.

Evidence Preservation - Make copies of logs, email, and other documentable communication. Keep lists ofwitnesses. Keep evidence as long as necessary to complete prosecution and beyond in case of an appeal.Notify proper external agencies - Notify the police and other appropriate agencies if prosecution of theintruder is possible. List the agencies and contact numbers here.Assess damage and cost - Assess the damage to the organization and estimate both the damage cost andthe cost of the containment efforts.Review response and update policies - Plan and take preventative steps so the intrusion can't happen again.

Consider whether an additional policy could have prevented the intrusion.Consider whether a procedure or policy was not followed which allowed the intrusion, then considerwhat could be changed to be sure the procedure or policy is followed in the future.Was the incident response appropriate? How could it be improved?

Page 9: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

9.

d. e.

f.

g. h. i.

Was every appropriate party informed in a timely manner?Were the incident response procedures detailed and cover the entire situation? How can they beimproved?Have changes been made to prevent a re-infection of the current infection? Are all systems patched,systems locked down, passwords changed, anti-virus updated, email policies set, etc.?Have changes been made to prevent a new and similar infection?Should any security policies be updated?What lessons have been learned from this experience?

Enforcement

Any employee found to have violated this policy may be subject to disciplinary action, up to and includingtermination of employment and personal legal liability

Software Development Methodology

MediaPlatform has multiple software development teams, organized around the company's product portfolio.MediaPlatform's software engineering methodology closely resembles Extreme Programming which is consideredone of the "Agile" methods. Extreme Programming acknowledges the shifting nature of software requirements andenables us to adapt quickly to our fast changing industry without burdening ourselves to the rigidity of defining allrequirements at the beginning of the project.We aim for product releases every 2 months so we can react quickly to customer requests and changes in theindustry. There are two main planning steps that we employ to keep us on our release schedules:Release Planning: This is the process of getting time estimates from developers on the desired features that havebeen defined by customer feedback and the marketplace. These 'cost' estimates are then prioritized and compiledinto a project plan or product roadmap. Release plans are designed to get more accurate with time.Iterative Planning: This is the process of continually (re)prioritizing the features and breaking a subset of them downinto time-estimated tasks. The implementation of these tasks allows us to do very regular (measured in weeks notmonths) internal development builds.The internal development builds are moved to our development servers where both QA and Development testing isdone on the build. Once the build is stable it is moved to our external facing site where other MediaPlatformpersonal and customers can use the latest features in order to provide feedback. This feedback is rolled into thenext release or re-prioritized if necessary. We have found that this iterative process allows us to deliver and makeupdates in a very responsive and calculated way to both customers and the industry.

Client-Specific Customization

When a client engagement calls for software customization, company management will assign the customizationproject to a project lead, who in turn works with the development teams responsible for the products involved in thecustomization. The project lead interacts with the client in a standard discovery-requirements-development-approvalcycle:

Following in-depth discussions with the client regarding requested customization, the project lead will developa draft set of requirements to describe the customization for client review and discussion. The requirementsare revised until the client is satisfied and signs off on them.At the same time, the project lead prepares a timeline and milestone plan.Upon client sign-off of requirements, the project lead works with the development teams to implement therequirements specified for the customization.MediaPlatform uses a software development methodology based on the Agile approach. As thecustomization is developed, successive builds are presented to engineering leadership for review.

Page 10: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

MediaPlatform's founder, Greg Pulier, is also the company's top technology architect. Mr. Pulier personallyoversees the engineering function at the company and is involved in any significant client customizationeffort.After testing and QA, MediaPlatform delivers the first version of the customization to the client for review.After a review and final revision process, MediaPlatform releases the customization to the client and overseesimplementation.

 

Page 11: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Change Management

MediaPlatform has devised a well-managed and controlled change management process. It is a rigorous processthat includes a SaaS upgrade and rollback plan, release schedules and critical patch processes.

We schedule releases approximately every 2 months. Prior to any release we test if for about 1-2 weeks in

Page 12: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

our QA environment. For any customized environment or hybrid SaaS set up, we work with the client to testthe new release in their set-up. Once our QA manager has signed off, we release the next version. Afterrelease, we post all new features on our client wiki and notify clients individually who submitted requests forenhancements or identified bugs. We also email all our clients through our regular client communication,"Infostream", about our new features.We notify clients one week prior to any updates. If a client has a major event, we will adjust the release date.Releases and maintenance are typically done on Friday night from 8pm EST - 10pm EST.From time to time we will conduct hot fixes. These are rare and only done if a client has an immediate need.In all cases, they need approval by the CTO.All bugs and change requests must be logged into JIRA.

JIRA is a project tracking tool for teams building great software.JIRA sits at the center of your development team, connecting the team and the work being done. Track bugs anddefects, link issues to related source code, plan agile development, monitor activity, report on project status, andmore. NOTE: Customer support tickets are logged in Salesforce.com. The customer support team is responsible fortransferring technical support tickets into JIRA issues if the issue affects a systemic aspect of the application thatneeds to be addressed by the technical team.

Procedures

The following are the high-level steps of the change management workflow. They include: identifying the change,analyzing the change, evaluating the change (go/no-go decision), planning the change, implement the change,releasing the change (make public) and verifying/closing the change.

Identify the change or potential change

Customers/users submit a request for change/enhancement (RFE) because either they have a new requirement(new functionality) or they have encountered a problem with existing functionality. This request is input into ourproject tracking tool (JIRA) for internal tracking and prioritizing.

Change request analyzed

This is driven by product manager with help from the development manager. The goal of this step is to determinetwo things: 1) the technical feasibility and 2) Rough estimates of effort (costs) involved in implementing the request.As a part of this process, the change is both prioritized (for urgency) and classified (minor, significant, major).

Change request evaluated for go/no-go decision

This is a critical step in the process and the goal is to determine if the change that was requested will beimplemented or not. This decision is made by a group of people which, depending on the request, can include theProduct Manager, VP of Engineering, CTO. The criteria for approval is fluid but, in general, requests will getapproved if it clearly adds a benefit to both the users requesting it and the majority of other users. If the change isnot approved, the requester is notified. If it is approved, it moves on to the planning stage.

Change planning

This goal of this step is to create and gather all the necessary information that is needed to implement the change.Depending on the change this can include: Requirements Specification ("spec"), wireframes, FunctionalSpecifications. A part of this step is "impact analysis" to determine if the change will make another part of the system'break' or make something incompatible. This is a type of risk analysis and changes that are determined to be 'highimpact' are accepted very carefully and only after it is determined that the change is essential and worth the impact

Page 13: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

1. 2.

3.

4.

to other areas of the application. For high-impact changes, this may also require an analysis of the impact of thechange on the larger system which would be done by the VP of Engineering. Change planning goes hand in handwith our release planning.

Change implementation

This is an iterative set of steps that can be generalized as the following:

Change is developed (programmed)The developer tests the change to confirm that it meets the requirements outlined. This is usually done on aninternal development environment and is an iterative process with step one.Test Change (QA) - The change is moved to our QA environment and testing by the QA department. Inaddition to the testing of the specific change to make sure it meets the requirements, QA also does a full setof regression tests in the functional area(s) that the change took place. The result of this step will be to eithersend it back to step one for more work by the developer or to approve. It should be noted that in someinstances, the customer who requested the change may be a part of this testing phase in conjunction with uson the QA environment. This is particularly helpful for compliance type testing of the change.Documentation Updated

Release change

This is done after both the Product Manager and the QA manager sign off on the change. We notify clients at leastone week prior to any updates. In general, changes are done in conjunction with other changes which, when takentogether, constitute a release. The change/release is then moved to the Production environment (made public)during pre-scheduled release and maintenance windows. As a policy, this is done after business hours.

Client verification/change closed

Depending on the change and the requirements of the customer who requested the change, verification that therequirements were met are done during this step. After verification, the change is 'closed'.

Critical patch releases ("hot-fixes")

Occasionally, there is the need to release a critical patch to the production environment. In these cases, we continueto follow the change management procedures outlined above with some slight modifications. First, the process isusually abbreviated and sped up. This is especially true about the first 4 steps. Second, there is special care takenby development to 'branch' the code with the goal of having a very controlled set of changes and associatedimpacts. In other words, additional planning is done by development in order to take the steps necessary to isolatethe change with the goal of minimizing risk (this is overseen and signed off on by the VP of Engineering and theCTO). Once implemented and thoroughly tested by development. Test cases and sometimes a test harness ishanded to QA for further compliance and regression testing.

Redundancy and Disaster Recovery

MediaPlaform's product is utilized by large enterprises across the globe. Global enterprises rely on MediaPlatform'sproducts and services on a daily basis for their critical media delivery needs. This includes live delivery, whichleaves very little room for error. As such, MediaPlatform has a necessity to focus on high availability, and overcomeall possible vulnerabilities, at all levels.

Overview

This document covers MediaPlatform's service continuity plan for its products and services, namely Webcaster and

Page 14: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

1.

2.

1. 2. 3.

Primetime. The overarching goal is to provide maximum uptime of these products. This is achieved throughimplementing best practices at all levels of the technology stack, security, and extends to customer communicationpolicy and processes.

Redundancy and Failover

Wherever possible, MediaPlatform's systems are designed to eliminate any single points of failure. MediaPlatformutilizes advancements in cloud computing to maintain an extremely high level of redundancy and automatic failover.In addition, MediaPlatform's application framework is based on a distributed computing architecture, which promotesboth scalability and automatic failover.

Physical Server Redundancy

By utilizing Amazon EC2 service, MediaPlatform is able to leverage Amazon's hardware redundancy virtual servercloud environment. All servers are "virtual" and are not tied to any one single physical machine and the failure pointsassociated with a single server. Similarly, by utilizing Amazon S3 service for file sharing services and Amazon RDSserver for database services, MediaPlatform has the same hardware redundancy benefits for these other twofundamental computing services. Both Amazon S3 and RDS support automatic geographic redundancy, meaningthese virtualized services span across multiple "availability zones" within Amazon, taking server redundancy toanother level.

Application Redundancy

The core MediaPlatform application runs on Amazon EC2 services. There are two possible availability issues thatmay arise when running the MediaPlatform application on Amazon's EC2:

Amazon's EC2 cannot automatically be set up for multiple "availability zones." If the "availability zone" wherethe EC2 virtual instance exists goes down, the EC2 instance will go down as well.The MediaPlatform application may become dysfunctional due to software issues at the OS or applicationlevel.

To protect against both of these possible issues, MediaPlatform utilizes a distributed computing architecture. AllMediaPlatform applications should be setup behind load balancers, and have multiple EC2 instances running indistinct availability zones available. Each application server has a health check system which continuously monitorsits own health, and will be instantly taken out of the load balancer if the health check fails.The health check is controlled at the application level and checks many critical conditions, such as DB connectivity,local storage availability (must be > 90%), fileshare mount availability, and local application responsiveness. Thesystem will send an alert to the System Administrator if disk capacity falls below 60%.

Backups

The MediaPlatform application may persist data in three possible locations. They include:

Local file system (EC2)Shared file systemDatabase (RDS)

Local File System

For all data stored to the local machine instance (on the local file system), a "snapshot" of critical EC2 instances areroutinely taken from Amazon Web Services. The snapshot is automatically stored on Amazon's high redundancyand high availability storage system, which spans across multiple availability zones.

Page 15: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

1. 2. 3.

Shared File System

MediaPlatform's shared file system is based on top of Amazon's S3 service. This allows for the shared file system toleverage S3's high scalability, high availability, and low latency.

Database

MediaPlatform utilizes Amazon's RDS server to ensure high availability of its database. For productionenvironments, RDS is configured to automatically span across multiple availability zones. RDS is also configured tokeep a 5 day historical, so data can easily be reverted back if the need arises. On top of all this, snapshots of theproduction RDS instance is routinely taken and automatically stored on Amazon's high availability storage system.

Monitoring

Continuous monitoring occurs at two levels. The first level of monitoring and alerting occurs at the Amazon WebServices level using CloudWatch. CloudWatch is configured to monitor various metrics for various Amazon WebServices. A few examples of alerts that have been set up include health check failure of MediaPlatform's applicationservers, CPU utilization above a specified threshold, and low storage availability of RDS.The second level of monitoring occurs from a system external to the Amazon Web Services. A tool known asZenoss is used to monitor the MediaPlatform systems for availability and other metrics which CloudWatch is unableto monitor, such as EC2 free storage and network connectivity.

Disaster Recovery

MediaPlatform has planned for possible catastrophic failure of Amazon cloud services within an entire region. Assome quick background, Amazon runs its cloud services at various regions around the globe, including Virginia,Oregon, N Cal, Ireland, Singapore, etc. Within each region, they have multiple availability zones, which aretheoretically isolated from one another. MediaPlatform's architecture already protects against a failure within aparticular availability zone. However, in the event that an entire region (Virginia) were to go down (perhaps due to amajor catastrophic event such as an earthquake), MediaPlatform's disaster recovery plan is to fall back to a differentregion (N Cal). The specific procedures are documented in MediaPlatform's Disaster Recovery Planning document.The expected recovery time is expected to be approximately 6 hours.

Change Management

There is always the possibility with any software that changes in the application may result in bug related downtime.This is mitigated by following a strict change management process. Please see MediaPlatform's ChangeManagement document.

Quarterly Audits

As a policy, MediaPlatform audits its system redundancy, disaster recovery procedures, and change managementprocesses on a quarterly basis. The purpose is to identify:

If MediaPlatform is following the defined processes and protocols.If all systems are setup properly for redundancy.If any improvement can be made.

Customer Communication

As a policy, customers are to be notified of the following events:

Page 16: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Maintenance windows (known future downtime)Any "minor" system failures which affects a running eventAny "major" system failures

Maintenance Windows

As described in our Change Management , clients are notified in advance of all maintenance windows. If asectionclient has a planned event, we will resechedule. Maintenance windows are used for routine maintenance and forupgrades.

"Minor" System Failures

Minor system failures constitute any failures that may affect the availability or performance of MediaPlatform'sproducts for a short period of time. In such an event, only affected customers are contacted. Client Services willcontact affected customers as quickly as possible to work through any issues.

"Major" System Failures

A major system failure constitutes any failure lasting much longer. This includes a possible disaster (see DisasterRecovery above). MediaPlatform will communicate with clients, providing information about the issue and estimateson how long the downtime may last. In addition, MediaPlatform will work with clients to support any critical needs ona case by case basis to accelerate the availability on backup systems.

Sample Performance Reports

 

Uptime

It is critical to track uptime,, which is comprised of monitoring data at the AWS level and at the server level. Asample AWS uptime report is provided below.

AWS

Page 17: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Historical uptime data is also available.

Page 18: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Server

Our application server has a specialized health check that is constantly monitoring its own health. The system isconfigured to send alerts based on thresholds of performance. This entails making sure all the critical componentsfor the system are functional, such as DB connectivity, available file storage space, etc. CloudWatch constantly monitors each server and will send an alert when it details a health check failure. ElasticLoad Balancer will remove the server from the load balancer if the health check fails. Below is a sample graph ofCloudWatch monitoring of the Webcaster application servers.

Load and Performance

Using CloudWatch, Media Platform is able to monitor server load and usage for EC2, EBS, ELB, and RDS. A slewof metrics are available based on the service.

Metrics for EC2

Below is a list of the metrics which may be reported for on EC2:

CPU UtilizationDisk Read BytesDisk Read OpsDisk Write Bytes

Page 19: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon

Disk Write OpsNetwork InNetwork Out

Metrics for EBS

Below is a list of the metrics which may be reported for on EBS:

Volume Idle TimeVolume Queue LengthVolume Read BytesVolume Read OpsVolume Total Read TimeVolume Total Write TimeVolume Write BytesVolume Write Ops

Metrics for ELB

Below is a list of the metrics which may be reported for on ELB:

Healthy Host CountHTTP Code Backend 2XXHTTP Code Backend 3XXHTTP Code Backend 4XXHTTP Code Backend 5XXHTTP Code ELB 4XXLatencyRequest CountUnhealthy Host Count

Metrics for RDS

Below is a list of the metrics which may be reported for on RDS:

CPU UtilizationDatabase ConnectionsFreeable MemoryFree Storage SpaceRead IOPSRead LatencyRead ThroughputSwap UsageWrite IOPSWrite LatencyWrite Throughput

Sample Reports

Below is a sampling of some key reports leveraged from the AWS CloudWatch service. The reports can be analyzedaccording to different points of view and layers of detail, e.g. machine instance, metric, and timeframe. The SystemAdministrator generates a weekly summary report for review. These reports are rolled up into a comprehensiveperformance report for customer viewing.

Page 20: MediaPlatform Application Security and Continuity Plan · such as syn cookies and connection limiting are used.€ To further mitigate the effect of potential DDoS attacks, Amazon