Upload
karthikkrk
View
150
Download
0
Embed Size (px)
Citation preview
Symantec Analy,cs Pla/orm 1
Symantec Analy-cs Pla0orm
Symantec Corpora-on
• Symantec – Symantec is the world leader in providing security so;ware for both enterprises and end users
– There are 300 million devices (PCs, Tablets and Phones) that rely on Symantec for their security needs
– There are also 1000’s of Enterprises that rely on Symantec to help them secure their assets from aHacks, including their data centers, emails and other sensi,ve data
• Cloud Pla/orm Engineering (CPE) – Cloud Pla/orm Engineering (CPE) organiza,on at Symantec is responsible for building the next genera,on cloud pla/orm for Symantec
– We use Open stack for building the infrastructure cloud
– We use Hadoop/Storm/KaQa/Spark for the building the analy,cs cloud
2 Symantec Analy,cs Pla/orm
About us • Karthik Karuppaiya
Karthik is a Principal Cloud Pla/orm Engineer, leading the efforts on architec,ng and implemen,ng the Symantec’s next genera,on Big Data Analy,cs Pla/orm. He has extensive experience with designing and engineering large scale distributed systems on Big Data technologies since 2010.
hHps://www.linkedin.com/in/karthikkrk
hHps://twiHer.com/karthikkrk
• Raghavendra Nandagopal Raghavendra Nandagopal is a Principal Cloud Pla/orm Engineer having extensive experience on architec,ng and engineering distributed systems in Big data space. He is also a contributor to Apache Storm project.
hHps://www.linkedin.com/in/speaktoraghav
hHps://twiHer.com/speaktoraghav
3 Symantec Analy,cs Pla/orm
Agenda
• Analy,cs Pla/orm Overview • Real-‐,me Streaming Architecture • Lessons Learned • Cluster Deployment Overview • Performance Metrics Collec,on • Monitoring • Self Service Analy,cs Cluster
4 Symantec Analy,cs Pla/orm
5
HDFS (Hadoop Distributed File System)
YARN (Cluster Resource Management)
KAFKA
PIG OOZIE HIVE STORM
Analytics Engines
BARE METAL OPENSTACK VMs
Nodes
QueryX
LMM
O
PS
V
IEW
Monitoring &
Alerting Services
KNOX BDSE SPaaS
Gateway Services G
AN
GLIA
HUE MFC
AM
BA
RI
PU
PP
ET
Dep
loym
ent A
utom
atio
n
Analy-cs Pla0orm Overview
Symantec Analy,cs Pla/orm
6
Real-‐-me Streaming Architecture
Security Events
(Kafka Producers)
Alert Events (Kafka Consumers)
Streaming Cluster
Kafka Kafka
Storm
Logstash collectd
Upload MetaData File
LMM
Symantec Analy,cs Pla/orm
Lessons Learned
• KaQa’s lack of rack awareness – With a replica,on of 3, chances are that all the 3 replica,ons for a par,,on resides on the same rack
• KaQa’s JBOD limita,ons – KaQa broker shuts down when a disk fails – E.g. If a broker as 10 disks configured and due to one disk failure all the 10 disks will be unavailable
– The ,me taken to replicate the data a;er broker restarts will be longer
7 Symantec Analy,cs Pla/orm
Lessons Learned
• Choosing storm worker slots for a cluster – Rule of thumb used for sizing based on the recommenda,ons from the storm community
– (M) Total Number Of Supervisors = 12 – (C) Total Number Of CPU cores per machine = 32 – (X) I/O-‐CPU-‐bound factor: a value between 1 (CPU bound) to 100 (I/O bound) = 10 (Uses regex)
– (W) No. of workers in a topology = 33 – (P) Parallelism Units = (M * C * X) -‐ W = (12 * 32 * 10) -‐ 33 = 3807
P will be rough es,mate of how many parallelism units we have. We can then distribute that number among components in the topology as parallelism hints.
8 Symantec Analy,cs Pla/orm
Cluster Facts
• KaQa Nodes – 10 Nodes – Each with 12 disks of 4 TB each – Total 48 TB * 10 = 480 TB capacity
• Storm Nodes – 12 Supervisor and 1 Nimbus – 128 GB RAM and 32 cores – 96 worker slots total
• Processing 300000 events/sec
9 Symantec Analy,cs Pla/orm
Cluster Deployment Overview
• Goals set out for deployment – Fully automated – Use the same deployment scripts for all the environments, to keep the deployments consistent
– Easy deployment of Dev clusters to enable fast adop,on – Use only open source tools – Use exis,ng tools as much as possible and fill the needed gaps
10 Symantec Analy,cs Pla/orm
11
Cluster Deployment Overview
Symantec Analy,cs Pla/orm
Deployment Automation Framework (DAO)
Puppet
Ambari Server
Kafka/ZK 1..N
Storm 1..N HDFS 1..N Ambari Server/API Node
Install Ambari Server and Agents Provision Hardware
Apply Blueprint
Cluster Deployment Overview (Ambari Blueprint)
Symantec Analy,cs Pla/orm 12
Performance Metrics Collec-on
• Easy to run and collect metrics • Easy to test mul,ple configura,ons • No-‐op Bolts in Storm • Primarily geared towards tes,ng the KaQa read/write performance
– KaQa Write Throughput Topology – KaQa Read Throughput Topology – KaQa Read/Write Throughput Topology
• Generate as many events as possible • Use Ganglia to collect metrics • The tool will be open sourced soon Symantec Analy,cs Pla/orm 13
Performance Metrics Collec-on
Symantec Analy,cs Pla/orm 14
Monitoring
• OpsView – Host level monitoring
• CPU, Memory, Disk, Network/Ports. • Service level monitoring.
• QueryX – Func,onal Valida,on/Monitoring
• Validation from inside/outside the cloud
• KaQa JMX/Consumer Lag Monitoring
Symantec Analy,cs Pla/orm 15
Monitoring (QueryX Dashboard)
Symantec Analy,cs Pla/orm 16
Monitoring
• KaQa JMX Metrics – We have a collectd client that pulls metrics from KaQa JMX – Runs every one minute and pushes the metrics to LMM
• LMM – Homegrown tool for collec,ng logs and metrics – Uses most of the technologies that SPaaS is built on – Logstash/Storm/KaQa/InfluxDB/Elas,cSearch/Kibana/Grafana
– Easy to collect metrics and create dashboards
Symantec Analy,cs Pla/orm 17
Monitoring (KaTa JMX Dashboard)
Symantec Analy,cs Pla/orm 18
Monitoring (KaTa Consumer Lag Tool)
• Why? – KaQa monitoring tools available only for tradi,onal KaQa consumers – One tool to track both tradi,onal and KaQa spout consumers
• What? – Built into our API layer – easy to deploy and manage – Provides op,on for both JSON and HTML output – Stats are sent automa,cally through “statsd” client – Statsd client pushes the metrics to LMM – We have a Grafana dashboard built on top of this metrics
Symantec Analy,cs Pla/orm 19
Monitoring (KaTa Consumer Lag Tool)
Symantec Analy,cs Pla/orm 20
Monitoring (KaTa Consumer Lag Tool)
Symantec Analy,cs Pla/orm 21
Monitoring (KaTa Consumer Lag Dashboard)
Symantec Analy,cs Pla/orm 22
Self Service Analy-cs Cluster
Symantec Analy,cs Pla/orm 23
• Why? – How do you enable 1000’s of Engineers to write applica,ons for Big Data Analy,cs Pla/orm?
– They need a safe place to experiment and learn • What?
– A cluster for each engineer – Easy to deploy Cluster – One click Deployment – Takes only few minutes to deploy the cluster – Engineers can build and destroy clusters at will – Manage resources and quotas for each engineer
24
Self Service Analy-cs Cluster
SSA API POST /cluster/create/5NodeTemplate
Symantec Analy,cs Pla/orm
Openstack / Keystone Identity API
authenticate
Auth Token
Validate Token / Check Quota
Call Nova / Neutron APIs to spin up VMs and Networks
Install Ambari using Puppet
Apply Ambari Blueprint
Return Ambari URL
Thank you!
Copyright © 2013 Symantec Corpora-on. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corpora,on or its affiliates in the U.S. and other countries. Other names may be trademarks of their respec,ve owners. This document is provided for informa,onal purposes only and is not intended as adver,sing. All warran,es rela,ng to the informa,on in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The informa,on in this document is subject to change without no,ce.
25
We are hiring..!
Symantec Analy,cs Pla/orm