ELK in the wild –Real life log analysis on...

Preview:

Citation preview

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Asaf Yigal, VP Product Co-Founder, Logz.io

May 2017

ELK in the wild – Real life log analysis on AWS

Who Am I?

Asaf Yigal – VP Product , Logz.io

1,000 companies from 80 different countries use Logz.io

Agenda

• Why log analysis is important ?• Introducing ELK• Security at British Airways• DDoS attack detection at Dyn

Online user behavior

IoTanalytics

Monitoring & system troubleshooting

Security and compliance

Security devices

App server

Network

Machine Logs Big Data

Fundamental to Understanding Machines

Open Source ELK +/-

Simple and beautifulIt’s simple to get started and play with ELK and the UI is just beautiful

Open SourceThe largest user base with a vibrant open source community that supports and improves the product

Fast. Very fast.Built on the Elasticsearch search engine, ELK provide blazing quick responses even when searching through millions of documents

Hard to ScaleData piles up and organization experience usage bursts. It’s super-complex building elastic ELK deployments that can scale up and down

Poor SecurityLogs include sensitive data and open source ELK offers no real security solution, from authentication to role based access

Not Production ReadyBuilding production ready ELK deployment is a great challenge organization face. With hundreds of different configurations and support matrix, making sure it’s always up is difficult

Simple and beautiful Open Source/Flexible Fast. Very fast.

ELK Stack500,000+

companies

20K companies

ELK Stack 2017

Propriety Software

*Research done by Logz.io

1. No logs should be dropped (trivial, ah)2. Highly Available3. Secure which means encryption and access control4. Index management, shard allocation5. Data should be parsed and mapping configured6. Data should be retained for x days7. Configuration management and monitoring8. Data spikes should handled up to 10x normal capacity9. Visualization and dashboards10. Archive long retention11. Alerts

Production Requirements

Security at British Airways

Challenges

Why Logz.io

ELB Health

Understanding Weekly trends

Understanding who is crawling the site

Understanding traffic

Understanding Client Location

DDoS attacks detection at Dyn

15https://img.memesuper.com/182956f180cfb7a8c95d6dda68a1d351_you-get-a-ddos-attack-ddos-meme_625-468.jpeg

Numerous methods of detection

16

17

Understand Normal

● We leverage monitoring to define normal.

● We alert in reasonable ways when critical metrics become abnormal

● Too many alerts and your “teams tasked with reactive reliability” will get burned out.

● Normal shouldn’t be subjective. Socialize your key performance indicators!

18

Netflow

19

Fast breakdown of SRC & DST port, proto, ASN, Site, etc.

20

Quick sort and analysis of v4 and v6 IPs

21

Examples of attack

22

• Lots of resources online• Try the Logz.io blog for detailed guides, benchmarks and

troubleshooting guides on building ELK stacks• @logzio• @asafyigal

How to Learn More

Asaf Yigal (@asafyigal)Logz.io (@logzio)

Recommended