Troubleshooting Basics - ChinaNetCloud Training

Preview:

Citation preview

Copyright ChinaNetCloud 2015

ChinaNetCloudRunning All the World's Internet Servers

ChinaNetCloud Training

Troubleshooting BasicsBy ChinaNetCloudPioneers in OaaS – Operations-as-a-Service

May, 2014

www.ChinaNetCloud.com

Copyright ChinaNetCloud 2015 2

ChinaNetCloudRunning All the World's Internet Servers

Introduction

● We all troubleshoot● Important in life● Really important in ops● Most important skill (with Engineering & Mgmt)

Copyright ChinaNetCloud 2015 3

ChinaNetCloudRunning All the World's Internet Servers

Born or Made ?

● Some people are better● Logical thinking important● Path running – Think it out● Rule in, rule out● Understand all systems● Understand this system

Copyright ChinaNetCloud 2015 4

ChinaNetCloudRunning All the World's Internet Servers

Overview of Process

● Gather facts● Gather history● Quick think likely issues● Few quick things● Rule out things● Deeper dives● Exotic rule outs● Question Facts

Copyright ChinaNetCloud 2015 5

ChinaNetCloudRunning All the World's Internet Servers

Key Points

● Know the system● Think, think, think● All about testing/proving● Rule OUT, not IN● Dive shallow● Dive deep diversely● Dive deep systematically● Question everything● Get help & perspective● Keep notes & records

Copyright ChinaNetCloud 2015 6

ChinaNetCloudRunning All the World's Internet Servers

Gather facts

● What do you know● What might be true● Symptoms

Copyright ChinaNetCloud 2015 7

ChinaNetCloudRunning All the World's Internet Servers

Gather history

● Old history of issues● Recent history of issues● Recent changes (& hidden ones)● What lead to problem?● Can we reproduce ?

Copyright ChinaNetCloud 2015 8

ChinaNetCloudRunning All the World's Internet Servers

Likely issues

● Educated guess of top 3-5 likely ideas● More diverse the better● Ideally easy to test● Goal is fast diagnosis & fix

Copyright ChinaNetCloud 2015 9

ChinaNetCloudRunning All the World's Internet Servers

Less Likely but Easy to Test

● Quickly rule out major things● Power, cables, networks● Major services & issues● Weird things● All easy to test● Goal to rule out things

Copyright ChinaNetCloud 2015 10

ChinaNetCloudRunning All the World's Internet Servers

Rule out things

● All about what the problem is not● Very important to moving forward● Testability, how can we know

Copyright ChinaNetCloud 2015 11

ChinaNetCloudRunning All the World's Internet Servers

Deeper dives - Diverse

● Less common things, deeper looking● Ideally easy to test● Diversity and broad problem coverage● Don't spent too long on each

Copyright ChinaNetCloud 2015 12

ChinaNetCloudRunning All the World's Internet Servers

Deeper Dives - Systematic

● Go through everything● Systematic and organized● Don’t miss things● Takes a long time

Copyright ChinaNetCloud 2015 13

ChinaNetCloudRunning All the World's Internet Servers

Exotic Rule Outs

● Check the really weird or unusual things● Including amazing recent stupidity● Takes lots of time

Copyright ChinaNetCloud 2015 14

ChinaNetCloudRunning All the World's Internet Servers

Question Facts

● When nothing makes sense● When covered everything● Always assume some facts are wrong

● They often are

● Re-check facts● Re-check what you've done● Usually a fact is not true● Or you missed something, wrong systems, etc.

Copyright ChinaNetCloud 2015 15

ChinaNetCloudRunning All the World's Internet Servers

About ChinaNetCloud

Www.ChinaNetCloud.com – +86-21-6422-1946 – Sales@ChinaNetCloud.com

ChinaNetCloud is a Shanghai-based, full-service Internet managed services provider (MSP). We architect, build, optimize, and manage large-scale systems for e-commerce, games, apps, mobile, media, and more.

We deliver Reliability, Performance, Scale, Security, and cost savings via our Operations-as-a-Service (OaaS) platform, which includes 7x24 operations, deep predcitive monitoring, networking, security scanning, backups, databases, upgrades,rapid troubleshooting, configuration changes, and much more.

Our OaaS platform is state-of-the-art with a wide variety of sophisticated tools ranging from deep design to audit, migration, management, monitoring, backups, CMDB, load testing, capacity planning, performance analysis, portals, and much more.

Over six years, we've helped hundreds of internet companies improve their systems, focusing on Reliability, Performance, Scalability, Security, and Cost-Savings.

Let us help you today!

Copyright ChinaNetCloud 2015 16

ChinaNetCloudRunning All the World's Internet Servers

Contact ChinaNetCloud

Silicon Valley Office:

440 North Wolfe Road

Sunnyvale, 94085 USA ChinaNetCloud www.ChinaNetCloud.com

Sales@ChinaNetCloud.com

Shanghai Headquarters:

X2 Space 10601

1238 Xietu Lu

Shanghai, 200032 China

Beijing Office:

Lee World Business Building #305

57 Middl Xingfu Village Rd., Chaoyang

Beijing, 100027 China

T: +86-21-6422-1946

Recommended