26
Network Network Administration Administration Research and Research and Analysis Analysis Week-7 Week-7

Network Administration Research and Analysis Week-7

Embed Size (px)

Citation preview

Network Network AdministrationAdministration

Research and AnalysisResearch and Analysis

Week-7Week-7

Theory of Network AdminTheory of Network AdminBurgess – Ch.11Burgess – Ch.11

Science vs. TechnologyScience vs. Technology Studying Complex SystemsStudying Complex Systems Purpose of ObservationPurpose of Observation Evaluation Methods and ProblemsEvaluation Methods and Problems Evaluating a Hierarchical SystemEvaluating a Hierarchical System Deterministic and Stochastic BehaviourDeterministic and Stochastic Behaviour Observational ErrorsObservational Errors Strategic AnalysesStrategic Analyses

A Scientific Basis A Scientific Basis for System Administrationfor System Administration

System admin has always involved System admin has always involved experimentationexperimentation

Development of Networks has lead to Development of Networks has lead to exponential increase in system complexity and exponential increase in system complexity and corresponding increase in difficulty of corresponding increase in difficulty of ManagementManagement

A purely mechanical approach may no longer be A purely mechanical approach may no longer be adequate: time for a theoretical basis….adequate: time for a theoretical basis….

World-wide interest, encouraged by professional World-wide interest, encouraged by professional organisations organisations (SAGE, USENIX, ACM, IEEE, ACS)(SAGE, USENIX, ACM, IEEE, ACS)

Network Admin ResearchNetwork Admin ResearchScience vs TechnologyScience vs Technology

System Admin studies mostly System Admin studies mostly “Applied Research” which result in “Applied Research” which result in the development of a specialised the development of a specialised toolset that solves local/specific toolset that solves local/specific problemproblem

Some workers have attempted to Some workers have attempted to collate results to form a more collate results to form a more general technology of more general technology of more permanent or global value.permanent or global value.

But this is not But this is not SScience cience !!

What is “Science”?What is “Science”?The Scientific MethodThe Scientific Method

Knowledge advanced by series of studies Knowledge advanced by series of studies that either verify/falsify a hypothesisthat either verify/falsify a hypothesis

Study may be theoretical or practical but Study may be theoretical or practical but all contribute to a larger on-going all contribute to a larger on-going discussion that leads to progressdiscussion that leads to progress

A single study is rarely the end of the A single study is rarely the end of the discussiondiscussion

Each study is usually repeated and verified Each study is usually repeated and verified or challenged by other researchersor challenged by other researchers

Reproducibility is very importantReproducibility is very important

Scientific MethodScientific Method

Motivation Motivation – statement of context and – statement of context and objectivesobjectives

Appraisal of problemsAppraisal of problems Theoretical Model Theoretical Model - used to understand or - used to understand or

solve problems and provide a framework for solve problems and provide a framework for comparison and measurementcomparison and measurement

Design an experiment Design an experiment – the Approach– the Approach Perform an Experiment Perform an Experiment – obtain Results– obtain Results Evaluation or Verification of Approach and Evaluation or Verification of Approach and

ResultsResults

Scientific MethodScientific Method

Science is a dialog of TheoriesScience is a dialog of Theories Science proceeds by ExperimentScience proceeds by Experiment Need Theory to interpret Need Theory to interpret

observationsobservations Need observations to disprove Need observations to disprove

TheoryTheory

Network Admin Research:Network Admin Research:Studying Complex SystemsStudying Complex Systems

Areas of study in System Admin have been Areas of study in System Admin have been Technical and/or Behavioural and include:Technical and/or Behavioural and include:– Reliability studiesReliability studies– Finding and evaluating methods for system Finding and evaluating methods for system

integrityintegrity– Observation which apply to non-linear behaviourObservation which apply to non-linear behaviour– Issues related to strategy and planningIssues related to strategy and planning

Mostly Empirical or Qualitative case studyMostly Empirical or Qualitative case study

Purpose of ObservationPurpose of Observation

Gather Info about a Problem to enable Gather Info about a Problem to enable development of a Technology which development of a Technology which solves itsolves it

To evaluate the Technology for To evaluate the Technology for effectiveness effectiveness (ie whether it fulfils it’s design goals)(ie whether it fulfils it’s design goals)

But evaluation of SysAdmin But evaluation of SysAdmin experiments is difficult due to Vested experiments is difficult due to Vested Interests and lack of clearly defined Interests and lack of clearly defined metricsmetrics

Evaluation Methods Evaluation Methods and some Problemsand some Problems

Ideally there should be a repeatable Ideally there should be a repeatable test yielding measurementstest yielding measurements

The trouble is that while a good The trouble is that while a good system administrator could do this system administrator could do this heuristically, these are heuristically, these are – Very difficult to quantifyVery difficult to quantify– Different SysAdmins work in different Different SysAdmins work in different

waysways– Extreme variability in systems and usersExtreme variability in systems and users

Some Research TopicsSome Research TopicsEfficiency & AutomationEfficiency & AutomationNetwork Administration Network Administration methods/modelsmethods/modelsReliability StudiesReliability Studies

– Fault managementFault management– MetricsMetrics– Patterns of eventsPatterns of events

prediction & performanceprediction & performance

A Common Research topic A Common Research topic and the problemsand the problems

Ways to relieve Administrators of tedious Ways to relieve Administrators of tedious work, so they can use there talents better in work, so they can use there talents better in other ways. What sort of experiment is other ways. What sort of experiment is needed?needed?

Measure time spent working on a system Measure time spent working on a system but the time required usually expands to occupy the time but the time required usually expands to occupy the time available!available!

Record actions of an automatic system and Record actions of an automatic system and compare with those of a human compare with those of a human administrator administrator but depends on the person - different people do things in but depends on the person - different people do things in different waysdifferent ways

Eg

Network Admin Research:Network Admin Research:effect of Vested Interests….effect of Vested Interests….

SysAdmins require tools….SysAdmins require tools…. Such tools often acquire a dedicated following Such tools often acquire a dedicated following

of users who grow to like them regardless of of users who grow to like them regardless of what the tools allow them to achievewhat the tools allow them to achieve

Marketing skills of one software vendor might Marketing skills of one software vendor might be better than others and create a bias in the be better than others and create a bias in the marketplace that effects the perceived marketplace that effects the perceived usefulness of a particular toolusefulness of a particular tool

So one cannot estimate the effectiveness of a So one cannot estimate the effectiveness of a tool based just on the number of those who tool based just on the number of those who use ituse it

Evaluating Hierarchical SystemEvaluating Hierarchical System What level of detailed decomposition of What level of detailed decomposition of

levels within the hierarchy is appropriate?levels within the hierarchy is appropriate? Building a model of the hierarchy is often Building a model of the hierarchy is often

the best way to address complexity – the best way to address complexity – focus focus on what’s important or practicalon what’s important or practical

Experiments based on this model might Experiments based on this model might then involvethen involve– MeasurementsMeasurements– SimulationsSimulations– Case studiesCase studies– User surveysUser surveys

FaultsFaultsIEEE classify software anomalies as:IEEE classify software anomalies as:

O/S crashO/S crash Program hangProgram hang Program crashProgram crash Input problemInput problem Output problemOutput problem Failed required Failed required

performanceperformance

Perceived total Perceived total failurefailure

System error System error messagemessage

Service DegradedService Degraded Wrong outputWrong output No outputNo output

Most common faults for Most common faults for SysAdmin are:SysAdmin are:

Input ProblemInput Problem– Missing or inappropriate configurationMissing or inappropriate configuration

Failed performanceFailed performance– Usually through loss of resourcesUsually through loss of resources

Software problems can be eliminated Software problems can be eliminated by revaluation of individual software by revaluation of individual software componentscomponents

Reliability and RedundancyReliability and Redundancy

Average (Mean) time before failureAverage (Mean) time before failure

With parallel or redundant componentsWith parallel or redundant components

With serial or dependent componentsWith serial or dependent components

Probability of FailureProbability of Failure

edTimeTotalElapsMeanUptimeR

nparallel

RRRR

1...

11

21

...321 RRRRseries

)exp()( RttP

MTBF and ComputersMTBF and Computers

Computer system MTBF doesn’t account Computer system MTBF doesn’t account for:for:– Dependency Dependency – Not all systems have same – Not all systems have same

attachmentsattachments

– Fail-over and Latency of serviceFail-over and Latency of serviceSystems may fail, then recover after a single Systems may fail, then recover after a single

delaydelaythis may occur repeatedly !!this may occur repeatedly !!

– Patterns of usagePatterns of usageUser behaviour may bias the outcomeUser behaviour may bias the outcome

Some MetricsSome Metrics

NetNet– Total number of packetsTotal number of packets– Amount of IP fragmentationAmount of IP fragmentation– Density of Broadcast messagesDensity of Broadcast messages– Number of CollisionsNumber of Collisions– Number of Sockets(TCP) in and outNumber of Sockets(TCP) in and out– Number of malformed packetsNumber of malformed packets

Some MetricsSome Metrics

StorageStorage– Disk Usage in BytesDisk Usage in Bytes– Disk Operations per SecondDisk Operations per Second– Paging rate (free memory and Paging rate (free memory and

thrashing)thrashing)

Fig 11.2 Daily paging data

Error bars exceed variation of data!

Fig 11.3 Weekly paging data

Also showing extreme variation

Some MetricsSome Metrics

ProcessesProcesses– Number of privileged processesNumber of privileged processes– Number of non-privileged processesNumber of non-privileged processes– Maximum percentage CPU used in Maximum percentage CPU used in

processesprocesses

Some MetricsSome Metrics

UsersUsers– Number logged onNumber logged on– Total NumberTotal Number– Average time spent logged on per userAverage time spent logged on per user– Load AverageLoad Average– Disk Usage rise per session per user per Disk Usage rise per session per user per

hourhour– Latency of ServicesLatency of Services

DistributionsDistributions

Delta – constant XDelta – constant X Uniform – constant YUniform – constant Y Gaussian or RandomGaussian or Random Normal – “bell curve”Normal – “bell curve” Black-Body or Planck – approx Black-Body or Planck – approx

exponentialexponential Poisson – random arrival with mean Poisson – random arrival with mean

raterate Pareto – Power LawPareto – Power Law

Theory of System Theory of System AdminAdmin

(end)(end)