A Six Sigma Analysis of Mobile Data Usage

Preview:

Citation preview

A Six Sigma Analysis of Mobile Data Usage

2016 WCQISession W10

Brandon Theiss, PEBrandon.Theiss@gmail.com

Motivation

Is my current mobile data plan with Republic Wireless Optimal Given my data usage?

Learning Objectives• Apply the Six Sigma Methodology to Non

Traditional Applications • Utilize Monte Carlo simulations to make

predictions• Utilize Non Parametric Hypothesis testing• Utilize Process Capability to determine

specification limitations for non-normal data

4 Major Mobile Phone Carriers

Plans Offered By Verizon

20% of Verizon customers charged overages in past year*

Plans Offered By AT&T

28% of AT&T customers charged overages in past year*

Plans Offered By T-Mobile

12% of T-Mobile customers charged overages in past year*

5% of Sprint customers charged overages in past year*

Plans Offered By Sprint

Plans Offered By Republic Wireless

121110987654321

12000

10000

8000

6000

4000

2000

0

Bill Number

Tota

l Usa

ge

100020003000

50006000

10000

12000

3095.80

1911.60

3203.802674.30

3224.90

4517.404846.80

5905.40

3039.103784.20

4612.404254.40

Chart of Total Data Usage

The Data Set

Data was collected from March 23, 2015Through March 24, 2016

121110987654321

$130

$120

$110

$100

$90

$80

$70

$60

$50

$40

Bill Number

Bille

d Am

mou

nt

Verizon (1GB)ATT (2GB)T-Mobile (2GB)Sprint (1GB)Republic (2GB)

Variable

Time Series Plot of Small Verizon, ATT, T-Mobile, Sprint, Republic

Comparison of Carriers Small Data Plans

Data Speed Potentially Decreased

121110987654321

$120

$110

$100

$90

$80

$70

$60

$50

Bill Number

Bille

d Am

mou

nt

Verizon (3GB)ATT (2GB)T-Mobile (2GB)Sprint (3GB)Republic (3GB)

Variable

Time Series Plot of MediumVerizon, ATT, T-Mobile, Sprint, Republic

Comparison of Carriers Medium Data Plans

Data Speed Potentially Decreased

121110987654321

$90

$85

$80

$75

$70

$65

Bill Number

Bille

d Am

mou

nt

Verizon (6GB)ATT (5GB)T-Mobile (6GB)Sprint (6GB)Republic (5GB)

Variable

Time Series Plot of Large Verizon, ATT, T-Mobile, Sprint, Republic

Comparison of Carriers Large Data Plans

Comparison of Carriers X-Large Data Plans

121110987654321

140

120

100

80

60

40

20

0

Index

Data

Verizon (12GB)ATT (15GB)T-Mobile (10GB)Sprint (12GB)Republic (Not Offered)

Variable

Time Series Plot of XL Verizon, ATT, T-Mobile, Sprint, Republic

ATT (

15GB)

Veriz

on (12

GB)

Verizo

n (1G

B)

Sprin

t (1GB)

ATT (

2GB)

Repu

blic (

5GB)

Veriz

on (3

GB)

Sprin

t (12G

B)

T-Mob

ile (10

GB)

Veriz

on (6

GB)

ATT (

5GB)

Sprin

t (3GB)

Sprin

t (6GB

)

T-Mob

ile (6

GB)

Repu

blic (

3GB)

T-Mob

ile (2

GB)

Repu

blic (

2GB)

$ 1,600.00

$ 1,400.00

$ 1,200.00

$ 1,000.00

$ 800.00

$ 600.00

$ 400.00

$ 200.00

$ 0.00

Plan

Annu

alChart of Annual Cost

How Much Would Each Plan have cost for the Year?

1st Quartile 3053.3Median 3504.63rd Quartile 4588.6Maximum 5905.4

3052.5 4459.2

3054.0 4587.4

784.2 1879.6

A-Squared 0.25P-Value 0.687Mean 3755.8StDev 1107.0Variance 1225527.2Skewness 0.314666Kurtosis -0.123559N 12Minimum 1911.6

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

60005000400030002000

Median

Mean

4500425040003750350032503000

95% Confidence Intervals

Summary Report for Total Monthly Usage

A First Statistical Approach (monthly data)

800070006000500040003000200010000

99

95

80

50

20

5

1

Total Usage

Perc

ent

Goodness of Fit Test

NormalAD = 0.248 P-Value = 0.687

Probability Plot for Total UsageNormal - 95% CI

Is The Data Normally Distributed?

121110987654321

6000

4000

2000

Observation

Indiv

idual

Value

_X=3756

UCL=6424

LCL=1088

121110987654321

3000

2000

1000

0

Observation

Mov

ing R

ange

__MR=1003

UCL=3278

LCL=0

I-MR Chart of Total Monthly Usage

Is The Data Is Statistical Control?

600050004000300020001000

LSL *Target *USL 1000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU -0.83Ppk -0.83Cpm *

Cp *CPL *CPU -1.03Cpk -1.03

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 100.00 99.36 99.90% Total 100.00 99.36 99.90

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (1GB)

Plan Annual Cost

ATT (2GB) $ 1,065.00

Sprint (1GB) $ 1,065.00

Is a 1GB (1,000MB) Limit Appropriate?

60005000400030002000

LSL *Target *USL 2000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU -0.53Ppk -0.53Cpm *

Cp *CPL *CPU -0.66Cpk -0.66

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 91.67 94.36 97.58% Total 91.67 94.36 97.58

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (2GB)

Is a 2GB (2,000MB) Limit Appropriate?

Plan Annual Cost

Republic (2GB) $ 480.00

T-Mobile (2GB) $ 600.00

ATT(2GB) $ 1,065.00

60005000400030002000

LSL *Target *USL 3000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU -0.23Ppk -0.23Cpm *

Cp *CPL *CPU -0.28Cpk -0.28

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 83.33 75.26 80.23% Total 83.33 75.26 80.23

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (3GB)

Plan Annual Cost

Republic (3GB) $ 660.00

Sprint (3GB) $ 840.00

Verizon (3GB) $ 1,020.00

Is a 3GB (3,000MB) Limit Appropriate?

60005000400030002000

LSL *Target *USL 5000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU 0.37Ppk 0.37Cpm *

Cp *CPL *CPU 0.47Cpk 0.47

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 8.33 13.05 8.09% Total 8.33 13.05 8.09

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (5GB)

Plan Annual Cost

ATT (5GB) $ 915.00

Republic (5GB) $ 1,020.00 ATT (5GB)

$ 1,500.00

Is a 5GB (5,000MB) Limit Appropriate?

60005000400030002000

LSL *Target *USL 6000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU 0.68Ppk 0.68Cpm *

Cp *CPL *CPU 0.84Cpk 0.84

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 0.00 2.13 0.58% Total 0.00 2.13 0.58

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (6GB)

Plan Annual Cost

T-Mobile (6GB) $ 780.00

Sprint (6GB) $ 780.00

Verizon (6GB) $ 960.00

Is a 6GB (6,000MB) Limit Appropriate?

900075006000450030001500

LSL *Target *USL 10000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU 1.88Ppk 1.88Cpm *

Cp *CPL *CPU 2.34Cpk 2.34

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 0.00 0.00 0.00% Total 0.00 0.00 0.00

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (10GB)

Plan Annual Cost

T-Mobile (10GB) $ 960.00

Is a 10GB (10,000MB) Limit Appropriate?

~6 Sigma !

1200010500900075006000450030001500

LSL *Target *USL 12000Sample Mean 3755.84Sample N 12StDev(Overall) 1107.04StDev(Within) 889.313

Process Data

Pp *PPL *PPU 2.48Ppk 2.48Cpm *

Cp *CPL *CPU 3.09Cpk 3.09

Potential (Within) Capability

Overall Capability

% < LSL * * *% > USL 0.00 0.00 0.00% Total 0.00 0.00 0.00

Observed Expected Overall Expected WithinPerformance

USLOverallWithin

Process Capability Report for Total Usage (12GB)

Plan Annual Cost

Sprint (12GB) $ 960.00

Verizon (12GB) $ 1200.00

Is a 12GB (12,000MB) Limit Appropriate?

Greater than 6 Sigma!

2/19/2

016

1/13/2

016

12/7/2

015

10/31/

2015

9/24/2

015

8/18/2

015

7/12/2

015

6/5/20

15

4/29/2

015

3/24/2

015

1200

1000

800

600

400

200

0

Date

Data

Usa

geTime Series Plot of Data Usage

A Second Statistical Approach (daily data)

1st Quartile 69.13Median 96.703rd Quartile 138.00Maximum 1100.00

112.59 133.69

88.25 102.97

95.71 110.67

A-Squared 27.78P-Value <0.005Mean 123.14StDev 102.64Variance 10535.65Skewness 3.9407Kurtosis 26.1682N 366Minimum 0.00

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

10008006004002000

Median

Mean

14013012011010090

95% Confidence Intervals

Summary Report for Data Usage

Descriptive Statistics On Daily Usage

12008004000

99.9

99

90

5010

1

0.1

Data Usage

Perce

nt

10000100

01001010.10.01

0.001

0.0001

99.9

99

90

5010

1

0.1

Data Usage

Perce

nt

100010010

99.9

99

90

50

10

1

0.1

Data Usage - Threshold

Perce

nt

20-2-4

99.999

90

50

10

10.1

Data Usage

Perce

nt

3-Parameter LoglogisticAD = 1.975 P-Value = *

Johnson TransformationAD = 0.171 P-Value = 0.932

Goodness of Fit Test

LogisticAD = 13.251 P-Value < 0.005

LoglogisticAD = 9.501 P-Value < 0.005

After Johnson transformation

Probability Plot for Data UsageLogistic - 95% CI Loglogistic - 95% CI

3-Parameter Loglogistic - 95% CI Normal - 95% CI

If The Data Is Not Normal What Approximates The Data?

12008004000

99.9

99

90

50

10

1

0.1

N 366AD 27.776P-Value <0.005

Perc

ent

20-2-4

99.9

99

90

50

10

1

0.1

N 366AD 0.171P-Value 0.932

Perc

ent

1.21.00.80.60.40.2

0.8

0.6

0.4

0.2

0.0

Z Value

P-Va

lue

for A

D te

st

0.38

Ref P

P-Value for Best Fit: 0.931848Z for Best Fit: 0.38Best Transformation Type: SUTransformation function equals-0.996951 + 0.885314 × Asinh( ( X - 59.1002 ) / 25.8392 )

Probability Plot for Original Data

Probability Plot for Transformed Data

Select a Transformation

(P-Value = 0.005 means ≤ 0.005)

Johnson Transformation for Data Usage

The Johnson Transformation of the Data

111098754321

3.0

1.5

0.0

-1.5

-3.0

Billing Cycle

Indiv

idual

Value

_X=-0.003

UCL=2.430

LCL=-2.436

111098754321

4

3

2

1

0

Billing Cycle

Mov

ing R

ange

__MR=0.915

UCL=2.989

LCL=0

1

11

11

11

111

I-MR Chart of Transformed Data Usage

Is the Data In Statistical Control?

121110987654321

1200

1000

800

600

400

200

0

Billing Cycle

Data

Usa

ge

106.75261.6645103.34889.1433104.029150.58156.348190.497101.303122.071153.747137.239

Boxplot of Data Usage

A Third Statistical Approach

Brandon Theiss
which number is unreadable?

10005000

99.999

90

50

10

10.1

Residual

Perc

ent

20015010050

1000

750

500

250

0

Fitted Value

Resid

ual

9007506004503001500

120

90

60

30

0

Residual

Freq

uenc

y

350300250200150100501

1000

750

500

250

0

Observation Order

Resid

ual

Normal Probability Plot Versus Fits

Histogram Versus Order

Residual Plots for Data Usage

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-ValueBilling Cycle 11 429109 39010 4.04 0.000Error 354 3416405 9651Total 365 3845514

Model Summary

S R-sq R-sq(adj) R-sq(pred)98.2388 11.16% 8.40% 4.99%

Method

Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05

Equal variances were assumed for the analysis.

Factor Information

Factor Levels ValuesBilling Cycle 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

One-way ANOVA: Data Usage versus Billing Cycle

Is There Statistically Significant Difference Between The Months?

But ANOVA Requires The Data to be Normal

Kruskal-Wallis Test: Data Usage versus Billing Cycle

Kruskal-Wallis Test on Data Usage

BillingCycle N Median Ave Rank Z 1 31 108.80 217.0 1.84 2 30 130.50 249.8 3.58 3 31 88.40 187.4 0.21 4 30 85.60 160.9 -1.22 5 31 137.90 265.5 4.51 6 31 129.40 234.7 2.82 7 30 88.15 182.3 -0.07 8 31 93.80 187.9 0.24 9 30 75.70 135.9 -2.5710 31 75.00 148.9 -1.9011 31 62.50 86.3 -5.3512 29 73.20 142.6 -2.17Overall 366 183.5

H = 82.19 DF = 11 P = 0.000H = 82.19 DF = 11 P = 0.000 (adjusted for ties)

A First Non-Parametric Approach

20

10

0

1050

9007506004503001500 1050

9007506004503001500

20

10

0

105090075060045

03001500

20

10

0

105090075060045

03001500

1

Data Usage

Freq

uenc

y

2 3 4

5 6 7 8

9 10 11 12

Histogram of Data Usage

Panel variable: Billing Cycle

Kruskal-Wallis Test RequiresThe Distributions To Have Similar Shapes

Mood Median Test: Data Usage versus Billing Cycle Mood median test for Data UsageChi-Square = 70.53 DF = 11 P = 0.000

Billing Individual 95.0% CIsCycle N≤ N> Median Q3-Q1 --+---------+---------+---------+---- 1 10 21 109 68 (*--) 2 4 26 131 45 (-*---) 3 17 14 88 59 (-*-----) 4 19 11 86 46 (-*-) 5 5 26 138 156 (-----*---------------) 6 8 23 129 81 (----*----) 7 16 14 88 78 (--*-----) 8 17 14 94 44 (-*-) 9 21 9 76 44 (-*--)10 22 9 75 46 (-*-)11 26 5 63 36 (-*-)12 18 11 73 83 (--*-----) --+---------+---------+---------+---- 60 120 180 240

Overall median = 97

A Second Non-Parametric Approach

A Fourth Statistical Approach

SaturdayFridayThursdayWednesdayTuesdayMondaySunday

1200

1000

800

600

400

200

0

Day Of Week

Data

Usa

ge

124.35117.36125.612116.7687.934127.687163.094

Boxplot of Data Usage

A Fifth Statistical Approach (by days of the week)

20

10

0

10509007506004503001500

10509007506004503001500

20

10

0

10509007506004503001500

20

10

0

Sunday

Data Usage

Freq

uenc

y

Monday Tuesday

Wednesday Thursday Friday

Saturday

Histogram of Data Usage

Panel variable: Day Of Week

What Do The Distributions Of Each Day Look Like?

SUNDAY

1st Quartile 74.32Median 120.153rd Quartile 197.35Maximum 1100.00

115.83 210.36

84.71 152.41

142.27 210.53

A-Squared 4.65P-Value <0.005Mean 163.09StDev 169.77Variance 28821.13Skewness 3.7420Kurtosis 18.3156N 52Minimum 0.00

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

9607204802400

Median

Mean

225200175150125100

95% Confidence Intervals

Summary Report for Data Usage

Sunday Descriptive Statistics

WeibullAD = 0.728 P-Value = 0.053

3-Parameter WeibullAD = 0.355 P-Value = 0.475

Goodness of Fit Test

ExponentialAD = 4.176 P-Value < 0.003

2-Parameter ExponentialAD = 1.614 P-Value = 0.017

1000100101

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

100010010

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Sunday?

9607204802400

40

30

20

10

0

Shape # 1.369Scale # 125.2Thresh # 26.73N 50

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Sunday Data

Red Bars indicate outliers that were excluded from parameter determination

MONDAY

1st Quartile 67.92Median 89.853rd Quartile 134.92Maximum 619.30

94.34 161.03

78.71 108.15

100.37 148.52

A-Squared 5.23P-Value <0.005Mean 127.69StDev 119.76Variance 14342.75Skewness 2.55234Kurtosis 7.19780N 52Minimum 0.00

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

6004803602401200

Median

Mean

16014012010080

95% Confidence Intervals

Summary Report for Data Usage

Monday Descriptive Statistics

10001001010.10.010.0010.0001

90

50

10

1

Data Usage

Perce

nt

1000100101

90

50

10

1

Data Usage - Threshold

Perce

nt

10001001010.10.010.0010.0001

90

50

10

1

Data Usage

Perce

nt

10010

90

50

10

1

Data Usage - Threshold

Perce

nt

WeibullAD = 2.383 P-Value < 0.010

3-Parameter WeibullAD = 0.398 P-Value = 0.342

Goodness of Fit Test

ExponentialAD = 6.080 P-Value < 0.003

2-Parameter ExponentialAD = 6.124 P-Value < 0.010

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Monday?

600480360240120

35

30

25

20

15

10

5

0

Shape # 1.916Scale # 74.12Thresh # 29.30N 48

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Monday Data

Red Bars indicate outliers that were excluded from parameter determination

TUESDAY

1st Quartile 61.250Median 81.4003rd Quartile 105.600Maximum 289.700

75.526 100.342

72.217 89.345

37.785 55.699

A-Squared 1.76P-Value <0.005Mean 87.934StDev 45.017Variance 2026.544Skewness 2.02797Kurtosis 7.44336N 53Minimum 0.000

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

300240180120600

Median

Mean

100959085807570

95% Confidence Intervals

Summary Report for Data Usage

Tuesday Descriptive Statistics

WeibullAD = 0.382 P-Value > 0.250

3-Parameter WeibullAD = 0.203 P-Value > 0.500

Goodness of Fit Test

ExponentialAD = 10.303 P-Value < 0.003

2-Parameter ExponentialAD = 3.239 P-Value < 0.010

1000100101

99.9

90

50

10

1

Data Usage

Perce

nt

10001001010.1

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

10010

99.9

90

50

10

1

Data Usage

Perce

nt

10010

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Tuesday?

30024018012060

25

20

15

10

5

0

Shape # 1.882Scale # 57.02Thresh # 34.60N 51

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Tuesday Data

Red Bars indicate outliers that were excluded from parameter determination

WEDNESDAY

1st Quartile 69.00Median 97.103rd Quartile 154.00Maximum 321.50

97.32 136.20

77.27 113.79

59.20 87.27

A-Squared 2.07P-Value <0.005Mean 116.76StDev 70.53Variance 4974.95Skewness 1.10549Kurtosis 0.67508N 53Minimum 0.00

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

320240160800

Median

Mean

1401301201101009080

95% Confidence Intervals

Summary Report for Data Usage

Wednesday Descriptive Statistics

WeibullAD = 1.186 P-Value < 0.010

3-Parameter WeibullAD = 0.618 P-Value = 0.113

Goodness of Fit Test

ExponentialAD = 5.427 P-Value < 0.003

2-Parameter ExponentialAD = 2.310 P-Value < 0.010

1000100101

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

100010010

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Wednesday?

3202802402001601208040

20

15

10

5

0

Shape 1.430Scale 104.1Thresh 24.66N 52

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

A 3-Parameter Weibull Models Wednesday Data

THURSDAY

1st Quartile 74.90Median 102.003rd Quartile 163.27Maximum 449.50

105.25 145.97

81.45 134.75

61.29 90.69

A-Squared 1.99P-Value <0.005Mean 125.61StDev 73.13Variance 5347.80Skewness 2.03570Kurtosis 6.42894N 52Minimum 42.10

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

400300200100

Median

Mean

1401301201101009080

95% Confidence Intervals

Summary Report for Data Usage

Thursday Descriptive Statistics

WeibullAD = 0.904 P-Value = 0.020

3-Parameter WeibullAD = 0.324 P-Value > 0.500

Goodness of Fit Test

ExponentialAD = 6.944 P-Value < 0.003

2-Parameter ExponentialAD = 1.454 P-Value = 0.025

1000100101

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

100010010

99.9

90

50

10

1

Data Usage

Perce

nt

1000100101

99.9

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Thursday?

400300200100

25

20

15

10

5

0

Shape # 1.364Scale # 85.54Thresh # 40.89N 52

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Thursday Data

Red Bars indicate outliers that were excluded from parameter determination

FRIDAY

1st Quartile 67.70Median 100.953rd Quartile 122.95Maximum 435.30

94.58 140.14

84.26 105.70

68.58 101.49

A-Squared 4.30P-Value <0.005Mean 117.36StDev 81.84Variance 6697.42Skewness 2.21566Kurtosis 5.35910N 52Minimum 10.70

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

4003002001000

Median

Mean

1401301201101009080

95% Confidence Intervals

Summary Report for Data Usage

Friday Descriptive Statistics

WeibullAD = 0.607 P-Value = 0.111

3-Parameter WeibullAD = 0.392 P-Value = 0.404

Goodness of Fit Test

ExponentialAD = 9.088 P-Value < 0.003

2-Parameter ExponentialAD = 2.477 P-Value < 0.010

1000100101

90

50

10

1

Data Usage

Perce

nt

10001001010.1

90

50

10

1

Data Usage - Threshold

Perce

nt

10010

90

50

10

1

Data Usage

Perce

nt

10010

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Friday?

400300200100

35

30

25

20

15

10

5

0

Shape # 1.670Scale # 61.32Thresh # 39.17N 51

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Friday Data

Red Bars indicate outliers that were excluded from parameter determination

SATURDAY

1st Quartile 69.73Median 101.853rd Quartile 137.40Maximum 597.70

96.46 152.24

82.46 121.30

83.94 124.22

A-Squared 4.52P-Value <0.005Mean 124.35StDev 100.17Variance 10033.47Skewness 2.79744Kurtosis 9.97571N 52Minimum 0.00

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

6004803602401200

Median

Mean

16014012010080

95% Confidence Intervals

Summary Report for Data Usage

Saturday Descriptive Statistics

WeibullAD = 1.262 P-Value < 0.010

3-Parameter WeibullAD = 0.441 P-Value = 0.310

Goodness of Fit Test

ExponentialAD = 7.494 P-Value < 0.003

2-Parameter ExponentialAD = 1.317 P-Value = 0.037

1000100101

90

50

10

1

Data Usage

Perce

nt

1000100101

90

50

10

1

Data Usage - Threshold

Perce

nt

10010

90

50

10

1

Data Usage

Perce

nt

1000100101

90

50

10

1

Data Usage - Threshold

Perce

nt

Probability Plot for Data UsageExponential - 95% CI 2-Parameter Exponential - 95% CI

Weibull - 95% CI 3-Parameter Weibull - 95% CI

What Distribution Models Saturday?

600480360240120

35

30

25

20

15

10

5

0

Shape # 1.246Scale # 69.33Thresh # 44.41N 50

Data Usage

Freq

uenc

yHistogram of Data Usage

3-Parameter Weibull

# This estimated historical parameter is used in the calculations.

A 3-Parameter Weibull Models Saturday Data

Red Bars indicate outliers that were excluded from parameter determination

THE SIMULATION

The Simulation Equation

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

Sunday Monday Tuesday Wednesday Thursday

Tuesday Wednesday Thursday Friday Saturday

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+ +=Bill 1

Sunday Monday Tuesday Wednesday Thursday Friday Saturday TotalBill1 4 4 5 5 5 4 4 31Bill2 4 4 4 4 4 5 5 30Bill3 5 5 5 4 4 4 4 31Bill4 4 4 4 5 5 4 4 30Bill5 5 4 4 4 4 5 5 31Bill6 4 5 5 5 4 4 4 31Bill7 4 4 4 4 5 5 4 30Bill8 5 5 4 4 4 4 5 31Bill9 4 4 5 5 4 4 4 30

Bill10 4 4 4 4 5 5 5 31Bill11 5 5 5 4 4 4 4 31Bill12 4 4 4 5 4 4 4 29

The Simulation Parameters

The Simulation Results

The Simulation Results

The Simulation Results

ASSESSING CAPABILITY FROM SIMULATION RESULTS

Is a 1GB (1,000MB) Limit Appropriate?

Is a 2GB (2,000MB) Limit Appropriate?

Is a 3GB (3,000MB) Limit Appropriate?

Is a 4GB (4,000MB) Limit Appropriate?

Is a 5GB (5,000MB) Limit Appropriate?

Is a 6GB (6,000MB) Limit Appropriate?

Is a 10GB (10,000MB) Limit Appropriate?

Is a 12GB (12,000MB) Limit Appropriate?

Data Usage <1 1-2 2-3 3-4 4-5 5-6 >6 Expected Monthly Charge 0.000% 0.330% 29.190% 52.890% 15.850% 1.650% 0.090%

Sprint (1GB) $ 40.00 $ 0.05 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 55.00 Sprint (3GB) $ 50.00 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 60.57 Sprint (6GB) $ 65.00 $ 65.00 VZ (1Gb) $ 50.00 $ 0.05 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 65.00 ATT (2GB) $ 55.00 $ 4.38 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 69.95 ATT (5GB) $ 75.00 $ 0.25 $ 0.01 $ 75.26 VZ (3GB) $ 65.00 $ 7.93 $ 2.38 $ 0.25 $ 0.01 $ 75.57 Sprint (12GB) $ 80.00 $ 80.00 VZ (6GB) $ 80.00 $ 0.01 $ 80.01 VZ(12GB) $ 100.00 $ 100.00 ATT (15GB) $ 125.00 $ 125.00

Plan Selection Based on Simulation

Measured SimulationPpk % Ppk %

1GB -0.83 99.36% -1.22 100%2GB -0.53 94.36% -0.7047 99.673GB -0.23 75.26% -0.1984 70.48%5GB 0.37 13.05% 0.81 1.74%6GB 0.68 2.13% 1.31 0.09%10GB 1.88 0.00% 3.33 0.00%12GB 2.48 0.00% 4.35 0.00%

Comparison of Simulated and Measured Capability

Conclusion• Mobile Phone Data usage can be analyzed using:

– Descriptive Statistics– Run Charts– Probability Plots– Control Chart– Process Capability

• Non-Normal Data requires different hypothesis test including:– Kruskal-Wallis– Mood Median

• A Stochastic Simulation Model can be created by:– Determining a distribution that characterized each factor– Specifying a mathematical relationship between the factors

• A Process Capability on simulated data can be used to determine specification limits

Questions?

Contact Information:Brandon R. Theiss, PE

Rutgers School of Law- CamdenBrandon.Theiss@Rutgers.edu

Recommended