44
1 Towards a Theory of Programming

1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

1

Towards a Theory of Programming

Page 2: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

2

Roadmap

• Early Concepts & Thoughts• Metrics• UML Process• Time is Money

Page 3: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

3

Abstraction – Underground Map

Page 4: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

4

Page 5: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

5

The Babylonian Tower Principle

programming languages still, and probably will ever fail to produce

abstraction mechanisms suitable for large software modules.

Page 6: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

6

Babylonian Tower (cont.)

• Each language has a hierarchy of abstractions

• E.g., Java has 5 levels:– Methods– Classes– Files– Packages– Jar files

• Number of children (at any level) should be 7+/-2

• => Total number of methods in a manageable Java program should be < 105

Page 7: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

7

Brooks: MMM

• X9 Factor• Surgical Team• Adding people to a late project makes it

later• People and Months are not

interchangeable• Flow diagrams are obsolete• No silver bullet

Page 8: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

8

Recap: High Quality Design(interim definition)

• A design that – Minimizes the number of bugs– Minimizes the effort for adding new

features

Page 9: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

9

OOP Metrics

• Chidamber & Kemerer 1994

• Some require full source code

• Other require only relationship between classes

• Motivation: Objective measurement of quality from the program itself

Page 10: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

10

MCC: McCabe Cyclomatic complexity

• # branches in the method• Typical value: ~5

Page 11: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

11

NOC: Number Of Children

• #direct subclasses• Typical value: unbounded

– E.g.: java.util.Iterator

• High value means– Reuse (good)– Coupling (bad)

• Low value means the opposite

Page 12: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

12

DIT: Depth of Inheritance Tree

• #Ancestors• Typical value: 1-2

• High value means– Hard to understand – High cohesion

Page 13: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

13

CBO: Coupling Between Objects

• #Classes on which a class directly depends

• Typical value: 30

• Low value means– Low coupling (good)– The code is using mostly primitive types (bad)

Page 14: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

14

LCOM: Lack of Cohesion of Methods

• For each class build an undirected graph– A node for each method, field– An edge between a method and a field if the

method accesses the field– An edge between two methods if one of them is

calling the other– LCOM = #Strongly connected components in this

graph

• Good value: 1

Page 15: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

15

Shortcoming of Metrics

• Easy to game the system

• No correlation to Quality– Because quality cannot be measured

• Not normalized

Page 16: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

16

Package Cycles: FindBugs 0.72

Page 17: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

17

Package Cycles: FindBugs 1.35

Page 18: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

18

Package Cycles: Ant

Page 19: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

19

Package Cycles: Antlr

Page 20: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

20

Package Cycles: Summary I

• Destructive rather than Constructive– Based on negative points

• Blind spots

• => The more blind spots you have the better your score is !?

Page 21: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

21

Package Cycles: Summary II

• Work only on statically typed languages

• Negative points

• => Dynamic languages will score very high

Page 22: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

22

Hard Data: Defect Fixing Costs

Time Introduced

Time Detected

Req.DesignImpl.System Test

PostRelease

Req.135-101010-100

Design1101525-100

Impl.11010-25

• Source: Code Complete II (McConnell)

Page 23: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

23

Think Ahead

Page 24: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

24

Approach: UML Process

• Philosophy:– Software has a top-down structure– An optimal solution at stage n requires careful examination of all factors at stage n-1– Human readable documents are less prone to errors than source code– A picture is worth a thousand words

• Values– Measure twice cut once– Strive to prevent future defects

• Principles– Top-down– Divide & Conquer via careful design of interfaces– Abstraction: Each stage concentrates on a specific kind of information

• Practices– Analysis: Requirement gathering/Use cases– Architecture– Design– Implementation– Testing

Page 25: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

25

Discussion: UML Process

• Distinguishes between design and programming

• Promoted formats for describing programs– Documents: SRS, TDD– Visual models: UML, BON

• These representation abstract away statement-level details– Considered to have minor affect on overall quality

• “Waterfall”: Measure twice cut once

Page 26: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

26

UML Tools• Diagrams

– Class – Part – State-chart– Activity– Sequence– Deployment– Use case

• Code Generation from the UML model– Round-tripping

• Highlight:– Multiple abstractions– Use case diagrams are a formal description of the informal

notion of requirements

Page 27: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

27

Iterative Waterfall

• Motivation: Requirements change

• Develop some of the program in UML process– All stages: Analysis, design, impl., …

• Repeat for some other part of the program

• Challenge: which part to choose in each iteration?

Page 28: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

28

Design Documents: The Manufacturing Analogy

• Customer need a new something– Medicine, air-plane, yogurt, mobile-phone, …

• Experts prepare a rough sketch

• Engineers prepare a detailed blue-print

• Workers manufacture the product by executing the blue-print

• Analogy– Engineers are the software designers– The blue print is the UML model/design documents– Workers are the programmers

Page 29: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

29

Software is not Manufactured• A medicine will be reproduced billions of times• Each instance must be identical to the other• => There’s a need for a precise blue-print

• Programmers need to “manufacture” a program only once– Reproduction is automatic (copying the executable)

• A fully detailed blue print is not really needed – If the programmer understands the designer’s intent, a simple phone

call is enough– Formalization may be a waste of time

• The code is the blue print

• The executable is the product

Page 30: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

30

UML: The Building Architecture Analogy

• Customer wants to build a house

• Meets an architect, explain his needs

• Architect prepares a model

• customer approves

• Engineer prepares a construction plan– Addresses lower level issues, e.g., drainage, structures,

material

• Contractor executes the construction plan

Page 31: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

31

Software is not a Building

• In Building, the costs of “undoing” are prohibitive– Hence “cut once, measure twice”

• In software, “measure twice” may be more expensive the “cut twice”

• The building model provides the customer with a faithful description of the building

• In software, use case diagrams and req. documents do not come close to a faithful description of the final system– Customer cannot provide an effective feedback– Chances of developing the wrong program are high

Page 32: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

32

Criticism on the UML Process• How do you know when to stop?

– Even UML supporters agree it is not adequate for coding method and low level classes

– => There is a level where a plain-old compiler is better– => Optimal results require a mixture– => How do you know where is the break-even point?

• How do you which classes you need?– You start implementing in your head– Is it really more cost effective than implementing the real code?

• Much easier to express classes than state– Tendency to yield design w/ many similar classes even if these differences

can be easily expressed via state

• Over-engineering – Build a lot of flexibility into software– To prevent going back to the early stages (see next slide)

• Traceability

Page 33: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

33

Over-Engineering

• Simple:– A class that traverses (pre-order) a tree of files/folders– Computes total size of all files

• Over-engineering– Compute something else– Iterate in a different order– Ignore certain files– Iterate over something other than files– Iterate over something that is not hierarchical

• YAGNI: You Are not Going to Need It

Page 34: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

34

The Mathematics of YAGNI

• A tree of height 3, degree 3– Each third child is redundant (incl. subtree)– Total nodes: 13– Redundant nodes: 1+1+4=6 (46%)

• A tree of height h, degree d– Total nodes: s(h) = d*s(h-1) + 1, s(0) = 0– Redundant nodes: r(h) = s(h-1) + (d-1)*r(h-1)– => r(h) is O(s(h))

Page 35: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

35

The Agile Manifesto

…we have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiationResponding to change over following a plan

Page 36: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

36

The Importance of Time• A hypothetical programming task

• Approach one: 5 days• Approach two: 1 day

– Same interface– Code is not well structured (high-coupling, low-coherency)

• First approach– Minimal effort: 5 days

• Second approach– Effort: 1 (best case) – 6 (worst case) days

• Prefer the second approach– Tests will stay– Other team members can work on their parts– Sad scenario: you lost 1 day– Happy scenario: you earned at least 4 days

• So time is a key factor. Can we estimate development time?

Page 37: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

37

Time Estimates: Physician Appointments

• My physician has an accurate schedule for at least in advance

• Method: Compute time per appointment– Evidence based estimation– Based on gathered statistical data, law of large numbers

• Properties of appointments– Countable– Identifiable end– Abundance

Page 38: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

38

Time Estimates: A Software Project

• Time per class?– Not countable

• Time per sub-system?– Not abundant

• Features?– Countable (breakdown of the big task)– Identifiable end (write tests)– Abundant (by definition)

Page 39: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

39

Burn Charts

• Time is important– (As shown in previous slide)– So, let’s describe our progress vs. time

• Vertical axis: tasks completed• Horizontal axis: time line

• Two variants: burn-up, burn-down

Page 40: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

40

Burn Down

Page 41: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

41

Burn Up

Page 42: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

42

Burn Up Example

Page 43: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

43

Quality in Software(new definition)

• A high-quality software is a software whose burn curve is linear

• Similar to Big-O notation of algorithms

• Does not distinguish between two linear curves– Differences in domain, languages, …

• States that a flattening is the #1 risk– Can be experienced even in student assignments

• Result oriented

Page 44: 1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time is Money

44

Summary

• Time to completion is a key factor• Time estimation by features is

practical• Burn up charts show progress• Quality: Linear burn curve