21
Henrik Bengtsson [email protected] (MSc Computer Science, PhD candidate in Statistics) Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden Object-oriented programming Object-oriented programming and programming style and programming style guidelines for R guidelines for R

Henrik Bengtsson [email protected] (MSc Computer Science, PhD candidate in Statistics)

Embed Size (px)

DESCRIPTION

Object-oriented programming and programming style guidelines for R. Henrik Bengtsson [email protected] (MSc Computer Science, PhD candidate in Statistics) Mathematical Statistics Centre for Mathematical Sciences Lund University, Sweden. Outline. Objects and Classes - PowerPoint PPT Presentation

Citation preview

Page 1: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

Henrik [email protected]

(MSc Computer Science, PhD candidate in Statistics)

Mathematical Statistics

Centre for Mathematical Sciences

Lund University, Sweden

Object-oriented programming and Object-oriented programming and programming style guidelines for Rprogramming style guidelines for R

Page 2: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

2 of 21

Outline

• Objects and Classes

• Concepts of object-oriented programming

• A complete example in [R] – Shapes

• References in [R]

• [R] Programming Style Guidelines with a few coding conventions.

Page 3: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

3 of 21

Part I:

Object-oriented programming in [R]

Page 4: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

4 of 21

Objects and Classes

MicroarrayDatalayout: Layout

R: double[][]G: double[][]

Rb: double[][]Gb: double[][]

nbrOfSlides(): intnbrOfSpots(): intswapDyes(...)append()as.data.frame(): data.framegetLayout(): LayoutsetLayout(layout)

subtractBackground(...)normalizeWithinSlide(...) normalizeAcrossSlides(...)plot(...)plotSpatial(...)boxplot(...)hist(...)static read(...): MicroarrayDatawrite(...)

Class name

Fields

Methods

MicroarrayData

Layout

MicroarrayData

MicroarrayData

a class is a data type -an object is an instance of a class

Objects of different classes

A class is the recipe for a certain cake...

...and the objects are the actual cakes of that kind.

Page 5: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

5 of 21

Encapsulation, Inheritance, and Polymorphism

Encapsulation means that a group of related properties, methods, and other members are treated as a single unit or object. Objects can control how properties are changed and methods are executed.

Why: Makes it easier to change your implementation at a later date by letting you hide implementation details of your objects, a practice called data hiding.

Page 6: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

6 of 21

Encapsulation, Inheritance, and Polymorphism

Inheritance describes the ability to create new classes based on an existing class. The new class inherits all the properties and methods and events of the base class, and can be customized with additional properties and methods.

Why: Promotes code reuse since the code for the methods of the subclasses do not need to be rewritten.

Page 7: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

7 of 21

Encapsulation, Inheritance, and Polymorphism

Polymorphism means that you can have multiple classes that can be used interchangeably, even though each class implements the same properties or methods in different ways. Polymorphism is essential to object-oriented programming because it allows you to use items with the same names, no matter what type of object is in use at the moment.

Why: Inheritance becomes more flexible. Subclasses can keep some methods inherited from their super classes and override others.

Page 8: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

8 of 21

Overloading and Overriding

Overloaded members are used to provide different versions of a property or method that have the same name, but that accept a different number of parameters, or parameters with different data types. Currently not supported in [R].

Overridden properties and methods are used to replace an inherited property or method that is not appropriate in a derived class. Overridden members must accept the same data type and number of arguments (not enforced in [R]). Derived classes inherit overridden members.

Page 9: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

9 of 21

Unified Modeling Language (UML) class diagram

10..*

abstractstatic

private

association(“using”)

inheritance(“is a”)

Page 10: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

10 of 21

# Create different Shape objects and store them in a listallShapes <- list( Rectangle(Point(0,0), width=5, height=8, color="blue"), Square(Point(-2,-5), side=3, color="red"), Triangle(Point(3,3), width=10, height=12, color="orange"), Triangle(Point(-4,-2.5), width=12, height=3, color="purple"), Circle(Point(-4,4), radius=5, color="green"))

# Plot all shapesfor shape in allShapes paint(shape)

# Get first mouse clickclick <- getFromClick(Point)

while click is inside plot region # Check with all shapes if they contains the click coordinates. for shape in allShapes if contains(shape, click) then paint(click, col=getColor(shape), style="disc") else paint(click, style="circle")

# Get another mouse click click <- getFromClick(Point)

Interactive example

polymorphism

static method call

Either shape$contains(click) orcontains(shape, click)

Page 11: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

11 of 21

setClassS3("Point", function(x=0, y=0) { extend(Object(), "Point", .x = x, # private .y = y # private ); })

setMethodS3("getX", "Point", function(this) { this$.x; })

setMethodS3("getY", "Point", function(this) { this$.y; })

setMethodS3("getXY", "Point", function(this) { c(this$.x, this$.y); })

setMethodS3("setX", "Point", function(this, newX) { this$.x <- newX; # Using reference!})

setMethodS3("setY", "Point", function(this, newY) { this$.y <- newY;})

setMethodS3("setXY", "Point", function(this, newXY) { this$.x <- newXY[1]; this$.y <- newXY[2];})

setMethodS3("getFromClick", "Point", function(this) { xy <- locator(n=1); # Ask for one mouse click Point(x=xy$x, y=xy$y);})

setMethodS3("print", "Point", function(this) { print(sprintf("%s at (%.3f,%.3f).", getClass(this), this$.x, this$.y));})

setMethodS3("paint", "Point", function(this, ...) { points(this$.x, this$.y, ...);})

private

static

class

method

Code for a class

Page 12: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

12 of 21

References

• One reference can only refer to one object. • One object can have one or several references referring

to it.

c <- Circle(Point(0,0), radius=2);c1 <- c;setRadius(c1, 4);getRadius(c); # will give 4!

• Results in more user-friendly packages for the end user!• Makes design and implementation much easier.• Saves memory.

Page 13: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

13 of 21

References, how?• References are not supported by [R] since everything is

copy-by-value. Have to return new instance:

setValue <- function(list, value) { list$value; return(list);}

• What you really want to do:

setValue <- function(list, value) { list$value;}

• Why: For example, each of the Shape objects can use (refer to) the same Point object to specify its position. By moving the Point object, all Shape object will then move along. This is not possible without references.

• However, reference can be emulated by encapsulating such functionalities in a root class Object, which all classes are enforced to be derived from.

Contact me to get the code for Object, setClassS3() and setMethodS3().

Page 14: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

14 of 21

Part II:

[R] Programming Style Guidelines

Page 15: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

15 of 21

Programming Style Guidelines

• 80% of the lifetime cost of a piece of software goes to maintenance.

• Hardly any software is maintained for its whole life by the original author.

• Code conventions improve the readability of the software, allowing programmers to understand new code more quickly and thoroughly.

• If you ship your source code as a product, you need to make sure it is as well packaged and clean as any other product you create.

Page 16: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

16 of 21

[R] Coding Convention

• Currently there is no RCC and people invent their own conventions or not at all.

• We suggest to adapt a modified version of the Java coding convention, which has proved to be successful and is a de facto standard.

Page 17: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

17 of 21

Class names

Names representing classes must be nouns and written in mixed case starting with upper case.

Shape, Rectangle, Point, MicroarrayData, Layout

Avoid . (period) in class names, because it might lead to ambiguities. , e.g. my.very.own.class is not a good name.

Page 18: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

18 of 21

Field and variable names

Variables and fields names must be in mixed case starting with lower case.

x, y, nbrOfSlides, locus

To maintain readability of the code, do not shorten variable names, e.g. nbrOfGrids (or ngrids) is much better than ngr.

Avoid using . (period) in variable names to make names more consistent with other naming conventions. However, private fields, e.g. layout., may contain periods for improving readability.

Page 19: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

19 of 21

Method names

Names representing methods (functions) must be verbs and written in mixed case starting with lower case.

getLayout(), normalize(method, slides)

To maintain readability of the code, do not shorten method names, e.g. normalizeWithinSlides() is much better than normWSl().

For same reasons as before avoid using . (period) in method names, e.g. get.layout() is not good.

Page 20: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

20 of 21

File names

Classes should be declared in individual files with the file name matching the class name.

Point.R, GenePixData.R

Results in well organized file structure and also gives quick access to the source code, Listing all *.R files in a source directory will give you an overview of all the classes.

For stand-alone functions one may adapt the same policy;

intToHex.R, col2rgb.R

Page 21: Henrik Bengtsson hb@maths.lth.se (MSc Computer Science, PhD candidate in Statistics)

21 of 21

Where to start

Tutorials and source code:

• R Programming Style Guidelines• Programming with References in R• Implementing support for references in R

http://www.maths.lth.se/help/R/