Object-Oriented (OO) data types



“Object-oriented programming (OOP) is a programming paradigm based on concept of objects, which can contain data and code:

  • data in the form of fields (often known as attributes or properties),

  • and code, in the form of procedures (often known as methods)“

Wikipedia


Object-oriented systems are made of class and method.

Illustration by Sue Looxkwood

Illustration by Sue Looxkwood


A class defines the behaviour of objects by describing their (1) attributes, their (2) methods, and their (3) relationship to other classes.


Classes are usually organised in a hierarchy

If a method does not exist for a child, then the parent’s method is used instead; the child inherits behaviour from the parent.


in R everything is treated like as an object.


The Class function

Many of the objects that are created within an R session have attributes associated with them. One common attribute associated with an object is its class.

The class is a vector which allows an object to inherit from multiple classes

It allows you to specify the order of inheritance for complex classes.

#create my vector
my_vector <- c(1,2,3)
my_vector
## [1] 1 2 3
# looking for the class of my vector
class(my_vector)
## [1] "numeric"
# give the SPC class to my vector
class(my_vector) <- append(class(my_vector),"SPC")
class(my_vector)
## [1] "numeric" "SPC"

My vector know inherits all the attributes and methods of the SPC class


The UseMethod function

One way to define a method for a class is to use the UseMethod command to define a hierarchy of functions that will react appropriately.

The UseMethod command will tell R to look for a function whose prefix matches the current function.

R searches then for the method with the suffix of class.

#create my vector
my_vector <- list(first="one", second="two", third="third")
#give classe to my vector
class(my_vector) <- append(class(my_vector),"SPC")

# Generic method ("prefix" (FUNCTION) )
GetFirst <- function(x){
  
 UseMethod("GetFirst",x)

}

# Method of the SPC class ("prefix + suffix" (FUNCTION+METHODS))
GetFirst.SPC <- function(x){
  return(x$first)
}

GetFirst(my_vector)
## [1] "one"


R is an OO programming language


R has three object oriented systems

OO Data types Description
S3 Generic-function OO implementing message-passing OO. With message-passing, methods are sent to objects and the object determines which function to call. It has NO FORMAL DEFINITION OF CLASSES.
S4 Similar to S3. S4 has formal class definitions, which describe the class, its inheritance, and methods.
Reference classes (RC) Different from S3 and S4, in RC methods belong to classes not functions. $ is used to separate objects and methods, so method calls look like class$method(params).




S3


An S3 object is a base type with at least a class attribute.

Most R’s objects that you encounter are S3s.

library(pryr)
# Useful tools to pry back the covers of R and understand the language at a deeper level.
df <- data.frame(x = 1:10, y = letters[1:10])
otype(df)  # A data frame is an S3 class
## [1] "S3"
otype(df$x)  # Avector isn't an S3 class
## [1] "base"


In S3 METHODS belong to FUNCTIONS, called generic functions or generics.

S3 methods do not belong to objects or classes.



mean
#> function (x, ...) 
#> UseMethod("mean")
#> <bytecode: 0x557b15d41900>
#> <environment: namespace:base>
ftype(mean)
#> [1] "s3"      "generic"


You can assess all the methods that belong to a generic with methods():

methods("mean")
## [1] mean.Date        mean.default     mean.difftime    mean.POSIXct     mean.POSIXlt    
## [6] mean.quosure*    mean.vctrs_vctr*
## see '?methods' for accessing help and source code
methods("t.test")
## [1] t.test.default* t.test.formula*
## see '?methods' for accessing help and source code


You can also list all generics that have a method for a given class:

methods(class = "ts")
##  [1] [             [<-           aggregate     as.data.frame cbind         coerce       
##  [7] cycle         diff          diffinv       filter        initialize    kernapply    
## [13] lines         Math          Math2         monthplot     na.omit       Ops          
## [19] plot          print         show          slotsFromS3   t             time         
## [25] window        window<-     
## see '?methods' for accessing help and source code


Defining classes


Classes can be created using the function structure or setting it directly with class

# Create and assign class in one step
foo <- structure(list(), class = "foo")

# Create, then set class
foo <- list()
class(foo) <- "foo"

You can determine if an object inherits from a specific class using inherits(x, "classname").

class(foo)
#> [1] "foo"
inherits(foo, "foo")
#> [1] TRUE

Most S3 classes provide a constructor function:

foo <- function(x) {
  if (!is.numeric(x)) stop("X must be numeric")
  structure(list(x), class = "foo")
}


Creating Methods

R possesses a simple generic function mechanism which can be used for an object-oriented style of programming. Method dispatch takes place based on the class of the first argument to the generic function, or of the object supplied as an argument to UseMethod().

To add a new generic create a function that calls UseMethod().

An R object is a data object which has a class attribute.

A class attribute is a character vector giving the names of the classes from which the object inherits.

If the object does not have a class attribute, it has an implicit class.

When a function calling UseMethod("fun") is applied to an object with class attribute c(“first”, “second”), the system searches for a function called fun.first and, if it finds it, applies it to the object. If no such function is found a function called fun.second is tried. If no class name produces a suitable function, the function fun.default is used, if it exists, or an error results.

foo <- function(x) UseMethod("foo")

A generic isn’t useful without some methods.

To add a method to a class, you have to create a regular function with the correct (generic.class) name:

my_func <- function(x) UseMethod("my_func")

my_func.my_class <- function(x) "Hallo?"

my_class <- structure(list(), class = "my_class")
class(my_class)
## [1] "my_class"
my_func(my_class)
## [1] "Hallo?"

S3 method can be dispatched

my_func <- function(x) UseMethod("my_func") # generic

my_func.my_class <- function(x) "my_class"

my_func.default <- function(x) "unknown_class"

what = structure(list(), class = "my_class")
my_func(what)
## [1] "my_class"
# No method for b class, so uses method for a class
my_func(structure(list(), class = c("my_class", "a")))
## [1] "my_class"
# No method for c class, so falls back to default
my_func(structure(list(), class = "c"))
## [1] "unknown_class"



S4

S4 adds formality to S3.

Methods still belong to functions (not to classes), but:

  1. Classes have formal definitions which describe their fields and inheritance structures.
  1. There is a special operator, @, for extracting slots (fields).
  1. Method dispatch can be based on multiple arguments to a generic function, not just one.

All S4 related code is stored in the methods.


Recognising objects and methods

You can identify an S4 object because str() describes it as a “formal” class, isS4() returns TRUE, and pryr::otype() returns “S4”

# From example(mle) Estimate parameters by the method of maximum likelihood.
library(stats4)
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
nLL <- function(lambda) -sum(dpois(y, lambda, log = TRUE))
fit <- mle(nLL, start = list(lambda = 5), nobs = length(y))

# An S4 object
isS4(fit)
## [1] TRUE
otype(fit)
## [1] "S4"
# An S4 generic
isS4(nobs)
## [1] TRUE
# function type
ftype(nobs)
## [1] "s4"      "generic"


Defining classes

In S4 you must define the representation of a class with setClass() and create a new object with new().

An S4 class has three properties

  1. A name: an alpha-numeric identifier.

  2. A named list of slots, which defines slot names and permitted classes. Slots can be S4 classes, S3 classes registered with setOldClass(), or base types.

  3. A string giving the class it inherits from, or that it contains.

Here an example:

setClass("Person",
  slots = list(name = "character",
               age = "numeric")
  )

setClass("Employee",
  slots = list(boss = "Person"),
  contains = "Person" # <-- inheritance from 
  )

alice <- new("Person", name = "Alice", age = 40)
john  <- new("Employee", name = "John", age = 20, 
             boss = alice
             )

To access slots of an S4 object use @ or slot():

alice@age
## [1] 40
slot(john, "boss")
## An object of class "Person"
## Slot "name":
## [1] "Alice"
## 
## Slot "age":
## [1] 40

If an S4 object contains an S3 class or a base type, it will have a special .Data slot which contains the underlying type:

setClass("RangedNumeric",
  contains = "numeric",
  slots = list(min = "numeric", max = "numeric"))

rn <- new("RangedNumeric", 1:10, min = 1, max = 10)
rn@min
## [1] 1
rn@.Data
##  [1]  1  2  3  4  5  6  7  8  9 10


Creating new methods

S4 provides special functions for creating new generics and methods.

  • setGeneric() creates a new generic or converts an existing function into a generic.

  • setMethod() takes (1) the name of the generic, (2) the classes the method should be associated with, and (3) a function that implements the method. For example, we could take union(), which usually just works on vectors, and make it work with data frames:

setGeneric("union")
## [1] "union"
setMethod("union",
  c(x = "data.frame", y = "data.frame"),
  function(x, y) {
    unique(rbind(x, y))
  }
)


Dispaching a method

If an S4 generic dispatches on a single class with a single parent then S4 method dispatch is the same as S3 dispatch.

Method dispatch becomes considerably more complicated if you dispatch on multiple arguments or if your classes use multiple inheritance. The rules are described in ?Methods,

To find which method gets called given the specification of a generic call:

selectMethod("nobs", list("mle"))
#> function (object, ...) 
#> if ("nobs" %in% slotNames(object)) object@nobs else NA_integer_
#> <bytecode: 0x7f991b0b4008>
#> <environment: namespace:stats4>
#> 
#> Signatures:
#>         object
#> target  "mle" 
#> defined "mle" 

Referece Class (RC)

They are fundamentally different to S3 and S4 because:

  • RC methods belong to objects not functions

  • RC objects are mutable

These properties make RC objects behave more like objects do in most other programming languages, e.g., Python, Ruby, Java, and C#.

Reference classes are implemented using R code: they are a special S4 class that wraps around an environment.


Defining Classes

RC classes are best used for describing objects that change over time.

  • setRefClass() create a new RC class. The only required argument is an alphanumeric name.

  • new() create new RC objects.

Account <- setRefClass("Account")
Account$new()
## Reference class object of class "Account"

setRefClass() also accepts a list of name-class pairs that define class fields

Additional named arguments passed to new() will set initial values of the fields.

You can get and set field values with $:

Account <- setRefClass("Account",
  fields = list(balance = "numeric")
  )

a <- Account$new(balance = 100)
a$balance
## [1] 100
a$balance <- 200
a$balance
## [1] 200

Note that RC objects are mutable (they have reference semantics, and are not copied-on-modify):

b <- a
b$balance
## [1] 200
a$balance <- 0
b$balance
## [1] 0

For this reason, RC objects come with a copy() method that allow you to make a copy of the object:

c <- a$copy()
c$balance
## [1] 0
a$balance <- 100
c$balance
## [1] 0


RC methods are associated with a class and can modify its fields in place.

Account <- 
  setRefClass("Account",
              fields  = list(balance = "numeric"),
              methods = list(
                withdraw = function(x) { balance <<- balance - x },
                deposit  = function(x) { balance <<- balance + x }
                )
              )

You call an RC method in the same way as you access a field:

a <- Account$new(balance = 100)
a$deposit(100) # <-- Method that modify a field
a$balance
## [1] 200

The final argument to setRefClass() is contains, or the name of the parent RC class to inherit behaviour from.

NoOverdraft <- 
  setRefClass("NoOverdraft",
              contains = "Account", # INHERITTED from
              methods = list(
                withdraw = function(x) {
                  if (balance < x) print("Not enough money")
                  balance <<- balance - x
                  }
                )
)

# new account 
accountJohn <- NoOverdraft$new(balance = 100)

# deposit cash
accountJohn$deposit(50)
accountJohn$balance
## [1] 150
# asking for more than I got
accountJohn$withdraw(200)
## [1] "Not enough money"


Method dispatch

Method dispatch is very simple in RC because methods are associated with classes not functions.

When you call x$f(), R will look for a method f in the class of x, then in its parent, then its parent’s parent, and so on.

From within a method, you can call the parent method directly with callSuper(...).




A work by Matteo Cereda and Fabio Iannelli