forked from hadley/adv-r
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
156 additions
and
264 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,181 +64,6 @@ method(arg1, arg2, arg3) | |
|
||
(In fact this message is so powerful that I've talked to programmers who moved to R from javascript and it took them a while to figure out that they're not calling the `frame` method of the `data` object.) | ||
|
||
## S4 {#s4} | ||
|
||
S4 works in a similar way to S3, but it adds formality and rigour. Methods still belong to functions, not classes, but: \index{objects!S4|see{S4}} \index{S4} | ||
|
||
* Classes have formal definitions which describe their fields and | ||
inheritance structures (parent classes). | ||
|
||
* Method dispatch can be based on multiple arguments to a generic function, | ||
not just one. | ||
|
||
* There is a special operator, `@`, for extracting slots (aka fields) | ||
from an S4 object. | ||
|
||
All S4 related code is stored in the `methods` package. This package is always available when you're running R interactively, but may not be available when running R in batch mode. For this reason, it's a good idea to include an explicit `library(methods)` whenever you're using S4. | ||
|
||
S4 is a rich and complex system. There's no way to explain it fully in a few pages. Here I'll focus on the key ideas underlying S4 so you can use existing S4 objects effectively. To learn more, some good references are: | ||
|
||
* [S4 system development in Bioconductor](http://www.bioconductor.org/help/course-materials/2010/AdvancedR/S4InBioconductor.pdf) | ||
|
||
* John Chambers' [_Software for Data Analysis_](http://amzn.com/0387759352?tag=devtools-20) | ||
|
||
* [Martin Morgan's answers to S4 questions on stackoverflow](http://stackoverflow.com/search?tab=votes&q=user%3a547331%20%5bs4%5d%20is%3aanswe) | ||
|
||
### Recognising objects, generic functions, and methods | ||
|
||
Recognising S4 objects, generics, and methods is easy. You can identify an S4 object because `str()` describes it as a "formal" class, `isS4()` returns `TRUE`, and `pryr::otype()` returns "S4". S4 generics and methods are also easy to identify because they are S4 objects with well defined classes. | ||
|
||
There aren't any S4 classes in the commonly used base packages (stats, graphics, utils, datasets, and base), so we'll start by creating an S4 object from the built-in stats4 package, which provides some S4 classes and methods associated with maximum likelihood estimation: | ||
|
||
```{r} | ||
library(stats4) | ||
library(pryr) | ||
# From example(mle) | ||
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) | ||
nLL <- function(lambda) - sum(dpois(y, lambda, log = TRUE)) | ||
fit <- mle(nLL, start = list(lambda = 5), nobs = length(y)) | ||
# An S4 object | ||
isS4(fit) | ||
otype(fit) | ||
# An S4 generic | ||
isS4(nobs) | ||
ftype(nobs) | ||
# Retrieve an S4 method, described later | ||
mle_nobs <- method_from_call(nobs(fit)) | ||
isS4(mle_nobs) | ||
ftype(mle_nobs) | ||
``` | ||
|
||
Use `is()` with one argument to list all classes that an object inherits from. Use `is()` with two arguments to test if an object inherits from a specific class. | ||
|
||
```{r} | ||
is(fit) | ||
is(fit, "mle") | ||
``` | ||
|
||
You can get a list of all S4 generics with `getGenerics()`, and a list of all S4 classes with `getClasses()`. This list includes shim classes for S3 classes and base types. You can list all S4 methods with `showMethods()`, optionally restricting selection either by `generic` or by `class` (or both). It's also a good idea to supply `where = search()` to restrict the search to methods available in the global environment. | ||
|
||
### Defining classes and creating objects | ||
|
||
In S3, you can turn any object into an object of a particular class just by setting the class attribute. S4 is much stricter: you must define the representation of a class with `setClass()`, and create a new object with `new()`. You can find the documentation for a class with a special syntax: `class?className`, e.g., `class?mle`. \index{S4!classes} \index{classes!S4} | ||
|
||
An S4 class has three key properties: | ||
|
||
* A __name__: an alpha-numeric class identifier. By convention, S4 class names | ||
use UpperCamelCase. | ||
|
||
* A named list of __slots__ (fields), which defines slot names and | ||
permitted classes. For example, a person class might be represented by a | ||
character name and a numeric age: `list(name = "character", age = "numeric")`. | ||
\index{slots} | ||
|
||
* A string giving the class it inherits from, or, in S4 terminology, | ||
that it __contains__. You can provide multiple classes for multiple | ||
inheritance, but this is an advanced technique which adds much | ||
complexity. | ||
|
||
In `slots` and `contains` you can use S4 classes, S3 classes registered | ||
with `setOldClass()`, or the implicit class of a base type. In `slots` | ||
you can also use the special class `ANY` which does not restrict the input. | ||
|
||
S4 classes have other optional properties like a `validity` method that tests if an object is valid, and a `prototype` object that defines default slot values. See `?setClass` for more details. | ||
|
||
The following example creates a Person class with fields name and age, and an Employee class that inherits from Person. The Employee class inherits the slots and methods from the Person, and adds an additional slot, boss. To create objects we call `new()` with the name of the class, and name-value pairs of slot values. \indexc{setClass()} \indexc{new()} | ||
|
||
```{r} | ||
setClass("Person", | ||
slots = list(name = "character", age = "numeric")) | ||
setClass("Employee", | ||
slots = list(boss = "Person"), | ||
contains = "Person") | ||
alice <- new("Person", name = "Alice", age = 40) | ||
john <- new("Employee", name = "John", age = 20, boss = alice) | ||
``` | ||
|
||
Most S4 classes also come with a constructor function with the same name as the class: if that exists, use it instead of calling `new()` directly. | ||
|
||
To access slots of an S4 object use `@` or `slot()`: \index{subsetting!S4} \index{S4|subsetting} | ||
|
||
```{r} | ||
alice@age | ||
slot(john, "boss") | ||
``` | ||
|
||
(`@` is equivalent to `$`, and `slot()` to `[[`.) | ||
|
||
If an S4 object contains (inherits from) an S3 class or a base type, it will have a special `.Data` slot which contains the underlying base type or S3 object: \indexc{.Data} | ||
|
||
```{r} | ||
setClass("RangedNumeric", | ||
contains = "numeric", | ||
slots = list(min = "numeric", max = "numeric")) | ||
rn <- new("RangedNumeric", 1:10, min = 1, max = 10) | ||
rn@min | ||
[email protected] | ||
``` | ||
|
||
Since R is an interactive programming language, it's possible to create new classes or redefine existing classes at any time. This can be a problem when you're interactively experimenting with S4. If you modify a class, make sure you also recreate any objects of that class, otherwise you'll end up with invalid objects. | ||
|
||
### Creating new methods and generics | ||
|
||
S4 provides special functions for creating new generics and methods. `setGeneric()` creates a new generic or converts an existing function into a generic. `setMethod()` takes the name of the generic, the classes the method should be associated with, and a function that implements the method. For example, we could take `union()`, which usually just works on vectors, and make it work with data frames: \index{S4!generics} \index{S4!methods} \index{generics!S4} \index{methods!S4} | ||
|
||
```{r} | ||
setGeneric("union") | ||
setMethod("union", | ||
c(x = "data.frame", y = "data.frame"), | ||
function(x, y) { | ||
unique(rbind(x, y)) | ||
} | ||
) | ||
``` | ||
|
||
If you create a new generic from scratch, you need to supply a function that calls `standardGeneric()`: | ||
|
||
```{r} | ||
setGeneric("myGeneric", function(x) { | ||
standardGeneric("myGeneric") | ||
}) | ||
``` | ||
|
||
`standardGeneric()` is the S4 equivalent to `UseMethod()`. | ||
|
||
### Method dispatch | ||
|
||
If an S4 generic dispatches on a single class with a single parent, then S4 method dispatch is the same as S3 dispatch. The main difference is how you set up default values: S4 uses the special class `ANY` to match any class and "missing" to match a missing argument. Like S3, S4 also has group generics, documented in `?S4groupGeneric`, and a way to call the "parent" method, `callNextMethod()`. \index{S4!method dispatch rules} | ||
|
||
Method dispatch becomes considerably more complicated if you dispatch on multiple arguments, or if your classes use multiple inheritance. The rules are described in `?Methods`, but they are complicated and it's difficult to predict which method will be called. For this reason, I strongly recommend avoiding multiple inheritance and multiple dispatch unless absolutely necessary. | ||
|
||
Finally, there are two methods that find which method gets called given the specification of a generic call: | ||
|
||
```{r, eval = FALSE} | ||
# From methods: takes generic name and class names | ||
selectMethod("nobs", list("mle")) | ||
# From pryr: takes an unevaluated function call | ||
method_from_call(nobs(fit)) | ||
``` | ||
|
||
### Exercises | ||
|
||
1. Which S4 generic has the most methods defined for it? Which S4 class | ||
has the most methods associated with it? | ||
|
||
1. What happens if you define a new S4 class that doesn't "contain" an | ||
existing class? (Hint: read about virtual classes in `?setClass`.) | ||
|
||
1. What happens if you pass an S4 object to an S3 generic? What happens | ||
if you pass an S3 object to an S4 generic? (Hint: read `?setOldClass` | ||
for the second case.) | ||
|
||
## Picking a system {#picking-a-system} | ||
|
||
Three OO systems is a lot for one language, but for most R programming, S3 suffices. In R you usually create fairly simple objects and methods for pre-existing generic functions like `print()`, `summary()`, and `plot()`. S3 is well suited to this task, and the majority of OO code that I have written in R is S3. S3 is a little quirky, but it gets the job done with a minimum of code. \index{objects!which system?} | ||
|
Oops, something went wrong.