Introduction to R

Martin Morgan (mtmorgan@fhcrc.org), Fred Hutchinson Cancer Research, Center, Seattle, WA, USA.
24 August 2014

Outline

Part I

  • Vectors (data)
  • Functions
  • Help!

Part II

  • Classes (objects)
  • Generics & methods
  • Help!

Part III

  • Packages
  • Help!

Part I: Vectors (data)

1                # vector of length 1
[1] 1
c(1, 1, 2, 3, 5) # vector of length 5
[1] 1 1 2 3 5

Part I: Vectors (data)

  • logical c(TRUE, FALSE), integer, numeric, complex, character c("A", "beta")
  • list list(c(TRUE, FALSE), c("A", "beta"))
  • Statistical concepts: factor, NA

Assignment and names

x <- c(1, 1, 2, 3, 5)
y = c(5, 5, 3, 2, 1)
z <- c(Female=12, Male=3)
  • = and <- are the same

Part I: Vectors (data)

Operations

x + y        # vectorized
[1] 6 6 5 5 6
x / 5        # ...recylcing
[1] 0.2 0.2 0.4 0.6 1.0
x[c(3, 1)]   # subset
[1] 2 1

Part I: Functions

Examples: c(), concatenate values; rnorm(), generate random normal deviates; plot()

x <- rnorm(1000)    # 1000 normal deviates
y <- x + rnorm(1000, sd = 0.5)
  • Optional, named arguments; positional matching
args(rnorm)
function (n, mean = 0, sd = 1) 
NULL

Part I: Functions

plot(x, y)

plot of chunk unnamed-chunk-6

  • formula: another way plot(y ~ x)

Part I: Help!

Within R

?rnorm

Rstudio

  • “Help” tab, search for “rnorm”

Main sections

  • Title, Description, Usage, Arguments, Details, Value (result), See also, Examples

Part II: Classes (objects)

Motivation: manipulate complicated data

  • e.g., x and y from previous example are related to one another – same length, element i of y is a transformation of element i of x

Solution: a “data frame” to coordinate access

df <- data.frame(X=x, Y=y)
head(df, 3)
        X       Y
1 -1.3692 -1.2625
2  1.9072  2.6103
3 -0.5395 -0.5987

Part II: Generics & methods

class(df) # plain function
[1] "data.frame"
dim(df)   # generic & method for data.frame
[1] 1000    2
head(df$X, 4)  # column access
[1] -1.3692  1.9072 -0.5395 -1.3264

Part II: Generics & methods

## create or update 'Z'
df$Z <- sqrt(abs(df$Y))
## subset rows and / or columns
head(df[df$X > 0, c("X", "Z")])
         X      Z
2  1.90720 1.6156
5  0.02705 0.5804
6  0.18376 0.5624
8  0.04149 0.2101
9  0.96177 0.3850
14 0.48720 1.0353

Part II: Generics & methods

plot(Y ~ X, df) # Y ~ X, values from 'df'
## lm(): linear model, returns class 'lm'
fit <- lm(Y ~ X, df)
abline(fit)  # plot regression line

plot of chunk unnamed-chunk-11

Part II: Generics & methods

anova(fit)  
Analysis of Variance Table

Response: Y
           Df Sum Sq Mean Sq F value Pr(>F)
X           1   1042    1042    4417 <2e-16
Residuals 998    235       0               

X         ***
Residuals    
---
Signif. codes:  
  0 '***' 0.001 '**' 0.01 '*' 0.05 '.'
  0.1 ' ' 1

Part II: Generics & methods

  • fit: object of class lm
  • anova(): generic, with method for for class fit
methods(anova)
[1] anova.glm*     anova.glmlist*
[3] anova.lm*      anova.lmlist* 
[5] anova.loess*   anova.mlm*    
[7] anova.nls*    

   Non-visible functions are asterisked

Part II: Help!

## class of object
class(fit)

## method discovery
methods(class=class(fit))
methods(anova)

## help on generic, and specific method
?anova
?anova.lm

Part III: Packages

Installed

  • Base & recommended
  • Additional packages
length(rownames(installed.packages()))
[1] 227

Available

Part III: Packages

'Attached' (installed and available for use):

search()            # attached packages
ls("package:stats") # functions in 'stats'

Attaching (make installed package available for use)

library(ggplot2)

Installing CRAN or Bioconductor packages

source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")

Part III: Help!

Packages

Part IV: Help!

Best bet

  • Other R users you know!

R

Bioconductor

Acknowledgements

Funding

  • US NIH / NHGRI 2U41HG004059; NSF 1247813

People

  • Seattle Bioconductor team: Sonali Arora, Marc Carlson, Nate Hayden, Valerie Obenchain, Hervé Pagès, Dan Tenenbaum
  • Vincent Carey, Robert Gentleman, Rafael Irizzary, Sean Davis, Kasper Hansen, Michael Lawrence, Levi Waldron