---
title: "Lab 02 - Analysis in RMark"
author: "WILD 502 - Jay Rotella"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Introduction
In lab this week, we have known fate data for 1 interval of time and the input file also contains 4 individual covariates (Area, Sex, Mass, & Length). Here, we'll use RMark and work with the same models that you evaluated directly in MARK for lab 2.
### Import and process the data
```{r}
library(RMark)
library(ggplot2)
# bring in the data file
fawns <- convert.inp("http://www.montana.edu/rotella/documents/502/lab02-fawns.inp",
covariates = c("area", "sex", "mass", "length"))
# add "length.sq" variable for use in quadratic models
fawns$length.sq <- (fawns$length)^2
head(fawns)
# convert area & sex variables to factors so that we can treat them as
# grouping variables in the analysis to ease working with output later
fawns$area <- factor(fawns$area,
levels = c(0, 1),
labels = c("control", "treatment"))
fawns$sex <- factor(fawns$sex,
levels = c(0, 1),
labels = c("female", "male"))
# Store versions of fate for use in final plotting
# first a factor version that has nice labels
fawns$fate <- factor(as.numeric(fawns$ch),
levels = c(11, 10),
labels = c("died", "lived"))
# second, a numeric version that's 0 = dead, 1 = lived
fawns$Fate <- 2 - as.numeric(fawns$fate)
summary(fawns)
# we know from other work on this dataset that there is 1 record with
# a very high value for mass, which we want to drop
fawns2 = fawns[fawns$mass < 70, ]
```
### Process the data for use in RMark
```{r}
# Process data
fawns.processed = process.data(fawns2,
model = "Known",
groups = c("area", "sex"))
#
# Create default design data
fawns.ddl = make.design.data(fawns.processed)
```
### Build function for creating models
Here, we write a function that establishes which models are to be run in MARK.
```{r}
# setup a function
run.fawns <- function() {
# Define range of models for S
S.dot = list(formula = ~ 1)
S.area = list(formula = ~ area)
S.mass = list(formula = ~ mass)
S.length = list(formula = ~ length)
S.length.sq = list(formula = ~ length + length.sq)
S.sex = list(formula = ~ sex)
S.area.length = list(formula = ~ area + length)
S.length.mass = list(formula = ~ length + mass)
# sex-specific intercepts & common slope
S.sex.length = list(formula = ~ sex + length)
# 1 intercept & sex-specific slopes
S.length.by.sex = list(formula = ~ sex:length)
# sex-specific intercepts & slopes
S.sex.x.length = list(formula = ~ sex * length)
# Create model list
model.list = create.model.list("Known")
# NOTE: to avoid having all the output for each model appear when you
# call the function, add ', output=FALSE' after 'ddl=fawns.ddl' below.
# Here, I don't do that so you can see the output for each model,
# but this might not be desired if you have many models.
fawns.results = mark.wrapper(model.list, data = fawns.processed,
ddl = fawns.ddl, invisible = TRUE, threads = 2)
# Return model table and list of models
return(fawns.results)
}
```
### Run the models
It's worth noting that here, the output from each of the models appears in the console where it can be reviewed. One can also examine model-specific output in other ways as shown below. And, if you don't want to see the output from every model appear in the output here, see the note above for the use of the "mark.wrapper" command where one has the option of suppressing the output.
```{r}
fawns.results = run.fawns()
```
### Look at the model-selection table
```{r}
fawns.results
```
### Work with the top model
You would not need to re-run the top model since we've already run it. But, by doing it this way, you can see how you can to run a model of interest without using a function as was done earlier.
```{r}
# here's how to run a model outside the function
top <- mark(fawns.processed,ddl = fawns.ddl,
model = "Known",
model.parameters = list("S" = list(formula = ~ sex * length)))
# look at beta-hats
top$results$beta
# create values of length to use for predictions
min.length = min(fawns2$length)
max.length = max(fawns2$length)
length.values = seq(from = min.length, to = max.length, length = 100)
# determine which parameter indices go with males and females
fawns.ddl
# make predictions for rows of 'fawns.ddl' associated with
# females (par.index=1) and males (par.index=3)
pred.top <- covariate.predictions(top,
data = data.frame(length = length.values,
length.sq = length.values ^ 2),
indices = c(1, 3))
# store values of sex in pred.top
female.rows <- which(pred.top$estimates$par.index == 1)
male.rows <- which(pred.top$estimates$par.index == 3)
pred.top$estimates$sex <- NA
pred.top$estimates$sex[female.rows] <- "female"
pred.top$estimates$sex[male.rows] <- "male"
head(pred.top$estimates)
```
### Working with output: Plotting
Here, we plot estimates of "S" for male and female fawns of different lengths. We also want to plot CI's for the values. To do so, we use the predicted values we created in the last step. Here, I use the ggplot2 package, which makes nice graphs.
```{r fig.width=7, fig.height=6, warning=FALSE}
pred <- pred.top$estimates
# build and store the plot in object 'p'
p <- ggplot(pred, aes(x = length, y = estimate, group = sex)) +
geom_line(size = 1.5, aes(colour = sex)) +
geom_ribbon(aes(ymin = lcl, ymax = ucl), alpha = 0.2) +
scale_colour_brewer(palette = "Set1") +
theme(legend.position = c(0.15, 0.8),
legend.justification = c(1, 0)) +
xlab("Body Length (cm)") +
ylab("Estimated Survival Rate") +
geom_jitter(data = fawns2, aes(y = Fate, color = sex),
height = 0.025)
# print the plot
p
# if you're running this code, it's wise to cleanup files
# you can use the next 2 lines without comments to do that
#rm(list=ls(all=TRUE))
#cleanup(ask = FALSE)
```