---
title: "Lab 08 - Analyses in RMark"
author: "WILD 502 - Jay Rotella"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

This markdown document runs the POPAN models for data on Coho salmon that you analyzed directly in MARK in lab 8. Here, we're dealing with 4 types of parameters: 'Phi', 'p', 'pent', and 'N'.

### Bring in the Data

```{r, warning=FALSE}
library(RMark)

coho <- convert.inp("http://www.montana.edu/rotella/documents/502/chaseAD.inp",
                 group.df = NULL)
head(coho)
```

### Process the data

```{r}
# Process data
# NOTE: for this data set, we need to provide the time
# intervals because they are not all the same length
coho.pr <- process.data(coho, begin.time = 1, model = "POPAN",
                        time.intervals = c(1.5, 1, 1, 1, 1, 1, 1.5))
#  
# Create default design data
coho.ddl = make.design.data(coho.pr)

# examine the design data 
coho.ddl
```

### Build function for creating models

Here, we set up a function that contains the structures for 'Phi', 'p', 'pent' and 'N'. It will run all combinations of the structures for each parameter type when executed.


```{r}
run.coho <- function() {
  #
  # Define range of models for each parameter type. Here,
  # we only have a few models as the labs focused more on how
  # how work with the output than on setting up a wide variety
  # of competing models
  
  # Define model for Phi
  Phi.time = list(formula =  ~ time)
  
  #  Define range of models for p
  p.dot = list(formula =  ~ 1)
  p.time = list(formula =  ~ time)
  
  #  Define model for pent (probability of entry)
  pent.time = list(formula =  ~ time)
  
  # Define model for N
  N.dot = list(formula =  ~ 1)
  
  # Run assortment of models
  Phi.time.p.dot.pent.time.N = mark(
    coho.pr,
    coho.ddl,
    model.parameters = list(
      Phi = Phi.time,
      p = p.dot,
      pent = pent.time,
      N = N.dot))
  
  # Note this initial run will penalize for too many parameters because
  #  we are not adjusting the parameter count for confounded parameters
  Phi.time.p.time.pent.time.N = mark(
    coho.pr,
    coho.ddl,
    model.parameters = list(
      Phi = Phi.time,
      p = p.time,
      pent = pent.time,
      N = N.dot))
    
    # Here, we adjust the parameter count to the correct value
    Phi.time.p.time.pent.time.N =
    adjust.parameter.count(Phi.time.p.time.pent.time.N, 21)
  
  # Return model table and list of models
  return(collect.models())
}
```

### Run the Models and Examine the Output

It's very simple to then run the function and each of the models. As implemented here, the output from each of the models appears in the console where it can be reviewed when the function written above is called. If you don't want the output from every model to appear automatically, see the note above the "mark.wrapper" command for how to set the option of suppressing the output. One can also examine model-specific output in other ways as shown below.

```{r}
coho.results <- run.coho()
```


### Examine Model-Selection Table

Once the models are run, we can examine the model-selection table. We can also examine model-specific output.

```{r}
options(width = 150)
coho.results
```

### Look at output from top model

```{r}
# View the model names
names(coho.results)

# examine the output from top-ranked model (#1) and
# store top model in object with short name 
top <- coho.results$Phi.time.p.dot.pent.time.N

# Check which link functions were used for the parameters
# Notice that RMark used parameter-specific link functions that 
# make sense here, i.e., the logit link for Phi & p, the mlogit for pent,
# and a log link for N.
top$links

# look at summary of top model's output
summary(top, showall = FALSE)

# store and examine estimates of Phi, p, pent, and N

# First examine estimates of the real parameters
# to see how results are stored. We drop 'fixed' & 'note' columns, so
# we can use the 'round' command
round(top$results$real[ , 1:4], 5)

# for Phi in top model there are 7 estimates
top.Phi <- top$results$real[1:7, ]
top.Phi

# for p in top model there is 1 estimate
# and it's in the 8th row of output
top.p <- top$results$real[8, ]
top.p

# Store and examine the estimates of pent
top.pent <- top$results$real[10:16, ]
round(top.pent[ ,1:4], 5)

# Store estimate for N
top.N <- top$results$real[9, ]
top.N
```

### Calculate the estimated number of births

To calculate the estimates of B[i], you need to multiply N by b[i]. Remember, to get b[0], you subtract the sum of all the b[i] for i = 1, 2, ..., 7 from 1.

```{r}
# calculate the proportion of the population present at the start. 
# To do this we need the b[i] for each occasion
b <- top.pent$estimate
b0 <- 1 - sum(top.pent$estimate)
#
# Examine the b[i] including b[0]
round(c(b0, b), 4)

# Next, calculate the net births by occasion
B.occ <- b*top.N$estimate 
round(B.occ, 4)

# B[0] is simply the proportion of the superpopulation that was 
# present at the start of the study (b[0]*N-hat)
(B0 <- b0*top.N$estimate)

# Sum up all the B[i] including B0
sum(B0, B.occ)
```

### Work out N[i] values

For this, we have to deal with the losses on capture and the fact that not all time intervals are the same between the various occasions. The details are provided on page 12-20 of the C&W chapter.

```{r}
# review values for B[i]
round(c(B0, B.occ), 4)

# calculate and store values of N[i] for each occasion iteratively 
# set up vector for containing N by occ
N.occ <- vector("numeric", 8)
# Store starting number
N.occ[1] <- B0

# Store time intervals for use in calculations
time.intervals <- c(1.5, 1, 1, 1, 1, 1, 1.5)
# Store the losses on capture (see Table 12.6 on page 12-15 of C&W)
losses <- c(0, 1, 11, 2, 8, 8, 6, 10)

# Calculate the N.occ values
for (i in 2:8) {
  N.occ[i] = (N.occ[i - 1] - losses[i - 1]) * 
    (top.Phi$estimate[i - 1]^time.intervals[i - 1]) +
    B.occ[i - 1]
}

# examine estimated numbers 
round(cbind(N.occ = N.occ, B.occ = c(B.occ, NA)), 4)
```


### Further work that can be done 

Chapter 12 goes through more modeling that can be done including additional models of the adults and combined modeling for adults and jacks. Also, you could work with the delta method to obtain estimates of the SE's for the quantities we derived above for B.occ and N.occ.