---
title: "Multinomials and the CJS Model"
author: "WILD 502 - Jay Rotella"
output: pdf_document
---
To gain familiarity with the multinomial, it is helpful to work with drawing values from it. We use the function 'rnultinom' to generate several datasets. We start using the example from the textbook for passes, where the probabilities are for a completion, incompletion, and interception. The code generates 10 replications of releases (trials) of 100 individuals on occasion 1.
```{r}
x <- rmultinom(10, size = 100, prob = c(0.6, 0.38, (1 - 0.6 - 0.38)))
row.names(x) <- c("complete", "incomplete", "intercept")
x
```
## CJS Example Using Multinomial
We'll set constant parameter values for $\phi_i$ and $p_i$ and use them to (1) calculate the probability of encounter history that's possible in a 3-year study, (2) see 2 versions of probability statements for encounter histories that end in 0's, and (3) check that the probabilities sum to 1.0. Examine the probability statement for the 100 encounter history: an animal can have this history by living through 2, 1, or 0 of the intervals, i.e., this type of history doesn't provide a lot of information. If you have low values of $p$, you can end up with a lot of these, and your precision will suffer.
```{r}
phi <- 0.8
p <- 0.3
(pr111 <- phi * p * phi * p)
(pr110 <- phi * p * ((1 - phi) + phi * (1 - p)))
(pr110.alt <- phi * p * (1 - (phi * p)))
(pr101 <- phi * (1 - p) * phi * p)
(pr100 <- (phi * (1 - p) * phi * (1 - p) +
phi * (1 - p) * (1 - phi) +
(1 - phi)))
(pr100.alt <- 1 - (pr111 + pr110 + pr101))
round(c(pr111, pr110, pr101, pr100, sum(pr111, pr110, pr101, pr100)), 3)
```
## Generate CJS Capture Histories with Random Multinomials
You can quickly generate summarized capture history information for a CJS study using the `rmultinom` function.
```{r}
set.seed(502) # you can set the seed to make the results repeatable
(rm <- rmultinom(10, size = 250, prob = c(pr111, pr110, pr101, pr100)))
# Check proportions
round(rm/250, 3)
# compare average proportions against original data
round(apply(rm, 1, mean)/250,3)
round(c(pr111, pr110, pr101, pr100),3)
```
## Analyze Data from Each Simulation in MARK
```{r, message=FALSE, echo=FALSE}
library(RMark)
library(knitr)
out <- data.frame(phi = NA, se.phi = NA, p = NA, se.p = NA)
for (i in 1:ncol(rm)) {
dat <- data.frame(ch = c('111', '110', '101', '100'),
freq = rm[, i])
dat$ch <- as.character(dat$ch)
dat.pr <- process.data(dat, model = "CJS")
Phi.dot.p.dot <- mark(dat.pr, output=FALSE)$results$real
out[i, 1:2] <- Phi.dot.p.dot[1, 1:2]
out[i, 3:4] <- Phi.dot.p.dot[2, 1:2]
}
row.names(out) <- 1:nrow(out)
kable(out, digits = 3)
# Clean up RMark files
cleanup(ask = FALSE)
```
## Summarize Results of Many Simulated Datasets
The above example only used 10 simulated datasets, which is nice for displaying the steps but inadequate for properly assessing the expected outcomes, which would require many simulated datasets, e.g., thousands. Ignoring the fact that we've only done 10 simulations, let's check the average performance and the variability in performance of the CJS model for the data and estimates generated above. Notice that here because we have multiple estimates of *phi* and *p*, we can empirically estimate the SE of each by calculating the SD of all the estimates of *phi* and *p*.
```{r, message=FALSE, echo=FALSE}
library(dplyr)
library(tidyverse)
out2 <- out %>%
gather(parameter, estimate)
out_summary <- out2 %>%
group_by(parameter) %>%
summarise_all(funs(mean, sd, min, max))
kable(out_summary[c(2, 4, 1, 3), ], digits = 3, row.names = FALSE)
```
## Simulations in Program MARK
You can conduct simulations for various types of capture-recapture studies in Program MARK. [Appendix A](http://www.phidot.org/software/mark/docs/book/pdf/app_1.pdf) of the on-line Cooch and White book has really helpful material on the topic, which we can review later in the semester if you like. Conducting simulations can be very helpful in understanding how models operate, evaluating the impacts of violating assumptions, and designing studies. Having simulation code becomes especially helpful when you have many occasions where working out all the probability statements can be tedious at best, and/or when covariates are involved.