For your homework for this week, please complete the following by the beginning of class next Tuesday.
Use the code supplied in “CJS-simulator-simple.R” to simulate and store a set of encounter histories for 6 occasions using the parameters provided in the code. Use the code to answer the following questions.
How many new individuals were released on each occasion for occasions 1 through 5?
What were the true underlying parameter values?
Based on the true underlying parameter values, what is the true underlying model structure?
Use Program MARK to run models \(\phi_t, p_t\), \(\phi_t, p_.\), \(\phi_., p_t\), and \(\phi_., p_.\) using either the PIMs directly or the PIM chart. Use Program MARK to answer the following.
Provide an AIC table like the one shown in the Results Browser. We will cover model selection soon, and Chapter 4 of CW has much useful information on the topic. For now, just consider these quotes from section 4.3 of CW regarding model selection and AIC:
“In simplest terms, we might express our objective as trying to determine the best model from the set of approximating models we’ve fit to the data. How would we identify such a ‘best model’? An intuitive answer would be to select the model that ‘fits the data the best’ …”
“However, there is a problem with this approach – the more parameters you put into the model, the better the fit … . As such, if you use a simple measure of ‘fit’ as the criterion for selecting a ‘best’ model, you’ll invariably pick the one with the most parameters.”
“How can we find a good, defensible compromise between the two? One approach is to make use of something called the AIC.”
“The AIC is calculated for a particular model as \(AIC = −2 ln\mathcal{L}(\hat\theta | data) + 2K\), where \(\mathcal{L}\) is the model likelihood (\(\hat\theta\)) represents the vector of the various parameter estimates given the data), and K is the number of parameters in the model.”
“… as the fit of the model goes up, the likelihood of the model (given the data) goes up (and thus \(−2 ln\mathcal{L}\) goes down). However, … the greater the number of parameters, the greater the parameter uncertainty … . Thus, as the fit of the model increases, … for a given number of parameters, the AIC declines. Conversely, for a given fit, if it is achieved with fewer parameters (lower K), then the calculated AIC is lower. The 2K term, then, is the penalty for the number of parameters. As K goes up, likelihood goes down, but this is balanced by the penalty of adding the term 2K.”
“… one strictly utilitarian interpretation of the AIC is that the model with the lowest AIC is the ‘best’ model because it is most parsimonious given the data – best fit with fewest parameters.”
“when the difference in AIC between two models (\(\Delta AIC\)) is <2, then we are reasonably safe is saying that both models have approximately equal weight in the data. If \(2 < \Delta AIC < 7\), then there is considerable support fora real difference between the models, and if \(\Delta AIC\), then there is strong evidence to support the conclusion of differences between the models.”
Based on the AIC table in the results browser, which model structure(s) was(were) most supported by the data and how does that compare to the true underlying model structure?
How do the parameter estimates from the best-supported model compare to the parameter values in the true underlying model structure?
How do the parameter estimates from a model whose structure matches that of the true underlying model compare to the parameter values in the true underlying model structure?
Open “CJS-simulator-simple.R” and change the code as follows.
+ line 8: n.mark <- 180
+ line 66: write.table(input.file, file="cjs_data_hw02-bigN.inp",
Once those changes are made, restart Program MARK, and analyze the data with the same 4 models on “cjs_data_hw02-bigN.inp”. Once you’ve set up the problem and brought in the data on the opening screens, feel free to use the “Pre-defined model(s)” option under the Run menu if you like to very quickly get all 4 models run. To do so, just click on “Run” at the top of the screen, then choose the “Pre-defined model(s)” option, next click on “Select Models” on the next screen, then choose both the “.” and the “t” structure for each parameter, and finally, click on “OK to Run”.
How do model-selection results differ from what you obtained from the earlier data set that had a more-modest sample size?
How does the precision of estimates from model \(\phi_., p_.\) compare between results for the 2 different input files?
Do the point estimates estimates from model \(\phi_., p_.\) differ between results for the 2 different input files? If so, why is that given that the underlying true parameters used in“CJS-simulator-simple.R” remained the same?