logistic regression on beetles

Get the beetle3.csv data set here

  1. create a plot displaying the data; use stat_sum (with ggplot) or plotrix::sizeplot() so that the graph shows the number of data values at each point. It’s up to you whether to distinguish between series="I" and series="II" in the data.
  2. use aggregate (base R) or group_by + summarise (dplyr) to compute the proportion killed for each unique dosage value/series combination. Optionally, add another column with the total number of individuals for each dosage value/series combination.
  3. Create a plot showing these aggregated values; add a smooth line showing the general trend. If you’re feeling ambitious, make the size of the points proportional to the total number of individuals.
  4. (Use original, disaggregated data from here on). Fit a logistic model including the interaction of the predictors series and log10(dose) to the data.
  5. Explain the meaning of the four parameters in words, as they relate to the expected survival, the effects of dose on survival, and the differences in these quantities between series.
  6. Test the null hypothesis that the two series have identical dose-response curves. Explain whether you are using a Wald test or a likelihood ratio test, and what that means. Is there evidence that the intercepts differ, the slopes, or neither?
  7. Fit a model that uses only log10(dose), ignoring series.
  8. Compute and compare Wald, likelihood profile, and bootstrap confidence intervals for the dose effect.
  9. Compute and display quantile residual-based diagnostics: what do you conclude?
  10. Compute predicted survival probabilities and confidence intervals for the minimum, mean, and maximum log10(dose)
  11. The LD50 (dose that is expected to kill 50% of individuals) is defined as the point where the log-odds of survival are equal to zero, i.e. \(x_{0.5} =-\beta_0/\beta_1\). Compute the LD50 based on your fit.
  12. Compute confidence intervals for the LD50 using (1) the delta method and (2) bootstrapping.