cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

Basic criteria for data presentation

Keynote talk by John Rauser (2016) is strongly recommended.

Visual perception of quantitative information: Cleveland hierarchy (W. S. Cleveland and McGill 1984,W. S. Cleveland and McGill (1987),W. Cleveland (1993))

cleveland

cleveland

Techniques for multilevel data

ggplot2 makes it fairly easy to do a simple two-stage analysis on the fly, e.g. with the CBPP data discussed below:

geom_smooth(aes(colour=herd,weight=size),
            method="glm",
            method.args=list(family=binomial),
            se=FALSE)

(ignore glm.fit warnings if you try this)

ggplot

ggplot intro

mappings + geoms

See Karthik Ram’s ggplot intro or my intro for disease ecologists, among many others.

Example/exercise

library("ggplot2"); theme_set(theme_bw())
library(ggalt)
source("../R/geom_cstar.R")

cbpp data set

Contagious bovine pleuropneumonia (CBPP): from Lesnoff et al. (2004), via the lme4 package. See ?lme4::cbpp for details.

data("cbpp",package="lme4")
## make period *numeric* so lines will be connected/grouping won't happen
cbpp2 <- transform(cbpp,period=as.numeric(as.character(period)))
g0 <- ggplot(cbpp2,aes(period,incidence/size))

spaghetti plot

g1 <- g0+geom_line(aes(colour=herd))+geom_point(aes(size=size,colour=herd))

Do we need the colours?

g2 <- g0+geom_line(aes(group=herd))+geom_point(aes(size=size,group=herd))

Facet instead:

g4 <- g1+facet_wrap(~herd)

Order by average prop. incidence, using the %+% trick:

cbpp2R <- transform(cbpp2,herd=reorder(herd,incidence/size))
g4 %+% cbpp2R
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

gopher tortoise mycoplasma data

Gopher tortoise data (from Ozgul et al. (2009), see ecostats chapter)

Plot density of shells from freshly dead tortoises (shells/Area) as a function of mycoplasmal prevalence (%, prev): you may want to consider site, year of collection, or population density as well.

load("../data/gopherdat2.RData")
g5 <- ggplot(Gdat,aes(prev,shells/Area))+geom_point()
g5+geom_encircle(aes(group=Site))
g5+stat_centseg(aes(group=Site),cfun=mean)

clipping data

Data from

load("../data/Banta.RData")
dat.tf$ltf1 <- log(dat.tf$total.fruits+1)
ggplot(dat.tf,aes(nutrient,ltf1))+geom_point()+
    facet_wrap(~amd)
## calc mean by group
aggdat <- aggregate(ltf1~popu+gen+amd+nutrient,FUN=mean,data=dat.tf)
sessionInfo()
## R Under development (unstable) (2018-07-26 r75007)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
## 
## Matrix products: default
## BLAS: /usr/local/lib/R/lib/libRblas.so
## LAPACK: /usr/local/lib/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_CA.UTF8       LC_NUMERIC=C             
##  [3] LC_TIME=en_CA.UTF8        LC_COLLATE=en_CA.UTF8    
##  [5] LC_MONETARY=en_CA.UTF8    LC_MESSAGES=en_CA.UTF8   
##  [7] LC_PAPER=en_CA.UTF8       LC_NAME=C                
##  [9] LC_ADDRESS=C              LC_TELEPHONE=C           
## [11] LC_MEASUREMENT=en_CA.UTF8 LC_IDENTIFICATION=C      
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggalt_0.4.0   ggplot2_3.0.0 knitr_1.20   
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.18       RColorBrewer_1.1-2 pillar_1.3.0      
##  [4] compiler_3.6.0     plyr_1.8.4         bindr_0.1.1       
##  [7] tools_3.6.0        extrafont_0.17     digest_0.6.15     
## [10] evaluate_0.11      tibble_1.4.2       gtable_0.2.0      
## [13] pkgconfig_2.0.1    rlang_0.2.1        yaml_2.2.0        
## [16] bindrcpp_0.2.2     Rttf2pt1_1.3.7     withr_2.1.2       
## [19] dplyr_0.7.6        stringr_1.3.1      maps_3.3.0        
## [22] rprojroot_1.3-2    grid_3.6.0         tidyselect_0.2.4  
## [25] glue_1.3.0         R6_2.2.2           rmarkdown_1.10    
## [28] extrafontdb_1.0    purrr_0.2.5        magrittr_1.5      
## [31] backports_1.1.2    scales_0.5.0.9000  htmltools_0.3.6   
## [34] MASS_7.3-50        proj4_1.0-8        assertthat_0.2.0  
## [37] colorspace_1.3-2   labeling_0.3       KernSmooth_2.23-15
## [40] ash_1.0-15         stringi_1.2.4      lazyeval_0.2.1    
## [43] munsell_0.5.0      crayon_1.3.4

References

Cleveland, William. 1993. Visualizing Data. Summit, NJ: Hobart Press.

Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54. doi:10.2307/2288400.

———. 1987. “Graphical Perception: The Visual Decoding of Quantitative Information on Graphical Displays of Data.” Journal of the Royal Statistical Society. Series A (General) 150 (3): 192–229. doi:10.2307/2981473.

John Rauser. 2016. “How Humans See Data.” https://www.youtube.com/watch?v=fSgEeI2Xpdc.

Lesnoff, Matthieu, Géraud Laval, Pascal Bonnet, Sintayehu Abdicho, Asseguid Workalemahu, Daniel Kifle, Armelle Peyraud, Renaud Lancelot, and François Thiaucourt. 2004. “Within-Herd Spread of Contagious Bovine Pleuropneumonia in Ethiopian Highlands.” Preventive Veterinary Medicine 64 (1): 27–40. doi:10.1016/j.prevetmed.2004.03.005.

Ozgul, Arpat, Madan K Oli, Benjamin M Bolker, and Carolina Perez-Heydrich. 2009. “Upper Respiratory Tract Disease, Force of Infection, and Effects on Survival of Gopher Tortoises.” Ecological Applications 19 (3): 786–98. http://www.ncbi.nlm.nih.gov/pubmed/19425439.