Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
If you’re at all interested in this topic, the talk by John Rauser (2016) (here) is strongly recommended.
Visual perception of quantitative information: Cleveland hierarchy (Cleveland and McGill 1984, @cleveland_graphical_1987, @cleveland_visualizing_1993)
plotly::ggplotly
)stat_ellipse
, ggalt::geom_encircle
, stat_centseg
(from ../../R/geom_cstar.R
)ggplot2
makes it fairly easy to do a simple two-stage analysis on the fly using geom_smooth
, e.g. with the CBPP data discussed below:
ggalt
, gridExtra
, ggExtra
, cowplot
, directlabels
packages usefulcoord_flip()
, ggstance()
package)forcats::fct_reorder()
, forcats::fct_infreq()
RColorBrewer
/ColorBrewer, IWantHue): respect dichromats and B&W printoutsggrepel
, directlabels
packages)ggplot(my_data,aes(x=age,y=rootgrowth,colour=phosphate))
geom_point
, geom_line
load("../../data/gopherdat2.RData")
library("ggplot2"); theme_set(theme_bw())
(ggplot(Gdat,aes(x=year,y=shells/Area,colour=Site))
+ geom_point()
)
geom_boxplot
, geom_smooth
facet_wrap
(free-form wrapping of subplots), facet_grid
(two-D grid of subplots)See Karthik Ram’s ggplot intro or my intro for disease ecologists, among many others.
library("ggalt")
source("../../R/geom_cstar.R")
Contagious bovine pleuropneumonia (CBPP): from Lesnoff et al. (2004), via the lme4
package. See ?lme4::cbpp
for details.
data("cbpp",package="lme4")
## make period *numeric* so lines will be connected/grouping won't happen
cbpp2 <- transform(cbpp,period=as.numeric(as.character(period)))
g0 <- ggplot(cbpp2,aes(period,incidence/size)) ## plot template (no geom)
g1 <- (g0
+geom_line(aes(colour=herd))
+geom_point(aes(size=size,colour=herd))
)
Do we need the colours?
g2 <- (g0
+geom_line(aes(group=herd))
+geom_point(aes(size=size,group=herd))
)
Facet instead:
g4 <- g1+facet_wrap(~herd)
Order by average prop. incidence, using the %+%
trick:
cbpp2R <- transform(cbpp2,herd=reorder(herd,incidence/size))
g4 %+% cbpp2R
two-stage analysis:
(g0
+ geom_point(aes(size=size,group=herd))
+ geom_smooth(aes(group=herd,weight=size),
method="glm",
method.args=list(family=binomial),
se=FALSE))
## `geom_smooth()` using formula = 'y ~ x'
(ignore glm.fit
warnings if you try this)
Gopher tortoise data (from Ozgul et al. (2009), see ecostats chapter)
Plot density of shells from freshly dead tortoises (shells/Area
) as a function of mycoplasmal prevalence (%, prev
): you may want to consider site, year of collection, or population density as well.
load("../../data/gopherdat2.RData")
g5 <- ggplot(Gdat,aes(prev,shells/Area))+geom_point()
g5+geom_encircle(aes(group=Site))
g5+geom_encircle(aes(group=Site),s_shape=1,expand=0) ## convex hulls
## connect points to center
g5+stat_centseg(aes(group=Site),cfun=mean)
Data from Banta, Stevens, and Pigliucci (2010):
Easier if there is one data point per group (connect with lines), but
load("../../data/Banta.RData")
## dat.tf$ltf1 <- log(dat.tf$total.fruits+1)
g6 <- ggplot(dat.tf,aes(nutrient,total.fruits,colour=gen))+
geom_point()+
scale_y_continuous(trans="log1p")+
facet_wrap(~amd)+
stat_summary(fun.y=mean,aes(group=interaction(popu,gen)),
geom="line")
## Warning: The `fun.y` argument of `stat_summary()` is deprecated as of ggplot2 3.3.0.
## ℹ Please use the `fun` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
If stat_summary
is used with fun.data=
, it can also compute confidence intervals. Try "mean_cl_boot"
or "mean_cl_normal"
(see ?mean_cl_boot
)
Dynamic graphics:
library(plotly)
ggplotly(g6)
Pick a data set from the list available on the web page (or use your own) and create two plots that indicate the grouping in different ways.
Banta, Joshua A., Martin H. H. Stevens, and Massimo Pigliucci. 2010. “A Comprehensive Test of the ’Limiting Resources’ Framework Applied to Plant Tolerance to Apical Meristem Damage.” Oikos 119 (2): 359–69. https://doi.org/10.1111/j.1600-0706.2009.17726.x.
Cleveland, William. 1993. Visualizing Data. Summit, NJ: Hobart Press.
Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54. https://doi.org/10.2307/2288400.
———. 1987. “Graphical Perception: The Visual Decoding of Quantitative Information on Graphical Displays of Data.” Journal of the Royal Statistical Society. Series A (General) 150 (3): 192–229. https://doi.org/10.2307/2981473.
John Rauser. 2016. “How Humans See Data.” https://www.youtube.com/watch?v=fSgEeI2Xpdc.
Lesnoff, Matthieu, Géraud Laval, Pascal Bonnet, Sintayehu Abdicho, Asseguid Workalemahu, Daniel Kifle, Armelle Peyraud, Renaud Lancelot, and François Thiaucourt. 2004. “Within-Herd Spread of Contagious Bovine Pleuropneumonia in Ethiopian Highlands.” Preventive Veterinary Medicine 64 (1): 27–40. https://doi.org/10.1016/j.prevetmed.2004.03.005.
Ozgul, Arpat, Madan K Oli, Benjamin M Bolker, and Carolina Perez-Heydrich. 2009. “Upper Respiratory Tract Disease, Force of Infection, and Effects on Survival of Gopher Tortoises.” Ecological Applications 19 (3): 786–98. http://www.ncbi.nlm.nih.gov/pubmed/19425439.
Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. 2nd Printing. Springer.
Wilkinson, L. 1999. The Grammar of Graphics. New York: Springer.