Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
Exploration
Diagnostics
plot.lm
Presentation
geom_smooth
)Much of what I have to say here is also said very nicely by John Rauser (2016)
Visual perception of quantitative information: Cleveland hierarchy (W. S. Cleveland and McGill 1984,W. S. Cleveland and McGill (1987),W. Cleveland (1993))
Data presentation scales with data size
RColorBrewer
/ColorBrewer, IWantHue): respect dichromats and B&W printoutsdata("cbpp",package="lme4")
## make period *numeric* so lines will be connected/grouping won't happen
cbpp2 <- transform(cbpp,period=as.numeric(as.character(period)))
g0 <- ggplot(cbpp2,aes(period,incidence/size))
## spaghetti plot
g1 <- g0+geom_line(aes(colour=herd))+geom_point(aes(size=size,colour=herd))
g2 <- ggplot(cbpp2,aes(period,incidence/size,colour=herd))
(g3 <- g2 + geom_line()+geom_point(aes(size=size)))
## facet instead
(g4 <- g1+facet_wrap(~herd))
## order by average prop. incidence
g1 %+% transform(cbpp2,herd=reorder(herd,incidence/size))
g4 %+% transform(cbpp2,herd=reorder(herd,incidence/size))
## also consider colouring by incidence/order ...
Makes it fairly easy to do a simple two-stage analysis on the fly:
g0+geom_point(aes(size=size,colour=herd))+
geom_smooth(aes(colour=herd,weight=size),
method="glm",
method.args=list(family=binomial),
se=FALSE)
(ignore glm.fit
warnings if you try this)
Possible solutions:
pch="."
stat_sum
); beeswarm plotsggplot2
)lattice
, ggplot
, http://d3js.org/, ggviz
gridExtra
, ggExtra
, cowplot
, directlabels
packages may be handymappings + geoms
Specified explicitly as part of a ggplot
call:
library(mlmRev)
head(Oxboys)
## Subject age height Occasion
## 1 1 -1.0000 140.5 1
## 2 1 -0.7479 143.4 2
## 3 1 -0.4630 144.8 3
## 4 1 -0.1643 147.1 4
## 5 1 -0.0027 147.7 5
## 6 1 0.2466 150.2 6
library(ggplot2)
ggplot(Oxboys)
But that isn’t quite enough: we need to specify a mapping between variables (columns in the data set) and aesthetics (elements of the graphical display: x-location, y-location, colour, size, shape …)
ggplot(Oxboys,aes(x=age,y=height))
but (as you can see) that’s still not quite enough. We need to specify some geometric objects (called geom
s) such as points, lines, etc., that will embody these aesthetics. The weirdest thing about ggplot
syntax is that these geom
s get added to the existing ggplot
object that specifies the data and aesthetics; unless you explicitly specify other aesthetics, they are inherited from the initial ggplot
call.
ggplot(Oxboys,aes(x=age,y=height))+geom_point()
geom_smooth
, stat_sum
)See Karthik Ram’s ggplot intro or my intro for disease ecologists, among many others.
sessionInfo()
## R Under development (unstable) (2018-04-16 r74611)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/local/lib/R/lib/libRblas.so
## LAPACK: /usr/local/lib/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_CA.UTF8 LC_NUMERIC=C
## [3] LC_TIME=en_CA.UTF8 LC_COLLATE=en_CA.UTF8
## [5] LC_MONETARY=en_CA.UTF8 LC_MESSAGES=en_CA.UTF8
## [7] LC_PAPER=en_CA.UTF8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_CA.UTF8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] mlmRev_1.0-6 lme4_1.1-17 Matrix_1.2-14
## [4] ggplot2_2.2.1.9000 knitr_1.20
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.16 magrittr_1.5 MASS_7.3-49
## [4] splines_3.6.0 munsell_0.4.3 lattice_0.20-35
## [7] colorspace_1.3-2 rlang_0.2.0.9001 minqa_1.2.4
## [10] stringr_1.3.0 plyr_1.8.4 tools_3.6.0
## [13] grid_3.6.0 nlme_3.1-137 gtable_0.2.0
## [16] withr_2.1.2 htmltools_0.3.6 yaml_2.1.18
## [19] lazyeval_0.2.1 rprojroot_1.3-2 digest_0.6.15
## [22] tibble_1.4.2 nloptr_1.0.4 evaluate_0.10.1
## [25] rmarkdown_1.9 labeling_0.3 stringi_1.1.7
## [28] compiler_3.6.0 pillar_1.2.1 scales_0.5.0.9000
## [31] backports_1.1.2
Cleveland, William. 1993. Visualizing Data. Summit, NJ: Hobart Press.
Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54. doi:10.2307/2288400.
———. 1987. “Graphical Perception: The Visual Decoding of Quantitative Information on Graphical Displays of Data.” Journal of the Royal Statistical Society. Series A (General) 150 (3): 192–229. doi:10.2307/2981473.
Gelman, Andrew, and Antony Unwin. 2013. “Infovis and Statistical Graphics: Different Goals, Different Looks.” Journal of Computational and Graphical Statistics 22 (1): 2–28. doi:10.1080/10618600.2012.761137.
Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2): 121–30. http://www.tandfonline.com/doi/abs/10.1198/000313002317572790.
John Rauser. 2016. “How Humans See Data.” https://www.youtube.com/watch?v=fSgEeI2Xpdc.
Tufte, Edward. 2001. The Visual Display of Quantitative Information. 2d ed. Graphics Press.
Tufte, Edward R. 1995. Envisioning Information. Cheshire, Conn.: Graphics Press.
———. 1997. Visual Explanations: Images and Quantities, Evidence and Narrative. Cheshire, Conn.: Graphics Press.
———. 2006. Beautiful Evidence. Cheshire, Conn.: Graphics Press.
Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. 2nd Printing. Springer.
Wilkinson, L. 1999. The Grammar of Graphics. New York: Springer.