Follow Hadley Wickham’s style guide, adapted from Google’s style guide (HW’s link to Google is broken). Some of the differences between Google and HW are:
snake_case for identifiers, Google likes kebab or camel (variable.name (or variableName), FunctionName, kConstantName)You can use whatever naming conventions you want but be consistent. (This holds more generally, e.g. for choice of tidyverse pipe %>% or native pipe |>, etc.)
In addition:
data, but also: sd, t, dt, df, I, …) for your own variables. fortunes::fortune(77):Firstly, don’t call your matrix ‘matrix’. Would you call your dog ‘dog’? Anyway, it might clash with the function ‘matrix’. (Barry Rowlingson, R-help (October 2004))
C:\\Joe's Computer\\Important Stuff)&, #, $, parentheses) in file names (_ and . are OK)View(), head(), str() in your code (unless commented out)install.packages() in a script (unless it’s commented out)tidyverse, you don’t need to explicitly load any of the contained packages (dplyr, tidyr, ggplot2, etc.)library() instead of require() to load packagesrm(list=ls()) at the top of your code (Bryan 2017)
Session > Restart R or via hotkey in RStudio)setwd(...) at the head of your file (instead, assume that the user has set their working directory correctly; this can be done (1) by hand with setwd(); (2) in RStudio, via the Session menu; (3) automatically in RStudio, by using an R project (home directory is stored in the .Rproj file); (4) using the here packageTRUE and FALSE rather than T/F (is this in the other style guides already???)mean(dd$x) instead of dd %>% pull(x) %>% mean()count() instead of group_by(..)+summarize(count=n()), or use base-R table (which also spreads the results): with(your_data,table(var1,var2))dplyr::rename())data= argument whenever possible (e.g. lm())across() function in tidyverse (in conjunction with mutate and summarise) to transform multiple columnsstopifnot() (or the assertthat package from the extended hadleyverse) to test conditionsmutate() stepsprint() statements rather than relying on objects to self-printreturn() statements rather than relying on R’s implicit “return value is the last statement in the function” rulec() (e.g. c(1:30)). Lean toward seq() and seq_along(), but OK (?) to use :ggplot specification on a separate line+, %>%). For example,thing <- (thing %>%
mutate(foo=x^2)
)
rather than
thing <- thing %>%
mutate(foo=x^2)
Consider moving the operator to the next line:
thing <- (thing
%>% mutate(foo=x^2)
)
This makes it easier to comment out unwanted lines temporarily. - Similarly, for complicated multi-argument expressions, put the comma on the following line to make commenting/deleting arguments easier (JD)
thing <- (thing
%>% mutate(foo=x^2
, bar=x^3
, bletch=x^4
)
)
rather than
thing <- thing %>%
mutate(foo=x^2,
bar=x^3,
bletch=x^4)
Bryan, Jenny. 2017. “Project-Oriented Workflow.” Tidyverse. https://www.tidyverse.org/blog/2017/12/workflow-vs-script/.