Follow Hadley Wickham’s style guide, adapted from Google’s style guide (HW’s link to Google is broken). Some of the differences between Google and HW are:
snake_case
for identifiers, Google likes kebab or camel (variable.name
(or variableName
), FunctionName
, kConstantName
)You can use whatever naming conventions you want but be consistent. (This holds more generally, e.g. for choice of tidyverse pipe %>%
or native pipe |>
, etc.)
In addition:
data
, but also: sd
, t
, dt
, df
, I
, …) for your own variables. fortunes::fortune(77)
:Firstly, don’t call your matrix ‘matrix’. Would you call your dog ‘dog’? Anyway, it might clash with the function ‘matrix’. (Barry Rowlingson, R-help (October 2004))
C:\\Joe's Computer\\Important Stuff
)&
, #
, $
, parentheses) in file names (_
and .
are OK)View()
, head()
, str()
in your code (unless commented out)install.packages()
in a script (unless it’s commented out)tidyverse
, you don’t need to explicitly load any of the contained packages (dplyr
, tidyr
, ggplot2
, etc.)library()
instead of require()
to load packagesrm(list=ls())
at the top of your code (Bryan 2017)
Session > Restart R
or via hotkey in RStudio)setwd(...)
at the head of your file (instead, assume that the user has set their working directory correctly; this can be done (1) by hand with setwd()
; (2) in RStudio, via the Session
menu; (3) automatically in RStudio, by using an R project (home directory is stored in the .Rproj
file); (4) using the here
packageTRUE
and FALSE
rather than T
/F
(is this in the other style guides already???)mean(dd$x)
instead of dd %>% pull(x) %>% mean()
count()
instead of group_by(..)+summarize(count=n())
, or use base-R table
(which also spreads the results): with(your_data,table(var1,var2))
dplyr::rename()
)data=
argument whenever possible (e.g. lm()
)across()
function in tidyverse (in conjunction with mutate
and summarise
) to transform multiple columnsstopifnot()
(or the assertthat
package from the extended hadleyverse) to test conditionsmutate()
stepsprint()
statements rather than relying on objects to self-printreturn()
statements rather than relying on R’s implicit “return value is the last statement in the function” rulec()
(e.g. c(1:30)
). Lean toward seq()
and seq_along()
, but OK (?) to use :
ggplot
specification on a separate line+
, %>%
). For example,thing <- (thing %>%
mutate(foo=x^2)
)
rather than
thing <- thing %>%
mutate(foo=x^2)
Consider moving the operator to the next line:
thing <- (thing
%>% mutate(foo=x^2)
)
This makes it easier to comment out unwanted lines temporarily. - Similarly, for complicated multi-argument expressions, put the comma on the following line to make commenting/deleting arguments easier (JD)
thing <- (thing
%>% mutate(foo=x^2
, bar=x^3
, bletch=x^4
)
)
rather than
thing <- thing %>%
mutate(foo=x^2,
bar=x^3,
bletch=x^4)
Bryan, Jenny. 2017. “Project-Oriented Workflow.” Tidyverse. https://www.tidyverse.org/blog/2017/12/workflow-vs-script/.