version control all the things



Source control is important because it provides:

There are two types of source control systems:

Centralised means that there is a single storage location and to work on a file it must be exclusively checked out. Distributed systems involve taking copies of the code base, making changes, and pushing these back to primary storage location.

Both have their own disadvantages but since with distributed source control you never get the situation where someone's left a file checked out as they go on holiday and no-one else can use it, I'm a big fan of distributed source control systems.


Git is a distributed source control system.

It integrates neatly into Rstudio, making it easy to source control your analysis.

The git conceptual model (\@cthydng)

git model

Git glossary

There are more terms. For a friendly glossary see Github's git glossary, and for an extensive, technical glossary see the official Git glossary


The package git2r supports a source control workflow directly within R. This means you can continue to use Rstudio for even complex git tasks. And of course there's always the shell option in Rstudio.

For a handy Git cheatsheet, check out this GitHub one.

The git2r documentation is pretty good. It's easier though to use once you've been utilising the Rstudio GUI for a bit, and dabbling with the command line.