RStudio Projects

Final set up

To allow Git to work properly with RStudio, and to ensure your local Git install is linked with your GitHub account, we need to open Git bash (windows) or Terminal (macOS) and enter:

git config --global user.name "Jane Doe"
git config --global user.email "jane@example.com"

substituting your name and the email associated with your GitHub account. The user.name you give does not have to be the same as your GitHub username. It can be your actual first and last name.

Generate a personal access token

Log into your GitHub account. Select your profile picture in the top right corner and select ‘settings’. Scroll to the bottom of the menu bar on the left and select ‘<> Developer settings’-> ‘Personal access tokens’ -> ‘Tokens (classic)’.

From here, select ‘Generate new token’ -> ‘Generate new token (classic)’.

Under ‘Note’ give your token a name so that you can remember what it is used for. For example tom-work-laptop. You can then set an expiration date.

If you would like to use the same token forever, so that you don’t have to return to GitHub to generate a new token, give it a somewhat generic name and set the expiration date to No Expiration.

For the ‘scopes’, we recommend selecting ‘repo’, ‘workflow’ and ‘user’. There are several reasons why you would want to create different personal access tokens, but scopes are one important reason. These define what type of access you need. ‘Fine-grained tokens’ take this even further by allowing you to specify different access types for different repositories.

Create GitHub repository & RStudio project

We’re going to work in an RStudio project connected to a GitHub repository.

First we create the repository in GitHub:

  1. Go to github.com, sign in, then click the green ‘New’ button.
  2. Name it “r-data-science”, leave the rest as default, then click ‘create repository’.
  3. Copy the github.com address in the ‘Quick set up box’.
  4. Open RStudio and select File -> New project -> Version control -> Git -> and paste the address into the ‘Repository URL’ box.

In the box called ‘Create project as subdirectory of:’ you need to select a location on your machine. It’s a good idea to create a folder called something like ‘projects’ in your user folder and select this every time, to group your repositories together in the same place on your machine(s). Let’s do that by clicking ‘Browse’, navigating to your user folder, and creating a new folder called ‘projects’ before selecting it and clicking ‘create project’.

RStudio opens a new project called r-data-science. RStudio projects are essentially the .RProj file - this file tells RStudio that this folder is home to the project, and to use the folder as the ‘working directory’, or the place where R is looking when you ask it to load data or save an output.

Set up folders

Create the following folders from within RStudio:

  • data_raw: holds the raw, untouched data.
  • data: holds cleaned data.
  • figures: holds figures and plots.
  • src: holds analysis script(s).

There are many ways of setting up project folders, but the idea is to pick a convention and stick to it across different projects to stay organised and save time.

Download this zipped dataset, move it into data_raw, and unzip it.

Create a new R script (Ctrl+Shift+N), call it analysis.R and save it in the src folder.