library(readr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(purrr)
library(car)
library(rstatix)Version Control
When working on your own projects, or collaborating with others, you’re likely to run into several problems:
- Work may be lost or accidentally overwritten
- Keeping track of which versions of a file are current becomes impossible as projects grow
- It’s difficult to see who did what and for what reason
- Difficulties reverting to previous versions if something goes wrong
Using a proper version control system solves these problems and makes collaborating on plain text documents like code much cleaner and simpler. We’ll be using Git (installed locally on our machines) alongside GitHub (a cloud service) to demo a simple collaborative workflow.
Committing with Git
Now that we’ve set up our project folder, it’s a good time to make our first ‘commit’ to Git.
Select the Git tab in RStudio and tick the ‘staged’ box for all the files (or select one file, press Ctrl+A, and then tick one of the selected files to tick all). Click Commit and type a commit message detailing the changes (let’s go with “set up project” for now) and select the ‘Commit’ button.
You may now be presented with a pop-up asking you to connect to Github. Here you will need to paste in your github token.
You can now ‘push’ your commit up to GitHub, so that the GitHub repository reflects the current state of the repository on your machine.
Working with branches
Git uses branches to make it easier to separate out new work and have more control over merging new work into the existing project. Let’s create a new branch in the Git pane by clicking the button with the purple shapes next to ‘main’. Let’s call it load-packages, and leave everything as default. Usually you would do a bit more work in a new branch before merging it into the main branch, but for our purposes we’ll just add some library() calls up the top of our script to load the packages we’ll need.
Open analysis.R and enter the following up the top:
Save the file and open the Git pane in RStudio. You should see the script file listed there.Lets stage, commit, and enter a commit message such as “add package loading”, commit, and push.
We’ve made this commit on the load-packages branch locally, and have now also pushed it up to the load-packages branch in our GitHub repository. Now we need to initiate what is called a ‘pull request’ in GitHub.
Merging with pull requests
A pull request is a proposal to merge a set of changes from one branch into another. In a pull request, collaborators can review and discuss the proposed set of changes before they integrate the changes into the main codebase. Pull requests display the differences, or diffs, between the content in the source branch and the content in the target branch.
Head to github.com, sign in if you aren’t already, click your profile icon top right and select ‘repositories’. Your ‘r-data-science’ repo should be up the top as it has just had changes pushed to it. Select the repository and click the big green ‘Compare & pull request’ button up the top. You should then see a page with an overview of the two branches and a green ‘Create pull request’ button. After clicking that, you’ll see a review page that shows a high-level overview of the changes between your branch (the compare branch) and the repository’s base branch. You can add a summary of the proposed changes, review the changes made by commits, add labels, milestones, and assignees, and @mention individual contributors or teams. After you’re happy with the proposed changes, you can merge the pull request.
In this example, you should be able to go ahead and merge the pull request straight away. But when working with other people, it’s fairly common to run into merge conflicts. Merge conflicts occur when people make different changes to the same line of the same file, or when one person edits a file and another person deletes the same file. You must resolve all merge conflicts before you can merge a pull request on GitHub. If you have a merge conflict between the compare branch and base branch in your pull request, you can view a list of the files with conflicting changes above the Merge pull request button. The Merge pull request button is deactivated until you’ve resolved all conflicts between the compare branch and base branch.
Collaborating with Git
You can invite collaborators to give them access to your repo. This allows other people to pull from and push to the repo, and perform a variety of other tasks.
Version control with Git and GitHub is a complex topic and not tsomething we can spend too much time on here, so please see this online introductory Git lesson if you’d like to learn more.