15  Local branching with Git.

16 Warmup exercise

Use the terminal to create a new git repo so that when you run our git viz command you see this output:

branching_warmup % git log --oneline --abbrev-commit --all --graph --decorate --color                         
* 50590c5 (HEAD -> main) Wednesday
* 39f89db Tuesday
* 1b2162e Monday

Remember that for a new git repo you may have to provide a username and email address. These commands will help (you don’t need to change the names)

git config --global user.email "you@example.com"
git config --global user.name "Your Name"

The git viz command is long, so you can create a short cut for it on the commandline:

git config alias.viz 'log --oneline --abbrev-commit --all --graph --decorate --color'

Then you will only need to type:

git viz

17 Branching with local git

Branching (and merging) is a useful way to keep different tasks organized and synchronized, even when working locally with git. Below we will work through a scenario for git branching, showing the commands and two different vizualizations. Eventually branching becomes a key part of working with github and shared repositories, but it is useful to approach it separately first.

17.1 A scenario

Imagine you are running a coffee roaster. You have code that produces a daily report that is emailed to your team. Each day the point of sale system makes new data available. Each night your computer executes a script (report_script.Rmd) to access the data, create, and email the report.

We are going to manage a situation where we want to:

  1. experiment and break things
  2. keep things running

It is very hard to do these things at the same time. Branching helps us to do this.

One option might be to maintain different versions of files using different file names (report_script.Rmd and new_report_script.Rmd. This becomes very messy when we have many files and folders involved (/images and /new_images???). This is even worse when we have multiple experiments and attempts going on in parallel and entirely hopeless when we move online and are working with groups of people.

Recall, though, that git enables us to “swap out” our working directory. Until now we’ve kept this linear, each new version added after the last. But branching enables us to keep parallel lines of commits (as though we had two lines of trays with on our paper plane repository table).

17.2 Coordinating new work while keeping old work running

Currently the report is pretty simple, it’s just a table of sales divided up between in-store and online. Each day’s sales are added as a new row in the table.

cd ~
mkdir scratch_branching
cd scratch_branching
git init
Initialized empty Git repository in scratch_branching/.git/

We can now add the reporting script and the data for monday. To simulate editing a file using an editor we can use some fancy syntax so you can copy and paste it.

echo "# Initial code for table report" >> reporting_script.Rmd
git add reporting_script.Rmd
git commit -m "starting setup"
[master (root-commit) 24bcda4] starting setup
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 data_monday.csv
 create mode 100644 reporting_script.Rmd

The scripts runs normally on Monday night.

One Tuesday morning you decide that the data would be better presented as a chart. You have some ideas but rightly decide it will take more than a day or two to get that working.

In the meantime you still have to produce the report. So you create a branch called towards_chart and begin working there, leaving the master branch untouched to produce the Tuesday night report.

git branch towards_chart
git checkout towards_chart

Again we can simulate editing the file.

echo "# Work towards charts" >> reporting_script.Rmd 
git status
On branch towards_chart
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
    modified:   reporting_script.Rmd

no changes added to commit (use "git add" and/or "git commit -a")
git add reporting_script.Rmd
git commit -m "Worked towards charts"
[towards_chart acc3f42] Worked towards charts
 1 file changed, 1 insertion(+)

On Tuesday you see that the report ran fine (using the code in master) and you continue to work on the charts.

echo "# More work towards charts" >> reporting_script.Rmd 
git add reporting_script.Rmd
git commit -m "More work towards charts"
[towards_chart acc3f42] Worked towards charts
 1 file changed, 1 insertion(+)

On Wednesday morning, though, you get an error from your reporting script. The online sales system updated and the data files now include a new column saying whether a sale was cash or credit. You know how to get things working again, but you still aren’t ready to launch your chart system. You don’t want to wait until your chart work is finished to get the report going again.

So you switch to master, edit the reporting file to handle the new column, then add and commit the change.

git checkout master
Switched to branch 'master'

Now we are back on the master branch and we won’t see our work towards the charts at all. So over on the towards_chart branch we can work away without upsetting the working code on master.

Check the current content of the script (will show nothing about charts)

cat reporting_script.Rmd
Initial code for table report

So now we can, without involving our work towards the chart, make the bugfix on master.

echo "# Fix to match new data format" >> reporting_script.Rmd
git add reporting_script.Rmd
git commit -m "A fix to match new data format"

Now if we run our git viz command:

git log --oneline --abbrev-commit --all --graph --decorate --color

we will see our branching starting to show up:

jlh5498@educcomp04:~/scratch_branching$ git log --oneline --abbrev-commit --all --graph --decorate --color
* 48afd7c (HEAD -> master) A fix to match new data format
| * 5c7fc76 (towards_chart) More work towards charts
| * c72d745 Worked towards charts
|/  
* c434df6 starting setup

Happily the report runs fine on Wednesday night.

Thursday morning you switch back to the towards_chart branch and are pleased to get things working.

git checkout towards_chart
echo "# Code to finalize the charts" >> reporting_script.Rmd 
git add reporting_script.Rmd
git commit -m "Finished up the charts"

You are ready to add the chart into the report by moving the code to the master branch.

But if you just merge the code back to master you may find that the code for the chart doesn’t work with the change to handle the updated data files. So you may have some merging to do. But you don’t know if you can get that done before the report has to run, and you don’t want to get caught fiddling with master because if the report tries to run you could end up with nothing going out that night.

So you first merge master over to your towards_chart branch, and ensure that things work well and the two pieces of work done in parallel work well together (charts and dealing with the new column).

First confirm which branch you are in, this command lists the branches and the one highlighted with the * is the current branch. git status can also show you.

git branch -v 
git branch -v
  master          576bc90 A fix to match new data format
* towards_chart bef7b72 Finished up the charts

Then merge over the master branch into the towards_chart branch.

git merge master
Auto-merging reporting_script.Rmd
CONFLICT (content): Merge conflict in reporting_script.Rmd
Automatic merge failed; fix conflicts and then commit the result.

Ah, good thing we did this on the branch because we do end up with a conflict. Git can resolve some conflicts but not all. Git shows conflicts by adding special lines of text into the file (using >>>>>> and <<<<< as indicators. To resolve them we remove those lines and leve the file the way we want it to be.

# Initial code for table report
<<<<<<< HEAD
# Work towards charts
# More work towards charts
# Code to finalize the charts
=======
# Fix to match new data format
>>>>>>> master

The parts separated by ======== show edits that conflict. Looking at this we can see that we want all the lines, so we edit the file to show:

# Initial code for table report
# Fix to match new data format
# Work towards charts
# More work towards charts
# Code to finalize the charts

Now we can save that file. Git knows that we are fixing a conflict, so git status shows:

On branch towards_chart
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
    both modified:   reporting_script.Rmd

no changes added to commit (use "git add" and/or "git commit -a")

And now we can add and commit to finish the merge.

git add reporting_script.Rmd
git commit -m "merged bug fix with charts code"
[towards_chart e420a05] merged bug fix with charts code

(Git can only flag syntactical conflicts, not semantic conflicts, so one would also try running some test reports to make sure things are all working).

So once you’ve resolved all issues, then you can merge the towards_chart branch back to master, signalling that you are ready to launch your new report with charts.

git checkout master
git merge towards_chart
Switched to branch 'master'
Updating 576bc90..e420a05
Fast-forward
 reporting_script.Rmd | 3 +++
 1 file changed, 3 insertions(+)

Finally we can visualize this branching, editing, and merging in two ways. First we can use this handy command to see a visualization in the command line.

git log --oneline --abbrev-commit --all --graph --decorate --color               

Which will show us this (using an image here because the color doesn’t copy):

We can also see this visually using learngitbranching, see the short video below.

17.2.1 Exercises (in-class):

  1. Go to “Learn Git Branching” site: https://learngitbranching.js.org/. (Note that this site is multi-locale, see the small planet icon in the bottom right?)
  2. Do exercises 2 and 3 (“Branching in Git” and “Merging in Git”)
  3. Shift back to the terminal and replicate the “Merging in Git” lesson. You will need to start with these commands (to create a new, empty, repo):
cd ~
mkdir replicate_learngitbranching
cd replicate_learngitbranching
git init

Remember that when we are working in the terminal it is different from learngitbranching because we have to actually create, edit, and save files before we can commit. We also have to use git commit -m 'some message' rather than just git commit.