Summary and Schedule
Welcome to the Introduction to Git and GitHub training! This page contains learning outcomes and setup that must be completed before working through the lesson.
Version control is fundamental to good quality assurance of science and code. Using a Version Control System is better than mailing files back and forth:
Nothing that is committed to version control is ever lost, unless you work really, really hard at losing it. Since all old versions of files are saved, it’s always possible to go back in time to see exactly who wrote what on a particular day, or what version of a program was used to generate a particular set of results.
As we have this record of who made what changes when, we know who to ask if we have questions later on, and, if needed, revert to a previous version, much like the “undo” feature in an editor.
When several people collaborate in the same project, it’s possible to accidentally overlook or overwrite someone’s changes. The version control system automatically notifies users whenever there’s a conflict between one person’s work and another’s.
Teams are not the only ones to benefit from version control: lone researchers can benefit immensely. Keeping a record of what was changed, when, and why is extremely useful for all researchers if they ever need to come back to the project later on (e.g., a year later, when memory has faded).
Version control is the lab notebook of the digital world: it’s what professionals use to keep track of what they’ve done and to collaborate with other people. Every large software development project relies on it, and most programmers use it for their small jobs as well. And it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.
Prerequisites
No knowledge of Git and GitHub is required. In this lesson we use Git from the command line. Some previous experience with the command line is expected, but isn’t mandatory. Access to the command line and a browser connected to the Internet are required.
This lesson is platform-independent and can be taken on Linux, Windows, and MacOS. This lesson has been fully tested on Linux. If you encounter any problems using either Windows or MacOS please reach out for support.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Automated Version Control | What is version control and why should I use it? |
Duration: 00h 05m | 2. Setting Up Git | How do I get set up to use Git? |
Duration: 00h 10m | 3. Creating a Repository | Where does Git store information? |
Duration: 00h 20m | 4. Branches |
Understand how branches are created. Learn the key commands to view and manipulate branches. |
Duration: 00h 50m | 5. Tracking Changes |
How do I record changes in Git? How do I check the status of my version control repository? How do I record notes about what changes I made and why? |
Duration: 01h 10m | 6. Exploring History |
How can I identify old versions of files? How do I review my changes? |
Duration: 01h 30m | 7. Reverting Changes | How can I recover old versions of files? |
Duration: 01h 55m | 8. Ignoring Things | How can I tell Git to ignore files I don’t want to track? |
Duration: 02h 00m | 9. Break | |
Duration: 02h 00m | 10. Remotes in GitHub | How do I share my changes with others on the web? |
Duration: 02h 45m | 11. Exploring GitHub | How do I search a repository? |
Duration: 03h 05m | 12. Exploring History on GitHub |
How can I identify old versions of files on GitHub? How do I review my changes on GitHub? |
Duration: 03h 15m | 13. Pull Requests |
What are pull requests for? How can I make a pull request? |
Duration: 04h 00m | 14. Configuring GitHub |
How do I edit my GitHub profile? How do I change my notification preferences? How do I change my organisation membership visibility, and team memberships? |
Duration: 04h 15m | 15. End | |
Duration: 04h 15m | 16. Open Science | How can version control help me make my work more open? |
Duration: 04h 25m | 17. Licensing | What licensing information should I include with my work? |
Duration: 04h 30m | 18. Citation | How can I make my work easier to cite? |
Duration: 04h 32m | 19. Hosting | Where should I host my version control repositories? |
Duration: 04h 42m | 20. Using Git from RStudio | How can I use Git with RStudio? |
Duration: 04h 52m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
How to get help
- Attend a Git and GitHub Surgery.
- Email the support mailbox.
Details of surgeries can be found on the Science Git Migration Project Comms Site.
Lesson Outcomes
This lesson is split into two parts. Episodes in the first half before the break focus on Git. Episodes in the second half after the break focus on GitHub.
By the end of the Git section you will be able to:
- describe the importance of Version Control.
- describe the similarities and differences between SVN and Git.
- configure Git.
- initialise new repositories using Git.
- develop changes on a branch.
- view differences between changes and explore your repositories history.
By the end of the GitHub section you will be able to:
- set up important GitHub settings and SSH access.
- describe the similarities and differences between Trac and GitHub.
- explain the differences between private, internal and public repositories.
- set up a repository, including how to protect the
main
branch and enable wiki pages, discussions, and GitHub projects. - view a repositories history and differences between changes.
- view Issues and Pull Requests.
- contribute changes to a repository using a Pull Request.
And much more!
Pre-lesson Survey
Please remember to fill out the pre-lesson survey prior to the start of the lesson. This information is vital for us to keep improving the lesson for other learners.
Installing Git
Creating a GitHub Account
You will need an account for GitHub to follow the GitHub episodes of this lesson.
- Go to https://github.com and follow the “Sign up” link at the top-right of the window.
- Follow the instructions to create an account.
- Verify your email address with GitHub.
- Configure multifactor authentication and or a passkey (see below).
There is no fixed guidance for choosing your GitHub username however
you should ensure it is suitable for work. At the Met Office a common
pattern for usernames is: mo-{first name initial}{surname}
.
So if your name is Eleanor Ormerod
your username would be:
mo-eormerod
Multi-factor Authentication
In 2023, GitHub introduced a requirement for all accounts to have multi-factor authentication (MFA) configured for extra security. Several options exist for setting up MFA, which are summarised here:
- If you already use an authenticator app, like Google Authenticator or Duo Mobile on your smartphone for example, add GitHub to that app.
- If you have access to a smartphone but do not already use an authenticator app, install one and add GitHub to the app.
- If you do not have access to a smartphone or do not want to install an authenticator app, you have two options:
The GitHub documentation provides more details about configuring MFA.
Passkeys
To completely avoid having authentication for work purposes on a personal device you may choose to set up a passkey. Your instructor or organisation will be able to provide guidance on suitable passkey providers and password managers. At the Met Office the KeePass password manager is available. Search, “Using KeePass for one-time passwords” on SharePoint for setup instructions.
SSH Setup
We recommend you move to using SSH Keys instead of a PAT (instructions below). This material will work using a PAT, please see the Note for Personal Access Token Users dropdown in the first GitHub episode.
Before you can connect to a repository on GitHub, you need to set up a way for your computer to authenticate with GitHub.
We are going to set up the method that is commonly used by many different services to authenticate access on the command line. This method is called Secure Shell Protocol (SSH). SSH is a cryptographic network protocol that allows secure communication between computers using an otherwise insecure network.
SSH uses what is called a key pair. This is two keys that work together to validate access. One key is publicly known and called the public key, and the other key called the private key is kept private.
You can think of the public key as a padlock, and only you have the key (the private key) to open it. You use the public key where you want a secure method of communication, such as your GitHub account. You give this padlock, or public key, to GitHub and say “lock the communications to my account with this so that only computers that have my private key can unlock communications and send git commands as my GitHub account.”
What we will do now is the minimum required to set up an SSH key and add the public key to a GitHub account.
Keeping your keys secure
You shouldn’t really forget about your SSH keys, since they keep your account secure. It’s good practice to audit your secure shell keys every so often. Especially if you are using multiple computers to access your account.
Run the list command to check what key pairs already exist on your computer.
Your output is going to look a little different depending on whether or not SSH has ever been set up on the computer you are using.
If you have not set up SSH on your computer, you will see
OUTPUT
ls: cannot access '~/.ssh': No such file or directory
If SSH has been set up on the computer you’re using, the public and
private key pairs will be listed. The file names are either
id_ed25519
/id_ed25519.pub
or
id_rsa
/id_rsa.pub
depending on how the key
pairs were set up.
Create an SSH key pair
To create an SSH key pair use the following command, where the
-t
option specifies which type of algorithm to use:
The -f
flag specifies a path to a file to store the key
in.
The -C
flag attaches a comment to the key. The comment
has no effect on your key, you may place anything here to help you
remember what the key is for. It makes no difference whether you use a
public email or your no-reply private GitHub email in the comment.
If you are using a legacy system that doesn’t support the Ed25519
algorithm, use:
$ ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_ed25519_github -C "your_email@example.com"
OUTPUT
Generating public/private ed25519 key pair.
Enter passphrase (empty for no passphrase):
Now you will be prompted for a passphrase. If the computer you work on is shared between multiple users you should set a passphrase. Be sure to use something memorable or save your passphrase somewhere, as there is no “reset my password” option. If you do not share your computer there is no need to set a passphrase so just press Enter.
Note that, when typing a passphrase on a terminal, there won’t be any visual feedback of your typing. This is normal: your passphrase will be recorded even if you see nothing changing on your screen.
OUTPUT
Enter same passphrase again:
After entering the same passphrase a second time, we receive the confirmation
OUTPUT
Your identification has been saved in ~/.ssh/id_ed25519_github
Your public key has been saved in ~/.ssh/id_ed25519_github.pub
The key fingerprint is:
SHA256:SMSPIStNyA00KPxuYu94KpZgRAYjgt9g4BA4kFy3g1o e.ormerod@mo-weather.uk Azure SPICE
The key's randomart image is:
+--[ED25519 256]--+
|^B== o. |
|%*=.*.+ |
|+=.E =.+ |
| .=.+.o.. |
|.... . S |
|.+ o |
|+ = |
|.o.o |
|oo+. |
+----[SHA256]-----+
The “identification” is actually the private key. You should never share it. The public key is appropriately named. The “key fingerprint” is a shorter version of a public key.
Now that we have generated the SSH keys, we will find the SSH files when we check.
OUTPUT
drwxr-xr-x 1 Eleanor 197121 0 Jul 16 14:48 ./
drwxr-xr-x 1 Eleanor 197121 0 Jul 16 14:48 ../
-rw-r--r-- 1 Eleanor 197121 419 Jul 16 14:48 id_ed25519_github
-rw-r--r-- 1 Eleanor 197121 106 Jul 16 14:48 id_ed25519_github.pub
Copy the public key to GitHub
Now we have an SSH key pair and we can run this command to check if GitHub can read our authentication.
OUTPUT
The authenticity of host 'github.com (192.30.255.112)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes
Warning: Permanently added 'github.com' (RSA) to the list of known hosts.
git@github.com: Permission denied (publickey).
Now we need to give GitHub the public key.
Ideally before connecting to a new host, like github.com
in the output above, you would check the RSA key fingerprint matches the
expected value. GitHub publishes their public SSH
key fingerprints for you to check against.
First, we need to copy the public key. Be sure to include the
.pub
at the end, otherwise you’re looking at the private
key.
OUTPUT
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDmRA3d51X0uu9wXek559gfn6UFNF69yZjChyBIU2qKI e.ormerod@mo-weather.uk
Now, go to GitHub.com, click on your profile icon in the top right corner to get the drop-down menu. Click “Settings”, then on the settings page, click “SSH and GPG keys”, on the left side “Access” menu. Click the “New SSH key” button on the right side. Now, you can add the title (normally an ID for the computer storing the keys such as “Work Linux”), paste your SSH key into the field, and click the “Add SSH key” to complete the setup.
Single sign-on (SSO)
If you are part of an organisation that requires single sign-on (SSO) to access their GitHub organisation you will need to authorise the key for use in the organisation.
Next to the newly created SSH key in the GitHub settings click on “Configure SSO”. Find the organisation in the list and click on “Authorise”.
Now check your authentication again from the command line.
OUTPUT
Hi Eleanor! You've successfully authenticated, but GitHub does not provide shell access.
This output confirms that the SSH key works as intended.
If your new key failed to connect you may need to alter your ssh config.
- Create the
~/.ssh/config
file if it doesn’t exist - Add the following to the file:
Host github.com
IdentityFile ~/.ssh/id_ed25519_github
IdentitiesOnly yes
This explicitly states which key to use for github.com
and is needed if you have many SSH keys already for other hosts.
IdentityFile along with IdentitiesOnly yes ensures that only your GitHub
SSH key is offered to github.com. This prevents failures caused by
trying more key files than the GitHub server accepts.
Optional: Git Autocomplete
Git provides a script which lets us display the version control
status in your terminal prompt. The following instructions have been
tested on Linux. If you are using MacOS or Windows
please consult the Git
autocomplete instructions at the top of the linked file and reach out for support
if you require help. To enable this script add the following to a new
~/.bashrc.d/prompt.bash
file:
BASH
if [[ $- =~ i ]]; then
GIT_PROMPT_PATH=/usr/share/git-core/contrib/completion/git-prompt.sh
if [[ -r "${GIT_PROMPT_PATH}" ]]; then
. "${GIT_PROMPT_PATH}" >&2
export GIT_PS1_SHOWDIRTYSTATE=1 # this can potentially slow down the prompt
export GIT_PS1_SHOWSTASHSTATE=1
export GIT_PS1_SHOWUPSTREAM="auto"
export GIT_PS1_SHOWCOLORHINTS=1
export GIT_PS1_SHOWUNTRACKEDFILES=1 # this can potentially slow down the prompt
export PS1='[\u@\h:\w]$(__git_ps1 "(%s)"):\$ ' # style to your taste
#else # optional, if you need to style the default prompt without Git
# export PS1='[\u@\h:\w] \$ '
fi
fi
You may use your preferred editor to create this file; these lesson
materials use the nano
editor when a
file needs creating or modifying. This is followed by the
cat
command in the material to show the file contents after
the change. You do not have to use the cat
command when
following the lesson material.
Make sure your ~/.bashrc
file includes:
BASH
# User specific aliases and functions
if [ -d ~/.bashrc.d ]; then
for rc in ~/.bashrc.d/*; do
if [ -f "$rc" ]; then
. "$rc"
fi
done
fi
unset rc
These lines should only be added to the ~/.bashrc
file.
Do not add them to your prompt.bash
file.
GIT_PROMPT_PATH
The path in the output above is correct for Met Office systems. If
you are not using Met Office systems please consult your institutions IT
services or download your own copy of the git-prompt.sh
script. Download the latest version from the Git
repository contrib directory. Ensure the
GIT_PROMPT_PATH
matches where you decide to store the
git-prompt.sh
file.
To see the changes to your terminal prompt run:
You might not notice much of a change until you are in a directory containing a Git repository.
If your ~/.bashrc
file, or any file in the
~/.bashrc.d/
directory, already modifies your prompt using
the PS1
command you can export PROMPT_COMMAND
instead.
Replace the export PS1
line with:
BASH
export PROMPT_COMMAND=("${PROMPT_COMMAND[@]}" '__git_ps1 "${CONDA_PROMPT_MODIFIER}[\u@\h:\w]:" "\$ " "(%s)"')
If your version of Bash is less than 5.1 or you are using MacOS you might need to use:
You might find that with long paths and usernames your prompt takes up the entire width of the terminal; there are several ways to reduce the prompt length:
Removing \u
and \h
If adding in your username, \u
, and hostname,
\h
, makes the terminal prompt too long you can remove the
\u
and or \h
from the
PROMPT_COMMAND
or PS1
lines.
Trim long directory paths
Just before the final fi
line you may add:
This trims long directory paths to only show the current and two parent directories. You can change this value to show more or fewer directories.
OUTPUT
/Desktop/A/Really/Long/Path $ # without PROMPT_DIRTRIM
.../Really/Long/Path $ # with PROMPT_DIRTRIM
Add in a newline
Just before the \$
symbol in the PS1
or
PROMPT_COMMAND
lines you may add \n
. This will
add in a newline before the $
symbol, separating your
prompt from your terminal commands.
Before:
OUTPUT
[~/Documents/git-novice]:(branch_name) $ _
After:
OUTPUT
[~/Documents/git-novice]:(branch_name)
$ _
To see the changes to your terminal prompt run: