Introduction for Git and Github for Beginners

Harika Bonthu 30 Jan, 2024 • 9 min read

This article was published as a part of the Data Science Blogathon

Introduction

Welcome to the world of collaborative coding! Ever wondered how multiple people can work on the same project without chaos? Enter Git and GitHub, the dynamic duo of version control. Learn the basics of Git, a Version Control System (VCS) that tracks changes in code. Discover GitHub, a platform that takes collaboration to the next level. This article explores the differences between Git and GitHub, walks you through Git installation, and introduces key operations and commands to empower your coding journey. Let’s dive in!

 
git and github

What is Git?

Git is a tool that helps software developers work together on projects. It keeps track of changes made to the code, allowing multiple people to collaborate without messing up each other’s work. Git also lets developers create separate branches to work on specific tasks and then merge their changes back together. It’s like a smart way to save and organize different versions of a project. Popular platforms like GitHub make it easier for developers to share their code and collaborate using Git.

What is GitHub and What does it do?

GitHub is a web-based platform that uses Git for version control. In simpler terms, it’s like a place on the internet where people can store and share their code with others. Here’s what GitHub does:

  • Code Hosting:
    • GitHub stores your code in the cloud, making it accessible from anywhere with an internet connection.
  • Version Control:
    • It uses Git to keep track of changes in your code over time, allowing you to see who made changes and when.
  • Collaboration:
    • Multiple people can work on the same project simultaneously, and GitHub helps manage their contributions.
  • Pull Requests:
    • When you want to add new features or fix issues, you can suggest changes through a “pull request.” Others can review and discuss your proposed changes before they are officially added.
  • Issues and Bug Tracking:
    • GitHub provides a way to report and track issues or bugs in your code, making it easier for teams to collaborate on problem-solving.
  • Wikis and Documentation:
    • You can create documentation for your projects, making it simpler for others to understand and contribute.
  • GitHub Pages:
    • GitHub allows you to host simple websites directly from your GitHub repository using a feature called GitHub Pages.
  • Community and Social Coding:
    • GitHub is a social platform, and developers can follow projects, collaborate with others, and contribute to open-source projects.
  • Visibility:
    • You can choose whether your projects are public (visible to everyone) or private (accessible only to selected collaborators).
  • Integration:
    • GitHub can integrate with various tools and services, making it a central hub for software development.

In essence, GitHub is a hub for hosting and collaborating on software projects. It makes it easier for individuals and teams to work together, share code, and contribute to open-source projects.

Difference B/W Git and GitHub

Git:

  • Distributed version control system (DVCS).
  • Tracks changes in source code locally on a developer’s machine.
  • Allows users to commit changes, create branches, and manage code history locally.
  • Operates through commands like git commit, git pull, and git push.
  • Does not require an internet connection for basic operations.

GitHub:

  • Web-based platform for hosting and collaborating on Git repositories.
  • Provides remote repository hosting in the cloud.
  • Adds collaboration features such as pull requests, issues, and discussions.
  • Facilitates teamwork and code review through a user-friendly web interface.
  • Requires an internet connection for interacting with the remote repositories on GitHub.

What is a Version Control System (VCS)?

Version Control Systems are the software tools for tracking/managing all the changes made to the source code during the project development. It keeps a record of every single change made to the code. It also allows us to turn back to the previous version of the code if any mistake is made in the current version. Without a VCS in place, it would not be possible to monitor the development of the project.

Types of VCS

The three types of VCS are:

  1. Local Version Control System
  2. Centralized Version Control System
  3. Distributed Version Control System

Local Version Control System

Local Version Control System is located in your local machine. If the local machine crashes, it would not be possible to retrieve the files, and all the information will be lost. If anything happens to a single version, all the versions made after that will be lost.

Also, with the Local Version Control System, it is not possible to collaborate with other collaborators.

Local vcs | git and github
Image 2

To collaborate with other developers on other systems, Centralized Version Control Systems are developed.

Centralized Version Control System

In the Centralized Version Control Systems, there will be a single central server that contains all the files related to the project, and many collaborators checkout files from this single server (you will only have a working copy). The problem with the Centralized Version Control Systems is if the central server crashes, almost everything related to the project will be lost.

centralized version control system
Image 3

To overcome all the above problems, Distributed Version Control Systems are developed.

Distributed Version Control System

In a distributed version control system,  there will be one or more servers and many collaborators similar to the centralized system. But the difference is, not only do they check out the latest version, but each collaborator will have an exact copy (mirroring) of the main repository(including its entire history) on their local machines.

Each user has their own repository and a working copy. This is very useful because even if the server crashes we would not lose everything as several copies are residing in several other computers.

distributed vcs | git and github
Image 4

5 Git operations and commands

Before deep-diving into Git operations and commands, create an account for yourself on GitHub if you don’t have it already.

Create repositories

Create a remote central repository on GitHub.

https://docs.github.com/en/get-started/quickstart/create-a-repo

create a local repository using git (I am using Git software on Windows 10)

Open your file explorer, navigate to the working directory, right-click and select “Git Bash Here”. This opens the Git terminal. To create a new local repository use the command git init and it creates a folder .git.

git init to create a new Git repository 

$ git init
git init
Image by Author

(master) is the default branch of the local repository.

Next, we need to sync the local and the central repositories.

git remote add to add a new remote repository.

To get the URL of the central repo, open your repository in GitHub and copy the link.

git and github tutorial
Image by Author

Run the below command,

$ git remote add origin "https://github.com/harika-bonthu/git-github-tutorial.git"

Generally, Origin is the shorthand name of the remote repository that we are cloning.

After adding, we need to pull the files from the remote repo.

git pull to download all the content from the remote repo

$ git pull origin main

(main is the branch in our central/remote repository. Kindly check the branch name before pull request)

pull origin
Image by Author

With just adding the origin, we do not have any files. After pulling from the main branch, we now have a README.md file in the local repository.

Now, if you again try to pull, it says “Already up to date.”

$ git pull origin main
From https://github.com/harika-bonthu/git-github-tutorial
 * branch            main       -> FETCH_HEAD
Already up to date.

Next, if you want to check if any files are modified or to be committed, use the below command.

git status to check the status of the working directory and the staging area.

Working directory – It is the place where we make changes to the existing files or create new files.

Staging area – It is the place where the files are ready to be committed.

$ git status
On branch master
nothing to commit, working tree clean

Since the last pull, we haven’t made any changes in the working directory. So it says “nothing to commit, working tree clean)

Now the question is, how do we add files to the staging area.

git add to add files to the index or the staging area.

To demonstrate it with an example, I am modifying the README.md file and creating two more text files “file1.txt”, “file2.txt”

If you wish to use the command line for creating or modifying files, please refer to the video: https://www.youtube.com/watch?v=UeF4ZhnPzZQ

After making changes in the working directory, once again check the status using the command git status.

$ git status
On branch master
Changes not staged for commit:
(use “git add …” to update what will be committed)
(use “git restore …” to discard changes in working directory)
modified: README.md
Untracked files:
(use “git add …” to include in what will be committed)
file1.txt
file2.txt
no changes added to commit (use “git add” and/or “git commit -a”)
 

It shows that the files file1.txt, file2.txt are untracked and README.md is modified.

Next, we will see how to add the README.md file to the staging area.

$ git add README.md

Below is the status after adding it to the staging area.

 

git status | git and github
Image by Author

The next step is to commit these changes to the local repository. 

git commit to save the changes to the local repository.

$ git commit -m "Initial commit"
git commit

-m in the above command stands for the message. The message lets other developers know what changes have been made.

Don’t forget we still have two files in the working directory that are to be committed.

Now, I am going to modify file1.txt, file2.txt files using the “nano” command.

To add multiple files to the staging area, we can simply use -A flag in the git add command.

$ git add -A

Then check the status and commit them.

$ git status
$  git commit -m "Committed txt files"

Now, what if you want to undo staging? Let’s see how it is done.

For that, I am creating another file named “file3.txt” and add it to the staging area and check the status.

$ touch file3.txt
$ git add file3.txt
$ git status
git status

To undo it, use the below command.

git restore --staged file3.txt
git restore | git and github

To see all the commits that are made till now, check the log.

git log to see all the commits

$ git log

Once you get familiar with the concepts that are discussed now, we will move to the topic branches.

A branch in Git is an independent line of work(a pointer to a specific commit). It allows users to create a branch from the original code (master branch) and isolate their work.

git branch to create a new branch

$ git branch branch1

To see all the branches used git branch -a

git branch | git and github

master is highlighted as we are currently working in the master branch. To switch to another branch we need to checkout.

git checkout to switch to another branch

$ git checkout branch1
git checkout branch
$ git branch -a
git branch

branch1 will have all the files of the master branch as it is originated from the master.

$ ls
README.md  file1.txt  file2.txt  file3.txt

In branch1, I would like to make changes to file1.txt and create another text file names file4.txt

Now add these files to the staging area and commit. If you now check the master branch, these changes are not yet made there.

To make these changes to the master branch, we need to merge branch1 with master.

$ git checkout master

$ git merge branch1

git merge

To revert to a particular commit, we can use the first 8 digits of the hexadecimal code of a respective commit

git checkout 8digitcode file1.txt

git checkout f3c0884b file1.txt
Updated 1 path from 32610ca

Once we are done working, we need to push all these code files to the central/remote repository.

git push to send all files to the remote repository.

$ git push origin main
git push

If you encounter such a problem, use the below command.

$ git push origin HEAD:main

Now go to your GitHub and TADA your files are hosted on the central repository.

Conclusion

In conclusion, Git is a version control system that tracks changes in code, while GitHub is a platform for hosting and collaborating on Git repositories. Version Control System (VCS) manages code versions, and Git and GitHub differ as local and remote repositories. Git installation is straightforward, and key commands enable efficient version control.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Harika Bonthu 30 Jan 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Related Courses

image.name
0 Hrs 20 Lessons
4.85

Getting Started with Git and GitHub for Data Science Professionals

Free

image.name
0 Hrs 17 Lessons
4.92

Tableau for Beginners

Free

image.name
0 Hrs 70 Lessons
5

Introduction to Python

Free

  • [tta_listen_btn class="listen"]