Case Study on GitHub

Chetan Lohkare
13 min readJan 5, 2022

GitHub is a web based Git repository hosting service which primarily offers distributed version control and source code management functionality of git.
Here are some interesting facts about GitHub:
■ GitHub was originally known as Logical Awesome LLC
■ As of April 2016,Github reports having more than 14 million users and more than 35 million repositories making it the largest host of source code in the world.
■ The trademark mascot of GitHub is called Octocat, a personified cat with octopus limbs portrayed in manga style.

A Friendly Introduction

When we are doing very straight forward code projects (suppose writing a simple bash file) there are only two points in our development timeline, only start and finish. We start coding very first, thereafter we finalize and ship those projects. Obviously many projects will get more than two points in their development timeline due to feature requests , bug fixes and sometimes reverts. When starting out as a web developer, it can be easy to get lost in the multitude of languages, tools and platforms that are available in today’s market. However many would argue that GitHub is an essential platform for every web developer at every level.

History

Development of the GitHub platform began on 1 October 2007. The site was launched in April 2008 by Tom Preston-Werner, Chris Wanstrath, and PJ Hyett after it had been made available for a few months prior as a beta release. When Chris and Tom started working on GitHub in late 2007, Git was largely unknown as a version control system. There were no commercial Git hosting options.

The Cat Logo

Simon Oxley designed the octopus, alongside the white bird Twitter used (before they received a proper logo) as part of a usual routine of cranking out images for iStock. GitHub saw it, and wanted it, presumably under the notion that it can represent how complex code combines to create peculiar things, much like the octopus except the CEO of GitHub called it an Octocat, and it has been the Octocat since.

And slowly GitHub became the new Facebook for coders where instead of posting pictures and life events people post code for projects and your fellow developers comment, request features and fork the code to suit there needs. Brian Doll, GitHub’s vice president of strategy says “If you look at the top 100 sites, you’ve got a handful of social sites, thirty flavors of Google with national footprints, a lot of media outlets — and GitHub”

Why GitHub ?

  1. GitHub makes version control easier

If we do have many points in our development timeline we really need to use a VCS(version control systems). So basically VCS tools allow users to manage their development paths (maybe versions, features , patches or technically branches) or development histories without too much effort. GitHub really helps out in this regard.

2. Graphical Interface

Git is very powerful, but it can only be used in the terminal, which can be daunting for many developers, especially new developers.
GitHub creates an intuitive and powerful graphical interface for the Git versioning system. You can easily see your repositories and browse through its list of commits. If you want to see the changes made in one of your commits, it is as simple as clicking on the commit from the list and GitHub will present you with the differential. Much simpler than typing commands in your terminal and deciphering their results.

3. Gist

Pastebin applications allow users to store plain text. They are commonly used by developers to store and share small scripts and bits of codes. Gist was created by GitHub as a Pastebin-style application, but Gists also benefit from Version Control.
Each Gist is essentially a mini-project, it has its own Git repository, which allows users to store multiple bits of code and also allows them to track changes within their Gist, without needing to commit them manually.

4. Collaboration tools

GitHub also introduces many collaboration tools, some of which are listed below, which make it easier for developers to work together on a project.

■ GitHub allows you to create access rights to your code which means, for example, you could designate certain users who are allowed to freely push code to your repository.

■ GitHub’s Forking feature allows a user to create a copy of a repository on their computer for them to work on without affecting the main repository. They can modify the code and then request to have the code merged with the main repository using another feature called Pull Requests.

■ GitHub also has Issues for repositories. Issues are a great way to keep track of bugs in your code, but you can also use it to keep track of tasks and other enhancements you would like to implement in your code.

5. GitHub is the place to be for open source

With so many great tools available to developers, GitHub has become the place to be for open source software. Some of the biggest open source projects are hosted on GitHub, such as Ruby on Rails, AngularJS, Bootstrap and many many more. There are even some big tech companies, like Microsoft, who maintain code repositories on GitHub.

How Does Git Actually Work ?

Git is one of those wonderful, elegant tools that does an amazing job of abstracting the underlying mechanism from the front-end workings. To pull changes from the remote down to the local, you execute git pull. To commit your changes in your local repository, you execute git commit. To push commits from your local repository to the remote repository, you execute git push. The front end does an excellent job of mirroring the mental model of what’s happening to your code.

But as you would expect, a lot is going on underneath. The nice thing about Git is that you could spend your entire career not knowing how the Git internals work, and you’d get along quite well. But being aware of how Git manages your repository will help cement that mental model and give a little more insight into why Git does what it does.

Everything is a hash !

Git refers to all commits by their SHA-1 hashes. You’ve seen that many times over in your personal and professional work with Git. The hash is the key that points to a particular commit in the repository, and it’s pretty clear to see that it’s just a type of unique ID. One ID references one commit. There’s no ambiguity there.

But if you dig down a little bit, the commit hash doesn’t reference everything that has to do with a commit. In fact, a lot of what Git does is create references to references in a tree-like structure to store and retrieve your data, and its metadata, as quickly and efficiently as possible.

To see this in action, you’ll dissect the “secret” files underneath the .git directory and see what’s inside of each.

The repository flow

The inner workings of Git —

Change to your terminal program and navigate to the main directory of your repository. Once you’re there, navigate into the .git directory of your repository:

cd .git

Now, pull up a directory listing of what’s in the .git directory, and have a look at the directories there. You should, at a minimum, see the following directories:

info/
objects/
hooks/
logs/
refs/

The directory you’re interested in is the objects directory. In Git, the most common objects are:

  • Commits: Structures that hold metadata about your commit, as well as the pointers to the parent commit and the files underneath.
  • Trees: Tree structures of all the files contained in a commit.
  • Blobs: Compressed collections of files in the tree.

Start by navigating into the objects directory:

cd objects

Pull up a directory listing to see what’s inside, and you’ll be greeted with the following puzzling list of directories:

02    14    39    55    6e    84    ad    c5    db    f8
05 19 3a 56 72 88 b4 c8 e0 f9
06 1a 3b 57 73 8b b5 ca e6 fb
0a 1c 3d 59 75 99 b8 ce e7 fe
0b 24 3e 5d 76 9d b9 cf eb ff
0c 29 43 5f 78 9f ba d2 ec info
0d 2c 45 62 7a a0 bb d3 ed pack
0e 33 47 65 7d a1 be d7 ee
0f 35 4e 67 7f a4 bf d8 f1
11 36 50 69 81 ab c0 d9 f4
12 37 54 6c 83 ac c4 da f5

It’s clear that this is a lookup system of some sort, but what does that two-character directory name mean?

The Git object repository structure —

When Git stores objects, instead of dumping them all into a single directory, which would get unwieldy in rather short order, it structures them neatly into a tree. Git takes the first two characters of your object’s hash, uses that as the directory name, and then uses the remaining 38 characters as the object identifier.

Here’s an example of the Git object directory structure, from my repository, that shows this hierarchy:

objects
├── 02
│ ├── 1f10a861cb8a8b904aac751226c67e42fadbf5
│ └── 8f2d5e0a0f99902638039794149dfa0126bede
├── 05
│ └── 66b505b18787bbc710aeef2c8981b0e13810f9
├── 06
│ └── f468e662b25687de078df86cbc9b67654d938b
├── 0a
│ └── 795bccdec0f85ebd9411e176a90b1b4dfe2002
├── 0b
│ └── 2d0890591a57393dc40e2155bff8901acafbb6
├── 0c
│ └── 66fedfeb176b467885ccd1a1ec70849299eeac
├── 0d
│ └── dfac290832b19d1cf78284226179a596bf5825
├── 0e
│ └── 066e61ce93bf5dfaa9a6eba812aa62038d7875
├── 0f
│ └── a80ee6442e459c501c6da30bf99a07c0f5624e
├── 11
│ ├── 06774ed5ad653594a848631f1f2786a76a776f
│ ├── 92339da7c0831ba4448cb46d40e1b8c2bed12c
│ └── c1a7373df5a0fbea20fa8611f41b4a032b846f
.
.
.

To find the object associated with a commit, simply take the commit hash you found above:

d83ab2b104e4addd03947ed3b1ca57b2e68dfc85

Decompose that into a directory name and an object identifier:

  • Directory: d8
  • Object identifier: 3ab2b104e4addd03947ed3b1ca57b2e68dfc85

Now you know that the object you want to look at is inside the d8 directory. Navigate into that directory and pull up another listing to see the files inside:

.
.
.
d7
├── c33fdd7d35372cba78386dfe5928f1ba8dfb70
└── e92f9daeec6cd217fda01c6b726cb07866728c
d8
└── 3ab2b104e4addd03947ed3b1ca57b2e68dfc85
d9
└── 809bc1dafdec03f0d60f41f6c7f6cfc3228c80
da
├── 967ae1f60e59d2a223e37301f63050dca0cf6f
└── fe823560ecc5694151c37187f978b5cf3d5cf1
.
.

In my case, I only see one file: 3ab2b104e4addd03947ed3b1ca57b2e68dfc85. You may see other files in there, and that’s to be expected in a moderately busy repository.

You can’t take a look at this object directly, though, as objects in Git are compressed. If you tried to look at it using cat 3ab2b104e4addd03947ed3b1ca57b2e68dfc85 or similar, you’ll probably see a pile of gibberish like so, along with a few chirps from your computer as it tries to read control characters from the binary object:

xu?Ko?0??̯?51??Ԯ
yB
??f?y?cBɯo?{ݝ?|ҌFL?:?@??_?0Td5?D2Br?D$??f?B??b?5W?HÁ?H*?&??(fbꒉ
dC!DV%?????D@?(???u0??8{?w????0?IULC1????@(<?s '
mO????????ƶe?S????>?K8 89_vxm(#?jxOs?u?b?5m????=w\l?
%?O??[V?t]?^??????G6.n?Mu?%
?̉?X??֖Xv??x?EX???:sys???G2?y??={X?Ռe?X?4u???????4o'G??^"qݠ???$?Ccu?ml???vB_)?I?6?$?(?E9?z??nUmV?Em]?p??3?`??????q?Ţqjw????VR?O? q?.r???e|lN?p??Gq?)?????#???85V?W6?????
)|Wc*??8?1a?b?=?f*??pSvx3??;??3??^??O?S}??Z4?/?%J?
`??*ގF?of??O

Viewing Git objects —

Git provides a way to look at the contents of a compressed Git object: git cat-file. This decompresses the object and writes it out to your console in a human-readable form. You can simply pass it a short or long hash, and Git will write out the contents of that object in a human-readable form.

So take a look at the uncompressed form of the object file with the following command, substituting in the short or long hash from the commit that you want to look at:

git cat-file -p d83ab2b

The -p option here tells Git to figure out what type of object it’s dealing with and to provide appropriately formatted output.

GitHub as Teaching Tool —

GitHub has been recently utilized by many colleges and universities as a teaching tool as it helps in managing students’ work.

Our case study further investigate the benefits and drawbacks associated with using GitHub in educational settings.

In particular, the authors conducted a qualitative study to understand how GitHub is being used in education, and the motivations, benefits and challenges it brings. The study consisted of analyzing online posts of personal experiences in using GitHub in the classroom along with interviews with faculty who used GitHub to support teaching or learning. The analysis revealed that GitHub is mainly utilized in classrooms as a submission and hosting platform. The transparency of GitHub encouraged students to participate and contribute more to the hosted course material. However, several limitations included barriers to entry, long learning curve, and lack of direct support for popular educational formats were reported.

One case study was conducted to know student’s experience of using GitHub as an educational platform in which GitHub was used for material dissemination, lab work submission, and student project hosting. The study design included direct interviews with the students followed by a validation survey. The results showed that GitHub promoted student cooperation and cross-team collaboration, making students more involved in the course. In addition, students were able to develop and demonstrate industry-relevant skills. However, students have raised several concerns about having their work publicly available, the unfamiliarity with Git and GitHub, and the general lack of educational features to support grading and assignment management.

— force is considered harmful, it will force to unconditionally overwrites the remote repository with whatever you have locally. Use it wisely…

Student Benefits of Using GitHub —

Previous studies show that GitHub has come up with numerous advantages in the context of the learning process. However, our case study reveals the experience of students when they are managing their projects with GitHub

Gaining and Demonstrating Industry Relevant Skills and Practices :

To succeed in modern software development, students need to be familiar with best practices (e.g., peer review, cross-team collaboration) and commonly used tools (e.g., continuous integration tools, distributed version control systems). Many of the interviewees mentioned that using GitHub in their courses provided a good introduction to the tool and to relevant practices: “I think when you go and work in software development too, you should get used to [having] lots of eyes being all over your work; that’s just the way it’s gonna be, so it’s practice before real life.”

Contribution of GitHub in Open Source —

It is very tedious job to add contributions to open source library which is poorly documented. We have to understand the source code and then make required changes. So open source documentation is a crucial part of open source management as it helps your users succeed with your software.

Apart from memes, GitHub is the biggest blessing to developers in recent times. Owing to its simple documentation, contribution to open-source software has become much easier than it ever was. Exploring the features of GitHub makes journey of open source much more interesting.

◾ Makes projects visible to fellow developers who can help you work and upgrade them. Makes it easy to contribute to open-source projects. You just need to be aware of using GitHub.

◾ GitHub eases collaboration and you always have a backup of your code.

◾ GitHub has simple documentation which makes it easier to work with.

◾ GitHub as a platform to showcase your work. It enhances visibility among recruiters and also acts as a shoutout to fellow developers.

◾ When it comes to open-source projects where a large number of contributions happen continuously, tracking changes in code becomes a necessity. GitHub takes care of this problem in the most efficient way.

◾ GitHub can integrate with numerous platforms and services.

GitHub in Marketing —

The benefits of GitHub are not limited to development, it can also help with marketing too.

Assume you have a development team , because GitHub makes it easy to implement new changes, your development team doesn’t have to roll so many updates into one new release.

So if one team member is working on a small update that only impacts existing customers, another is working on a new user interface, and your whole team is wrapping up a giant, game-changing feature, you can create separate announcements for each.

For instance, you could send out an email newsletter about the small change, a blog post to announce the new user interface updates, and develop a full PR campaign for the game-changer.

GitHub Pricing —

GitHub offers plans from Free to $21 per user per month.

Here’s what each plan includes (higher tiers include the features of lower tiers):

Free

◾ Personal Account
◾ Unlimited Collaborators
◾ Unlimited Public Repositories Hosted on GitHub

Developer — $7 per Month

◾ Unlimited Private Repositories

Team — $9 per User/per Month (Starts at $25/Month for First 5 Users)

◾ Organization Account
◾ Team and User Permissions

Business — $21 per User/per Month

◾ GitHub Business Hosting
◾ SAML Single Sign-On
◾ Access Provisioning
◾ 24/5 Support with 8-Hour Response Time
◾ 99.95% Uptime SLA

Enterprise — $21 per User/per Month (Sold in Packs of 10)

◾ GitHub Enterprise Hosting (On Your Servers, Cloud)
◾ Multiple Organization Accounts
◾ SAML, LDAP, and CAS
◾ Access Provisioning
◾ 24/7 Support
◾ Advanced Auditing

Conclusion

Git is an open-source distributed version control system. It is designed to handle minor to major projects with high speed and efficiency. It is developed to co-ordinate the work among the developers. The version control allows us to track and work together with our team members at the same workspace. Git is foundation of many services like GitHub and GitLab, but we can use Git without using any other Git services. Git can be used privately and publicly. Git was created by Linus Torvalds in 2005 to develop Linux Kernel. It is also used as an important distributed version-control tool for the DevOps. Git is easy to learn, and has fast performance. It is superior to other SCM tools like Subversion, CVS, Perforce, and ClearCase.

--

--

Chetan Lohkare

Student at Vishwakarma Institute of Technology, Pune