Git Repository Metrics


Git repositories are more than just a way to keep track of your source code, they provide a history of software development activities that can be studied. In software engineering research, mining Git repositories is a common method for understanding how a software development team collaborates on a project.

In this activity, we will do some basic mining of Git repositories. We will take the standard output from Git's log command, which can be accessed by this command:

git log

Note: If you run this by itself on the command line on nitron within your repository, you can scroll with your up and down keys, or with page-up and page-down. Press "q" to quit. If you run this command as a part of standard output redirection, you don't need to worry about scrolling.


Download git_metrics.rb and the accompanying unit test test_git_metrics.rb.

We also provide two test data sets, taken from the Ruby Progress Bar gem. Download the two test files: ruby-progressbar-short.txt and the longer one ruby-progressbar-full. Examine these logs to see what the format looks like so you can parse it properly.

Using our unit testing technique and using the test data given, write the three given methods:

For your unit tests, we have you given some tests to demonstrate what some methods should expect and return. Write at least five more unit tests (minimally two per method) for git_metrics.rb

To run the full program from the command line with test data, it looks like this:

  $ ruby git_metrics.rb < ruby-progressbar-short.txt
  Number of commits: 3
  Number of developers: 2
  Number of days in development 4

For the longer data set the output looks like this:

  $ ruby git_metrics.rb < ruby-progressbar-full.txt
  Number of commits: 430
  Number of developers: 33
  Number of days in development 1806

Analyze Your Own Repo

For fun, if you want to test this out on your own repository, use this command

git log | ruby git_metrics.rb


As usual, fill out an Activity Journal. Submit via Git to the GitMetrics folder