Software Archeology @ RIT

[ar·che·ol·o·gy] n. the study of people by way of their artifacts
Week 14 Status Update

28 Nov 2014

Team Work Summary

As finals week approaches we are still chugging along, trying to wrap up our projects for this semester. We wrote code ownership scripts and got them aggregating data on OWNERS files. We have gathered data on the OWNERS files at each commit and also when developers were first added to OWNERS files.

We have also continued to explore NLP for the Chromium data. At this point we have a unique “technical” vocabulary set for the comments and messages. The vocab has been mapped back to the developers who used them and the code reviews they appear in.

We are still looking into interactive churn on the Chromium src data, but collection is currently too slow.

Scraping Chromium data

To find whether the developer is a major or minor contibutor, we plan to calculate the aggregated churn from the number of non-trivial commits a developer has made to a file. To find what a non-trivial commit is we are aggregating churn over each commit. Once we analyze the churn, will determine what a non-trivial commit looks like. I have been working on getting a script that searches a repo for churnover commits working for our project.

« Home