Publications

Analyzing Security Data Andrew Meneely The Art and Science of Analyzing Software Data pp. 213–227
Security is a challenging and strange property of software. Security is not about understanding how a customer might use the system; security is about ensuring that an attacker cannot abuse the system. Instead of defining what the system should do, security is about ensuring that system does not do something malicious. As a result, applying traditional software analytics to security leads to some unique challenges and caveats. In this chapter, we will discuss four "gotchas" of analyzing security data, along with vulnerabilities and severity scoring. We will describe a method commonly-used for collecting security data in open source projects. We will describe some of the state-of-the-art in analyzing security data today.
 @article{MeneelyASD2015,
  author = {Meneely, Andrew},
  title = {Analyzing Security Data},
  journal = {The Art and Science of Analyzing Software Data},
  publisher = {Elsevier},
  pages = {213--227},
  doi = {},
  abstract = {Security is a challenging and strange property of software. Security is not about understanding how a customer might use the system; security is about ensuring that an attacker cannot abuse the system. Instead of defining what the system should do, security is about ensuring that system does not do something malicious. As a result, applying traditional software analytics to security leads to some unique challenges and caveats. In this chapter, we will discuss four "gotchas" of analyzing security data, along with vulnerabilities and severity scoring. We will describe a method commonly-used for collecting security data in open source projects. We will describe some of the state-of-the-art in analyzing security data today.}
}
 
An Insider Threat Activity in a Software Security Course Daniel E. Krutz, Andrew Meneely, & Samuel A. Malachowsky 2015 IEEE Frontiers in Education Conference (FIE) pp. to appear
Software development teams face a critical threat to the security of their systems: insiders. A malicious insider is a person who violates an authorized level of access in a software system. Unfortunately, when creating software, developers do not typically account for insider threat. Students learning software development are unaware of the impacts of malicious actors and are far too often untrained in prevention methods against them. A few of the defensive mechanisms to protect against insider threats include eliminating system access once an employee leaves an organization, enforcing principle of least privilege, code reviews, and constant monitoring for suspicious activity. At the Department of Software Engineering at the Rochester Institute of Technology, we require a course titled Engineering of Secure Software and have created an activity designed to prepare students for the problem of insider threats. At the beginning of this activity, student teams are given the task of designing a moderately sized secure software system. The goal of this insider is to manipulate the team into creating a flawed system design that would allow attackers to perform malicious activities once the system has been created. When the insider is revealed at the conclusion of the project, students discuss countermeasures regarding the malicious actions the insiders were able to plan or complete, along with methods of prevention that may have been employed by the team to detect the malicious developer. In this paper, we describe the activity along with the results of a survey. We discuss the benefits and challenges of the activity with the goal of giving other instructors the tools they need to conduct this activity at their institution. While many institutions do not offer courses in computer security, this self-contained activity may be used in any computing course to enforce the importance of protecting against insider threats.
 @article{KrutzFIE2015,
  author = {Krutz, Daniel E. and Meneely, Andrew and Malachowsky, Samuel A.},
  title = {An Insider Threat Activity in a Software Security Course},
  journal = {2015 IEEE Frontiers in Education Conference (FIE)},
  pages = {to appear},
  doi = {},
  abstract = {Software development teams face a critical threat to the security of their systems: insiders. A malicious insider is a person who violates an authorized level of access in a software system. Unfortunately, when creating software, developers do not typically account for insider threat. Students learning software development are unaware of the impacts of malicious actors and are far too often untrained in prevention methods against them. A few of the defensive mechanisms to protect against insider threats include eliminating system access once an employee leaves an organization, enforcing principle of least privilege, code reviews, and constant monitoring for suspicious activity. At the Department of Software Engineering at the Rochester Institute of Technology, we require a course titled Engineering of Secure Software and have created an activity designed to prepare students for the problem of insider threats. At the beginning of this activity, student teams are given the task of designing a moderately sized secure software system. The goal of this insider is to manipulate the team into creating a flawed system design that would allow attackers to perform malicious activities once the system has been created. When the insider is revealed at the conclusion of the project, students discuss countermeasures regarding the malicious actions the insiders were able to plan or complete, along with methods of prevention that may have been employed by the team to detect the malicious developer. In this paper, we describe the activity along with the results of a survey. We discuss the benefits and challenges of the activity with the goal of giving other instructors the tools they need to conduct this activity at their institution. While many institutions do not offer courses in computer security, this self-contained activity may be used in any computing course to enforce the importance of protecting against insider threats.}
}
 
Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project Felivel Camilo, Andrew Meneely, & Meiyappan Nagappan 2015 International Working Conference on Mining Software Repositories pp. to appear
ACM Distinguished Paper
Best Paper MSR 2015
As developers face ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into min- ing software repositories has greatly increased our understanding of software quality via empirical study of bugs. However, con- ceptually vulnerabilities are different from bugs: they represent abusive functionality as opposed to wrong or insufficient function- ality commonly associated with traditional, non-security bugs. In this study, we performed an in-depth analysis of the Chromium project to empirically examine the relationship between bugs and vulnerabilities. We mined 374,686 bugs and 703 post-release vulnerabilities over five Chromium releases that span six years of development. Using logistic regression analysis, we examined how various categories of pre-release bugs (e.g. stability, compatibility, etc.) are associated with post-release vulnerabilities. While we found statistically significant correlations between pre-release bugs and post-release vulnerabilities, we also found the asso- ciation to be weak. Number of features, SLOC, and number of pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of our non-security bug categories. In a separate analysis, we found that the files with highest defect density did not intersect with the files of highest vulnerability density. These results indicate that bugs and vulnerabilities are empirically dissimilar groups, warranting the need for more research targeting vulnerabilities specifically.
 @article{CamiloMSR2015,
  author = {Camilo, Felivel and Meneely, Andrew and Nagappan, Meiyappan},
  title = {Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project},
  journal = {2015 International Working Conference on Mining Software Repositories},
  pages = {to appear},
  award = { ACM Distinguished Paper Award, MSR 2015 Best Paper},
  doi = {},
  abstract = {As developers face ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into min- ing software repositories has greatly increased our understanding of software quality via empirical study of bugs. However, con- ceptually vulnerabilities are different from bugs: they represent abusive functionality as opposed to wrong or insufficient function- ality commonly associated with traditional, non-security bugs. In this study, we performed an in-depth analysis of the Chromium project to empirically examine the relationship between bugs and vulnerabilities. We mined 374,686 bugs and 703 post-release vulnerabilities over five Chromium releases that span six years of development. Using logistic regression analysis, we examined how various categories of pre-release bugs (e.g. stability, compatibility, etc.) are associated with post-release vulnerabilities. While we found statistically significant correlations between pre-release bugs and post-release vulnerabilities, we also found the asso- ciation to be weak. Number of features, SLOC, and number of pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of our non-security bug categories. In a separate analysis, we found that the files with highest defect density did not intersect with the files of highest vulnerability density. These results indicate that bugs and vulnerabilities are empirically dissimilar groups, warranting the need for more research targeting vulnerabilities specifically.}
}
 
An Empirical Investigation of Socio-technical Code Review Metrics and Security Vulnerabilities Andrew Meneely, Alberto C. Rodriguez Tejeda, Brian Spates, Shannon Trudeau, Danielle Neuberger, Katherine Whitlock, Christopher Ketant, & Kayla Davis Proceedings of the 6th International Workshop on Social Software Engineering pp. 37–44 2014
One of the guiding principles of open source software development is to use crowds of developers to keep a watchful eye on source code. Eric Raymond declared Linus' Law as "many eyes make all bugs shallow", with the socio-technical argument that high quality open source software emerges when developers combine together their collective experience and expertise to review code collaboratively. Vulnerabilities are a particularly nasty set of bugs that can be rare, difficult to reproduce, and require specialized skills to recognize. Does Linus' Law apply to vulnerabilities empirically? In this study, we analyzed 159,254 code reviews, 185,948 Git commits, and 667 post-release vulnerabilities in the Chromium browser project. We formulated, collected, and analyzed various metrics related to Linus' Law to explore the connection between collaborative reviews and vulnerabilities that were missed by the review process. Our statistical association results showed that source code files reviewed by more developers are, counter-intuitively, more likely to be vulnerable (even after accounting for file size). However, files are less likely to be vulnerable if they were reviewed by developers who had experience participating on prior vulnerability-fixing reviews. The results indicate that lack of security experience and lack of collaborator familiarity are key risk factors in considering Linus' Law with vulnerabilities.
 @article{MeneelySSE2014,
  author = {Meneely, Andrew and Tejeda, Alberto C. Rodriguez and Spates, Brian and Trudeau, Shannon and Neuberger, Danielle and Whitlock, Katherine and Ketant, Christopher and Davis, Kayla},
  title = {An Empirical Investigation of Socio-technical Code Review Metrics and Security Vulnerabilities},
  booktitle = {Proceedings of the 6th International Workshop on Social Software Engineering},
  series = {SSE 2014},
  year = {2014},
  isbn = {978-1-4503-3227-9},
  location = {Hong Kong, China},
  pages = {37--44},
  numpages = {8},
  url = {http://doi.acm.org/10.1145/2661685.2661687},
  doi = {10.1145/2661685.2661687},
  acmid = {2661687},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {code review, socio-technical, vulnerability},
  abstract = {One of the guiding principles of open source software development is to use crowds of developers to keep a watchful eye on source code.  Eric Raymond declared Linus' Law as "many eyes make all bugs shallow", with the socio-technical argument that high quality open source software emerges when developers combine together their collective experience and expertise to review code collaboratively. Vulnerabilities are a particularly nasty set of bugs that can be rare, difficult to reproduce, and require specialized skills to recognize. Does Linus' Law apply to vulnerabilities empirically? In this study, we analyzed 159,254 code reviews, 185,948 Git commits, and 667 post-release vulnerabilities in the Chromium browser project. We formulated, collected, and analyzed various metrics related to Linus' Law to explore the connection between collaborative reviews and vulnerabilities that were missed by the review process. Our statistical association results showed that source code files reviewed by more developers are, counter-intuitively, more likely to be vulnerable (even after accounting for file size). However, files are less likely to be vulnerable if they were reviewed by developers who had experience participating on prior vulnerability-fixing reviews. The results indicate that lack of security experience and lack of collaborator familiarity are key risk factors in considering Linus' Law with vulnerabilities. }
}
 
When a patch goes bad: Exploring the properties of vulnerability-contributing commits Andrew Meneely, Harshavardhan Srinivasan, Ayemi Musa, Alberto Rodriguez Tejeda, Matthew Mokary, & Brian Spates Proceedings of the 2013 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement pp. 65–74 Oct. 2013
Security is a harsh reality for software teams today. Developers must engineer secure software by preventing vulnerabilities, which are design and coding mistakes that have security consequences. Even in open source projects, vulnerable source code can remain unnoticed for years. In this paper, we traced 68 vulnerabilities in the Apache HTTP server back to the version control commits that contributed the vulnerable code originally. We manually found 124 Vulnerability-Contributing Commits (VCCs), spanning 17 years. In this exploratory study, we analyzed these VCCs quantitatively and qualitatively with the over-arching question: "What could developers have looked for to identify security concerns in this commit?" Specifically, we examined the size of the commit via code churn metrics, the amount developers overwrite each others' code via interactive churn metrics, exposure time between VCC and fix, and dissemination of the VCC to the development community via release notes and voting mechanisms. Our results show that VCCs are large: more than twice as much code churn on average than non-VCCs, even when normalized against lines of code. Furthermore, a commit was twice as likely to be a VCC when the author was a new developer to the source code. The insight from this study can help developers understand how vulnerabilities originate in a system so that security-related mistakes can be prevented or caught in the future.
 @article{MeneelyESEM2013,
  title = {When a patch goes bad: Exploring the properties of vulnerability-contributing commits},
  booktitle = {Proceedings of the 2013 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement},
  series = {ESEM '10},
  location = {Baltimore, MD, USA},
  year = {2013},
  month = oct,
  pages = {65--74},
  numpages = {10},
  doi = {10.1109/ESEM.2013.19},
  author = {Meneely, Andrew and Srinivasan, Harshavardhan and Musa, Ayemi and Tejeda, Alberto Rodriguez and Mokary, Matthew and Spates, Brian},
  keywords = {vulnerability, churn, socio-technical, empirical},
  abstract = {Security is a harsh reality for software teams today. Developers must engineer secure software by preventing vulnerabilities, which are design and coding mistakes that have security consequences. Even in open source projects, vulnerable source code can remain unnoticed for years. In this paper, we traced 68 vulnerabilities in the Apache HTTP server back to the version control commits that contributed the vulnerable code originally. We manually found 124 Vulnerability-Contributing Commits (VCCs), spanning 17 years. In this exploratory study, we analyzed these VCCs quantitatively and qualitatively with the over-arching question: "What could developers have looked for to identify security concerns in this commit?" Specifically, we examined the size of the commit via code churn metrics, the amount developers overwrite each others' code via interactive churn metrics, exposure time between VCC and fix, and dissemination of the VCC to the development community via release notes and voting mechanisms. Our results show that VCCs are large: more than twice as much code churn on average than non-VCCs, even when normalized against lines of code. Furthermore, a commit was twice as likely to be a VCC when the author was a new developer to the source code. The insight from this study can help developers understand how vulnerabilities originate in a system so that security-related mistakes can be prevented or caught in the future.},
  month_numeric = {10}
}
 
Teaching web engineering using a project component Daniel E. Krutz & Andrew Meneely 2013 IEEE Frontiers in Education Conference (FIE) vol. 0 pp. 1366–1368 2013
Web applications are an intricate part of today's world. Everything from banking to checking our Facebook status may now be done through the use of web applications. The pressure of creating secure web applications on time and on budget is felt by innumerable software developers. Late or buggy software may have adverse financial implications while an insecure application could expose the private information of countless individuals. Today's students need to balance numerous concerns in order to create a web application that is robust, on time and on budget. These software engineering techniques are amenable to web applications due to their changing requirements, diverse development teams and iterative nature. Unfortunately, the majority of students lack these skills upon graduation. Web engineering is the utilization of software engineering techniques to web technologies. While Web engineering has recently begun to emerge as its own field of engineering, it still lacks the maturity of legacy software engineering fields. At the Department of Software Engineering at the Rochester Institute of Technology, we created a course called Web Engineering. As part of this course, we developed an innovative project component which focused on students following software engineering development principles such as elicitation, requirements generation, testing and deployment. Some of the requirements for this project included the creation of a custom calendar, several tie-ins with the Facebook API and creation of custom web services. An additional differentiating factor for this project was the cross course collaboration with a simultaneous security class. As part of this partnership between the two classes, students from the security course would routinely examine the web engineering project. Security students provided feedback to the web engineering students, and the web engineering students were expected to react. To our knowledge, no other curriculum uses such a project driven approach to emphasis engineering techniques. Furthermore we encourage security through the collaboration with the security course in our department. The student reaction to this course has been overwhelmingly positive. The initial offering of this course was in the Spring of 2012 with a second iteration in the Spring of 2013. This experience report will describe the project, general course structure as well as various obstacles and lessons learned.
 @article{KrutzFIE2013,
  title = {Teaching web engineering using a project component},
  author = {Krutz, Daniel E. and Meneely, Andrew},
  journal = {2013 IEEE Frontiers in Education Conference (FIE)},
  volume = {0},
  issn = {0190--5848},
  year = {2013},
  pages = {1366--1368},
  doi = {http://doi.ieeecomputersociety.org/10.1109/FIE.2013.6685055},
  publisher = {IEEE Computer Society},
  address = {Los Alamitos, CA, USA},
  series = {FIE '13},
  numpages = {3},
  keywords = {web, engineering, education, security},
  abstract = {Web applications are an intricate part of today's world. Everything from banking to checking our Facebook status may now be done through the use of web applications. The pressure of creating secure web applications on time and on budget is felt by innumerable software developers. Late or buggy software may have adverse financial implications while an insecure application could expose the private information of countless individuals. Today's students need to balance numerous concerns in order to create a web application that is robust, on time and on budget. These software engineering techniques are amenable to web applications due to their changing requirements, diverse development teams and iterative nature. Unfortunately, the majority of students lack these skills upon graduation. Web engineering is the utilization of software engineering techniques to web technologies. While Web engineering has recently begun to emerge as its own field of engineering, it still lacks the maturity of legacy software engineering fields.  At the Department of Software Engineering at the Rochester Institute of Technology, we created a course called Web Engineering. As part of this course, we developed an innovative project component which focused on students following software engineering development principles such as elicitation, requirements generation, testing and deployment. Some of the requirements for this project included the creation of a custom calendar, several tie-ins with the Facebook API and creation of custom web services. An additional differentiating factor for this project was the cross course collaboration with a simultaneous security class. As part of this partnership between the two classes, students from the security course would routinely examine the web engineering project. Security students provided feedback to the web engineering students, and the web engineering students were expected to react.  To our knowledge, no other curriculum uses such a project driven approach to emphasis engineering techniques.  Furthermore we encourage security through the collaboration with the security course in our department. The student reaction to this course has been overwhelmingly positive. The initial offering of this course was in the Spring of 2012 with a second iteration in the Spring of 2013. This experience report will describe the project, general course structure as well as various obstacles and lessons learned.}
}
 
Vulnerability of the day: concrete demonstrations for software engineering undergraduates Andrew Meneely & Samuel Lucidi Proceedings of the 2013 International Conference on Software Engineering pp. 1154–1157 2013
Software security is a tough reality that affects the many facets of our modern, digital world. The pressure to produce secure software is felt particularly strongly by software engineers. Todays software engineering students will need to deal with software security in their profession. However, these students will also not be security experts, rather, they need to balance security concerns with the myriad of other draws of their attention, such as reliability, performance, and delivering the product on-time and on-budget. At the Department of Software Engineering at the Rochester Institute of Technology, we developed a course called Engineering Secure Software, designed for applying security principles to each stage of the software development lifecycle. As a part of this course, we developed a component called Vulnerability of the Day, which is a set of selected example software vulnerabilities. We selected these vulnerabilities to be simple, demonstrable, and relevant so that the vulnerability could be demonstrated in the first 10 minutes of each class session. For each vulnerability demonstration, we provide historical examples, realistic scenarios, and mitigations. With student reaction being overwhelmingly positive, we have created an open source project for our Vulnerabilities of the Day, and have defined guiding principles for developing and contributing effective examples.
 @article{MeneelyICSESEE2013,
  author = {Meneely, Andrew and Lucidi, Samuel},
  title = {Vulnerability of the day: concrete demonstrations for software engineering undergraduates},
  booktitle = {Proceedings of the 2013 International Conference on Software Engineering},
  series = {ICSE '13},
  year = {2013},
  isbn = {978-1-4673-3076-3},
  location = {San Francisco, CA, USA},
  pages = {1154--1157},
  numpages = {4},
  url = {http://dl.acm.org/citation.cfm?id=2486788.2486948},
  acmid = {2486948},
  publisher = {IEEE Press},
  address = {Piscataway, NJ, USA},
  abstract = {Software security is a tough reality that affects the many facets of our modern, digital world. The pressure to produce secure software is felt particularly strongly by software engineers. Todays software engineering students will need to deal with software security in their profession. However, these students will also not be security experts, rather, they need to balance security concerns with the myriad of other draws of their attention, such as reliability, performance, and delivering the product on-time and on-budget. At the Department of Software Engineering at the Rochester Institute of Technology, we developed a course called Engineering Secure Software, designed for applying security principles to each stage of the software development lifecycle. As a part of this course, we developed a component called Vulnerability of the Day, which is a set of selected example software vulnerabilities. We selected these vulnerabilities to be simple, demonstrable, and relevant so that the vulnerability could be demonstrated in the first 10 minutes of each class session. For each vulnerability demonstration, we provide historical examples, realistic scenarios, and mitigations. With student reaction being overwhelmingly positive, we have created an open source project for our Vulnerabilities of the Day, and have defined guiding principles for developing and contributing effective examples. }
}
 
Validating software metrics: A spectrum of philosophies Andrew Meneely, Ben Smith, & Laurie Williams ACM Trans. Software Engineering Methodologies (TOSEM) vol. 21, no. 4 pp. 24:1–24:28 Feb. 2013
Context: Researchers proposing a new metric have the burden of proof to demonstrate to the research community that the metric is acceptable in its intended use. This burden of proof is provided through the multi-faceted, scientific, and objective process of software metrics validation. Over the last 40 years, however, researchers have debated what constitutes a "valid" metric. Aim: The debate over what constitutes a valid metric centers on software metrics validation criteria. The objective of this paper is to guide researchers in making sound contributions to the field of software engineering metrics by providing a practical summary of the metrics validation criteria found in the academic literature. Method: We conducted a systematic literature review that began with 2,288 papers and ultimately focused on 20 papers. After extracting 47 unique validation criteria from these 20 papers, we performed a comparative analysis to explore the relationships amongst the criteria. Results: Our 47 validation criteria represent a diverse view of what constitutes a valid metric. We present an analysis of the criteria's categorization, relationships, advantages, and philosophical motivations behind the validation criteria. We then present a step-by-step process for selecting appropriate metrics validation criteria based on a metric's intended use. Conclusions: The diversity of motivations and philosophies behind the 47 validation criteria indicates that metrics validation is complex. Researchers proposing new metrics should consider the applicability of the validation criteria in terms of our categorization and analysis. Rather than arbitrarily choosing validation criteria for each metric, researchers should choose criteria that can confirm that the metric is appropriate for its intended use by inspecting the advantages that different criteria provide. We conclude that metrics validation criteria provide answers to questions that researchers have about the merits and limitations of a metric.
 @article{MeneelyTOSEM2013,
  author = {Meneely, Andrew and Smith, Ben and Williams, Laurie},
  title = {Validating software metrics: A spectrum of philosophies},
  journal = {ACM Trans. Software Engineering Methodologies (TOSEM)},
  issue_date = {November 2012},
  volume = {21},
  number = {4},
  month = feb,
  year = {2013},
  issn = {1049-331X},
  pages = {24:1--24:28},
  articleno = {24},
  numpages = {28},
  url = {http://doi.acm.org/10.1145/2377656.2377661},
  doi = {10.1145/2377656.2377661},
  acmid = {2377661},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {Software metrics, systematic literature review, validation criterion},
  abstract = {Context: Researchers proposing a new metric have the burden of proof to demonstrate to the research community that the metric is acceptable in its intended use. This burden of proof is provided through the multi-faceted, scientific, and objective process of software metrics validation. Over the last 40 years, however, researchers have debated what constitutes a "valid" metric.
  Aim: The debate over what constitutes a valid metric centers on software metrics validation criteria. The objective of this paper is to guide researchers in making sound contributions to the field of software engineering metrics by providing a practical summary of the metrics validation criteria found in the academic literature.
  Method: We conducted a systematic literature review that began with 2,288 papers and ultimately focused on 20 papers. After extracting 47 unique validation criteria from these 20 papers, we performed a comparative analysis to explore the relationships amongst the criteria.
  Results: Our 47 validation criteria represent a diverse view of what constitutes a valid metric. We present an analysis of the criteria's categorization, relationships, advantages, and philosophical motivations behind the validation criteria. We then present a step-by-step process for selecting appropriate metrics validation criteria based on a metric's intended use.
  Conclusions: The diversity of motivations and philosophies behind the 47 validation criteria indicates that metrics validation is complex. Researchers proposing new metrics should consider the applicability of the validation criteria in terms of our categorization and analysis. Rather than arbitrarily choosing validation criteria for each metric, researchers should choose criteria that can confirm that the metric is appropriate for its intended use by inspecting the advantages that different criteria provide. We conclude that metrics validation criteria provide answers to questions that researchers have about the merits and limitations of a metric.},
  month_numeric = {2}
}
 
Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities Yonghee Shin, A. Meneely, L. Williams, & J.A. Osborne IEEE Transactions on Software Engineering vol. 37, no. 6 pp. 772–787 2011
Security inspection and testing require experts in security who think like an attacker. Security experts need to know code locations on which to focus their testing and inspection efforts. Since vulnerabilities are rare occurrences, locating vulnerable code locations can be a challenging task. We investigated whether software metrics obtained from source code and development history are discriminative and predictive of vulnerable code locations. If so, security experts can use this prediction to prioritize security inspection and testing efforts. The metrics we investigated fall into three categories: complexity, code churn, and developer activity metrics. We performed two empirical case studies on large, widely used open-source projects: the Mozilla Firefox web browser and the Red Hat Enterprise Linux kernel. The results indicate that 24 of the 28 metrics collected are discriminative of vulnerabilities for both projects. The models using all three types of metrics together predicted over 80 percent of the known vulnerable files with less than 25 percent false positives for both projects. Compared to a random selection of files for inspection and testing, these models would have reduced the number of files and the number of lines of code to inspect or test by over 71 and 28 percent, respectively, for both projects.
 @article{ShinTSE2012,
  title = {Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities},
  volume = {37},
  issn = {0098-5589},
  doi = {10.1109/TSE.2010.81},
  abstract = {Security inspection and testing require experts in security who think like an attacker. Security experts need to know code locations on which to focus their testing and inspection efforts. Since vulnerabilities are rare occurrences, locating vulnerable code locations can be a challenging task. We investigated whether software metrics obtained from source code and development history are discriminative and predictive of vulnerable code locations. If so, security experts can use this prediction to prioritize security inspection and testing efforts. The metrics we investigated fall into three categories: complexity, code churn, and developer activity metrics. We performed two empirical case studies on large, widely used open-source projects: the Mozilla Firefox web browser and the Red Hat Enterprise Linux kernel. The results indicate that 24 of the 28 metrics collected are discriminative of vulnerabilities for both projects. The models using all three types of metrics together predicted over 80 percent of the known vulnerable files with less than 25 percent false positives for both projects. Compared to a random selection of files for inspection and testing, these models would have reduced the number of files and the number of lines of code to inspect or test by over 71 and 28 percent, respectively, for both projects.},
  number = {6},
  journal = {{IEEE} Transactions on Software Engineering},
  author = {Shin, Yonghee and Meneely, A. and Williams, L. and Osborne, {J.A.}},
  year = {2011},
  keywords = {Charge coupled devices, code churn, Complexity theory, developer activity metrics, Fault diagnosis, fault prediction, Linux, Mozilla Firefox Web browser, online front-ends, open-source projects, Predictive models, program testing, public domain software, Red Hat enterprise Linux kernel, security inspection, software fault tolerance, software metrics, software security, software vulnerabilities, source code, vulnerability prediction., vulnerable code locations},
  pages = {772--787}
}
 
Interactive churn metrics: socio-technical variants of code churn Andrew Meneely & Oluyinka Williams SIGSOFT Software Engineering Notes vol. 37, no. 6 pp. 1–6 Nov. 2012
A central part of software quality is finding bugs. One method of finding bugs is by measuring important aspects of the software product and the development process. In recent history, researchers have discovered evidence of a "code churn" effect whereby the degree to which a given source code file has changed over time is correlated with faults and vulnerabilities. Computing the code churn metric comes from counting source code differences in version control repositories. However, code churn does not take into account a critical factor of any software development team: the human factor, specifically who is making the changes. In this paper, we introduce a new class of human-centered metrics, interactive churn metrics as variants of code churn. Using the git blame tool, we identify the most recent developer who changed a given line of code in a file prior to a given revision. Then, for each line changed in a given revision, determined if the revision author was changing his or her own code (self churn), or the author was changing code last modified by somebody else (interactive churn). We derive and present several metrics from this concept. Finally, we conducted an empirical analysis of these metrics on the PHP programming language and its post-release vulnerabilities. We found that our interactive churn metrics are statistically correlated with post-release vulnerabilities and only weakly correlated with code churn metrics and source lines of code. The results indicate that interactive churn metrics are associated with software quality and are different from the code churn and source lines of code.
 @article{MeneelyWoSQ2012,
  author = {Meneely, Andrew and Williams, Oluyinka},
  title = {Interactive churn metrics: socio-technical variants of code churn},
  journal = {SIGSOFT Software Engineering Notes},
  issue_date = {November 2012},
  volume = {37},
  number = {6},
  month = nov,
  year = {2012},
  issn = {0163-5948},
  pages = {1--6},
  numpages = {6},
  url = {http://doi.acm.org/10.1145/2382756.2382785},
  doi = {10.1145/2382756.2382785},
  acmid = {2382785},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {code churn, interactive churn, socio-technical},
  abstract = {A central part of software quality is finding bugs. One method of finding bugs is by measuring important aspects of the software product and the development process. In recent history, researchers have discovered evidence of a "code churn" effect whereby the degree to which a given source code file has changed over time is correlated with faults and vulnerabilities. Computing the code churn metric comes from counting source code differences in version control repositories. However, code churn does not take into account a critical factor of any software development team: the human factor, specifically who is making the changes. In this paper, we introduce a new class of human-centered metrics, interactive churn metrics as variants of code churn. Using the git blame tool, we identify the most recent developer who changed a given line of code in a file prior to a given revision. Then, for each line changed in a given revision, determined if the revision author was changing his or her own code (self churn), or the author was changing code last modified by somebody else (interactive churn). We derive and present several metrics from this concept. Finally, we conducted an empirical analysis of these metrics on the PHP programming language and its post-release vulnerabilities. We found that our interactive churn metrics are statistically correlated with post-release vulnerabilities and only weakly correlated with code churn metrics and source lines of code. The results indicate that interactive churn metrics are associated with software quality and are different from the code churn and source lines of code.},
  month_numeric = {11}
}
 
Developing an Applied, Security-Oriented Computing Curriculum Marcin Lukowiak, James Vallino, Christopher Wood, & Andrew Meneely Proceedings of the 2012 American Symposium on Engineering Education Jun. 2012
Software and hardware security is a reality that all stakeholders must face, from hardware engineers to software developers to customers. As a direct result, the technology industry is facing a growing need for engineers who understand security principles at varying levels of abstraction. These engineers will need security-oriented perspectives stemming from both theoretical and practical disciplines, including software engineering, computer engineering, and computer science. Unfortunately, in traditional academic settings, secure software and hardware are typically taught independently despite being intertwined in practice. Consequently, the objective of this initiative is to prepare students to apply a security-oriented awareness to a broad range of hardware and software systems by developing a multi-disciplinary curriculum involving three departments. Our efforts at Rochester Institute of Technology focus on integrating security into software design and implementations, hardware design and implementations, and hardware-software co-design. In the cluster of courses described in this paper, we use cryptographic applications as the motivating security focus. We describe changes made to an existing introductory cryptography course, report on a recently-developed course entitled Hardware and Software Design for Cryptographic Applications, and present our plans for a Secure Software Engineering course.
 @article{LukowiakASEE2012,
  address = {San Antonio, Texas, {USA}},
  title = {Developing an Applied, Security-Oriented Computing Curriculum},
  author = {{Marcin Lukowiak} and {James Vallino} and {Christopher Wood} and {Andrew Meneely}},
  booktitle = {Proceedings of the 2012 American Symposium on Engineering Education},
  month = jun,
  year = {2012},
  abstract = {Software and hardware security is a reality that all stakeholders must face, from hardware engineers to software developers to customers. As a direct result, the technology industry is facing a growing need for engineers who understand security principles at varying levels of abstraction. These engineers will need security-oriented perspectives stemming from both theoretical and practical disciplines, including software engineering, computer engineering, and computer science. Unfortunately, in traditional academic settings, secure software and hardware are typically taught independently despite being intertwined in practice. Consequently, the objective of this initiative is to prepare students to apply a security-oriented awareness to a broad range of hardware and software systems by developing a multi-disciplinary curriculum involving three departments. Our efforts at Rochester Institute of Technology focus on integrating security into software design and implementations, hardware design and implementations, and hardware-software co-design. In the cluster of courses described in this paper, we use cryptographic applications as the motivating security focus. We describe changes made to an existing introductory cryptography course, report on a recently-developed course entitled Hardware and Software Design for Cryptographic Applications, and present our plans for a Secure Software Engineering course.},
  month_numeric = {6}
}
 
Does adding manpower also affect quality? an empirical, longitudinal analysis Andrew Meneely, Pete Rotella, & Laurie Williams Foundations in Software Engineering pp. 81–90 2011
With each new developer to a software development team comes a greater challenge to manage the communication, coordination, and knowledge transfer amongst teammates. Fred Brooks discusses this challenge in The Mythical Man-Month by arguing that rapid team expansion can lead to a complex team organization structure. While Brooks focuses on productivity loss as the negative outcome, poor product quality is also a substantial concern. But if team expansion is unavoidable, can any quality impacts be mitigated? Our objective is to guide software engineering managers by empirically analyzing the effects of team size, expansion, and structure on product quality. We performed an empirical, longitudinal case study of a large Cisco networking product over a five year history. Over that time, the team underwent periods of no expansion, steady expansion, and accelerated expansion. Using team-level metrics, we quantified characteristics of team expansion, including team size, expansion rate, expansion acceleration, and modularity with respect to department designations. We examined statistical correlations between our monthly team-level metrics and monthly product-level metrics. Our results indicate that increased team size and linear growth are correlated with later periods of better product quality. However, periods of accelerated team expansion are correlated with later periods of reduced software quality. Furthermore, our linear regression prediction model based on team metrics was able to predict the product's post-release failure rate within a 95\% prediction interval for 38 out of 40 months. Our analysis provides insight for project managers into how the expansion of development teams can impact product quality.
 @article{MeneelyFSE2011,
  author = {Meneely, Andrew and Rotella, Pete and Williams, Laurie},
  title = {Does adding manpower also affect quality? an empirical, longitudinal analysis},
  booktitle = {Foundations in Software Engineering},
  series = {ESEC/FSE '11},
  year = {2011},
  isbn = {978-1-4503-0443-6},
  location = {Szeged, Hungary},
  pages = {81--90},
  numpages = {10},
  url = {http://doi.acm.org/10.1145/2025113.2025128},
  doi = {10.1145/2025113.2025128},
  acmid = {2025128},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {brooks law, developer, linear regression, longitudinal analysis, modularity, team expansion metric},
  abstract = {With each new developer to a software development team comes a greater challenge to manage the communication, coordination, and knowledge transfer amongst teammates. Fred Brooks discusses this challenge in The Mythical Man-Month by arguing that rapid team expansion can lead to a complex team organization structure. While Brooks focuses on productivity loss as the negative outcome, poor product quality is also a substantial concern. But if team expansion is unavoidable, can any quality impacts be mitigated? Our objective is to guide software engineering managers by empirically analyzing the effects of team size, expansion, and structure on product quality. We performed an empirical, longitudinal case study of a large Cisco networking product over a five year history. Over that time, the team underwent periods of no expansion, steady expansion, and accelerated expansion. Using team-level metrics, we quantified characteristics of team expansion, including team size, expansion rate, expansion acceleration, and modularity with respect to department designations. We examined statistical correlations between our monthly team-level metrics and monthly product-level metrics. Our results indicate that increased team size and linear growth are correlated with later periods of better product quality. However, periods of accelerated team expansion are correlated with later periods of reduced software quality. Furthermore, our linear regression prediction model based on team metrics was able to predict the product's post-release failure rate within a 95\% prediction interval for 38 out of 40 months. Our analysis provides insight for project managers into how the expansion of development teams can impact product quality.}
}
 
Socio-technical developer networks: should we trust our measurements? Andrew Meneely & Laurie Williams Proceedings of the 33rd International Conference on Software Engineering pp. 281–290 2011
Software development teams must be properly structured to provide effectiv collaboration to produce quality software. Over the last several years, social network analysis (SNA) has emerged as a popular method for studying the collaboration and organization of people working in large software development teams. Researchers have been modeling networks of developers based on socio-technical connections found in software development artifacts. Using these developer networks, researchers have proposed several SNA metrics that can predict software quality factors and describe the team structure. But do SNA metrics measure what they purport to measure? The objective of this research is to investigate if SNA metrics represent socio-technical relationships by examining if developer networks can be corroborated with developer perceptions. To measure developer perceptions, we developed an online survey that is personalized to each developer of a development team based on that developer's SNA metrics. Developers answered questions about other members of the team, such as identifying their collaborators and the project experts. A total of 124 developers responded to our survey from three popular open source projects: the Linux kernel, the PHP programming language, and the Wireshark network protocol analyzer. Our results indicate that connections in the developer network are statistically associated with the collaborators whom the developers named. Our results substantiate that SNA metrics represent socio-technical relationships in open source development projects, while also clarifying how the developer network can be interpreted by researchers and practitioners.
 @article{MeneelyICSE2011,
  author = {Meneely, Andrew and Williams, Laurie},
  title = {Socio-technical developer networks: should we trust our measurements?},
  booktitle = {Proceedings of the 33rd International Conference on Software Engineering},
  series = {ICSE '11},
  year = {2011},
  isbn = {978-1-4503-0445-0},
  location = {Waikiki, Honolulu, HI, USA},
  pages = {281--290},
  numpages = {10},
  url = {http://doi.acm.org/10.1145/1985793.1985832},
  doi = {10.1145/1985793.1985832},
  acmid = {1985832},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {developer network, developers, social network analysis},
  abstract = {Software development teams must be properly structured to provide effectiv collaboration to produce quality software. Over the last several years, social network analysis (SNA) has emerged as a popular method for studying the collaboration and organization of people working in large software development teams. Researchers have been modeling networks of developers based on socio-technical connections found in software development artifacts. Using these developer networks, researchers have proposed several SNA metrics that can predict software quality factors and describe the team structure. But do SNA metrics measure what they purport to measure? The objective of this research is to investigate if SNA metrics represent socio-technical relationships by examining if developer networks can be corroborated with developer perceptions. To measure developer perceptions, we developed an online survey that is personalized to each developer of a development team based on that developer's SNA metrics. Developers answered questions about other members of the team, such as identifying their collaborators and the project experts. A total of 124 developers responded to our survey from three popular open source projects: the Linux kernel, the PHP programming language, and the Wireshark network protocol analyzer. Our results indicate that connections in the developer network are statistically associated with the collaborators whom the developers named. Our results substantiate that SNA metrics represent socio-technical relationships in open source development projects, while also clarifying how the developer network can be interpreted by researchers and practitioners.}
}
 
The iTrust Electronic Health Records System Andrew Meneely, Ben Smith, & Laurie Williams Software Systems Traceability no. XVII 2012
Electronic health record ({EHR)} systems present a formidable "trustworthiness" challenge because people's health records, which are transmitted and protected by these systems, are just as valuable to a myriad of attackers as they are to health care practitioners. Major initiatives in {EHR} adoption and increased sharing of health information raise significant challenges for protecting the privacy of pa-tients' health information. The United States is pursuing the vision of the National Health Information Net-work ({NHIN)} in which the electronic health records of the American people are passed between sometimes-competing health care providers. The American Recovery and Reinvestment Act of 2009 ({ARRA)} [1] provides \$34 billion of incen-tives to health care providers to deploy a government-approved {EHR.} The {ARRA} will, by 2014, impose penalties on those who do not. As a result, the use of {EHR} systems is likely to proliferate in the {US} in the next four years. Dr. Laurie Williams created {iTrust} in 2005 as a course project for undergraduates in North Carolina State University's Software Engineering course. {iTrust} is in-tended as a patient-centric application for maintaining an {EHR.} An ideal health care system combines medical information from multiple sources to provide a summary or detail view of the history of a particular patient in a way that is useful to the health care practitioner. {iTrust} is not intended to fulfill the requirements set forth to be approved by the government, nor is it intended for use by practitioners in the field of medicine. The primary goal for the project is to provide software engineering students with a pro-ject with real-world relevance and enough depth and psychological complexity as to mimic industrial systems that students may encounter while working in the software industry. Additionally, {iTrust} provides an educational testbed for under-standing the importance of security and privacy requirements. {iTrust} is particularly focused with maintaining the privacy standards set forth in the {HIPAA} Security and Privacy Rules
 @article{MeneelySST2011,
  series = {Software Engineering},
  title = {The {iTrust} Electronic Health Records System},
  isbn = {978-1-4471-2238-8},
  abstract = {Electronic health record ({EHR)} systems present a formidable "trustworthiness" challenge because people's health records, which are transmitted and protected by these systems, are just as valuable to a myriad of attackers as they are to health care practitioners. Major initiatives in {EHR} adoption and increased sharing of health information raise significant challenges for protecting the privacy of pa-tients' health information. The United States is pursuing the vision of the National Health Information Net-work ({NHIN)} in which the electronic health records of the American people are passed between sometimes-competing health care providers. The American Recovery and Reinvestment Act of 2009 ({ARRA)} [1] provides \$34 billion of incen-tives to health care providers to deploy a government-approved {EHR.} The {ARRA} will, by 2014, impose penalties on those who do not. As a result, the use of {EHR} systems is likely to proliferate in the {US} in the next four years. Dr. Laurie Williams created {iTrust} in 2005 as a course project for undergraduates in North Carolina State University's Software Engineering course. {iTrust} is in-tended as a patient-centric application for maintaining an {EHR.} An ideal health care system combines medical information from multiple sources to provide a summary or detail view of the history of a particular patient in a way that is useful to the health care practitioner. {iTrust} is not intended to fulfill the requirements set forth to be approved by the government, nor is it intended for use by practitioners in the field of medicine. The primary goal for the project is to provide software engineering students with a pro-ject with real-world relevance and enough depth and psychological complexity as to mimic industrial systems that students may encounter while working in the software industry. Additionally, {iTrust} provides an educational testbed for under-standing the importance of security and privacy requirements. {iTrust} is particularly focused with maintaining the privacy standards set forth in the {HIPAA} Security and Privacy Rules},
  number = {{XVII}},
  booktitle = {Software Systems Traceability},
  publisher = {Springer},
  author = {Meneely, Andrew and Smith, Ben and Williams, Laurie},
  year = {2012}
}
 
On the Use of Issue Tracking Annotations for Improving Developer Activity Metrics Andrew Meneely & Laurie Williams vol. 2010, no. 273080 2010
Understanding and measuring how teams of developers collaborate on software projects can provide valuable insight into the software development process. Currently, researchers and practitioners measure developer collaboration with social networks constructed from version control logs. Version control change logs, however, do not tell the whole story. The collaborative problem-solving process is also documented in the issue tracking systems that record solutions to failures, feature requests, or other development tasks. We propose two annotations to be used in issue tracking systems: solution originator and solution approver. We annotated which developers were originators or approvers of the solution to 602 issues from the OpenMRS healthcare system. We used these annotations to augment the version control logs and found 47 more contributors to the OpenMRS project than the original 40 found in the version control logs. Using social network analysis, we found that approvers are likely to score high in centrality and hierarchical clustering. Our results indicate that our two issue tracking annotations identify project collaborators that version control logs miss. Thus, issue tracking annotations are an improvement in developer activity metrics that strengthen the connection between what we can measure in the project development artifacts and the team's collaborative problem-solving process.
 @article{MeneelyASE2010,
  title = {On the Use of Issue Tracking Annotations for  Improving Developer Activity Metrics},
  volume = {2010},
  doi = {doi:10.1155/2010/273080},
  number = {273080},
  url = {http://www.hindawi.com/journals/ase/2010/273080/cta/},
  publisher = {Advances in Software Engineering},
  author = {Meneely, Andrew and Williams, Laurie},
  year = {2010},
  numpages = {9},
  abstract = {Understanding and measuring how teams of developers collaborate on software projects can provide valuable insight into the software development process. Currently, researchers and practitioners measure developer collaboration with social networks constructed from version control logs. Version control change logs, however, do not tell the whole story. The collaborative problem-solving process is also documented in the issue tracking systems that record solutions to failures, feature requests, or other development tasks. We propose two annotations to be used in issue tracking systems: solution originator and solution approver. We annotated which developers were originators or approvers of the solution to 602 issues from the OpenMRS healthcare system. We used these annotations to augment the version control logs and found 47 more contributors to the OpenMRS project than the original 40 found in the version control logs. Using social network analysis, we found that approvers are likely to score high in centrality and hierarchical clustering. Our results indicate that our two issue tracking annotations identify project collaborators that version control logs miss. Thus, issue tracking annotations are an improvement in developer activity metrics that strengthen the connection between what we can measure in the project development artifacts and the team's collaborative problem-solving process.},
  keywords = {developer network, centrality, socio-technical}
}
 
Improving developer activity metrics with issue tracking annotations Andrew Meneely, Mackenzie Corcoran, & Laurie Williams Proceedings of the 2010 ICSE Workshop on Emerging Trends in Software Metrics pp. 75–80 2010
Understanding and measuring how groups of developers collaborate on software projects can provide valuable insight into software quality and the software development process. Current practices of measuring developer collaboration (e.g. with social network analysis) usually employ metrics based on version control change log data to determine who is working on which part of the system. Version control change logs, however, do not tell the whole story. Information about the collaborative problem-solving process is also documented in the issue tracking systems that record solutions to failures, feature requests, or other development tasks. To enrich the data gained from version control change logs, we propose two annotations to be used in issue tracking systems: solution originator and solution approver. We examined the online discussions of 602 issues from the OpenMRS healthcare web application, annotating which developers were the originators of the solution to the issue, or were the approvers of the solution. We used these annotations to augment the version control change logs and found 47 more contributors to the OpenMRS project than the original 40 found in the version control change logs. Applying social network analysis to the data, we found that central developers in a developer network have a high likelihood of being approvers. These results indicate that using our two issue tracking annotations identify project collaborators that version control change logs miss. However, in the absence of our annotations, developer network centrality can be used as an estimate of the project's solution approvers. This improvement in developer activity metrics provides a valuable connection between what we can measure in the project development artifacts and the team's problem-solving process.
 @article{MeneelyWETSOM2010,
  author = {Meneely, Andrew and Corcoran, Mackenzie and Williams, Laurie},
  title = {Improving developer activity metrics with issue tracking annotations},
  booktitle = {Proceedings of the 2010 ICSE Workshop on Emerging Trends in Software Metrics},
  series = {WETSoM '10},
  year = {2010},
  isbn = {978-1-60558-976-3},
  location = {Cape Town, South Africa},
  pages = {75--80},
  numpages = {6},
  url = {http://doi.acm.org/10.1145/1809223.1809234},
  doi = {10.1145/1809223.1809234},
  acmid = {1809234},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {collaboration, developer, metric, network analysis},
  abstract = {Understanding and measuring how groups of developers collaborate on software projects can provide valuable insight into software quality and the software development process. Current practices of measuring developer collaboration (e.g. with social network analysis) usually employ metrics based on version control change log data to determine who is working on which part of the system. Version control change logs, however, do not tell the whole story. Information about the collaborative problem-solving process is also documented in the issue tracking systems that record solutions to failures, feature requests, or other development tasks. To enrich the data gained from version control change logs, we propose two annotations to be used in issue tracking systems: solution originator and solution approver. We examined the online discussions of 602 issues from the OpenMRS healthcare web application, annotating which developers were the originators of the solution to the issue, or were the approvers of the solution. We used these annotations to augment the version control change logs and found 47 more contributors to the OpenMRS project than the original 40 found in the version control change logs. Applying social network analysis to the data, we found that central developers in a developer network have a high likelihood of being approvers. These results indicate that using our two issue tracking annotations identify project collaborators that version control change logs miss. However, in the absence of our annotations, developer network centrality can be used as an estimate of the project's solution approvers. This improvement in developer activity metrics provides a valuable connection between what we can measure in the project development artifacts and the team's problem-solving process.}
}
 
Strengthening the empirical analysis of the relationship between Linus’ Law and software security Andrew Meneely & Laurie Williams Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement pp. 9:1–9:10 2010
Open source software is often considered to be secure because large developer communities can be leveraged to find and fix security vulnerabilities. Eric Raymond states Linus' Law as "many eyes make all bugs shallow", reasoning that a diverse set of perspectives improves the quality of a software product. However, at what point does the multitude of developers become "too many cooks in the kitchen", causing the system's security to suffer as a result? In a previous study, we quantified Linus' Law and "too many cooks in the kitchen" with developer activity metrics and found a statistical association between these metrics and security vulnerabilities in the Linux kernel. In the replication study reported in this paper, we performed our analysis on two additional projects: the PHP programming language and the Wireshark network protocol analyzer. We also updated our Linux kernel case study with 18 additional months of newly-discovered vulnerabilities. In all three case studies, files changed by six developers or more were at least four times more likely to have a vulnerability than files changed by fewer than six developers. Furthermore, we found that our predictive models improved on average when combining data from multiple projects, indicating that models can be transferred from one project to another.
 @article{MeneelyESEM2010,
  author = {Meneely, Andrew and Williams, Laurie},
  title = {Strengthening the empirical analysis of the relationship between Linus' Law and software security},
  booktitle = {Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement},
  series = {ESEM '10},
  year = {2010},
  isbn = {978-1-4503-0039-1},
  location = {Bolzano-Bozen, Italy},
  pages = {9:1--9:10},
  articleno = {9},
  numpages = {10},
  url = {http://doi.acm.org/10.1145/1852786.1852798},
  doi = {10.1145/1852786.1852798},
  acmid = {1852798},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {contribution network, developer network, metric, vulnerability},
  abstract = { Open source software is often considered to be secure because large developer communities can be leveraged to find and fix security vulnerabilities. Eric Raymond states Linus' Law as "many eyes make all bugs shallow", reasoning that a diverse set of perspectives improves the quality of a software product. However, at what point does the multitude of developers become "too many cooks in the kitchen", causing the system's security to suffer as a result? In a previous study, we quantified Linus' Law and "too many cooks in the kitchen" with developer activity metrics and found a statistical association between these metrics and security vulnerabilities in the Linux kernel. In the replication study reported in this paper, we performed our analysis on two additional projects: the PHP programming language and the Wireshark network protocol analyzer. We also updated our Linux kernel case study with 18 additional months of newly-discovered vulnerabilities. In all three case studies, files changed by six developers or more were at least four times more likely to have a vulnerability than files changed by fewer than six developers. Furthermore, we found that our predictive models improved on average when combining data from multiple projects, indicating that models can be transferred from one project to another.}
}
 
Challenges for protecting the privacy of health information: required certification can leave common vulnerabilities undetected Ben Smith, Andrew Austin, Matt Brown, Jason T King, Jerrod Lankford, Andrew Meneely, & Laurie Williams Proceedings of the second annual workshop on Security and privacy in medical and home-care systems pp. 1–12 2010
The use of electronic health record ({EHR)} systems by medical professionals enables the electronic exchange of patient data, yielding cost and quality of care benefits. The United States American Recovery and Reinvestment Act ({ARRA)} of 2009 provides up to \$34 billion for meaningful use of certified {EHR} systems. But, will these certified {EHR} systems provide the infrastructure for secure patient data exchange? As a window into the ability of current and emerging certification criteria to expose security vulnerabilities, we performed exploratory security analysis on a proprietary and an open source {EHR.} We were able to exploit a range of common code-level and design-level vulnerabilities. These common vulnerabilities would have remained undetected by the 2011 security certification test scripts from the Certification Commission for Health Information Technology, the most widely used certification process for {EHR} systems. The consequences of these exploits included, but were not limited to: exposing all users' login information, the ability of any user to view or edit health records for any patient, and creating a denial of service for all users. Based upon our results, we suggest that an enhanced set of security test scripts be used as entry criteria to the {EHR} certification process. Before certification bodies spend the time to certify that an {EHR} application is functionally complete, they should have confidence that the software system meets a basic level of security competence.
 @article{SmithSPIMACS2010,
  address = {New York, {NY}, {USA}},
  series = {{SPIMACS} '10},
  title = {Challenges for protecting the privacy of health information: required certification can leave common vulnerabilities undetected},
  isbn = {978-1-4503-0094-0},
  shorttitle = {Challenges for protecting the privacy of health information},
  doi = {10.1145/1866914.1866916},
  abstract = {The use of electronic health record ({EHR)} systems by medical professionals enables the electronic exchange of patient data, yielding cost and quality of care benefits. The United States American Recovery and Reinvestment Act ({ARRA)} of 2009 provides up to \$34 billion for meaningful use of certified {EHR} systems. But, will these certified {EHR} systems provide the infrastructure for secure patient data exchange? As a window into the ability of current and emerging certification criteria to expose security vulnerabilities, we performed exploratory security analysis on a proprietary and an open source {EHR.} We were able to exploit a range of common code-level and design-level vulnerabilities. These common vulnerabilities would have remained undetected by the 2011 security certification test scripts from the Certification Commission for Health Information Technology, the most widely used certification process for {EHR} systems. The consequences of these exploits included, but were not limited to: exposing all users' login information, the ability of any user to view or edit health records for any patient, and creating a denial of service for all users. Based upon our results, we suggest that an enhanced set of security test scripts be used as entry criteria to the {EHR} certification process. Before certification bodies spend the time to certify that an {EHR} application is functionally complete, they should have confidence that the software system meets a basic level of security competence.},
  booktitle = {Proceedings of the second annual workshop on Security and privacy in medical and home-care systems},
  publisher = {{ACM}},
  author = {Smith, Ben and Austin, Andrew and Brown, Matt and King, Jason T and Lankford, Jerrod and Meneely, Andrew and Williams, Laurie},
  year = {2010},
  note = {{ACM} {ID:} 1866916},
  keywords = {attack, cchit, design},
  pages = {1--12}
}
 
Protection Poker: The New Software Security "Game" Laurie Williams, Andrew Meneely, & Grant Shipley IEEE Security and Privacy vol. 8 pp. 14–20 May 2010
Tracking organizations such as the {US} {CERT} show a continuing rise in security vulnerabilities in software. But not all discovered vulnerabilities are equal-some could cause much more damage to organizations and individuals than others. In the inevitable absence of infinite resources, software development teams must prioritize security fortification efforts to prevent the most damaging attacks. Protection Poker is a collaborative means of guiding this prioritization. A case study of a Red Hat {IT} software maintenance team demonstrates Protection Poker's potential for improving software security practices and team software security knowledge.
 @article{MeneelyIEEEPS2010,
  title = {Protection Poker: The New Software Security {"Game"}},
  volume = {8},
  issn = {1540-7993},
  shorttitle = {Protection Poker},
  url = {http://dx.doi.org/10.1109/MSP.2010.58},
  doi = {10.1109/MSP.2010.58},
  abstract = {Tracking organizations such as the {US} {CERT} show a continuing rise in security vulnerabilities in software. But not all discovered vulnerabilities are equal-some could cause much more damage to organizations and individuals than others. In the inevitable absence of infinite resources, software development teams must prioritize security fortification efforts to prevent the most damaging attacks. Protection Poker is a collaborative means of guiding this prioritization. A case study of a Red Hat {IT} software maintenance team demonstrates Protection Poker's potential for improving software security practices and team software security knowledge.},
  urldate = {2011-04-07},
  journal = {{IEEE} Security and Privacy},
  author = {Williams, Laurie and Meneely, Andrew and Shipley, Grant},
  month = may,
  year = {2010},
  note = {{ACM} {ID:} 1830041},
  keywords = {protection mechanisms, management, measurement, documentation, design, security, verification, security, risk assessment, risk estimation, delphi estimation, wideband delphi estimation},
  pages = {14-20},
  month_numeric = {5}
}
 
Secure open source collaboration: an empirical study of linus’ law Andrew Meneely & Laurie Williams Proceedings of the 16th ACM conference on Computer and communications security pp. 453–462 2009
Open source software is often considered to be secure. One factor in this confidence in the security of open source software lies in leveraging large developer communities to find vulnerabilities in the code. Eric Raymond declares Linus' Law "Given enough eyeballs, all bugs are shallow." Does Linus' Law hold up ad infinitum? Or, can the multitude of developers become "too many cooks in the kitchen", causing the system's security to suffer as a result? In this study, we examine the security of an open source project in the context of developer collaboration. By analyzing version control logs, we quantified notions of Linus' Law as well as the "too many cooks in the kitchen" viewpoint into developer activity metrics. We performed an empirical case study by examining correlations between the known security vulnerabilities in the open source Red Hat Enterprise Linux 4 kernel and developer activity metrics. Files developed by otherwise-independent developer groups were more likely to have a vulnerability, supporting Linus' Law. However, files with changes from nine or more developers were 16 times more likely to have a vulnerability than files changed by fewer than nine developers, indicating that many developers changing code may have a detrimental effect on the system's security.
 @article{MeneelyCCS2009,
  author = {Meneely, Andrew and Williams, Laurie},
  title = {Secure open source collaboration: an empirical study of linus' law},
  booktitle = {Proceedings of the 16th ACM conference on Computer and communications security},
  series = {CCS '09},
  year = {2009},
  isbn = {978-1-60558-894-0},
  location = {Chicago, Illinois, USA},
  pages = {453--462},
  numpages = {10},
  url = {http://doi.acm.org/10.1145/1653662.1653717},
  doi = {10.1145/1653662.1653717},
  acmid = {1653717},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {contribution network, developer network, linus' law, metric, vulnerability},
  abstract = { Open source software is often considered to be secure. One factor in this confidence in the security of open source software lies in leveraging large developer communities to find vulnerabilities in the code. Eric Raymond declares Linus' Law "Given enough eyeballs, all bugs are shallow." Does Linus' Law hold up ad infinitum? Or, can the multitude of developers become "too many cooks in the kitchen", causing the system's security to suffer as a result? In this study, we examine the security of an open source project in the context of developer collaboration. By analyzing version control logs, we quantified notions of Linus' Law as well as the "too many cooks in the kitchen" viewpoint into developer activity metrics. We performed an empirical case study by examining correlations between the known security vulnerabilities in the open source Red Hat Enterprise Linux 4 kernel and developer activity metrics. Files developed by otherwise-independent developer groups were more likely to have a vulnerability, supporting Linus' Law. However, files with changes from nine or more developers were 16 times more likely to have a vulnerability than files changed by fewer than nine developers, indicating that many developers changing code may have a detrimental effect on the system's security. }
}
 
On preparing students for distributed software development with a synchronous, collaborative development platform Andrew Meneely & Laurie Williams Proceedings of the 40th ACM technical symposium on Computer science education pp. 529–533 2009
Working remotely is becoming the norm for both professionals and students alike. Software development has become a global industry due to outsourcing, teleworking, flex time, and companies' desire to use the best and/or most economical talent regardless of where that talent is located. Professionals are not alone because students usually work from home despite having sufficient resources on campus. In this paper we share our experiences from using Jazz, a synchronous, collaborative development platform, with our inevitably distributed software engineering students. Eleven students optionally used the tool while working on a five-week team project. Students primarily used the version control, chat, and work item features in Jazz. We collected their reactions in retrospective essays and found that all Jazz students supported using Jazz in future semesters of the course. We also examined grade differences and found that the students who used Jazz were more successful than those who did not use Jazz.
 @article{MeneelySIGCSE2009,
  address = {Chattanooga, {TN}, {USA}},
  title = {On preparing students for distributed software development with a synchronous, collaborative development platform},
  isbn = {978-1-60558-183-5},
  url = {http://portal.acm.org/citation.cfm?id=1508865.1509047&coll=GUIDE&dl=ACM&CFID=103618886&CFTOKEN=28737401},
  doi = {10.1145/1508865.1509047},
  abstract = {Working remotely is becoming the norm for both professionals and students alike. Software development has become a global industry due to outsourcing, teleworking, flex time, and companies' desire to use the best and/or most economical talent regardless of where that talent is located. Professionals are not alone because students usually work from home despite having sufficient resources on campus. In this paper we share our experiences from using Jazz, a synchronous, collaborative development platform, with our inevitably distributed software engineering students. Eleven students optionally used the tool while working on a five-week team project. Students primarily used the version control, chat, and work item features in Jazz. We collected their reactions in retrospective essays and found that all Jazz students supported using Jazz in future semesters of the course. We also examined grade differences and found that the students who used Jazz were more successful than those who did not use Jazz.},
  urldate = {2010-09-07},
  booktitle = {Proceedings of the 40th {ACM} technical symposium on Computer science education},
  publisher = {{ACM}},
  author = {Meneely, Andrew and Williams, Laurie},
  year = {2009},
  keywords = {collaboration, distributed development, software engineering},
  pages = {529--533}
}
 
ROSE: a repository of education-friendly open-source projects Andrew Meneely, Laurie Williams, & Edward F. Gehringer Proceedings of the 13th annual conference on Innovation and technology in computer science education pp. 7–11 2008
Open-source project artifacts can be used to inject realism into software engineering courses or lessons on open-source software development. However, the use of open-source projects presents challenges for both educators and for students. Educators must search for projects that meet the constraints of their classes, and often must negotiate the scope and terms of the project with project managers. For students, many available open-source projects have a steep learning curve that inhibits them from making significant contributions to the project and benefiting from a "realistic" experience. To alleviate these problems and to encourage cross-institution collaboration, we have created the Repository for Open Software Education ({ROSE)} and have contributed three open-source projects intended for an undergraduate computer science or software engineering course. The projects in {ROSE} are education-friendly in terms of a manageable size and scope, and are intended to be evolved over many semesters. All projects have a set of artifacts covering all aspects of the development process, from requirements, design, code, and test. We invite other educators to contribute to {ROSE} and to use projects found on {ROSE} in their own courses.
 @article{MeneelyITICSE2008,
  address = {Madrid, Spain},
  title = {{ROSE:} a repository of education-friendly open-source projects},
  isbn = {978-1-60558-078-4},
  shorttitle = {{ROSE}},
  url = {http://portal.acm.org/citation.cfm?id=1384271.1384276&coll=GUIDE&dl=ACM&CFID=103618886&CFTOKEN=28737401},
  doi = {10.1145/1384271.1384276},
  abstract = {Open-source project artifacts can be used to inject realism into software engineering courses or lessons on open-source software development. However, the use of open-source projects presents challenges for both educators and for students. Educators must search for projects that meet the constraints of their classes, and often must negotiate the scope and terms of the project with project managers. For students, many available open-source projects have a steep learning curve that inhibits them from making significant contributions to the project and benefiting from a "realistic" experience. To alleviate these problems and to encourage cross-institution collaboration, we have created the Repository for Open Software Education ({ROSE)} and have contributed three open-source projects intended for an undergraduate computer science or software engineering course. The projects in {ROSE} are education-friendly in terms of a manageable size and scope, and are intended to be evolved over many semesters. All projects have a set of artifacts covering all aspects of the development process, from requirements, design, code, and test. We invite other educators to contribute to {ROSE} and to use projects found on {ROSE} in their own courses.},
  urldate = {2010-09-07},
  booktitle = {Proceedings of the 13th annual conference on Innovation and technology in computer science education},
  publisher = {{ACM}},
  author = {Meneely, Andrew and Williams, Laurie and Gehringer, Edward F.},
  year = {2008},
  keywords = {open-source software repository, software engineering curriculum},
  pages = {7--11}
}
 
Protection Poker: Structuring Software Security Risk Assessment and Knowledge Transfer Laurie Williams, Michael Gegick, & Andrew Meneely Proceedings of the 1st International Symposium on Engineering Secure Software and Systems pp. 122–134 2009
Discovery of security vulnerabilities is on the rise. As a result, software development teams must place a higher priority on preventing the injection of vulnerabilities in software as it is developed. Because the focus on software security has increased only recently, software development teams often do not have expertise in techniques for identifying security risk, understanding the impact of a vulnerability, or knowing the best mitigation strategy. We propose the Protection Poker activity as a collaborative and informal form of misuse case development and threat modeling that plays off the diversity of knowledge and perspective of the participants. An excellent outcome of Protection Poker is that security knowledge passed around the team. Students in an advanced undergraduate software engineering course at North Carolina State University participated in a Protection Poker session conducted as a laboratory exercise. Students actively shared misuse cases, threat models, and their limited software security expertise as they discussed vulnerabilities in their course project. We observed students relating vulnerabilities to the business impacts of the system. Protection Poker lead to a more effective software security learning experience than in prior semesters. A pilot of the use of Protection Poker with an industrial partner began in October 2008. The first security discussion structured via Protection Poker caused two requirements to be revised for added security fortification; led to the immediate identification of one vulnerability in the system; initiated a meeting on the prioritization of security defects; and instigated a call for an education session on preventing cross site scripting vulnerabilities.
 @article{MeneelyESSoS2009,
  address = {Berlin, Heidelberg},
  series = {{ESSoS} '09},
  title = {Protection Poker: Structuring Software Security Risk Assessment and Knowledge Transfer},
  isbn = {978-3-642-00198-7},
  shorttitle = {Protection Poker},
  url = {http://dx.doi.org/10.1007/978-3-642-00199-4_11},
  doi = {10.1007/978-3-642-00199-4_11},
  abstract = {Discovery of security vulnerabilities is on the rise. As a result, software development teams must place a higher priority on preventing the injection of vulnerabilities in software as it is developed. Because the focus on software security has increased only recently, software development teams often do not have expertise in techniques for identifying security risk, understanding the impact of a vulnerability, or knowing the best mitigation strategy. We propose the Protection Poker activity as a collaborative and informal form of misuse case development and threat modeling that plays off the diversity of knowledge and perspective of the participants. An excellent outcome of Protection Poker is that security knowledge passed around the team. Students in an advanced undergraduate software engineering course at North Carolina State University participated in a Protection Poker session conducted as a laboratory exercise. Students actively shared misuse cases, threat models, and their limited software security expertise as they discussed vulnerabilities in their course project. We observed students relating vulnerabilities to the business impacts of the system. Protection Poker lead to a more effective software security learning experience than in prior semesters. A pilot of the use of Protection Poker with an industrial partner began in October 2008. The first security discussion structured via Protection Poker caused two requirements to be revised for added security fortification; led to the immediate identification of one vulnerability in the system; initiated a meeting on the prioritization of security defects; and instigated a call for an education session on preventing cross site scripting vulnerabilities.},
  urldate = {2011-04-07},
  booktitle = {Proceedings of the 1st International Symposium on Engineering Secure Software and Systems},
  publisher = {Springer-Verlag},
  author = {Williams, Laurie and Gegick, Michael and Meneely, Andrew},
  year = {2009},
  note = {{ACM} {ID:} 1532745},
  keywords = {planning poker, protection poker, software security, wideband delphi},
  pages = {122-134}
}
 
Jazz Sangam: A real-time tool for distributed pair programming on a team development platform John Vijay Sena Devide, Andrew Meneely, Chih-Wei Ho, Laurie Williams, & Michael Devetsikiotis In Proc. of IRCSE 2008
Pair programming has proven to be a useful technique for developing high quality code while sharing knowledge throughout a team. Rapid global dispersion of software development teams, however, makes co-located pair programming a challenge, motivating the need for development tools tailored specifically for distributed pair programming. Previously, the Sangam Eclipse plug-in was developed to support distributed pair programming. More recently, the Jazz collaborative software development platform was built to support team communication and the sharing of life-cycle resources and to integrate a variety of disparate tools used by team members. We have ported Sangam to the Jazz platform to enable teams to pair program within their integrated team environment. In this paper, we describe Jazz Sangam, highlight the choices that lead to Sangam's current design, and discuss how Jazz Sangam can improve the distributed pair programming experience.
 @article{MeneelyIRECOSE2008,
  author = {Devide, John Vijay Sena and Meneely, Andrew and Ho, Chih-Wei and Williams, Laurie and Devetsikiotis, Michael},
  title = {Jazz Sangam: A real-time tool for distributed pair programming on a team development platform},
  booktitle = {In Proc. of IRCSE},
  year = {2008},
  abstract = {Pair programming has proven to be a useful technique for developing high quality code while sharing knowledge throughout a team. Rapid global dispersion of software development teams, however, makes co-located pair programming a challenge, motivating the need for development tools tailored specifically for distributed pair programming. Previously, the Sangam Eclipse plug-in was developed to support distributed pair programming. More recently, the Jazz collaborative software development platform was built to support team communication and the sharing of life-cycle resources and to integrate a variety of disparate tools used by team members. We have ported Sangam to the Jazz platform to enable teams to pair program within their integrated team environment. In this paper, we describe Jazz Sangam, highlight the choices that lead to Sangam's current design, and discuss how Jazz Sangam can improve the distributed pair programming experience.}
}
 
Predicting failures with developer networks and social network analysis Andrew Meneely, Laurie Williams, Will Snipes, & Jason Osborne Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering pp. 13–23 2008
Software fails and fixing it is expensive. Research in failure prediction has been highly successful at modeling software failures. Few models, however, consider the key cause of failures in software: people. Understanding the structure of developer collaboration could explain a lot about the reliability of the final product. We examine this collaboration structure with the developer network derived from code churn information that can predict failures at the file level. We conducted a case study involving a mature Nortel networking product of over three million lines of code. Failure prediction models were developed using test and post-release failure data from two releases, then validated against a subsequent release. One model's prioritization revealed 58% of the failures in 20% of the files compared with the optimal prioritization that would have found 61% in 20% of the files, indicating that a significant correlation exists between file-based developer network metrics and failures.
 @article{MeneelyFSE2008,
  author = {Meneely, Andrew and Williams, Laurie and Snipes, Will and Osborne, Jason},
  title = {Predicting failures with developer networks and social network analysis},
  booktitle = {Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering},
  series = {SIGSOFT '08/FSE-16},
  year = {2008},
  isbn = {978-1-59593-995-1},
  location = {Atlanta, Georgia},
  pages = {13--23},
  numpages = {11},
  url = {http://doi.acm.org/10.1145/1453101.1453106},
  doi = {10.1145/1453101.1453106},
  acmid = {1453106},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {developer network, failure prediction, logistic regression, negative binomial regression, social network analysis},
  abstract = {Software fails and fixing it is expensive. Research in failure prediction has been highly successful at modeling software failures. Few models, however, consider the key cause of failures in software: people. Understanding the structure of developer collaboration could explain a lot about the reliability of the final product. We examine this collaboration structure with the developer network derived from code churn information that can predict failures at the file level. We conducted a case study involving a mature Nortel networking product of over three million lines of code. Failure prediction models were developed using test and post-release failure data from two releases, then validated against a subsequent release. One model's prioritization revealed 58% of the failures in 20% of the files compared with the optimal prioritization that would have found 61% in 20% of the files, indicating that a significant correlation exists between file-based developer network metrics and failures.}
}
 
Fifteen compilers in fifteen days Jeremy D. Frens & Andrew Meneely Proceedings of the 37th SIGCSE technical symposium on Computer science education pp. 92–96 2006
Traditional approaches to semester-long projects in compiler courses force students to implement the early stages of a compiler in depth; since many students fall behind, they have little opportunity to implement the back end. Consequently, students have a deep knowledge of early material and no knowledge of latter material. We propose an approach based on incremental development and test-driven development; this approach solves the emphasis problem, provides experience with useful tools, and allows for such a course to be taught in a three or four weeks.
 @article{MeneelySIGCSE2006,
  address = {Houston, Texas, {USA}},
  title = {Fifteen compilers in fifteen days},
  isbn = {1-59593-259-3},
  url = {http://portal.acm.org/citation.cfm?id=1121341.1121372&coll=GUIDE&dl=ACM&CFID=103618886&CFTOKEN=28737401},
  doi = {10.1145/1121341.1121372},
  abstract = {Traditional approaches to semester-long projects in compiler courses force students to implement the early stages of a compiler in depth; since many students fall behind, they have little opportunity to implement the back end. Consequently, students have a deep knowledge of early material and no knowledge of latter material. We propose an approach based on incremental development and test-driven development; this approach solves the emphasis problem, provides experience with useful tools, and allows for such a course to be taught in a three or four weeks.},
  urldate = {2010-09-07},
  booktitle = {Proceedings of the 37th {SIGCSE} technical symposium on Computer science education},
  publisher = {{ACM}},
  author = {Frens, Jeremy D. and Meneely, Andrew},
  year = {2006},
  keywords = {compiler course, incremental development, refactoring, test-driven development, unit testing},
  pages = {92--96}
}