Books

[Bird 2015]: Christian Bird, Tim Menzies and Thomas Zimmermann. The Art and Science of Analyzing Software Data. Morgan Kaufmann, 2015. RIT e-book.
[Carrington 2005]: Peter J. Carrington, John Scott, and Stanley Wasserman, eds. Models and methods in social network analysis. Vol. 28. Cambridge University Press, 2005.
[Manning 2008]: Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to information retrieval. Cambridge University Press, 2008.
[Menzies 2016]: Tim Menzies, Laurie Williams, and Thomas Zimmermann. Perspectives on Data Science for Software Engineering. Morgan Kaufmann, 2016. RIT e-book.
[Tan 2006]: Pang‐Ning Tan, Michael Steinbach, and Vipin Kumar. Introduction to Data Mining. Addison-Wesley Longman, Boston, 2006.

Research Papers

[Bhattacharya 2012]: Pamela Bhattacharya, Marios Iliofotou, Iulian Neamtiu, and Michalis Faloutsos. 2012. Graph-based analysis and prediction for software evolution. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). 419-429.
[Chen 2014]: Ning Chen, Jialiu Lin, Steven C. H. Hoi, Xiaokui Xiao, and Boshen Zhang. 2014. AR-miner: mining informative reviews for developers from mobile app marketplace. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). 767-778.
[Ciurumelea 2017]: Adelina Ciurumelea, Andreas Schaufelbühl, Sebastiano Panichella and Harald C. Gall. 2017. Analyzing reviews and code of mobile apps for better release planning. IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER '17). 91--102.
[Gharehyazie 2015]: Mohammad Gharehyazie, Daryl Posnett, Bogdan Vasilescu, and Vladimir Filkov. 2015. Developer initiation and social interactions in OSS: A case study of the Apache Software Foundation. Empirical Software Engineering. 20(5):1318-1353.
[Han 2009]: Sangmok Han, David R. Wallace, and Robert C. Miller. 2009. Code Completion from Abbreviated Input. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE '09). 332-343.
[Hassan 2008]: Ahmed E. Hassan. 2008. The road ahead for Mining Software Repositories. In Proceedings of the 2008 Frontiers of Software Maintenance. 48-57.
[Jiang 2008]: Zhen Ming Jiang, Ahmed E. Hassan, Gilbert Hamann and Parminder Flora. 2008. Automatic identification of load testing problems. IEEE International Conference on Software Maintenance (ICSM 2008). 307-316.
[Joblin 2015]: Mitchell Joblin, Wolfgang Mauerer, Sven Apel, Janet Siegmund, and Dirk Riehle. 2015. From developer networks to verified communities: A fine-grained approach. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015). 563-573.
[Kalliamvakou 2014]: Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, and Daniela Damian. 2014. The promises and perils of mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR 2014). 92-101.
[Kalliamvakou 2016]: Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, and Daniela Damian. 2016. An in-depth study of the promises and perils of mining GitHub. Empirical Software Engineering, 21(5):2035-2071.
[Laurent 2007]: Paula Laurent, Jane Cleland-Huang, and Chuan Duan. 2007. Towards automated requirements triage. In Proceedings of the 15th IEEE International Requirements Engineering Conference (RE '07). 131--140.)
[Lessmann 2008]: Stefan Lessmann, Bart Baesens, Christophe Mues, and Swantje Pietsch. 2008. Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings. IEEE Transactions on Software Engineering, 34(4):485--496.
[Lo 2009]: David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, and Chengnian Sun. 2009. Classification of software behaviors for failure detection: a discriminative pattern mining approach. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '09). 557--566.
[Maalej 2015]: Walid Maalej and Hadeer Nabil. 2015. Bug report, feature request, or simply praise? on automatically classifying app reviews. In Proceedings of the IEEE 23rd International Requirements Engineering Conference (RE '15). 116--125.
[McCabe 1976]: Thomas J. McCabe. 1976 A complexity measure. IEEE Transactions on Software Engineering. 2(4):308--320.
[Mirarab 2007]: Siavash Mirarab, Alaa Hassouna, and Ladan Tahvildari. 2007. Using Bayesian Belief Networks to Predict Change Propagation in Software Systems. In Proceedings of the 15th IEEE International Conference on Program Comprehension (ICPC '07). 177-188.
[Meneely 2008]: Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. 2008 Predicting failures with developer networks and social network analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering (FSE '08), 13--23.
[Pandita 2012: Rahul Pandita, Xusheng Xiao, Hao Zhong, Tao Xie, Stephen Oney, and Amit Paradkar. 2012. Inferring method specifications from natural language API descriptions. In Proceedings of the 34th International Conference on Software Engineering (ICSE), 815--825.
[Pendharkar 2005]: Parag C. Pendharkar, Girish H. Subramanian, and James A. Rodger. 2005. A probabilistic model for predicting software development effort. IEEE Transactions on Software Engineering. 31(7):615-624.
[Petrić 2016]: Jean Petrić, David Bowes, Tracy Hall, Bruce Christianson, and Nathan Baddoo. 2016. Building an Ensemble for Software Defect Prediction Based on Diversity Selection. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). Article 46, 10 pages.
[Schröter 2006]: Adrian Schröter, Thomas Zimmermann, and Andreas Zeller. 2006. Predicting component failures at design time. In Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering (ISESE '06). 18--27.
[Stewart 2002]: B. Stewart. 2002. Predicting project delivery rates using the Naive–Bayes classifier. Journal of Software Maintenance and Evolution: Research and Practice. 14(3):161--179.
[Thummalapenta 2009]: Suresh Thummalapenta and Tao Xie. 2009. Mining exception-handling rules as sequence association rules. In Proceedings of the 31st International Conference on Software Engineering (ICSE '09). 496-506.
[Tufano 2015]: Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, and Denys Poshyvanyk. 2015. When and why your code starts to smell bad . In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE '15). 403-414.
[Villarroel 2016]: Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, and Massimiliano Di Penta. 2016. Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering (ICSE '16). 14-24.
[Zanetti 2013]: Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone, and Frank Schweitzer. 2013. Categorizing bugs with social networks: A case study on four open source software communities. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). 1032-1041.

Research Papers (Optional)

[Bowring 2004]: James F. Bowring, James M. Rehg, and Mary Jean Harrold. 2004. Active learning for automatic classification of software behavior. In Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis (ISSTA '04). 195--205.
[Cheung 2008]: Leslie Cheung, Roshanak Roshandel, Nenad Medvidovic, and Leana Golubchik. 2008. Early prediction of software component reliability. In Proceedings of the 30th international conference on Software engineering (ICSE '08). 111-120.
[Rabiner 1990]: Lawrence R. Rabiner. 1990. A tutorial on hidden Markov models and selected applications in speech recognition. In Readings in speech recognition, Alex Waibel and Kai-Fu Lee (Eds.). Morgan Kaufmann Publishers Inc., San Francisco. 267-296.
[Shepperd 1988]: Martin Shepperd. 1988. A critique of cyclomatic complexity as a software metric. Software Engineering Journal. 3(2):30--36.
[Singh 2011]: Param Vir Singh, Yong Tan, and Nara Youn. 2011. A Hidden Markov Model of Developer Learning Dynamics in Open Source Software Projects. Information Systems Research. 22(4):790--807.
[Song 2018]: Liyan Song, Leandro L. Minku, and Xin Yao. 2018. A novel automated approach for software effort estimation based on data augmentation. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). 468-479.
[Turhan 2009]: Burak Turhan and Ayse Bener. 2009. Analysis of Naive Bayes' assumptions on software fault data: An empirical study. Data & Knowledge Engineering. 68(2):278--290.