Data Science Methods in Software Engineering

SWEN 789-01 (Graduate Special Topics)

Pradeep K. Murukannaiah

Email: pkmvse at rit-domain
Office hours: MW 2:00–3:00PM plus via email
Office: Golisano 70-1521


[ Home | Schedule | Reading | Paper Assignment | Deliverables ]


Semester Project

This is the main project for this class and it contributes 30% toward the course grade. In line with the theme of this course, the project should involve both data science and software engineering.

Proposal March 25
Intermediate progress report April 15
Final report April 29

Implementation Project (Solo or Group of Two)

You will implement a set of data science methods to solve a specific SE problem. You will require a dataset to start on this path. You can choose a publicly available dataset or collect the data yourself. Data collection can be nontrivial. Thus, if you choose to collect some data yourself, please discuss with me before you start. In the resources section below, I point to many publicly available datasets and specific problems you can address with those datasets.

Important: The effort involved must be nontrivial. It is not required that you implement a novel method, although I encourage it. However, it is required that your work involves significant effort in preprocessing the data, implementing an algorithm, or building a pipeline combining multiple methods. It is not acceptable to take a publicly available dataset and simply run it through an off-the-shelf data science method.

Software

Proposal

Treat your proposal as a working document for the your final deliverable. Ideally, the proposal should include the following details.

  1. A working title (required)
  2. Team members (required)
  3. Problem description (required)
  4. Motivation
  5. Datasets to be used (required)
  6. Tools and techniques to be used
  7. Relevant prior literature
  8. Preliminary results

Deliverables

Resources