SWEN-440

Project 2: Services Architecture: Analysis and Selection

Creating an architecture is based on understanding the purpose of your application and the components that will be used in constructing the application. Along the way, there are choices you will make to construct the application. The collection of those choices will drive your architecture.

During this project, you will propose an architecture for a new integrated application that uses existing web services. You will be required to investigate and analyze the services so you can decide on your proposed architecture

Format

This will be a team-based project. Teams will be assigned by the instructor. It is expected that you will collaborate much like modern teams using all the technology at your disposal.

The project involves analysis, reporting results, technology implementation (coding), measurement and architecture proposals and tradeoffs determination.

Purpose

Forming a software architecture is not just writing lines of code.
It involves discovery and investigation, analysis (and measurement), proposal, and review. This project will provide experience in those areas.

Typical programs in software are created by integrating multiple components.
Each component has its own quirks, and each choice in how to use the component has implications and tradeoffs.
Learning how to investigate and characterise/ measure the component behaviour; how it fits in your larger application; how you need to create an architecture that takes advantage of its capabilities AND addresses its shortfalls (no component is a perfect fit!) are important analytic and design skills for software engineers.
You will explore these areas as you make your architecture proposal.
You will first create a plan to analyse the various aspects of the intended software solution.
You will then execute the plan, gather the data, and start understanding what the data tells you
Finally, using all the information you have gathered, you will create an architecture proposal and a short presentation

Background

You are a software design and development team. Your customer needs a software system to process applications - forms that are filled in by external users - which arrive as documents (think about applying for insurance, or a loan as an example). The documents are received as paper - in the mail - and are scanned in. Software is required to extract data from the scanned document. The extracted information is used to enter information in an ongoing list of the key data from the submitted applications.

There is an existing workflow, that is very manual, already in place. Using the new software system you are expected to modernize the workflow - albeit with some resctrictions in what you can change.

Your software team does not have to worry about how the paper documents are received and scanned, your software solution works on processing the documents after they have been converted to digital form. However, you need to understand the current workflow. You can review the workflow description document

It is important that the applications be processed quickly - which means performance is an important consideration, but it is NOT the only consideration. How the forms processors (who are users of the final application) can get their work done efficiently will also be key. Which APIs you pick to use (or not use) will govern the complexity of your final application. You will need to think about all these things (and more).

Some software services have been put in place as foundational components for this application (see below), but an over-arching application has yet to be designed. i.e. the overall application that integrates the services to create the “Forms Processing” Application, which is your task.


The services (foundational components) you will use can be found at the website listed below:

The document below provides some further information on the services available:
Processing Services Description

Deliverables
All documentation must be professional quality, with clean and clear sections, paragraphs and formatting. You will be evaluated on both content and quality of the document. Professional language, correct grammar and overall writing style are expected in all deliverables.
Part 2 requires writing test code to perform analysis. All your code for the group must be in RIT gitlab (https://kgcoe-git.rit.edu). Refer to gitlab instructions

    Document how you will approach investigating the existing services and determining your path forward for the architecture.
  • This should be a 3-5 page proposal. This plan will be the guide toward your activities for Deliverable 2, and will affect the quality of your final architecture proposal in Deliverable 3.
  • The plan should include at least the following:
    • A overview of the assumptions around how users will interact with/ use the final application
      • Keep this high level for now. You will expand it later.
    • A description (summary), listing and diagram of the main services and the available APIs/ interfaces, and your understanding of the functions.
    • A high-level explanation of your evaluation plan for the services (tests to be run, rationale etc.)
    • A rough project plan for doing the work (who does what activity, estimated task time) What data you will collect and why. How you collect it, and how you will use it to determine the architecture and design.
        NOTE: One area you should NOT spend much time on, is the QUALITY of the OCR output i.e. the accuracy of the image->text conversion.
        This is 'academic' OCR version, so quality isn't that great
    Note that there are multiple APIs available for similar functions. Make sure you investigate each of them!

    This is a plan, so conclusions are not expected in this deliverable. Use the template provided as an outline for your document. You may add sections if needed.

    Templates: Template for Analysis Plan

    You will execute your analysis plan, collect the data, and document the results. Your document must be well organized and professional quality. Your document will include at least the following:
  • Sequence diagram(s), showing the system components, APIs, data exchanges. Since multiple interfaces/ options are available, you may have more than one sequence diagram
    • Use this as the baseline to reference to the measurements you took to analyze the services
  • An explanation of the results of your data gathering and analysis, consistent with the above diagrams
    Summarize the data and show any conclusions you draw from observing the results
    • If there are differences in performance/ data based on which API used, you should provide your thoughts on why there are differences.
  • A reflection on your project plan results compared to the plan you put together in Part 1.

    e.g. Plan vs. actual on time spent; completion dates

  • All assumptions, caveats and conclusions must be clearly documented in your results.
  • If, as you run your tests, you find you need to deviate from the original plan, write that information in this document as well, including why you decided to change.
    NOTES:
  • To complete this part, you WILL need to write software to exercise the services, gather the data, and organize, summarize and display the results in a coherent form. Usage of tables, graphs and formulas may help in organizing and reporting the data.
    Measurements should have sufficient iterations of data (i.e. multiple runs). Your goal is to have enough data to be able to show useful results.
    You do not need hundreds of data point, but have enough to calculate simple metrics. (In fact, you should avoid overloading the test server, so keep your tests simple and small)
    You will not be graded on the code quality of the software you write to generate the data, however you are expected to use good practices so the services are exercised correctly (otherwise your data will not be valid). The sample test app provided will not be sufficient to gather all the data, it is there to help you get started.
  • Sign up for time on the server to run your tests! See instructor for the signup sheet.
    Please adhere to the schedule for testing. Be respectful of others.
    You will create an Architecture document that includes (but is not restricted to) the items listed below. Note that in this deliverable, you will be making a choice on the architecture approach based on the data and analysis you performed in Deliverable 2. You will also create a short presentation to share with the class. Your proposal should include at least the following (feel free to add more to enhance your proposal)
  • An overview of the system - including its purpose and how the architecture fulfills the purpose
  • An updated workflow diagram and description (use the existing information, and update as appropriate)
  • A basic use case description. This should lay out the main flow for the user when going through a typical usage scenario. Describe how it is different from the current workflow, highlight the improvements.
  • A summary of the technical constraints on the architecture.
  • A System diagram, showing the main components, existing and new
  • An updated sequence diagram, showing interfaces to be used (likely a subset of the total APIs available), data exchanged etc.
  • Any additional architectural views that help describe the architecture and the set of architectural decisions made
  • Architecture/ Design patterns and/ or tactics used in your proposal and how they are used to meet the requirements
  • Options and tradeoffs for deciding the architectural and design approach
    Include your rationale for your choices for the architecture
    • i.e. You must explain your conclusions based on the data, tradeoffs, decisioning factors and therefore the APIs/ methods
  • A model to describe the performance of the system as a function of system load. Calculations showing the expected performance under different loading scenarios (i.e. light load, heavy load, average load). Explain the factors to be considered for each category. Create a model (simple equation or calculation to show the main variables, and the effect of the system performance as they change). Don’t make this too complicated!
    See below for examples of scenarios for low, medium, high load …
    • Low Load: You receive a few hundred applications each week
    • Medium Load: You receive a few thousand applications each week.
    • High load: You receive tens of thousands of applications each week, or there is a steady stream of input. It never stops!!
  • Implications to stakeholders:
    • Users
      • Consider the user workflow model
        i.e. How would a user perform the tasks of taking the documents that are ready to be processed and running them through the application? What would you do in your architecture to make the users work easier?
          A preferred way to describe this would be a use case
    • Developers (including recommendations/ cautions)
    • IT (Deployment)
    • Business stakeholders (are there any arch. decisions based on, or affected by financial impacts?)
      • Consider your choice(s) in the OCR engine. What are the performance/ behaviour differences between std. and pro.?
        What are the cost implications of that choice? How do you justify your choice?

  • NOTE: Your architecture must be fast, but also cost efficient (throwing more and more CPU and servers at scalability is not affordable and not a sufficient architectural approach). “Use autoscaling on AWS” is not an answer.

  • Your presentation should be roughly 10 minutes. Slide show format, summarizing the project work and conclusions. Try to have all team members participate in the presentation in class.
Submission

Submit your code to gitlab and your written assignments and presentation to myCourses

Grading
  • Part 1: 25 points
  • Part 2: 50 points
  • Part 3: 75 points
Also refer to the rubric in myCourses