Lab Report Format

Lab reports are intended to read like a paper. Make them structured in prose, not in a question-and-answer kind of format. Think of this as more of a presentation than filling out a form. Your lab report must include the following:

Broad Questions. I will give these to you.
Research questions. Refine my broad questions down to answerable, specific questions. It's okay to modify your questions after you did your work - sometimes you realize you're answering a different question than you set out to!
Workload Specification. What is the data set you are testing this on? And how does a potential use case make this workload realistic?
Technology details. Provide details of the environment, hardware, languages, etc. used, including specific version numbers where appropriate for repeatability.
Implementation details
- Document the decisions you made in constructing this experiment. Be systematic in how you approach this!
- Did you implement this yourself?
- If you did not, provide proof that the implementation matches what you expect (e.g. link to their docs, link to their source code, point to unit tests you wrote)
- What aspects of your implementation could impact the performance of this code?
- Please include relevant code snippets from your experiment as figures. You MUST provide syntax highlighting and you MUST use a fixed-width font. I recommend an online syntax highlighter such as https://tohtml.com/.
Measurement methodology
- What metrics are you using? Refine any from what we specify, or add any metrics you believe are relevant.
- What measurement tools are you using? How are you using it?
Results and discussion. Present the most convincing case for the conclusions you are trying to make.
Limitations. Have a thorough(!) discussion of the threats to validity your study has. What are the factors that impact your conclusions? What can be done about those limitations?

Grading Rubric

Below is the rubric we use for grading the lab reports:

10pts. Completed Assignment. Did the student finish? Did they do what was asked for? (Or, reasonably within what was asked for?)
5pts. Questions Refined. Are the questions refined to the degree that they are useful distillations of the methodology? Are the questions refined to a point where can be sufficiently answered by the results?
5pts. Decisions Documented. Are the methodology and implementation documented as a narrative? Is it documenting the decisions you made? Are the decisions reasonable and systematic?
5pts. Clarity in Writing. Is the writing clear? Are the results clearly presented enough to be understood upon a quick read? Is the style and voice consistent?
5pts. Thoughtful Limitations. Do you document potential issues that are outside of your control that limit the conclusions? Are the limitations thoughtful? Are there methodological limitations? Are there technological limitations?
5pts. Realistic Situation. Are your simulations reasonably realistic in light of an engineering scenario? This includes both data and implementation choices.
5pts. Believeable Results. Are the results believeable? This is a catch-all for the overall evidence evaluated.
5pts Providing feedback. Are you providing feedback on the your labmates' documents? Did you show up to the lab meeting? (We prefer you be present for the lab meeting, but if you are absent then we ask for GoogleDoc comments. Being present in the in-person meeting is enough to get credit.)

Total: 40 points

Common feedback

Here's a listing of various general comments we've made to lab reports in the past.

Clarity is king. The single most important goal in technical writing is to be clear. Your writing means nothing if it's not understood, and technical writing often is conveying complex nuance. Don't think about being formal, think about being systematic and clear.

Use Active Voice. Despite what you may have learned in high school, most readers of academic research expect active voice. Actually, in most all technical writing, active voice is preferred. Passive voice often lends itself to ambiguity because it lacks who is doing what (e.g. "The code was implemented" vs. "I implemented the code"). Instead, you can use "I" or "We".

Magic numbers. Like in code, arbitrary numbers stick out like a sore thumb. Any time you write a hard number, make sure the reason for its existence is clearly defined. ("Why did you run your experiment 27 times and not 28?").

Limitations are not a retrospective. A limitation in research is a factor that can influence your results. This can the result of all kinds of choices you made, from the validity of your measurements, the realism of your data sets, to the implementation choices you made. In the scientific community, it is understood that all studies have limitations - and lacking limitations is more of a sign of laziness on your part than it is of the quality of your work. Here are some example ARE limitations:

Running your experiments concurrently means that wall-clock time might be off because of context switches or processor scheduling.
Memory management in an interpreted language might lead to pre-allocated memory, so the OS-level measurements of memory may not change for small problems.

The following things are NOT valid limitations:

"I could have done better"
"The results are not what I expected"
"I didn't have enough time to do this"
"Python" (an entire language is not a limitation, idiosyncracies of how they implement things in light of your measurement methodology can be the limitation!)

The best way to state a limitation is to connect how one factor influences your results, and then discuss why this may or may not have happened. NOTE: the limitations section is actually probably the toughest part to write for most SE majors taking this course.

Be reproducible!. Are your decisions documented enough that someone could reproduce your work with most of the same parameters along the way?

If you don't believe your choices, who will?. Stand by your work! Describe what doesn't work, but make your work as believable as possible. All science has limitations, but if you are open about those limits then someone else may see your evidence and accept it even if you don't because they are more objective than you are about your own work.

Representative over Random. Random is not always the best type of input! The world is not as random as we think it is. Focus on experimenting on a realistic scenario to test instead of just randomizing all your inputs.

Write like someone needs your results. Always think about how someone could use your study. Will they use it in a meta-study? Will you use it in a meeting of software engineers? This should mean that your decisions that seemed obvious to you might still need to be documented so that your study could compare to other studies.

Figures & Tables should convey a message quickly. Don't make your reader do numerical comparisons in their head across your table. Draw reader's eye to the message of your chart quickly. Graphs are useful for getting a quick gestalt of how variables behave, but also make sure they are clear in their axes and legends.

Averages are sensitive to outliers. Any time you are reporting averages, know that computers are quirky and sometimes have one experimental run that can skew your results. Always report standard deviation along with averages at the very least. Report medians too. Graphs also help convey outliers as well.

Submitting Labs

Labs are due on the day of your Lab Meeting (usually Mondays, but check the schedule for exact days). You will need to have your lab report done by the lab meeting and be ready for feedback. Your team will discuss your work and provide GoogleDoc feedback. Everyone will be graded on the quality of feedback they provide to others (see below).

Your instructor will invite you to a class-wide Google Team Drive. Create a folder on that drive called "X Lab Reports" where X is your first and last name.
In that folder, your labs will be in their own folders called "Lab 1 Sorting", "Lab 2 Recursion", etc
Once you have your lab group (you find your own group of 3-4 people), move your folder into a group folder
For code, create one repository for the whole semester on GitLab at https://kgcoe-git.rit.edu
- Add your instructor and TA as a "Reporter" on the repository
- Don't share this code with teammates, but feel free to bring up your code in the lab meeting if necessary. Feel free to help teammates with their labs, just make them write their own code.
- Don't share this code in a public forum either, even after this class is over.
- Put each lab's code in a separate folder in the root of the repo

Providing Feedback on Labs

You will need to provide feedback to your labmates each week. If you are in the meeting, make sure you ask questions and help them make their report better. For grading purposes, we consider your presence in the lab meeting enough. If you miss the lab meeting, we will be looking for comments on your labmates' GoogleDoc that night so they can improve their report.

A high-quality comment is insightful, actionable, and constructive. We do not place a number on how many comments you are to give - only on how helpful and insightful they are. Your comments should fall into the following categories:

Grammar, Spelling, or Style: These comments should be trivially correctable. Don't just look for low-hanging fruit here - think about what would make the writing smoother and easier for you to read. For example, "Use active voice in this paragraph instead of passive voice."
Metholology or other substantive comments: These types of comments concern the content itself. For example, "I think your results might be impacted by the overhead of starting a new process for each run".

Lab Topics

Lab #1 Sorting Algorithms on Arrays

Technology: any programming language you desire.
Implementation: You may implement your own sorting algorithms. You may use built-in libraries as long as you provide proof that they are the algorithm you expect to be testing (e.g. unit tests)
Comparisons:
- Insertion sort
- Merge sort
- Quick sort
- Pick one more sorting algorithm of your choice
Workload. In the real world, not all data arrives in perfect random order! Create 3 realistic workloads: "nearly sorted", "random", and "evil sorted". The "evil sorted" would be to arrange the data in such a way that it might slow down some algorithms. Specify how you approach this direction.
Measurement methodology. Report the following, and add any other metrics you feel are relevant to your experiment.
- Array size
- Time to completion (minimize confounding factors!)
- Number of swaps
- Number of repetitions (n) - more is always better, as time allows
- Memory usage (optional) - if you have time, try to measure the memory usage in any way you can
Scalability: At what length of the array does each sorting algorithm demonstrate improvement over the others? Examine at least 5 different array sizes.
Factor summary: Thus, this experiment has 1 language x 4 algorithms x 3 workloads x 5 sizes x n repetitions = 60*n runs.

Lab #2 Recursion

Broad Research Question: What are the performance benefits and/or costs to using recursion? Explore the alternatives to using recursion to solve the same problems.
Technology:any programming languages you want for this lab.
Comparisons
- Choose two different programming languages
- Choose two different programming problems that involve recursion. Implement each problem, one with using recursion and one without using recursion. There are two main alternatives to recursion: implementing your own stack and dynamic programming. Be sure to discuss your implementation choices.
- Thus, you should have 2*2*2*n = 8 programs to compare (2 languages, 2 problems, 2 solutions to each problem in each language).
- Compare each program against varying sizes of the problem as well. The scale is up to you. Thus your results will be 8n where n in the number of different scales you gave each program.
Measurement methodology. You must measure speed and memory footprint. The specific metrics are up to you, and you should define them.
Background research. You will need to research how your chosen languages implement recursion and how that can impact your results. Two topics you must cover are memoization and tail recursion. Be sure to cite any sources you use for this.

Lab #3: Native library bindings

Interpreted languages today are quite a marvel. They allow programmers to prototype their code rapidly, write very readable code, and perform complex operations with concise syntax. But, one of their (oft-touted) disadvantages is an inherent performance degradation that comes with on-the-fly parsing and interpretation. In most modern production-quality interpreted programming languages, such as Python, Ruby, and Javascript, most of the performance-demanding work is actually done in C extensions. (This is pretty humorous when you think about it - given how much programmers love to have endless debates about these languages today and yet most of the time everyone is just binding to the exact same C libraries under the hood).

In this lab, you will be testing out the performance benefits and drawbacks of binding to a native language. This involves some C programming and some interpreted language programming.

Broad Research Questions. What is the performance overhead of native library binding? What are the performance benefits and drawbacks of using C versus an interpreted language?
Technology. You MUST use one interpreted language, and one native language for this. (You may use Java as an interpreted language, although it's technically not considered an interpreted language, but the questions still apply to it)
- Nervous about getting this to work? We have an example in Ruby + C that will work on Nitron. You are welcome to adapt from that, and you are welcome to run your experiments on Nitron. (Just be mindful that other people use this server, so don't run tests that are longer than a few minutes.) Be sure to also check out the links we provided that have documentation on C extensions in Ruby.
- Want some extra credit? If you create your own example in a language pairing not seen in the repo, and you get your pull request approved, then your instructor will give you an extra 5 points on your lab report. Examples that run on Nitron are especially desired, but instructions for Windows and Mac are also welcome.
Experiments. This is a 6-way experiment, three solutions to each of two different problems.
- Three solutions. Implement the solution to the problem purely in your chosen interpreted language, purely in your chosen native language, and then in interpreted binding to native. The code for pure native language can be the exact same code as interpreted+native, just where the main function calls it directly.
- Multiplication. Take two numbers and multiply them. Assume that they are 32-bit signed integers.
- Hamming distance between two 32-bit signed integers. The Hamming distance is defined as the number of differences between two strings of equal length (also called "edit distance"). Assume that your "strings" are the binary representation of two 32-bit signed integers. For example, the Hamming distance between 4 (100) and 5 (101) is 1 because only 1 digit is different when you compare them like strings. As another example, the hamming distance between 4 (0100) and 8(1000) is 2. In interpreted languages, you will most likely need to convert your numbers to strings and then do the comparison - but if you find a purely-interpreted approach that is faster that's fine too (just document what you did!). In C, you should be able to use XOR on the two integers to tally the differences, and then use the __builtin_popcount function to count the number of 1's (this is often considered the fastest technique, depending on your hardware).
Measurement Methodology.
- Please measure the speed of each function in "iterations per second". That is, count the number of seconds it takes to do a fixed number of iterations, then divide.
- You may assume that the numbers are uniformly distributed random within the 32-bit space.
Timeline
- First week. You must make sure that you understand native bindings. Either come up with a working "Hello, World" example in your favorite languages, or test out our example on nitron. Place your choice in your Research Plan in your lab report. Secondly, write some limitations that you can see with this experiment.
- Second week Full lab report due.

Lab #4: Multiprocessing

For 40+ years, Moore's Law was a phenomenon. But today, the era of Moore's Law has largely ended, which has brought in the advent of multiprocessing and multi-core processing. In this lab, you'll be exploring the benefits and drawbacks of parallelization.

Broad Research Questions. What is the performance overhead of multiprocessing? What are the performance benefits and drawbacks of using multiprocessing?
Technology. You are welcome to choose your favorite programming language. In particular, here are the things to watch out for:
- We are testing multi-processing, not multi-threading.
- Does your language have a Global Interpreter Lock? If so, then use a library to bypass this and create separate sub-processes. For example, Python's version of this is relatively straightforward.
- Document as much as you can about your language's usage of multi-processing
Experiments. This is a 4-way experiment.
- You will be testing two parallelization problems: finding a median via the "quick select" algorithm and another problem of your choice. You can find parallelization problems, including median quick select, on this CMU site. You may choose whatever algorithms you like, EXCEPT a sorting algorithm (we've done too much of those already). Be sure to test the validity of your implementation to make sure it's getting the correct answers. UPDATE Feb 18: the parallel version of the given quick select algorithm does not appear to actually parallelize well (although mentions of a parallel quickselect algorithm exist in the research literature many times). For the purposes of this lab, discuss this issue in your limitations.
- Run each algorithm in parallel and sequentially. Try to keep your sequential algorithm as close in implementation to your parallel algorithm as you can. Ideally, your setup should allow you to just create 1 worker instead of multiple.
- To recap, for algorithms A and B, you will be running "Parallel A", "Sequential B", "Parallel B", and "Sequential A"
Measurements. Report the speed of your system in throughput ("jobs per second"). Measure the utilization of the cores on your system to ensure that your system is truly running in parallel. You are not required to measure memory usage, but reporting it would be appreciated if it's easy for you.

Lab #5: Multithreading

Multithreading is a much more lightweight approach to parallelization when compared to multiprocessing.

Broad Research Questions. What are the performance benefits and drawbacks of using multithreading when compared to multiprocessing?
Technology. You are welcome to choose your favorite programming language. In particular, here are the things to watch out for:
- We are testing multi-threading, this time, not multi-processing.
- Does your language have a Global Interpreter Lock? (e.g. CPython) If so, then discuss how this impacts your results
- Document as much as you can about your language's usage of multi-threading
Experiments. This is a 4-way experiment.
- You will be testing two parallelization problems:
  - A CPU bound problem - similar to what we saw in the previous lab, these involve little IO
  - An IO bound problem - come up with a realistic scenario where data must be read in from the hard drive and processed in parallel.
- Run each algorithm in parallel and sequentially. Try to keep your sequential algorithm as close in implementation to your parallel algorithm as you can. Ideally, you can set up a "pool" of threads and then just set the pool size to 1 worker instead of multiple.
- To recap, there are four combinations: CPU-sequential, CPU-parallel, IO-parallel, IO-sequential
Minimum runtime. Each of your four experiments must run for at least 60 seconds, so as to give you time to check the memory usage and the core usage. This also helps mitigate "warmup" time that some languages have.
Measurements.
- Report the speed of your system in throughput ("jobs per second"). Measure the utilization of the cores on your system to ensure that your system is truly running in parallel. You are not required to measure memory usage, but reporting it would be appreciated if it's easy for you.
- Did your machine use multiple cores during the run of this experiment? Report this fact and discuss why.
- Report on the maximum memory usage observed in kb

Meta Study #1

Science only advances when we can aggregate individual studies into systematic knowledge. Let's attempt to systematize the knowledge that we gained through our labs.

To do this, we need everyone to do the following ASAP. The sooner you do this, the easier time your teammates will have studying your work.

Make your data clear for someone else to combine your data with others. Include tables as appendices if you only included charts, for example. Or make sure you have your data in the folder. Other people will be using your raw data, so anything that you can do to make it easier is great.
From now on, we will be requiring relevant code snippets from your code when you document your methodology. This helps everyone better understand how you ran your experiment. For this meta-study, we need everyone to retro-actively add code snippets to each lab report. You MUST provide syntax highlighting and you MUST use a fixed-width font. I recommend an online syntax highlighter such as https://tohtml.com/.

Next, we need a group to deliver a meta-study for each of the labs we've had. Everybody has been assigned a Meta Study Team, which should NOT have any members of your existing lab group. You can find the assignment on myCourses under Groups

All labs are available on the team drive.
Important. Every member of your team MUST read EVERY lab report you are assigned. Do not divide up the work by paper.
As a group, write a 2-4 page paper on what conclusions can be drawn from aggregating the results from these studies. Use the stated research questions as guides for what can be answered.
Make an attempt to combine data across case studies whenever it makes sense. For example, plotting the percentage increase of throughput from one factor to another. Note that raw data cannot be simply combined, since everyone ran their experiments on different machines. So you will have to define your own meta-metrics.
Aggregate limitations across all studies. Do some limitations apply to some studies and not others? Make that clear.
Discuss any studies that don't quite fit due to their implementation decisions or other limitations.
Provide at least two figures (more is better) that make a fair comparison across multiple studies.

Lab #6: Relational Databases

Databases are extremely powerful systems with lots of options and features. Every design decision in a database ends up with various trade-offs. I n this study we will be examining some basic choices you can make in a relational database system.

Broad Research Question. What are the performance trade-offs in indexing choices and query structure? What configuration options impact the results the most?
Technology choice. You must choose one widely-used relational database system that is ACID compliant and is NOT sqlite. Examples include: PostgreSQL, MariaDB, MySQL, MS SQL Server, DB2 (MongoDB and Redis do not qualify.)
Data schema. Take our Blog example from the Introduction to Databases slides and create that schema. You are welcome to elaborate on it however you like, but be sure to document all of the decisions you made. Please show your schema creation SQL script as a figure.
Workload. The data set you generate should be realistic, and significantly large. The overall size of your database should exceed 1 gigabyte, otherwise optimizations become harder to predict at smaller levels. I recommend writing a script to populate random data.
Reporting EXPLAIN output. Most relational database systems can generate a detailed query execution plan, usually with the EXPLAIN keyword. As part of documenting your decisions, run EXPLAIN on every query and report those outputs. These can be pretty verbose, so put them in an appendix at the end of your report. These can also be helpful in understanding whether your experiments are running the way you think they are.
Experiments.
- Queries from Activity. Use the queries from our sqlite activity as your initial benchmarks.
- Indexing choices. There are many different types of indexes that you can use to apply to a given column. Test three options: one without using any indexes, and three different types of indexes (e.g. in PostgreSQL, you can do Hash, B-tree, and several others). Read up on what each of those algorithms are optimized for, and construct multiple queries that take advantage of those indexes. Test the differences between the three options for each query. Report both the length of the query run and the size of the index built. Be sure to report the EXPLAINs.
- Effects of ANALYZE. Most relational database systems have an "ANALYZE" command that caches table sizes and other statistics that aid in query planning. What effect does ANALYZE have on your queries?
- The cost of sub-queries. Subqueries are known to be very expensive because they essentially force a nested-loop structure. Subqueries can often be re-written to use joins anyway. Write a complex query two ways: one with a subquery, and another without. Compare the results, and report the statistics from EXPLAIN.

Lab #7: Compiler Optimizations

Compilers can do a lot for us. But how realistic are these optimization situations?

Broad Research Question. What kind of performance trade-offs are present when compiler optimizations are used?
Technology choice. Choose one programming language to research. It can be an interpreted or compiled language. (Tip: compiled languages might be easier for this lab since you can inspect the binary.)
Optimizations. Research a compiler/interpreter of your language of choice to determine what kinds of optimizations it provides. Look in two places: options to the compiler itself (e.g. like this one from GCC), and the changelog of the compiler. Find three optimizations that you will attempt to reproduce. (You may not choose Tail Call recursion, as it was in a prior lab.)
Workload. Test for (a) constant folding, and (b) your three additional optimizations. Write example code that SHOULD be optimized by compiler. Demonstrate in your lab report that the two factors of your experiment show unoptimized code vs. optimized code. This can be via changing the compiler version, or in showing the difference in the assembly generated (e.g. gcc -S), or something else.
Minimum runtime. Be sure to have enough trial repetitions for each experiment to last at least 60 seconds.
In your lab report, be sure to include a detailed explanation of what is happening. Use figures and code snippets. Provide some real-world examples of how this optimization can happen and what kinds of benefits it provides.

Lab #8: Revisit

For this final lab, you must go back to one of your previous labs and revisit it. You may choose the lab.
Improve the experiment. Are there more trials you can run? More factors you can explore? Better workloads you can use? Look at your limitations for inspiration. Document your choices and re-run your results
Add new questions. Decide upon 2-3 new research questions on the lab topic that you want to explore. Conduct those additional experiments and integrate them into your conclusions.
Submission. Make a copy of your prior lab for this one and work from there.
Grading. The grade for this one is just like the others - the grade on your old lab still stands. This is just an opportunity to go back and improve.