A common text format for structured tabular data is comma separated values (CSV). Most database systems and spreadsheets provide an option to save their data in this format. Simply put, a CSV file consists of a sequence of lines, where each line contains 1 or more fields separated by commas. Note that a line with N fields will have (N-1) commas.
Last,First,Email,NumGrade,Letter
Awesome,Abby,axa@foo.edu,95.6,A
Better,Bobby,bnb@foo.edu,82.3,B
Doofus,Donald,ddd@foo.edu,64.4,D
Food,Amount,Calories
Peanut Butter,tbsp,95
Whole Milk,cup,146
Ho Hos,serving,370
As these examples show, a CSV file often has a header line that labels the fields in the following lines. For this activity we will only consider CSV files with header lines.
Download csv.zip which you will complete to parse and print a CSV file (read from standard input).
The following constants & structure will be used to represent each parsed CSV line
#define MAX_FIELDS (15)
#define MAX_CHARS (20)
typedef char f_string[MAX_CHARS + 1] ;
typedef struct {
int nfields ;
f_string field[MAX_FIELDS] ;
} csv_line ;
A csv_line
is a struct
holding a field count,
nfields
, and up to MAX_FIELDS
field
s. Each field
is of type f_string
,
which is an array of chars that can store at most MAX_CHARS
characters plus a terminating NUL ('\0')). You must assume that a field
may be empty - that
is, have only a NUL in it. This
would be the case, for instance, if a line contains two consecutive
commas or consists of only a newline character.
You program must complete the bodies of the following three functions in csv.c:
int get_field(f_string field) ;
Fills in the field array
with the next field
from
standard input, ensuring the field is properly NUL
terminated. A field ends when one of (1) a comma (,), (2) a newline
('\n'), or (3) EOF is returned by getchar()
.
The (provided) helper function is_end_of_field(int
ch)
returns true if and only if the character ch
is one of these terminators.
get_field()
returns the character that ended the field it read.
csv_line get_line() ;
Reads and splits the next
CSV input line into its constituent fields, returning the resulting csv_line
structure. The function
works by repeatedly calls get_field()
with the successive field
arrays to be filled in, stopping when get_field()
returns a newline or EOF.
Note that any legal line, even an empty one, has at least one field;
thus, end of file is indicated by setting nfields
to 0 in the structure get_line()
returns.
void print_csv(csv_line header, csv_line data) ;
Prints label / value pairs, where header
is the parsed version of the first input line and data
is the parsed version of one of the following lines. For instance,
the first data line from the CSV file in Example #2 above
would be printed as:
Food = Peanut Butter
Amount = tbsp
Calories = 95
If the header
and data lines differ in the number of fields, then the number of pairs
printed is the minimum of these two field counts - see the provided
helper function
min(int x, int y)
The (provided) main function
reads the first header line, then reads and prints the successive data
lines using print_csv()
.
To compile for unit tests:
gcc -o test -g -Wall csv.c unit_tests.c
To compile for normal execution:
gcc -o csv -g -Wall csv.c unit_tests.c
To run unit tests (all unit tests MUST pass for full credit):
./test
To run normally:
./csv <food.csv >actual.out
To compare output:
diff actual.out expected_food_csv.out
You can download the expected_food_csv.out file here
Note that these instructions are specific for the csv
assignment. As you change assignments, you will need to modify the appropriate variables in the .gitlab-ci.yml file
Place your completed files, along with your activity
journal, in a directory named csv
at the top level of your git repo.
Create a make file that builds the test version and
the normal (csv) version for an additional 5%. If you create a makefile that also automatically
executes the unit tests you can get 5% more for a total possible bonus of 10%.
Submit your Makefile
in the same csv directory.
Your Makefile
must create both the test and csv executables,
compile and link with the -g switch, and must compile with the -Wall switch.
NOTE -- for you to get full bonus credit you must accomplish this by just typing make
with no parameters.