A common text format for structured tabular data is comma separated values (CSV). Most database systems and spreadsheets provide an option to save their data in this format. Simply put, a CSV file consists of a sequence of lines, where each line contains 1 or more fields separated by commas. Note that a line with N fields will have (N-1) commas.
Last,First,Email,NumGrade,Letter
Awesome,Abby,axa@foo.edu,95.6,A
Better,Bobby,bnb@foo.edu,82.3,B
Doofus,Donald,ddd@foo.edu,64.4,D
Food,Amount,Calories
Peanut
Butter,tbsp,95
Whole
Milk,cup,146
Ho
Hos,serving,370
As these examples show, a CSV file often has a header line that labels the fields in the following lines. For this activity we will only consider CSV files with header lines.
Download csv.c which you will complete to parse and print a CSV file (read from standard input).
The following constants & structure will be used to represent each parsed CSV line
#define MAX_FIELDS (15)
#define MAX_CH_PER_FIELD (20)
typedef char f_string[MAX_CHARS+1] ;
typedef struct {
int nfields ;
f_string
field[MAX_FIELDS] ;
} csv_line ;
A csv_line is a struct holding a field count,
nfields, and up to MAX_FIELDS
fields. Each field is of type f_string,
which is an array of chars that can store at most MAX_CHARS
characters plus a terminating NUL ('\0')). You must assume that a field may be empty - that
is, have only a NUL in it. This
would be the case, for instance, if a line contains two consecutive
commas or consists of only a newline character.
You program must complete the bodies of the following three functions in csv.c:
int get_field(f_string field) ;
Fills in the field array
with the next field from
standard input, ensuring the field is properly NUL
terminated. A field ends when one of (1) a comma (,), (2) a newline
('\n'), or (3) EOF is returned by getchar().
The (provided) helper function is_end_of_field(int
ch) returns true if and only if the character ch
is one of these terminators.
get_field()
returns the character that ended the field it read.
csv_line get_line() ;
Reads and splits the next
CSV input line into its constituent fields, returning the resulting csv_line structure. The function
works by repeatedly calls get_field()
with the successive field
arrays to be filled in, stopping when get_field()
returns a newline or EOF.
Note that any legal line, even an empty one, has at least one field;
thus, end of file is indicated by setting nfields
to 0 in the structure get_line()
returns.
void print_csv(csv_line header, csv_line data) ;
Prints label / value pairs, where header
is the parsed version of the first input line and data
is the parsed version of one of the following lines. For instance,
the first data line from the CSV file in Example #2 above
would be printed as:
Food = Peanut Butter
Amount
= tbsp
Calories
= 95
If the header
and data lines differ in the number of fields, then the number of pairs
printed is the minimum of these two field counts - see the provided
helper function
min(int x, int y)
The (provided) main function reads the first header line, then reads and prints the successive data lines using print_csv().
Place your completed csv.c file, along with your activity journal, in a directory named csv at the top level of your git repo.