One of the
most simple but useful utilities on
/ Linux is the word count (
program. This program simply reads one or more files and prints the
lines, words, and characters found in each file. Without any command
its text from standard input. For example, here's a run of
file we'll be using
in this activity
11 64 375
This shows that the file has 11 lines, 64 words, and 375 characters.
Your task for this activity is to exactly duplicate the behavior of
Create a new directory in your repository called
Using wget to download the
file to this directory and unpack it.
In the browser right click on the zip link to copy the link.
Then use wget in the nitron window to get the zip file
(right click in the nitron window to paste the URL for the zip file).
You should see two files,
file and fill in the body of the main program with code to duplicate
the functionality of
See the next section for some hints.
compile your program using the GNU C Compiler (
When you get a clean
compile, the executable program will be named
(that's what the
(output) option is for). To
test your program, execute the following commands:
The first line runs the
command so you can see what's expected. The second runs
forces the command language interpreter ( bash
) to look in the current directory rather
than the directories for standard system commands.
In step 3 above your output must exactly match the output of the Linux wc utility.
Submit your source file ritwc.c and updated
in a directory named CWordCount to your Git pushbox.
To receive full credit for this activity you do the following:
- Submit your work in a correctly named directory. This must be one of the top level directories in your repository.
- Use the exact filenames for both files: ritwc.c and ActivityJournal.txt.
- The program must compile without any warnings.
- The program must compile exactly as shown in Step 2 of the activity.
- You cannot use any other options.
- Do not use the -std=c99 option.
- The output must match the expected output including the exact formatting.
- Use good software style including consistent indentation and appropriate variable names.
- Use a reasonable amount of comments describing the purpose of each section of code.
- Complete the Activity Journal including the time estimate, plan, actual time, and observations.
What's in a word? More
specifically, what is
a word to
A word is a sequence of non-whitespace characters terminated by a
whitespace character or end-of-file. Whitespace characters include the
obvious "space" itself, along with tabs, carriage-return, line-feed and
some other control characters. The following, from the sample
file, has 2 lines, 14 words, and 67 characters:
It was a dark and
the rain fell in torrents - except
The words are (quoted):
Note that (a) an empty line,
or one with only spaces, has no words on it, and (b) a line containing
words may begin with whitespace characters which are simply ignored.
Peruse on-line documentation
for the three libraries referenced via the
Some of these library functions will make your life easier. In
particular, there is one that makes it trivial to tell whether or not a
character is whitespace.
; this is important as
is actually a negative number (which cannot be the code for a legal
This file has the
first sentence from
Edward Bulwer-Lytton's excruciatingly bad 1830 novel Paul
fame stems from Charles Schulz's comic
and the many strips where Snoopy is writing a novel that starts "It was
dark and stormy night." It is also the inspiration for the annual
sponsored by the English Department at
, a competition to create the worst opening sentence for