Overview
One of the most simple but useful utilities on
Unix/ Linux is the word count (
wc
)program. This program simply reads one or more files and prints the
number of lines, words, and characters found in each file. Without any command
line arguments,
wc
reads
its text from standard input. For example, here's a run of
wc
on the
DarkAndStormyNight.txt
file we'll be using in this activity
:
bash-3.2$
wc
< DarkAndStormyNight.txt
11 64 375
This shows that the file has 11 lines, 64 words, and 375 characters.
Your task for this activity is to exactly duplicate the behavior of
wc
shown above.
Setup
-
Create a new directory in your repository called
CWordCount
.
-
Using wget to download the
ritwc.zip
file to this directory and unpack it.
In the browser right click on the zip link to copy the link.
Then use wget in the hamilton window to get the zip file
(right click in the hamilton window to paste the URL for the zip file).
You should see two files,
DarkAndStormyNight.txt
and
ritwc.c
.
The Activity
-
Edit the
ritwc.c
file and fill in the body of the main program with code to duplicate
the functionality of
wc
.
See the next section for some hints.
-
On
hamilton
,
compile your program using the GNU C Compiler (
gcc
)
as follows:
gcc
-o
ritwc
ritwc.c
-
When you get a clean
compile, the executable program will be named
ritwc
(that's what the
-o
(output) option is for). To
test your program, execute the following commands:
wc
<
DarkAndStormyNight.txt
./ritwc
< DarkAndStormyNight.txt
-
The first line runs the
standard
wc
command so you can see what's expected. The second runs
your program;
the
./
forces the command language interpreter ( bash
,
or the
Bourne-Again
Shell
) to look in the current directory rather
than the directories for standard system commands.
-
In step 3 above your output must exactly match the output of the Linux wc utility.
Submission
Submit your source file ritwc.c and updated
ActivityJournal.txt
in a directory named CWordCount to your Git repo.
Grading Criteria
To receive full credit for this activity you do the following:
- Submit your work in a correctly named directory. This must be one of the top level directories in your repository.
- Use the exact filenames for both files: ritwc.c and ActivityJournal.txt.
- The program must compile without any warnings.
- The program must compile exactly as shown in Step 2 of the activity.
- You cannot use any other options.
- Do not use the -std=c99 option.
- The output must match the expected output including the exact formatting.
- Use good software style including consistent indentation and appropriate variable names.
- Use a reasonable amount of comments describing the purpose of each section of code.
- Complete the Activity Journal including the time estimate, plan, actual time, and observations.
Hints
& Suggestions
-
What's in a word? More
specifically, what is
a word to
wc
?
A word is a sequence of non-whitespace characters terminated by a
whitespace character or end-of-file. Whitespace characters include the
obvious "space" itself, along with tabs, carriage-return, line-feed and
some other control characters. The following, from the sample
file, has 2 lines, 14 words, and 67 characters:
It was a dark and
stormy night;
the rain fell in torrents - except
The words are (quoted):
'It'
,
'was'
,
'a'
,
'dark'
,
'and'
,
'stormy'
,
'night;'
,
'the'
,
'rain'
,
'fell'
,
'in'
,
'torrents'
,
'-'
, and
'except'
.
-
Note that (a) an empty line,
or one with only spaces, has no words on it, and (b) a line containing
words may begin with whitespace characters which are simply ignored.
-
Peruse on-line documentation
for the three libraries referenced via the
#include
directives:
stdlib.h
,
stdio.h
and
ctype.h
.
Some of these library functions will make your life easier. In
particular, there is one that makes it trivial to tell whether or not a
character is whitespace.
-
Note that
getchar
(
)
returns an
int
,
not a
char
; this is important as
EOF
is actually a negative number (which cannot be the code for a legal
ASCII character).
(*)
This file has the first sentence from Edward Bulwer-Lytton's excruciatingly
bad 1830 novel Paul Clifford. Its fame stems from Charles Schulz's comic
Peanuts,
and the many strips where Snoopy is writing a novel that starts "It was a
dark and stormy night." It is also the inspiration for the annual
Bulwer-Lytton Fiction Contest, sponsored by the English Department at
San Jose State
, a competition to create the worst opening sentence for a novel.
