SWEN-342 Concurrent & Distributed Software Systems
File Proecessing
The Problem
For this activity you will create a list of every unique word in a file. The words in the list
should be punctuation free and should all be in lower case. Additionally, a couple metrics
about the file will be displayed.
Requiements
-
All of your code must be in the main
method of a file named FileProcessor.java.
-
You must use Java Streams and lambdas to implement the parsing.
-
Limit any lambdas to 5 lines or less. The goal is brevity. In general
more stream operations are preferred over larger lambdas.
-
Each unique word in the file must be store in a
List. Words should have no punctuation attached to them and be lower
case.
-
Print the number of words in the list. You must use streams to determine
the word count (don't simply print the list's size).
-
Print every word, (one per line), in the list that matches the regular
expression ".*bit.*". Once again, the
work should be accomplished by using streams and lambdas.
Example Output
Using the alice.txt file, your output should look similar to:
There are 3044 words in the file.
Words that contain 'bit' in them:
ambition
bit
bite
bitter
prohibition
rabbit
rabbits
Hints
-
Instructor soluion used 5 stream operations to create the list.
The longest labmda was 3 lines. Yours does not have to match, this
is simply an example to give you some idea of scale.
-
You can use
Files.lines(new File("alice.txt").toPath()) to access the file
as a stream, one line at a time.
-
You are free to hard code the file and regular expression used to be
the ones mentioned in this document.
Deliverables
Commit and push your solution to the GitHub repository by the due date.