SWEN-342 Eng of Concurrent & Distributed SW Systems
Bucket Hash Map
This activity will be done individually.
Introduction
A bucket hash map is a hash map implemented via an array of N buckets. Each bucket is associated with those keys whose Java hashCode (modulo N)
is the bucket's index in the array, and each bucket has a list of some
sort to hold the key/value pairs for its associated keys. One
reason for such a structure is to cut down on the search time for a key
- on average, each bucket holds only 1/N of the hash map entries. In
addition, distinct threads can access different buckets can proceed
concurrently. The source code for a simplfied hash map using native
Java synchronization is at SynchronizedBucketHashMap.java.
While this version works, it has two significant drawbacks:
- Access
to each bucket is exclusive, even though buckets are read much more
frequently than they are written, and read accesses can proceed
concurrently.
- The computation of the hash map's size is
inconsistent - it may reflect a size which the map never had. For
instance, since the size of each bucket is computed while only holding
that bucket's lock, changes to other buckets can lead to an incorrect
size. For example, assume there are 3 buckets in the map and the bucket
has 50 key/value pairs in it:
- Bucket 0 is locked, its size added in, and bucket 0's lock is released.
- Bucket 1 is locked, its size added in, and bucket 1's lock is released.
- Two key/value pairs are removed from bucket 0 by another thread.
- Three key/value pairs are added to bucket 2 by another thread.
- Bucket 2 is locked, its size added in, and bucket 2's lock is released.
- At
this point, the reported size only accounts for the addition to bucket
2, not the removal in bucket 0. The reported size of 53 is wrong - the
size if 51 - and what is more, the map never had 53 elements in it!
- The
problem could be addressed by acquiring locks on all the buckets before
computing the size, but the basic java mechanisms do not allow this in
the general case.
The Problem
Convert SynchronizedBucketHashMap to a ConcurrentBucketHashMap as follows:
- Copy over the provided file and change the class name globally.
- Add a private ReadWriteLock to the state of the internal Bucket class, and add interface methods to Bucket to lock and unlock the read and the write locks - four methods in all.
- Replace
the basic Java synchronization with explicit locking, ensuring that
maximum possible concurrency while maintaining overall state
consistency (i.e., safety).
- Ensure that size()
returns a consistent value by acquiring all read locks first, then, as
each bucket's size is accumulated, releasing that bucket's lock.
You
must, of course, test your implementation both to ensure proper
functionality and to show that increased concurrency is possible. For
this latter issue, you are free to instrument your ConcurrentBucketHashMap
to (a) force thread switches and insert delays, and (b) to print out
trace information showing proper concurrent access by multiple client
threads. Collect this trace information into a text file and annotate
it to show that concurrent access to a given bucket is possible, while
read access to a bucket being written is impossible (actually, that it cannot be produced in a trace).
Deliverables
Submit your solution by the deadline in MyCourses by pushing it to the GitHub repository. At a minimum you must have two files named
exactly
ConcurrentBucketHashMap.java and
Trace.txt. Note that ConcurrentBucketHashMap.java must be free of any instrumentation code.
The activity will be graded based on the course rubric.