|
The Java Specialists' Newsletter
Issue 015 2001-03-28
Category:
Performance
Java version: Implementing a SoftReference based HashMapby Dr. Heinz M. Kabutz
Welcome to the 15th issue of "The Java(tm) Specialists'
Newsletter", this week we are looking at an extremely useful
addition to the Java language, namely soft and weak references.
In a nutshell, the main difference between the two seems to be
that the SoftReference has some notion of remembering when last
it was accessed, which may be used by the GC (if it feels like
it) to release little-used SoftReferences first.
Next week I will be doing some training in Germany, so I will not
be able to send out a newsletter, as I have not decided on a list
server yet. Hopefully the week after that you'll have your
normal fix again. Please continue forwarding this newsletter to
your local JUG, friends and family who are interested in Java.
If you do send the newsletter to a closed user list, kindly tell
me, I like having an indication of how many souls are reading my
newsletter. Similarly, if you quote from my newsletter, be so
kind as to attribute the source.
Many thanks to Sydney Redelinghuys for his input and corrections
in this newsletter.
Once again I have seen the value in actually testing my code,
as there was a serious error in my code which I only found
through my "unit" test.
Thanks for reading this newsletter on our website. We also have a mailing list. That is where the real action takes place (webinars, free reports, etc.). Maybe subscribe today?
Advanced Java Courses on Crete:Java Specialists Master Course 18-21 June 2013 and
Concurrency Specialists Course 6-9 August 2013.
Implementing a SoftReference based HashMap
Java is slow. Java is a memory hog.
But, if I have to write a network application, I would not
hesitate for a second to use Java. Why? Because Java shines in
threaded, network-based applications and because over the years,
I have recognised the weaknesses of Java and have learnt (the
hard way) what I should do, or not do, to avoid running into too
serious problems. Recognising the problems takes a while to
learn, which is why most Java jobs require you to have at least 2
years of Java programming experience.
One of the most common places of memory wastage is in hash maps,
so SUN have provided a WeakHashMap to minimize memory usage in
caches implemented using maps. A WeakHashMap stores the keys
using WeakReference objects, which means in layman's terms that
as soon as the key is not referenced from somewhere else in
your program, the entry may be removed and is available for
garbage collection. (Have a look at the JavaDocs for the
java.util.WeakHashMap and java.lang.ref.WeakReference for more
information) Say you write a middle tier that provides an OO
view of your database via an application server. What you really
want is for the values to be automatically released, rather than
the keys. If you put all the objects into a normal HashMap, you
can easily run out of memory when you access many different
objects from different clients. But if you store the objects in
a WeakHashMap they are cleared as soon as your clients is not
referencing them anymore. What you really want, however, is to
only have the objects released when the VM is running low on
memory, since that is when you get problems.
Enter SoftReferences. As far as I understand, a SoftReference
will only be garbage collected when the VM is running low on
memory and the object it is pointing to is not accessed from a
normal (hard) reference. This is probably a better option to use
for the HashMap values than a WeakReference, but the default JDK
collections don't include a GC-friendly HashMap based on values
and neither does it provide a SoftReference based HashMap.
Before I show you a SoftHashMap implementation, based on ideas
by Sydney (what's up doc?) Redelinghuys, I need to explain some
ideas which will make our SoftHashMap more optimal.
- Each time we change the map (put, remove, clear) or ask for
its size, we first want to go through the map and throw away
entries for which the values have been garbage collected.
It is quite easy to find out which soft references have been
cleared. You can give the SoftReference a ReferenceQueue to
which it is added when it is garbage collected.
- We don't want our hash map to be bullied by the garbage
collector, so we provide the option of the map itself keeping
a hard link to the last couple of objects (typically 100).
- The SoftHashMap will use a variant of the Decorator pattern to
add this functionality to an internally kept
java.util.HashMap. I'm busy working on a Design Patterns
course based on the GOF book, let me know if you want further
information.
Without further ado, here comes the SoftHashMap:
//: SoftHashMap.java
import java.util.*;
import java.lang.ref.*;
public class SoftHashMap extends AbstractMap {
/** The internal HashMap that will hold the SoftReference. */
private final Map hash = new HashMap();
/** The number of "hard" references to hold internally. */
private final int HARD_SIZE;
/** The FIFO list of hard references, order of last access. */
private final LinkedList hardCache = new LinkedList();
/** Reference queue for cleared SoftReference objects. */
private final ReferenceQueue queue = new ReferenceQueue();
public SoftHashMap() { this(100); }
public SoftHashMap(int hardSize) { HARD_SIZE = hardSize; }
public Object get(Object key) {
Object result = null;
// We get the SoftReference represented by that key
SoftReference soft_ref = (SoftReference)hash.get(key);
if (soft_ref != null) {
// From the SoftReference we get the value, which can be
// null if it was not in the map, or it was removed in
// the processQueue() method defined below
result = soft_ref.get();
if (result == null) {
// If the value has been garbage collected, remove the
// entry from the HashMap.
hash.remove(key);
} else {
// We now add this object to the beginning of the hard
// reference queue. One reference can occur more than
// once, because lookups of the FIFO queue are slow, so
// we don't want to search through it each time to remove
// duplicates.
hardCache.addFirst(result);
if (hardCache.size() > HARD_SIZE) {
// Remove the last entry if list longer than HARD_SIZE
hardCache.removeLast();
}
}
}
return result;
}
/** We define our own subclass of SoftReference which contains
not only the value but also the key to make it easier to find
the entry in the HashMap after it's been garbage collected. */
private static class SoftValue extends SoftReference {
private final Object key; // always make data member final
/** Did you know that an outer class can access private data
members and methods of an inner class? I didn't know that!
I thought it was only the inner class who could access the
outer class's private information. An outer class can also
access private members of an inner class inside its inner
class. */
private SoftValue(Object k, Object key, ReferenceQueue q) {
super(k, q);
this.key = key;
}
}
/** Here we go through the ReferenceQueue and remove garbage
collected SoftValue objects from the HashMap by looking them
up using the SoftValue.key data member. */
private void processQueue() {
SoftValue sv;
while ((sv = (SoftValue)queue.poll()) != null) {
hash.remove(sv.key); // we can access private data!
}
}
/** Here we put the key, value pair into the HashMap using
a SoftValue object. */
public Object put(Object key, Object value) {
processQueue(); // throw out garbage collected values first
return hash.put(key, new SoftValue(value, key, queue));
}
public Object remove(Object key) {
processQueue(); // throw out garbage collected values first
return hash.remove(key);
}
public void clear() {
hardCache.clear();
processQueue(); // throw out garbage collected values
hash.clear();
}
public int size() {
processQueue(); // throw out garbage collected values first
return hash.size();
}
public Set entrySet() {
// no, no, you may NOT do that!!! GRRR
throw new UnsupportedOperationException();
}
}
And here comes some test code that demonstrates to a certain
degree that I'm not talking complete nonsense. Soft and weak references are quite
difficult to experiment with as there is a lot of freedom left to the writers of
the JVM of how they must implement them. I wish the implementation would hold
back longer before removing these references, that the JVM would wait until it
is really running low on memory before clearing them, but unfortunately I am
not the one who wrote the JVM. I have tried to question the authors of the
java.lang.ref package to find out what the strategy is for references in future
versions, but have not had any response yet.
//: SoftHashMapTest.java
import java.lang.ref.*;
import java.util.*;
public class SoftHashMapTest {
private static void print(Map map) {
System.out.println("One=" + map.get("One"));
System.out.println("Two=" + map.get("Two"));
System.out.println("Three=" + map.get("Three"));
System.out.println("Four=" + map.get("Four"));
System.out.println("Five=" + map.get("Five"));
}
private static void testMap(Map map) {
System.out.println("Testing " + map.getClass());
map.put("One", new Integer(1));
map.put("Two", new Integer(2));
map.put("Three", new Integer(3));
map.put("Four", new Integer(4));
map.put("Five", new Integer(5));
print(map);
byte[] block = new byte[10*1024*1024]; // 10 MB
print(map);
}
public static void main(String[] args) {
testMap(new HashMap());
testMap(new SoftHashMap(2));
}
}
Until next week, and please remember to forward this newsletter
and send me your comments.
Heinz
Performance Articles
Related Java Course
Discuss at The Java Specialist Club
|