Java Specialists' Java Training Europehome of the java specialists' newsletter

The Java Specialists' Newsletter
Issue 1332006-09-29 Category: Performance Java version: JDK 1.6 beta2

GitHub Subscribe Free RSS Feed

Safely and Quickly Converting EJB3 Collections

by Levent Yurtsever, Andreas Schmidt, Dr. Heinz M. Kabutz
Abstract:
When we query the database using EJB3, the Query object returns an untyped collection. In this newsletter we look at several approaches for safely converting this to a typed collection.

Welcome to the 133rd edition of The Java(tm) Specialists' Newsletter. We only have another three weeks left in sunny South Africa, before we move to Greece. We are enjoying the last weeks, sitting in the sun with some visitors from Germany.

We are privileged to have Andreas Schmidt from Karlsruhe with us, doing a practical at our company towards his degree. In addition, we did some private tutoring for Levent Yurtsever this week. As part of the private tutoring, we investigated some advanced EJB3 persistence issues together. Lots of fun, except that Levent made some German style coffee on Monday, which kept me awake until 03:30am!

This newsletter is the result of some of the discussions we had about generics, casting and EJB3. It is a group effort between Levent, Andreas and myself.

NEW: We have revised our "Advanced Topics" course, covering Reflection, Java NIO, Data Structures, Memory Management and several other useful topics for Java experts to master. 2 days of extreme fun and learning. Extreme Java - Advanced Topics.

Safely and Quickly Converting EJB3 Collections

One of the quirks of EJB3 is that it does not want to take responsibility for ClassCastExceptions. If you select a list of entity beans, it comes back as an untyped list. One approach is to then simply cast the list to a typed list (e.g. List<Banner>) and hope that you do not get a ClassCastException further down the line. Here is a snippet of EJB3 code that would get a result list of Banner objects and cast it to a typesafe collection (with a warning):

Query q = em.createQuery("SELECT b FROM Banner b");
Collection<Banner> banners = q.getResultList(); // causes warning
for(Banner banner : banners) {
  System.out.println("banner = " + banner);
}
  

Levent's default approach, that he also used with Hibernate, was to copy the entire list into a new typed ArrayList, and casting the objects using Class.cast():

public static <T> List<T> copy(List<?> objects, Class<T> cls) {
  List<T> typedList = new ArrayList<T>(objects.size());
  for (Object o : objects) {
    typedList.add(cls.cast(o));
  }
  return typedList;
}
  

We were not entirely happy with this approach, because it meant that every collection would be copied. In addition, if the type of the collection was anything other than an ArrayList, that information would be lost in the copy.

To correct this, we created three different strategies for converting the list. The first was Levent's original approach - to simply copy it and use the Class.cast() method to avoid warnings. If the type was incorrect, we would get ClassCastException. The second would iterate through the list and try to cast each element. This would cause a ClassCastException if the type was wrong. However, it caused unchecked conversion warnings on compilation, which we then turned off with the @SuppressWarnings annotation. The third approach was based on how the Collections.checkedCollection() method verifies that the type in the collection is correct. In this approach, we convert the entire collection to an array using the Collection.toArray method. Since this is can be done deep down in native code for some collections, we may end up with a faster check than with the other methods.

None of these approaches would be a good choice when we have a really large list that is being retrieved lazily. For performance critical code, I have added a fourth converter that simply does a cast of the collection without checking. You may get ClassCastExceptions later on upon using it.

We start with an abstract superclass that will contain a convert method for changing an untyped Collection or List to typed:

import java.util.*;
public abstract class Converter {
  public final static Converter TO_ARRAY = new ToArrayConverter();
  public final static Converter WITH_COPY = new WithCopyConverter();
  public final static Converter WITH_LOOP = new WithLoopConverter();
  public final static Converter UNSAFE = new UnsafeConverter();

  public abstract <T> Collection<T> convert(Class<T> dest,
                                   Collection<?> objects);
  @SuppressWarnings("unchecked")
  public final <T> List<T> convert(Class<T> dest, List<?> objects) {
    return (List<T>) convert(dest, (Collection)objects);
  }
}
  

The first Converter is based on Levent's earlier's approach of copying the entire collection to a new list. It hides the unchecked conversion warnings from us, but is consistently the slowest solution:

import java.util.*;
// no warnings generated here
public class WithCopyConverter extends Converter {
  public <T> Collection<T> convert(Class<T> dest,
                                   Collection<?> objects) {
    Collection<T> result = new ArrayList<T>(objects.size());
    for (Object obj : objects) {
      result.add(dest.cast(obj));
    }
    return result;
  }
}
  

The second converter is quite similar except that it does not copy the list. It simply goes through and checks the type of each element. It generates an unchecked conversion warning:

import java.util.*;
public class WithLoopConverter extends Converter {
  @SuppressWarnings("unchecked")
  public <T> Collection<T> convert(Class<T> dest,
                                   Collection<?> objects) {
    for (Object obj : objects) {
      if (obj != null && !dest.isInstance(obj)) {
        throw new ClassCastException();
      }
    }
    return (Collection<T>) objects; // this causes the warning
  }
}
  

The third approach is the most interesting one. I discovered it by looking at the java.util.Collections.checkedCollection() method. The checkedCollection factory method takes an ordinary collection and a class, and checks that you always only add the correct types of elements to it. At construction, it needs to check that all the elements are of the correct type. Assuming that the author of the method has access to the brightest performance experts at Sun, we would expect them to have chosen the most efficient approach. As we will see in the results, we see stark differences between using the server and client virtual machine. But first, the ToArrayConverter:

import java.lang.reflect.Array;
import java.util.Collection;
// warnings caused by the casts
public class ToArrayConverter extends Converter {
  @SuppressWarnings("unchecked")
  public <T> Collection<T> convert(Class<T> dest,
                                   Collection<?> objects) {
    try {
      objects.toArray((T[])Array.newInstance(dest, objects.size()));
    } catch (ArrayStoreException ase) {
      throw new ClassCastException();
    }
    return (Collection<T>) objects;
  }
}
  

The fourth approach is to simply cast the collection to the correct generic type. This is unsafe in that it might contain incorrect elements. You would only notice that further down the line when you iterate over the collection:

import java.util.Collection;
// warnings caused by the cast
public class UnsafeConverter extends Converter {
  @SuppressWarnings("unchecked")
  public <T> Collection<T> convert(Class<T> dest,
                                   Collection<?> objects) {
    return (Collection<T>) objects;
  }
}
  

Performance Results

I would expect that the ToArrayConverter could be the fastest of the first three for a java.util.ArrayList. However, it would make most sense for the WithLoopConverter to bet the fastest for a LinkedList. The reason for this is that the ToArrayConverter essentially uses a System.arrayCopy() native call when called on an ArrayList.

A short explanation of the results. I tried using collections of size 5, 50 and 500. Although I ran tests with a whole bunch of collections, I just show the two most interesting ones, LinkedList and ArrayList. Instead of taking the average run length, I took the minimum, with the logic being that you cannot go faster than you can go.

java.util.LinkedList

Notice how for LinkedList, the ToArrayConverter is consistently faster with the Client VM. With the Server VM, the WithLoopConverter is consistenly faster for the LinkedList. The Server VM results coincide more readily with what we would expect based on code inspection.

SizeClientServer
WithCopyConverter53713
WithLoopConverter5396
ToArrayConverter53233
WithCopyConverter50145123
WithLoopConverter5020557
ToArrayConverter50105109
WithCopyConverter5001457513
WithLoopConverter5002072254
ToArrayConverter500583598

java.util.ArrayList

Here the results are also interesting. We find that for the Client VM, the ToArrayConverter is almost always the fastest. However, with the Server VM, the WithLoopConverter is the fastest on average.

SizeClientServer
WithCopyConverter5277
WithLoopConverter58111
ToArrayConverter53835
WithCopyConverter50213121
WithLoopConverter5032243
ToArrayConverter509374
WithCopyConverter5002163489
WithLoopConverter5003262431
ToArrayConverter500455432

It is impossible to know for sure, but I suspect that the person writing the Collections.checkedCollection() method ran one quick test and noticed the performance improvements for the default (i.e. client) VM. We need to always look at not just the performance results, but also make sure that we have a really solid explanation for them.

Lastly, I want to show you the performance test harness, written by Andreas Schmidt and Levent Yurtsever. It was great on Wednesday. I popped into my house to have a chat with a friend who had come for a visit, since we had finished the tutoring session for the day. When I emerged back in the office about 1.5 hours later, Andy and Levent had refactored my initial test into something that has no signs of WET. We used JAMon available on Source Forge.

import com.jamonapi.*;
import java.util.*;
import java.util.concurrent.ConcurrentLinkedQueue;
import static java.util.concurrent.TimeUnit.MILLISECONDS;

public class ConverterPerformanceTest {
  private static final int COUNT = 20 * 1000;
  private static final int REPEATS = 10;

  private static final int[] NO_OF_ELEMENTS = {5, 50, 500};

  private static final Converter[] converters = {
      Converter.WITH_COPY, Converter.WITH_LOOP, Converter.TO_ARRAY,
      Converter.UNSAFE
  };

  public static void main(String[] args) {
    test(new LinkedList());
    test(new ArrayList());
    test(new HashSet());
    test(new ConcurrentLinkedQueue());
  }

  private static void gcAndWait() {
    try {
      // this not only clears the memory, but offers a specific
      // GC signature that will allow us to determine which GC
      // activity was for which test.
      MILLISECONDS.sleep(500);
      System.gc(); System.gc(); System.gc();
      MILLISECONDS.sleep(5000);
    } catch (InterruptedException e) {
      Thread.currentThread().interrupt();
    }
  }

  private static void test(Collection valueContainer) {
    printHeading(valueContainer);
    for (int elementCount : NO_OF_ELEMENTS) {
      for (Converter converter : converters) {
        test(elementCount, valueContainer, converter);
      }
    }
    System.out.println();
  }

  private static void test(int elements, Collection valueContainer,
                           Converter converter) {
    prepareContainer(elements, valueContainer);
    gcAndWait();

    // make sure that the HotSpot Compiler has chance for its magic
    for (int i = 0; i < COUNT * REPEATS / 2; i++) {
      converter.convert(String.class, valueContainer);
    }

    String measurement = converter.getClass().getSimpleName() +
        "-" + valueContainer.getClass().getSimpleName()
        + "(" + elements + ")";
    Monitor mon = MonitorFactory.start(measurement);

    for (int j = 0; j < REPEATS; j++) {
      mon.start();
      for (int i = 0; i < COUNT; i++) {
        converter.convert(String.class, valueContainer);
      }
      mon.stop();
    }
    printResults(measurement, mon);
  }

  private static void prepareContainer(int elements,
                                       Collection valueContainer) {
    for (int i = 0; i < elements; i++) {
      valueContainer.add(Integer.toString(i));
    }
  }

  public static void printHeading(Collection valueContainer) {
    System.out.println(valueContainer.getClass());
    System.out.println("Avg,Var,Min,Max");
  }

  public static void printResults(String measurement, Monitor mon) {
    System.out.printf("%s,%.0f,%.0f,%.0f,%.0f%n", measurement,
        mon.getAvg(), mon.getStdDev(), mon.getMin(), mon.getMax());
  }
}
  

It was fun writing this newsletter with two co-authors, thanks Levent and Andy :)

Kind regards

Heinz

P.S. We heard some Egyptian Geese next door today. My neighbour used to farm in Zambia and started feeding the Guinea Fowls that prowl the neighbourhood, but word has spread amongst the wildlife. Tonight we had one of the most spectacular sunsets of all time, so we are in for a treat of a day tomorrow. Maybe we will even brave our pool?

P.P.S. WET stands for Write Every Time and is the opposite of the desirable DRY (Don't Repeat Yourself). My friend John Green thought that one up himself.

Performance Articles Related Java Course

Extreme Java - Concurrency and Performance for Java 8
Extreme Java - Advanced Topics for Java 8
Design Patterns
In-House Courses

© 2010-2016 Heinz Kabutz - All Rights Reserved Sitemap
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. JavaSpecialists.eu is not connected to Oracle, Inc. and is not sponsored by Oracle, Inc.