Running on Java 22-ea+27-2262 (Preview)
Home of The JavaSpecialists' Newsletter

254Big O Cost of Class.getMethod()

Author: Dr. Heinz M. KabutzDate: 2018-02-27Java Version: 1.4Category: Tips and Tricks
 

Abstract: We now look at why the best-case scenario for a getMethod() call is O(n), not O(1) as we would expect. We also discover that the throughput of getMethod() has doubled in Java 9.

 

Welcome to the 254th edition of The Java(tm) Specialists' Newsletter. As a pimply-faced youth, when I needed to do some deep thinking, I'd gather my books and wander down to Botany Bay. I would sit on the rocks, watching the kelp bob up and down in the surf, and think. Def Leppard's "Pour Some Sugar On Me" bellowed from walkman into eardrums. It was the 80s after all. There's something about staring at salty water that seems to stir my creative juices.

Fast-forward thirty years to Crete. Since I am a "polyteknos" -- I have too many children -- we drive a Mercedes Viano. We use it on Sundays to cart the family off to church and a delicious lunch at Mitsos afterwards. The rest of the week it gathers dust, lots of. But not anymore. It is now my mobile executive suite. It looks rather conspicuous, all black with darkened windows. The previous owner chauffeured famous rock stars and royalty. It makes the perfect thinking place. I drive it to random beaches and sit in the back surrounded by Java books. There I philosophy about how the design patterns of old have a place in modern Java.

I will tell you more about my adventures in future newsletters.

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

Big O Cost of Class.getMethod()

As I mentioned yesterday, I was searching for examples of the GoF Builder in the JDK. I thought that perhaps reflection would use it. I knew that the Method returned by Class.getMethod() was a protective copy. Turns out, this is rather an example of the Prototype. In order to understand how Class.getMethod() worked internally, I stepped into the code with IntelliJ. Whilst doing so, I noticed that even when we had found the method we were looking for, it kept on searching. (BTW, if you love IntelliJ or are curious what it can do, be sure to get my free IntelliJ Wizardry Course).

When we call Class.getMethod(name,parameterTypes), it does a linear search through an internal publicMethods array, lazily created. The methods are not sorted in any way and, as I discovered in my tests, they are also not always stored in the same order. This is why a linear search is necessary. Most of the time, the number of methods is going to be fairly small for one class, although we do include superclass methods too. We would expect it to be less than 100 though. Is that reasonable? I did some searching through the Java 8 rt.jar file. I found almost 500 classes that had 100 or more methods, including those from superclasses. The worst two offenders were from CORBA, as you probably expected. The next 38 offenders were Swing or UI classes. The average number of methods per class is 25, which includes methods from Object. If we use getDeclaredMethods(), the average number of methods per class decreases to just 7.35.

Now comes the clincher: Even when we find a matching method in our jumbled up array, we don't stop searching. Why?

Java 5 gave us covariant return types. This means that if class A has method Object foo(), then subclass B can have method Number foo(). Same signature, different return type. A further subclass C of B could contain method Integer foo():

public class A {
  public Object foo() {
    return new Object();
  }
}

public class B extends A {
  public Number foo() {
    return new Integer(42);
  }
}

public class C extends B {
  public Integer foo() {
    return new Integer(42);
  }
}

C.foo()'s return type must be assigneable from B. Thus we cannot make it return String. This also means that class C actually contains three methods foo():

import java.lang.reflect.*;

public class ReflectionTest {
  public static void main(String... args)
      throws NoSuchMethodException {
    System.out.println(A.class.getMethod("foo"));
    System.out.println(B.class.getMethod("foo"));
    System.out.println(C.class.getMethod("foo"));

    System.out.println("All methods in class C:");

    for (Method method : C.class.getMethods()) {
      System.out.println(method);
    }
  }
}

When we run this, we see the following output:

public java.lang.Object A.foo()
public java.lang.Number B.foo()
public java.lang.Integer C.foo()
All methods in class C:
public java.lang.Object C.foo()
public java.lang.Number C.foo()
public java.lang.Integer C.foo()
    ... and all the Object methods

When I saw Class.searchMethods() in operation, I realized why they were continuing their search, even when they had found a good candidate. The reason is that there might be another method further down in the array that had a more specific return type. If we call C.class.getMethod("foo"), we want public java.lang.Integer C.foo()

Since the average number of methods per class is small, they decided to not bother sorting the array. A binary search would find the desired method in O(log n) steps. However, sorting would take O(n * log n) steps, so unless we did a lot of lookups, we would never regain the performance loss from sorting. A similar argument would apply for a HashMap.

Usually, the best-case scenario for a search is O(1). For example, if we search through an unsorted list, we would start at the beginning and scan through until the end. If we were lucky, the first element would be the one we were looking for and we could stop. Average case for an unsorted list would be O(n). On average we would need n/2 lookups, but since it is a constant multiplier of 0.5, we would only consider the slowdown as n became larger. Since the average time would be in relation to the size of the list, it would also be written as O(n). The worst-case is what we consider most of the time for searching, and that is also O(n) for an unsorted list. If we have a sorted ArrayList, then best-case would be O(1) and the average and worst cases would be O(log n). If this is all a bit much for you, make sure to get our Data Structures in Java 9 Course, which explains complexity in a lot more detail.

In Class.searchMethods(), the best-case is O(n), as are the average and worst cases. However, this is only since Java 5, when covariant return types were introduced into the language. Prior to that, the computational complexities were the same as for an unsorted list.

How can we demonstrate this beyond eyeballing the JDK code?

Here is a ClassGenerator that produces classes with increasing numbers of methods. I wrote the code in Java 1.4 syntax so that I could run it on an old machine. No generics, no Java 8 streams, no try-with-resource, no varargs. It felt a bit weird. Here we go:

import java.io.*;

public class ClassGenerator {
  public static void main(String[] args) throws IOException {
    for (int methods = 4; methods < 65536; methods *= 5) {
      PrintStream out = new PrintStream(
          new FileOutputStream("Methods" + methods + ".java"));
      out.println("public class Methods" + methods + " {");
      for (int i = 0; i < methods; i++) {
        out.println("  public void foo" + i + "() {}");
      }
      out.println("}");
      out.close();
    }
  }
}

ClassGenerator spits out 7 classes with the largest containing 62500 methods. The JDK 1.4.0-b92 javac compiler choked with a StackOverflowError on the largest class, so I used the Java 8 compiler with -source 1.4 and -target 1.4. That worked.

To demonstrate the performance degradation post Java 4, I wrote the following class that tests how many times we can find the first, last and middle "foo" methods in our array.

import java.lang.reflect.*;

public class MethodLookupCost {
  public static void main(String[] args) throws Exception {
    System.out.println("Warmup");
    for (int i = 0; i < 2; i++) {
      test();
    }
    System.out.println();
    System.out.println("Proper run");
    for (int i = 0; i < 3; i++) {
      test();
    }
    System.out.println(blackhole);
  }

  private static void test() throws Exception {
    for (int methods = 4; methods < 65536; methods *= 5) {
      test(Class.forName("Methods" + methods));
    }
  }

  private static volatile Method blackhole;

  private static void test(Class clazz) throws Exception {
    System.out.println(clazz);
    System.out.println(clazz.toString().replaceAll(".", "-"));
    Method[] methods = clazz.getMethods();

    int firstIndex = fooIndex(methods, 0, 1);
    Method first = methods[firstIndex];
    int lastIndex = fooIndex(methods, methods.length - 1, -1);
    Method last = methods[lastIndex];
    int middleStart = (lastIndex - firstIndex) / 2;
    int middleIndex = fooIndex(methods, middleStart, 1);
    Method middle = methods[middleIndex];

    System.out.println("indexes = (" + firstIndex + ", " +
        middleIndex + ", " + lastIndex + ")");
    System.out.println("foos = (" + first.getName() + ", " +
        middle.getName() + ", " + last.getName() + ")");

    System.out.println("  first  = " + methodFinds(first));
    System.out.println("  middle = " + methodFinds(middle));
    System.out.println("  last   = " + methodFinds(last));
    System.out.println();
  }

  private static int fooIndex(Method[] methods,
                              int start, int inc) {
    int idx = start;
    while (!methods[idx].getName().startsWith("foo")) idx += inc;
    return idx;
  }

  private static long methodFinds(Method method)
      throws Exception {
    return methodFinds(method.getDeclaringClass(),
        method.getName());
  }

  /**
   * Counts how many times we can get the method in a second.
   */
  private static long methodFinds(Class clazz, String method)
      throws Exception {
    long time = System.currentTimeMillis();
    long methodFinds = 0;
    while (System.currentTimeMillis() - time < 1000) {
      blackhole = clazz.getMethod(method, (Class[]) null);
      methodFinds++;
    }
    return methodFinds;
  }
}

The purpose of this benchmark was to confirm my suspicion that prior to Java 5, the best-case lookup cost of Class.getMethod() was O(1) and is now O(n). However, we need to consider the context of the benchmark. It is of almost no relevance in real code, since we will seldom have more than 100 methods in a class.

Here is the output from Java 9.0.4:

class Methods4
--------------
indexes = (0, 1, 3)
foos = (foo0, foo1, foo3)
  first  = 10064862
  middle = 10276604
  last   = 10112908

class Methods20
---------------
indexes = (0, 9, 19)
foos = (foo13, foo8, foo4)
  first  = 7009813
  middle = 7290529
  last   = 7367027

class Methods100
----------------
indexes = (0, 49, 99)
foos = (foo25, foo74, foo4)
  first  = 2238002
  middle = 2166709
  last   = 3389455

class Methods500
----------------
indexes = (0, 249, 499)
foos = (foo25, foo249, foo499)
  first  = 585933
  middle = 399407
  last   = 407028

class Methods2500
-----------------
indexes = (0, 1249, 2499)
foos = (foo25, foo1649, foo499)
  first  = 72224
  middle = 51273
  last   = 61625

class Methods12500
------------------
indexes = (0, 6249, 12499)
foos = (foo2796, foo9045, foo2795)
  first  = 10671
  middle = 10813
  last   = 10770

class Methods62500
------------------
indexes = (0, 31249, 62499)
foos = (foo2796, foo31545, foo2795)
  first  = 1685
  middle = 1174
  last   = 1732

Interesting is the output for Java 8. It starts off as half the throughput of Java 9 on my machine. At about 100 methods, they have the same throughput and then Java 8 ends up faster than Java 9. I haven't researched this thoroughly, but I suspect that the reason Java 9 is faster is that they don't call intern() on the method name on every single call. I'm not sure why Java 8 is faster after we get to 100 methods, but considering the average number of methods per class, I think that's a reasonable compromise.

class Methods4
--------------
indexes = (0, 1, 3)
foos = (foo0, foo1, foo3)
  first  = 5711796
  middle = 5611266
  last   = 5786185

class Methods20
---------------
indexes = (0, 9, 19)
foos = (foo13, foo10, foo3)
  first  = 5242688
  middle = 5251910
  last   = 5254697

class Methods100
----------------
indexes = (0, 49, 99)
foos = (foo13, foo49, foo99)
  first  = 3716464
  middle = 3595559
  last   = 3628375

class Methods500
----------------
indexes = (0, 249, 499)
foos = (foo13, foo249, foo499)
  first  = 1400637
  middle = 1358047
  last   = 1380786

class Methods2500
-----------------
indexes = (0, 1249, 2499)
foos = (foo13, foo1733, foo499)
  first  = 228720
  middle = 223003
  last   = 223820

class Methods12500
------------------
indexes = (0, 6249, 12499)
foos = (foo13, foo6733, foo499)
  first  = 40339
  middle = 40360
  last   = 40553

class Methods62500
------------------
indexes = (0, 31249, 62499)
foos = (foo12505, foo44215, foo62224)
  first  = 5232
  middle = 5391
  last   = 5397

I also ran this with Java 1.4.0 on a VMWare Fusion instance with Windows XP. We can see two main differences between pre and post Java 1.4. First is that the best case performance stays the same, irrespective of the number of methods. Second is that it appears that the Method[] is ordered, probably according to where the methods appear in the source file. In the Java 8 output above, in Methods62500 the first method starting with "foo" was foo12505(). The middle was foo44215(). And the last was foo62224(). Java 9 is even stranger, with first foo2796(), middle foo31545() and last foo2795(). Here is the output from Java 1.4.0:

class Methods4
--------------
indexes = (0, 1, 3)
foos = (foo0, foo1, foo3)
  first  = 1243931
  middle = 1328329
  last   = 1372984

class Methods20
---------------
indexes = (0, 9, 19)
foos = (foo0, foo9, foo19)
  first  = 1410311
  middle = 1482650
  last   = 1301043

class Methods100
----------------
indexes = (0, 49, 99)
foos = (foo0, foo49, foo99)
  first  = 1305948
  middle = 1147514
  last   = 960986

class Methods500
----------------
indexes = (0, 249, 499)
foos = (foo0, foo249, foo499)
  first  = 1232767
  middle = 658452
  last   = 437830

class Methods2500
-----------------
indexes = (0, 1249, 2499)
foos = (foo0, foo1249, foo2499)
  first  = 1230703
  middle = 251969
  last   = 131139

class Methods12500
------------------
indexes = (0, 6249, 12499)
foos = (foo0, foo6249, foo12499)
  first  = 1177412
  middle = 56583
  last   = 29448

class Methods62500
------------------
indexes = (0, 31249, 62499)
foos = (foo0, foo31249, foo62499)
  first  = 1240943
  middle = 12028
  last   = 5926

As we saw, in Java 1.4, when we didn't have covariant return types, we could return from our search as soon as we found our first matching method. There could be only one, so there was no point searching further. The throughput to find the first method thus stays consistent, irrespective of the size of the Method[]. Classic O(1).

We can draw several conclusions from this. Firstly, the cost of finding any method in a class is dependent on how many methods are visible, not where in the class the method is declared. Secondly, it appears that Java 9 has doubled the throughput of Class.getMethod() by getting rid of intern(), although that would need further research to confirm. Thirdly, there was a hidden cost in covariant return types that I had not considered up to now.

Kind regards from Crete

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...