Abstract: Java's CopyOnWriteArrayList gives us a snapshot iterator for fast reading. However, its subList() is not snapshot, and will break if the list is changed. In this newsletter we look at a way to make a safe subList() copy, using the iterator and an AbstractList.
Welcome to the 336th edition of The Java(tm) Specialists' Newsletter, which I started writing at 30,000 feet en route back from Devoxx Poland. The first person I met at Devoxx was Jeff Chan, a friendly recruiter from Netflix Amsterdam. After hearing where I lived, he switched to perfect Greek. Jeff grew up in Athens. It would have been hilarious to watch: a "Heinz" and a "Chan" having an animated conversation in Greek about how to find good Java programmers. For sociable people, Greek is one of the best languages to learn. I've heard Greek spoken all over the world. And when Greeks hear you speak it, they are always happy to stop what they are doing and have a chat.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
One of the many hats I wear is that of consultant, helping companies
with Java performance and concurrency
issues. A few months ago, I was chatting with the author of
JobRunr, and he told me
about a strange ConcurrentModificationException he was seeing with the
CopyOnWriteArrayList (Issue 1052).
CopyOnWriteArrayList has a snapshot iterator, which guarantees
that we never see this pesky exception during iteration. The only
place where the ConcurrentModificationException is mentioned in the
Javadoc
is where it says that it does not get thrown from this class'
iterator.
However, if we create a subList() from the CopyOnWriteArrayList, and
we change the list, then all the methods of subList() will subsequently
throw the ConcurrentModificationException. I mention this in my
detailed CopyOnWriteArrayList
Teardown Course, but it is not completely obvious from the Javadoc alone. Considering
that CopyOnWriteArrayList is supposed to be safe in a concurrent
context, I'm not sure that subList() is that useful.
The offending code looked something like this:
new ArrayList<>(list.subList(0, 2));
The ArrayList constructor calls toArray()
on the subList. If another thread has changed the
CopyOnWriteArrayList in the meantime, then the subList is
no longer valid and we get a ConcurrentModificationException.
Our solution was to make a copy of the entire CopyOnWriteArrayList
into an ArrayList, and to then get a subList() off that, like so:
new ArrayList<>(list).subList(0, 2);
This worked fine, because the CopyOnWriteArrayList was never very
large. However, if the list did become large, then we would have a lot
of unnecessary copying, just to then create a subList.
Here is a quick demo that demonstrates that we can get a
ConcurrentModificationException on a subList() if we change the
original CopyOnWriteArrayList.
import java.util.*;
import java.util.concurrent.*;
public class ConcurrentModificationOnCOWAList {
void main() {
var list = new CopyOnWriteArrayList<Integer>();
Collections.addAll(list, 3, 1, 4, 1, 5, 9);
var subList = list.subList(1, 4);
System.out.println("subList = " + subList); // OK
subList.add(1, 99);
System.out.println("subList = " + subList); // OK
list.add(1, 42);
System.out.println("list = " + list); // OK
System.out.println("subList = " + subList); // BOOM
}
}
Here is the output:
subList = [1, 4, 1]
subList = [1, 99, 4, 1]
list = [3, 42, 1, 99, 4, 1, 5, 9]
Exception in thread "main" java.util.ConcurrentModificationException
at java.base ... *snip*
The subList() method is really not very useful for the
CopyOnWriteArrayList, since it stops working if the
original list is changed.
However, the iterator of CopyOnWriteArrayList is
thread-safe, and is guaranteed to never throw a
ConcurrentModificationException. We can use that to
make a copy of the part of the CopyOnWriteArrayList
that we are interested in, like so:
import java.util.*;
import java.util.concurrent.*;
public class ImmutableSubList<E> extends AbstractList<E> {
private final Object[] elements;
public ImmutableSubList(CopyOnWriteArrayList<E> list,
int fromIndex, int toIndex) {
Objects.requireNonNull(list, "list");
if (fromIndex < 0 || fromIndex > toIndex)
throw new IndexOutOfBoundsException();
this.elements = new Object[toIndex - fromIndex];
var index = 0;
for (var iterator = list.listIterator(fromIndex);
iterator.hasNext() &&
index + fromIndex < toIndex; ) {
var next = iterator.next();
elements[index++] = next;
}
if (index < elements.length)
throw new IndexOutOfBoundsException();
}
@Override
public E get(int index) {
Objects.checkIndex(index, elements.length);
@SuppressWarnings("unchecked")
var element = (E) elements[index];
return element;
}
@Override
public int size() {
return elements.length;
}
}
Since we now rely on the ListIterator, we will never
cause a ConcurrentModificationException. In our
original code, we copied the subList into an
ArrayList, effectively creating a copy-on-read. Our
ImmutableSubList does something similar, copying the
contents of the iterator into an array.
Let's put this code to the test. For good measure, I
also asked AI to propose solutions. Most of their
suggestions were wrong, but the one that came up the
most often was to use a stream, like so:
list.stream().limit(2).toList(). It is
not exactly the same, as we could end up with a list
that is shorter than 2 elements, without seeing an
IndexOutOfBoundsException.
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.*;
import java.util.function.*;
import java.util.stream.*;
public class ImmutableSubListDemo {
void main() {
for (var i = 0; i < 10; i++) {
testAll();
}
}
private void testAll() {
test("broken", l -> new ArrayList<>(l.subList(0, 2)));
test("fixed", l -> new ArrayList<>(l).subList(0, 2));
test("fast", l -> new ImmutableSubList<>(l, 0, 2));
test("AI", l -> l.stream().limit(2).toList());
System.out.println("-".repeat(80));
}
/**
* Returns a subList of the first two elements from a
* {@link CopyOnWriteArrayList}.
*/
private interface Subber extends Function<
CopyOnWriteArrayList<Integer>, List<Integer>> {
}
private static final List<Integer> TINY =
List.of(60, 50);
private static final List<Integer> HUNDRED =
IntStream.range(0, 100).boxed().toList();
private static final List<Integer> THOUSAND =
IntStream.range(555, 1555).boxed().toList();
private void test(String experiment, Subber subber) {
System.out.println(experiment);
System.out.println(experiment.replaceAll(".", "="));
checkCorrectness(subber);
measure("tiny", TINY, subber);
measure("100", HUNDRED, subber);
measure("1000", THOUSAND, subber);
System.out.println();
}
private void measure(String set,
List<Integer> data,
Subber subber) {
var cow = new CopyOnWriteArrayList<Integer>(data);
System.out.print(set + ": ");
measure(() -> subber.apply(cow));
}
private volatile List<Integer> leak;
private void measure(Supplier<List<Integer>> task) {
var running = new AtomicBoolean(true);
ForkJoinPool.commonPool().schedule(
() -> running.set(false),
3, TimeUnit.SECONDS);
var repeats = 0L;
while (running.get()) {
leak = task.get();
repeats++;
}
System.out.printf("subLists = %,d%n", repeats);
}
private void checkCorrectness(Subber subber) {
System.out.print("Testing: ");
var cow = new CopyOnWriteArrayList<Integer>();
Collections.addAll(cow, 1, 2, 3, 4, 5);
var testing = new AtomicBoolean(true);
ForkJoinPool.commonPool().schedule(
() -> testing.set(false),
1, TimeUnit.SECONDS);
var thread = Thread.ofPlatform().start(() -> {
while (testing.get()) {
cow.addLast(42);
cow.removeLast();
cow.addFirst(99);
cow.removeFirst();
}
});
try {
while (testing.get()) {
subber.apply(cow);
}
try {
// test with empty input list
subber.apply(new CopyOnWriteArrayList<>());
System.out.println("Expected " +
"IndexOutOfBoundsException");
} catch (IndexOutOfBoundsException e) {
System.out.println("Probably correct");
}
} catch (ConcurrentModificationException e) {
System.out.println(e.getClass().getSimpleName());
}
}
}
Running this on my AMD Ryzen 5 3600 6-Core Processor
server, we see that my ImmutableSubList solution (fast)
is 2.3x faster than the original broken solution. We
also see that the fixed
new ArrayList<>(list).subList(0, 2);
did not scale well, since we were copying the entire
ArrayList every time. The AI suggested
solution of using Stream was 5x worse than
our ImmutableSubList.
broken ====== Testing: ConcurrentModificationException tiny: subLists = 115,246,431 100: subLists = 115,814,933 1000: subLists = 115,650,531 fixed ===== Testing: Probably correct tiny: subLists = 222,696,153 100: subLists = 39,702,459 1000: subLists = 4,439,176 fast ==== Testing: Probably correct tiny: subLists = 261,142,085 100: subLists = 260,684,422 1000: subLists = 260,695,443 AI == Testing: Expected IndexOutOfBoundsException tiny: subLists = 52,561,381 100: subLists = 52,059,400 1000: subLists = 52,291,818
I like our ImmutableSubList. It is very easy to
create a fully functional List by subclassing the
AbstractList. All we have to implement is
get(int) and size(). For our use-case,
we are making a copy anyway, so creating an array directly
seems like a good idea. And our solution is the fastest that we
tried. Plus it works.
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.