# Java subList() gotcha.

Recently I stumbled upon an issue of stuck thread. The thread seemed to be stuck with stack trace something like the one below:-

java.util.SubList$1.nextIndex(AbstractList.java:713) java.util.SubList$1.nextIndex(AbstractList.java:713) ...Repeated 100s of times...
java.util.SubList$1.nextIndex(AbstractList.java:713) java.util.SubList$1.hasNext(AbstractList.java:691)
java.util.SubList$1.next(AbstractList.java:695) java.util.SubList$1.next(AbstractList.java:696) ...Again repeated 100s of times...
java.util.SubList$1.next(AbstractList.java:696) java.util.AbstractList.hashCode(AbstractList.java:526) java.util.Collections$UnmodifiableList.hashCode(Collections.java:1152)

# Finding the maxima. Err… involves some simple maths

Try to run the Java code listed at the top of the post and try changing 5000 to 1000. Then try running that program multiple times, each time increasing the values of the rest two for loops in multiples of 100. You will notice the peak performance penalty comes at iteration of 700 for 1000 elements. You might be surprised at first but think about this. When you increase the iterations so much that SubList-n has zero elements then there will be no performance penalty. This is a limiting case. So there must be some peak point between the two extremes, i.e. when the SubList-n directly refers to the original list (when n=1) but it contains n-1 elements, and the other extreme being when SubList-n has only zero elements. So, we begin. After 1st iteration the number of elements in alist is n-1 and has 1 SubList to reach the original list. After 2nd iteration the number of elements in alist is n-2 and has 2 SubLists to reach the original list. … After $xth$ iteration the number of elements in alist is $n-x$ and has $x$ SubLists to reach the original list.

\begin{align} O(n) &= x (n-x)\\ &= xn – x^2 \end{align}

Now to find the maxima,

\begin{align} \frac{dO(n)}{dx} = 0\\ &\Rightarrow \frac{d(xn – x^2)}{dx} = 0\\ &\Rightarrow \frac{ndx}{dx} – \frac{d(x^2)}{dx} = 0\\ &\Rightarrow n – 2x = 0\\ &\Rightarrow n = 2x\\ &\Rightarrow x = \frac{n}{2} \end{align}

So from above it is clear that the maxima is reached when number of partitioning (iterations with alist=alist.subList()) is half the total amount of data. Practical experiments show that the actual maxima is slightly shifted further away from the half point.

# Summary. Phew! Finally!

If you use subList wrongly then a time will come when dealing with large data your thread will appear to be stuck. In reality the code is not really hung but is doing a hell lot of iterations. In fact if you take the snapshot of your stuck thread then you will see the stack is growing, if not then at least note the position of the line java.util.SubList\$1.hasNext(AbstractList.java:691). The position of this line would be changing rapidly in the stack.

^1 More accurately $n+n+(n-1)+(n-2)+...+1=\frac{n}{2}(3+n)$