i@yujinyan.me

Blog

Flatten Nested List Iterator

LeetCode

Given a nested list of integers, implement an iterator to flatten it.

Each element is either an integer, or a list — whose elements may also be integers or other lists.

public interface NestedInteger {
  public boolean isInteger();
  public Integer getInteger();
  public List<NestedInteger> getList();
}

Example:

Input: [[1,1],2,[1,1]]
Output: [1,1,2,1,1]

Explanation:

By calling next repeatedly until hasNext returns false, the order of elements returned by next should be: [1,1,2,1,1].

Answer scaffold:

public class NestedIterator implements Iterator<Integer> {
  public NestedIterator(List<NestedInteger> nestedList) {/**/}

  @Override
  public Integer next() {/**/}

  @Override
  public boolean hasNext() {/**/}
}

This is actually a very good design problem. It has some immediate application in common programming tasks. Instead of “nested integers”, think of scraping web pages, or more generically, tasks that spawn child tasks.

A straightforward solution would be pre-processing and flattening the nestedList in the constructor. Then, it boils down to simple tree traversal. Although it seems the fastest way to pass LeetCode’s test cases (according to my experiments), this approach defeats the purpose of an iterator.

An iterator needs some laziness in it. In other words, it should iterate the source data only as its client requests it. If the client just peeks a few items, the eager method would have wasted a lot of work in the constructor.

Laziness has other benefits. If the input tree takes time to traverse like scraping web pages, which involves a lot of I/O, a true lazy iterator helps us employ a producer-consumer scheme. Instead of waiting until the whole traversal is done, we can process the data concurrently as it becomes available.

While this problem represents the input data as a List, the type could be relaxed to be an Iterator. Then the input could be something like a database cursor, in which case it would be impractical to load all the data into memory.

Solution

 public class NestedIterator implements Iterator<Integer> {
   private Stack<Iterator<NestedInteger>> stack;

   public NestedIterator(List<NestedInteger> nestedList) {
     stack = new Stack<>();
     stack.push(nestedList.iterator());
   }

   @Override
   public Integer next() {
     hasNext();
     return stack.pop().next().getInteger();
   }

   @Override
   public boolean hasNext() {
     while (!stack.isEmpty()) {
       Iterator<NestedInteger> iter = stack.peek();
       // If the iterator is empty, clear it off the stack.
       if (!iter.hasNext()) {
         stack.pop();
         continue;
       }

       // Consume the next item.
       NestedInteger n = iter.next();
       // If it's a leaf node, wrap it in an iterator
       // and push back to the stack.
       if (n.isInteger()) {
         List<Integer> l = new ArrayList<>(1);
         l.add(n.getInteger());
         stack.add(l.iterator());
         return true;
       }
       stack.add(n.getList().iterator());
     }
     return false;
   }
}

Straightforward tree traversal is easy to implement with recursion. The official solution uses a stack, and does most of the work in hasNext. Inside the method, the algorithm adjusts the stack so that a leaf node (NestedInteger.isInteger() returns true) is always placed at the top of the stack. If it’s unable to do so, there is no more data to iterate.

🚨

It’s inadvisable to use the Stack class in JDK any more. I sometimes use it for LeetCode solutions because its API is simpler and semantically clearer.

Iterator.hasNext should be idempotent

At first blush, the solution looks wrong to me since it changes state in hasNext. Iterator.hasNext should be idempotent. Clients should be able to call hasNext as many times as they like and still get the right answer when they call next.

What will happen if we call hasNext multiple times with the official solution? The stack would have already been adjusted and the top of the stack is an iterator with one leaf-node item. After calling iter.hasNext, the iterator is consumed and will be cleared off the stack during the next iteration. So the implementation is correct, but has this undesirable effect that multiple hasNext calls may leave these empty iterators on the stack. This might be fixed by introducing some extra state to the class.

Generator pattern

The generator pattern is a perfect match for the problem. In Kotlin, we can use a sequence.

class NestedIterator(nestedList: List<NestedInteger>) {
  suspend tailrec fun SequenceScope<Int>.walk(n: List<NestedInteger>) {
    for (item in n) {
      if (item.isInteger()) yield(item.getInteger())
      else walk(item.getList())
    }
  }

  private val iter = sequence { walk(nestedList) }.iterator()
  fun next(): Int = iter.next()
  fun hasNext(): Boolean = iter.hasNext()
}

Note that we express the recursive traversal algorithm naturally just like we would have done using the eager approach, yielding an elegant solution. Laziness is handled by Kotlin language features. I talked about this in this post.