TryAlgo

PQ trees

Given a collection of sets over some ground set, find an order on the ground set such that every set consists of consecutive (contiguous) elements.

Problem names

This problem is called the Contiguous Ordering Problem. Another representation of the problem, known as Testing the Consecutive Ones Property consists of a boolean matrix, where we wish to order the columns such that every row has the pattern 0*1*0*. This problem has been studied for the first time by archaeologists where columns represents observed tombs and rows styles of objects that have been found in some tombs. The problem has also an application in graph drawing of planar graphs. Moreover in terms of graphs this problem consists in recognizing interval graphs.

PQ tree

This problem has been solved by Booth and Lueker in 1976, using an algorithm running in time $O(n+m+s)$. It is based on a dynamical data structure called PQ tree.

In short, a PQ-tree represents a collection of total orders over a ground set. It is a rooted tree. The leafs of the tree are the values of the ground set. Inner nodes are of type P or Q. P means all permutations of the children are allowed. Q means only the left to right or the right to left order of the children is allowed.

Typically P nodes are represented as circles and Q nodes as rectangles. The following tree

represents the following orders.

    A B C D E
    A B E D C
    A C D E B
    A E D C B
    B A C D E
    B A E D C
    B C D E A
    B E D C A
    C D E A B
    C D E B A
    E D C A B
    E D C B A

Overall structure

The idea is to start with a PQ tree representing all orderings, consisting of a single P node as the root with all elements of the universe attached to it. Then for every given set S we do a restriction of the PQ tree to orderings in which S is contiguous. Whenever this operation is not possible an exception is raised.

def consecutive_ones_property(sets, groundset):
    tree = P_tree([leaf(x) for x in groundset])
    try:
        for S in sets:
            tree.restrict(S)
        return tree.border()
    except IsNotC1P:
        return None

Before defining the restriction operation in detail we have to understand the effect of the restriction at the level of a node.

Marking nodes

Consider a fixed set S and a PQ tree T. We can mark nodes of T as empty, full or as partial. These marks propagate bottom up as follows.

For an example consider the following PQ-tree.

When marking it with respect to the set {B,D,E,G,J} the tree will look like this (black=full, while=empty, grey=partial).

Processing nodes

The algorithm processes the tree from bottom up, processing a node, once all its descendants have been processed. The processing consists of a long case analysis that recognizes different templates and does local changes accordingly. The case analysis is done as in this picture (just to say that it is tedious).

The following steps show the result of processing the red marked nodes from bottom up.

The template recognition for Q nodes is done with the help of a finite state automaton. The arrows in the labels indicate whether the descendants can be oriented from EMPTY to FULL or from FULL to EMPTY.

Implementation

Our implementation consists of the class PQ_tree, which contains a reference to the root of the tree and a list of its leafs. A node of the tree is represented as an instance of the class PQ_node, and has a shape (which can be P, Q or Leaf), a value in case of a leaf, a list of sons in case of a P or Q node, a reference to the parent node, a mark (EMPTY, FULL or PARTIAL) as well as the counters full_leafs and processed_sons. The former counts the number of full leafs in the subtree and permits to detect the key node. This is the node at which the process of the tree can be stopped. The counter processed_sons permits to detect when all descendants of a node have been processed so the node can be placed in the queue for its own processing.

Possible improvements

A first improvement of the implementation would be to make sure that P nodes have at least 2 descendants and Q nodes at least 3 descendants. This will help to keep the tree size proportional to the size of the universe. But more importantly the list of descendants should not be implemented with a Python list, but with a double linked list, permitting constant time insertions and removals.

The current implementation has a complexity in the order of n*m however an implementation in O(n+m+s) is possible, where n is the size of the universe, m the number of sets and s the total size over all sets.

A quick and incomplete bibliography

Booth and Lueker in 1976 introduced the PQ trees. Later in 2001 a more general, but yet simpler structure called PC-trees was introduced. The current implementation just raises an exception if the problem has no solution. In order to avoid this problem cases PQR-trees have been introduced and Ross M. McConnell proposed an algorithm that can generate a minimal forbidden sub-configuration in these cases. Michel Habib, Ross M. McConnell, Christophe Paul, Laurent Viennot in 2000 proposed a different approach to solve the all-consecutive-ones testing problem, in time O(n + s log n), using Lex-BFS and partition refinements. I think that this is the algorithm that one wants to implement for this problem. We refer to the book chapter by Hsu and McConnell for a complete survey.

References