Here I derive the elimination tree for the (right-looking) sparse Cholesky algorithm for computing A = LL^T for lower triangular L and sparse matrices A. This tree forms the foundation for most sparse factorization software, even when the underlying assumptions of Cholesky (symmetric and definite) do not apply. Ultimately this tree tells us two things: 1. where nonzeros appear in the matrix L even if not present in the original A (i.e. “fill-in”) and 2. the task dependency graph of our resulting factorization. Traditionally this concept is usually presented in the context sparse triangular solves which is then used as a building-block to a Cholesky factorization. I wanted to instead work directly from a Cholesky factorization, which is what I do below.
Suppose at first we start with the plain dense right-looking Cholesky algorithm in pseudocode below:
The implied order and loop structure results in a task DAG. In the event that A is sparse it turns out that the DAG as well as the implied fill-in from step (3) can be completely determined by the initial nonzero pattern of A and a simple tree data structure, which is known as the column elimination tree. I will illustrate with a simple 5x5 matrix (only showing the lower triangular part to make it easier to read)
As expected, because A is sparse there are some operations here that we never need to do. If we prune these we get a much smaller graph.
Note the column grouping here doesn’t serve a practical purpose yet, but we may interpret a column group as meaning “when we have completed all operations in a column group i, no future operation references column i”.
Eliminating all unnecessary operations we get the following
This pruned graph tells us every operation we need to do in order to fully cholesky factorize this sparse matrix. But in order to determine this graph I had to do a dense factorization and then prune the unnecessary calculations.
It turns out however there is a simple O(n) data structure which when combined with the initial nonzero pattern of A can quickly tell us the final fill pattern as well as the pruned task graph. This is the column elimination tree.
Recalling step (3) from our dense Cholesky factorization
It tells us a key structural fact: If k < j <= i, L[i][k]!=0 and L[j][k]!=0 then L[i][j]!=0.
Now when we look at the pruned task graph we can see some implied column dependencies like 0->1 and 0->2. Since 0 has two “parents” this is a DAG pattern rather than a tree. But the above structural rule tells us that 0->2 is a redundant edge because 0->1 necessarily means L[1,0]!=0, and 0->2 necessarily means L[2,0]!=0, and thus we must have L[2,1]!=0, and so there is an implied edge 1->2, and it is redundant information that 0->2 because we must proceed from 0->1 first anyways. If we remove all redundant edges from our column DAG, the result is a tree
The elimination tree tells us how to generate the fill-pattern of L from the initial nonzero pattern of A and the structural rule: k < j <= i, L[i][k]!=0 and L[j][k]!=0 implies L[i][j]!=0.
I will defer calculation of the elimination tree to the end of this post and show now how we can use the elimination tree encoded as a parents array:
Once we have the nonzero pattern of L writing the numeric factorization follows straightforwardly from the dense algorithm outline, using appropriate sparse matrix datastructures. The key step is the step where fill-in is introduced which I show in pseudo-c below
and since we precomputed the nonzero pattern we need not check if those numerical entries exist or not, they will have been preallocated.
To compute the elimination tree using only the starting nonzeros of A without excessive extra work we need to incrementally build parents. This means for each r we need to try and find the path to r in this tree.
So if we iterate through rows of A in order and then look at the column ids we get a candidate path c-> ... -> r. We save that c->r in ancestors[c] and iterate up ancestors until we found a column we have not yet matched, then that column’s parent becomes r, because we are iterating rows in order we can not find a smaller r.
The presentation of elimination trees often involves some preliminary graph theory that always felt a little detatched from the linear algebra. My aim here was not to replace the graph theory, but to make it more grounded in the underlying algorithm. I proceeded directly from the right-looking Cholesky factorization to a task DAG, pruned that DAG, and then showed that the structural rule implied by step (3) of the algorithm results in a compressed representation of the task DAG, and that compressed representation happens to be a tree.