ATMCS 6: Day 5 – Applied Topology

Ryan H. Lewis summarizes the morning talks, and Andrew Cooper the afternoon talks:

Christopher Hoffman gives the first talk about recent advanced in random topology. He began by discussing the stopping time for monotone graph properties. An example is for a sequence of graphs $$\langle G_j \rangle$$ we define:
$$ \tau_{\textrm{connected}} = \min j \textrm{ such that } G_j \textrm{ is connected}$$
For example in the following example we have $$\tau_{\textrm{connected}} = 4$$

He is interested in generalizing two results about monotone Erdos-Renyi random graphs to facts about erdos-renyi random simplicial complexes.

The Linial-Meshulam model is a collection $$Y_i$$ of 2-dimensional simplicial complexes where $$Y_0$$ is a complete graph on $$n$$ vertices and $$Y_i = Y_{i-1} \cup \{\textrm{a } 2 \textrm{ cell} \}.$$

It turns out that the first 2-cycle either has 4 faces with probability converging to $$c_0 = .909$$ or it is larger than $$\frac{n}{\log{(n)}}$$ with probability converging to 1 – $$c_0$$.

To generalize the second result to studying isolated edges one can study when the $$H_1(Y, C)=0$$ for $$\mathbb{Z}, \mathbb{Z}_2,$$ and $$\mathbb{Q}$$ coefficients, as well as the $$\pi_1(Y) = 0$$. He presents a series of results relating the stopping times for these events.

In the future they want to use probabilistic methods to demonstrate the existence of complexes desired but unobtained by classical methods in topology.

Paul Villeuotrox (sp?)

Talks about using persistent homology on a wide range of epithelial cells.

By viewing such a structure as a cover of the plane, it’s nerve is a 2D topological space. A filtration of this space is given by assigning to a vertex it’s degree, and each cell the maximum filtration value of it’s boundary.

He has found that persistent homology has proven useful for studying the structure of these cell networks.

He finds that by comparing the barcodes produced from these pipelines to the barcodes produced by complexes built on a random complexes whose underlying graph is endowed with the degree distribution that has been observed empirically, that while persistent $$H_0$$ appears to be similar between these two types of complexes, persistent $$H_1$$ seems to be very different.

Raul Rabadan talked about The Topology of Evolution.

The only figure in Darwin’s Origin of Species is a (mostly-binary) tree. This “Tree of Life” paradigm used starting in the 1970s to analyze genomic sequences. The first major discovery using the tree paradigm was Carl Woese’s 1977 discovery of Domain Archaea.

Woese excluded viruses because they lacked some of the genes he used. But even if he had included them, he would have had trouble: viruses have a high level of horizontal gene transfer, so the choice of tree as a structure to represent the phylogeny is not very good for viruses.

Nor is it very good for bacteria and archaea. Nor is it very food for plants (even Darwin knew this). Nor is it very good for us: when you get gonorrhoea, it gets you! 10% of gonorrhoea genes are human. 8% of human genes are viral. Your genome is something like a “cemetery of past infections” rather than a list of all your ancestors.

If we can’t use a tree to model evolution, what can we use? Mathematically, the problem is:

Given a set of genomes and a way of comparing them, how do we represent their relationships without importing (too many) assumptions from biology?

We would like an answer which is statistical and incorporates the notion of scale. We’d also like to detect when clonal (descent) transfer happens, and when horizontal (non-descent) transfer happens. Answer: use persistent homology to detect the topology of the genetic data!

Persistence detects not just topology, but topology at scales. In 0th homology, scale represents taxa: as we increase the filtration value, we are collecting together more and more distantly related genomes. In 1st homology, a long bar represents a transfer between distantly-related taxa.

For example, though overall flu genes show a lot of cycles, if we restrict our attention to a particular segment there are almost no long-persisting cycles. HIV, on the other hand, has persistent cycles even when we focus on small suites of genes. Thus persistence detects the fact that gene transfer in flu occurs by trading whole segments, whereas in HIV it occurs by trading much smaller units of genetic material.

For human genomic data, persistence bars are about 2 centiMorgans-per-megabase long.

Michael Robinson talked about Morphisms between Logic Circuits

Logic circuits are described by their truth tables. But computers take time to do computations: fast input switching yields the “wrong” output (as evidenced by the flickering screen on Dr. Robinson’s slides). How can we analyze the failures of circuits due to problems of timing? Use sheaves!

Sheaves allow local specifications (we are really good at understanding small circuits) to determine global behavior (what we need to get a handle on). Plus sheaves whose stalks are vector spaces are `just’ linear-algebraic, so we can compute their cohomology using easy, well-known techniques.

As we try to associate a vector space to each logic gate, we encounter various aspects of engineering practice like one-hot encoding.

The zeroth cohomology of the switching sheaf detects the (synchronous) classical logic behavior of the system. The first cohomology of the switching sheaf detects stored information (hence, the possibility of a timing problem in the circuit).

But cohomology of the switching sheaf doesn’t tell us everything. The categorification approach says we should consider morphisms to get more information. Given a circuit we want to understand, we can construct a circuit with the same logical behavior.Then we can ask how many morphisms of the switching sheaves there are which cover the identity on inputs and outputs.

Sometimes there aren’t any such morphisms. Sometimes there are a few. Apparently there are never exactly three.

Leave a Reply Cancel reply