Exercises in Materials Geometry and Topology

********************************************
Shape Up 2015
Exercises in Materials Geometry and Topology

Berlin (Germany), 14-18 September 2015
********************************************

Dear Scientists and Mathematicians,

We wish to announce our upcoming conference:

Shape Up 2015

Exercises in Materials Geometry and Topology

The conference will be an interdisciplinary discussion meeting on
patterns and geometry, and their role in biological and synthetic
microstructured materials and tissue.. We invite contributions
from biology, chemistry, materials science, mathematics, physics
and related fields addressing the genesis, properties and function
of complex nano-scale geometries, as well as underlying geometric
and topological concepts for the study of complex structure and shape..

The key areas of interest include:

* Geometric and topological concepts in soft condensed matter
* Characterization of structure and function
* Topological data analysis and image reconstruction
* Self-assembly of bi- and polycontinuous phases
* Negatively curved surfaces and hyperbolic geometry
* Structure enumeration & network-like phases
* Function of complex nanostructures in biology
* Bio-inspired design of materials
* Knots and entanglements in physics
* Emergence of chirality, order and the role of disorder
* Symmetry, graphs and discrete geometry
* Tessellations, area-minimising foams and packings
* Three-dimensional topology of cellular microstructures
* Pattern formation and complex structures in soft materials

The conference will run from Monday morning 14 September 2015 to the
late afternoon of Friday 18 September and will feature an extended
poster session, 15-20 contributed talks as well as invited lectures
by

* Simon Copar (U Ljubljana)
* Herbert Edelsbrunner (IST Austria)
* Karsten Grosse-Brauckmann (TU Darmstadt)
* Jemal Guven (U Nacional Autonoma de Mexico)
* Stephen Hyde (Australian National University)
* Motoko Kotani (Tohoku U)
* Rob Kusner (UMass Amherst)
* Jeremy Mason (Bogazici U)
* Elisabetta Matsumoto (Harvard U)
* Konstantin Mischaikow (Rutgers U)
* Piotr Pieranski (U Poznan)
* James Sethian (UC Berkeley)
* Ullrich Steiner (U Fribourg)
* John M. Sullivan (TU Berlin)
* Adam Squires (U Reading)
* Salvatore Torquato (U Princeton)
* Silvia Vignolini (U Cambridge)

The conference is hosted at the Technical University of Berlin on
Strasse des 17. Juni that leads right up to the famous Brandenburg Gate,
next to Reichstag, the parliament of Germany. Museum Island, Berliner
Philharmonie, Opera Houses, Checkpoint Charlie, Berlin Television Tower
and other famous sites are within a few kilometers walking distance and
are easily accessible via public transport that connects all of inner
Berlin – from vibrant Prenzlauer Berg to Gendarmenmarkt at the heart
of Berlin. All lectures and the poster session will take place at the
Institute of Mathematics, Strasse des 17. Juni 136.

We are inviting abstracts for contributed oral presentations and for
posters. We encourage you to use the latex-template on the website
(http://www.shape-up.academy/) and to include both attractive images
and references in your abstract. The deadline for abstract submission
is

***  31 May 2015  ***

Registration will open 15 June 2015.

We are hopeful of encouraging people from a wide variety of scientific
and mathematical backgrounds to attend. Any queries can be addressed
to the organising committee at shape-up@math.tu-berlin.de.

Best regards, and we hope to see you in Berlin,

The Organising Committee

Myfanwy Evans (TU Berlin), Andrew Kraynik (Sandia, (Ret.)), Frank Lutz
(TU Berlin), Gerd Schroeder-Turk (Murdoch) and Bodo Wilts (Fribourg)

Efficiently implementing the persistence algorithm

Around the time of my recent submission Multicore homology via Mayer Vietoris an interesting optimization was made which seems to have a nice impact on the running time of the persistence algorithm.

When computing persistence  the boundary matrix is usually stored sparsely. In particular, over $$\mathbb{Z}_2$$ each column stores only the indices of the rows of the matrix which are nonzero in it’s column.

In the parlance of [ZC ’05]  each column is identified as:

$$x = \partial(\textrm{Cascade}[\sigma]).$$

The column $$x$$ is laid out in memory as a [dynamic] array of computer words  corresponding to row indices, with the natural order of these words being the filtration order.

The persistence algorithm proceeds iteratively, column by column, adding previous columns to the current one. If $$x$$ is the current column being operated on the next column we will add to it

can be identified as:

$$y = \textrm{Cascade[Youngest[}\partial(\textrm{Cascade}[\sigma])\textrm{]]]}$$

Until either the column is zeroed out or a unique pairing occurs, e.g. $$y = 0$$.

Now let me restate the definition of $$y$$ as a procedure:

Step 1) The youngest entry of $$x$$ is identified.

Step 2) The address of the column $$y$$ is computed.

Step 3) The address of the column $$y$$ is fetched into CPU registers.

Usually, this sequence of three steps occurs only after the previous addition completed. Chain addition is generally are implemented as a set_symmetric_difference from begin() to end() on each chain, e.g. oldest entry of each column to the youngest.

In recent versions of both CTL as well as PHAT a change was made to how these column additions  are carried out. The columns are added now added from youngest to oldest elements. This should have the advantage of better locality of reference.

This is because when inspecting the youngest entry we are loading it, along with some amount of the end of the chain into memory. If we then proceeded to add front to back we might end up retiring this cache line in favor of the beginning of the array.

However, adding backwards has even more benefits. Once we add backwards we know that the first pair of elements we inspect will agree, necessarily. However, it is possible,  that the two vectors we add together agree at their tail significantly more. We may delay the allocation of the output buffer for the sum until we find the first mismatch between these two chains. This saves us a small amount of memory. But their is even more.

Supposing that $$x$$ is not a positive column, e.g. after a call to mismatch we will find a pair of elements in disagreement, precisely the younger of these two elements is the new youngest element.  We may now compute the address of the next column to add to $x$ and request that the computer prefetches the column at that address.

In other words we get some added low level parallelism: as we add the pair of current columns, we are simultaneously fetching the next column to add.

How well does all of this work? In my opinion, quite well. Here is a  plot of the running time for the persistence algorithm showing the original running time, the running time when adding backwards, and the running time when adding backwards with prefetching, on the various datasets from the multicore paper

Feel free to comment and discuss on the results of this experiments. I’ll answer questions the best I can.

ATMCS 6: Day 5

Ryan H. Lewis summarizes the morning talks, and Andrew Cooper the afternoon talks:

Christopher Hoffman gives the first talk about recent advanced in random topology. He began by discussing the stopping time for monotone graph properties. An example is for a sequence of graphs $$\langle G_j \rangle$$ we define:
$$ \tau_{\textrm{connected}} = \min j \textrm{ such that } G_j \textrm{ is connected}$$
For example in the following example we have $$\tau_{\textrm{connected}} = 4$$

He is interested in generalizing two results about monotone Erdos-Renyi random graphs to facts about erdos-renyi random simplicial complexes.

The Linial-Meshulam model is a collection $$Y_i$$ of 2-dimensional simplicial complexes where $$Y_0$$ is a complete graph on $$n$$ vertices and $$Y_i = Y_{i-1} \cup \{\textrm{a } 2 \textrm{ cell} \}.$$

It turns out that the first 2-cycle either has 4 faces with probability converging to $$c_0 = .909$$ or it is larger than $$\frac{n}{\log{(n)}}$$ with probability converging to 1 – $$c_0$$.

To generalize the second result to studying isolated edges one can study when the  $$H_1(Y, C)=0$$ for  $$\mathbb{Z}, \mathbb{Z}_2,$$ and $$\mathbb{Q}$$ coefficients, as well as the $$\pi_1(Y) = 0$$. He presents a series of results relating the stopping times for these events.

In the future they want to use probabilistic methods to demonstrate the existence of complexes desired but unobtained by classical methods in topology.

Paul Villeuotrox (sp?)

Talks about using persistent homology on a wide range of epithelial cells.

By viewing such a structure as a cover of the plane, it’s nerve is a 2D topological space. A filtration of this space is given by assigning to a vertex it’s degree, and each cell the maximum filtration value of it’s boundary.

He has found that persistent homology has proven useful for studying the structure of these cell networks.

He finds that by comparing the barcodes produced from these pipelines to the barcodes produced by complexes built on a random complexes whose underlying graph is endowed with the degree distribution that has been observed empirically, that while persistent $$H_0$$ appears to be similar between these two types of complexes, persistent $$H_1$$ seems to be very different.

Raul Rabadan talked about The Topology of Evolution.

The only figure in Darwin’s Origin of Species is a (mostly-binary) tree. This “Tree of Life” paradigm used starting in the 1970s to analyze genomic sequences. The first major discovery using the tree paradigm was Carl Woese’s 1977 discovery of Domain Archaea.

Woese excluded viruses because they lacked some of the genes he used. But even if he had included them, he would have had trouble: viruses have a high level of horizontal gene transfer, so the choice of tree as a structure to represent the phylogeny is not very good for viruses.

Nor is it very good for bacteria and archaea. Nor is it very food for plants (even Darwin knew this). Nor is it very good for us: when you get gonorrhoea, it gets you! 10% of gonorrhoea genes are human. 8% of human genes are viral. Your genome is something like a “cemetery of past infections” rather than a list of all your ancestors.

If we can’t use a tree to model evolution, what can we use? Mathematically, the problem is:

Given a set of genomes and a way of comparing them, how do we represent their relationships without importing (too many) assumptions from biology?

We would like an answer which is statistical and incorporates the notion of scale. We’d also like to detect when clonal (descent) transfer happens, and when horizontal (non-descent) transfer happens. Answer: use persistent homology to detect the topology of the genetic data!

Persistence detects not just topology, but topology at scales. In 0th homology, scale represents taxa: as we increase the filtration value, we are collecting together more and more distantly related genomes. In 1st homology, a long bar represents a transfer between distantly-related taxa.

For example, though overall flu genes show  a lot of cycles, if we restrict our attention to a particular segment there are almost no long-persisting cycles. HIV, on the other hand, has persistent cycles even when we focus on small suites of genes. Thus persistence detects the fact that gene transfer in flu occurs by trading whole segments, whereas in HIV it occurs by trading much smaller units of genetic material.

For human genomic data, persistence bars are about 2 centiMorgans-per-megabase long.

Michael Robinson talked about Morphisms between Logic Circuits

Logic circuits are described by their truth tables. But computers take time to do computations: fast input switching yields the “wrong” output (as evidenced by the flickering screen on Dr. Robinson’s slides). How can we analyze the failures of circuits due to problems of timing? Use sheaves!

Sheaves allow local specifications (we are really good at understanding small circuits) to determine global behavior (what we need to get a handle on). Plus sheaves whose stalks are vector spaces are `just’ linear-algebraic, so we can compute their cohomology using easy, well-known techniques.

As we try to associate a vector space to each logic gate, we encounter various aspects of engineering practice like one-hot encoding.

The zeroth cohomology of the switching sheaf detects the (synchronous) classical logic behavior of the system. The first cohomology of the switching sheaf detects stored information (hence, the possibility of a timing problem in the circuit).

But cohomology of the switching sheaf doesn’t tell us everything. The categorification approach says we should consider morphisms to get more information. Given a circuit we want to understand, we can construct a circuit with the same logical behavior.Then we can ask how many morphisms of the switching sheaves there are which cover the identity on inputs and outputs.

Sometimes there aren’t any such morphisms. Sometimes there are a few. Apparently there are never exactly three.

Exact sequence of a pair: Computing the connecting map

Given a pair of spaces , with , the short exact sequence of the pair is a relationship between the chains and . The sequence that looks like this: (1)   is short exact, or that: . If you want to know more about exact sequences I recommend reading this. It turns out that on […]

Given a pair of spaces (X,A), with A \subset X, the short exact sequence of the pair is a relationship between the chains C_*(A),C_*(X/A) and C_*(X). The sequence that looks like this:

(1)   \begin{equation*} 0 \longrightarrow C_*(A) \xlongrightarrow{i} C_*(X) \xlongrightarrow{j} C_*(X/A) \longrightarrow 0 \end{equation*}

is short exact, or that: \Ker{j} = \Im{i}. If you want to know more about exact sequences I recommend reading this.

It turns out that on the level of homology this short exact sequence turns into a long exact sequence:

(2)   \begin{equation*} \ldots \longrightarrow H_*(A) \xlongrightarrow{i_*} H_*(X) \xlongrightarrow{j_*} H_*(X/A) \xlongrightarrow{\delta} H_{*-1}(A) \ldots \end{equation*}

and the map \delta is called the connecting homomorphism.

The natural question to ask is: What is \delta([x])? The answer is that
that \delta([x]) = [\partial[x]]. However, if you read the aforementioned authoritative text on the subject you will see that \delta is not explicitly defined, but its existence is merely proved.

However, given a particular X and A, one is able to come up with an idea of what the map does. I’ll go over the basic diagram chase and then show how this works by example.

Since we are given a representation of H_*(X/A) we can begin by taking a homology class \hat{c} which is relative cycle. Any chain c in X which is in the class \hat{c} has the property that j(c) = \hat{c} in H_*(X/A). Such a chain c is called a lift.

Assuming we have found a lift we can compute z = \partial(c), and it turns out that z \in Z_{*-1}(A) (because \partial{j} = j\partial), which means that [z] \in H_{*-1}(A). So our mapping sends \hat{c} \in H_*(X/A) to [z] \in H_{*-1}(A). So you now just need to prove that this process is well defined and this ends your diagram chase as well as your interest in the details of this sequence. Unless of course you are Mr. Cooperman, or, a computer scientist.

If you are a computer scientist you might be dissatisfied because it is not entirely clear how [z] is presented. Indeed z itself is not unique as it depends on our choice of lift, z is likely itself not even a member of our given representative for homology on H_{*-1}(A).

The problem we have is that so far our cycle z is written down as a linear combination of cells. We want an equivalent representation in terms of cycles. Noether says we can write down Z_{*}(A) \simeq H_{*}(A) \oplus B_{*}(A).
This tells us that we can use our homology basis to form a basis for our cycle group, and its not to hard to see that we can extend the cycle group to a basis for all chains. However in our case we want to take a cycle expressed in the chain basis, and write it as a linear combination of given cycles. This amounts to being able to write down a rectangular “change of basis” matrix between the cell basis and the given cycle basis on Z_{*-1}(A).

The column space of this matrix would be given by our given cycle basis, and the row space would be given by the canonical cell basis for C_{*-1}(A). We then look to “solve Ax = z” over \mathbb{Z}_p (in this blog p = 2), in the sense that we want to write down a linear combination of the cycles in Z_{*-1}(A) apply our matrix and end up with the z we computed.

Consider this space A:

Rendered by QuickLaTeX.com

which is homotopic to an annulus.

We can imagine that A \subset X, and that X is obtained by gluing a disc onto the boundary of A, so that X is itself homotopic to a disc, X/A is then homotopic to a sphere, with exactly one non-trivial 2nd homology class. The boundary of this class would be the outermost cycle of A.

Lets say that we are provided a basis for H_1(A) = \langle \alpha = bc + cd + db \rangle and B_1 = \langle \beta = ab + bc + ca \rangle. If I want to express the outer bounding cycle ab + bd + dc +ca in this basis, I begin by writing down the change of basis matrix, in order to solve Ax=z I augment the matrix with the identity matrix on the right and z on the bottom.

    \[ A' = \left( \begin{array}{c|ccccc|cc} & ab & ac & bc & bd & cd & \alpha & \beta \\ \hline \alpha & 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ \beta & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ \hline & 1 & 1 & 0 & 1 & 1 & 0 & 0 \end{array} \right) \]

Now you can see that modulo our extra row for z our matrix is in row echelon form, and it suffices to perform one Gaussian elimination (over Z_2) in this last column to achieve our result:

    \[ \left( \begin{array}{c|ccccc|cc} & ab & ac & bc & bd & cd & \alpha & \beta \\ \hline \alpha & 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ \beta & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ \hline z & 1 & 1 & 0 & 1 & 1 & 0 & 0 \end{array} \right) \]

    \[ z \rightarrow z + \beta \]

    \[ \left( \begin{array}{c|ccccc|cc} & ab & ac & bc & bd & cd & \alpha & \beta \\ \hline \alpha & 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ \beta & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ \hline z=\beta & 0 & 0 & 1 & 1 & 1 & 0 & 1 \end{array} \right) \]

    \[ z \rightarrow z + \alpha \]

    \[ \left( \begin{array}{c|ccccc|cc} & ab & ac & bc & bd & cd & \alpha & \beta \\ \hline \alpha & 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ \beta & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ \hline z=\alpha+\beta & 0 & 0 & 0 & 0 & 0 & 1 & 1 \end{array} \right) \]

Notice how the right hand side now contains our vector x and indeed Ax = z = \partial(j(c)).