ATMCS 6: Day 3

Summaries for Day 3 are contributed by Rachael Phillips.

Jeff Erickson talked about efficiently hex-meshing things with topology. With a hex mesh, a polyhedra with six quadrilateral facets, there can be a quadrilateral mesh that can be extended to a hexahedral mesh of the interior volume. This can only happen when there are an even amount of quadrilaterals and none of the cycles are odd for the hex mesh. If such a mesh exists, then a polyhedron in 3-dimensional Euclidean space with quadrilateral facets can be constructed in polynomial time. These are extended to domains that have disconnected boundaries and are continued from Thurston, Mitchell, and Eppstein, where the odd cycle criteria is trivial. The idea is to look at a quadrilateral figure and extend that figure to the interior. So, the importance is not in the shape of that figure, since we are not looking at the geometry of this figure, but the topology.

 Jose Perea talked about Obstructions to Compatible Extensions of Mappings. From Betti numbers in 1994 to Zig-Zag persistence in 2009, there have been several classic invariants in algebraic topology. The basic ones being from a point cloud, constructing a filtration and using that filtration to compute Betti numbers, which tell us about the number of k-dimensional holes within a metric space. Instead of this, it would be useful to come up with new ways of encoding multi-scale information from data. The main goal is to be able to fit our data into a model for the best methods. Using extending sections and the retraction problem, Mumford data is used to fit the model. The question is, how far do you have to go for the model to be good? From local to global, the model tells us the death-like events, where an example would be compatible extensions. The birth-like events where the filtration of each level extends to the next. Once these models are found, it extends compatibility once the model has been extended. The main goal being to extend the previous invariant methods to new invariant methods for data analysis.

Donald Sheehy talked about Nested Dissection and (Persistent) Homology. Using nested dissection, this is a way of solving systems of linear equations. This method is an improvement of the naive Gaussian elimination. The reason that it is important to improve Gaussian elimination is because it is a long process that takes a lot of computer memory. It is a method that needs to be improved when using computers to solve it. By building a filtered simplicial complex and computing the persistent homology, we can try to speed up the process of elimination. Normally, Gaussian elimination has a running time of $$O(n^3)$$, or even worse using Strassen it is a running time of $$O(n^{\log_27})$$. Nested Dissection removes a random column in a matrix and separates the graph into two pieces. When a matrix is separated in two pieces, it improves the matrix multiplication and using topology, while doing Gaussian on the boundary. The nested dissection computes the persistent homology of the space of the matrix. Using four methods such as Mesh Filtrations, Nested Dissection, Geometric Separation and Output sensitive persistence algorithm, there is a theorem that improves the asymptotic running time of the persistence algorithm.

Shmuel Weinberger talked about Complex and Simple “Topological” invariants. It is common in the study of topological data analysis that the values of topological invariants are discussed. When you calculate homology it does not always give you enough information. The goal is not to look at the dynamics, but learn about dynamics. It is possible to look at the persistent homology of any data set, unfortunately, conditions on the noise, “invariance” with high-probability. It is important is to find examples of Probability Approximately Correct (PAC) computable ideas. Knowing when noise is a problem and when it is not is the main goal of this talk.

ATMCS 6: Day 2

For day 2, Sara Kalisnik and Andrew Blumberg are giving summaries of the talks we heard:

Gunnar Carlsson: Persistence barcodes are natural invariants of finite metric spaces useful for studying point cloud data. However, they are not well adapted to standard machine learning methods. One approach to this problems is to equip the set of barcodes with a metric, but an even simpler would be to provide coordinates for the set of barcodes. Gunnar Carlsson talked about coordinatizations of barcode spaces and their properties. He also proposed some ways in which they could be used to obtain information from multidimensional persistence profiles.

Vanessa Robins focused on on-going work about applications of discrete Morse theory and persistent homology to analyzing images of porous materials.  She explained specific connections between the physical structure of the material and the patterns of persistence diagrams.  The results presented were particularly exciting insofar as they represented a very serious and thorough application of computational topology to large quantities of real data.

Radmila Sazdanovic introduced categorification and provided several examples in pure mathematics, especially knot theory. Among others, categorifications of Jones and chromatic polynomials.

Sayan Mukherjee had two distinct themes.  In the first part of the talk, he discussed results (with Boyer and Turner) on the “persistent homology transform” and “Euler characteristic transform”, which roughly speaking are invariants of an object in \(\mathbb{R}^2\) or \(\mathbb{R}^3\) obtained by taking the ensemble of persistent homology or Euler characteristic of slices (relative to some fixed orientation vector).  It turns out these are sufficient statistics and moreover seem quite successful in classification problems (e.g., for primate bones).  The second part of the talk focused on the problem of manifold learning in the context of mixtures of hyperplanes of different dimensions.  The key insight is that a Grassmanian embedding due to Conway, Hardin, and Sloane allows the use of distributions on the sphere to carry out statistical procedures on spaces of hyperplanes.

ATMCS 6: Day 1

Day 1 of ATMCS 6 is now (mostly) over. Small groups of applied topologists are roaming the streets of Vancouver looking for sights, food or drink, while the less hardy of us have already eaten and retired to our rooms for a quiet night, or a night full of last-minute preparations.

Vin de Silva talked about his currently ongoing research into interesting new perspectives on persistence stability theorems and foundational models for persistence modules. One thing that really caught my attention was the idea of metric certificates: many metrics are defined as the supremum or infimum over a range of potential comparison points. The certificates idea summarizes all these approaches under a common header – a comparison point is a certificate, and the metric is produced by optimizing across certificates.

This is put to use to produce extension theorems of the form
If \(A\) is a subspace of \(B\), and there is a 1-Lipschitz map \(A \to M\), then we can construct a 1-Lipschitz map \(B \to M\).
These theorems turn out to hold in a bunch of situations, and to be highly relevant for persistent homology.

Amit Patel talked about his work on Quillen 2-categories and their relationship to persistent homology. It turns out there are ways of talking about persistent homology that pull in some sheaf-theoretic perspectives and naturally produce a Quillen 2-category that encodes much of the structure.

Sarah Day talked about Conley index theory and symbolic dynamics; with some research geared towards using symbolic dynamics approximations of dynamical systems to discover models and pick out cycles and stable behaviors. Within this project, Conley indices turn out to be useful tools.

Tamal Dey talked about the Graph induced complex, and work with Fengtao Fan and Yusu Wang on data sparsification by building topological models and simplifying the computational complexity of generating topological inferences.

I’ll see if I can recruit volunteers from the audience to keep a stream of conference updates flowing here.

ICML Workshop: Topological Methods for Machine Learning

Description:

“This workshop will focus on the following question: Which promising directions in computational topology can mathematicians and machine learning researchers work on together, in order to develop new models, algorithms, and theory for machine learning? While all aspects of computational topology are appropriate for this workshop, our emphasis is on topology applied to machine learning — concrete models, algorithms and real-world applications.”

More here: http://topology.cs.wisc.edu

AIM Workshop: Generalized persistence and applications

This workshop will be devoted to generalizations of persistent homology with a particular emphasis on finding calculable algebraic invariants useful for applications. Applications of persistence — for example, signal processing, drug design, tumor identification, shape classification, and geometric inference — rely on the classification of persistence via barcodes, geometrization of the space of barcodes via metrics or as an algebraic variety, and on efficient algorithms. Accordingly, this workshop will bring together theoriticians, computer scientist, and the users of computational topology.

The main topics for the workshop are:

  • Generalizations of persistence: multidimensional persistence, well groups, (co)sheaves
  • Algorithms
  • Geometrization
  • Applications

The workshop will differ from typical conferences in some regards. Participants will be invited to suggest open problems and questions before the workshop begins, and these will be posted on the workshop website. These include specific problems on which there is hope of making some progress during the workshop, as well as more ambitious problems which may influence the future activity of the field. Lectures at the workshop will be focused on familiarizing the participants with the background material leading up to specific problems, and the schedule will include discussion and parallel working sessions.

Space and funding is available for a few more participants. If you would like to participate, please apply by filling out the on-line form no later than May 15, 2014. Applications are open to all, and we especially encourage women, underrepresented minorities, junior mathematicians, and researchers from primarily undergraduate institutions to apply.

http://aimath.org/workshops/upcoming/persistence/

Open Question: Lions and Contamination

I’d like to point to you an open problem that I find interesting. A good reference is the paper “How many lions are needed to clear a grid?” by Florian Berger, Alexander Gilbers, Ansgar Grüne, and Rolf Klein [1].

Disclaimer:  I would classify this problem as more combinatorial than topological.

Suppose we have a graph which is an \(n \times n\) grid. This graph contains \(n^2\) vertices, and the case \(n=5\) is drawn below.

5x5grid

We have \(k\) lions moving on this grid. At each time step a lion occupies a vertex, and between adjacent time steps a lion either stays put or travels across one edge to an adjacent vertex.

We also need to define the subset of vertices \(W(t)\) which are “contaminated” at time \(t\). A lion can clean a contaminated vertex, but in the absence of lions, the contamination spreads. At starting time \(t=0\) every vertex not occupied by a lion is contaminated; this gives \(W(0)\). How does the contaminated set update as the lions move? A vertex \(v\) is in \(W(t+1)\) if \(v\) is not covered by a lion at time \(t+1\) and either

  • \(v\) belongs to \(W(t)\), or
  • \(v\) has a neighbor \(u\) in \(W(t)\) such that no lion travels from vertex \(v\) to \(u\) between times \(t\) and \(t+1\).

Suppose you are given \(k\) lions. You get to choose the lions’ starting vertices at time zero in the grid – all other vertices begin contaminated. You get to pick how each lion moves at each time step. Can you design a way to clear all contaminated vertices from the grid?

If \(k \geq n\) then this problem is easy. At time zero simply line up the lions along the left-hand side of the grid, from top to bottom. At each time step, sweep each lion one step to the right. At time \(t=n-1\) the lions will be on the right-hand side of the grid, and there will be no contaminated vertices.

It is unknown whether \(k=n-1\) lions are sufficient to clear the grid or not. I would guess that most people think \(k=n-1\) lions are insufficient, but nobody has a proof!

An equivalent way to phrase this problem is to use a mobile evader instead of the set of contaminated vertices. Suppose our evader moves at the same speed as the lions: at each time step the evader occupies a vertex, and between adjacent time steps the evader either stays put or crosses one edge. The evader is caught if it occupies the same vertex or crosses the same edge as any lion. It is known that \(k\geq n\) lions can catch any such evader (say by sweeping from left to right), and it is unknown whether \(k=n-1\) lions are sufficient or not. To see the equivalence between the formulations using a mobile evader or contaminated vertices, note that \(W(t)\) is the set of all possible locations of a mobile evader at time \(t\).

This is one of those problems that is harder than it sounds. Upon first hearing it your reaction is that you will have a proof after one evening of hard work. A week later you still haven’t made much progress, and you’re a week behind on your normal research agenda. Consider yourself warned!

One reason why I classify this problem as more combinatorial than topological is that the details of the discretization matter. For example, see Figure 1 of [1] (Note – in this figure, a vertex of the \(n \times n\) grid is drawn as a square. This is their representation of a \(4 \times 4\) grid with 16 vertices, not a \(5 \times 5\) grid with 25 vertices). For a second example, see Figure 5 of [1]. In the 3d version of this problem, you might expect that \(n^2\) lions are necessary to clear an \(n \times n \times n\) grid. Figure 5 of [1] shows that this is false – 8 lions (which is less than 9) are sufficient to clear the \(3 \times 3 \times 3\) grid.

References

[1] Florian Berger, Alexander Gilbers, Ansgar Grüne, and Rolf Klein. How many lions are needed to clear a grid? Algorithms 2009, 2, 1069-1086.

ICMS 2014: Session on “Software for Computational Topology”

The 4th International Congress on Mathematical Software (ICMS 2014) takes place in Seoul, Korea on Aug 5-9. This year, it will host a workshop session dedicated to Computational Topology. Contributions on state-of-the-art software for topological problems as well as applications of such software to other domains are welcome. See the dedicated webpage for more information,

How to contribute: Submit a short abstract of 200-500 words to the session organizer until March 31. You will get a notification about acceptance within one week and upon positive evaluation, you will give a talk at ICMS. An extended abstract (due end of April) will appear in the conference proceedings.  A special issue of Journal of Symbolic Computation will be organized immediately after the workshop.

ATMCS 6 open for registration

Dear ALGTOP-L,

Algebraic Topology- Methods, Computation and Science 6 (ATMCS6) is now open for registration. The conference takes place at PIMS University of British Columbia May 26-30.

Confirmed speakers include:

Amit Patel
Donald Sheehy
Jeff Erickson
Jose Perea
Liz Munch
Michael Robinson
Omer Bobrowski
Peter Bubenik
Radmilla Sazdanovic
Sayan Mukerjhee
Vanessa Robbins
Yuliy Baryshnikov
Tamal Dey
Shmuel Weinberger
Raul Rabadan
Chris Hoffman
Vin de Silva

For more details, see:
http://www.pims.math.ca/scientific-event/140526-atmcs

Applied and computational topology refers to the adaptation of topological ideas and techniques to study problems in science and engineering. A particular focus is on using invariants and methods of algebraic topology to understand large high-dimensional data sets. The further development of topological techniques for use in applications and the creation of new areas of application in the subject are amongst the goals of this workshop.

The workshop will bring together leading researchers in this emerging discipline as well as providing an opportunity for young mathematicians to get involved in it. In past years, the ATMCS conference has been very successful in providing a forum for cutting-edge research to be disseminated; attendance tends to represent a broad swath of the diverse research community which works in this area.

Workshop Format:
The workshop will feature lectures and discussion in the morning and afteroon. Mid-Morning and Afternoon refreshments will be provided during the conference.

On behalf of the organizers
Andrew Blumberg, U Texas
Matthew Kahle, Ohio State
Mikael Vejdemo-Johansson, KTH / IMA / IJS

Source material for Topological Data Analysis

To start off the feature articles at appliedtopology.org, I figured it might be worth while collecting good entry points to the field. One of the most common questions I get about persistent homology and topological data analysis is how to get started with our techniques and ideas.

Overview articles and books

First off in the list of entry points is the written word. There are survey articles, overview articles and books written about topological data analysis as a whole, as well as focusing on specific parts.

Topology and Data by Gunnar Carlsson. This survey article came soon after Ghrist’s survey, and covers persistent homology, as well as Mapper for topological simplification and modeling. It also comes with a good discussion of the underlying philosophy of the field.
Start here.

Barcodes: the persistent topology of data by Robert Ghrist. This is the first major survey article to come out, and covers persistent homology and some of its applications.

Topology for computing by Afra Zomorodian. This is the first book format exposition of persistent homology for applied and computational topology. It is a good and self-contained introduction to the field, if ever so slightly dated: in particular, it does not cover anything about zigzag persistence or multi-dimensional persistence.

Computational Topology: an introduction by Herbert Edelsbrunner and John Harer. This book covers the state of the art as of 2010 of computational topology, with some focus on persistent homology: one third of the book is devoted to persistence and its applications. Throughout, the book discusses the underlying theory, the most obvious algorithm, and the fastest known algorithm.

Software packages

So you understand what the underlying ideas of the field are. Next up, you’ll want to try them out on your own data. There are some ways you can go to do this, and they all have their specific strengths and weaknesses.

Plex, jPlex, javaPlex: this sequence of libraries were developed in the Stanford group, and with an explicit aim at always interoperating smoothly and easily with Matlab. Of the three, we currently recommend javaPlex unless this library does not cover your exact use case — in which case some methods may exist in jPlex. Plex is written in C++, and connects to Matlab through a MEX interface, while jPlex and javaPlex are both Java libraries.

Dionysus: this library, written and maintained by Dmitriy Morozov, provides a platform for developing and experimenting with computational topology algorithms in C++ or in Python. It interfaces with CGAL for low dimensional geometric constructions, and has example applications provided for persistent homology, cohomology, vineyards, alphashapes and numerous other common techniques.

Perseus: this package, developed by Vidit Nanda, provides a platform for computing persistent homology for cubical and simplicial complexes generated in a number of different ways. It specifically uses methods based on discrete morse theory for speeding up computations.

pHat: this package, created by Ulrich Bauer, Michael Kerber and Jan Reininghaus builds on results by the authors that speeds up persistence computation by specific tricks that use structures in a persistence boundary matrix. Currently only using Z/2-coefficients and not constructing the complex for you, it seems to be the fastest publicly available package.

CHomP: this software package came out of the CHomP research project, and consists of a rich collection of tools to work persistently or statically with cubical complex data. For homology on image or voxel collection data, CHomP forms the fastest and most complete analysis system available right now.

We warmly appreciate suggestions for more papers, software, or other resources if you have anything to add to this list.