Applications of Topology and Geometry in Data Science

URBANCIC, ZIVA (2025) Applications of Topology and Geometry in Data Science. Doctoral thesis, Durham University.

Preview

PDF - Accepted Version
4Mb

Abstract

In the last decades we have experienced a boom in computing power, closely followed by the emergence of novel analytic methods which rely on more expensive computations. Part of this wave have been methods that build models of the processes at play based on the data collected during observation, and view them through the lens of topology and geometry. The primary assumption of these methods is that topological and geometrical properties of the model reflect the underlying structure in data, and so probing them can help uncover laws governing the phenomena under study.
The versatility and wide range of use of these methods are illustrated in this thesis. We present four works of completely different flavors covering topics from theory to practice, from mathematics and computer science to cell biology. In one, we contribute new insights into persistent homology: a method whose output is an algebraic object called persistence module, which counts and relates topological features of the data at different scales. Our results give guarantees of when two persistence modules are close enough algebraically, so that we can pair the entries encoding the same underlying feature at each scale, and track the evolution of said pairs as the scale is increased. We then switch the setting to one of neural networks, where we evaluate if their plasticity relates to how they partition the input space. In particular, we set out to answer if choosing their initial parameters with the aim of obtaining finer partitions speeds up their learning and increases the accuracy of their final predictions. Next, we provide a framework for topological modeling of spaces, which are characterized by both their metric and directed structure, and appear naturally when the phenomenon under study has a non-reversible component. In particular, we provide two similarity measures that can be used to compare such spaces. As the final curtain, we explore gene expression data obtained from a specific class of neurons, namely monoaminergic neurons, in the brains of fruit flies. Within the data set we identify structure in the form of denser subsets, which is related to division of neurons into several subtypes. In addition, we uncover the genes that drive this division, and repeat the analysis on a subtype corresponding to dopaminergic neurons.

Item Type:	Thesis (Doctoral)
Award:	Doctor of Philosophy
Faculty and Department:	Faculty of Science > Mathematical Sciences, Department of
Thesis Date:	2025
Copyright:	Copyright of this thesis is held by the author
Deposited On:	28 May 2025 10:57

Social bookmarking:

Applications of Topology and Geometry in Data Science

Abstract

Quick links

Prospective students