Depositor Login | Administrator Login

On Hierarchical Encoding and Reasoning in Deep Transformer-based Generative Models

SLACK, DEAN,LEWIS (2025) On Hierarchical Encoding and Reasoning in Deep Transformer-based Generative Models. Doctoral thesis, Durham University.

Preview

PDF - Accepted Version
7Mb

Abstract

Recent advances in generative Transformer-based foundation models have driven remarkable progress in artificial intelligence, yet their internal mechanisms for representing complex hierarchical structures remain largely unknown, posing significant challenges for interpretability, safety, and robust generalisation. This thesis aims to progress on these issues by systematically investigating how such models internalise hierarchical structures, the relationship between this learning and behaviours like generalisation versus memorisation, and how hierarchical principles can inform the development of safer, more accurate, generative models. To this end, we first introduce novel probing techniques to map the layer-wise emergence of linguistic hierarchies in language models and extend this analysis to the visual domain by developing PSViT: a pixel-space Transformer with hierarchical decompositions of video image patches, shown to learn and generalise hierarchical physical dynamics from raw video data. We investigate memorisation during fine-tuning, establishing an n-gram based early warning signal for verbatim leakage and proposing scalable defences that promote structural generalisation over verbatim memorisation. Building on these insights, we further demonstrate that a unified next-frame prediction framework enables a single model to process text, images, audio, and video without modality-specific encoders, thereby learning shared hierarchical patterns across these diverse inputs. Collectively, our findings underscore that the capacity to learn and represent hierarchical structure is a fundamental characteristic of Transformer models, and that a focused analysis of these underpinnings is crucial for advancing more capable, interpretable, and safer artificial intelligence.

Item Type:	Thesis (Doctoral)
Award:	Doctor of Philosophy
Keywords:	Deep Learning, Machine Learning, Spatiotemporal Modelling, Hierarchical Reasoning, Natural Language Processing
Faculty and Department:	Faculty of Science > Computer Science, Department of
Thesis Date:	2025
Copyright:	Copyright of this thesis is held by the author
Deposited On:	04 Nov 2025 11:43

Social bookmarking:

On Hierarchical Encoding and Reasoning in Deep Transformer-based Generative Models

Abstract

Quick links

Prospective students