GAJBHIYE, AMIT (2020) Enhancing the Reasoning Capabilities of Natural Language Inference Models with Attention Mechanisms and External Knowledge. Doctoral thesis, Durham University.
|PDF (Amit_Gajbhiye_PhD_Thesis_2020) - Accepted Version |
Available under License Creative Commons Attribution Non-commercial 3.0 (CC BY-NC).
Natural Language Inference (NLI) is fundamental to natural language understanding. The task summarises the natural language understanding capabilities within a simple formulation of determining whether a natural language hypothesis can be inferred from a given natural language premise. NLI requires an inference system to address the full complexity of linguistic as well as real-world commonsense knowledge and, hence, the inferencing and reasoning capabilities of an NLI system are utilised in other complex language applications such as summarisation and machine comprehension. Consequently, NLI has received significant recent attention from both academia and industry. Despite extensive research, contemporary neural NLI models face challenges arising from the sole reliance on training data to comprehend all the linguistic and real-world commonsense knowledge. Further, different attention mechanisms, crucial to the success of neural NLI models, present the prospects of better utilisation when employed in combination. In addition, the NLI research field lacks a coherent set of guidelines for the application of one of the most crucial regularisation hyper-parameters in the RNN-based NLI models -- dropout.
In this thesis, we present neural models capable of leveraging the attention mechanisms and the models that utilise external knowledge to reason about inference. First, a combined attention model to leverage different attention mechanisms is proposed. Experimentation demonstrates that the proposed model is capable of better modelling the semantics of long and complex sentences. Second, to address the limitation of the sole reliance on the training data, two novel neural frameworks utilising real-world commonsense and domain-specific external knowledge are introduced. Employing the rule-based external knowledge retrieval from the knowledge graphs, the first model takes advantage of the convolutional encoders and factorised bilinear pooling to augment the reasoning capabilities of the state-of-the-art NLI models. Utilising the significant advances in the research of contextual word representations, the second model, addresses the existing crucial challenges of external knowledge retrieval, learning the encoding of the retrieved knowledge and the fusion of the learned encodings to the NLI representations, in unique ways. Experimentation demonstrates the efficacy and superiority of the proposed models over previous state-of-the-art approaches. Third, for the limitation on dropout investigations, formulated on exhaustive evaluation, analysis and validation on the proposed RNN-based NLI models, a coherent set of guidelines is introduced.
|Item Type:||Thesis (Doctoral)|
|Award:||Doctor of Philosophy|
|Keywords:||Natural Language Inference; NLI; Generic NLI Model; Attention Mechanism; Commonsense Knowledge; Dropout; BERT; Knowledge Graph; ConceptNet; Aristo Tuple; Recurrent Neural Networks; External Knowledge; Deep Learning|
|Faculty and Department:||Faculty of Science > Computer Science, Department of|
|Copyright:||Copyright of this thesis is held by the author|
|Deposited On:||08 Jan 2021 10:14|