abstract.tex (1583B)
1 \chapter{Abstract} 2 Capturing concepts' interrelations is a fundamental of natural language understanding. 3 It constitutes a bridge between two historically separate approaches of artificial intelligence: the use of symbolic and distributed representations. 4 However, tackling this problem without human supervision poses several issues, and unsupervised models have difficulties echoing the expressive breakthroughs of supervised ones. 5 This thesis addresses two supervision gaps we identified: the problem of regularization of sentence-level discriminative models and the problem of leveraging relational information from dataset-level structures. 6 7 \smallskip 8 9 The first gap arises following the increased use of discriminative approaches, such as deep neural network classifiers, in the supervised setting. 10 These models tend to collapse without supervision. 11 To overcome this limitation, we introduce two relation distribution losses to constrain the relation classifier into a trainable state. 12 The second gap arises from the development of dataset-level (aggregate) approaches. 13 We show that unsupervised models can leverage a large amount of additional information from the structure of the dataset, even more so than supervised models. 14 We close this gap by adapting existing unsupervised methods to capture topological information using graph convolutional networks. 15 Furthermore, we show that we can exploit the mutual information between topological (dataset-level) and linguistic (sentence-level) information to design a new training paradigm for unsupervised relation extraction.