Publikationen

2019

  • C. Alt, M. Hübner, and L. Hennig, „Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction,“ in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, p. 1388–1398.
    [BibTeX] [Abstract] [Download PDF]

    Distantly supervised relation extraction is widely used to extract relational facts from text, but suffers from noisy labels. Current relation extraction methods try to alleviate the noise by multi-instance learning and by providing supporting linguistic and contextual information to more efficiently guide the relation classification. While achieving state-of-the-art results, we observed these models to be biased towards recognizing a limited set of relations with high precision, while ignoring those in the long tail. To address this gap, we utilize a pre-trained language model, the OpenAI Generative Pre-trained Transformer (GPT) (Radford et al., 2018). The GPT and similar models have been shown to capture semantic and syntactic features, and also a notable amount of “common-sense” knowledge, which we hypothesize are important features for recognizing a more diverse set of relations. By extending the GPT to the distantly supervised setting, and fine-tuning it on the NYT10 dataset, we show that it predicts a larger set of distinct relation types with high confidence. Manual and automated evaluation of our model shows that it achieves a state-of-the-art AUC score of 0.422 on the NYT10 dataset, and performs especially well at higher recall levels.

    @inproceedings{alt_fine-tuning_2019,
    address = {Florence, Italy},
    title = {Fine-tuning {Pre}-{Trained} {Transformer} {Language} {Models} to {Distantly} {Supervised} {Relation} {Extraction}},
    url = {https://www.aclweb.org/anthology/P19-1134},
    abstract = {Distantly supervised relation extraction is widely used to extract relational facts from text, but suffers from noisy labels. Current relation extraction methods try to alleviate the noise by multi-instance learning and by providing supporting linguistic and contextual information to more efficiently guide the relation classification. While achieving state-of-the-art results, we observed these models to be biased towards recognizing a limited set of relations with high precision, while ignoring those in the long tail. To address this gap, we utilize a pre-trained language model, the OpenAI Generative Pre-trained Transformer (GPT) (Radford et al., 2018). The GPT and similar models have been shown to capture semantic and syntactic features, and also a notable amount of “common-sense” knowledge, which we hypothesize are important features for recognizing a more diverse set of relations. By extending the GPT to the distantly supervised setting, and fine-tuning it on the NYT10 dataset, we show that it predicts a larger set of distinct relation types with high confidence. Manual and automated evaluation of our model shows that it achieves a state-of-the-art AUC score of 0.422 on the NYT10 dataset, and performs especially well at higher recall levels.},
    booktitle = {Proceedings of the 57th {Annual} {Meeting} of the {Association} for {Computational} {Linguistics}},
    publisher = {Association for Computational Linguistics},
    author = {Alt, Christoph and Hübner, Marc and Hennig, Leonhard},
    month = jul,
    year = {2019},
    pages = {1388--1398}
    }

  • C. Alt, M. Hübner, and L. Hennig, „Improving Relation Extraction by Pre-trained Language Representations,“ in Proceedings of AKBC 2019, Amherst, Massachusetts, 2019, p. 1–18.
    [BibTeX] [Abstract] [Download PDF]

    Current state-of-the-art relation extraction methods typically rely on a set of lexical, syntactic, and semantic features, explicitly computed in a pre-processing step. Training feature extraction models requires additional annotated language resources, which severely restricts the applicability and portability of relation extraction to novel languages. Similarly, pre-processing introduces an additional source of error. To address these limitations, we introduce TRE, a Transformer for Relation Extraction, extending the OpenAI Generative Pre-trained Transformer [Radford et al., 2018]. Unlike previous relation extraction models, TRE uses pre-trained deep language representations instead of explicit linguistic features to inform the relation classification and combines it with the self-attentive Transformer architecture to effectively model long-range dependencies between entity mentions. TRE allows us to learn implicit linguistic features solely from plain text corpora by unsupervised pre-training, before fine-tuning the learned language representations on the relation extraction task. TRE obtains a new state-of-the-art result on the TACRED and SemEval 2010 Task 8 datasets, achieving a test F1 of 67.4 and 87.1, respectively. Furthermore, we observe a significant increase in sample efficiency. With only 20\% of the training examples, TRE matches the performance of our baselines and our model trained from scratch on 100\% of the TACRED dataset. We open-source our trained models, experiments, and source code.

    @inproceedings{alt_improving_2019,
    address = {Amherst, Massachusetts},
    title = {Improving {Relation} {Extraction} by {Pre}-trained {Language} {Representations}},
    url = {https://openreview.net/forum?id=BJgrxbqp67},
    abstract = {Current state-of-the-art relation extraction methods typically rely on a set of lexical, syntactic, and semantic features, explicitly computed in a pre-processing step. Training feature extraction models requires additional annotated language resources, which severely restricts the applicability and portability of relation extraction to novel languages. Similarly, pre-processing introduces an additional source of error. To address these limitations, we introduce TRE, a Transformer for Relation Extraction, extending the OpenAI Generative Pre-trained Transformer [Radford et al., 2018]. Unlike previous relation extraction models, TRE uses pre-trained deep language representations instead of explicit linguistic features to inform the relation classification and combines it with the self-attentive Transformer architecture to effectively model long-range dependencies between entity mentions. TRE allows us to learn implicit linguistic features solely from plain text corpora by unsupervised pre-training, before fine-tuning the learned language representations on the relation extraction task. TRE obtains a new state-of-the-art result on the TACRED and SemEval 2010 Task 8 datasets, achieving a test F1 of 67.4 and 87.1, respectively. Furthermore, we observe a significant increase in sample efficiency. With only 20\% of the training examples, TRE matches the performance of our baselines and our model trained from scratch on 100\% of the TACRED dataset. We open-source our trained models, experiments, and source code.},
    booktitle = {Proceedings of {AKBC} 2019},
    author = {Alt, Christoph and Hübner, Marc and Hennig, Leonhard},
    year = {2019},
    pages = {1--18}
    }