< Back to previous page

Publication

imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A Multilabel BERT-based system for predicting COVID-19 events

Book Contribution - Book Chapter Conference Contribution

In this paper, we present our system designed to address the W-NUT 2020 shared task for COVID-19 Event Extraction from Twitter. To mitigate the noisy nature of the Twitter stream, our system makes use of the COVID-Twitter-BERT (CT-BERT), which is a language model pre-trained on a large corpus of COVID-19 related Twitter messages. Our system is trained on the COVID-19 Twitter Event Corpus and is able to identify relevant text spans that answer pre-defined questions (i.e., slot types) for five COVID-19 related events (i.e., TESTED POSITIVE, TESTED NEGATIVE, CAN-NOT-TEST, DEATH and CURE & PREVENTION). We have experimented with different architectures; our best performing model relies on a multilabel classifier on top of the CT-BERT model that jointly trains all the slot types for a single event. Our experimental results indicate that our Multilabel-CT-BERT system outperforms the baseline methods by 7 percentage points in terms of micro average F1 score. Our model ranked as 4th in the shared task leaderboard.
Book: Conference on Empirical Methods in Natural Language Processing (and forerunners) (2020)
Volume: Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Pages: 505-513
Number of pages: 9
Publication year:2020
Accessibility:Open