< Terug naar vorige pagina

Publicatie

Value-added tax fraud detection with scalable anomaly detection techniques

Tijdschriftbijdrage - e-publicatie

The tax fraud detection domain is characterized by very few labelled data (known fraud/legal cases) that are not representative for the population due to sample selection bias. We use unsupervised anomaly detection (AD) techniques, which are uncommon in tax fraud detection research, to deal with these domain issues. We analyse a unique dataset containing the VAT declarations and client listings of all Belgian VAT numbers pertaining to ten sectors. Our methodology consists in applying AD methods to firms belonging to the same sector and enables an efficient auditing strategy that can be adopted by tax authorities worldwide. The high lifts and hit rates observed in most sectors demonstrate the success of this approach. Sectoral differences exist due to varying market conditions and legal requirements across sectors and we show that the optimal AD method is sector dependent. We focus on three methodological problems that show issues in the related literature. (1) Can we design suitable input features? We develop new fraud indicators from specific fields of the VAT form and client listings and show the predictive value of the combination of these features. (2) Can we design fast algorithms to deal with the large data sizes that can occur in the tax domain? New methods are developed and we demonstrate their scalability both theoretically as well as empirically. (3) How should fraud detection performance be assessed? A new evaluation methodology is proposed that provides reliable performance indications and guarantees that fraud cases are effectively detected by the proposed methods.
Tijdschrift: Applied soft computing
ISSN: 1568-4946
Volume: 86
Pagina's: 1 - 20
Jaar van publicatie:2020
Trefwoorden:A1 Journal article
BOF-keylabel:ja
BOF-publication weight:3
CSS-citation score:2
Authors from:Higher Education
Toegankelijkheid:Open