< Back to previous page


Connecting morphosyntax and lexical semantics with Elastic Net regression

This project proposes to use regularization methods from machine
learning, more specifically Elastic Net regression (and its siblings
Ridge and Lasso), to look into lexical semantic effects in
morphosyntactic alternances. These regularization techniques apply
shrinkage to the coefficients and can thus be used for variable
selection, especially when the number of predictors is very large. In
variationist studies, this is often the case if one wishes to enter
lexemes associated with a construction into a regression model to
predict constructional variants. We combine the Elastic Net regulator
with k-fold cross-validation - a standard procedure - to avoid
overfitting. Our approach mitigates the various drawbacks present in
alternative approaches that are currently used in variationist
linguistics, like random factors in mixed models and collostructional
analysis. We look at ten multifactorially driven alternances from
Dutch. The project offers a transparent pipeline that can easily be
extrapolated to other case studies, and to other languages.

Date:1 Jan 2022 →  Today
Keywords:machine learning, regularization, alternation studies
Disciplines:Morphology, Dutch language, Corpus linguistics, Syntax, Computational linguistics