Marina Gomtsyan (LPSM, Sorbonne Université)
Variable selection methods in sparse GLARMA models
We propose novel variable selection methods for sparse GLARMA (Generalised Linear Autoregressive Moving Average) models, which can be used for modelling discrete-valued time series. These models allow us to introduce some dependence in a Generalised Linear Model (GLM). The key idea behind our estimation procedure is first to estimate the coefficients of the ARMA part of the GLARMA model and then use a regularised approach, namely the Lasso, to estimate the regression coefficients of the GLM part of the model. Furthermore, we establish a sign-consistency result for the estimator of the regression coefficients in a sparse Poisson model without time dependence. The performance of our proposed methods was assessed on simulation studies in different frameworks and on several datasets in the field of molecular biology. Our approaches exhibit very good statistical performance, surpassing other methods in identifying non-null regression coefficients. Secondly, their low computational burden enables their application to relatively large datasets. Our proposed methods are implemented in R packages, which are publicly available on the Comprehensive R Archive Network (CRAN).
