The Hessian Screening Rule
Predictor screening rules, which discard predictors from the design matrix before fitting a model, have had considerable impact on the speed with which l1-regularized regression problems, such as the lasso, can be solved. Current state-of-the-art screening rules, however, have difficulties in dealing with highly-correlated predictors, often becoming too conservative. In this paper, we present a new screening rule to deal with this issue: the Hessian Screening Rule. The rule uses second-order information from the model to provide more accurate screening as well as higher-quality warm starts. The proposed rule outperforms all studied alternatives on data sets with high correlation for both l1-regularized least-squares (the lasso) and logistic regression. It also performs best overall on the real data sets that we examine.
Citation
@inproceedings{larsson2022,
author = {Larsson, Johan and Wallin, Jonas},
editor = {Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D.
and Cho, K. and Oh, A.},
publisher = {Curran Associates, Inc.},
title = {The {Hessian} Screening Rule},
booktitle = {Advances in Neural Information Processing Systems},
volume = {35},
pages = {15823-15835},
date = {2022/12/06},
address = {Red Hook, NY, USA},
url = {https://papers.nips.cc/paper_files/paper/2022/hash/65a925049647eab0aa06a9faf1cd470b-Abstract-Conference.html},
langid = {en},
abstract = {Predictor screening rules, which discard predictors from
the design matrix before fitting a model, have had considerable
impact on the speed with which l1-regularized regression problems,
such as the lasso, can be solved. Current state-of-the-art screening
rules, however, have difficulties in dealing with highly-correlated
predictors, often becoming too conservative. In this paper, we
present a new screening rule to deal with this issue: the Hessian
Screening Rule. The rule uses second-order information from the
model to provide more accurate screening as well as higher-quality
warm starts. The proposed rule outperforms all studied alternatives
on data sets with high correlation for both l1-regularized
least-squares (the lasso) and logistic regression. It also performs
best overall on the real data sets that we examine.}
}