Coordinate Descent for SLOPE
An online talk for AISTATS 2023 on our work on coordinate descent for SLOPE.
The lasso is the most famous sparse regression and feature selection method. One reason for its popularity is the speed at which the underlying optimization problem can be solved. Sorted L-One Penalized Estimation (SLOPE) is a generalization of the lasso with appealing statistical properties. In spite of this, the method has not yet reached widespread interest. A major reason for this is that current software packages that fit SLOPE rely on algorithms that perform poorly in high dimensions. To tackle this issue, we propose a new fast algorithm to solve the SLOPE optimization problem, which combines proximal gradient descent and proximal coordinate descent steps. We provide new results on the directional derivative of the SLOPE penalty and its related SLOPE thresholding operator, as well as provide convergence guarantees for our proposed solver. In extensive benchmarks on simulated and real data, we show that our method outperforms a long list of competing algorithms.
Citation
@unpublished{larsson2022,
author = {Larsson, Johan},
title = {Coordinate {Descent} for {SLOPE}},
date = {2022-03-31},
url = {https://youtu.be/Zq-jVFgHt5o},
langid = {en},
abstract = {The lasso is the most famous sparse regression and feature
selection method. One reason for its popularity is the speed at
which the underlying optimization problem can be solved. Sorted
L-One Penalized Estimation (SLOPE) is a generalization of the lasso
with appealing statistical properties. In spite of this, the method
has not yet reached widespread interest. A major reason for this is
that current software packages that fit SLOPE rely on algorithms
that perform poorly in high dimensions. To tackle this issue, we
propose a new fast algorithm to solve the SLOPE optimization
problem, which combines proximal gradient descent and proximal
coordinate descent steps. We provide new results on the directional
derivative of the SLOPE penalty and its related SLOPE thresholding
operator, as well as provide convergence guarantees for our proposed
solver. In extensive benchmarks on simulated and real data, we show
that our method outperforms a long list of competing algorithms.}
}