Needle-in-a-haystack Regularized Regression

by dshin   Last Updated August 10, 2018 10:19 AM

I'm in a setting where I am trying to model a continuous output variable given ~100 variables and ~100k datapoints. The signal-to-noise ratio is extremely low, and colinearity is very high. Among the variables are many "needle-in-a-haystack" binary-valued features. A "needle-in-a-haystack" binary-valued feature, $f$, is one where $Pr[f==1]$ is small (~0.01), but where it is important for our model to be unbiased when $f==1$.

When I use OLS, the resultant model is properly unbiased when $f==1$. However, the model has undesirable characteristics stemming from noise and colinearity.

When I try elastic-net regularization, the noise/colinearity problems go away. However, it appears that the act of regularizing causes the model to disregard bias for the needle-in-a-haystack features. Even when $f$ is selected by the model, the model generates unacceptably large residuals when $f==1$.

I'm wondering how I can get the best of both worlds. I am currently training an elastic-net regularized model first, and then training a second OLS model to predict the residuals from the needle-in-haystack features. This seems to work decently, but I'm wondering if there is a more standard way.

Related Questions

If you minimize RSS, does that minimize MSE?

Updated February 27, 2018 09:19 AM

Definition of residuals versus prediction errors?

Updated October 16, 2017 13:19 PM