Is setting a self learning system possible via incremental (online) learning?

by mlee_jordan   Last Updated September 21, 2018 14:19 PM - source

Self learning and incremental learning are all new to me. I am trying to develop a system for one of my case. Simply I have a data set (with about 90K observations and 400 features) for a binary classification problem. I first select the important features via DRF and model it with GBM. Once the model is deployed, new data (one data point) will be provided on a daily (or weekly) basis. I'd like to use this new information to improve my model without re-training it.

From my searches incremental (online) learning sounds suitable for my problem. However, the more I read the more I am confused. For example, in the related scikit-learn's page, it says incremental learning is used (via partial_fit) for very large data-sets when it is not possible to fit them in the memory.

Would it be suitable to use incremental learning for my problem, what are your suggestions or warnings? Is there something similar on H2O?

Could you suggest a proper method for my problem? Any tutorials would be highly appreciated.

Thanks in advance!

Related Questions

About learning curves in Machine Learning

Updated July 13, 2017 15:19 PM

Batch Norm & Input Norm Comparisons

Updated June 02, 2017 18:19 PM