Self learning and incremental learning are all new to me. I am trying to develop a system for one of my case. Simply I have a data set (with about 90K observations and 400 features) for a binary classification problem. I first select the important features via DRF and model it with GBM. Once the model is deployed, new data (one data point) will be provided on a daily (or weekly) basis. I'd like to use this new information to improve my model without re-training it.
From my searches incremental (online) learning sounds suitable for my problem. However, the more I read the more I am confused. For example, in the related scikit-learn's page, it says incremental learning is used (via partial_fit) for very large data-sets when it is not possible to fit them in the memory.
Would it be suitable to use incremental learning for my problem, what are your suggestions or warnings? Is there something similar on H2O?
Could you suggest a proper method for my problem? Any tutorials would be highly appreciated.
Thanks in advance!