Using k-means to segment customers in the positive class

by Insu Q   Last Updated August 13, 2019 23:19 PM - source

I have some labeled data (0=didn’t cancel, 1=canceled) that I am creating a model for in my marketing class.

On top of predicting who is likely to cancel, I’d like to explore the possibility of trying different proactive retention strategies. I was thinking of running k-means on the training data where the label=1 and get, say, 4 clusters.

Is this the right way to go about this? I would basically end up with two models and run each customer through the binary classifier, and if it’s predicted to cancel, run the customer through the clustering model.

I’m not sure of this approach because k-means is an unsupervised learning method and I’m sort of helping it by feeding it just the customers in the positive class.

Please share your thoughts on this approach and any suggestions.



Related Questions


Clustering Noisy Data

Updated November 04, 2018 18:19 PM

K-means and maximum likelihood!

Updated August 03, 2017 14:19 PM


How is the growing-region segmentation works flow?

Updated August 20, 2019 08:19 AM