categorical variable

by memile   Last Updated October 19, 2019 23:19 PM - source

I am working on something there trying to predict a cost per location there are 8 variables one of them is a categorical value that has over 300 levels of postal codes in the entire provinces will that mess up my predictive model or it is better to use another method like binning to reduce the level. looking for advice as I will be using Decision tree, Random forest, KNN, ANN, and logistic to get some answers Postal code carry individuals information, jobs categories and salary average. My sample size is 3000 x 13 and I am answering many questions with the data set. Estimation of loan requested and will it be repaid – Thank you



Related Questions




How to measure accuracy of models for count data

Updated March 14, 2017 18:19 PM


Confusing sklearn definition of nested CV

Updated October 05, 2017 17:19 PM