which one is important training or testing accuracy?

Here is my question:

I split my data into 80% training and 20% testing dataset. Using the 80% split and 10 cross-validations, I build the model and get the training accuracy. Then I test my model on the 20% split and get the testing accuracy. The question is: which is important training or testing accuracy? If I used 10 different machine learning algorithms on the same split, which accuracy will guide me for the best algorithm, training or testing accuracy?

Answers 1

The testing data in your cross-validation mimics the situation of "true" testing data. So if the performance of your model on new, not-before-seen data is important, then you should go by its performance on the CV testing data. (I have a hard time picturing a situation where training data performance is more important.)

