Dear users and experts,
I am not clear about how is used the input data sample.
I have a csv file with 5 fields and 20k lines entries.
When I run a classification training, does the 20k entries used ? Or the algorithm split this data sample randomly?
My final goal is to ensure that I am training on the desired examples I am providing, in order to make my training better when I find some new examples on which the algortihm is doing wrong.
Best regards.