I am trying to build a linear regression model, but some of my traits are not numerical, for example. “Car color”, while others are, for example, “Engine size”. In non-numerical cases, I'm not sure how to represent this when added as an input function. The only way I could do this is to present each color with a different value, for example. (red = 1, blue = 2, green = 3 ...), however this does not seem acceptable as it means that green is "better" than red.
Can someone help ... I implement this in Java, so I would appreciate algorithms expressed in this language, or language-independent.
Jlove source
share