Variable Weights - Entropy Method
In horse-racing, using fundamental handicapping, we try to derive predictor variables from mining past-performance data. As ever, our starting-point is to ask Bill Benter's fundamental question of handicapping:
What additional variables (if any) explain a significant proportion of the variance in results to date that is not already accounted for by the public odds (Wisdom of Crowds)?
Assuming that we have already identified a number of such variables that appear to influence the outcome of races, how do we weight those variables? Do we weight them separately for different codes (Flat, Jumps), different types (Maiden, Handicap), or different distances (Sprints, Routes) of races?
Obviously, we could use some form of Regression Analysis to derive the necessary weights but, perhaps, a simpler option presents itself! In the Multiple Criteria Decision Analysis process TOPSIS, the Entropy Weight method is used to objectively derive criteria (variables) weights based on the dispersion of scores across the alternatives being analysed. Translating into a handicapping scenario, the underlying assumption of this method is that the greater the difference in scores for contestants across multiple criteria, the greater the difference in predicted outcome for some future event! In other words, we are operationalizing the belief that it is the differences between horses on some key variables and not their similarities (or the differences between race codes, types, distances, and so on) that best determines the winner. Also, all races generate their own unique set of weights and there can be a mixture of positive (1,3,4,5,6) and negative (2,7) weights.
This method has some limitations (particularly relating to scores of zero and entropy values close to one). A number of solutions have been recommended to resolve these issues and the following approach shows promise - New Entropy Weight-Based TOPSIS for Evaluation of Multi-objective Job-Shop Scheduling Solutions.