To guide us in selecting horse races to review, we have WCMI and for identifying selections to evaluate within a specific race, we have TWPD. But what if we want to evaluate racing at a higher level of granularity (e.g. course or festival)? For example, which racecourses are better for favorites or for longshots?
By comparing the SP Rank probabilities (Actual) with the Benford distribution (Expected), we can use a combination of Kullback-Leibler Divergence (KLD), and Jensen-Shannon Divergence (JSD) to calculate the Jensen-Shannon Distance (JSDist) (square-root of JSD) to reflect how close are the two distributions and, indirectly, to provide insights into the competitiveness of racecourses.
Our example below is a seven year unfiltered (except for a minimum of nine runners) snapshot of all-weather races at Kempton racecourse and you can see that it passes the 'eyeball' test in terms of alignment. Note that both the actual and expected columns are scaled from zero to one as this is necessary for the calculation of the comparison metrics.
SP Rank | Runners | Winners | Actual | Expected |
---|---|---|---|---|
1 | 1541 | 431 | 0.30054 | 0.30103 |
2 | 1501 | 254 | 0.18184 | 0.17609 |
3 | 1501 | 171 | 0.12242 | 0.12494 |
4 | 1468 | 150 | 0.10980 | 0.09691 |
5 | 1457 | 127 | 0.09366 | 0.07918 |
6 | 1500 | 102 | 0.07307 | 0.06695 |
7 | 1443 | 55 | 0.04096 | 0.05799 |
8 | 1427 | 67 | 0.05045 | 0.05115 |
9 | 1379 | 35 | 0.02727 | 0.04576 |
Using the Benford distribution as our baseline, we would expect that approximately 60% of winners would come from the top three in the betting (SP Ranks = 1, 2, and 3) and that is almost exactly what we find in the above example - 60.48% (Actual) vs 60.21% (Expected).
Another way to analyze the above example is through the use of the Mean Relative Difference (MRD). This metric allows bettors to compare observed frequencies (actual) against expected frequencies based on a reference distribution, such as the Benford Distribution.
To calculate the MRD, we must first calculate the relative differences for each rank:
and then take the average of the relative differences:
The following table should provide a guideline on how to interpret the combination of of JSDist and MRD calculations.
ID | Sample | JSDist | MRD | JSDist-View | MRD-View |
---|---|---|---|---|---|
Alpha | 1401 | 0.02434 | 2.50% | Balanced(+) | Favorites(+) |
Kempton (AW) | 1379 | 0.00682 | 0.36% | Balanced(+) | Favorites(-) |
Bravo | 858 | 0.04825 | 5.01% | Balanced(-) | Favorites(+) |
Charlie | 790 | 0.02971 | (1.08%) | Balanced(-) | Favorites(-) |
A 'Balanced (+)' JSDist-View implies that the MRD is spread approximately evenly across all three favorites. A 'Favorites (+)' JSDist-View implies that the MRD advantage favors focusing on the three favorites against the field. In order to derive the 'Balanced / Favorites' inferences, we set arbitrary thresholds for both JSDist and MRD of 0.025. We would probably need to experiment with these parameters over time to better reflect our preference for convex bets.