Wednesday, April 26, 2023

Benford Racecourses (JSDist and MRD)

[Benford Racecourses (JSD and MRD)]()

To guide us in selecting horse races to review, we have WCMI and for identifying selections to evaluate within a specific race, we have TWPD. But what if we want to evaluate racing at a higher level of granularity (e.g. course or festival)? For example, which racecourses are better for favorites or for longshots?

By comparing the SP Rank probabilities (Actual) with the Benford distribution (Expected), we can use a combination of Kullback-Leibler Divergence (KLD), and Jensen-Shannon Divergence (JSD) to calculate the Jensen-Shannon Distance (JSDist) (square-root of JSD) to reflect how close are the two distributions and, indirectly, to provide insights into the competitiveness of racecourses.

Our example below is a seven year unfiltered (except for a minimum of nine runners) snapshot of all-weather races at Kempton racecourse and you can see that it passes the 'eyeball' test in terms of alignment. Note that both the actual and expected columns are scaled from zero to one as this is necessary for the calculation of the comparison metrics.

SP Rank Runners Winners Actual Expected
1 1541 431 0.30054 0.30103
2 1501 254 0.18184 0.17609
3 1501 171 0.12242 0.12494
4 1468 150 0.10980 0.09691
5 1457 127 0.09366 0.07918
6 1500 102 0.07307 0.06695
7 1443 55 0.04096 0.05799
8 1427 67 0.05045 0.05115
9 1379 35 0.02727 0.04576

Using the Benford distribution as our baseline, we would expect that approximately 60% of winners would come from the top three in the betting (SP Ranks = 1, 2, and 3) and that is almost exactly what we find in the above example - 60.48% (Actual) vs 60.21% (Expected).

Another way to analyze the above example is through the use of the Mean Relative Difference (MRD). This metric allows bettors to compare observed frequencies (actual) against expected frequencies based on a reference distribution, such as the Benford Distribution.

To calculate the MRD, we must first calculate the relative differences for each rank:

D i f f R e l = ( ( A c t u a l E x p e c t e d ) / E x p e c t e d ) DiffRel=((Actual-Expected)/Expected)

and then take the average of the relative differences:

M R D = M e a n ( D i f f R e l ) MRD = Mean(DiffRel)

The following table should provide a guideline on how to interpret the combination of of JSDist and MRD calculations.

ID Sample JSDist MRD JSDist-View MRD-View
Alpha 1401 0.02434 2.50% Balanced(+) Favorites(+)
Kempton (AW) 1379 0.00682 0.36% Balanced(+) Favorites(-)
Bravo 858 0.04825 5.01% Balanced(-) Favorites(+)
Charlie 790 0.02971 (1.08%) Balanced(-) Favorites(-)

A 'Balanced (+)' JSDist-View implies that the MRD is spread approximately evenly across all three favorites. A 'Favorites (+)' JSDist-View implies that the MRD advantage favors focusing on the three favorites against the field. In order to derive the 'Balanced / Favorites' inferences, we set arbitrary thresholds for both JSDist and MRD of 0.025. We would probably need to experiment with these parameters over time to better reflect our preference for convex bets.