Sunday, May 23, 2021

Handicap Triage And [Fast-And-Frugal Trees (FFTs)](https://en.wikipedia.org/wiki/Fast-and-frugal_trees)

Handicap Triage And Fast-And-Frugal Trees (FFTs)

In Learning from small samples: An analysis of simple decision heuristics, the authors recount how, in 2009, two experienced pilots decided their course of action after '...a commercial passenger plane struck a flock of geese within two minutes of taking off from LaGuardia Airport. The plane immediately and completely lost thrust from both engines, leaving the crew facing a number of critical decisions, one of which was whether they could safely return to LaGuardia. The answer depended on many factors, including the weight, velocity, and altitude of the aircraft, as well as wind speed and direction. None of these factors, however, are directly involved in how pilots make such decisions. As copilot Jeffrey Skiles discussed in a later interview, pilots instead use a single piece of visual information: whether the desired destination is staying stationary in the windshield. If the destination is rising or descending, the plane will undershoot or overshoot the destination, respectively. Using this visual cue, the flight crew concluded that LaGuardia was out of reach, deciding instead to land on the Hudson River. Skiles reported that subsequent simulation experiments consistently showed that the plane would indeed have crashed before reaching the airport...'

Fast-And-Frugal Trees (FFTs) are the ultimate expression of heuristics and have been successfully applied to medical situations - most notably heart disease triage in emergency rooms.

Though obviously worlds apart from life-or-death, medical-triage emergencies, sports handicapping lends itself quite well to the use of heuristics (FFTs) as a shortcut alternative to currently popular, elaborate machine-learning approaches! Perhaps it is the engineer's defeasible reasoning that attracts us to these simpler, more elegant, and more easily explained techniques. With respect to horse-racing, there is a strong emphasis placed on a horse's past performances as the most likely indicator of future performance. Such an emphasis has a certain logical appeal on the surface. However, at a deeper level (see accompanying graph - percentages for illustrative purposes only), the key determinant of a horse's performance today is the current form of the stable (i.e. trainer). This scenario was clearly illustrated over the weekend when a graded stakes race was won by a horse (eight of ten in the betting order) from a stable the runners from which are currently beating 62% of their respective opponents against a market expectation of 45% (see WASP Trainers)! In other words, the simple factor (attribute, cue) of trainer form was probably the defining heuristic in deciding the outcome of that graded stakes race.

Saturday, April 10, 2021

Aintree 2021: Grand-National Chase (G3) Handicapping

Aintree 2021: Grand National Chase Handicapping

Throwing caution to the wind, we will attempt to find live-longshots (20/1+) to finish in the money in the 40 runner Grand National Chase (G3) at Aintree 2021.

Taking our lead once again from Shannon-Fano Crowd Handicapping, we can see that the implied probabilities of the win-market prices allow us to divide the race entrants into five separate groups or sub-races (ABCDE). For example, we could decide to handicap each sub-race separately and leave ourselves five selections for punting.

Constraint Satisfaction Handicapping

Our approach is to use the recently posted Trader Probabilities Derivation And K-L Divergence (Part 3) method, which is basically solving a constraint satisfaction problem (CSP).

Model 20210410_17-15_Aintree_Grand-National Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied probabilities. ! Lower-bound set to 1.25% - half the natural-odds implied ! probability (1/48). ! Upper-bound set to 100%. acapella_bourgeois = 1/36.00, >=0.00, <=1.00 alpha_des_obeaux = 1/95.00, >=0.00, <=1.00 ami_desbois = 1/140.00, >=0.00, <=1.00 anibale_fly = 1/32.00, >=0.00, <=1.00 any_second_now = 1/14.50, >=0.00, <=1.00 balko_des_flos = 1/130.00, >=0.00, <=1.00 ballyoptic = 1/110.00, >=0.00, <=1.00 blaklion = 1/100.00, >=0.00, <=1.00 bristol_de_mai = 1/38.00, >=0.00, <=1.00 burrows_saint = 1/13.00, >=0.00, <=1.00 cabaret_queen = 1/120.00, >=0.00, <=1.00 canelo = 1/60.00, >=0.00, <=1.00 chriss_dream = 1/80.00, >=0.00, <=1.00 class_conti = 1/65.00, >=0.00, <=1.00 cloth_cap = 1/6.40, >=0.00, <=1.00 definitly_red = 1/90.00, >=0.00, <=1.00 discorama = 1/22.00, >=0.00, <=1.00 double_shuffle = 1/150.00, >=0.00, <=1.00 fagan = 1/130.00, >=0.00, <=1.00 farclas = 1/23.00, >=0.00, <=1.00 give_me_a_copper = 1/95.00, >=0.00, <=1.00 hogans_height = 1/95.00, >=0.00, <=1.00 jett = 1/110.00, >=0.00, <=1.00 kauto_riko = 1/150.00, >=0.00, <=1.00 kimberlite_candy = 1/18.50, >=0.00, <=1.00 lake_view_lad = 1/75.00, >=0.00, <=1.00 lord_du_mesnil = 1/40.00, >=0.00, <=1.00 magic_of_light = 1/25.00, >=0.00, <=1.00 milan_native = 1/36.00, >=0.00, <=1.00 minellacelebration = 1/12.00, >=0.00, <=1.00 minella_times = 1/120.00, >=0.00, <=1.00 mister_malarky = 1/38.00, >=0.00, <=1.00 ok_corral = 1/100.00, >=0.00, <=1.00 potters_corner = 1/30.00, >=0.00, <=1.00 secret_reprieve = 1/20.00, >=0.00, <=1.00 shattered_love = 1/75.00, >=0.00, <=1.00 some_neck = 1/48.00, >=0.00, <=1.00 sub_lieutenant = 1/100.00, >=0.00, <=1.00 takingrisks = 1/42.00, >=0.00, <=1.00 talkischeap = 1/75.00, >=0.00, <=1.00 the_long_mile = 1/100.00, >=0.00, <=1.00 tout_est_permis = 1/140.00, >=0.00, <=1.00 vieux_lion_rouge = 1/100.00, >=0.00, <=1.00 yala_enki = 1/50.00, >=0.00, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. acapella_bourgeois + & alpha_des_obeaux + & ami_desbois + & anibale_fly + & any_second_now + & balko_des_flos + & ballyoptic + & blaklion + & bristol_de_mai + & burrows_saint + & cabaret_queen + & canelo + & chriss_dream + & class_conti + & cloth_cap + & definitly_red + & discorama + & double_shuffle + & fagan + & farclas + & give_me_a_copper + & hogans_height + & jett + & kauto_riko + & kimberlite_candy + & lake_view_lad + & lord_du_mesnil + & magic_of_light + & milan_native + & minellacelebration + & minella_times + & mister_malarky + & ok_corral + & potters_corner + & secret_reprieve + & shattered_love + & some_neck + & sub_lieutenant + & takingrisks + & talkischeap + & the_long_mile + & tout_est_permis + & vieux_lion_rouge + & yala_enki = 1.00 ! --- Private trader opinions --- ! Stable preference groups. (acapella_bourgeois + burrows_saint + cabaret_queen + class_conti + & cloth_cap + secret_reprieve + vieux_lion_rouge) > & (alpha_des_obeaux + ami_desbois + anibale_fly + any_second_now + & balko_des_flos + ballyoptic + blaklion + bristol_de_mai + canelo + & chriss_dream + definitly_red + discorama + double_shuffle + & fagan + farclas + give_me_a_copper + hogans_height + jett + & kauto_riko + kimberlite_candy + lake_view_lad + lord_du_mesnil + & magic_of_light + milan_native + minellacelebration + & minella_times + mister_malarky + ok_corral + potters_corner + & shattered_love + some_neck + sub_lieutenant + takingrisks + & talkischeap + the_long_mile + tout_est_permis + yala_enki) ! Class preference groups. (acapella_bourgeois + any_second_now + bristol_de_mai + & burrows_saint + chriss_dream + cloth_cap + definitly_red + & discorama + farclas + milan_native + minella_times + & mister_malarky + secret_reprieve + shattered_love) > & (alpha_des_obeaux + ami_desbois + anibale_fly + balko_des_flos + & ballyoptic + blaklion + cabaret_queen + canelo + class_conti + & double_shuffle + fagan + give_me_a_copper + hogans_height + & jett + kauto_riko + kimberlite_candy + lake_view_lad + & lord_du_mesnil + magic_of_light + minellacelebration + & ok_corral + potters_corner + some_neck + sub_lieutenant + & takingrisks + talkischeap + the_long_mile + tout_est_permis + & vieux_lion_rouge + yala_enki) ! Age preference groups. (any_second_now + cabaret_queen + chriss_dream + class_conti + & cloth_cap + kimberlite_candy + talkischeap) > & (acapella_bourgeois + alpha_des_obeaux + ami_desbois + & anibale_fly + balko_des_flos + ballyoptic + blaklion + & bristol_de_mai + burrows_saint + canelo + definitly_red + & discorama + double_shuffle + fagan + farclas + give_me_a_copper + & hogans_height + jett + kauto_riko + lake_view_lad + & lord_du_mesnil + magic_of_light + milan_native + & minellacelebration + minella_times + mister_malarky + ok_corral + & potters_corner + secret_reprieve + shattered_love + some_neck + & sub_lieutenant + takingrisks + the_long_mile + tout_est_permis + & vieux_lion_rouge + yala_enki) ! Saddle-weight preference groups. (acapella_bourgeois + alpha_des_obeaux + anibale_fly + & any_second_now + balko_des_flos + burrows_saint + & kimberlite_candy + lake_view_lad + magic_of_light + & mister_malarky + ok_corral + talkischeap + & tout_est_permis) > & (ami_desbois + ballyoptic + blaklion + bristol_de_mai + & cabaret_queen + canelo + chriss_dream + class_conti + & cloth_cap + definitly_red + discorama + double_shuffle + fagan + & farclas + give_me_a_copper + hogans_height + jett + kauto_riko + & lord_du_mesnil + milan_native + minellacelebration + & minella_times + potters_corner + secret_reprieve + shattered_love + & some_neck + sub_lieutenant + takingrisks + the_long_mile + & vieux_lion_rouge + yala_enki) ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1/36.00)*log(((1/36.00)/acapella_bourgeois)) + & (1/95.00)*log(((1/95.00)/alpha_des_obeaux)) + & (1/140.00)*log(((1/140.00)/ami_desbois)) + & (1/32.00)*log(((1/32.00)/anibale_fly)) + & (1/14.50)*log(((1/14.50)/any_second_now)) + & (1/130.00)*log(((1/130.00)/balko_des_flos)) + & (1/110.00)*log(((1/110.00)/ballyoptic)) + & (1/100.00)*log(((1/100.00)/blaklion)) + & (1/38.00)*log(((1/38.00)/bristol_de_mai)) + & (1/13.00)*log(((1/13.00)/burrows_saint)) + & (1/120.00)*log(((1/120.00)/cabaret_queen)) + & (1/60.00)*log(((1/60.00)/canelo)) + & (1/80.00)*log(((1/80.00)/chriss_dream)) + & (1/65.00)*log(((1/65.00)/class_conti)) + & (1/6.40)*log(((1/6.40)/cloth_cap)) + & (1/90.00)*log(((1/90.00)/definitly_red)) + & (1/22.00)*log(((1/22.00)/discorama)) + & (1/150.00)*log(((1/150.00)/double_shuffle)) + & (1/130.00)*log(((1/130.00)/fagan)) + & (1/23.00)*log(((1/23.00)/farclas)) + & (1/95.00)*log(((1/95.00)/give_me_a_copper)) + & (1/95.00)*log(((1/95.00)/hogans_height)) + & (1/110.00)*log(((1/110.00)/jett)) + & (1/150.00)*log(((1/150.00)/kauto_riko)) + & (1/18.50)*log(((1/18.50)/kimberlite_candy)) + & (1/75.00)*log(((1/75.00)/lake_view_lad)) + & (1/40.00)*log(((1/40.00)/lord_du_mesnil)) + & (1/25.00)*log(((1/25.00)/magic_of_light)) + & (1/36.00)*log(((1/36.00)/milan_native)) + & (1/12.00)*log(((1/12.00)/minellacelebration)) + & (1/120.00)*log(((1/120.00)/minella_times)) + & (1/38.00)*log(((1/38.00)/mister_malarky)) + & (1/100.00)*log(((1/100.00)/ok_corral)) + & (1/30.00)*log(((1/30.00)/potters_corner)) + & (1/20.00)*log(((1/20.00)/secret_reprieve)) + & (1/75.00)*log(((1/75.00)/shattered_love)) + & (1/48.00)*log(((1/48.00)/some_neck)) + & (1/100.00)*log(((1/100.00)/sub_lieutenant)) + & (1/42.00)*log(((1/42.00)/takingrisks)) + & (1/75.00)*log(((1/75.00)/talkischeap)) + & (1/100.00)*log(((1/100.00)/the_long_mile)) + & (1/140.00)*log(((1/140.00)/tout_est_permis)) + & (1/100.00)*log(((1/100.00)/vieux_lion_rouge)) + & (1/50.00)*log(((1/50.00)/yala_enki)) End Equations End Model
{ "acapella_bourgeois" : [1.0344483718E-01], "alpha_des_obeaux" : [1.1092490127E-02], "ami_desbois" : [6.4013255253E-03], "anibale_fly" : [1.1092489557E-02], "any_second_now" : [1.1905042800E-01], "balko_des_flos" : [1.1092490142E-02], "ballyoptic" : [6.4013255253E-03], "blaklion" : [6.4013255253E-03], "bristol_de_mai" : [6.9307674653E-03], "burrows_saint" : [1.0344530933E-01], "cabaret_queen" : [5.6574957561E-02], "canelo" : [6.4013255292E-03], "chriss_dream" : [1.3449984823E-02], "class_conti" : [5.6574614025E-02], "cloth_cap" : [1.6988521027E-01], "definitly_red" : [6.9307668564E-03], "discorama" : [6.9307681009E-03], "double_shuffle" : [6.4013255253E-03], "fagan" : [6.4013255253E-03], "farclas" : [6.9307680439E-03], "give_me_a_copper" : [6.4013255256E-03], "hogans_height" : [6.4013255256E-03], "jett" : [6.4013255253E-03], "kauto_riko" : [6.4013255253E-03], "kimberlite_candy" : [4.9381265450E-02], "lake_view_lad" : [1.1092490049E-02], "lord_du_mesnil" : [6.4013255341E-03], "magic_of_light" : [1.1092489303E-02], "milan_native" : [6.9307675202E-03], "minellacelebration" : [6.4013255548E-03], "minella_times" : [6.9307668115E-03], "mister_malarky" : [1.2784649318E-02], "ok_corral" : [1.1092490142E-02], "potters_corner" : [6.4013255385E-03], "secret_reprieve" : [1.3221176794E-02], "shattered_love" : [6.9307669456E-03], "some_neck" : [6.4013255317E-03], "sub_lieutenant" : [6.4013255253E-03], "takingrisks" : [6.4013255334E-03], "talkischeap" : [4.9381264855E-02], "the_long_mile" : [6.4013255253E-03], "tout_est_permis" : [1.1092490143E-02], "vieux_lion_rouge" : [1.1419641645E-02], "yala_enki" : [6.4013255312E-03], }

On this occasion, there are four standout horses (apart from the top three in the betting), which have an edge on the market and are worth each-way consideration:


      a. acapella_bourgeois:  36.00
      b. cabaret_queen:      120.00
      c. class_conti:         65.00
      d. talkischeap:         75.00
   

We have completed our task without having to generate 40 win probabilities but simply by identifying preference groups within categories that we ourselves have chosen!

One notable exception to our process is secret_reprieve: 20.00 who is the least-exposed runner in the race but with both high class and fpr ratings - great potential but will need a lot of luck in running to be competitivve!

Note: We are not saying that those horses we have eliminated are not going to win - simply that they did not meet our criteria for live longshots to run in the money. Also, whether we succeed or fail in a single race is not the measure of the two processes we present here but their long-term peformance against the market!

Monday, March 15, 2021

Cheltenham 2021: Supreme-Novices Hurdle Handicapping

Cheltenham 2021: Supreme-Novices Hurdle Handicapping

Suffice to say that this year's Supreme Novices Championship Hurdle has been gutted with 10 defections at the last stage of declaration. That said, the following analysis was completed before this happened...Remember that handicapping is all about getting the process right. After that, the results will naturally follow. Enjoy!

'Once more unto the breach, dear friends...'

It is time once again for our annual attempt to find live-longshots (10/1+) to finish in the money in the Supreme Novices Hurdle (G1) at Cheltenham 2021.

Taking our lead from Shannon-Fano Crowd Handicapping, we can see that the implied probabilities of the win-market prices allow us to divide the race entrants into five separate groups (ABCDE). For example, the first group (Appreciate It and Metier) accounts for approximately 50% of the market. The question we then ask ourselves is whether the winner will come from this group or the remainder of the field? Similarly with the second group - comparing it to the remainder of the field below it. As a result, we have effectively reduced the race from one comprising 18 runners initially to four sub-races (AvBCDE, BvCDE. CvDE, and DvE), which is a much simpler cognitive task to undertake. However, for this particular exercise, we will effectively be ignoring groups A and B.

This year we will apply two distinct approaches to our analysis:

a. Pencil-and-Paper (Qualitative) and
b. Constraint Satisfaction (Quantitative).

Pencil-And-Paper Handicapping

This is our traditional approach for the last few years, where we use a process of elimination for identifying contenders.

a. Short Starting Price [Appreciate It, Ballyadam, Blue Lord, Metier, Soaring Glory].
b. Connections lack confidence [Galopin Des Champs, Ganapathi, Guard Your Dreams, M C Muldoon, Shakem Up'Arry].
c. Poor FPR [For Pleasure, Galopin Des Champs, Grumpy Charley, Shakem Up'Arry].
d. Pedigree mismatch to former winners [Blue Lord, Ganapathi, Grumpy Charley].
e. Poor "Late-Speed" [For Pleasure, M C Muldoon, Shakem Up'Arry].
f. Poor Cheltenham form [Shakem Up'Arry].
g. Never run on L-H track [Ganapathi].
h. Over-exposed form [For Pleasure].
i. Weak "Strength-Of-Schedule" [Guard Your Dreams].
j. Convincingly beaten by current favorite [Irascible]

We are left with the following five horses, which are worth each-way consideration:

a. Bob Olinger         17.00
b. Keskonrisk          32.00
c. Gowel Road          44.00
d. Fifty Ball          70.00
e. Third Time Lucki    85.00

Constraint Satisfaction Handicapping

Our alternate approach is to use the recently posted Trader Probabilities Derivation And K-L Divergence (Part 3) method, which is basically solving a constraint satisfaction problem (CSP).

Model Cheltenham_Supreme-Novices-Hurdle Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied probabilities. ! Lower-bound set to 0%. ! Upper-bound set to 100%. appreciate_it = 1/2.60, >=0.00, <=1.00 ballyadam = 1/8.00, >=0.00, <=1.00 blue_lord = 1/12.00, >=0.00, <=1.00 bob_olinger = 1/17.00, >=0.00, <=1.00 fifty_ball = 1/70.00, >=0.00, <=1.00 for_pleasure = 1/65.00, >=0.00, <=1.00 galopin_des_champs = 1/200.00, >=0.00, <=1.00 ganapathi = 1/42.00, >=0.00, <=1.00 gowel_road = 1/44.00, >=0.00, <=1.00 grumpy_charley = 1/75.00, >=0.00, <=1.00 guard_your_dreams = 1/130.00, >=0.00, <=1.00 irascible = 1/40.00, >=0.00, <=1.00 keskonrisk = 1/32.00, >=0.00, <=1.00 m_c_muldoon = 1/90.00, >=0.00, <=1.00 metier = 1/6.00, >=0.00, <=1.00 shakem_uparry = 1/410.00, >=0.00, <=1.00 soaring_glory = 1/10.00, >=0.00, <=1.00 third_time_lucki = 1/85.00, >=0.00, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. appreciate_it + & ballyadam + & blue_lord + & bob_olinger + & fifty_ball + & for_pleasure + & galopin_des_champs + & ganapathi + & gowel_road + & grumpy_charley + & guard_your_dreams + & irascible + & keskonrisk + & m_c_muldoon + & metier + & shakem_uparry + & soaring_glory + & third_time_lucki = 1.00 ! --- Private trader opinions --- ! Ratings preference groups. (appreciate_it + ballyadam + soaring_glory + irascible + blue_lord + & keskonrisk + fifty_ball) > & (metier + guard_your_dreams + galopin_des_champs + ganapathi + & gowel_road + shakem_uparry + bob_olinger + third_time_lucki + & grumpy_charley + m_c_muldoon + for_pleasure) ! Runs preference groups. (m_c_muldoon + galopin_des_champs + irascible + keskonrisk + metier + & blue_lord + ganapathi + bob_olinger + gowel_road + fifty_ball) > & (ballyadam + guard_your_dreams + appreciate_it + soaring_glory + & grumpy_charley + shakem_uparry + third_time_lucki + for_pleasure) ! Pedigrees preference groups. (bob_olinger + gowel_road + shakem_uparry + appreciate_it) > & (third_time_lucki + galopin_des_champs + irascible + fifty_ball + & ballyadam + guard_your_dreams + soaring_glory + keskonrisk + & metier + m_c_muldoon + for_pleasure + blue_lord + grumpy_charley + & ganapathi) ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1/2.60)*log(((1/2.60)/appreciate_it)) + & (1/8.00)*log(((1/8.00)/ballyadam)) + & (1/12.00)*log(((1/12.00)/blue_lord)) + & (1/17.00)*log(((1/17.00)/bob_olinger)) + & (1/70.00)*log(((1/70.00)/fifty_ball)) + & (1/65.00)*log(((1/65.00)/for_pleasure)) + & (1/200.00)*log(((1/200.00)/galopin_des_champs)) + & (1/42.00)*log(((1/42.00)/ganapathi)) + & (1/44.00)*log(((1/44.00)/gowel_road)) + & (1/75.00)*log(((1/75.00)/grumpy_charley)) + & (1/130.00)*log(((1/130.00)/guard_your_dreams)) + & (1/40.00)*log(((1/40.00)/irascible)) + & (1/32.00)*log(((1/32.00)/keskonrisk)) + & (1/90.00)*log(((1/90.00)/m_c_muldoon)) + & (1/6.00)*log(((1/6.00)/metier)) + & (1/410.00)*log(((1/410.00)/shakem_uparry)) + & (1/10.00)*log(((1/10.00)/soaring_glory)) + & (1/85.00)*log(((1/85.00)/third_time_lucki)) End Equations End Model
{ "appreciate_it" :[2.7185684754E-01], "ballyadam" :[3.0378824642E-02], "blue_lord" :[6.4188212946E-02], "bob_olinger" :[1.1189333156E-01], "fifty_ball" :[3.7051562905E-02], "for_pleasure" :[1.7925815422E-02], "galopin_des_champs":[2.6029458023E-02], "ganapathi" :[2.6029458036E-02], "gowel_road" :[1.1192387537E-01], "grumpy_charley" :[1.7925811477E-02], "guard_your_dreams" :[1.7925805273E-02], "irascible" :[6.4188228012E-02], "keskonrisk" :[6.4188226406E-02], "m_c_muldoon" :[2.6029458048E-02], "metier" :[2.6029460057E-02], "shakem_uparry" :[3.8130987929E-02], "soaring_glory" :[3.0378827903E-02], "third_time_lucki" :[1.7925808455E-02], }

This time, there are two standout horses (apart from the favorite), which are worth each-way consideration:

a. Bob Olinger   17.00
c. Gowel Road    44.00

As a sanity check, we should expect some degree of overlap in terms of the selections generated by the two methods. Otherwise, we are being incoherent in our overall process.

One notable exception to our process is For Pleasure - (we got lucky -paid 9.50 to show!) who is a course and distance winner albeit at a lower grade and with a projected starting price of 50/1+ is worth an each-way saver!

Note: Given the limited exposure of all the runners, we are not saying that those horses we have eliminated are not going to win - simply that they did not meet our criteria for live longshots to run in the money. Also, whether we succeed or fail in a single race is not the measure of the two processes we present here but their long-term peformance against the market!

Monday, February 22, 2021

Trader Probabilities Derivation And K-L Divergence (Part 3)

Trader Probabilities Derivation And K-L Divergence (Part 3)

Returning one last time to the trader win probabilities derivation (TWPD) question Trader Probabilities Derivation And K-L Divergence (Part 1) and Trader Probabilities Derivation And K-L Divergence (Part 2):

How do we create a coherent set of trader win probabilities that does not stray too far from the implied win market probabilities while taking into account our own limited and possibly vague insights?

On this occasion, we will focus on soccer and add a late-breaking injury report from social media to our own 'gut instinct' about the likely outcome.

As before, we are using our 'no-frills' python script to access APMonitor-GEKKO:

import json import os import shutil from gekko import GEKKO d_path = os.path.dirname(os.path.realpath(__file__)) file = open(d_path + '/apm') model = file.read() file.close() m = GEKKO(remote=False) m.Raw(model) m.solve(disp=False) print('Objective: ', m.options.OBJFCNVAL) shutil.copy(m.path+'/Results.json', d_path + '/json') with open(m.path+'/Results.json') as f: results = json.load(f) print(results)

with the input - apm:

Model 20210223_Follis_LeagueMatch Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied probabilities. ! Lower-bound set to 17% - half the natural-odds implied ! probability (1/3). ! Upper-bound set to 100%. home = 1.00/4.20, >=0.17, <=1.00 away = 1.00/3.60, >=0.17, <=1.00 draw = 1.00/1.98, >=0.17, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. home + & away + & draw = 1.00 ! --- Private trader opinions --- ! Gut instinct. draw >= 0.50 draw <= 0.60 ! Late Twitter away-team injury report. home >= 0.25 home <= 0.30 ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1.00/4.20)*log(((1.00/4.20)/home)) + & (1.00/3.60)*log(((1.00/3.60)/away)) + & (1.00/1.98)*log(((1.00/1.98)/draw)) End Equations End Model

and the generated output - json:

{ "home" : [ 2.7046072411E-01], "away" : [ 2.0172750176E-01], "draw" : [ 5.2781177413E-01], "kld" : [ 3.6252143687E-02] }

Once again, as a sanity check. we can test the derived set of trader win probabilities against our constraints:

  • min(home, away, draw) >= 17% [0.20173];
  • 0.25 <= home <= 0.30 [0.27046];
  • 0.50 <= draw <= 0.60 [0.52781].

In sum, we have expertly combined the wisdom of the crowds with our own limited insights to derive a coherent and valid set of trader win probabilities (TWPD) Historically, professional traders have 'melded' (e.g. Bill Benter) their own odds-line with the win-market prices to gain a more informed opinion. We have revised that process by baselining on the win-market prices and adding a few constraints where we believe we have additional information (e.g. late-breaking injury or weather reports from social media) while relying on our automatic method to generate the updated trader win probabilities and, of course, indirectly our own odds-line:

Outcome  Private   Public
Home:    3.70      4.20
Away:    4.95      3.60
Draw:    1.89      1.98

For completion here is an updated horse-racing model template file:

Model 20210223_14-30_RacingPark Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied ! probabilities. ! Lower-bound set to 6% - half the natural-odds ! implied probability (1/8). alpha = 1/25.00, >=0.06, <=1.00 bravo = 1/34.00, >=0.06, <=1.00 charlie = 1/13.50, >=0.06, <=1.00 delta = 1/2.78, >=0.06, <=1.00 echo = 1/3.60, >=0.06, <=1.00 foxtrot = 1/17.00, >=0.06, <=1.00 golf = 1/8.80, >=0.06, <=1.00 hotel = 1/17.50, >=0.06, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. alpha + & bravo + & charlie + & delta + & echo + & foxtrot + & golf + & hotel = 1.00 ! --- Private trader opinions --- ! Half the field have a better chance. (alpha + echo + foxtrot + hotel) > (bravo + charlie + delta + golf) ! Two horses have between 30% and 40% combined win probability. echo + foxtrot >= 0.30 echo + foxtrot <= 0.40 ! Between 3% and 7% edge on one horse. (17.00 * foxtrot) - 1.00 >= 0.03 (17.00 * foxtrot) - 1.00 <= 0.07 ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1/25.00)*log(((1/25.00)/alpha)) + & (1/34.00)*log(((1/34.00)/bravo)) + & (1/13.50)*log(((1/13.50)/charlie)) + & (1/2.78)*log(((1/2.78)/delta)) + & (1/3.60)*log(((1/3.60)/echo)) + & (1/17.00)*log(((1/17.00)/foxtrot)) + & (1/8.80)*log(((1/8.80)/golf)) + & (1/17.50)*log(((1/17.50)/hotel)) End Equations End Model

Tuesday, February 16, 2021

Trader Probabilities Derivation And K-L Divergence (Part 2)

Trader Probabilities Derivation And K-L Divergence (Part 2)

Returning to the trader win probabilities derivation (TWPD) question we asked last time in Trader Probabilities Derivation And K-L Divergence (Part 1):

How do we derive a coherent and valid distribution of trader win probabilities that deviates as little as possible from the implied win market probabilities distribution while taking into account our own limited and possibly vague insights?

To illustrate our approach, let us assume (once again) that we have identified a horse-race with five runners that meets our WCMI trading threshold.

As previously observed, we never know the true win probabilities of individual horses. However, we almost always have some opinions to work with. For example:

  • 70% chance that winner will come from one of three horses - Alpha, Bravo, or Charlie;
  • 5% edge on implied win market probability for Charlie, and
  • All horses have at least 7% chance of winning.
Given the implied win market probability distribution \(P\) and the trader win probability distribution \(Q\) on the countable set \(X = \{x1, x2,...\}\) of horses in a specific race with \(P_i = P(x_i)\) and \(Q_i = Q(x_i)\), the Kullback-Leibler Divergence (K-LD) is defined as $$ D_{KL} (P||Q) = \sum_{x \in X} P(x)log(\frac {P(x)}{Q(x)}) $$ and is the metric we wish to minimize. In so doing, we guarantee that the derived distribution \(Q\) will be as close as possible to the original distribution \(P\).

For those readers who would like an alternative to the Excel solution, we can strongly recommend the excellent APMonitor-GEKKO optimization suite. Using the following 'no-frills' Python script and APM model file, we can derive the same set of trader win probabilities by minimizing the K-LD from the impied win market probabilities distribution while meeting our additional constraints.

import json import os import shutil from gekko import GEKKO file = open('APM-Gekko_TWPD_Input.apm') model = file.read() file.close() m = GEKKO(remote=False) m.Raw(model) m.solve(disp=False) print('Objective: ', m.options.OBJFCNVAL) dir_path = os.path.dirname(os.path.realpath(__file__)) shutil.copy(m.path+'/Results.json', dir_path + '/APM-Gekko_TWPD_Output.json') with open(m.path+'/Results.json') as f: results = json.load(f) print(results)
Model APM-Gekko TWPD Input Variables ! Trader win probabilities. ! Initial values set to win market implied probabilities. ! Lower-bound set to 7%. ! Upper-bound set to 100%. x1 = 1/2.625, >=0.07, <=1.00 x2 = 1/3.250, >=0.07, <=1.00 x3 = 1/5.500, >=0.07, <=1.00 x4 = 1/6.000, >=0.07, <=1.00 x5 = 1/21.000, >=0.07, <=1.00 obj End Variables Equations ! Total trader win probabilities must sum to one. x1 + x2 + x3 + x4 + x5 = 1.00 ! First three horses have approximately 70% combined win probability. x1 + x2 + x3 <= 0.70 ! Assume 5% edge on third horse. (5.500 * x3) - 1.00 >= 0.05 ! Minimize K-LD. obj=(1/2.625)*log(((1/2.625)/x1)) + & (1/3.250)*log(((1/3.250)/x2)) + & (1/5.500)*log(((1/5.500)/x3)) + & (1/6.000)*log(((1/6.000)/x4)) + & (1/21.000)*log(((1/21.000)/x5)) End Equations End Model

and the output - APM-Gekko_TWPD_Output.json - should be, as follows:

{ "time" : [0.00], "apmonitorgekkoequus-kldhaighexample.x1" : [ 2.8162476473E-01], "apmonitorgekkoequus-kldhaighexample.x2" : [ 2.2746615613E-01], "apmonitorgekkoequus-kldhaighexample.x3" : [ 1.9090908912E-01], "apmonitorgekkoequus-kldhaighexample.x4" : [ 2.2999992603E-01], "apmonitorgekkoequus-kldhaighexample.x5" : [ 7.0000063981E-02], "apmonitorgekkoequus-kldhaighexample.obj" : [ 1.2714140699E-01], "apmonitorgekkoequus-kldhaighexample.slk_2" : [ 0.00 ], "apmonitorgekkoequus-kldhaighexample.slk_3" : [ 0.00 ] }

which gives us exactly the same results as with Excel Solver!

As a sanity check. we can (once again) test the derived set of trader win probabilities against our constraints:

  • (28% + 23% + 19%) <= 70%;
  • ((5.500 * 19%) - 1.000) >= 0.050 (rounding up);
  • MIN(28%, 23%, 19%, 23%, 7%) >= 7%.

Notes:

1. Round all probabilities to zero. Any additional precision is irrelevant.
2. Python script and APM files are a minimum set. You will have to install additional modules (e.g. GEKKO), as required.
3. Initial run of Python script may be quite slow (e.g 60 secs.). Subsequent runs should be approximately five to seven seconds.

In sum, we have expertly combined the wisdom of the crowds with our own limited insights to derive a coherent and valid set of trader win probabilities (TWPD)!