Monday, February 22, 2021

Trader Probabilities Derivation And K-L Divergence (Part 3)

Trader Probabilities Derivation And K-L Divergence (Part 3)

Returning one last time to the trader win probabilities derivation (TWPD) question Trader Probabilities Derivation And K-L Divergence (Part 1) and Trader Probabilities Derivation And K-L Divergence (Part 2):

How do we create a coherent set of trader win probabilities that does not stray too far from the implied win market probabilities while taking into account our own limited and possibly vague insights?

On this occasion, we will focus on soccer and add a late-breaking injury report from social media to our own 'gut instinct' about the likely outcome.

As before, we are using our 'no-frills' python script to access APMonitor-GEKKO:

import json import os import shutil from gekko import GEKKO d_path = os.path.dirname(os.path.realpath(__file__)) file = open(d_path + '/apm') model = file.read() file.close() m = GEKKO(remote=False) m.Raw(model) m.solve(disp=False) print('Objective: ', m.options.OBJFCNVAL) shutil.copy(m.path+'/Results.json', d_path + '/json') with open(m.path+'/Results.json') as f: results = json.load(f) print(results)

with the input - apm:

Model 20210223_Follis_LeagueMatch Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied probabilities. ! Lower-bound set to 17% - half the natural-odds implied ! probability (1/3). ! Upper-bound set to 100%. home = 1.00/4.20, >=0.17, <=1.00 away = 1.00/3.60, >=0.17, <=1.00 draw = 1.00/1.98, >=0.17, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. home + & away + & draw = 1.00 ! --- Private trader opinions --- ! Gut instinct. draw >= 0.50 draw <= 0.60 ! Late Twitter away-team injury report. home >= 0.25 home <= 0.30 ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1.00/4.20)*log(((1.00/4.20)/home)) + & (1.00/3.60)*log(((1.00/3.60)/away)) + & (1.00/1.98)*log(((1.00/1.98)/draw)) End Equations End Model

and the generated output - json:

{ "home" : [ 2.7046072411E-01], "away" : [ 2.0172750176E-01], "draw" : [ 5.2781177413E-01], "kld" : [ 3.6252143687E-02] }

Once again, as a sanity check. we can test the derived set of trader win probabilities against our constraints:

  • min(home, away, draw) >= 17% [0.20173];
  • 0.25 <= home <= 0.30 [0.27046];
  • 0.50 <= draw <= 0.60 [0.52781].

In sum, we have expertly combined the wisdom of the crowds with our own limited insights to derive a coherent and valid set of trader win probabilities (TWPD) Historically, professional traders have 'melded' (e.g. Bill Benter) their own odds-line with the win-market prices to gain a more informed opinion. We have revised that process by baselining on the win-market prices and adding a few constraints where we believe we have additional information (e.g. late-breaking injury or weather reports from social media) while relying on our automatic method to generate the updated trader win probabilities and, of course, indirectly our own odds-line:

Outcome  Private   Public
Home:    3.70      4.20
Away:    4.95      3.60
Draw:    1.89      1.98

For completion here is an updated horse-racing model template file:

Model 20210223_14-30_RacingPark Variables ! --- Public market information --- ! Trader win probabilities. ! Initial values set to market price implied ! probabilities. ! Lower-bound set to 6% - half the natural-odds ! implied probability (1/8). alpha = 1/25.00, >=0.06, <=1.00 bravo = 1/34.00, >=0.06, <=1.00 charlie = 1/13.50, >=0.06, <=1.00 delta = 1/2.78, >=0.06, <=1.00 echo = 1/3.60, >=0.06, <=1.00 foxtrot = 1/17.00, >=0.06, <=1.00 golf = 1/8.80, >=0.06, <=1.00 hotel = 1/17.50, >=0.06, <=1.00 kld End Variables Equations ! Total trader win probabilities must sum to one. alpha + & bravo + & charlie + & delta + & echo + & foxtrot + & golf + & hotel = 1.00 ! --- Private trader opinions --- ! Half the field have a better chance. (alpha + echo + foxtrot + hotel) > (bravo + charlie + delta + golf) ! Two horses have between 30% and 40% combined win probability. echo + foxtrot >= 0.30 echo + foxtrot <= 0.40 ! Between 3% and 7% edge on one horse. (17.00 * foxtrot) - 1.00 >= 0.03 (17.00 * foxtrot) - 1.00 <= 0.07 ! --- Combination of public and private information --- ! Minimize K-LD. kld=(1/25.00)*log(((1/25.00)/alpha)) + & (1/34.00)*log(((1/34.00)/bravo)) + & (1/13.50)*log(((1/13.50)/charlie)) + & (1/2.78)*log(((1/2.78)/delta)) + & (1/3.60)*log(((1/3.60)/echo)) + & (1/17.00)*log(((1/17.00)/foxtrot)) + & (1/8.80)*log(((1/8.80)/golf)) + & (1/17.50)*log(((1/17.50)/hotel)) End Equations End Model

Tuesday, February 16, 2021

Trader Probabilities Derivation And K-L Divergence (Part 2)

Trader Probabilities Derivation And K-L Divergence (Part 2)

Returning to the trader win probabilities derivation (TWPD) question we asked last time in Trader Probabilities Derivation And K-L Divergence (Part 1):

How do we derive a coherent and valid distribution of trader win probabilities that deviates as little as possible from the implied win market probabilities distribution while taking into account our own limited and possibly vague insights?

To illustrate our approach, let us assume (once again) that we have identified a horse-race with five runners that meets our WCMI trading threshold.

As previously observed, we never know the true win probabilities of individual horses. However, we almost always have some opinions to work with. For example:

  • 70% chance that winner will come from one of three horses - Alpha, Bravo, or Charlie;
  • 5% edge on implied win market probability for Charlie, and
  • All horses have at least 7% chance of winning.
Given the implied win market probability distribution \(P\) and the trader win probability distribution \(Q\) on the countable set \(X = \{x1, x2,...\}\) of horses in a specific race with \(P_i = P(x_i)\) and \(Q_i = Q(x_i)\), the Kullback-Leibler Divergence (K-LD) is defined as $$ D_{KL} (P||Q) = \sum_{x \in X} P(x)log(\frac {P(x)}{Q(x)}) $$ and is the metric we wish to minimize. In so doing, we guarantee that the derived distribution \(Q\) will be as close as possible to the original distribution \(P\).

For those readers who would like an alternative to the Excel solution, we can strongly recommend the excellent APMonitor-GEKKO optimization suite. Using the following 'no-frills' Python script and APM model file, we can derive the same set of trader win probabilities by minimizing the K-LD from the impied win market probabilities distribution while meeting our additional constraints.

import json import os import shutil from gekko import GEKKO file = open('APM-Gekko_TWPD_Input.apm') model = file.read() file.close() m = GEKKO(remote=False) m.Raw(model) m.solve(disp=False) print('Objective: ', m.options.OBJFCNVAL) dir_path = os.path.dirname(os.path.realpath(__file__)) shutil.copy(m.path+'/Results.json', dir_path + '/APM-Gekko_TWPD_Output.json') with open(m.path+'/Results.json') as f: results = json.load(f) print(results)
Model APM-Gekko TWPD Input Variables ! Trader win probabilities. ! Initial values set to win market implied probabilities. ! Lower-bound set to 7%. ! Upper-bound set to 100%. x1 = 1/2.625, >=0.07, <=1.00 x2 = 1/3.250, >=0.07, <=1.00 x3 = 1/5.500, >=0.07, <=1.00 x4 = 1/6.000, >=0.07, <=1.00 x5 = 1/21.000, >=0.07, <=1.00 obj End Variables Equations ! Total trader win probabilities must sum to one. x1 + x2 + x3 + x4 + x5 = 1.00 ! First three horses have approximately 70% combined win probability. x1 + x2 + x3 <= 0.70 ! Assume 5% edge on third horse. (5.500 * x3) - 1.00 >= 0.05 ! Minimize K-LD. obj=(1/2.625)*log(((1/2.625)/x1)) + & (1/3.250)*log(((1/3.250)/x2)) + & (1/5.500)*log(((1/5.500)/x3)) + & (1/6.000)*log(((1/6.000)/x4)) + & (1/21.000)*log(((1/21.000)/x5)) End Equations End Model

and the output - APM-Gekko_TWPD_Output.json - should be, as follows:

{ "time" : [0.00], "apmonitorgekkoequus-kldhaighexample.x1" : [ 2.8162476473E-01], "apmonitorgekkoequus-kldhaighexample.x2" : [ 2.2746615613E-01], "apmonitorgekkoequus-kldhaighexample.x3" : [ 1.9090908912E-01], "apmonitorgekkoequus-kldhaighexample.x4" : [ 2.2999992603E-01], "apmonitorgekkoequus-kldhaighexample.x5" : [ 7.0000063981E-02], "apmonitorgekkoequus-kldhaighexample.obj" : [ 1.2714140699E-01], "apmonitorgekkoequus-kldhaighexample.slk_2" : [ 0.00 ], "apmonitorgekkoequus-kldhaighexample.slk_3" : [ 0.00 ] }

which gives us exactly the same results as with Excel Solver!

As a sanity check. we can (once again) test the derived set of trader win probabilities against our constraints:

  • (28% + 23% + 19%) <= 70%;
  • ((5.500 * 19%) - 1.000) >= 0.050 (rounding up);
  • MIN(28%, 23%, 19%, 23%, 7%) >= 7%.

Notes:

1. Round all probabilities to zero. Any additional precision is irrelevant.
2. Python script and APM files are a minimum set. You will have to install additional modules (e.g. GEKKO), as required.
3. Initial run of Python script may be quite slow (e.g 60 secs.). Subsequent runs should be approximately five to seven seconds.

In sum, we have expertly combined the wisdom of the crowds with our own limited insights to derive a coherent and valid set of trader win probabilities (TWPD)!