Wednesday, March 18, 2020

Practical Dominance (PD)

In terms of our ongoing efforts to improve the handicapping process, we can strongly assert that it is easier to evaluate a four horse race than a nine horse race (all other things being equal). Keeping in mind our strong preference for eliminating alternatives over confirming selections, we can look to the Even Swaps Method (ESM) for a useful concept called practical dominance.
  • Select specific race using WCMI.
  • For each horse in race (using past performances):
    * Evaluate each contestant's form on at most five to seven attributes. See Tsai et al, 2008, Slovic,  1973 and Do You Really Need More Information from the CIA on the positive impact of additional information on confidence (Figure 5).
    * Convert the absolute ratings on each attribute into rankings across contestants.
    * Eliminate those contestants that are either completely dominated (unlikely) or practically dominated (likely) by another entrant.
    • In the sample race below, Alpha practically dominates Charlie as his rankings are superior on all attributes except A6.
    • Foxtrot, Golf, Hotel, and Juliet are similarly dominated.
  • Consider remaining contestants as potential trades using Kelly Criterion.

As ever, if we cannot find variables that account for sufficient variance in outcomes over and above that provided by market prices then we will not have an edge and we will lose our bankroll.

Sunday, February 10, 2019

Wins Above Starting Price (WASP)

In terms of our ongoing drive to identify value bets, we can now add WASP (Wins Above Starting Price) to our armory. The method for calculating each horse's WASP is outlined below but we do not prescribe how that value should be used in any particular market.
  • Select specific race using WCMI.
  • For each horse in race (using past performances):
    Calculate number of actual wins using our Juvenile Finish Position Ratings algorithm.
    Determine number of expected wins by multiplying reciprocal of starting-price by number of opponents in race.
    Subtract expected wins from actual wins.
  • Then divide difference by number of races run and quotient is horse's WASP.
WASP tells us how well the crowd has been estimating a particular horse's past performances (on average). If, relative to the other runners, a particular horse has a higher historical WASP then, assuming you rate it a live contender, it is more likely to be a value bet.

WASP is not to be confused with the excellent A/E statistic.

Saturday, December 22, 2018

Automatic Trading Using WCMI

Explore And Exploit

On which sports-trading events should we risk our capital? A good starting point is to ask the fundamental question of sports markets:

"Is the public market well-informed with respect to a specific event (Wisdom of Crowd)?"

Our proxy for identifying such events is to calculate the Wisdom of Crowd Market Index (WCMI) for all markets and to select only those events for which the market falls below a specific WCMI threshold (for example, 0.15). By focusing on these less well-informed markets, we are dramatically increasing the chances of identifying at least one overlay. In other words, the guiding principle is to explore all markets but only exploit those markets with low WCMIs. For UK Flat horse-racing markets, FlatStats is the logical starting point in this process.

Market Selection Using WCMI

The following Betfair simulation shows how an automated trading solution would first filter those markets with low WCMI and then, having identified at least one overlay, bet on one or more selections as calculated by the Single Event Multiple Selections variant of the Kelly Criterion:
  • Filter markets (e.g. identify 5f sprints);
  • Calculate WCMI for each filtered market;
  • Rank contenders in market on fundamental factors;
  • Create odds-line for contenders based on ratings;
  • Check market contains at least one overlay;
  • Make selections using Kelly Criterion; and
  • Submit bets.

Monday, November 26, 2018

Kelly's Multiple Personality Disorder

For the professional sports-trader, Kelly has three separate mathematical forms:
  1. Single Event, Single Selection;
  2. Single Event, Multiple Selections; and
  3. Multiple Events, Multiple Selections.

Single Event, Single Selection

This is the basic case as outlined in cursory descriptions of the Kelly Criterion. We identify a selection, which gives us an edge over the market and calculate the optimal stake to maximize that advantage. For this purpose, the Excel Add-In offers the KellySingleStake sports-trading function that accepts Decimal OddsWin Probability, and Multiplier parameters. The KellySingleStake function is the correct formula for AvB events such as moneyline markets in MLBNBA, and the NFL.

Decimal OddsWin ProbabilityStake

Single Event, Multiple Selections

In AvB events, the general advice to only bet the overlay is technically correct. However, in an AvK event, such as horse-racing and golf with a number of mutually-exclusive outcomes this advice is not strictly correct. Kelly betting is predicated on maximizing the logarithm of the handicapper's bankroll over the long-term. But, in the short-term, that goal is translated into not losing specific events when the price is right! The key role played by overlays in mutually-exclusive events is that there must be at least one such betting option available in any event on which we wish to bet. Beyond that, the specific choices will only be governed by maximizing the logarithm of our bankroll! The sports-trading function, KellyMutExStakes (array formula), with Decimal Odds Range and Win Probability Range inputs will identify the optimal selections and stakes.

EntryDecimal OddsWin ProbabilityTrader EdgeStake

Multiple Events, Multiple Selections

For Nx(AvB) events, such as trading Ryder Cup golf singles matches or NFL games on Any Given Sunday, we need the sports-trading function, KellySimEvtStakes (array formula), with Decimal Odds Range and Win Probability Range parameters to identify the optimal stakes.

EntryDecimal OddsWin ProbabilityTrader EdgeStake

Note that Example #2 in the Pinnacle Guest article - The real Kelly Criterion - calculates the wrong stakes as can easily be confirmed by entering the decimal odds and win probabilities into the SBR Kelly Calculator for four independent events.

Saturday, July 07, 2018

Betting Strategy Calculator (Itty.Bitty.Site)

Itty.Bitty.Site is a new URL-based microsite generator, created by Nicholas Jitkoff, that is sure to revolutionise the web in ways that we cannot yet imagine (hopefully, in positive ways). To that end, we have created one of the first Itty.Bitty HTML5 apps that runs a simple assessment of your betting strategy - Betting Strategy Calculator. Or access via QR code. Enjoy!
Note: The app works in both Chrome - you may have to set Insecure content blocked = Load unsafe scripts from end of address bar) and Firefox. For Edge - Use context menu | Inspect element | Emulation | Mode (User agent string) = Internet Explorer 11 and set Blocked content = See all content (at end of address bar) and wait 10 seconds for page to load. The app will disappear when emulation mode is closed.

Thursday, June 28, 2018

Speed-Stamina Course Profiles And Juvenile Races

In an earlier posting, Speed-Stamina Fingerprints, we outlined an approach to "hoof-printing" racecourses in terms of their speed-stamina profiles based on the best times for various distances using a power-law equation, approximately of the form: Time=Speed*DistanceStamina. This approach also allows us to effectively project performances by inexperienced 2yo and 3yo horses, as follows.

For example, supposing a 2yo colt turns in an eye-catching performance in a novice race and is entered in a graded stakes race 10 days later. Historical data tells us that the median winning time of this future event is
73.57s. The question we want to ask: Is the maiden winning colt likely to be competitive in the graded stakes event? One way of answering this question is to translate the maiden performance into an equivalent performance at the graded stakes course and distance. Using our hoof-printing technique produces a time of 73.40s telling us that this promising colt is likely to be an above average contender for the graded stakes event. Fast-forward to race day and our selection wins in a time of 73.51s (just 0.10s slower than projected). (Note that we were not restricted to choosing performances at the same distance as the future event).

Obviously, this technique has restrictions in terms of producing realistic projections. It works best with:
  • Maiden 2yo and 3yo races at sprint distances;
  • Races on “Good” or “Good-To-Firm” going; 
  • Recent “In-The-Money” races;
  • Projections from one graded stakes racecourse to another (e.g. Newmarket to York); and
  • Horses that race prominently and do not require "luck in running".

That said, it has potential for projecting future times based on performances at racecourses with vastly different configurations, a goal which cannot be achieved by speed figures!

Thursday, May 31, 2018

CsvPredictor: Turns Historical Record Into Mini Prediction System

CsvPredictor turns a historical record in CSV format into a mini prediction system. The program is completely agnostic with respect to the domain knowledge captured in the file (e.g. weather conditions, successful movies, past performances). 
Running CsvPredictor.exe with a valid csv file will result in a QnA session based on the salience of the features (columns), effectively, turning a standard flat file into a data mining classification tree
For example, whether or not to play ball given current weather conditions:
    C:\CsvPredictor>CsvPredictor.exe PlayBall.csv
    CsvPredictor v2.41
    Input File: "PlayBall.csv" (14 records and 4 features)
    Top Features (Salience)
    Outlook   0.46176
    Humidity  0.36618
    Wind      0.11693
    Q. Is Outlook  =  ["Overcast"; "Rainy"; "Sunny"]?  Sunny
    Q. Is Humidity =  ["High"; "Normal"]?  Normal
    A. Predict: PlayBall = True
or checking the likelihood of a new movie being a blockbuster!
    C:\CsvPredictor>CsvPredictor.exe Movies.csv
    CsvPredictor v2.41
    Input File: "Movies.csv" (2690 records and 5 features)
    Top Features (Salience)
    Budget              0.34871
    Genre               0.26719
    Production Country  0.24084
    Runtime             0.11430
    Q. Is Budget =  ["<=15000000.00"; "<=44263333.33"; "<=380000000.00"]? 
    Q. Is Genre =  ["Action"; "Adventure"; "Animation"; "Comedy"; "Crime"; 
                    "Documentary"; "Drama"; "Family"; "Fantasy"; "Foreign"; 
                    "History"; "Horror"; "Music"; "Mystery"; "Romance"; 
                    "Science Fiction"; "Thriller"; "War"; "Western"]?  
    Q. Is Production Country =  ["Australia"; "Canada"; "Hong Kong"; 
                                 "Ireland"; "United Kingdom";
                                 "United States of America"]?  
                                 United States of America
    Q. Is Runtime =  ["<=99.47"; "<=115.00"; "<=248.00"]?  <=115.00
    Q. Is Release Month =  ["<=5.00"; "<=9.00"; "<=12.00"]?  <=12.00
    A. Predict: Success = True
Note, it is very important to state that this program is only intended to provide an easy entry-point to data analytics for handicappers and is, in no way, intended to replace the advice and expertise of professional data analysts and statisticians!

Sunday, May 06, 2018

ExMachina Handicapping Rules (Excel Add-In)

Many of us spend countless hours trawling through historical records in a vain attempt to gain new insights into the key fundamental factors that will enhance our sports handicapping. Unfortunately, our innate cognitive biases (e.g. anchoring, availability, confirmation) continually invade all attempts at a quasi-scientific approach to data mining. Ideally, we would like a quick-fix solution to this dilemma – no new learning required and automatically works with available tools!

To that end, enter the ExMachina Excel Add-In (32-bit and 64-bit), which takes as input a CSV file of historical data and outputs a set of decision rules. In brief, the goal is to identify the most salient attributes in the data file and to create a set of rules based on that specific subset. Note, it is very important to state that this Excel Add-In is only intended to provide an easy entry-point to data analytics for handicappers and is, in no way, intended to replace the advice and expertise of professional data analysts and statisticians.

If you are interested in reviewing how the Excel Add-In works, then download the following MP4 file – ExMachina Handicapping Rules.

Monday, March 12, 2018

Cheltenham 2018: Supreme Novices Hurdle Handicapping

It is time once again for our annual attempt to find live longshots to finish in the money in the Supreme Novices Hurdle (G1) at Cheltenham 2018.

As ever, our approach is based on the following premises:
1. Supreme Novices Hurdle is similar to Kentucky Derby - young horses, many 
attempting graded stakes, championship race for first-time with little form in book.
2. Eliminate non-contenders and whatever remains, no matter how improbable, are our selections. 
Horses are only eliminated under one heading even though they may qualify for elimination under 
multiple headings:
    a. Pedigree mismatch to former winners [Sharjah].
    b. Small fields [Slate House].
    c. Poor "Late-Speed" [First Flow, Shoal Bay].
    d. Poor Cheltenham Form [Golden Jeffrey].
    e. Not suited by Going [Saxo Jack, Trainwreck].
    f. Not suited by L-H track [Getabird].
    g. Poor FPR [Khudha, Lostintranslation, Mengli Khan].
    h. Over-exposed form [Claimantakinforgan, Dame Rose, Western Ryder].
    i. Weak "Strength-Of-Schedule" [Simply The Betts].
8. Minimum price 10/1 [Kalashnikov, Summerville Boy].
This leaves Paloma Blue 13/1Us And Them 33/1, and Debuchet 40/1 as our selections with Kalashnikov 4/1 and Summerville Boy 8/1 only eliminated on price!

Note: Given the limited exposure of all the runners, we are not saying that those we have eliminated are not going to win - simply that they did not meet our criteria for live longshots to run in the money. The key takeaway, as always, is using a process of elimination not selection for identifying contenders.

Thursday, March 08, 2018

Kelly And Mutually-Exclusive Outcomes (AvK)

In AvB events, such as baseball, basketball, or football, the general advice to only bet the overlay is technically correct. However, in an AvK event, such as horse-racing, with a number of mutually-exclusive outcomes this advice is not strictly correct. For example, in the following racecard (sorted in decreasing e.v order), even though the handicapper has rated Bravo's win probability (π) at 29%, it is an underlay and not included in the list of bet selections:


But, if Bravo's price was to drift to 3.35, then this underlay is now added to the list!


Kelly betting is predicated on maximising the logarithm of the handicapper's bankroll over the long-term. But, in the short-term, that goal is translated into not losing specific events when the price is right! The key role played by overlays in mutually-exclusive events is that there must be at least one such betting option available in any event on which we wish to bet. Beyond that, the specific choices will only be governed by maximising the logarithm of our bankroll!

: Blindly backing high probability combinations such as Alpha, Bravo, and Charlie (total win probability = 80%) will eventually lead to ruin.

Tuesday, December 12, 2017

Vendire-Ludorum Excel Add-In

Treat yourself to an Excel Add-In (32-bit and 64-bit) [Windows] that includes some of the standard functions on which we have come to rely in our daily sports trading. Among the functions available are the following:
  • Edge,
  • Expected Value,
  • Time Value,
  • Kelly Single Stake,
  • Kelly Mutually-Exclusive Stakes,
  • Kelly Simultaneous Independent Events (5),
  • Mark-To-Market,
  • Risk-Of-Ruin, and
  • Wisdom-Of-Crowd Market Index (WCMI).
In addition, there is a sample spreadsheet highlighting the various functions through worked examples from Haigh, Paulos, and Yao.

Sunday, November 26, 2017

Mark-To-Market And Hedging

Following careful analysis of the next race, your assessment of the odds is 4/1(5.00) - a 20% edge on the market price of 5/1 (6.00) - with respect to your selection. Flushed with confidence, you place a $250 (4%) win bet on InTheMoney at 5/1 (6.00). Time to sit back and wait for the profit to roll in? Maybe!
Consider for a moment -  what is the current market value of your investment?. When you place the initial trade, its market value is $250=[($250*6.00)*(1/6.00)]. Roll tape and the market turns in-play as InTheMoney sets the early pace with measured fractions. Turning into the stretch it looks an even-money chance at worst to win. Freeze frame and consider once again - what is the current market value of your investment?. Assuming the market is now efficient with respect to InTheMoney’s win probability, the updated market value is $750=[(($250*6.00)*(1/2.00)]. In other words, at this point in the race, you have already won $500=[$750-$250]. Fast forward to the finish-line and InTheMoney is beaten by a late closer, JustInTime. Now, the really interesting question is - how much have you lost? If you had no choice but to only back your selection before the race, then you have lost $250. But, if you also had the option in-play to hedge the win bet at even-money, then you lost $750! (see Weighing the Odds in Sports Betting (Ch.4)).
In summary, no trade is complete without both a back and a lay bet or, in other terminology, an opening and a closing position.

Using our knowledge of time averages, we can select a lay price and calculate a hedging stake to maximize our median bankroll over time.

Thursday, August 31, 2017

Horse-Racing Overlays

Sports traders sometimes conflate AvB events with AvK (K = N-1) events (N = number of entrants). The prototypical AvB game is a football match and the equivalent AvK example is horse-racing. Some experts encourage traders to identify a single overlay in both events and bet accordingly. Whereas this advice is correct for AvB events, it is not correct for AvK events.
In horse-racing, you should only select those races with at least one overlay for further examination. But, as Ravi Phatarfod points out in his excellent 1996 paper Betting Strategies In Horse Races, a gambler is more likely to correctly assess that the winner will be one of three horses than being able to correctly assign win probabilities to each individual horse. And, as John Haigh illustrates in Taking Chances, Kelly betting on horse races also encourages us to spread our risk across multiple entrants including, on occasion, those from all three categories of bets, favorable (positive expected value), fair (zero expected value), and unfavorable (negative expected value) bets. This approach guarantees over time that you will minimize your risk of ruin (total loss of capital).

Note: You must enter horse details in descending descending e.e order only.

Sunday, July 16, 2017

An Engineer, A Physicist, And A Statistician

Many years ago, as a young postgraduate student I and two colleagues (an engineer (E) and a physicist (P)) would meet every Saturday for an early lunch to discuss the week’s events in our respective disciplines and to give our differing perspectives on world events. It was an innocent and idealistic exercise driven by youthful enthusiasm and naivete. Naturally, our discussions ranged from the sublime to the ridiculous and everything in between.
Time passed and as our careers evolved we drifted apart and lost contact. Then a few years ago, I unexpectedly ran into E at Royal Ascot. After we engaged in some good-natured banter about the humbling nature of the aging process and introduced our respective wives, our attention turned to the Group 2, Queen Mary Stakes for 2yo Fillies. I asked E what he liked in the race and how he made his selections. He turned to me in disbelief and said, “For 2yo races, I use the method you recommended to me back in the day to identify a select band of unexposed horses to exploit throughout the season.” Completely bewildered I said, “Remind me”. He then proceeded to outline an adaptation of the bayesian bandit (thompson sampling) “explore-exploit” strategy as used in the multi-armed bandit problem. To which, I blurted out, “You mean, it works!”. I quickly pointed out that I must have thought at the time it was a strategy worth exploring but that all the kudos should go to him for exploiting it so successfully. Engineers rule by defeasible reasoning!
Later that evening, my wife teased me by asking if I had given mathematical advice to everyone I had ever met and when I looked surprised by the question she added wickedly, “My hero, so brave, so strong!”

Thursday, June 29, 2017

True Talent Levels

It is easy to conflate Which team is number one? with Which team wins today? The former is decided over the course of a regular season and is primarily driven by skill (talent) but the latter changes on a daily basis and when there are two roughly equal opponents it is principally governed by luck!
With respect to being number one, the definitive review of the field is provided by Langville and Meyer (2012) in Who's #1?: The Science of Rating and Ranking. And, in terms of estimating true talent levels, 
Adam Dorhauer delivers two excellent articles worthy of publication, Elo vs. Regression to the Mean: A Theoretical Comparison and Regression with Changing Talent Levels: The Effects of Variance.

Monday, March 13, 2017

Cheltenham 2017: Supreme Novices Hurdle Handicapping

Once more unto the breach, dear friends, in our annual attempt to find live longshots to finish in the money in the Supreme Novices Hurdle (G1) at Cheltenham 2017. As ever, our approach is based on the following premises: 
  • Supreme Novices Hurdle is similar to Kentucky Derby - young horses, many attempting graded stakes, championship race for first-time with little form in book.
  • Eliminate non-contenders and whatever remains, no matter how improbable...
    • Avoid horses with pedigree mismatch to former winners [MelonCrack Mome].
    • Avoid horses from small fields [Labaik], [Elgin].
    • Late speed important [Magna CartorGlaringCapitol Force].
    • Poor Cheltenham Form [River Wylde].
    • Poor FPR [Pingshou, High Bridge].
  • Minimum price 10/1 [BallyandyBunk Off Early].
This leaves Beyond Conceit 20/1 and Cilaos Emery 16/1 as our selections!
Note, given the limited exposure of all the runners, we are not saying that those we have eliminated are not going to win - simply that they did not meet our criteria for live longshots to run in the money. The key takeaway, as always, is using a process of elimination not selection for identifying contenders.
Footnote: In their respective next outings, Cilaos Emery won G1 (Punchestown) and Beyond Conceit finished second in G1 (Aintree).

Friday, February 03, 2017

Adaptive Boosting

Machine learning studies the design of automatic methods for making predictions about the future based on past experiences. In the context of classification problems, machine-learning methods attempt to learn to predict the correct grouping of unseen examples. In the mid-1990s, Freund and Schapire introduced the meta-heuristic, Adaptive Boosting (AdaBoost), “…an approach to machine learning based on the idea of creating a highly accurate prediction rule by combining many relatively weak and inaccurate rules… Indeed, at its core, boosting solves hard machine-learning problems by forming a very smart committee of grossly incompetent but carefully selected members…” (Boosting: Foundations And Algorithms). As context for how boosting might work, the authors introduce the following toy problem in the opening paragraph to A Short Introduction To Boosting: “A horse-racing gambler, hoping to maximize his winnings, decides to create a computer program that will accurately predict the winner of a horse race based on the usual information...” As discovered by the early pioneers of expert systems in the 70s and 80s and as acknowledged by the authors, the biggest stumbling block to using experts is that many of them are unable to detail their decision process or even to rank order the importance of key variables. In light of this issue, Freund and Schapire point out that the beauty of boosting is that it builds on what experts can do not on what they cannot do, namely, given a specific scenario they are usually able to make a judgment in favor or against a particular outcome. For instance, we can ask an expert handicapper if a specific scenario - course and distance winner in last outing a week ago - would lead him to believe that it was more likely to win again or to finish out of the money. Note the phrase “more likely to” – this is a key strength of boosting - it asks for the balance of probabilities and not for the highly probable. The boosting phase combines many such simple, scenario-based rules into an overall weighted decision for an upcoming event. In its original specification, the defining quality of boosting was that it aggregated an incomplete set of “if-then” rules (decision stumps) that recursively address unexplored regions (areas for which previously chosen rules would give incorrect predictions) of existing data sets. The inherent strength of this approach is that it automatically leverages the key dimensions of Wisdom Of Crowds, namely, diversity, independence, decentralization, and aggregation For a worked example applied to NFL prediction, see James McCaffrey’s Classification And Prediction Using Adaptive Boosting. For the underlying theory of why wisdom of crowds works, see Scott Page’s excellent The Difference. And finally, Malacaria And Smeraldi explore the relationship between the AdaBoost weight update procedure and Kelly’s theory of betting and also establish a connection between AdaBoost and Information theory in On AdaBoost And Optimal Betting Strategies.

Friday, December 23, 2016

Biased Coin (Haghani & Dewey, 2016)

A recent paper by Haghani & Dewey (2016) sheds an unflattering light on subjects formally trained in finance as to their lack of basic knowledge with respect to probability and uncertainty – “If a high fraction of quantitatively sophisticated, financially trained individuals have so much difficulty in playing a simple game with a biased coin, what should we expect when it comes to the more complex and long-term task of investing one’s savings?” Though an otherwise interesting study, there are a couple of key points which do not receive adequate attention in the paper:
  • Financial: Though the median final bankroll of $10,504 is derived in the footnotes, there is not sufficient attention drawn to it in the paper itself. Time-Value automatically generates this value whereas Expected-Value generates the wholly unrealistic $3,220,637.
  • Psychological: The fallacy of “Playing With House Money” – “…you are offered a stake of $25 to take out your laptop to bet on the flip of a coin for thirty minutes.” What would have happened if the subjects had to pay $25 to play instead of being given it for free? 

No less a luminary in both the financial and gambling worlds than Ed Thorp says: “This is a great experiment for many reasons. It ought to become part of the basic education of anyone interested in finance or gambling.

Monday, November 07, 2016

Evens-Equivalent Trades

Risk of Ruin (RoR) (Epstein 2009) provides an easily understood metric (probability of bankroll depletion before doubling it) with which to compare strategies. By way of illustration, let us assume that both you and your brother are recreational handicappers. You trade baseball, home-underdogs and he trades horse-racing, second-favorites, as follows:


In the classic treatment of ruin, there is a working assumption of even-money trades to make the calculations tractable. To that end, we must first transform our real-world trades into their even-money equivalents with the same edge and volatility, see Krigman (1999). Despite having a smaller edge and a larger stake, you have a lower probability of depletion than your brother principally because you are risking a lower percentage of your bankroll per trade. Ideally, your RoR should be below 5% and to achieve this you both would have to either increase your bankroll or decrease your stake, as follows. [(MLB: 5%, 218.20 or 5730); (H-R: 5%, 54.97 or 4,546)].

Note that Edge's impact only equates with that of Volatility after 219 trades for you and 363 trades for your brother. And it takes a minimum of 806 trades for you and 1336 trades for your brother before you can be at least 95% confident that the combined effects of positive edge and mixed-bag volatility work in your favor to guarantee positive bankroll growth. In other words, despite having potentially successful trading strategies, you both will be well into your second season of handicapping before you can be sure of beginning to reap the benefits!