
March 27th, 2010
12:58 am  World Cup Bracketology Someone at work suggested that word has gone around that I blog. This felt weird to hear at first, but on further reflection it's not much different from the fact that there are quite a few of us on Facebook offering up status updates  this is just a bit more verbose. If you're from where I work and reading this, hello! (Don't be a stalky stranger; please tell me that I'm wearing odd socks next time you see me at work.) The definitive version of my blog is at Dreamwidth, but if you're sufficiently bored then you might care to look through the silly things I've said over the years at LiveJournal. You've all known that I'm a raging geek for years so I'm more than happy to own and stand up for what I've written, though much of what I've written is now rather old and some things change over time. Not many, though.
A big feature on the annual American sporting calendar is March Madness, a tournament between the best college (i.e., university) basketball teams. (Parallel contests exist for men and women.) Simplifying at the cost of a little accuracy, the US is arbitrarily split into four regions, each of which is nearlyspuriously associated with a geographic name. A committee determines and produces a ranked order of the 16 best college basketball teams in each region. A singleelimination (i.e., knockout) competition takes place in each region, with the draw predetermined by strict seeding order: the first round features the number 1 ranked team vs. the number 16 ranked team, the number 2 seed vs. the number 15 seed and so on, the second round may feature number 1 vs. number 8, 2 vs. 7 and so forth. The four winners of the regions then play each other in semifinals and a final to determine a national champion. This takes place over about three weeks or so. It's a huge deal, akin to the FA Cup over the course of three weeks; compare with Wimbledon, except that all the teams are supported rather than "just the British players".
Part of the paraphernalia of the event is the Bracket Contest, a form of competition in which participants attempt to predict the result of every game in the tournament. You know who the 64 initial teams are and you can work out who is going to be playing whom in later rounds, so you predict the results of 63 matches. Predicting all 63 results in advance, not least who is going to be playing at each point, is legendarily, astronomically difficult. We're talking, very roughly, "winning a big lottery jackpot from a single ticket twice running" difficult here.
Theoretically, picking each result at random requires you to get a fiftyfifty shot correct 63 times in a row, so the chances of getting it correct are 1 in ½^{63}  9,223,372,036,854,775,807 to 1 against, which is nearly as likely as winning the Euromillions jackpot at a single attempt two weeks running, with four numbers from a single National Lottery ticket on the Wednesday between. (I make no apology for catering to a British audience here in my references: my US readers likely know all about March Madness already. I wave respectfully to nonUS, nonUK readers.) So that's a "least probable" bound for estimating the probability of a perfect bracket.
The tremendous Sports Economist blog has some stats about seeds' progression records over the past 24 years. We can use this to estimate a "most probable" bound for estimating the probability of a perfect bracket; let us assume that the most probable bracket has each of the four regions going exactly to seeding. I estimate that the probability of a region of 16 going exactly to seeding is a little worse than 1 in 1200. (I multiplied the probability of a number 1 seed having 4+ wins by the probability of a number 2 seed having exactly 3 wins given that they have at least 3 wins, by the probability of a number 3 seed having exactly 2 wins given that they have at least 2 wins, and so on, down to the probability of each of the 9 to 16 seeds having exactly 0 wins. Excel spreadsheet on request.) Then the probability of all four regions going exactly to seeding is (1 in 1200) to the power of four, and for your allperfect bracket you have to get the last three matches correct as well. Accordingly, I reckon we're looking at a "most probable" bound of something like 16,605,026,108,360 to 1 against, which is similar to a National Lottery jackpot on one ticket followed by five balls plus the bonus on another. (Other estimates vary, but the tightest estimate is "1 in 150 million"  which, working backwards, implies they think the chance of getting a region perfect is better than 1 in 68. If that's so then I'll lay you a small amount at a generous 801 against getting an entire region correct any day. Look at the overlay on that...)
Bracket Contests are very common. Book of Odds quotes a source suggesting 40 million brackets are filled out every year, and that ESPN had over 4.6 million submissions in 2009. (The best score was 58/63.) Now that's not going to be 40 million people with one bracket each, but that's still millions or maybe tens of millions of players. This is a Farmville sort of number, let alone a World of Warcraft sort of number, and probably less than an order of magnitude from a poker or a Tetris sort of number. Heck, the President plays (video), and the video starts with the Baracket including a pick of #1 Kansas over #9 Northern Iowa, like just about everyone else. Which proved to be wrong.
So a recent interesting story is this claim that someone managed to pick the first two rounds entirely correctly, going 48/48. I don't much care for the hook that the story uses, but given that ESPN claims to have had 4.78 million entries this year of which only four were 47/48 for the first two rounds, a perfect entry is a rarity. If you believe the claims that perfection for the first two rounds is 13.46 million to 1 against, then even 4.78 million shots at perfection all miss 94% of the time. It is said that the existence of Bracket Contests make March Madness a rare sporting contest that becomes less followed the closer it gets to the final, simply because people take less interest when they know their bracket is out of contention.
In conclusion: Bracket Contests are big news and fun. They're also pretty exclusive to March Madness. Why shouldn't the rest of the world enjoy Bracket Contests at a sporting event they'll be following... like this year's (association football) World Cup?
The World Cup has a pretty welldefined structure which makes it amenable to running a Bracket Contest, of sorts. The finals take place in two stages; the first stage sees eight groups of four teams compete in parallel round robins, to generate eight firstplaced teams and eight secondplaced teams. These teams then fit into a completely deterministic bracket, from which it is possible to identify all the potential matches in the remainder of the competition. This makes it ideal fertile territory for a Bracket Contest, and I don't think there are many of those taking place. (I thought this was a genuinely original idea, but hats off to TourneyTopia for getting there first  and, quite possibly, lots of other people of whom I am not aware.) We then ask, for values of "we" equal to "I", what the probability of a perfect World Cup bracket is.
A World Cup bracket, as I define it, would consist of correctly identifying which team would come first and which team would come second, out of four, in each of eight groups, followed by correctly choosing the results of all 15 competitive games in the rest of the competition. (Nobody cares about the Bert Bell Benefit ThirdPlace Game.) There are 12 ways to determine a first place and a second place from a group of four, so there are 12^8 = 429,981,696 possible ways to fill the brackets even before picking a bracket match result. There are 2^15 = 32,768 ways to fill the match results based on the same initial 16 teams in the bracket, so there are 14,089,640,214,528 different brackets possible. This gives us a lower bound for the probability (or, I suppose, a higher bound for the odds) of a perfect bracket of 14,089,640,214,527 to 1 against  close to "National Lottery jackpot followed by fiveandthebonus" territory.
Working out a higher bound for the probability of a perfect bracket is trickier, not least as we don't have the wonderfully convenient seeding structures of March Madness in the World Cup from which to draw inferences. Instead, I fear we must attempt to mine the wisdom of the bookmakers, with Oddschecker being as good a starting point as any. As we are trying to generate an upper bound, and thus want to estimate the probabilities of results being as high as possible, I will take the least generous (nontrivial) set of odds offered by any bookmaker on a particular outcome and convert that to a probability, noting that it is in the bookmaker's interest to overestimate that probability and thus provide the potential for an overround.
An upper bound for the probability of picking the eight winners is given as follows, quoting each favourite with what we can be reasonably confident is an overestimate of their probability: France (5/9), Argentina (5/7), England (7/9), Germany (13/21), Netherlands (15/23), Italy (7/9), Brazil (9/13) and Spain (4/5). The chance of all eight winning is thus at most 100/1863, or about 5.4%. By extension, I think that, in general, picking all eight World Cup group winners is never going to be more than 10% likely, and that would take eight (3/4 probability, or 1/3 fair odds) shots.
An upper bound for the probability of picking the eight winners and eight second places is given analogously: France/Mexico (3/13), Argentina/Nigeria (1/3), England/USA (1/2), Germany/Serbia (5/19), Netherlands/Cameroon (4/13), Italy/Paraguay (1/2), Brazil/Portugal (3/8), Spain/Chile (1/2). The chance of all eight pairs placing as listed is thus at most 15/51376, or about 0.03%. By extension, I think that, in general, picking all eight World Cup group winners and all eight World Cup group second places is never going to be more than 0.5% likely, and that would take eight (11/21 probability, or 10/11 fair odds) shots.
An upper bound for the probability of getting all fifteen knockout matches correct, even given an accurate bracket of 16, is hard to estimate, not least because we don't have any indication of seeding. For an upper bound, I will be stingy and wildly overestimate that there is, on averge, a 75% chance of each match being correctly predictable, and thus the probability of getting 15 consecutive 75% shots correct is just greater than, well, 1 in 75. (If we go down to 72%, it's 1 in 138. 70%? 1 in 210. Two thirds? 1 in 437. 65%? 1 in 640. 60%? 1 in 2126. I don't know what the actual probability we should be looking at is.) Thus an upper bound for the probability of being able to fill out an entire World Cup bracket correctly is 0.5% * (1 / 74.8309), which we can fairly safely push out the tiniest smidgeon to a nice and memorable 15,000 to 1 against.
Incidentally, if you have spotted any errors in either my arithmetic or my logic, or if you can suggest any ways to tighten my bounds, they would be gratefully accepted.
I do think that bookmakers or newspapers could run quite engaging, and very simple to understand, "fill out your World Cup bracket" contests: one point for each team filled in correctly, first in terms of which teams make it from the group stages to the correct place in the final sixteen, then in terms of which teams make it through each round of the knockout stages. Bearing in mind the probability figures I quote are all overestimates, one should be able to offer 10,000/1 against a perfect bracket with consolations of 100/1 for placing all 8 winners and all 8 second places correctly and 10/1 for placing only all 8 winners correctly, and still make money on it. Alternatively, a newspaper might be able to run the competition and offer bonus prizes safe in the knowledge that they are moderately unlikely to be paid out  or, at least, could probably be reinsured against fairly easily.
Part of the reason why I'm in the mood to think about such things is that not so long ago I encountered the The Wizard of Odds web site, written by the epnoymous wizard (real name Michael Shackleford) himself. It has scads of information about gambling games, particularly their practical implementations found in Las Vegas. The author is the titular Wizard, who has a career path that I would have idolised twenty years ago: he qualified as an actuary, then put his mind towards analysing casino games and made a career out of it, both from consulting work and as a university professor passing on his knowledge. There's probably only room for one of him in the world; I'm glad he exists, for the commitment he has demonstrated to getting very large volumes of highquality information out there at no charge to the reader.
I am terribly favourably predisposed towards him because of his writing style, which (quite correctly) focuses on expected value, near to exclusion of anything else, sometimes even to the fifth significant digit. The style is generally very unemotional, but when there is emotion, it's generally very direct, delightfully earnest and unselfconscious. (Case in point.) By chance in 2005 the site found itself top on Google for "Is my boyfriend cheating on me?" and he has wound up answering relationship questions amid the gambling questions ever since, approaching them in a very similar fashion, though his advice is sometimes a bit on the, well, xkcd side. He keeps out of his own writing to such an extent that it's charming when he allows himself a very occasional selfdeprecatory anecdote. I even give him points for being a Settlers of Catan player, though these points may just be paying off debt from his October 2004 assertion that "Risk is the greatest board game ever made". There's no accounting for taste, of course, and it may well just be what he grew up with.
Lastly, if any of the game design people around here have ever thought of turning their hands to designing casino games (because, as far as I'm concerned, there isn't a terribly high bar to beat) then the good wizard has a list of articles about this. Seems to me that I would probably enjoy attending a games event devoted to homebrewed gambling games, were such a thing ever to exist, at least as much as I would enjoy playing existing games in a casino. Could there be a gap in the market there for people who want a little more variety with their gambling, and might be prepared to pay for the privilege?
Please redirect any comments here, using OpenID or (identified, ideally) anonymous posting; there are comments to the post already. Thank you!


