Part 5 was the number crunching post. I promised pedantry and I delivered 🙂
This post is about some nit-picks, caveats and other stuff that are worth pointing out partly because it is important to get the maths right as best we can and partly because they don’t matter that much in terms of the broad sweep of Dave Freer’s argument. Put another way: all models are simplifications and imperfect representations. When critiquing an argument based on a model, figuring out what is a deep flaw and what is a minor departure from reality is important.
Dave uses for his analogy a person pulling colored balls from the bag. When he calculates the probability of several of the same color in each row he uses the same probability each time e.g. ” ½ x ½ x ½ = 1/8″. This is not quite correct. His analogy is one in which balls are taken from the bag and then not returned to the bag (because in the real situation he is modelling a given novel can only be nominated the once in a set of nominations). Consequently when a red ball is taken from a bag of finite balls, the chance that the next ball is red is reduced a little.
Think about a bag with 2 red and 2 black balls. The first ball you pick is a red (p=1/2). What is the chance your second ball is a red? There is now only one red out of three balls so p=1/3. This is an illustration of two events that are not independent.
Dave treats the analogy as if the probabilities are independent. He doesn’t say why but it is probably because the maths is simpler to explain and because it is better suited to the argument about proportionality he is making. In terms of his overall argument this is a nit-pick. He could have used an unbiased roulette wheel or he could have the balls returned to the bag etc. For a bag with a large number of balls the change in probability ater each event would also be small.
How about the Hugos? Well this is trickier. The Hugo nominations are random events at all. We’ve only introduced random events as way of modelling what the Hugos would be like without political bias. Put another way if people aren’t considering the politics of the authors then we should expect the results in terms of politics to be effectively random.
Even so, in any given year, a book or work will only be nominated the once and so unless a given author has multiple eligible works we have a “without replacement” situation. If a ‘red’ author is picked that makes the pool of eligible ‘red’ authors smaller each year. That should work in Dave’s favor as it makes it less likely to get a run of reds. However this assumes that the ‘red’ authors have nothing else in common and the Puppies claim that this isn’t the case – they say that it isn’t just the leftyness of the nominated works but other aspects – subgenre types being ignored, ‘literary’ fiction being advantaged etc.
This gives us a different problem of independence. People may not be voting based on the politics of the author but might be voting on the basis of genuine literary qualities AND the Puppies claim that at least some of these qualities are partly associated with the politics of the author. Note this doesn’t need to be deterministic, there can be multiple exceptions (Eric Flint springs to mind) on multiple levels.
These kinds of associations (‘comorbidities’ if you want to sound like a doctor) really mess up independence. Additionally each nominating choice is not a set of separate events. Each of your choices influences the other choice.
Autocorrelation isn’t people driving the same car (stats pun!). It is basically any circumstance where the result from one point is correlated with the result from a nearby point. A simple example is temperature – it varies from day to day but how warm it was the day before effects how warm it is today.
In a Puppy fever dream this is the point where I’d bring in global warming 🙂 – no don’t worry no Hugo hockey sticks will appear!
With the Hugos we have another way in which independence breaks down. Winning a Hugo may influence an author’s chance of winning a Hugo.
Winning a Hugo necessarily increases the visibility of an author. I don’t know what effect it has on the general book reading public but it definitely makes the book (or other work) more visible to a key group: people who vote for Hugo awards. Additionally the Hugo packet (free works or excerpts of nominated work provided to voters) means that Hugo voters get to read most of everything.
ow does this effect independence? In 2015 Ancillary Sword is a nominee and it is a sequel to Ancillary Justice that won the Hugo for best novel last year. Additional Hugo voters would have read Ancillary Justice last year because of the nomination (and win) and, assuming they liked it, gone and read the sequel. Assuming it was a great novel (it was IMHO) then its chance of getting another nomination might increase. On the other hand voters might feel that as the author (Ann Leckie) had already won a Hugo that they should pick something else. Either way, Leckie’s novel this year is partly influenced by a nomination and win last year.
Why we can ignore all that and move on
Well we sort of can’t and it is possible that some combination of these kinds of factor amount to a source of bias in the Hugos – but not the kind of conspiratorial bias that the Puppies complain of but as a kind of emergent property. However it is hard to see how this would create specific left-right bias all by itself.
If we can’t see how it can effect the result we can make an unsupported assumption – that all those issues just amount to statistical noise i.e. they may pull things away from true randomness but in all directions equal and so effectively unpredictably that they amount to noise. Dave’s argument ignores them – I don’t think that is a major flaw in his argument. It does make it all much less of a slam-dunk than he and other Puppies thought though. To use a cliche – it’s complicated.
Back to the main stadium in Part 7.