On petunias and whales: part 5

In part 4 I started trying to get a better handle on Dave’s 15% estimate. I explained why category he thinks of “left wing” maybe much larger than he imagines when considering authors as a population.

In this post I’ll try and look at how Dave Freer then models the actual results from various Hugos and what results we might have expected.

Dave starts with 2005 – unfortunately that breaks the model straight away. I assume he picked 2005 to avoid Dan Simmons’s nomination for Ilium in 2004. In fact 2005 is an excellent year to consider bias because it was a year in which the best novel nominations show an indisputable bias!

  • Susanna Clarke – Jonathan Strange & Mr Norrell [winner]
  • Ian McDonald – River of Gods
  • Iain M. Banks – The Algebraist
  • Charles Stross – Iron Sunrise
  • China Miéville – Iron Council

All of the nominees are British, and Banks and McDonald are both Scottish and yes there is definitely some outspoken lefties in that list! Oh and WorldCon was in Glasgow that year, the most lefty city in the most lefty country in the United Kingdom which itself is more lefty than the US. We don’t need an analogy of pulling colored balls from a bag to realize that WorldCon 2005 was unlikely to reflect the right-left spectrum of the USA. Party poltics in Scotland is currently a two party conflict between the left-of-center Labour Party versus the left-of-center Scottish Nationalist Party. At least it gives us a data point about Dave’s classification – he calls it 3 red and 2 white. As Stross, Banks and Miéville are undoubtedly red, I’ll assume he was counting Clarke and McDonald as white.

2014 and 2015 are both years in which slates and/or publisher campaigning (specifically Tor with Wheel of Time) overtly skew results.

That leaves the period of 2006 to 2013. Of those 2007 was in Japan, 2009 in Quebec, Canada, 2010 in Australia and 2014 in London. While many Americans will have still attended, all of those cons would have a significant departure from US norms. In short Dave shouldn’t include them in his analysis because his markers (most obviously “right to concealed carry”) just don’t work the same way.

That leaves this set of results from Dave’s post:

Red White Black
2006 3 2 0
2008 4 1 0
2011 2 3 0
2012 2 3 0
2013 3 2 0

Even this set of depleted data is still an unlikely occurrence if the unbiased chance of a ‘red’ was 15%. Five years of nominations is 25 nominations and 15% of those comes to 3.75 i.e. at 15% we’d expect 4 ‘red’ nominations and instead there are 14 or about 56%. That puts it at the high end (or beyond) or the possible range I’d considered. So has bias been shown after all?

Not so quick! After all it could be just chance that it is a big result and luckily we have a way of judging. This time instead of trying to work out what the magic “red” percentage is, we can look at what percentages would make are result of 14 red out of 25 too unlikely to be anything be chance. To this we can use the binomial distribution. This distribution allows somebody to model what occurs when random trials of a given probability take place. In Excel you can use the “BINOMDIST” function to generate values for this based on various probabilities. For example if you had a run of 8 heads out of ten tosses of a coin you can use the binomial distribution to work out the chance of that occurring. Better yet you can use it work out the chance of getting 8 or more heads by chance.

I’ve used a spreadsheet to work out the chance of 14 or more ‘red’ winners for different values of the “red’ probability – starting at 15% and working up to 60% in increment of 5%-points. Now note that I’m working out the chance of 14 or more. This is important because the remarkable thing is not that there were exactly 14 reds but that the numbers was so large (at least according to Dave). So 15 reds, 16 reds etc would be even more remarkable. So 14 or more.

So what is so unlikely that it counts as TOO unlikely? Well that is a matter of opinion – 1% makes it easy to not reject the ‘chance’ hypothesis, 5% makes it harder, 10% harder still. There isn’t a fixed answer for this question but 10% is generous to Dave’s hypothesis (i.e. it makes it easier for us to reject ‘unbiased chance’ and to accept ‘bias’ as an explanation).

So in order :
If the proportion of ‘reds’ we use is 15% the chance of 14 or more reds out of 25 is effectively zero. The same is true for 20%. For 25% the chance comes to 0.1% – it at least shows on the spreadsheet to 1 decimal place but still very unlikely.
30% the chance is 0.6%.
35% the chance is 2.5% – at significance level of 1% we wouldn’t reject ‘chance’.
40% the chance is 7.8% – we’ve passed the 5% significance.
45% the chance is 18.3% – safely over 10%.

Effectively if 45% is a reasonable estimate of the proportion of Hugo eligible authors that Dave would find to be ‘red’ if they were nominated for a Hugo *and* that reflected the US population (all of which is very plausible) – then the results over those years are a bit unlikely but not very unlikely.

Remember that the proportion of Americans that support same sex marriage is 54% (at least) and the proportion that support affirmative action is 63%.

John Scalzi is a Hugo award winner much disliked by some Puppies and often presented as a being a leftist. He describes his views on politics in general here: http://whatever.scalzi.com/2015/05/13/reader-request-week-2015-6-me-and-republicans/

I think it is notable how well Scalzi matches the profile that the stats and data suggest:

  • Not actually that left wing
  • Perceived as being outspokenly left wing by Puppies
  • The “left wing” issues on which he has been outspoken have been ‘identity’ politics and issues such as same-sex marriage
  • These are issues that are actually not that left wing – they are more not-“Steadfast conservative”

So we have a ‘red’ author who actually is close to the middle – i.e. closer to the 50th percentile with a chunky 50% of the population being to the LEFT of him. Well one data point isn’t proof – he may just be weird.