Science journalist Angela Saini’s third book Superior: the Return of Race Science is a very timely survey of the history and contemporary impact of the attempts to use science to prop up racism and beliefs about race.
From Carl Linnaeus to the sinister Pioneer Fund, Saini maps the shifts both in actual understanding and the layers of post-hoc rationalisations for prejudices. She does this with minimal (but appropriate) editorialising and instead lets the views of a very wide range of interviewees inform the reader about how views have shifted or, in some cases, stubbornly refused to shift.
Much of it covered topics and personalities I was already familiar with and if you have read books like Stephen J Gould’s The Mismeasure of Man, then you’ll be familiar with a lot of the background. However, Saini takes a broader survey and branches out into topics like the misguided but often well intentioned use of race in prescription medicines. I found that the sections that covered areas I was already very familiar with where both interesting and provided good insights, although I obviously got more value out of the sections on topics I was less aware of.
Saini also charts recent events such as the rise of the alt-right, the renewed ideological racism in populist governments (in particular Trump’s America but also Modi’s Hindu nationalism) and demonstrates how the 18th century obsession with race is connected to modern concerns and pseudoscience.
The people-centred approach of the book gives it a very human quality. Saini has a knack at humanising many of the protagonists without excusing or apologising either for their mistakes or (in many cases) their bigotry. Rather, by focusing on the individuals her approach highlights their motives and in the cases of many of the scientists involved how they managed to fool themselves into thinking they had transcended their own prejudices and somehow found objective truths instead of discovering convoluted ways of having their own biased assumptions echoing back to them.
I listened to the audio-book version which is narrated by Saini herself. I really highly recommend this book both in terms of the insights she gives on the topic but also as an example of excellent modern science writing.
Heading off on a tangent from my last tangent. In my previous post I talked about ‘windows’ for authors and the Hugo Awards. Having done some number crunching for that post I thought I’d explain a bit more about the numbers etc.
A follow up to yesterday’s post. One rabbit-hole I had to stop myself running down was Eric Flint’s 2015 post THE DIVERGENCE BETWEEN POPULARITY AND AWARDS IN FANTASY AND SCIENCE FICTION. Eric Flint, often cast as the token left-winger of Baen’s stable, tread a difficult line during the Debarkle with many of his colleagues or professional collaborators (e.g. Dave Freer) very much advocating the Sad Puppy line. Flint’s overall position could be described as conceding that there was some sort of issue with the Hugo Awards but disagreeing with the tactic and rhetoric of the Sad Puppies and the underlying causes of the problem.
Flint’s diagnosis of the issue is explained in the post I linked to and can be summarised by this proposition:
“the Hugos (and other major F&SF awards) have drifted away over the past thirty years from the tastes and opinions of the mass audience”
This was not a post-hoc reaction to the Debarkle but a view he had held for several years:
Here’s the history: Back in 2007, I wound up (I can’t remember how it got started) engaging in a long email exchange with Greg Benford over the subject of SF awards. Both of us had gotten a little exasperated over the situation, which is closely tied to the issue of how often different authors get reviewed in major F&SF magazines.
[some punctuation characters have been cleaned up -CF]
Flint goes on to describes the issues he had trying to substantiate the feeling. He acknowledges that the basic issue with any simple analysis to corroborate his impression is that sales data is not readily available or tractable. He goes on to attempt to address that deficit of data in other ways. However, regardless of of his method (how much space book stores dedicate to given writers) his approach only address one part of what is actually a two part claim:
There is a current disparity between popularity of authors and recognition of authors in the Hugo Award.
Thirty years ago this was not the case (or was substantially less).
Now I have even less access to sales data than Flint and publishing has changed even further since even 2015. Nor do I have any way of travelling back to 1985 (or 1977) to compare book stores then with the Hugo Awards. Flint’s claim is far to subject to impressions and confirmation bias to really get a handle on. I could counter Flint’s more anecdotal evidence of current (at the time) big genre sellers unrecognised by the Hugo Awards with examples form 1985. An obvious one would Jean M. Auel’s whose Clan of the Cave Bear series was selling bucket load in the early 80’s and beyond (The Mammoth Hunters would have been cluttering up book stores in 1985). A more high-brow megaseller from 1985 would be Carl Sagan and Ann Druyan’s Contact, which, again, did not make into the Hugo list of finalists. Yet, these counter-examples lack bite because the Hugo’s missing a couple of books don’t demonstrate that Flint’s impression is wrong even if they help demonstrate that his evidence for the current (as of 2015 or 2007*) is weak.
However, Flint does go on to make a different kind of argument by using the example of Orson Scott Card:
“With the last figure in the group, of course,Orson Scott Card,we find ourselves in the presence of a major award-winner. Card has been nominated for sixteen Hugo awards and won four times, and he was nominated for a Nebula on nine occasions and won twice. And he was nominated for a World Fantasy Award three times and won it once. But… He hasn’t been nominated for a WFC in twenty years, he hasn’t been nominated for a Nebula in eighteen years, and hasn’t been nominated for a Hugo in sixteen years. And he hasn’t won any major award (for a piece of fiction) in twenty years. This is not because his career ended twenty years ago. To the contrary, Card continues to be one of our field’s active and popular authors. What’s really happened is that the ground shifted out from under him – not as far as the public is concerned, but as far as the in-crowds are concerned. So, what you’re really seeing with Orson Scott Card’s very impressive looking track record is mostly part of the archaeology of our field, not its current situation. As we’ll see in a moment, the situation is even more extreme with Anne McCaffrey and almost as bad with George R.R. Martin.
[some punctuation characters have been cleaned up -CF]
Well this is more tractable. We can track authors over time through the Hugo Awards and we can look at what we might call ‘windows’ in which they receive awards. So that’s what I did. I grabbed list of Hugo finalists for the story categories (novel, novella, novelette, short story), put them in a big spreadsheet, cleaned up all sorts of things as per usual and went to have a look.
I’ll save a lot of the data for another post. There are two big issues with looking at the data over time. The first is that there are built in patterns to the data that show changes overtime that arise just out of the data being collected. Back in 1953 a Hugo finalist could only possibly have been nominated that once. Likewise a first time Hugo finalist in 2020 has a hard limit on the span of years between their first and last Hugo nomination.
A different issue is exemplified by this grouping of data where span of years if the difference between the first year an author was a Hugo finalist to the last year.
Span of Years
1 to 5
6 to 10
11 to 15
16 to 20
21 to 25
26 to 30
31 to 35
36 to 40
fee-fi-fo-fum I smell the blood of a power-law distributi-um
More than half of the data set are one-hit wonders because everybody’s first go as a finalist is a one-hit wonder until they get their next one. That’s quite a healthy sign IMHO but I digress. 70% of the authors are in 0 to 5 year span but there a small number of authors who have large time spans of nominations. The top two being George RR Martin and Isaac Asimov (38 years and 36 years). This kind of data is not summarised well by arithmetic means.
I’ll save some of the geekier aspects for another time. Is there a shift in some of these spans recently? Maybe but both the structural issues with the data and (ironically) the Debarkle itself make it hard to spot.
What we can do though is look at specific cases and Orson Scott Card is a great example. He’s great because he undeniably fell out of favour with people by being an enormous arse and we can corroborate that externally from this data set. However! EVEN GIVEN THAT the table of groupings I posted shows us something that severely undermines Flint’s point.
Card’s Hugo span (last year as finalist minus first year as a finalist) is 14 years. That puts him in the top 14% of writers by Hugo span. Card has been very far from being short changed compared to other authors. These are his 14 year-span companions:
Min of Year
Max of Year
C. M. Kornbluth
Joan D. Vinge
Orson Scott Card
Robert J. Sawyer
Note that the group is from multiple decades. The broader 11-15 group includes writers like Frank Herbert, China Miéville, C. M. Kornbluth, Philip K. Dick, and John Scalzi. Now Miéville and Scalzi might still extend their span (as might Card but probably not).
Flint goes on to suggest that awards get more literary over time and maybe they do but looking at the data I think Flint is sort of seeing a phenomenon but misreading what it is.
I would suggest instead that Awards favour a sweet-spot of novelty. A work that is too out-there won’t garner enough support quickly enough to win awards. A work that is too like stuff people have seen before isn’t going to win awards either — almost by definition, if we are saying ‘this book is notable’ it has to stand out from other books. For the Sad Puppies or even the LMBPN Nebula slate, this was apparent in works that struggled to differentiate themselves from other stories in an anthology or another book in a series. Jim Butcher’s Skin Game (to pick a Debarkle example) was just another book in his long running series and not even a particularly good episode.
The same applies to some degree for authors. I am not saying John Scalzi will never win another Hugo Award but I don’t expect him to even though I think he’ll be writing good, entertaining sci-fi for many years. This is not because he’s not sufficiently left-wing for current Hugo voters but because we’ve read lots of John Scalzi now and sort of know what to expect.
A future equivalent of Eric Flint in 2036 may look back to 2006 and say “Back in the day the Hugos used to reward popular authors like John Scalzi. Look at the virtual-cyber shelf on Googlazon and you’ll see rows of Scalzi books up to his latest ‘Collapsing Old Red Shirt 23: Yogurt’s Revenge’ – why don’t the Hugo’s give him rockets any more!”**
The Hugo’s move on, it is true but they have repeatedly picked out not exactly brand new talent but authors when they are at a sweet spot of their careers. Yes some have much longer Hugo spans but they are unusual and many are the sci-fi giants of yore and others are people with long gaps between nominations.
Card actually had a good run but even without his more giant-arsehole like antics, it is very unlikely that he would have got a Hugo nomination any time soon. Note, for example, that Card has not yet been a Dragon Award finalist despite having eligible novels and despite the Dragons (championed by Flint) as supposedly addressing the popularity issue.
*[Or 2020, as I don’t think Flint has said everything is fine now.]
**[I suspect future John Scalzi will be more inventive than just rehashing his former hits but also I think he’d actually be quite brilliant at writing a parody pastiche of his own work.]
In the comments to the previous post on this topic, Johan P raised some really interesting points. I’d said rather glibly that the categories with more subscribers will obviously have more free-downloads and sales. As Johan points out this is counter-intuitive as the figures given are AVERAGES i.e. (I assume) the number of downloads/sales per book rather than the total number of downloads or sales in those categories. However, it really is true that the bigger categories have bigger downloads/sales but I haven’t explained it properly and I did use misleading terms like ‘crowded’.
The graph plots the totals of free-downloads + discounted book sales (horizontal axis) against number of subscribers. The relationship is quite strong. I plotted a line of best fit courtesy of Excel. Now a linear relationship is probably not the best way of describing the data. I assume that underneath all of this is some sort of power-law type thing going on with sales (i.e. some books sell HUGE amounts and shape the averages accordingly). How that all plays when comparing subscribers to sales would require more detailed data than we have. Even so, the line gives use something to compare the data we do have and an r-squared of 74% is enough to justify my claim that more subscribers=more downloads/sales as a broad statement.
Flipping this round, we get a different way of looking at the data: which genres deviate most from that line and in which direction? If I’m right and the sales figures are distorted by bestsellers, then a newbie author should stay clear from those genres ABOVE the line because these genres have more subscribers than we would predict from the number of downloads/sales. Genres below the line have more sales/downloads than we would predict from the number subscribers and that sounds like a better bet or at least those averages maybe closer to a ‘typical’ value rather than a distorted average.
Here’s a similar graph but this time looking at sales only and unfortunately done using Apple’s Numbers spreadsheet rather than Excel:
There are many ways we can quantify how much a data point deviates from that line but within the limits of the tools on this laptop, I’m just going to find the difference between the actual number of subscribers and the number predicted by the equation of the line. Negative is better here I think but I’ve sailed off into generating numbers whose meaning is unclear. I *think* that the genres near the top are less impacted by a few bestseller and the books near the bottom are more impacted but…I wouldn’t swear to that and I’m just guessing.
Some major caveats before we go into them. Firstly these are for marketing purposes and as they say “averages are based on historical data, but are only meant as a reference and are not guaranteed”. The book figures also only apply to free downloads and discounted book sales. Lastly, these are BookBubs numbers and other retailers of books may show different patterns.
A broader caveat to add when considering any kind of average sales within books (or other media) is the dreaded power-law distribution. A small number of books account for a large number of sales and conversely a large number of books account have small sales individually but account for a lot of sales together. The arithmetic mean has many flaws but it is particularly flawed in such circumstances. One huge hit (e.g. The Da Vinci Code) will have an outsized impact on the average book sales even if other books are selling poorly.
This is an edited version of three Twitter rants from yesterday. It started as an off-cuff reaction but I was too far into it before I thought that it should be a blog-post rather than Tweets.
Stephen Pinker tweeted out a very weird bit of science theatre created by Michael Shermer.
Pinker has enough critical thinking skills that he should look at it with hefty scepticism…but obviously isn’t. It’s pretend science, using play-acting at science to refute what is obvious and ignores the core issues.
Each and every one of the people surveyed is a public figure who have made multiple public statements about politics and social issues. I don’t need an anonymous survey to find out what Andy Ngo or Sam Harris thinks, I can go and read what they say. And it is what they SAY that matters and what defines the IDW term not what they might privately think. If Sam Harris thinks he has warm & fuzzy liberal beliefs that’s nice but the whole point of the “dark web” label was the contrarian issues he promotes. Maybe Ben Shapiro secretly believes Global Warming is real and climate change is caused by humans. I don’t know but what matters is he propagandises the opposite. If an anonymous survey of the 34 “Intelectual Dark” Webbers reveals that their underlying views are more centrist and mainstream then that is not evidence that the public perception of their public positions is wrong. Rather it confirms a key point about the IDW.
The fundamental issues with the disparate group lumped together as the Intellectual Dark Web is that they are DISINGENUOUS about their politics. It’s not news that Jordan Peterson thinks of himself as moderate and reasonable. We knew that already. It doesn’t change that he (and Harris & Shapiro & Ngo & Quillette) frame and enable a perspective that bolsters the far right. The whole “we are the reasonable ones” is part of the schtick of the IDW. That they’ll boost that in an anonymous survey is, frankly, wank.
Let’s be sceptical as I’m sure Dr Pinker and Shermer would want us to be. Let’s take one conclusion Pinker raises from the survey: The members of the IDW are “concerned w climate” Let’s look at the survey: The survey agrees: “67% strongly agreed that global warming is caused by human actions (no one strongly disagreed)” So their you go! Hoorah! No, no let us be sceptical first. If this was GENUINELY true would it not be easily observed?
To the empircism-mobile! Here’s the output of the Quillete Climate tag https://quillette.com/tag/climate/ zoiks! A hefty TWO article, one concern trolling Greta Thunberg and the other saying people shouldn’t be mean to capitalism. Yes, Quillette is just one source but it is one that connects Steven Pinker on the one hand (who we can observe genuinely does advocate for action on Global Warming) with Andy Ngo on the other hand (who genuinely does have connections with the alt-right and violent far right groups) via Claire Lehmann (Quillette’s founder, fan of Pinker and one time boss of Ngo).
Yes, Steven Pinker himself has a better record on the of global warming but the issue he raised was to look collectively at the IDW and their media-organs. Broadly this is not a group trying to do very much about helping with the issue. And wow, think of the actually good the IDW could achieve given their actual audience. Whatever they may think of themselves, collectively they do have the ear of many on the right – exactly where climate change denial and bad science on the topic is endemic. You’d think these out spoken people might be busy being outspoken on a potential planet wide disaster.
It gets worse. The actual sample was only 18 not 34 people. Nearly half of the 34 didn’t answer. So when the survey says “67%” (the percentage favouring gun control and which believes global warming is real) actually means “12 people” That’s actually both more plausible and more wretched. Even if we accept that 12 of those IDWs think climate change is real, it says almost nothing about the group. Any one member of the original 34 people is a hefty 3% of the population being sampled and hence missing any one of them can have a large impact on the results. This is particularly true given that we already know that the label of “Intellectual Dark Web” is being attached to a group with a very broad range of views on many topics.
Shermer is assuming non-response to the survey is random across the traits being surveyed (i.e the 18 is a random sample of the 34). There is no reason to believe that and really anybody who is wants to seriously call themselves a sceptic should dismiss any general conclusion from the survey without substantial additional supporting evidence.
Indeed there’s good reason to assume that the 18 who responded is not a good random sample of the 34, just on the nature of the numbers. It is very hard with small numbers in a survey for the sample to be representative because one person makes a big difference. Shermer hides that by quoting percentages rather than raw totals but with small number percentages hide how few people he’s talking about. It’s not invalid to look at proportions with small sample sizes, sometimes that is all you have but there’s a point where 12 out of 18 is more informative than 67%.
We can illustrate the issue with the women who were surveyed. Of the 34 named people in survey associated with the “Intellectual Dark Web” 8 (24%) are women. In the survey 3 (17%) are women. So are the IDW 17% women (generalising from survey) or 24%? Obviously 24% is the correct figure but 17% is the equivalent of the the kind of survey conclusions Shermer presents. In fact any one woman listed is 13% of the IDW women, so one more woman answering makes a huge different to sub-sample of women. Any one person is 6% of the whole sample of 18 people!
Circling back to 67% claim. Again assuming everybody who responded is being honest (which I doubt) the survey actually found that 12 people of the 34 who were asked believed in gun control and the same number believed that global warming was real (which I’ll add isn’t saying much, some prominent sceptics will say global warming is real, just as many anti-vaccination campaigners will say they support vaccinations – it is the ‘but’ that follows where the issues lie). That might mean 67% or there about of 34 believe in gun control but a safer conclusion is no less than 35% do (12/34) and no more than 82% (28/34). Given how granular this data is, hoping the estimate is in the middle isn’t supported.
This is why I call it theatre. It is the wrong methodology applied badly. It illustrates methodological snobbery. Synthesising the complex views of a small group of people is exactly where qualitative methods work better. It is a domain where you need to put on your humanities hat and apply those humanities skills. Shermer is using sciencey film-flam by presenting a pointlessly anonymous survey and presenting the results as percentages as if there were proportions of the whole group.
Don’t get me wrong I absolutely LOVE applying basic quantitive methods to things and place where they don’t always make sense. It’s very much my hobby but even on this less than 100% serious blog I’d throw more caveats at better numbers than Shermer is using.
Stephen Jay Gould is a voice that is missed in today’s world. Smart, compassionate and analytical but also with a deft capacity to write about complex ideas in an engaging way. In The Mismeasure of Man Gould stepped out of his main field of paleontology and looked at the history of attempts to measure intelligence and the racist assumptions that have run through those attempts. This is the 1981 edition which doesn’t have the chapters on The Bell Curve but still a worthy read.
Is it perfect? No but then a popular account of broad area of research necessarily simplifies and skips over some details. As gateway into understanding the issues there is no better book that I’m aware of.
There are a host of different patterns in those graphs – note these are my observations not those of Ersatz Culture. Some awards are more volatile than others and, of course, some awards are very recent. Overall, there has been the shifted already noted from:
Young Adult awards have been more favourable to women. Fantasy awards have tended to be more favourable to women also. Any shift in a generic award towards YA or fantasy therefore might also lead to a shift towards women.
New writer awards (the former-Campbell Award, Locus Best First Novel) have often had a better split (not always a good split) than other awards in the same year. That is interesting as they might be a leading indicator of future award demographics in these awards.
This data is a snapshot and right now the list is naturally dominated by Margaret Atwood’s sequel to The Handmaid’s Tale, so the list contains a lot of different versions of both (print version, audio version, Kindle version etc). It’s also very Amazon with some popular-in-Kindle-unlimited works further down the ranks.
I took the top 100 listed and then did a few things to the data. Firstly, I deleted multiple versions of a work, that will add a bit of bias to the data by understating the impact of biggest sellers. I then classified authors based on name, pronouns, and bios as male, female, non-binary or both (in the case of dual authors). I didn’t identify any authors for the non-binary category. One author name was a joint authorship of a man & woman and was counted as “both”. That took the initial 100 rows down to 84 rows.
I then duplicated that data set and in the second version I deleted multiple works by an author leaving only the highest ranked work from the Amazon list. This was done so a single author wasn’t double counted (or n-tuple counted in the case of J.K.Rowling) but the process reduces the success of authors like Rowling or Stephen King. That took the number of rows down to 55.
The results are delightfully ambiguous with enough contrary results to please multiple readings.
All Works Top 50
Top Work Top 50
All Works: counts by author gender of the 85 books in the SFF Amazon bestsellers.
Top Works: counts by author of the 55 books by unique authors in the SFF Amazon bestsellers.
All Works Top 50: counts by author ranked 50 or better out of the 85 (36 books).
Top Work Top 50: counts by authors ranked 50 or better out of the 55 (24 books).
Looking at just works ranked 25 or better results in a figure more consistent between the two sets of data.
I had some weird conversations yesterday about Dragon Award stats. One was a brilliant take down of my figure that 10 men out of 10 had won Dragon Awards from 2016 in the two headline categories. Aha! Four years and two categories is only EIGHT! Yeah but it really is ten men. James S A Corey is actually two people and, even harder to believe, apparently John Ringo and Larry Correia are different. Mind you…if I only count Larry Correia once (because he is the same person whichever year he’s in) then it is back to 8 again…You’ll note that however we count it the answer comes out the same: 100% have gone to men in the two headline categories.
The discussion does raise a relevant point about why statistics is hard. Even a basic stat like a count of how many out of how many requires engaging your brain and thinking carefully about what you are counting. It was suggested that I should have said 10 men out of 8 awards…which I guess makes it clearer what was being counted but is horrible arithmetically. It looks like “10 out of 8” i.e. 125% which is nonsense because we are diving two different things and creating a derived unit of men per awards.
To round off that previous gender post here is an equivalent graph of winners by gender in the book category:
Like the graph in the previous post of finalists, I’m using counts by gender which reduces the gender disparity by only counting two joint authors of the same gender as 1 but two joint authors of different genders as 1 each per gender. Same caveats about gender as a binary classification apply as with the earlier post.
Worst year was 2017 which was also peak Rabid Puppy influence.
A couple of conceptual questions have come up that are related. I was asked elsewhere what the chance was of so many authors on Brad’s list winning. A different question with the same kind of issue was asked by James Pyles – basically what was the chance of N.K.Jemisin winning a Hugo three times in a row.
Both questions aren’t something that can easily be answered and they sort of miss the point of the kind of comparisons against chance you might do with gender. With the Brad list these were people who were plausible winners, the outcome wasn’t surprising. There’s no expectation that the result of an award is a random event when looking at individuals – the same is true with Jemisin. We could say, well there’s 7 billion people on earth and one winner so the chance is 1/7 billion and the chance of winning three times is (1/7 billion)^3 and then concluding that everything is impossible but the comparison is silly.
Comparing with chance is there to test a kind of hypothesis: specifically whether the result is plausibly the result of chance. If the probability is tiny then we can reject that it happened by chance. We already know that somebody winning a Dragon or a Hugo isn’t by chance because names aren’t picked out of a hat.
So why compare gender of winners to chance events if we know winning isn’t a chance event? Good question. Because, we are testing another level of hypothesis. With gender, the hypothesis could be stated as ‘gender is an irrelevant variable with regard to winning award X’.
Consider this. Imagine if all Dragon (or Hugo) winners were born on a Tuesday. That would be remarkable. Day of the week surely isn’t connected to whether you win an award or not! We might reasonably expect only one-seventh of winners to be born on a Tuesday. We might do extra research to see if across all people if day-of-the-week is evenly distributed. We might fine tune that further and consider only English speakers or only Americans etc. The point being that if day-of-the-week departed from chance then we would reject that day-of-the-week is irrelevant.
If we did find that, it wouldn’t tell us why or how day-of-the-week was relevant. One response I’ve seen to producing gender stats is people saying that they don’t pay attention to author’s gender when voting. Even if we ignore subconscious influences and take that at face value, all that does is remove one possible cause of a gender disparity, it doesn’t make the gender disparity go away.
Another response is that looking at gender stats is ‘politics’. Well, yes, it is but it is relevant even if we otherwise lived in a gender neutral utopia. Again, imagine if Tuesday-born people won far more sci-fi awards than other people — that would be fascinating even though we don’t live in a world of Tuesday-privilege.