ngramming across the universe

…only going forward because we cannot find reverse.

I’ve gone a bit n-gram crazy today – as can happen. I thought I could test the question of whether the Hugo Awards have lost relevance or importance by graphing the term “Hugo Award” over time with the n-gram viewer. Now there is a little extra trick you can do which is to graph a term in one corpus and in another at the same time by using special syntax. I’m going to use the English 2012 corpus (books written in English digitized up to 2012 – although it only shows up to 2008) but also the English Fiction 2012 corpus (fiction books in English).

The syntax is Hugo Award:eng_2012,Hugo Award:eng_fiction_2012


Which is a graph with some sort of story behind it.

A couple of things. The data is normalized over time, so figures represent percentages of text from that year. As the English 2012 is a big corpus the term “Hugo Award” will be a smaller percentage in the bigger corpus than in just the fiction.

However what the graph does show is the term “Hugo Award” has a general upward trend when looked at in books in general but quite a different pattern when looked at purely fiction.

Why? Well I personally have no idea but I assume it relates to the extent to which published fiction (including anthologies and magazines) will have included the term “Hugo Award”. That seems to have peaked from 1978 to 1982.

The Seven Cardinal Sins – a puppy summary

I haven’t reviewed everything that was nominated but I have read everything and read multiple reviews. I thought this was a good time to look retrospectively of what was wrong with the nominated works (not including best dramatic categories, editorial categories, fan categories or artists).

The Catholic catechism traditionally identifies seven capital sins: pride, avarice (greed), envy, wrath, lust, gluttony, and sloth (or acedia i.e. neglect).

In terms of the Puppy campaigns these traditional sins aren’t a great match in their entirety. Lust in particular doesn’t make much of an appearance – if anything the overall attitude to sex has been almost puritanical. The one I would pick out is sloth. I think there is some obvious evidence of laziness in the compilation of the slates. They appear rushed and contain obvious omissions – Soft Causality by Michael Z Williamson, Heinlein’s biography, The Three Body Problem (which became a top pick of Vox Day leader of the Rabid Puppies). Traditionally this sin also includes acedia a sin that covers many modern issues including things we would not regard as a vice (such as depression) but also things we still do such a neglect or mindless compliance.

Lazy curating of the slate, mindless compliance with lock step voting and email campaigns, neglectful edits, and a general unwillingness to explain, review or persuade. Sloth and avarice seems to be the cardinal sin of the Puppy campaigns (no, that doesn’t mean I’m saying everybody who has ever read a Puppy nominated author is greedy and lazy).

With that in mind what are the major ‘sins’ of the nominated works?

  • Poor editing
  • Lack of cohesion
  • …parts of incomplete works…
  • Appearance by virtue of knowing Brad Torgersen
  • Shown up by substantially superior works in the same category
  • Stories of over-blown self importance
  • Irrelevance

Poor editing: all of the John C Wright nominations but in particular One Bright Star. The Science is Never Settled needed substantial re-working for it to be a decent (i.e. 18 year old at school) essay.

Lack of cohesion: Again John C Wright’s Transhuman and Subhuman, Roberts’s The Science is Never Settled, Championship B’Tok and Big Boys Don’t Cry all tended to wander off topic and lacked clarity.

…parts of incomplete works…: With a current lack of a ‘saga’ category perhaps Skin Game can be forgiven but Flow and Journey Man in the Stone House and Championship B’Tok all had a fragmentary feel of a an extracted chapter from a novel. None stood alone well.

Appearance by virtue of knowing Brad Torgersen: Numerous works but most obviously Wisdom from My Internet. I don’t know if Brad T is friends with the person who makes Zombie Nation but there is no obvious reason why it was nominated.

Shown up by substantially superior works in the same category: Zombie Nation was the only puppy nominee for Best Graphic Story among a set of commercial and critically successful other nominees that showed depth and talent.

Stories of over-blown self importance: Turncoat, Parliament of Beast and Birds, One Bright Star. Pompous pomposity.

Irrelevance: Best related work included a collection of unfunny Facebook offensiveness and a half baked essay on the nature of science – neither had any more than a tenuous relation to SF/F. Parliament of Beast and Birds was a religious fable – but that arguably scrapes into fantasy

Excel Pluribus Hugo

[Note: I’m very much not an expert on this proposal – this was the easiest way for me to make sense of it at a practical level. I may have well misunderstood aspects of the process]

Yay! I think I have finally removed all the kinks from my Excel version of the proposed new nomination system for the Hugo awards known as “E Pluribus Hugo” or EPH (and of course look here and interesting comments here).

The proposal basically boils down to people nominating what they like in a simple manner i.e. submit five (or fewer) things you want nominated in a category but then adding a more complex way of tallying the votes. Each work you nominated is weighted by the number of things you nominated – so if you nominate 5 things each one is getting 1/5 (0.2) of a vote from you, nominate four things and each thing gets 0.25 of a vote from you.

Nominated works then go through an elimination process. First they would find the two nominees with the lowest weighted score, then they would compare the total number of raw nominations (not weighted) each of those two works got. The one with the fewest raw nominations is eliminated. The clever bit is that with that work gone, anybody who nominated it has now got a slightly more strongly weighted vote for everything else they nominated.

In theory it would mean a slate or block vote would improve their success of getting one work on the final ballot but drastically reduce the chance of them getting multiple works on the ballot.

Now I wanted to model it in a transparent way – hence a spreadsheet rather than a program. With an Excel spreadsheet you see a static snapshot of all the stages in one go. I didn’t want to us any VBA or Pivot Tables either and I wanted to set up a complete round so that I could then copy and paste one round after another to make a complete process without editing formulas as I went. Continue reading “Excel Pluribus Hugo”


So I’ve been down a few dead-ends of late in my number crunching past Hugo winners.

I’ve looked for obvious signs of bias and also for cliques and found not much to write home about. The last two issues are whether the Hugo Awards (or other awards such as the Nebulas) have gone to unworthy winners or alternatively, to works that are too literary. This is something of a heads-you-win-tails-I-lose proposition, as demonstrating worthiness would tend to involve showing independent recognition of a writer’s skill beyond SF/F awards – thus proving the too literary complaint.

Either way I have been looking for a way into this so that there is actual evidence to discuss. So far not much luck.

One promising lead was the Google N-Gram viewer. When Google digitized huge numbers of books they gained a massive corpus of texts that allow for systematic analysis. One kind of analysis is a count of n-grams i.e. a ordered set of characters. As the Google book metadata includes the year of publication that allows for trends in topics to be graphed. For example this graph shows trends for William Gibson’s 1989 Hugo nominee Mona Lisa Overdrive.

mldrive See it properly here.

Continue reading “Ngramming”

More Rabid nastiness

For other reasons I visited Vox Day’s blog today (the leader of the Rabid Puppies campaign). It is never a pleasant experience but it does help illustrate how particularly unpleasant this ‘side’ of the Hugo kerfuffle is.

Kary English was nominated by both the Rabid Puppy and the Sad Puppy campaigns for best short story. Surprisingly, despite the overall poor quality of the Puppy nominees, many non-Puppies who have read the story have quite liked it. It is probably the most broadly liked of all the Puppy stories and I agree it has many positive qualities.

Kary English has also distanced herself from the Puppy campaigns somewhat and in an extended comment at File70 outlined her views.

Furface Tension 6/26

I also wish people like Brad, Larry and other SP notables would come out and say “Hey, this* isn’t what we intended or what we hoped would happen. We’re sorry the whole thing has become such a mess.” (*where “this” means locking up the ballot and shutting out other works)

I don’t consider myself a spokesperson for the SP, or even an SP notable, but I’ll say it. I never got involved in this with any idea that I’d even make the ballot, much less that VD would run his own campaign or that there would be a ballot sweep. If I’d known that, I wouldn’t have participated. To the extent that I’ve been part of that, even unknowingly, I apologize.

It seems I can’t say anything remotely in that vein without someone saying that if I truly thought that, I would withdraw. I’ve already given my reasons for not withdrawing, but I’ll mention again that a large part of it is not giving Vox Day the satisfaction.

All that stuff about nominating liberals just to watch them self-flagellate and see how fast they withdraw? I’m not his marionette, and I won’t dance to his tune. He set us up to be targets, just like he set up Irene Gallo. I’m not giving in to Vox Day.

This has provoked what can best be called a very sulky reaction from Vox Day in which he basically says he doesn’t care. I shan’t link to it because the comments that follow are extraordinarily nasty and vindictive. I will share this quote from Day’s post:

I think it’s interesting that she thinks I have given her any thought whatsoever. Kary, my dear, I don’t give a quantum of a damn what you do. Withdraw, don’t withdraw, retire to a nunnery, it makes absolutely no difference to me.

So Day announces his utter lack of care…

Following on from that Day has posted his picks for best story (which I will link to – I don’t think my little blog lends his much web credibility of googlyness)

  1. “Turncoat”, Steve Rzasa (Riding the Red Horse, Castalia House)
  2. “The Parliament of Beasts and Birds”, John C. Wright (The Book of Feasts & Seasons, Castalia House)
  3. “On A Spiritual Plain”, Lou Antonelli (Sci Phi Journal #2, 11-2014)
  4. “A Single Samurai”, Steven Diamond (The Baen Big Book of Monsters, Baen Books)

Notably Kary English’s story (as nominated by Vox Day’s own campaign) is now missing. I guess Day just forgot in his total lack of caring rather than spitefully deciding that English was now an ‘unperson’.

I still won’t be voting for Totaled above No Award for the reasons I outlined here. However I think the chance of her actually winning this category has increased substantially.