A summary of Babel issues

Nothing new here, just summing up known issues. I ended up writing a much longer comment at File770 than I intended and as it is post-length I’m putting it here also. https://file770.com/barkley-so-glad-you-didnt-ask-80/comment-page-2/#comment-1603312

In that thread, Brian Z was being his usual self. I know he annoys people but he can sometimes be thought-provoking. He was pushing the idea that the Hugo admins decided that Babel was ineligible and then decided to stop counting its EPH points during the elimination process. I don’t think that makes much sense because they’d just be plugging the ballots into the software and it isn’t likely to have a “turn counting off for this finalist but don’t eliminate it” button. But who knows? Maybe, they used software that did have that button! When nothing makes sense, it is hard to call a hypothesis absurd. Maybe “Babel” is the book title equivalent of Little Bobby Tables.

What that got me thinking about is firstly that issue that currently the only viable explanations are a series of coincidences/events or conspiracies or a mixture of both. There are neat general explanations including censorship or incompetence but by themselves they don’t explain the specifics. We are unlikely to ever know for sure.

The other thing that I was thinking about was that we do tend to assume a particular order of events here. This is what I liked about Brian’s weird theory. It looks backwards because obviously you decide eligibility LAST but that is assuming you are doing the process properly… Anyway, here is a long comment now made even longer by four extra paragraphs explaining the context. The comment was in reply to Bruce Baugh rather than Brian.

The most sensible way to rig the stats is to rig the ballot. The EPH software should be quick to run with clean data, so it is also possible to not just rig the ballot if you control the data but to do so iteratively i.e. put fake data in, see the result, tweak the fake data, put it back in etc. You’d need a bit of time, some familiarity with past results and, of course, some strong motive, poor supervision and a lack of moral integrity…but that’s how to do it.

Do the published results look like that? No! They are full of absurdities and errors. So the one thing we can rule out is a competent attempt to change the results of the ballot by somebody with full control of the ballot.

There are clearly layers of issues here and Brian’s question highlights something. Multiple different kinds of something must be going on but we don’t have a clear idea of the ORDER of those somethings.

Babel is of particular interesting because most of the issues apply here:

  1. Babel’s raw vote number looks inflated as do the other top 7 nominees. This is the Cliff issue that Heather Rose Jones analysed.
  2. The EPH total points in the first listed round EXCEED the total number of voters given at the top of the page. Marshall has documented similar issues in other categories.
  3. The ratio of raw votes to EPH points in that first round for Babel and the other top 7 nominees is very high (4+). This shows a kind of average number for how many other nominees still in play the nominee shares a ballot with. By this point, this ratio should be somewhere between 1ish to 2ish. The top 7 are very coordinated ballots like hundreds of voters picked between the same 7 nominees very closely.
  4. Babel’s EPH points never change. This is unique to Babel in this category but occurs for some nominees in other categories. This is impossible as we know some people voted for works shown in the longlist that were then eliminated. The clearest example is The Mountain in the Sea. I’ve gathered multiple examples of people who had both Mountain and Babel on their ballots which demonstrates the EPH points for Babel cannot be correct.
  5. Babel is disqualified with no explanation.

As eligibility is supposed to be decided last, we all assume that was the last thing to happen but that assumption rests on the process following procedure. However, we know the process is flawed from point 4. So Brian’s speculation that Babel was already disqualified when they started counting isn’t totally nuts. We can’t trust the assumed order of events BUT if we don’t then we have to just give up at this point.

If we do trust the order of events then we could explain 1 and 3 as evidence of ballot stuffing. This would be external to the Hugo committee and not their fault. They aren’t technically empowered to say “hey, these ballots look dodgy” and the likely bad PR that would follow would be a disincentive to doing so. There is also nothing anybody could do in such circumstances without clear evidence beyond the improbable numbers.

Point 2 must be an issue from within the Hugo committee. Slates or ballot stuffing or just people voting in weird ways for legitimate reasons shouldn’t make the total not add up. That must be an internal error.

Point 4. This is interesting. EPH could have been run properly on a set of ballots where nobody who nominated Mountain also nominated Babel. That is what those numbers imply but that would mean that not only did we have ballot stuffing but also ballot deletion or just fabricated data altogether or in part. Alternatively, it could be just an error in making the table for the report. Maybe they just copied the number from the last round to the earlier round in the table for the report.
Alternatively, there are errors in the names listed in the longlist. Maybe Mountain in the Sea was eliminated much earlier and they just put the wrong name on the published table. If it is some other work then it is less mysterious that no points transferred to Babel.
Either way, this is an issue that MUST be something that the Hugo Committee should have been aware of regardless of its cause.

Point 5. Then, after all that, Babel is disqualified. I think it makes most sense that this happened last and at least somewhat independently of everything else.

[end of comment]

I’m inclined to lump 1,3 & 4 together i.e. EPH was run correctly but on ballots that don’t represent how people voted, mainly because Nona9’s numbers don’t look right either. That doesn’t explain 2 or 5 though. Point 2 might be sloppiness or it might indicate some other issue, i.e. an attempt to deal with or correct the dodgy ballots. None of this explains point 5.

I will, will, will change topics tomorrow even if it is just gibberish.


32 responses to “A summary of Babel issues”

  1. Interesting thoughts. Would it be possible that they just counted really sloppily, because they thought, it will be removed from the ballot anyway?

    At least by some of the poeple (possibily not all people were into the game at this point).

    Im assuming, there is a manual part of the vote counting process.

    Like

    • The only manual process should be data cleaning. After that it should be into the machine and press a button.

      What that means is that very local inconsistencies were probably done to the results after. If the software was broken, you’d expect to see more consistent errors [not always of course, software can make very quirky localised errors but there’d be some reason like, the logic breaks if the title of a book is too long or the word “Babel” is a command to randomise everything 🙂 ]

      Liked by 2 people

  2. However we explain the statistics, incompetence is going to be a factor.

    Given the failure to actually clean up the results, laziness may well be an issue, too.

    Liked by 1 person

  3. Is it possible that they didn’t run EPH at all? Then after the fact fabricated these numbers to make it superfically look as though these works/people were the results of running EPH? I mean these EPH scores seem to be completely discounted from both what we know shared the some of the same ballots and what logically are more likely to share a ballot (English/Chinese works, not many with 4 or 5 of the top 7, etc.). But what if you were working backward without regard for what actually was on the same ballots. This is a shot in the dark on my part, and I’m curious if this seems possible or not.

    Like

    • Other categories look more sensible, so it would be weird. We could imagine a situation where because of the difficulty of internet transfers in-and-out of China, that the team couldn’t access exist EPH software? Then had to write their own and ran out of time? Seems unlikely though. We aren’t talking about some massive program here.

      Like

    • You know, that’s not a terrible theory. We know Dave & Ben never liked EPH– and I suspect they really don’t understand how to do it– so the idea that while being otherwise unsupervised they just counted the number of nominations to determine finalists, and then after-the-fact fudged EPH numbers goosed the total nominations so the finalists they’ve already determined win by a comfortable enough margin that EPH couldn’t hurt it?

      (Though that doesn’t gibe with what happened on Short Story, where Destiny Delayed was dropped for EPH reasons despite having 100+ noms more than two above it, because of the even stranger thing of the lowest two EPH numbers in the penultimate round both having significantly more nominations than the next ones. All of Short Story is pretty wild, though.)

      Liked by 3 people

    • Thanks, Cam and Marshall, for your takes. I was thinking more of a “what if we just didn’t” situation rather than “we couldn’t figure out how to do it”.

      I know Ben Yalow’s biggest argument against EPH was that it was too complex to easily explain to people. And people would counter that it was no more complicated than the final voting system. But what I really think he felt was that it was too complicated to explain why we would want to do it. Because he didn’t believe it was fair himself.

      So maybe they run EPH and it knocks off some of the Chinese finalists. (We know that quite a few of the Chinese finalists/longlist were from SF World’s recommendation list. This may have created a bit of a slate effect.) Ben and Dave can’t explain how this is fair. Solution: keep the non-EPH results and give us these garbage stats.

      Now short story is where we have the strongest Chinese showing so maybe this category is the real EPH results, but still had to have fake numbers created because they no longer had the original EPH stats. Does that make any sense or am I just trying to make my theory work?

      Liked by 2 people

      • It makes some sense. I haven’t looked at Short STory in detail but superficially it looks reasonable aside from the inflated numbers AND if we assume there’s usually little transfer between Chinese and English language nominees.

        Like

        • Actually, the more I look at Short Story, the more questions it raises. There’s the inflated numbers, but also Destiny Delayed, despite have a significantly higher number of raw nominations then the two that end up in the 6 & 7 slots, gets knocked off in the penultimate round for EPH reasons and ends up in #8. But its EPH numbers don’t change across the whole row. Zhurong on Mars similarly doesn’t have a change in its EPH numbers until Destiny is eliminated. 

          (There’s also the fact that every single story on the entire longlist has more nominations that any short story received in the previous three years, and 1500 nominations is about 2.5 times as many as it had been in the previous three years, AND the EPH totals for N-9 are 1569, and stay above 1500 until the antipenultimate round. It’s a wild mess.)

          Liked by 2 people

        • Oh, one more oddity for Short Story’s EPH: for the distribution when Destiny Delayed is eliminated to work, it needs to be on 9 ballots by itself, and then 420 ballots where it’s on with three of the top five. 

          Liked by 2 people

    • Thinking about it since yesterday, I’ve stepped back from the coverup of selective use of EPH theory.

      New theory is however they produced the finalists/longlist (before wrongful finalist removals — and perhaps we can be hopeful that EPH was correctly performed from original ballots), these nomination stats tables we were given seem to be an attempted fabrication from this end result list of titles/names without reference to what the original input of what was on individual ballots together. McCarty spends his summer and fall busy with Chinese vacations and his day job/non-fandom life. Too late he discovers the original info he needs to produce this report is gone for whatever reason. We wait 3 months and get this garbage because it’s impossible to really recreate this info from only the end point.

      Liked by 1 person

  4. Makes me want to run the 1984 noms through the current program, just to see what would happen.

    No, there’s no way to stop the final ballot counting midway. It’s a loop going through counting, eliminating the hindmost, recounting, and so on. The nominations – why would you stop halfway?

    Liked by 3 people

  5. Post censures and reprimands are we any more likely to discover What Really Happened? Or is this destined to become the Area 51 of fandom?!

    Like

      • If he is clever…

        But if we hear something again from Dave is unimportant. The question is if we hear something substantial and that I don’t believe.

        I am in the incompetent manipulationcamp and I am afraid that will be the fandomarea 51.

        Liked by 2 people

  6. Most of the feature of the data would be explained if the following steps had taken place in this order:

    Nomination ballots tabulated, EPH computed.

    Additional votes added to some of the shortlisted nominees.

    Some nominees disqualified.

    EPH statistics from (1) adjusted to (try to) take account of (2).

    As Camestros says, steps (2) and (4) aren’t how you’d fake the data if you know what you were doing: instead you’d synthesize extra nomination ballots and re-run EPH. But this would require time, expertise, and access to the original nomination ballots. So perhaps at step (4) the nomination ballots were inaccessible and only the summary statistics of the original EPH run had been preserved, making it too hard to produce statistics consistent with all constraints.

    Like

Blog at WordPress.com.