Captain Marvel versus the Trolls

Multiple news sources are covering that the new (and as yet unseen) Captain Marvel movie is being review-bombed by right wing trolls. The amount of coverage of this has itself increased just in the past few hours but this link seems to be one of the first articles on it:

I’d actually thought about writing about how the alt-right campaign against the film had started to warm up the other day after seeing our old-pal Vox Day jump on the bandwagon (archive link)…but didn’t because I’m lazy and/or got distracted. What I can offer instead of an amazingly insightful prediction that obnoxious misogynists are about to be misogynistic obnoxiously is some graphs!

I grabbed the review data from Rotten Tomatoes so that I can show graphically the influx of reviews. Unfortunately, I would have liked to show another film for comparison but it’s hard to get a like for like. The nearest equivalent with a similar release date and no pre-screening reviews yet is Disney’s live action version of Dumbo. That has only one page of user reviews/comments so far, as opposed to Captain Marvel’s six pages but I don’t think it is a like-for-like in terms of organic interest.

Here’s the first graph for Captain Marvel. It’s a running total of comments over time. It’s a longgggg time axis because the first comment is from 2015! Rotten Tomatoes (and similar sites) create entries for movies that have been announced even before production begins.

Interest (mainly positive but some negative) starts picking up from last July and subsequent trailers lead to more comments (again some positive and some negative). Some of the coverage of this troll attack is focused on the absurdity of people rating films that haven’t been seen yet but at this point, it is technically Rotten Tomatoes allowing people to say whether they are “Not interested” or “Want to see it”. Some of the comments are literally spam and some of the earlier comments are anti-Disney etc.

The next graph zooms in to the last few months:

There’s a spike of comments in February. Obviously some of that is an inevitable increase as the release date gets closer but the more overt hate comments really ramp up. The worst include comments about the lead actress (Brie Larson) being hit by a bus. The length of the comments also increase in the form of what are best called rants:

“Why Marvel decided to cast a very vocal racist and sexist aimed at white males, I’ll never know. If Robert Downey Jr. started saying that he didn’t care about the opinions of 40 year old white chicks and he doesn’t want to be interviewed by a white woman as its not inclusive enough, people would lose their minds. His career would be over, branded a racist and sexist, attacked in the media and his legacy tarnished. As a white male, I will not be supporting this or any other movie that stars Brie Larson. They say that Captain Marvel will be the new face of the MCU? As the villain because she certainly isn’t a her-o. “

How many is it though? Well, one comment anticipating somebody dying in a bus accident is one too many but for a sense of scale it’s about 14 comments over the past 10 days that are of the ‘arrghh SJWs! Feminazi!’ style crap. It’s not a huge number and the spike shown above is inflated by other people querying why there are so many anti comments for a film nobody has seen yet.

It’s a reasonable assumption that this is just the start though.


An actual case of [voter] election* fraud in the US?

I’ve made several post now about how the evidence for wide scale voter fraud of any serious impact is rare. However, there does seem to be a serious case in North Carolina:

‘Enough confusion has clouded a North Carolina congressional race that the state’s board of elections has announced a delay in certifying that Republican Mark Harris defeated Democrat Dan McCready in the state’s 9th District because of “claims of irregularities and fraudulent activities.”‘

“In October, during the final stretch of the congressional election in North Carolina’s Ninth District—one of the most tightly contested House races in the nation—Datesha Montgomery opened her door, in Bladen County, to find a young woman who explained that she was collecting absentee ballots. “I filled out two names on the ballot—Hakeem Brown for Sheriff and Vince Rozier for board of education,” Montgomery wrote in an affidavit. Under North Carolina law, only voters themselves are allowed to handle or turn in their ballots, but the woman at Montgomery’s door “stated the [other races] were not important.” Montgomery added, “I gave her the ballot and she said she would finish it herself. I signed the ballot and she left. It was not sealed up at any time.” are apparently numerous anecdotes like that surrounding the Republican (surprise, surprise) candidate. However, as well as this anecdotal evidence there are numerical inconsistencies:

“In Bladen and Robeson Counties, Bitzer found that Harris won an unusually high share of mail-in absentee-ballot votes. Bladen was the only county where the Republican prevailed in the mail-in absentee vote, winning sixty-one per cent of the votes from mail-in ballots—despite registered Republicans accounting for only nineteen per cent of the county’s returned absentee ballots. To achieve that margin, Harris would have needed to win not only all of the Republican ballots, but almost every single mail-in vote from Independents, as well as a significant number of votes from crossover Democrats.”

(as above)
It looks like the Republican Primary earlier in the year may have been tainted as well.

There’s an analysis of some of the numbers here that is well worth looking at:

Noticeably, the many right-leaning sources of panic about voter fraud are oddly quite about this North Carolina case even though it would appear to be one of the few credible instances of large scale fraud having a significant impact on a result.

*title changed

50% chance of doing X

This is a bit abstract and it follows on from this previous post about voting demographics.

Let’s say you’ve got a statistical model that predicts a person Z with Y characteristics has a 50% chance of doing X. The actual percentage doesn’t matter but 50% is a nice amount of measurable uncertainty — maximally knowing that we don’t know what person Z will do about X given the context of Y.

Empircally, the data would be looking at lots of Y people and seeing they do X 50% of the time. However, note that there’s a big and important distinction here between two extremes.

  1. Half of Y people do X and half of Y people don’t but those two halves are distinct. This implies that Y isn’t really the relevant factor here and we should be looking for some other feature of these people that better explains X behaviour.
  2. Y people do X half of the time randomly. That is Y people are essentially a coin toss with regards to X. In that case Y isn’t great for predicting whether people will do X but it is really relevant to the question (particulalry if W people behave more decisively).

In the demographic voting model and taking a figure of say 80%:20% for atheists splitting between left and right, I suspect this is a grouping where individuals have even less variability in their actual voting patterns. Some of that 20% will be Ayn Rand style atheists who are very committed to a right-wing viewpoint, rather than representing a 20% chance that a given atheist would vote Republican. However, that is not neccesarily true of other groups where the percentage may more closely represent a degree of individual variability.


US Voting Demographic Model

The Economist has a fascinating demographic model on US voters here:

There are no details on how robust the model is but they claim to have built it up from a large number of surveys of sufficient detail to compare the relative chance of a given person voting Republican or Democrat within a sub-group and controlling for the other sub-groups that person would be in.

It is an interesting perspective on political groupings. It’s not causal exactly but could help disentangle what relates to what in other groups.

For example, imagine you had a group of people who weren’t ostensibly related by politics. It could be a profession or members of a hobby related club. Now imagine that the members of the club were 70%/30% atheists v Christian and 60%/40% Democrat v Republican. Does the club lean Democrat because it has so many atheists in it or does it lean atheist because it has so many Democrats in it? The Economist’s model helps answer that question. Most Democrats aren’t atheists (mainly because few Americans are atheists) but atheism strongly implies a person will vote Democrat. Based on those numbers it looks more like the Democrat leanings are more due to the large number of atheists than vice versa.

You can plug in your own demographic details to see how close you fit. You can also plug in counterfactuals about yourself. I’m not American so I can’t factually describe what part of the US I live in but in a parallel universe in which I did but was otherwise much the same I’d have at least an 80% chance of voting Democratic REGARDLESS of where I was from in the US.

Another look at crowdfunding data

In relation to the post on Vox Day’s comic being pulled from IndieGoGo and whether there were financial shenanigans, I thought I’d grab some data. Unfortunately, because the dodgy comic has shelved, the contribution data is no longer available at the IndieGoGo site. However, a different “Arkhaven” (aka Castalia House, aka Vox Day’s vanity press) comic still has a live campaign. This one is being run by Timothy’s number 1 client, Jon Del Arroz.

Contribution data is available by looking at the list of backers. Not all backers are named (which is fair enough) and not all backers reveal how much they contribute (also fair enough – not trying to invade anybody’s privacy). For those backers who don’t reveal how much they gave publically, the overall total can be inferred by subtracting the amount raised by people who do show their contribution from the complete amount raised. For convenience, I’ve shared that between all the backers who didn’t list an amount (of course, in reality, some may have given a lot less or a lot more).

NOTE: I’m using this just as an example of crowdfunding data. I’ll point out interesting or notable features but 1. I’m not saying they are evidence of anything dodgy and 2. to make any such claim would require looking at many other campaigns to get a sense of what was typical v unusual. Also, while many names are given publically at the site, no names should be referred to in the comments etc aside from the organiser.  Lastly I may have made errors 🙂


There were four donations that were set as “Private” and they occurred “24” and “23” days ago (the data on the site is given in that format – I’ve inferred dates). Together they amount to $70 or $17.50 each i.e. unremarkable*.

The graph looks like what I might expect. I guess some campaigns might be more S shaped with a slow start and then a steeper climb before tapering off. Yet, campaigns with a big burst of contributions in the first few days and then a slow increase after makes sense also. I would imagine campaigns that run a danger of just falling short of their target goal might show a big blip near the end as the campaign makes one last push. For comparison here is a graph I drew of a different style of campaign: (it looks smoother because I spread some of the day-by-day data out artificially).

The biggest contributions were from five people who gave $515 each. That’s a curious amount, particulalry as the tier reward is at $500 and there’s no other donation values around that size i.e. everybody who gave a lot of money gave exactly the same amount. How weird is that? I don’t know, hence my caveat above. That maybe something that happens a lot with crowdfunding campaigns or maybe it’s really weird. Drawing conclusions about what’s weid requires data from more campaigns but also models to compare data against. For example, there’s no tier reward between $150 and $500, so it is not actually surprising that there’s no contributions at around $200 or $300.

Even so, around the $150 tier there is a lot more variation. Like I said, I don’t have a theoretical distribution to compare this against but while there’s more kinds of values they are still oddly clumped to my eye:

  • 2 at $172
  • 3 at $170
  • 2 at $167
  • 1 at $165
  • 18 at $162 (?!?)
  • 6 at $160
  • 0 at $150

I’ve no idea why exactly $162 is so popular. Perhaps it is a round number contribution in some other currency (don’t know what though – doesn’t match Euros or Canadian $) It doesn’t seem to match a combination of tier rewards either.

This graph shows the frequency of each of the 23 different sizes of amounts that were contributed and how much money was raised by that category. Bars are numbers of people and green dots are totals amount of money. ($18 is actually $17.50 and that’s actually the “Private” amount and hence should be taken with a pinch of salt).


I’d have expected something a bit more Pareto like I guess.

So, no big conclusion just that there’s stuff to look at and with enough background data of similar campaigns it would be plausible to spot campaigns that were distinctly unusual.

*[Speaking of errors, in the first graph I drew I’d calculated these ‘Private’ amounts incorrectly by using the Goal of the campaign instead of the total raised. Luckily I spotted my error before making a fool of myself.]

Claims and false claims

[A content warning: this post discusses sexual assault reports.]

All reports of a crime have potential consequences. We live in an age where false reports of crimes lead to death and where “SWATting” is a murderous prank. However, only one class of crime leads to constant concern from conservatives that false allegations are sufficiently common to require a kind of blanket scepticism. Amid the allegations against Supreme Court nominee Brett Kavanaugh, conservatives are pushing back against treating allegations of sexual assault at face value. This is part of a long history of people demanding that sexual assault crimes, in particular, require additional scepticism and scrutiny. That history pushed an idea that rape claims are made by women to ruin a man’s reputation even though historically the consequences of speaking out have always fallen more heavily on women than men*.

A piece by David French at the conservative magazine National Review attempts to pushback against modern feminist advocacy for supporting victims of sexual violence:

“It happens every single time there’s a public debate about sex crimes. Advocates for women introduce, in addition to the actual evidence in the case, an additional bit of  “data” that bolsters each and every claim of sexual assault. You see, “studies” show that women rarely file false rape claims. According to many activists, when a woman makes a claim of sexual assault, there is an empirically high probability that she’s telling the truth. In other words, the very existence of the claim is evidence of the truth of the claim.”

The tactic here is one we’ve seen in multiple circumstances where research runs counter to conservative beliefs. FUD, fear-uncertainty-doubt — everything from cigarettes to DDT to climate change has had the FUD treatment as intentional strategy to undermine research. Note the ‘how ridiculous’ tone of ‘In other words, the very existence of the claim is evidence of the truth of the claim.’ when, yes the existence of somebody claiming a crime happened to them IS evidence that a crime happened to them. It is typically the first piece of evidence of a crime! It isn’t always conclusive evidence of a crime for multiple reasons but yes, mainfestly it is evidence. The rhetorical trick here is to take something that is actually commonplace (i.e. a default assumption that when a person makes a serious claim of a crime there is probably a crime) and make it sound spurious or unusual.

The thrust of the article rests on an attempt to debunk research that has been done on the issue of false rape allegations. To maintain the fear of men suffering from false rape allegations, the article aims to emphasise the uncertainty in the statistics to provoke doubt (and uncertainty) amid its target audience.

After a broad preamble, the article focuses on one study in particular and to the article’s credit it does actually link to the paper. The 2010 study in question is this one False Allegations of Sexual Assault: An Analysis of Ten Years of Reported Cases by David Lisak, Lori Gardinier, Sarah C. Nicksa and Ashley M. Cote. The specific study looks at reports of sexual assault to campus police at major US Northeastern university. However, the study also contains (as you might expect) a literature review of other studies conducted. What is notable about the studies listed is that they found frequencies of flase allegations were over reported. For example a 2005 UK Home Office study found:

“There is an over-estimation of the scale of false allegations by both police officers
and prosecutors which feeds into a culture of skepticism, leading to poor communi-
cation and loss of confidence between complainants and the police.”

The space were David French seeks to generate uncertainty around these studies is two-fold:

  1. That sexual assault and rape are inherently difficult topics to research because of the trauma of the crime and social stigma [both factors that actually point to false allegations being *less* likely than other crimes, of course…]
  2. That there are a large numbers of initial reports of sexual assault were an investigation does not proceed.

That large numbers of rape and sexual assault reports to police go univestigated may sound more like a scandal than a counter-argument to believing victims but this is a fertile space for the right to generate doubt.

French’s article correctly reports that:

“researchers classified as false only 5.9 percent of cases — but noted that 44.9 percent of cases where classified as “Case did not proceed.””

And goes on to say:

“There is absolutely no way to know how many of the claims in that broad category were actually true or likely false. We simply know that the relevant decision-makers did not deem them to be provably true. Yet there are legions of people who glide right past the realities of our legal system and instead consider every claim outside those rare total exonerations to be true. According to this view, the justice system fails everyone else.”

The rhetorical trick is to confuse absolute certainty (i.e. we don’t know exactly the proportion of the uninvestigated claims might be false) with reasonable inferences that can be drawn from everything else we know (i.e. it is very, very, unlikely to be most of them). We can be confident that cases that did not proceed BECAUSE the allegation was false (i.e. it was investigated and found to be false) were NOT included in the 44.9% of cases precicesly because those cases were counted in false allegation. More pertinently, linking back to the “fear” aspect of the FUD strategy, the 44.9% of cases also led to zero legal or formal consequences to alleged perpetrators.

I don’t know if this fallacy has a formal name but it is one I see over and over. I could call it “methodological false isolation of evidence” by which I mean the tendency to treat evidence for a hypothesis as seperate and with no capacity for multiple sources of evidence to cross-coroborate. If I may depart into anthropoegenic global warming for a moment, you can see the fallacy work like this:

  • The physics of carbon dioxide and the greenhouse effect imply that increased CO2 will lead to warming: countered by – ah yes, but we can’t know by how much and maybe it will be less than natural influences on climate and maybe the extra CO2 gets absorbed…
  • The temperature record shows warming consistent with the rises in anthopogencic greenhouse gases: countered by – ah yes, but maybe the warming is caused by something natural…

Rationally the the two pieces of evidence function together: correlation might not be causation but if you have causation AND correlation then, well that’s stronger evidence than the sum of its parts.

With these statistics we are not operating in a vacuum. They need to be read an understood along with the other data that we know. Heck, that idea is built into the genre of research papers and is exactly why literature reviews are included. Police report statistics are limited and do contain uncertainty and aren’t a window into some Platonic world of ideal truth BUT that does not mean we know nothing and can infer nothing. Not even remotely. What it means is we have context to examine the limitations of that data and consider where the bias is likely to lie i.e. is the police report data more likely to OVERestimate the rate of false allegations or UNDERestimate compared to the actual number of sexual assaults/rapes?

It’s not even a contest. Firstly as the 2010 report notes:

“It is notable that in general the greater the scrutiny applied to police classifica-
tions, the lower the rate of false reporting detected. Cumulatively, these findings con-
tradict the still widely promulgated stereotype that false rape allegations are a common occurrence.”

But the deeper issue is the basic bias in the data that depends on reports to the police.

“It is estimated that between 64% and 96% of victims do not report the crimes committed against them (Fisher et al., 2000; Perkins & Klaus, 1996), and a major reason for this is victims’ belief that his or her report will be met with suspicion or outright disbelief (Jordan, 2004).”

Most victims of sexual assault do not report the crime at all i.e. most victims aren’t even the data sets we are looking at. Assume for a moment that the lower bound of that figure (64%) is itself exaggerated (although why that would be the case I don’t know) and assume, to give David French an advantage, that 50% of actual sexual assaults go unreported and that half of the 44.9% figure were somehow actual FALSE allegations (again, very unlikely) that would make the proportion of false allegations compared with (actual assaults+false allegations) about 14% based on the 2010 study’s campus figure. It STILL, even with those overt biases included, points to false allegations being very unlikely.

It makes sense to believe. The assumption that rape in particular is likely to draw malicious allegations is a misogynistic assumption. That does not mean nobody has ever made a false claim of rape, it just means that we do not place the same burden of doubt on people when they claim to be robbed or mugged etc. People make mistakes and some people do sometimes maliciously accuse others of crimes but such behaviour is unusual and, if anything, it is particulalry unusual with sexual crimes where, in fact, the OPPOSITE is more likely to occur: the victim makes no allegation out of fear of the consequences and because of the trauma involved.

Somehow it is 2018 and we still have to say this.

*[I don’t want to ignore that men are also victims of sexual violence, perhaps at far greater rates than are currently quantified, but the specific issue here relates to a very gendered view of sex and sexual assault.]

This is evidence of something but of what, I’m not sure

The BBC’s short story contest has, after a selection process in which author gender wasn’t known, selected an all-female shortlist. It isn’t a contest that I’m familiar with but apparently, it has been running for 13 years and on four previous occasions the shortlist has been all women*.

It’s nice to see authors being celebrated this way and the ‘blind’ selection process undermines the likely claim from the intransigently anti-women section of society that the nominees were chosen based on ‘affirmative action’ or some anti-men sentiment. I say ‘undermines’ because, of course, it hasn’t stopped the usual misogynistic comments on social media.

As a positive story it is still interesting to look at in terms of how results from awards may depart from simple demographic splits. As I’ve discussed here before, quantifying the discrepancy between actual results and what those result might be if demographic profile was effectively random, versus understanding were that discrepancy comes from and whether it is a problem are two very different things. First things first though, is the shortlist of five women numerically remarkable?

The prize has been running since 2006 and by my count from Wikipedia five of the twelve winners have been women. According to the Guardian article, about 57% of submissions are from women.

  • In 2017 the shortlist of 5 included 2 women,
  • 2016 had 5 women,
  • 2015 had 2 women,
  • 2014 had 5 women,
  • 2013 had 5 women,
  • 2012 had a longer shortlist** with of 10 authors of which 6 were women,
  • 2011 had 3 women
  • 2010 had 3 women
  • 2009 had 5 women
  • 2008 had 3 women
  • 2007 had 2 women***
  • 2006 had 2 women on a shortlist of four

Looking at the list there are few obvious things: many of the same authors get nominated (which isn’t surprising) and just eyeballing the numbers suggests women are more likely to get nominated but it isn’t a trend as such. Of the 64 nominees, 43 were women, about 67% which is more than would be predicted if the distribution was 50-50 and is higher than expected using a 57%-43% split based on submission rates.

Interestingly these rates make the gender split on the winners also look oddly biased (in the statistical sense). Only five of the winners are women, 38% of the winners out of 67% of the nominees. I can’t find exact details of the judging process but I assume it is the same panel of judges in a given year who both shortlist and decide the finalist. There’s no simple model of personal gender bias that easily accounts for a jury that is gender biased in two directions 🙂

One difference is that the while the shortlisting is done ‘blind’ the selection of the winner is not. However, looking at the recurrence of names among the nominees, the role of ‘blind’ shortlisting can be overstated — notable authors have notable styles and even without names some stories are easily recognised (e.g. 2015 shortlisted nominee was Hilary Mantel’s comical tale “The Assassination of Margaret Thatcher“). It is also reasonable to speculate that judges are looking for qualities in the short stories that may be more common among women authors — not because gender determines how people write but because of the on-going mechanics of expectations on creative people in a gendered society. Perhaps that explains both sets of results?

Simple proximate causes are inadequate here.

*[The reporting was in terms of male versus female, so I don’t know what the figures would be if the authors were classified with a broader view of gender. Without full author bios, the counts here are based on gendered first names or pronouns used in associated stories.]

**[To coincide with the Olympics the competition was opened up to more nationalities.]

***[Four nominated works and five nominated authors as one story was by two authors “Slog’s Dad” by Margaret Drabble & Dave Almond]