Category: Statistics

50% chance of doing X

This is a bit abstract and it follows on from this previous post about voting demographics.

Let’s say you’ve got a statistical model that predicts a person Z with Y characteristics has a 50% chance of doing X. The actual percentage doesn’t matter but 50% is a nice amount of measurable uncertainty — maximally knowing that we don’t know what person Z will do about X given the context of Y.

Empircally, the data would be looking at lots of Y people and seeing they do X 50% of the time. However, note that there’s a big and important distinction here between two extremes.

  1. Half of Y people do X and half of Y people don’t but those two halves are distinct. This implies that Y isn’t really the relevant factor here and we should be looking for some other feature of these people that better explains X behaviour.
  2. Y people do X half of the time randomly. That is Y people are essentially a coin toss with regards to X. In that case Y isn’t great for predicting whether people will do X but it is really relevant to the question (particulalry if W people behave more decisively).

In the demographic voting model and taking a figure of say 80%:20% for atheists splitting between left and right, I suspect this is a grouping where individuals have even less variability in their actual voting patterns. Some of that 20% will be Ayn Rand style atheists who are very committed to a right-wing viewpoint, rather than representing a 20% chance that a given atheist would vote Republican. However, that is not neccesarily true of other groups where the percentage may more closely represent a degree of individual variability.

 

Advertisements

US Voting Demographic Model

The Economist has a fascinating demographic model on US voters here: https://www.economist.com/graphic-detail/2018/11/03/how-to-forecast-an-americans-vote

There are no details on how robust the model is but they claim to have built it up from a large number of surveys of sufficient detail to compare the relative chance of a given person voting Republican or Democrat within a sub-group and controlling for the other sub-groups that person would be in.

It is an interesting perspective on political groupings. It’s not causal exactly but could help disentangle what relates to what in other groups.

For example, imagine you had a group of people who weren’t ostensibly related by politics. It could be a profession or members of a hobby related club. Now imagine that the members of the club were 70%/30% atheists v Christian and 60%/40% Democrat v Republican. Does the club lean Democrat because it has so many atheists in it or does it lean atheist because it has so many Democrats in it? The Economist’s model helps answer that question. Most Democrats aren’t atheists (mainly because few Americans are atheists) but atheism strongly implies a person will vote Democrat. Based on those numbers it looks more like the Democrat leanings are more due to the large number of atheists than vice versa.

You can plug in your own demographic details to see how close you fit. You can also plug in counterfactuals about yourself. I’m not American so I can’t factually describe what part of the US I live in but in a parallel universe in which I did but was otherwise much the same I’d have at least an 80% chance of voting Democratic REGARDLESS of where I was from in the US.

Another look at crowdfunding data

In relation to the post on Vox Day’s comic being pulled from IndieGoGo and whether there were financial shenanigans, I thought I’d grab some data. Unfortunately, because the dodgy comic has shelved, the contribution data is no longer available at the IndieGoGo site. However, a different “Arkhaven” (aka Castalia House, aka Vox Day’s vanity press) comic still has a live campaign. This one is being run by Timothy’s number 1 client, Jon Del Arroz. https://www.indiegogo.com/projects/the-ember-war-graphic-novel#/

Contribution data is available by looking at the list of backers. Not all backers are named (which is fair enough) and not all backers reveal how much they contribute (also fair enough – not trying to invade anybody’s privacy). For those backers who don’t reveal how much they gave publically, the overall total can be inferred by subtracting the amount raised by people who do show their contribution from the complete amount raised. For convenience, I’ve shared that between all the backers who didn’t list an amount (of course, in reality, some may have given a lot less or a lot more).

NOTE: I’m using this just as an example of crowdfunding data. I’ll point out interesting or notable features but 1. I’m not saying they are evidence of anything dodgy and 2. to make any such claim would require looking at many other campaigns to get a sense of what was typical v unusual. Also, while many names are given publically at the site, no names should be referred to in the comments etc aside from the organiser.  Lastly I may have made errors 🙂

emberwar

There were four donations that were set as “Private” and they occurred “24” and “23” days ago (the data on the site is given in that format – I’ve inferred dates). Together they amount to $70 or $17.50 each i.e. unremarkable*.

The graph looks like what I might expect. I guess some campaigns might be more S shaped with a slow start and then a steeper climb before tapering off. Yet, campaigns with a big burst of contributions in the first few days and then a slow increase after makes sense also. I would imagine campaigns that run a danger of just falling short of their target goal might show a big blip near the end as the campaign makes one last push. For comparison here is a graph I drew of a different style of campaign: https://camestrosfelapton.wordpress.com/2018/03/03/looking-at-some-crowdfunding-data/ (it looks smoother because I spread some of the day-by-day data out artificially).

The biggest contributions were from five people who gave $515 each. That’s a curious amount, particulalry as the tier reward is at $500 and there’s no other donation values around that size i.e. everybody who gave a lot of money gave exactly the same amount. How weird is that? I don’t know, hence my caveat above. That maybe something that happens a lot with crowdfunding campaigns or maybe it’s really weird. Drawing conclusions about what’s weid requires data from more campaigns but also models to compare data against. For example, there’s no tier reward between $150 and $500, so it is not actually surprising that there’s no contributions at around $200 or $300.

Even so, around the $150 tier there is a lot more variation. Like I said, I don’t have a theoretical distribution to compare this against but while there’s more kinds of values they are still oddly clumped to my eye:

  • 2 at $172
  • 3 at $170
  • 2 at $167
  • 1 at $165
  • 18 at $162 (?!?)
  • 6 at $160
  • 0 at $150

I’ve no idea why exactly $162 is so popular. Perhaps it is a round number contribution in some other currency (don’t know what though – doesn’t match Euros or Canadian $) It doesn’t seem to match a combination of tier rewards either.

This graph shows the frequency of each of the 23 different sizes of amounts that were contributed and how much money was raised by that category. Bars are numbers of people and green dots are totals amount of money. ($18 is actually $17.50 and that’s actually the “Private” amount and hence should be taken with a pinch of salt).

emberwardistrib

I’d have expected something a bit more Pareto like I guess.

So, no big conclusion just that there’s stuff to look at and with enough background data of similar campaigns it would be plausible to spot campaigns that were distinctly unusual.

*[Speaking of errors, in the first graph I drew I’d calculated these ‘Private’ amounts incorrectly by using the Goal of the campaign instead of the total raised. Luckily I spotted my error before making a fool of myself.]

Claims and false claims

[A content warning: this post discusses sexual assault reports.]

All reports of a crime have potential consequences. We live in an age where false reports of crimes lead to death and where “SWATting” is a murderous prank. However, only one class of crime leads to constant concern from conservatives that false allegations are sufficiently common to require a kind of blanket scepticism. Amid the allegations against Supreme Court nominee Brett Kavanaugh, conservatives are pushing back against treating allegations of sexual assault at face value. This is part of a long history of people demanding that sexual assault crimes, in particular, require additional scepticism and scrutiny. That history pushed an idea that rape claims are made by women to ruin a man’s reputation even though historically the consequences of speaking out have always fallen more heavily on women than men*.

A piece by David French at the conservative magazine National Review attempts to pushback against modern feminist advocacy for supporting victims of sexual violence:

“It happens every single time there’s a public debate about sex crimes. Advocates for women introduce, in addition to the actual evidence in the case, an additional bit of  “data” that bolsters each and every claim of sexual assault. You see, “studies” show that women rarely file false rape claims. According to many activists, when a woman makes a claim of sexual assault, there is an empirically high probability that she’s telling the truth. In other words, the very existence of the claim is evidence of the truth of the claim.” https://www.nationalreview.com/2018/09/brett-kavanaugh-accusations-rape-claim-statistics/

The tactic here is one we’ve seen in multiple circumstances where research runs counter to conservative beliefs. FUD, fear-uncertainty-doubt — everything from cigarettes to DDT to climate change has had the FUD treatment as intentional strategy to undermine research. Note the ‘how ridiculous’ tone of ‘In other words, the very existence of the claim is evidence of the truth of the claim.’ when, yes the existence of somebody claiming a crime happened to them IS evidence that a crime happened to them. It is typically the first piece of evidence of a crime! It isn’t always conclusive evidence of a crime for multiple reasons but yes, mainfestly it is evidence. The rhetorical trick here is to take something that is actually commonplace (i.e. a default assumption that when a person makes a serious claim of a crime there is probably a crime) and make it sound spurious or unusual.

The thrust of the article rests on an attempt to debunk research that has been done on the issue of false rape allegations. To maintain the fear of men suffering from false rape allegations, the article aims to emphasise the uncertainty in the statistics to provoke doubt (and uncertainty) amid its target audience.

After a broad preamble, the article focuses on one study in particular and to the article’s credit it does actually link to the paper. The 2010 study in question is this one False Allegations of Sexual Assault: An Analysis of Ten Years of Reported Cases by David Lisak, Lori Gardinier, Sarah C. Nicksa and Ashley M. Cote. The specific study looks at reports of sexual assault to campus police at major US Northeastern university. However, the study also contains (as you might expect) a literature review of other studies conducted. What is notable about the studies listed is that they found frequencies of flase allegations were over reported. For example a 2005 UK Home Office study found:

“There is an over-estimation of the scale of false allegations by both police officers
and prosecutors which feeds into a culture of skepticism, leading to poor communi-
cation and loss of confidence between complainants and the police.”

The space were David French seeks to generate uncertainty around these studies is two-fold:

  1. That sexual assault and rape are inherently difficult topics to research because of the trauma of the crime and social stigma [both factors that actually point to false allegations being *less* likely than other crimes, of course…]
  2. That there are a large numbers of initial reports of sexual assault were an investigation does not proceed.

That large numbers of rape and sexual assault reports to police go univestigated may sound more like a scandal than a counter-argument to believing victims but this is a fertile space for the right to generate doubt.

French’s article correctly reports that:

“researchers classified as false only 5.9 percent of cases — but noted that 44.9 percent of cases where classified as “Case did not proceed.””

And goes on to say:

“There is absolutely no way to know how many of the claims in that broad category were actually true or likely false. We simply know that the relevant decision-makers did not deem them to be provably true. Yet there are legions of people who glide right past the realities of our legal system and instead consider every claim outside those rare total exonerations to be true. According to this view, the justice system fails everyone else.”

The rhetorical trick is to confuse absolute certainty (i.e. we don’t know exactly the proportion of the uninvestigated claims might be false) with reasonable inferences that can be drawn from everything else we know (i.e. it is very, very, unlikely to be most of them). We can be confident that cases that did not proceed BECAUSE the allegation was false (i.e. it was investigated and found to be false) were NOT included in the 44.9% of cases precicesly because those cases were counted in false allegation. More pertinently, linking back to the “fear” aspect of the FUD strategy, the 44.9% of cases also led to zero legal or formal consequences to alleged perpetrators.

I don’t know if this fallacy has a formal name but it is one I see over and over. I could call it “methodological false isolation of evidence” by which I mean the tendency to treat evidence for a hypothesis as seperate and with no capacity for multiple sources of evidence to cross-coroborate. If I may depart into anthropoegenic global warming for a moment, you can see the fallacy work like this:

  • The physics of carbon dioxide and the greenhouse effect imply that increased CO2 will lead to warming: countered by – ah yes, but we can’t know by how much and maybe it will be less than natural influences on climate and maybe the extra CO2 gets absorbed…
  • The temperature record shows warming consistent with the rises in anthopogencic greenhouse gases: countered by – ah yes, but maybe the warming is caused by something natural…

Rationally the the two pieces of evidence function together: correlation might not be causation but if you have causation AND correlation then, well that’s stronger evidence than the sum of its parts.

With these statistics we are not operating in a vacuum. They need to be read an understood along with the other data that we know. Heck, that idea is built into the genre of research papers and is exactly why literature reviews are included. Police report statistics are limited and do contain uncertainty and aren’t a window into some Platonic world of ideal truth BUT that does not mean we know nothing and can infer nothing. Not even remotely. What it means is we have context to examine the limitations of that data and consider where the bias is likely to lie i.e. is the police report data more likely to OVERestimate the rate of false allegations or UNDERestimate compared to the actual number of sexual assaults/rapes?

It’s not even a contest. Firstly as the 2010 report notes:

“It is notable that in general the greater the scrutiny applied to police classifica-
tions, the lower the rate of false reporting detected. Cumulatively, these findings con-
tradict the still widely promulgated stereotype that false rape allegations are a common occurrence.”

But the deeper issue is the basic bias in the data that depends on reports to the police.

“It is estimated that between 64% and 96% of victims do not report the crimes committed against them (Fisher et al., 2000; Perkins & Klaus, 1996), and a major reason for this is victims’ belief that his or her report will be met with suspicion or outright disbelief (Jordan, 2004).”

Most victims of sexual assault do not report the crime at all i.e. most victims aren’t even the data sets we are looking at. Assume for a moment that the lower bound of that figure (64%) is itself exaggerated (although why that would be the case I don’t know) and assume, to give David French an advantage, that 50% of actual sexual assaults go unreported and that half of the 44.9% figure were somehow actual FALSE allegations (again, very unlikely) that would make the proportion of false allegations compared with (actual assaults+false allegations) about 14% based on the 2010 study’s campus figure. It STILL, even with those overt biases included, points to false allegations being very unlikely.

It makes sense to believe. The assumption that rape in particular is likely to draw malicious allegations is a misogynistic assumption. That does not mean nobody has ever made a false claim of rape, it just means that we do not place the same burden of doubt on people when they claim to be robbed or mugged etc. People make mistakes and some people do sometimes maliciously accuse others of crimes but such behaviour is unusual and, if anything, it is particulalry unusual with sexual crimes where, in fact, the OPPOSITE is more likely to occur: the victim makes no allegation out of fear of the consequences and because of the trauma involved.

Somehow it is 2018 and we still have to say this.

*[I don’t want to ignore that men are also victims of sexual violence, perhaps at far greater rates than are currently quantified, but the specific issue here relates to a very gendered view of sex and sexual assault.]

This is evidence of something but of what, I’m not sure

The BBC’s short story contest has, after a selection process in which author gender wasn’t known, selected an all-female shortlist. It isn’t a contest that I’m familiar with but apparently, it has been running for 13 years and on four previous occasions the shortlist has been all women*.

https://www.theguardian.com/books/2018/sep/14/bbc-short-story-prize-selects-all-female-shortlist-for-fifth-time?CMP=twt_gu

It’s nice to see authors being celebrated this way and the ‘blind’ selection process undermines the likely claim from the intransigently anti-women section of society that the nominees were chosen based on ‘affirmative action’ or some anti-men sentiment. I say ‘undermines’ because, of course, it hasn’t stopped the usual misogynistic comments on social media.

As a positive story it is still interesting to look at in terms of how results from awards may depart from simple demographic splits. As I’ve discussed here before, quantifying the discrepancy between actual results and what those result might be if demographic profile was effectively random, versus understanding were that discrepancy comes from and whether it is a problem are two very different things. First things first though, is the shortlist of five women numerically remarkable?

The prize has been running since 2006 and by my count from Wikipedia five of the twelve winners have been women. According to the Guardian article, about 57% of submissions are from women.

  • In 2017 the shortlist of 5 included 2 women,
  • 2016 had 5 women,
  • 2015 had 2 women,
  • 2014 had 5 women,
  • 2013 had 5 women,
  • 2012 had a longer shortlist** with of 10 authors of which 6 were women,
  • 2011 had 3 women
  • 2010 had 3 women
  • 2009 had 5 women
  • 2008 had 3 women
  • 2007 had 2 women***
  • 2006 had 2 women on a shortlist of four

Looking at the list there are few obvious things: many of the same authors get nominated (which isn’t surprising) and just eyeballing the numbers suggests women are more likely to get nominated but it isn’t a trend as such. Of the 64 nominees, 43 were women, about 67% which is more than would be predicted if the distribution was 50-50 and is higher than expected using a 57%-43% split based on submission rates.

Interestingly these rates make the gender split on the winners also look oddly biased (in the statistical sense). Only five of the winners are women, 38% of the winners out of 67% of the nominees. I can’t find exact details of the judging process but I assume it is the same panel of judges in a given year who both shortlist and decide the finalist. There’s no simple model of personal gender bias that easily accounts for a jury that is gender biased in two directions 🙂

One difference is that the while the shortlisting is done ‘blind’ the selection of the winner is not. However, looking at the recurrence of names among the nominees, the role of ‘blind’ shortlisting can be overstated — notable authors have notable styles and even without names some stories are easily recognised (e.g. 2015 shortlisted nominee was Hilary Mantel’s comical tale “The Assassination of Margaret Thatcher“). It is also reasonable to speculate that judges are looking for qualities in the short stories that may be more common among women authors — not because gender determines how people write but because of the on-going mechanics of expectations on creative people in a gendered society. Perhaps that explains both sets of results?

Simple proximate causes are inadequate here.

*[The reporting was in terms of male versus female, so I don’t know what the figures would be if the authors were classified with a broader view of gender. Without full author bios, the counts here are based on gendered first names or pronouns used in associated stories.]

**[To coincide with the Olympics the competition was opened up to more nationalities.]

***[Four nominated works and five nominated authors as one story was by two authors “Slog’s Dad” by Margaret Drabble & Dave Almond]

 

Looking at Subscription Data

The discussion in the comments about Amazon ranks sent me off on a tangent. I gathered some Amazon rankings for SFF magazines that offer subscriptions via Amazon and having got that data I thought I should do something with it.

As I also had the 2017 Fireside Report data I thought I’d compare the two. Now, this data is not great. Firstly, while the Fireside Report is methodical it is necessarily less strong on a per-magazine basis than it is in aggregate — one author incorrectly identified (or not identified) would have a big impact on the proportion listed. Secondly, the Amazon rankings I’ve got don’t necessarily represent the size of the readership consistently between the magazines — there is some major variation in business model between the magazines listed.

Still, I was curious. Story outlets that maintain an ongoing Kindle subscription model would be (I speculated) the more established and hence ‘traditional’ and hence reflect the least amount of social/cultural change.

Given all that, it is not surprising that the data is really just a big bunch of all-over-the-place when comparing rankings. I did tabulate sub-rankings in particular categories but those rankings on their own terms appeared to make no sense and/or not quite commensurate classifications within Amazon.

No strong conclusions to draw other than:

  • there’s no obvious commercial downside for outlets that have better representation
  • overall (as noted in the Fireside report) the level of representation isn’t good
  • Uncanny’s model doesn’t suit the ranking very well.

The last two columns are from the Fireside Report 2017 Google spreadsheet https://firesidefiction.com/blackspecfic-2017

Magazine Amazon Kindle Subs Rank
total stories, black authors % stories by black authors
Fantasy & Science Fiction

214,172

4

6.7%

Asimov’s

220,953

1

1.4%

Analog

263,553

0

0.0%

Clarkesworld

343,321

0

0.0%

Lightspeed

428,963

3

6.5%

Apex

564,541

6

18.8%

Nightmare Magazine

989,486

2

8.3%

Uncanny

2,464,628

4

12.5%