Category: Research

50% chance of doing X

This is a bit abstract and it follows on from this previous post about voting demographics.

Let’s say you’ve got a statistical model that predicts a person Z with Y characteristics has a 50% chance of doing X. The actual percentage doesn’t matter but 50% is a nice amount of measurable uncertainty — maximally knowing that we don’t know what person Z will do about X given the context of Y.

Empircally, the data would be looking at lots of Y people and seeing they do X 50% of the time. However, note that there’s a big and important distinction here between two extremes.

  1. Half of Y people do X and half of Y people don’t but those two halves are distinct. This implies that Y isn’t really the relevant factor here and we should be looking for some other feature of these people that better explains X behaviour.
  2. Y people do X half of the time randomly. That is Y people are essentially a coin toss with regards to X. In that case Y isn’t great for predicting whether people will do X but it is really relevant to the question (particulalry if W people behave more decisively).

In the demographic voting model and taking a figure of say 80%:20% for atheists splitting between left and right, I suspect this is a grouping where individuals have even less variability in their actual voting patterns. Some of that 20% will be Ayn Rand style atheists who are very committed to a right-wing viewpoint, rather than representing a 20% chance that a given atheist would vote Republican. However, that is not neccesarily true of other groups where the percentage may more closely represent a degree of individual variability.

 

Advertisements

US Voting Demographic Model

The Economist has a fascinating demographic model on US voters here: https://www.economist.com/graphic-detail/2018/11/03/how-to-forecast-an-americans-vote

There are no details on how robust the model is but they claim to have built it up from a large number of surveys of sufficient detail to compare the relative chance of a given person voting Republican or Democrat within a sub-group and controlling for the other sub-groups that person would be in.

It is an interesting perspective on political groupings. It’s not causal exactly but could help disentangle what relates to what in other groups.

For example, imagine you had a group of people who weren’t ostensibly related by politics. It could be a profession or members of a hobby related club. Now imagine that the members of the club were 70%/30% atheists v Christian and 60%/40% Democrat v Republican. Does the club lean Democrat because it has so many atheists in it or does it lean atheist because it has so many Democrats in it? The Economist’s model helps answer that question. Most Democrats aren’t atheists (mainly because few Americans are atheists) but atheism strongly implies a person will vote Democrat. Based on those numbers it looks more like the Democrat leanings are more due to the large number of atheists than vice versa.

You can plug in your own demographic details to see how close you fit. You can also plug in counterfactuals about yourself. I’m not American so I can’t factually describe what part of the US I live in but in a parallel universe in which I did but was otherwise much the same I’d have at least an 80% chance of voting Democratic REGARDLESS of where I was from in the US.

Claims and false claims

[A content warning: this post discusses sexual assault reports.]

All reports of a crime have potential consequences. We live in an age where false reports of crimes lead to death and where “SWATting” is a murderous prank. However, only one class of crime leads to constant concern from conservatives that false allegations are sufficiently common to require a kind of blanket scepticism. Amid the allegations against Supreme Court nominee Brett Kavanaugh, conservatives are pushing back against treating allegations of sexual assault at face value. This is part of a long history of people demanding that sexual assault crimes, in particular, require additional scepticism and scrutiny. That history pushed an idea that rape claims are made by women to ruin a man’s reputation even though historically the consequences of speaking out have always fallen more heavily on women than men*.

A piece by David French at the conservative magazine National Review attempts to pushback against modern feminist advocacy for supporting victims of sexual violence:

“It happens every single time there’s a public debate about sex crimes. Advocates for women introduce, in addition to the actual evidence in the case, an additional bit of  “data” that bolsters each and every claim of sexual assault. You see, “studies” show that women rarely file false rape claims. According to many activists, when a woman makes a claim of sexual assault, there is an empirically high probability that she’s telling the truth. In other words, the very existence of the claim is evidence of the truth of the claim.” https://www.nationalreview.com/2018/09/brett-kavanaugh-accusations-rape-claim-statistics/

The tactic here is one we’ve seen in multiple circumstances where research runs counter to conservative beliefs. FUD, fear-uncertainty-doubt — everything from cigarettes to DDT to climate change has had the FUD treatment as intentional strategy to undermine research. Note the ‘how ridiculous’ tone of ‘In other words, the very existence of the claim is evidence of the truth of the claim.’ when, yes the existence of somebody claiming a crime happened to them IS evidence that a crime happened to them. It is typically the first piece of evidence of a crime! It isn’t always conclusive evidence of a crime for multiple reasons but yes, mainfestly it is evidence. The rhetorical trick here is to take something that is actually commonplace (i.e. a default assumption that when a person makes a serious claim of a crime there is probably a crime) and make it sound spurious or unusual.

The thrust of the article rests on an attempt to debunk research that has been done on the issue of false rape allegations. To maintain the fear of men suffering from false rape allegations, the article aims to emphasise the uncertainty in the statistics to provoke doubt (and uncertainty) amid its target audience.

After a broad preamble, the article focuses on one study in particular and to the article’s credit it does actually link to the paper. The 2010 study in question is this one False Allegations of Sexual Assault: An Analysis of Ten Years of Reported Cases by David Lisak, Lori Gardinier, Sarah C. Nicksa and Ashley M. Cote. The specific study looks at reports of sexual assault to campus police at major US Northeastern university. However, the study also contains (as you might expect) a literature review of other studies conducted. What is notable about the studies listed is that they found frequencies of flase allegations were over reported. For example a 2005 UK Home Office study found:

“There is an over-estimation of the scale of false allegations by both police officers
and prosecutors which feeds into a culture of skepticism, leading to poor communi-
cation and loss of confidence between complainants and the police.”

The space were David French seeks to generate uncertainty around these studies is two-fold:

  1. That sexual assault and rape are inherently difficult topics to research because of the trauma of the crime and social stigma [both factors that actually point to false allegations being *less* likely than other crimes, of course…]
  2. That there are a large numbers of initial reports of sexual assault were an investigation does not proceed.

That large numbers of rape and sexual assault reports to police go univestigated may sound more like a scandal than a counter-argument to believing victims but this is a fertile space for the right to generate doubt.

French’s article correctly reports that:

“researchers classified as false only 5.9 percent of cases — but noted that 44.9 percent of cases where classified as “Case did not proceed.””

And goes on to say:

“There is absolutely no way to know how many of the claims in that broad category were actually true or likely false. We simply know that the relevant decision-makers did not deem them to be provably true. Yet there are legions of people who glide right past the realities of our legal system and instead consider every claim outside those rare total exonerations to be true. According to this view, the justice system fails everyone else.”

The rhetorical trick is to confuse absolute certainty (i.e. we don’t know exactly the proportion of the uninvestigated claims might be false) with reasonable inferences that can be drawn from everything else we know (i.e. it is very, very, unlikely to be most of them). We can be confident that cases that did not proceed BECAUSE the allegation was false (i.e. it was investigated and found to be false) were NOT included in the 44.9% of cases precicesly because those cases were counted in false allegation. More pertinently, linking back to the “fear” aspect of the FUD strategy, the 44.9% of cases also led to zero legal or formal consequences to alleged perpetrators.

I don’t know if this fallacy has a formal name but it is one I see over and over. I could call it “methodological false isolation of evidence” by which I mean the tendency to treat evidence for a hypothesis as seperate and with no capacity for multiple sources of evidence to cross-coroborate. If I may depart into anthropoegenic global warming for a moment, you can see the fallacy work like this:

  • The physics of carbon dioxide and the greenhouse effect imply that increased CO2 will lead to warming: countered by – ah yes, but we can’t know by how much and maybe it will be less than natural influences on climate and maybe the extra CO2 gets absorbed…
  • The temperature record shows warming consistent with the rises in anthopogencic greenhouse gases: countered by – ah yes, but maybe the warming is caused by something natural…

Rationally the the two pieces of evidence function together: correlation might not be causation but if you have causation AND correlation then, well that’s stronger evidence than the sum of its parts.

With these statistics we are not operating in a vacuum. They need to be read an understood along with the other data that we know. Heck, that idea is built into the genre of research papers and is exactly why literature reviews are included. Police report statistics are limited and do contain uncertainty and aren’t a window into some Platonic world of ideal truth BUT that does not mean we know nothing and can infer nothing. Not even remotely. What it means is we have context to examine the limitations of that data and consider where the bias is likely to lie i.e. is the police report data more likely to OVERestimate the rate of false allegations or UNDERestimate compared to the actual number of sexual assaults/rapes?

It’s not even a contest. Firstly as the 2010 report notes:

“It is notable that in general the greater the scrutiny applied to police classifica-
tions, the lower the rate of false reporting detected. Cumulatively, these findings con-
tradict the still widely promulgated stereotype that false rape allegations are a common occurrence.”

But the deeper issue is the basic bias in the data that depends on reports to the police.

“It is estimated that between 64% and 96% of victims do not report the crimes committed against them (Fisher et al., 2000; Perkins & Klaus, 1996), and a major reason for this is victims’ belief that his or her report will be met with suspicion or outright disbelief (Jordan, 2004).”

Most victims of sexual assault do not report the crime at all i.e. most victims aren’t even the data sets we are looking at. Assume for a moment that the lower bound of that figure (64%) is itself exaggerated (although why that would be the case I don’t know) and assume, to give David French an advantage, that 50% of actual sexual assaults go unreported and that half of the 44.9% figure were somehow actual FALSE allegations (again, very unlikely) that would make the proportion of false allegations compared with (actual assaults+false allegations) about 14% based on the 2010 study’s campus figure. It STILL, even with those overt biases included, points to false allegations being very unlikely.

It makes sense to believe. The assumption that rape in particular is likely to draw malicious allegations is a misogynistic assumption. That does not mean nobody has ever made a false claim of rape, it just means that we do not place the same burden of doubt on people when they claim to be robbed or mugged etc. People make mistakes and some people do sometimes maliciously accuse others of crimes but such behaviour is unusual and, if anything, it is particulalry unusual with sexual crimes where, in fact, the OPPOSITE is more likely to occur: the victim makes no allegation out of fear of the consequences and because of the trauma involved.

Somehow it is 2018 and we still have to say this.

*[I don’t want to ignore that men are also victims of sexual violence, perhaps at far greater rates than are currently quantified, but the specific issue here relates to a very gendered view of sex and sexual assault.]

Reading Bad Science So You Don’t Have To

Yesterday I made a mistake. I was aware of a kerfuffle around the publication in the notable open journal PLOS One of an article on the dubious notion of “Rapid Onset Gender Dysphoria”. The mistake was not listening sufficiently when lots of people said the article was very, very bad and thinking “Yeah, but how bad could it be if it was published somewhere non-obscure?” A pernicious thought that I’m very glad I didn’t express out loud because then I went and read the article…

There’s a bad, bad mental habit of discounting objections to arguments that you haven’t paid attention to when those arguments come from people you perceive as being in some way partisan on an issue, even if you yourself are partisan on the issue. It is the insidious sibling of false balance and ‘both sides’ that assumes that criticism from ‘your’ side must be at least a bit exaggerated. So I’m starting with a mea-culpa: it wasn’t that I didn’t believe the critics of the article, just that I assumed they must be exaggerating its badness at least a little. It’s a bias of arrogance that assumes that because somebody feels passionate about something that their statements aren’t wholly reliable (arrogance because it’s not a rule you then apply to yourself).

Anyway, enough beating myself up. Some background.

The rights of transgender people have become an increasingly virulent political battlefield following a pattern that we’ve seen many times before: a group of people who have been systematically marginalised ask for what is little more than basic human dignity only to be met with a counter-reaction that is deeply confronting. The pushback from the Christian Right and the Alt-Right is one thing but the vehement reaction from some people in the centre and the left can also be terrible.

Rapid Onset Gender Dysphoria” is a concept that has been floating around various anti-transgender rights groupings on the internet. It is essentially a pseudo-scientific term in the sense that it takes a basic (and false) claim and dresses it up in quasi-scientific terms. The claim is that many teenagers are claiming to have gender dysphoria because it is trendy or because of peer pressure. The concept has rested mainly on increased visibility of transgender people in society and structurally is no different than similar claims made about people being gay or lesbian. As social stigma is reduced and as people find more open social support, more people will be public about core aspects of themselves. (see also this link https://medium.com/@juliaserano/everything-you-need-to-know-about-rapid-onset-gender-dysphoria-1940b8afdeba )

However, there are people who are keen for “Rapid Onset Gender Dysphoria” to be a thing, that is somehow a “real” medical condition that should be recognised as a false-positive when a young person claims to be transgender. Essentially it is a way of trying to medicalise the argument that “it is just a phase”.

Into this space has come a paper published in the journal PLOS One entitled “Rapid-onset gender dysphoria in adolescents and young adults: A study of parental reports” by Lisa Littman. It is currently available here https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0202330 and it really is a very bad study.

There is a whole series of posts starting here that pull the study to bits in various ways that are worth reading and in particular this post: https://genderanalysis.net/2018/08/meet-the-unbiased-reliable-not-at-all-transphobic-parents-from-the-rapid-onset-gender-dysphoria-study/ gets to the heart of what is so very wrong with the paper.

The paper claims to be a kind of exploratory study of “rapid onset gender dysphoria” as a hypothesis and uses a report from parents to suggest that such a phenomenon exists as an actual condition. The paper then speculates on causes and recommends some actions by medical practitioners. However, the paper is not just methodologically flawed, it is actually an unwitting study of something else altogether.

The study involved posting a questionnaire on several parent websites but those websites were websites were “rapid onset gender dysphoria” was already being discussed and which included activism around the concept. The survey posted itself included leading question such as:

“How many of your children have experienced a sudden or rapid onset of gender dysphoria, which began after puberty?”

In the survey instrument that is the first question after the basic demographic questions. Put another way, the study assumed the existence of the phenomenon and then surveyed people likely to also believe in the existence of the phenomenon and then reported positive results as confirmation that the phenomenon existed.

Put another way, it would be like surveying people on an internet forum where people discussed and shared studies of UFO sightings, asked them “How many UFOs have you seen?” and then credulously used the number of positive reports as a lead into to discuss whether the UFOs were from Mars or another galaxy. Too silly an example? Perhaps, but here is a different analogy. The study is very like surveying internet forums known for GamerGate activism and asking them “How many cases of bad ethics have you seen in games journalism?” and then concluding that the volume of responses showed something meaningful about how reviews of video games function.

Essentially it is a very complex form of question-begging.

There are some formal ways that the paper adopts valid methods. There is a survey and the responses were collected and (I assume) tabulated correctly. It’s not necessarily invalid to collect data from parents to possibly identify a medical condition in their children. However, the basic structure of what was done fundamentally changes the meaning of all the results from what was intended.

To salvage the work done and turn the data into something meaningful would require recognising that it answers no medical questions at all. Instead, what has been done is a survey into *beliefs* and specifically beliefs among a community of people i.e. the study is sociological and the topic is “really” how a pseudoscientific idea becomes entrenched in some internet communities. Ironically, the study focuses on concepts such as “social contagion” and yet somehow misses that the study itself reveals how ideas spread due to peer pressure and apparent topicality.

What lessons are there to be learned? Well for me I should have not wasted my time reading a paper that everybody had already told me was bad. It was bad, bad not just interestingly flawed or mistaken.

Geometry News!

I appreciate a nice polyhedron but as an area of interest it isn’t one prone to many events. The regular polyhedra were fully classified a very long time ago and while that’s just one set of an infinite space of 3D objects with polygonal faces. If you allow for curves or slightly curved, almost polygons then there is a lot to play with but not many objects stand out from the crowd.

Anyway, biology to the rescue! An article in Nature (https://www.nature.com/articles/s41467-018-05376-1 ) looks at the issue of cell-packing. Our cells are squishy 3D objects that pack together to form tissue. Now getting objects to pack together to fill a space efficiently is a well-known and difficult to solve problem if you dealing with anything other than cubes. Hexagonal prisms are a solution that crops up in nature in places such as basalt rock formations and bee hives (and presumably bee hives made out of basalt on some planet with magma bees and honey volcanoes).

In 2D one way of filling a plane with irregular but simple polygons is a Voronoi pattern. Arrnagments of cells in a layer looked at ‘top-down’ can (apparently) resemble that kind of pattern but that doesn’t help describe the 3D aspect of the cells. Prisms don’t work because the ‘top’ face may be smaller than the ‘bottom’ face. Frustrums (chopped off pyramids) don’t work because the ‘top’ and ‘bottom’ faces maybe polygons of different sizes and frustrums don’t neccesarily pack nicely. Enter the scutoid.

Scutoids are (apparently, I’m just reading the paper) messed up prisms. The example picture shows a shape with a pentagon-bottom and a hexagon-top and the vertices of each polygon joined by  curved edges with the exception of an additional triangular face. Flip the same shape upside down and they can nestle into each other. Which is sweet.

 

41467_2018_5376_fig1_html

Fig1 from https://www.nature.com/articles/s41467-018-05376-1 ‘Scutoids are a geometrical solution to three-dimensional packing of epithelia’ Nature Communications volume 9, Article number: 2960 (2018)

 

So not quite polyhedra, crazy mixed up nearly prisms that know how to pack. The picture of the beatle is there because of the distinct pattern of five shapes – specifically that little triangle at the top where the line between the carapace covering the wing splits. The combination of faces on the scutoid reminded the researchers of the beatle and the ‘scutoid’ name is derived from that.

Also I don’t know if you say “scoo-toid” or “scuh-toid”.

Other coverage:

https://gizmodo.com/the-scutoid-is-geometrys-newest-shape-and-it-could-be-1827924643?IR=T

https://www.newscientist.com/article/2175297-a-new-shape-called-the-scutoid-has-been-discovered-in-our-cells/

 

No, that doesn’t settle it.

A Washington Post article about a research paper claims to have settled the question of whether to put one space or two spaces after a full-stop (or as Americans like to say a ‘period’*).

The research neatly encapsulates some of the elements of questions of objectivity and meaning that I keep returning to.

The research had two components. The first was about usage and is interesting but not consequential. The second is of more note. Using eye-tracking, the researchers measured how a number of people followed sentences they were reading. Using that data, they could compare the relative reading ease of texts that used a single space after a full-stop and texts that used two spaces.

The results showed a small advantage for two spaces. By ‘small’ I mean:

  • ‘comprehension was not affected by punctuation spacing’ i.e. there was no measurable difference in how well subjects understood the texts they were reading.
  • there was some evidence that ‘initial processing of the text was facilitated when periods were followed by two spaces’.

So practically, two-spaces was not obviously better but MAYBE it required a smaller effort to read, perhaps. Note this second conclusion requires its own chain of inference that’s not well established i.e. it assumes that the processing of the text was facilitated but that was not measured directly.

But the bigger issue (mentioned in the WP article but not in the abstract of the paper) was that the text used was…

 ...in a monospaced font.

That does not make any of the findings in the report invalid. It doesn’t undermine the quality of the methodology used. It doesn’t make the findings less objective BUT it does entirely miss the point of the underlying argument.

The two-space versus one-space debate pertains to the transition from typewriters to modern wordprocessing. Classic typewriters had to use common widths between letters due to the mechanics of a typewriter, including a degree of error as to exactly where a letter might be placed. Modern word-processing uses typefaces where letters and the spacing around them are customised for not just individual letters but also for punctuation. The two-space versus one-space argument is one about the transition from classic typing to modern word-processing.

There is a parallel with drug trials here. For example a new drug or treatment might be compared with a placebo. That’s a scientifically legitimate approach to collecting data and looking at efficacy. However, its often not the relevant question. More pertinent is how the new drug compares with existing treatment rather than a placebo.

The point being – what is the underlying issue or what is the question being asked? These are more vague, more wooly aspects of scientific inquiry but also deeply important. The more clarity on those aspects help us judge whether empirical evidence is relevant to the question being asked.

However, my point above does not mean the research was wasted. It does demonstrate a couple of things:

  • The typing habit of using two spaces after a full stop had some merit.
  • The possible advantage of using two spaces is very small.

I don’t think either conclusion helps out the two-spacers much. The first implies social habits and vague aesthetics of people who type can be trusted – and that would tend towards favouring the one-spacer’s attitude to modern texts with modern fonts. The second implies that the cost-benefit of using two spaces is a best marginal and at worst a waste of time. Although, I’m clearly showing my one-spaced prejudices here.

*(As we are engaged in trivial quibbles of no actual consequence, let me just say that ‘period’ should be retired as a name for the full-stop. It should then be re-allocated to the n-dash whose role is often to indicate a period of time, such as when it joins two dates together. I also have opinions about hyphens and dashes that I will reserve for another post – I feel the controversy would just be WAY too much for you all.)

Objectivity and stuff

I wanted to write about some of the interesting things people have been saying about reviewing but part of my brain obviously wants to talk about reason and evidence and those sorts of things. I guess I haven’t done much of that this year in attempt to look less like a philosophy professor.

Anyway – objectivity! The thing with objectivity as a word is that we (including myself) use it in a way that implies various things which maybe aren’t really part of what it means. Objectivity carries positive connotations and connotations of authority in contrast to subjectivity. Those connotations suggest impartial judgement and a lack of bias. That’s all well and good – words can mean whatever a community of users want them to mean but I think it creates confusion.

Here is a different sense of ‘objective’ – to say something is objective is to say that two people can follow the same steps/process and come up with the same answer reliably. Maybe we should use a different word for that but such processes are often described as ‘objective’ because they clearly contrast with subjective judgement.

The thing is that meaning does not in ANYWAY imply a lack of bias. Lots of systematic or automated processes can contain bias. Indeed we expect there to be biases in, for example, processes for collecting data. More extreme examples include machine learning algorithms which are inherently repeatable and ‘objective’ in that sense (and the sense that they operate post-human judgement) that nonetheless repeat human prejudices because those prejudices exist in the data they were trained on.

Other examples include the data on gender disparity in compensation for Uber drivers – the algorithm was not derived from human prejudices but there was still a pay disparity that arose from different working patterns that arose from deep-seated social disparities.

However, there is still an advantage here in terms of making information and data gathered more objective. Biases may not be eliminated but they are easier to see, identify and quantify.

Flipping back to ‘subjective’, I have discussed before both the concept of intersubjectivity (shared consensus opinions and beliefs that are not easily changed) as well as the possibility of their being objective facts about subjective opinions (e.g. my opinion that Star Trek: Discovery was flawed is subjective but it is an objective fact about the universe that I held that opinion).

Lastly the objective aspect of data can be mistaken for the more subjective interpretation of the data. In particular the wider meaning or significance of a data set is not established simply by the fact that the data is collected reliably or repeatedly.

Consider another topic: IQ. I’ve discussed before aspect of IQ and IQ testing and much of the pseudoscientific nonsense talked about it. Look at these two claims between Roberta and Bob:

  • Roberta: My IQ is higher than Bob’s.
  • Roberta: I am more intelligent than Bob.

The first statement may be an objective fact – it is certainly a claim that can be tested and evaluated by prescribed methods. The second statement is more problematic: it relies on opinions about IQ and the nature of intelligence that are not well established. The objectivity of the first statement does not establish the objectivity of the second. Nor does the apparent objectivity of the first imply that it does not have biases that may also impact wider claims based upon it.