The US Presidential Election isn’t just a river in Egypt, it is also a series of bizarre claims. One of the many crimes against statistics being thrown about in what is likely to be a 5 year (minimum) tantrum about the election is a claim about Benford’s law. The first example I saw was last Friday on Larry Correia’s Facebook
“For those of you who don’t know, basically Benford’s Law is about the frequency distribution of numbers. If numbers are random aggregates, then they’re going to be distributed one way. If numbers are fabricated by people, then they’re not. This is one way that auditors look at data to check to see if it has been manipulated. There’s odds for how often single digit, two digit, three digit combos occur, and so forth, with added complexity at each level. It appears the most common final TWO digits for Milwaukee’s wards is 00. Milwaukee… home of the Fidel Castro level voter turn out. The odds of double zero happening naturally that often are absurdly small. Like I don’t even remember the formula to calculate that, college was a long time ago, but holy shit, your odds are better that you’ll be eaten by a shark ON LAND. If this pans out, that is downright amazing. I told you it didn’t just feel like fraud, but audacious fraud. The problem is blue machine politics usually only screws over one state, but right now half the country is feeling like they got fucked over, so all eyes are on places like Milwaukee.I will be eagerly awaiting developments on this. I love fraud stuff. EDIT: and developments… Nothing particularly interesting. Updated data changes some of the calcs, so it goes from 14 at 0 to 13 at 70. So curious but not damning. Oh well.”
So after hyping up an idea he only vaguely understood (Benford’s law isn’t about TRAILING digits for f-ck sake and SOME number has to be the most common) Larry walked the claim back when it became clear that there was not very much there. As Larry would say beware of Dunning-Krugerands.
The same claim was popping up elsewhere on the internet and there was an excellent Twitter thread debunking the claims here:
But we can have hierarchies of bad-faith poorly understood arguments. Larry Correia didn’t have the integrity to at least double check the validity of what he was posting before he posted it but at least he checked afterwards…sort of. Vox Day, however, has now also leaped upon the magic of Benford’s law 
Sean J Taylor’s Twitter thread does a good job of debunking this but as it has now come up from both Sad and Rabid Puppies, I thought I’d talk about it a bit as well with some examples.
First of all Benford’s law isn’t much of a law. Lots of data won’t follow it and the reason why some data follows it is not well understood. That doesn’t mean it has no utility in spotting fraud, it just means that to use it you first need to demonstrate that it applies to the kind of data you are looking at. If Benford’s Law doesn’t usually apply to the data you are looking at but your data does follow Benford’s law then THAT would/might be a sign of something going on.
That’s nothing unusual in statistics. Data follows distributions and comparing data against an applicable distribution that you expect to apply is how a lot of statistics is done. Benford’s law may or may not be applicable. As always, IT DEPENDS…
For example, if I grab the first digit of the number of Page Views on Wikipedia of Hugo Award finalists  then I get a set of data that is Benford like:
The most common digit is 1 as Benford’s law predicts. The probability of it being 1 according to the law is log10(1+1/d) or about 30%. Of the 1241 entries, Benford’s law would predict 374 would have a leading digit of 1 and the actual data has 316. But you can also see that it’s not a perfect fit and we could (but won’t bother because we actually don’t care) run tests to see how good a fit it was.
But what if I picked a different set of numbers from the same data set? Here is the leading digit for the “Age at Hugo” figure graphed for the finalists where I have that data.
It isn’t remotely Benford like and that’s normal (ha ha) because age isn’t going to work that way. Instead the leading digit will cluster around the average age of Hugo finalists. If the data did follow Benford’s law it would imply that teenagers were vastly more likely to win Hugo Awards (or people over 100 I suppose or both).
Generally you need a wide spread of numbers across magnitudes. For example, I joked about Hugo winners in their teens or their centuries but if we also had Hugo finalists who where 0.1… years old as well (and all ages in between) then maybe the data might get a bit more Benfordish.
So what about election data. ¯\_(ツ)_/¯
The twitter thread above cites a paper entitled Benford’s Law and the Detection of Election Fraud  but I haven’t read it. The abstract says:
“Looking at simulations designed to model both fair and fraudulent contests as well as data drawn from elections we know, on the basis of other investigations, were either permeated by fraud or unlikely to have experienced any measurable malfeasance, we find that conformity with and deviations from Benford’s Law follow no pattern. It is not simply that the Law occasionally judges a fraudulent election fair or a fair election fraudulent. Its “success rate” either way is essentially equivalent to a toss of a coin, thereby rendering it problematical at best as a forensic tool and wholly misleading at worst.”
Put another way, some election data MIGHT follow Benford’s law sometimes. That makes sense because it will partly depend on the scale of data we are looking at. For example, imagine we had a voting areas of approx 800 likely voters and two viable candidates, would we expect “1” to be a typical leading digit in vote counts? Not at all! “3” and “4” would be more typical. Add more candidates and more people and things might get more Benford like.
Harvard University has easily downloadable US presidential data by State from 1976 to 2016 . At this scale and with all candidates (including numerous 3rd, 4th party candidates) you do get something quite Benford like but with maybe more 1s than expected.
Now look specifically at Donald Trump in 2016 and compare that with the proportions predicted by Benford’s law:
Oh noes! Trump 2016 as too many 1s! Except…the same caveat applies. We have no idea if Benford’s law applies to this kind of data! For those curious, Hilary Clinton’s data looks like (by eyeball only) a better fit.
Now we could test these to see how good a fit they are but…why bother? We still don’t know whether we expect the data to be a close fit or not. If you are looking at those graphs and thinking “yeah but maybe it’s close enough…” then you also need to factor in scale. I don’t have data for individual polling booths or whatever but we can look at the impact of scale by looking at minor candidates. Here’s one Vox Day would like, Pat Buchanan.
My eyeballs are more than sufficient to say that those two distributions don’t match. By Day’s misapplied standards, that means Pat Buchanan is a fraud…which he is, but probably not in this way.
Nor is it just scale that matters. Selection bias and our old friend cherry picking are also invited to the party. Because the relationship between the data and Benford’s law is inconsistent and not understood, we can find examples that fit somewhat (Trump, Clinton) and examples that really don’t (Buchanan) but also examples that are moderately wonky.
Here’s another old fraudster but whose dubious nature is not demonstrated by this graph:
That’s too many twos Ronnie!
Anyway, that is far too many words and too many graphs to say that for US Presidential election data Benford’s law applies only just enough to be horribly misleading.
 Sean S Taylor’s R code https://gist.github.com/seanjtaylor/cd85175055e66cdc2bb7899a3bcdf313
 Deckert, J., Myagkov, M., & Ordeshook, P. (2011). Benford’s Law and the Detection of Election Fraud. Political Analysis,19(3), 245-268. doi:10.1093/pan/mpr014 https://www.cambridge.org/core/journals/political-analysis/article/benfords-law-and-the-detection-of-election-fraud/3B1D64E822371C461AF3C61CE91AAF6D