Category: Statistics

Cheese Update

Sorry but I uploaded that last graph with some layers missing. Tweaked and corrected.

Advertisements

The Consonantal USA*

*[Pun courtesy of Mr Mike Glyer, purveyor of fine blogs, fanzines and assorted goods]

There are many things we can say about the states that comprise the United States of America. “Why?” might be one of them but another might be “which state has the highest proportion of consonants in its name?” If that is your question then behold! The answer is below.

    • 25.0% consonants=1 length=4 | Iowa
    • 25.0% consonants=1 length=4 | Ohio
    • 33.3% consonants=2 length=6 | Hawaii
    • 33.3% consonants=3 length=9 | Louisiana
    • 40.0% consonants=2 length=5 | Maine
    • 40.0% consonants=2 length=5 | Idaho
    • 42.9% consonants=3 length=7 | Alabama
    • 42.9% consonants=3 length=7 | Arizona
    • 42.9% consonants=3 length=7 | Georgia
    • 42.9% consonants=3 length=7 | Indiana
    • 50.0% consonants=4 length=8 | Missouri
    • 50.0% consonants=4 length=8 | Oklahoma
    • 50.0% consonants=3 length=6 | Alaska
    • 50.0% consonants=5 length=10 | California
    • 50.0% consonants=4 length=8 | Colorado
    • 50.0% consonants=4 length=8 | Delaware
    • 50.0% consonants=4 length=8 | Illinois
    • 50.0% consonants=3 length=6 | Nevada
    • 50.0% consonants=3 length=6 | Oregon
    • 50.0% consonants=2 length=4 | Utah
    • 50.0% consonants=4 length=8 | Virginia
    • 53.8% consonants=7 length=13 | South Carolina
    • 54.5% consonants=6 length=11 | South Dakota
    • 55.6% consonants=5 length=9 | Minnesota
    • 55.6% consonants=5 length=9 | New Mexico
    • 55.6% consonants=5 length=9 | NewJersey
    • 55.6% consonants=5 length=9 | Tennessee
    • 57.1% consonants=4 length=7 | Montana
    • 57.1% consonants=4 length=7 | Wyoming
    • 57.1% consonants=4 length=7 | Florida
    • 57.1% consonants=4 length=7 | New York
    • 58.3% consonants=7 length=12 | Pennsylvania
    • 58.3% consonants=7 length=12 | West Virginia
    • 60.0% consonants=3 length=5 | Texas
    • 61.5% consonants=8 length=13 | North Carolina
    • 62.5% consonants=5 length=8 | Maryland
    • 62.5% consonants=5 length=8 | Michigan
    • 62.5% consonants=5 length=8 | Arkansas
    • 62.5% consonants=5 length=8 | Kentucky
    • 62.5% consonants=5 length=8 | Nebraska
    • 63.6% consonants=7 length=11 | Mississippi
    • 63.6% consonants=7 length=11 | Connecticut
    • 63.6% consonants=7 length=11 | North Dakota
    • 63.6% consonants=7 length=11 | Rhode Island
    • 66.7% consonants=8 length=12 | New Hampshire
    • 66.7% consonants=4 length=6 | Kansas
    • 66.7% consonants=6 length=9 | Wisconsin
    • 69.2% consonants=9 length=13 | Massachusetts
    • 70.0% consonants=7 length=10 | Washington
    • 71.4% consonants=5 length=7 | Vermont

 

Today’s Important Charts

Star Wars movies title lengths by year:

starwarslengthname

The long period of consensus on proper Star Wars movie title length has ended with a sharp decline.

Neither Caravan of Courage: An Ewok Adventure nor Ewoks: The Battle for Endor were included in the first graph as they were TV movies. However, including them implies the recent decline is a natural correction to mid-1980’s excesses.

withewoks

Most importantly (save the strongest results till last) including all the movies and comparing title length to running time produces this important result:

runningtimevlength

That’s an R-squared that’s not to be sneezed at!* 40% of the variance is explained by title length. According to the Felapton Towers research scientists the mathematical model is:

running time = 151.42 – 1.2979×title length

This is excellent news for when they produce the film entitled “R2”, chronicle his life and career as a Sith Lord. the model predicts a running time of 148.8242 minutes, which is shorter than The Last Jedi (a bit of an outlier). Whereas The Life, Loves and Wacky Adventures of Galactic Senator Jar-Jar Binks and All His Fun Friends will mercifully be only half an hour long.

*[Moral: be careful looking at correlations]

Reviews and discourse

This is thinking out loud and a bit long and waffly in a field beyond my expertise. Approach with skepticism 🙂

Reviews and literary criticism are not one and the same thing. Criticism (in its literary sense) is a tool of the reviewer and at the same time, reviews can be seen as a subset of criticism. In its wider sense criticism is intended to examine text and shed light on what those texts do. Reviews are more overtly functional and are often characterised as being there to inform potential consumers.

Of course we should be first of all suspicious of this distinction and also the description of both reviews and criticism. I often write things entitled as ‘reviews’ not to inform potential readers/viewers but to share my feeling and experiences as a consumer of that media. In particular, reviews with spoilers or which really require the reader to have experienced the book/movies/tv-show are not written with potential consumers in mind but rather people who have already consumed the media. Much of reviewing that is available online is better described as ‘commentary’ – more akin to post-games discussion of sport matches or analysis of news stories. It may be less high-brow than what would be recognised as literary criticism (and less informed by models of literary criticism) but it is closer in kind to it than ‘reviews’ in the sense of information for potential consumers.

Yet another role for reviews and criticism is improvement, change or the establishment of norms. Editorial reviews (at a high-level – I don’t mean proofreading) are one example but it is something that can be seen in literary criticism and in more general reviewing. Identifying problems in texts or discussing whether a text fits within a genre form parts of wider discussion about what counts as being a problem in a text or what defines a genre.

To clarify for the purpose of discussion I’ll split things into various roles:

  • Reviews for the purpose of informing potential consumers (not unlike product reviews of goods).
  • Post-consumption sharing of experiences.
  • Criticism for the purpose of understanding a text.
  • Reviews/criticism for identifying problems or potential improvements in a text.

Each of these form part of the wider discourse within a community that has a shared engagement with texts. As this is a broad, multi-faceted and diverse discourse that ranges across multiple venues, there are no hard borders between those roles. A review ostensibly for helping people what to read next may incorporate particular norms about the genre because having norms provide a way of judging and reporting on texts. Likewise analysis of what is going on in a story for its own sake can encourage somebody to read a story (or ensure they stay well away from it forever!)

Identifying problems with how race, ethnicity, nationality, religious belief, gender, sexuality or disability are represented in a text would fall into the fourth point on my list but also fits with the first point on my list and will often be derived from the second point on my list. Lastly, a discussion of such issues may revolve around the third point on my list I.e. what is actually going on in a text and what kind of representation is being used.

I’m well out of my philosophical comfort zone by this point having strayed out of the analytical and insular and into the phenomenological and continental but let’s persist.

Common to all is the sharing of personal experiences with others. Put another way, reviews and criticism bridge subjective experiences to intersubjective community understanding.

How we experience a text (story, film etc) is something we can examine and discuss and it is something that we can analyse, something we can find patterns with and it is also something where we can aggregate data. Aggregation and quantification of subjective data does provide a way of looking at subjective experiences using tools designed for “objective” data (and see the previous essay for what I mean about “objective”). It is another way of looking at shared experience.

Related to that is the role of anthologies and magazines. Both are traditionally and important part of this wider discourse about the nature of science fiction and fantasy. By collecting stories together and present a set of stories as examples of the genre and as examples that are of at least some minimum quality, anthologies are also a form of both review and criticism that wittingly or not describes possible boundaries of the genre. To call an anthology a work of ‘criticism’ may sound odd but it covers many aspects of the points above (e.g. a ‘best of the year’ style anthology is normative by presenting examples of some standard of ‘best’ and also a way of guiding consumers to stories they may like and also represent some of the subjective reaction of the editors/compilers).

Similar points can be made about awards and competitions and here the overt nature of a discourse becomes clearer. With fan awards discussion and shared experience is an important part of the process. Juried awards can also engender debate and discussion and I’d argue the most interesting ones are the ones that evolved this aspect (for example the Clarkes by virtue of the Shadow Clarkes have become a more interesting award).

Where are you going with this Camestros! I hear you shout (assuming you’ve read this far). OK, OK, I’ll stop waffling.

My point is, if we are to discuss what reviewing should be like and what kinds of reviews and reviewing activity people should be doing, we have to consider it against this wider discourse. It is the big broad discussion that is the important thing – along side the health and welfare of individuals.

Objectivity and stuff

I wanted to write about some of the interesting things people have been saying about reviewing but part of my brain obviously wants to talk about reason and evidence and those sorts of things. I guess I haven’t done much of that this year in attempt to look less like a philosophy professor.

Anyway – objectivity! The thing with objectivity as a word is that we (including myself) use it in a way that implies various things which maybe aren’t really part of what it means. Objectivity carries positive connotations and connotations of authority in contrast to subjectivity. Those connotations suggest impartial judgement and a lack of bias. That’s all well and good – words can mean whatever a community of users want them to mean but I think it creates confusion.

Here is a different sense of ‘objective’ – to say something is objective is to say that two people can follow the same steps/process and come up with the same answer reliably. Maybe we should use a different word for that but such processes are often described as ‘objective’ because they clearly contrast with subjective judgement.

The thing is that meaning does not in ANYWAY imply a lack of bias. Lots of systematic or automated processes can contain bias. Indeed we expect there to be biases in, for example, processes for collecting data. More extreme examples include machine learning algorithms which are inherently repeatable and ‘objective’ in that sense (and the sense that they operate post-human judgement) that nonetheless repeat human prejudices because those prejudices exist in the data they were trained on.

Other examples include the data on gender disparity in compensation for Uber drivers – the algorithm was not derived from human prejudices but there was still a pay disparity that arose from different working patterns that arose from deep-seated social disparities.

However, there is still an advantage here in terms of making information and data gathered more objective. Biases may not be eliminated but they are easier to see, identify and quantify.

Flipping back to ‘subjective’, I have discussed before both the concept of intersubjectivity (shared consensus opinions and beliefs that are not easily changed) as well as the possibility of their being objective facts about subjective opinions (e.g. my opinion that Star Trek: Discovery was flawed is subjective but it is an objective fact about the universe that I held that opinion).

Lastly the objective aspect of data can be mistaken for the more subjective interpretation of the data. In particular the wider meaning or significance of a data set is not established simply by the fact that the data is collected reliably or repeatedly.

Consider another topic: IQ. I’ve discussed before aspect of IQ and IQ testing and much of the pseudoscientific nonsense talked about it. Look at these two claims between Roberta and Bob:

  • Roberta: My IQ is higher than Bob’s.
  • Roberta: I am more intelligent than Bob.

The first statement may be an objective fact – it is certainly a claim that can be tested and evaluated by prescribed methods. The second statement is more problematic: it relies on opinions about IQ and the nature of intelligence that are not well established. The objectivity of the first statement does not establish the objectivity of the second. Nor does the apparent objectivity of the first imply that it does not have biases that may also impact wider claims based upon it.

Reading Peterson 11 – Notes & Facts & Hypothesis

Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9, Part 10, Part 11, Part 12,…

There’s no shortage of notes in Jordan B Peterson’s book 12 Rules for Life but that doesn’t mean every assertion related to facts is referenced. Also, when references are used they aren’t always tightly associated with the argument. Take this for example from chapter 2:

“This is perhaps because the primary hierarchical structure of human society is masculine, as it is among most animals, including the chimpanzees who are our closest genetic and, arguably, behavioural match. It is because men are and throughout history have been the builders of towns and cities, the engineers, stonemasons, bricklayers, and lumberjacks, the operators of heavy machinery.” – Peterson, Jordan B.. 12 Rules for Life: An Antidote to Chaos (p. 40). Penguin Books Ltd. Kindle Edition.

Now there is a lot wrong with that statement factually but the right reference here, if this was an academic essay, would be to a source discussing historical patterns of employment. Peterson instead links to some modern labour statistics here https://www.dol.gov/wb/stats/occ_gender_share_em_1020_txt.htm The tables do use the term ‘traditional occupations’ and ‘non-traditional’ based on proportions of women involves but this is ‘traditional’ in a very loose sense and includes “Meeting, convention, and event planners”. My point here isn’t that the table is wrong of even questioning gendered-roles in employment – just that a lot of references are weak in this fashion. It is vaguely related but not neatly tied to Peterson’s argument.

(This is quite long – so more after the fold)

Continue reading

Looking at some crowdfunding data

I’m mainly just curious how such things work but I picked on data from a Go Fund Me campaign that I know people might be morbidly curious about.

monyraised

The site gives a list of donations made with the amount and how many days ago the donation was made. Doing some minor spreadsheet wrangling, it is fairly easy to turn this into graphable data. The only departure from literal truth is I used the order in which the donations are listed to spread out the data points more evenly across each day of the campaign – so the smooth growth within each day is just to make the graph easier on the eye (the raw data would just give a big vertical chunk of points).

Compared with the fundraising goal the graph looks like this:

versusgoal

If we assume a growth rate of $20 every three days than this campaign should reach its target in about 1317 days or about three and a half years. Of course, events may change that.