Some Wordle Stats

I’m not putting any serious spoilers or cheats below but there’s a fold if you would rather remain ignorant of some details or if you are just heartily sick of people talking about the daily word game that is clearly an alien plot or a CIA scheme to make us think about abstract objects.

A friend discussing Wordle (the game where you use clues to guess a 5 letter word https://powerlanguage.co.uk/wordle/ ) stated that I had probably made a big spreadsheet already. I hadn’t…but I’d thought about it and that was sufficient motivation to look at some letter frequencies.

I thought I’d have to approximate the underlying word list but it is actually quite easy to get hold of. You can probably Google it now but it’s not a great help to have it. There are 2315 words in the list from aback to zonal (or, as my spreadsheet insisted from aback to FALSE).

Not every letter in the alphabet is equally likely to appear. This table shows the top ten letters by frequency.

LetterTotalPercent
e123311%
a9798%
r8998%
o7547%
t7296%
l7196%
i6716%
s6696%
n5755%
c4774%

That’s fairly typical of English. E is the most common but it is a power-law distribution with a lot of low-frequency letters. That means that even though E is the most common, it is much less common than NOT-E.

E’s dominance isn’t consistent in every position. If we just look at the first letter of words, E is only the 14th most common letter. In position 1, the letter S is a lot more common.

LetterPosition 1
s366
c198
b173
t149
p142
a141
f136
g115
d111
m107

In positions 2 and 3, the letter A is more common and E’s strength really relies on positions 4 and 5. E is first when it is last!

So the perfect first guess would be SAAEE…except astute users of English have spotted that SAAEE isn’t a word. A nice feature of the puzzle is that you can only guess words in the word list. Without that rule, the game would be more of a logic puzzle.

So what are the most Wordly words then if SAAEE isn’t a valid English word? To make a guess at that, for each letter in each position in each word, I tabulated the frequency and then added those frequencies together to get an overall score. Here are the top 15

WordTotal Score
slate1437
sauce1411
slice1409
shale1403
saute1398
share1393
sooty1392
shine1382
suite1381
crane1378
saint1371
soapy1366
shone1360
shire1352
saucy1351

The bottom 15 are a lot more fun.

omega396
abyss393
offal393
ennui391
endow383
hydro377
lymph361
jumbo349
igloo348
ethic347
unzip345
umbra344
affix340
ethos326
inbox318
nymph310

And a stats post wouldn’t be complete with a pointless graph. So here is a graph of all those scores with a few example words on the horizontal/category axis.


35 responses to “Some Wordle Stats”

    • I’ll rot13 in case people don’t want to know
      (V qvqa’g qvfpbire guvf, vg’f orra qvfphffrq bayvar) Vs lbh ybbx ng gur jro cntr fbhepr pbqr gurer’f n yvax gb n wninfpevcg svyr gung unf gur pbqr sbe gur tnzr. Gur jbeq yvfg vf va gur tnzr nf n inevnoyr pnyyrq “Gn”.

      Liked by 1 person

      • Wordle is such a simple yet elegant game and kudos to Josh Wardle for putting everything in the clear, which has made it trivial for other people to replace the dictionaries with their own & spawn their own versions.

        If you want a direct link to the script containing the list of words (which you can view with any text reader) it’s here (ROT13’d):
        uggcf://cbjreynathntr.pb.hx/jbeqyr/znva.r65pr0n5.wf

        The solution list has ~2300 words, but Wordle accepts a longer list of ~12,000 as guesses. Interestingly, although it is hosted in a uk domain, Wordle uses American English (Wardle originally created the game for his partner who AFAIK is American). That has caused some grief amongst non-American English spellers. Another factoid: the list of words allowable as guesses contain words in both American & UK English. So I can see why it caused confusion.

        Interestingly, the coloured grids used to share results in a non-spoilerly way, weren’t in the earlier public version. It was created by manually typing emoji by a Twitter user from New Zealand Elizabeth S. but was such a success, Wardle incorporated it, and that helped it go viral.

        The New Zealand connection seems to have helped in another way too. Turns out New Zealand Twitter is more interconnected than ‘rest of the world’ Twitter, so something that takes off gets propagated more quickly (Twitter was where I first saw mentions of Wordle). https://www.rnz.co.nz/national/programmes/the-weekend/audio/2018826546/josh-wardle-the-power-of-wordle

        Like

  1. Oooooh, Wordle geekery!

    So, I think this is good as an analysis of letter-frequency, but my personal experience with the game has led me to some kind of counterintuitive conclusions. And a big one is: nailing down a few super-common letters actually _isn’t_ the most helpful thing early on in the game.
    For example, it doesn’t _hurt_ to know that the fifth letter is ‘E’, but that doesn’t actually give me a ton to go on for my next guesses. Finding an “unusual”, slightly less-common letter, or a common letter but in a surprising position, can be a much stronger hint. (This is extra true in Hard Mode, where getting too much right early on kind of locks you in, and limits how many new letters you can guess/eliminate per guess.)

    As a very handy example, today my first guess was JUICY, and it was *super* helpful! I got the word in three moves.
    (That said, my second guess was STARE, which will rank very highly in your score, and that was ALSO super-helpful…)

    Like

    • (started with JUICY because I wanted to either confirm or eliminate U, I, and Y, which are specifically the *trickier* vowels, and also harder ones to work into guesses once I’ve got more information and am trying to figure out more specific things)

      Liked by 1 person

    • I’m very naughty. I’ve come up with 4 words which encompass 19 letters, so I always start with those. Most of the time I can get the right word after the first 3 guesses, a couple of times I’ve figured it out after the first 2 guesses, and a few times it’s taken 5 tries, and a couple of times 6 tries.

      I’m very non-competitive and don’t feel as though I have anything to prove, so it’s just a couple minutes of amusement for me each day.

      Liked by 4 people

      • I absolutely love it as “a few minutes of amusement each day,”
        except there are *two* Hebrew clones, so now it’s up to about 15 minutes of amusement each day,
        but also the Hebrew ones aren’t nearly as good because Hebrew doesn’t have nearly as many patterns as English, nor a tenth as many idiosyncrasies and odd exceptions.

        Liked by 1 person

        • Yes, the proprietor has done a good job of selecting words with odd, less-intuitive structures.

          The ones with a letter that repeats (e.g. “SQUEE”) are the hardest, and usually take me 5 or 6 tries, because if you get the letter right at some point, there’s no indication that there’s a second version of it lurking in there.

          Liked by 1 person

          • Oh nice!
            There’s also the adversarial one, Absurdle,
            where the “correct” word isn’t pre-ordained,
            and instead, every time you make a guess, the game gives you responses in such a way as to leave as many options as possible still valid. It’s kind of brilliant, because it shows you just how you can be “funneled” into word patterns that are so common you still need a bunch of guesses to figure the word out!

            Like

            • If I had a modicum of spare time this week, I have everything I need now to make a Typordle game – exactly like wordle but all the words are slightly misspelled. This would be universally loved by everybody obviously and not remotely frustrating.

              Liked by 2 people

    • Yes indeed. This doesn’t tell you what the best first guess word should be. I think that’s more tactical. Having said that, I did a first guess “exile” once to be weird and got the “x” and nothing else and that was a nightmare.[the answer was “proxy”]

      Liked by 1 person

  2. My goals in Wordle are to get the word. If I get it quickly, all the better. But I am not obsessed with getting it in as few moves as possible, just to get it if I can.

    Liked by 2 people

  3. My question is: Is it possible to gave 5 yellow boxes? I.e: Are there words in the list that are anagrams of another were every letter is shifted (for example no palindroms, bc the middle letter stays the same).

    Like

    • That’s true. And, in a sense, Wordle is distilling the communal crossword experience down to a single entry – it’s the same puzzle that everyone plays just once in the same 24hr period.
      But the key here isn’t anything to do with the puzzle – it’s the brilliance of the social media sharing; you aren’t sharing your specific guesses or even the answer, meaning that you aren’t spoiling anything and yet you still get to show-off (which is what social media is for, after all.)
      I don’t find it particularly interesting either (I think Jotto is far more interesting, where you are simply told how many correct letters you have and nothing more), but I can understand why the social sharing took off.

      Liked by 2 people

  4. There’s another version at wordlegame.org.

    This one is harder, because firstly it uses a larger dictionary of allowed words, and secondly allows any entry in the dictionary to be the answer. This includes having non-words such as sokol as the answer. Another time I lost because I had 4 letters correct, but I ran out of guesses because there were too many valid alternatives for the 4th letter.

    For the powerlanguage version, my results are 0/0/4/2/10/0. I’ve read that a group of physicists have taken the answer and dictionary lists and generated a strategy tree that guarantees completion in 5 guesses.

    Like

  5. Y’all should try Evil Wordle.

    https://swag.github.io/evil-wordle/

    The person that developed it was looking for an algorithm that would maximize the number of guesses by a player.

    As I understand it, you start with all of the 5-letter words available. You cannot guess the word in the first row because the algorithm doesn’t know it yet. Based on the letters you select, the algorithm identifies the group of words with the most possible permutations/letters. And you guess again using the narrowed range of letters.

    If you get down to a bunch of words that use the same letters

    cakes
    takes
    makes
    wakes
    lakes

    then you will have to eliminate all of the options before getting the “correct” word.

    I feel pretty good if I can get in 7 guesses. I did it in 5 guesses once.

    It’s a bit like Mastermind, but only if the other player can change the answer based on your guesses until you have eliminated all the possibilities.

    Regards,
    Dann
    So many books, so little time. – Frank Zappa

    Liked by 2 people

  6. Initially, I was trying with the supposedly optimal first guesses, but have now switched to using a different first guess every day. I enjoy it more that way because it’s about the journey. I’ve also switched to playing in hard mode (where you have to use all revealed clues in subsequent guesses).

    Liked by 2 people

  7. One more word about Wordle and the puppy gets it! (Oh, wait, that won’t sound like a threat to the author of the Debarkle. Better pick another animal. Manatee? Eh, I’ll get back to you.)

    Liked by 4 people

  8. Some people (including well known writers) stick to one word for their first guess.

    Like

Blog at WordPress.com.

%d bloggers like this: