Boggle

Boggle is a word game in which players attempt to find words in sequences of adjacent letters. Unlike the numerous other online implementations of the Boggle game (which are often released under various different names to avoid trademark infringement), this application exists to provide a lightweight, minimalistic user interface and power-user features which aim to help a player improve.

Boggle can be played on "any modern browser or device". The implementation relies on the Service Worker API to ensure it will work completely offline, and can be installed to one's desktop or phone home screen if desired.

The Boggle application opens to a menu, providing the options to play a game, train your vocabulary, define a word, view stats from past games, or change the settings.

Play

While playing, the standard 3 minute timer ticks down. The timer will automatically pause if the player navigates away from the board outside of viewing the score breakdown or 'define' functionality. Furthermore, the timer state can be toggled manually at any point by simply by clicking on it. A player can navigate back to the main menu by using the icon in the top left corner which can also be long pressed to cause a 'refresh' which will end the current game and begin a new one with a randomly generated board (alternatively, going back to the menu and clicking 'New Game' as opposed to 'Resume' will also trigger a refresh). In the top right corner the player's current score is displayed. Words may be entered on a touch device by touching letters on the board and dragging across the tiles to complete words - once the touch ends the word is considered finished. On devices with a keyboard, words may be entered simply by typing, the 'Enter' or 'Space' key can be used to complete the word.

By default, after a word has been entered:

The application allows words to be entered even after the timer has expired, though it keeps track of which words were entered before the game's duration elapsed and scores them separately (see example above).

The application also provides some 'power user' features during gameplay in the form of shortcuts and heatmap hints. To simplify and expedite word entry, given that the minimum length of word supported by the application is 2 letters, anytime a 1 or 2 letter word is played the game will interpret it as if it were intended to be appended as a suffix to the word which had just been entered previously (the 'ING' suffix is also handled as a special case for this shortcut). The most common use case for this is quickly entering plurals of words (~50% of all valid words are plurals), though any suffixes can take advantage of the shortcut:

The heatmap hint feature is more situational - during game play, a player may long press the score displayed in the top right corner to cause the application to colour the tiles based on how frequently they occur (more precisely, the points the tile is involved in) in the remaining possible words that can be played. The tiles opacity is weighted such that those which are present in words forming the most points in aggregate will show up in the most opaque red for the duration of the press. While there are some rare edge cases when words may be found more than one way on a given board, in general the heatmap can be used to help a player determine where to focus their search to maximize their score.

Score

Clicking on the score (as opposed to long pressing it) will instead display a full breakdown of which words were played, for the current game and for each game that has ever been played by the user in reverse chronological order.

Each game's ID can be seen in the list, though only the current game's 'Played' section will be expanded by default, other games or sections can be expanded or collapsed via clicking on them. A game's ID contains information about the dice configuration, minimum board size, dictionary and the random seed used to generate the board - this ID can be shared with another player for them to be able to play the same board and compare results. Alongside the ID is the players total score (irrespective of the timer) for the game compared to their configured 'goal' score (also rendered as a percentage).

The 'Played' section summary contains several metrics:

These summary metrics can be completely ignored, but they may also serve as helpful metrics for players looking to improve their play and detect weaknesses. Finally, the 'Possible' section summary simply lists the total number of points attainable using only words of a specific grade level or below, ie. 'D / C / B / A (S)'. Due to the nature of the grading system, the final score in parentheses also indicates the highest possible score for the given board in question.

Upon expanding the 'Played' section, words played by the user are listed in the order that they were entered, along with any definitions. Invalid words show up in red and words which are of a higher grade level than the goal grade configured by the user show up in grey.

The 'Possible' section lists all of the words possible in the board which were not already played by the user, though these are sorted by a more complicated procedure: words which were determined as 'missed' (suffixes/subwords/anagrams as described above in the section on 'Played' metrics) are shown at the top (and underlined), then the words are sorted by grade, length, grouped together with their anagrams and then sorted alphabetically. This sort order attempts to make it easy for a player to prioritize learning which words were overlooked.

Define

By either navigating through the main menu, swiping left or right anywhere in the application outside of the game board, or by using the '?' keyboard shortcut the 'define' panel can be pulled up:

Words will be defined immediately as they are typed. Words that are recognized in certain configurations but not the current configuration will be greyed out. In addition to definitions, all valid anagrams of the search term will be displayed (and may be clicked to link to their definition), as will various statistics for the defined word:

A user can return out of define mode either by swiping left or right or hitting the 'Escape' key. Hitting 'Space' or 'Enter' once will clear the current query, hitting it a second time will leave define mode.

Stats

To further aid with post game analysis, a 'stats' breakdown of commonly missed words, anadromes and anagram pairs from recent games can be accessed by via the main application menu.

The sort order for these tables attempts to surface words worth the most points per game that have been missed most recently. The 'Word' tab shows the commonly missed words and how often they have been found relative to how often they were possible, the 'Pair' tab shows how often one direction of an anadrome pair has been found relative to another compared to the total amount either have been found, and the 'Anagram' tab shows how often a specific word has been found relative to how often it has been present compared to the stats of its anagrams.

Settings

At the the of the settings pane is a game ID (which, as described above, contains information about the the dice configuration, minimum board size, dictionary and the random seed used to generate the board) is displayed and can be edited manually, though as the various setting options are toggled the seed updates itself to reflect the changes. By default, the ID shown is the ID of the current game, but if there is no current game or the ID has been modified while there is still an active game the ID reflects what the next game's ID will be (i.e. you must click 'New Game' to have the changed settings applied).

The Boggle board game has been released with various dice configurations over the years. This implementation supports the three most common English dice distributions, including the 5x5 distribution for 'Big Boggle'. The 6x6 'Super Big Boggle' distribution is not supported.

4x4 configurations of Boggle typically mandate that a legal word must be at least 3 characters long, with the 5x5 Big Boggle further restricting the legal word size to at least 4 characters. This implementation follows these defaults, but also allows the player to change the minimum word length restriction independent of board size, allowing 3, 4, or 5 letter words to be the minimum allowed.

Boggle has no official dictionary, so instead this implementation relies on the three canonical Scrabble word lists - NWL (formerly known as 'TWL', more restrictive, used in American and Canadian Scrabble tournaments), ENABLE (similar to NWL, used by a large number of online games), and CSW (formerly known as 'SOWPODS' the most permissive word list, used by Scrabble players outside of North America). By default, the NWL is used (though it is important to note that due to availability all definitions come from CSW which may sometimes be a source of confusion).

The grade toggle simply allows a player to choose which grade (as described in detail above) best reflects the breadth of their vocabulary and is only used for display purposes when determining goal points or visually styling words.

The display toggle changes how much information the player sees during a game. The 'Hide' mode more accurately reflects how games are played in real life - all words entered will disappear immediately instead of lingering with a definition if they were valid or displaying in red if they were invalid, and the current score is not displayed at all times. To find out these details the user must navigate to the scoring breakdown. On the opposite end of the spectrum is the 'Full' display mode which pulls out the metrics from the scoring breakdown and places them front and center for easy access while playing. From left to right, the potential score, number of missed suffixes, number of missed subwords, number of missed anagrams, points scored, goal score and percentage of goal score are displayed (a hybrid of what is displayed alongside game ID in the scoring breakdown and the 'Played' section summary).

Finally, light and dark themes are supported for the application, though the application will respect the system's default preference until it is overwritten in the settings.

Training

The final feature the application provides is a 'training' mode, accessible via the main menu or settings. Training mode determines which anagrams to present to the user based on the settings configured in the settings pane and past training sessions and effectively amounts to 'anagram >flashcards' - users are tested with a jumble of letters to form words out of and by touching anywhere on the screen they can see all of the possible valid words and their definitions. At the bottom of the answer card users are asked to assign a rating to how well they remembered the card - the application will use this rating and the user's past ratings of this anagram group to determine when it will next be tested. A recommended rubric for assigning the numerical ratings based on the original paper for the algorithm is as follows:

5perfect response (obvious, knew words and exact number in group immediately)
4correct response after a hesitation (default correct rating, able to name entire group)
3correct response recalled with serious difficulty (took a long time, possibly required size hint or named words not in the group)
2incorrect response; where the correct one seemed easy to recall (~80% or more of the group remembered fairly easily, missing words familiar)
1incorrect response; the correct one remembered (default incorrect rating, no words seem completely foreign)
0complete blackout (less than 20-30% of words remembered and some words totally unrecognized)

New anagram groups are introduced in order of priority based on their expected points per game (based on the settings configuration), and the spaced repetition algorithm used is a modification of those used by existing memorization software applications like Anki, Duolingo, Quizlet or Memrise. Similar standalone game specific word study software exists (eg. 'NASPA Zyzzyva' or 'Scrabble Expert'), but none leverage the expected points per game data available in Boggle to focus on the most important words. A detailed description of how the training algorithm works can be found in the Appendix.

The user's progress is displayed in two parts in the top right corner of training mode - the number of overdue anagram groups vs. the total number of groups the user has trained on. Similar to the 'heatmap' hint during game play, the progress indicator can be long pressed to reveal a hint indicating the number of valid words that are possible to form by unscrambling the letters being tested. By simply clicking on the progress instead of long pressing, a 'review' page where anagram groups which are currently troublesome to the user is displayed.

Appendix

Strategy

The core skill in most word games (eg. Boggle, Scrabble, Bananagrams) is being able to find possible words from a jumble of letters - that is, to determine valid anagrams. Having a substantive vocabulary is important, but the type of vocabulary required is vastly different than the type of vocabulary that may be useful for passing the SAT or GRE - long words occur with significantly less regularity. Shorter words, particularly those involving specific letters, tend to be the most valuable to focus one's studying on.

In Bananagrams and Scrabble, knowledge of 2-letter words is incredibly important as it allows much more flexibility in playing longer words alongside those which already exist. Memorizing all valid 2-letter words is actually fairly straightforward, Tom Rees outlines a way of remembering them through use of mnemonics, an example of which (updated for NWL 2019) can be found below:

dumpy biz thankfulabirthdays mangle wax
few behind him partyehomosexual snowdrifts
quake halts bag mixupiinfested
midnight jeep bylawsomurky pushdown fix
x-menuphantoms
mm hm by my shaa ae ai oe oi

As described in the article, a player simply needs to remember the phrases for each of the vowels (and the leftover words at the bottom), and they will be able to determine if a combination of two letters is a valid word or not. Actually knowing the definition of said 2-letter word is irrelevant (case in point, a New Zealander who speaks no French won the French-language Scrabble world championship), though learning the words' definitions can be helpful for further ingraining it in one's memory or making it easier to recognize as a word.

2-letter words can also be relatively useful in 4x4 Boggle play, as many of these words can take an '-S' suffix, giving one a jump start on learning the 3-letter words. Playing a letter before or after an existing word is known as "hooking", and it is already demonstrated in the table above, as the consonants which can be 'hooked' onto a vowel are encoded in the phrases. In Scrabble and Bananagrams, hooking is central to the game as it's the only way a player is able to play their word - they must build off an existing word. In Boggle, the concept of hooks is still useful for building up one's vocabulary in more manageable pieces and also in game - after discovering a pocket of tiles, one can then look around the neighbouring tiles for known hooks to potential form additional words.

In Scrabble and Bananagrams, learning the 2- and 3-letter words and their hooks (and especially which can be pluralized) is usually the best place to start, as we can assume these words will appear most frequently and be used to join new words to the board. With Boggle, these words are still important (and there's only ~1000 3-letter words), but Boggle actually has a unique property compared to the other games in that the state of the board is completely determined from the beginning of the game, independent of input from the players. As such, one can actually randomly generate hundreds of millions of boards and track which words occur most often in aggregate, and then use the expected points each word will garner in an average game to prioritize which to study first (thanks to Luke Gustafson and Craig S. Kaplan for this idea).

The word frequency stats from Boggle games in aggregate shows that most words on a board are going to be 3-7 characters long, with the most between 3-5 (or 4-6 in Big Boggle). As such, Boggle favours the ability to recognize to an even greater extent than Bananagrams or Scrabble where longer words have more advantages. There are several other important tactical considerations when playing or learning Boggle:

Playing games like Bananagrams and Scrabble involve slightly different strategies, including:

Training Algorithm

Words are learned alongside their anagrams, with new groups being introduced in order of the groups' average total expected points per game. The trainee's ratings are then plugged into the standard SM2 algorithm, with a small number of enhancements and the now standard 3 day review period. Like with Anki, failures during the initial 'learning' phase of a group (the first 5 times the group has been seen) are treated less harshly, with the modifier applied to the difficulty rating only progressively approaching its full value. Furthermore, if a group is corrected remembered after its due date, a bonus is applied to reduce its difficulty.

Because the training mode is built into the same application as the game, it seems like it would be possible to use in-game experience to objectively adjust training weights of anagram groups identified in game. Unfortunately, there are a number of reasons why this is difficult:

Instead, the stats section was designed with the hopes of surfacing the same sorts of actionable information based on trying to use feedback from the game history.

Implementation

The main challenge with implementing Boggle is the size of the dictionary - ranging from ~180,000 - 280,000 words depending on which dictionary is used, even just the words themselves stored in a deterministic acyclic finite state automaton or an efficiently compressed trie still take up around 2MB. Furthermore, since the application shows definitions by default on successful entry of word it ends up requiring the full dictionary anyway which balloons the size up 10-15x. Unfortunately, this has a large impact on initial startup as downloading a 30MB dictionary anywhere is going to be slow.

If supporting offline was not a requirement the application could compute board solutions server side and send down only the required definitions and words for the game (or do point lookups on demand for something like define mode). Even with the need to support offline mode the initial download could be slightly optimized by shipping down just the encoded word data structure without definitions (and only for the specific dictionary configuration, given CSW is ~50% larger than the other two) so that the initial board could be solved and then shipping down definitions in a secondary download so that the user could begin playing immediately (racing the definitions download), however this approach was not chosen as either not showing the definitions before the second download has completed or blocking input on the board until the downloads are undesirable.

As such, the best that can be done is attempting to hiding the latency (and the dreaded loading spinner UI) of the dictionary download by inserting the menu splash instead of dropping the user directly into a game and using the Service Worker API to cache the dictionary to make future loads as fast as possible. Given the number of features and modes supported by the application the menu as the loading screen isn't actually a terrible UX decision, and if the user hesitates long enough on choosing where they wish to navigate to the entirety of the dictionary download may have time to finish in the background.

Computing the training set or processing game history can also cause problems as they grow linearly. The training set thankfully grows fairly slowly because it is bound by human memory, though it is still relatively expensive to initially compute as it involves iterating through the entirety of the dictionary. Saving the history of each game played also is problematic as the longer a user uses the application the slower their experience will get, however, its possible to minimize this impact by bounding the amount of recent history we look at (eg. only looking at the last 500 games for stats) or by lazy loading (as is done in the past scoring breakdown).