Welcome to the Wordsnakes game web page! Invented by NYU professor Dennis Shasha, this game has inspired his students to compete on who will be the one to come up with the best Wordsnake ever… The funny part of the story is that you are not really in a position to claim with 100% certainty that your Wordsnake is the best possible! But what is a Wordsnake after all? For the original web page describing the game look here. On this web site, following a short description of the game, you can find an algorithm on how to play the game, its implementation in language K, and instructions on how to run it.

Game Description

The rules of the game are rather simple; each player is given the same set of words, which they are asked to arrange them in such away, so as to achieve the maximum score, defined in the following rules.



The technique proposed here for finding the maximum-score word sequence presented here, splits the problem of arranging the words into sub-problems. The input of sub-problem i is a list of words ei = [wji], where 1 £ j £ ni. The output is a modified word list which serves as an input to the next sub-problem, and the current score, initially set to 0.

In each sub-problem, all possible two-word sequences, described as a vector vjk = (wj, wk), j¹ k, are scored according to their overlap. Those that yield the maximum score are stored in set Si. We construct a graph G(V,E), such that for each vjkÎ S, vertices j and k are included in the set of vertices V, and a directed edge (j,k) is added in the set of edges E. The goal is to select as many edges of this graph as possible. Equivalently, we are trying to include in our final sequence as many two-word sequences of maximal score as possible.

We have to note here that adopting such an approach won’t guarantee that the final sequence, i.e. the result of solving all sub-problems, will have the maximum score. Maximizing the score of each sub-problem does not necessarily yield a maximal solution, because including a particular two-word sequence in the solution of sub-problem i reduces the options left for sub-problem i-1. Consider the following two-word sequence: "emerge", "merger". Their score is 25 (overlap of size 5). Including this sequence will yield "emerge" unusable as the first word in a two-word sequence again (now it is bound to "merger"), and similarly, "merger" cannot be used as the second word again. Therefore, when we move to the next sub-problem, where we will be trying to include pairs of overlap 4, we will have lost one candidate pair whose first word is "emerge" and whose second one starts with "erge". Also, we might miss a pair whose second word is "merger" and whose first one ends with "merg". Each of these pairs would yield a score of 16, total 32, which is better than 25. However, it is not true that in such a case we should necessarily opt for the two pairs of score 16 each, instead of the one pair of score 25. After all, we get a score of 25 by just using two words and producing one new (so one word less), whereas the 32 is the result of using four words to produce two new ones (so two words less). Clearly, having more words, in principle leaves us with more options. Also, for overlaps less than or equal to 3, it is always advantageous to include as many of the maximum score pairs as possible. To see that, consider the dilemma of including or not an edge of overlap i (and of score i2). If we choose not to include it, i.e., choose to keep those two words forming the edge apart, then the maximum gain we can earn by attaching these words to other words (in the subsequent step of the algorithm) is 2(i-1)2. It can be easily verified that if i £ 3, then 2(i-1)2 < i2. From the short discussion above, the proposed greedy-like heuristic seems to be a good choice.

However, there are certain restrictions on the way edges may be selected, due to the nature of the original problem. Recall that each word may be used only once (since you may just rearrange them). This implies that when an edge (j,k) representing the two-word sequence (wj,wk) is selected, it prevents the selection of all other edges originating at vertex j and all those that terminate at vertex k.

Each sub-problem is NP-hard. If graph G has a Hamiltonian Circuit (a path that visits each vertex exactly one time), then this path would be a solution to our sub-problem that contains the maximum possible number of edges (|V|-1). Conversely, in a graph where a Hamiltonian Circuit exists, all solutions that select |V|-1 edges must form a Hamiltonian Circuit. However, the problem of defining if a graph has a Hamiltonian Circuit is NP-complete.

The way we tackle our problem is the following. First, we include in our solution all those edges that cause no conflicts, that is, their inclusion in the solution does not imply the exclusion of any other edge. These are the edges e = (j,k), where at vertex j there are no outgoing edges (other that e itself), and at vertex j there are no incoming edges (other than e, course). Then, we choose an edge at random, remove all conflicting edges, and repeat the process, until there are no edges left. Of course, we can apply heuristics, like choosing each time the edge that cause the minimum number of conflicting edges. This is left for future implementation. The idea behind the current implementation is to avoid heuristics other than the basic one that led to the creation of the sub-problems in the first place. Instead, the size of the problem is reduced as much as possible, including for example all non-conflicting edges as explained before, and then randomizing the algorithm, expecting that after few iterations it will hit the maximum.

After a solution for graph sub-problem i is found, each pair of words represented by the selected edges of graph G is considered one word from this point on, having the first word of the pair as its prefix and the second one as its suffix. The words represented by the pairs are removed form the word list, the new words are appended to it, and the new word list is passed on to the next sub-problem i-1. Finally, a value of the maximum score times the size of the solution set S is added to the current score.

All permutations of the word list produced by the last sub-problem yield the same score. These sequences are obtained from the word list after all partial sub-sequences of words created by each sub-problem (potentially nested) are broken up to form the original words.


The algorithm was implemented in the prototyping language K, which can be downloaded from the Kx systems web site. My code is stored here as snake.k. If you install K in Windows, open a command window and type:

k snake word-file-name

K is an interpreted language, so typing the above line at the Windows command prompt will invoke the K interpreter and load the program. To run it, simply type:


at the K prompt, where x is the number of iteration you want the program to perform (remember the algorithm is randomized). Each time a better score is found, it is printed on the screen, and when the program terminates, it prints out the final wordsnake.

The input of the program (word-file-name) is simply a file containing a set of words (the words have to be in small letters), one per line. Here are some sample word files:

If you have any problem email me at tsirigos@cs.nyu.edu.