Sudoku Dragon
Sudoku Dragon
Home 
Sudoku 
Solver 
Daily 
Online 
Forum 
Books 
Resources 
Site Map 

AddThis Social Bookmark Button
Sudoku Solver
Download our Sudoku Dragon for a free 22 day trial on your PC. It includes all the features you need to help solve puzzles whether you are new to Sudoku or an expert.
Today's Dragon Tip
Full backtrack
If you change your mind about a square allocation you can simply undo it, even if the allocation was done a few moves ago. Just use CTRL and the Z key to undo.
Read More

Sudoku Puzzle Theory

Sudoku is all about permutations, but permutations with an extra twist of logic. To look into the theory behind Sudoku you need to first look into permutations.

Permutations

A permutation is just a particular ordering of symbols. In Sudoku it is insisted that there is only one occurrence of each symbol (or number) in each group (row, column or region).

So for two symbols there are only two possible orders {1;2} and {2;1} with three there are six {1;2;3}, {1;3;2}, {2;1;3}, {2;3;1}, {3;1;2} and {3;2;1} possible orders and for four there are now 24 permutations of the four symbols. The number of permutations is the factorial of the number of symbols in the set, as each time an extra element is added to a set of size 'n' that element multiplies up the number of arrangements of the previous set size. So for the standard Sudoku set size of 9 we have 9 factorial (represented in maths as 9!) or 9x8x7x6x5x4x3x2x1 possible permutations which works out as 362,880 possible ways of ordering the nine symbols in a row, column or region.

Properties of permutations

On their own permutations do not look at all exciting. But you have a permutation you can create other valid permutations from it by doing one of the following:

Swap the symbols consistently
In a permutation you can always swap all occurrences of one symbol for another as long as the swap is done systematically and in reverse too. If you swap 4 with 1 then the 1 must be swapped to a 4 (e.g. 4;2;3;1 would become 1;2;3;4). Several or all symbols can be swapped in this way. The result will always be another valid permutation.

Shifting the order
Permutations are by their nature just an ordering so you can swap the order of elements however much you like and the result is also a permutation. For example you can shift the symbol '8' from the start to the end so (8;4;5;6;1;2;7;9;3) becomes (4;5;6;1;2;7;9;3;8) or swap each element with its neighbor in pairs so (8;4;5;6;1;2;7;9;3) becomes (4;8;6;5;2;1;9;7;3).

Arithmetic Analysis

The original related puzzle of Magic squares has the property that all the numbers in rows and columns add up to the same number. This is also a property of permutations, if you add up the individual numbers in the set that make up the permutation then this will always give the same result. This is because addition is associative, it does not matter in which order you add up the numbers, you always end up with the same result ((5+1)+2) = (5+(1+2)). The same is true of multiplication but is not true of all the simple arithmetic operations. Both subtraction and division give different results depending on the order that the operation is carried out (4/3)/2 is not the same as (3/4)/2.

So if we add up all the numbers in a completed 9x9 Sudoku row, column or region we will always get the same total: 45, and if they are multiplied together the answer always is 362,880 (again this is our 9 'factorial' or 9!)

This is starting to look useful because if we have a number missing we can deduce which number it actually is. For example, if there is one number missing and the sum of the other numbers is 40 the missing number must be a '5' to make up the total for the group of nine of 45. This does not work in general as for example if two numbers are missing and the total is 38, it can't be directly deduced what the numbers are, it could be any two numbers that add up to 7 (either 2 and 5; or 1 and 6; or 3 and 4). Similarly with multiplication, the value of missing number this can be deduced by multiplying together all the numbers that are there and dividing this product into 362,880. For example if the group is 8;4;6;1;2;5;9;3 the product is 51,840 so dividing this into 362,880 we get the answer '7' as the missing number. Unfortunately, just like addition we can't use this scheme to determine which of two or more numbers are missing. There is more than one choice of numbers that give the same answer.

Before getting any further into any more theoretical analysis, let's simplify by using the 4x4 Sudoku grid rather than 9x9 just to reduce the number of options. In the 4x4 grid, the rules are just the same but there are only four numbers 1;2;3;4 in each row; column and region. So in the case a permutation of all the numbers must add up to 10 and the product of the numbers is 24. [Sudoku Dragon supports ten different puzzle sizes including the 4x4 and 16x16 sized puzzles.]

Gödel numbers

The problem with deducing missing numbers does not arise if we use a form of Gödel numbers. Here we don't just multiply the Sudoku numbers together we use the corresponding prime number. So for a 1 we use the first prime number 2, for 2 use the second prime 3; for 3 use 5; for 4 use 7 etc..

All Sudoku groups in the 4x4 grid must now multiply up to give a product of 2x3x5x7 = 210 rather than 24. If a Sudoku 4x4 group is 2;;3;; with two missing numbers we convert these to the corresponding prime and multiply them together 3 x 5 giving 15. Now 210 / 15 = 14. So the two missing numbers multiplied together must give 14 and there are only two numbers that can do that 2 and 7. Using the correspondence 'encoding' table shows that 1 (for the 2) and 4 (for the 7) are the missing numbers. We can use this trick for finding any number of missing numbers in a Sudoku group (of any size) just by a little multiplication and division. Why does this work? Because prime numbers are well, prime, we can't make a prime number by multiplying two other numbers together.

Note : the product of all primes up to a specific prime number such as 210 is given a special name : a 'primorial'.

In the case of 9x9 grid we need the next five extra prime numbers for 5 use 11; for 6 use 13; for 7 use 17; for 8 use 19 and for 9 use 23. So for example an incomplete row of 8;4;6;1;2; ;5;9;3 we convert these to the corresponding prime and multiply them together 19x7x13x2x3x11x23x5 = 13,123,110 but the product of all numbers should be 223,092,870 (a primordial) which means the missing number is 17 or back in the normal Sudoku world '7'. Now try a Sudoku group with three numbers missing 9;3;;6;;1;;2;8 the Gödel number using our encoding gives 170,430 dividing this into the full product 223,092,870 gives an answer of 1,309. Now because of the properties of prime numbers there is only one way of generating 1,309 using only prime numbers and that is by multiplying 7 x 11 x 17, and so the missing numbers must be the corresponding numbers 4; 5 and 7.

More logical Approach

Having looked at traditional arithmetic to investigate the properties of Sudoku it is worth looking at the puzzle in a different way to find a thankfully simpler analysis. Sudoku can be done with any old symbols not just numbers, they can be colors, shapes, pictures: anything as long as they are distinct and there is only one per group. We shouldn't get hung up about using 'numbers', the problem is more general than that. What is needed then is just a simple 'yes' or 'no' approach, to the question 'is the symbol in this position?'. This can then be easily encoded into binary ('1' or '0'). For the 4x4 grid we can then encode the permutation of 3;4;1;2 as four binary numbers: 0100 (for position 1 there is a 3);1000 (for position 2 there is a 4);0001 (for position 3 there is a 1) and 0010 (for position 4 there is a 2). We put a binary '1' to indicate the presence of the symbol at the corresponding position. To be a well formed Sudoku group the binary addition (or logical or) of all these numbers must give a result of 1111 to confirm that each of the four symbols has occurred once in all the four positions in the group. This can then be used to find missing numbers, a group with 2;;3;; would be translated as 0010;;0100;; logically ored together gives 0110, so we can now tell at a glance that it is a '4' and a '1' that are the missing numbers as they correspond with the missing symbols represented by '0's. So rather than using Gödel numbers the same reasoning can be applied to Sudoku much more straightforwardly using boolean 'yes/no' operations.

Actions and Operations

There is another way of looking at permutations and symbols. You can think of the contents not as symbols but as operations to perform. A permutation is just saying that you need to perform a set of operations only once but in any order. To make this more everyday consider four operations : getting dressed; cleaning teeth; collecting the post and eating breakfast. We might do each of these every morning, only once and they can be done in any order. If you think of these as a permutation with numbers 3;2;4;1 might represent collect post; clean teeth; eat breakfast; get dressed. That's established the idea of thinking of a permutation as a sequence of operations done in time order rather than symbols. Looking at this in a simpler, more mathematical sense we could treat each symbol as a move along a vector. An 'operation' is in terms of moving a certain amount in a certain direction. So we could use '1' as move 2 units N ; '2' as move 2 units S; '3' as move W 2 units and '4' move E 2 units. These are chosen so that the end result of completing all these steps takes you back to where you started. This is important as it makes the sequence of operations into a mathematical concept known as a 'ring'. Other examples of these sorts of patterns are square dances where after a number of moves, twists and turns you end up where you started. Other vectors could have been chosen. Just like addition the moves can be done in any order to achieve the same end position. Interestingly this approach allows missing moves to be deduced by just combining the known moves and working out how to get back to the original point.

Combining permutations

If Sudoku were just a matter of single groups of symbols then it wouldn't be much of a challenge. The complexity comes from combining the restriction of rows and columns into a grid. There are three constraints on all the squares it must be unique in the row; column and region.

Sudoku Possibility Analysis

Apart from the simplest cases (where only one choice is available) the solving of a Sudoku puzzle involves analysing possibilities. Each unsolved square can have one or more possibilities. Each unsolved symbol must be possible in one or more square in a group.

If we look at a sudoku group on its own then all the unused symbols can occur in any of the unsolved squares. However taking the other groups that share squares with this group reduces the number of possibilities. So for a row having 7; ; ;6;3; ; ;2;5 the four missing numbers are 1;4;8;9 these could on their own occur in any permutation within the squares. Other groups (columns, regions) may constrain the choice down to {1;4;8}, {1;8;9}, {4;8} and {1;4;8;9} for the respective four empty places. Each of these is a subset of the missing numbers (1;4;8;9) and it is the patterns of these subsets that are used to deduce additional constraints on the contents of the squares. For example, the possibility subsets could have been {1;8} {1;8} {4;8;9} and {4;9}. The {1;8} is an example of a 'naked twin' there are two squares with just two possibilities and this means that the 1 must go in one of the two places and 8 in the other place. Therefore the square with {4;8;9} as possibilities can be safely restricted down to {4;9} as the 8 can't occur there. This is about the simplest case of how analysis of possibilities can be helpful in reducing the number of possibilities. By using the knowledge that a symbol may occur only in a subset of the squares we can deduce not where exactly it can go but deduce where it can not go.

General possibility rule

The 'naked twin' rule is just the simplest example of a general rule for Sudoku possibilities. The general rule is that if there are 'n' symbols and all possibilities for these symbols are located in a subset of the 'n' squares within a group then we have a sub-group of possibilities. Apart from the twin example there is the 'chain'. If the possibilities were {1;4} {4;8} {8;1} and {1;4;9} then the first three form a chain of the symbols {1;4;8} and preclude the occurrence of these symbols in the last square, in this case the only choice left is '9'. A chain is a closed loop of symbols that imply that the symbols must occur only in this group but it can not be determined where precisely the numbers will go. In the case of {1;4} and {4;1} in a group this is just saying the allocation is either {1} and {4} or {4} and {1} so the 1 and 4 can not occur anywhere else. The same logic applies to 3 or more symbols it is not limited to just two. If we had {1;4;9} {1;4;9} {1;4;9} and {4;8} then the three squares form a triplet of {1;4;9} and the other square must be the 8.

Beyond the linear dimension

Much of this analysis so far has looked at one Sudoku group in isolation. Each square is however a member of three groups (row; column and region) and so the constraints for the square in one group apply equally to the other groups. So if a column requires that {1;4;9} are the possibilities for a square in the column and the row it is in gives {2;4;9} as possibilities for the same square then the combined constraint is that the square can only be 4 or 9 {4;9}. If in addition the region gives possibilities {3;4;5} then that would only leaves {4} as the only possibility that meets the requirements of the row; column and region. The knowledge that a {4} must go in the square can be fed back into the constraints for the three groups as it precludes {4} being a possibility anywhere else in the groups.

The complication does not end there. Some of the constraints are 'indirect' meaning that the implication for one square will limit what can go in another square, not directly but because of the shared groups it is in. The simplest example of this is the X-Wing. Here four groups are logically inter-linked to form a 'box'. If the possibilities form a particular pattern then the corners of the box must be in one of only two configurations. The Sudoku rules are therefore applying a two-dimensional constraint involving four groups (two rows and two columns). This can get even more complicated as more groups are involved... These cyclic dependencies result from the ways that squares are connected via groups into grids. Both humans and computers struggle to find these interrelationships between square possibilities. Luckily most Sudoku puzzles can be solved without resorting to using them.

Computing a solution

Finding patterns within these permutation subsets is definitely a 'hard' problem. This is not just a human description, it is just as tough for computers too. Pattern matching problems like Sudoku belong to the class of the difficult problems to solve : The NP complete class. This is because any analysis has to look through all combinations, it can not just do it in a single linear scan of the permutations. To spot a 'twin' a computer needs to look through all possible combinations of two squares in a group. The time taken to solve does not grow linearly with problem size it grows exponentially. If it takes 2 seconds to solve a problem of size '3' it will take much more than 4 seconds to solve a problem of size '6'. There is no simple 'trick' that a human or computer can use to solve sudoku puzzles in general.

The simplest algorithm to try is check all the possibilities in turn without looking for excluded possibilities or only choices. This can take a very long time to do as there just so many to check - the crudest algorithm would work through something like 10 to the power 47 combinations (that's 10 with 47 noughts after it).

Sudoku Puzzle Sizes

The 4x4 Puzzle

There is nothing that special about 9x9 Sudoku puzzles, it just turns out to make a fairly difficult Sudoku puzzle that can be solved in a reasonable time 9x9 is a good choice. However it is quite possible to go for a smaller or larger size of grid.

With 4x4 there are only 16 squares in total and it is impossible to create a difficult puzzle. The minimum number of squares that can be revealed and still produce a solvable puzzle is four. I have never seen a 4x4 with only 3 initial squares revealed - this may be provable to be an impossible starting arrangement.

The 16x16 Puzzle

Moving beyond the regular 9x9 puzzle size is possible but introduces nothing new other than more possibilities to work through. All the strategies you might use for solving 9x9 can be adapted to the larger 16x16 grid. For example the 'two out of three' strategy becomes 'three out of four' strategy. You can still use all the basic strategies : only choice; forced choice; excluded twin and excluded subgroup. It becomes much harder for mere humans to solve when there are 16 rather than 9 numbers/symbols to reason about. But with practice these grids can become quite straightforward to solve too.

Larger Sudoku Puzzles

There is no theoretical limit on the size of a Sudoku puzzle. The rules are generic and by mathematical induction it can be shown that they can just grow and grow. To keep to a square arrangement the number just goes up in squares : 16x16; 25x25; 36x36 (over a thousand squares to complete); 49x49; 81x81; 100x100 (10,000 squares in the puzzle).

Rectangular Puzzles

If you have a set of same sized rectangles you can always arrange them into a square grid. One rectangular Sudoku option is 3x5 blocks arranged as three blocks wide and five blocks deep giving 15 regions in all. The same can be done with many other sizes (e.g. six 2x3 blocks), in fact the only grid sizes that can't be generated are those that are prime numbers and so can't be divided up into a rectangular block (e.g. a 5x5 puzzle can not be created). Sudoku Dragon supports 2x3, 2x4, 3x5, 3x5, 4x5, 2x5 and 2x7 rectangular block puzzles giving puzzle sizes of 6, 8, 15, 20, 10 and 14 respectively.

The smallest Sudoku Puzzle

For the pure mathematicians amongst you, it may be interesting to note that from a pedantic point of view 4x4 is not the simplest size of Sudoku puzzle. If instead of all groups having 4 squares in it we have just 'one' square then, even though it is rather academic, the 1x1 sudoku puzzle has 1 row and 1 column with 1 symbol occurring just once in each row and column. There is only one region too with only 1 square in it. So there is only one Sudoku puzzle of size 1 and the solution is 1. Mathematicians may appreciate the symmetry as it shows that the Sudoku rules are general for any group size including 1 upwards. It just happens that 1 is a seriously simple puzzle to solve!

Today's Quotation
"The best laid schemes o' mice an' men Gang aft agley, An' lea'e us nought but grief an' pain For promis'd joy!"

Robert Burns (To a mouse) 1785

sudokudragon Give our SudokuDragon puzzle solver a free 22 day trial by visiting our download page. It can selectively help you quickly identify possibilities and exclude the ones that actually aren't possible.

contribute Any comments on this page ? Click here to contribute

Copyright © 2005 - 2008 Sudoku Dragon