Solving small puzzles (with just a few lines of code)

For my first blog post I decided to choose subject that is just really me: Solving small puzzles. It's one of my weird hobbies that requires a bit of understanding of the puzzle and of course some programming skills. I have used this same idea that I'm going to present here to many puzzles.

Background of the puzzle: Survo

Survo puzzle is invention of Seppo Mustonen. He invented Survo puzzles in 2006, so it's quite new. Idea of Survo puzzle is similar to Sudoku. You need to add numbers to the puzzle so that row and column sums are same as given values. Biggest difference to Sudoku is the number of solutions. In Survo puzzle there will be only one possible solution. Puzzles are created so that solution is unique. As opposite to the Sudoku that might not be the case. Also numbers that are used to solve the case isn't limited to 1 to 9. Board size isn't fixed to some predefined size like in Sudoku and it's typically something like 3x3, 4x3 10x2 or 6x6. All this will make sense when we look at the example. More background can be find from Survo puzzle homepages or from paper written by Seppo Mustonen. It also worth mentioning that they give degree of the difficulty for the puzzle. Really easy puzzles are something around 1 to 10 and the most challenging Survo puzzle is 17000 (which is also known as the beast).

Important note: All Survo puzzles used in this page are taken from the Survo puzzle homepages or from published places such as Helsingin yliopiston tiedelehti (=Science magazine of Helsinki University).

First example of Survo puzzle

Here is our first Survo puzzle (already inside my application :). For now let's concentrate only to the grid where is many empty slots and few given numbers. We can see that this puzzle is 4x3 and puzzle has 3 extra numbers given already. Idea is to fill those white slots with numbers 1 to 12 (which comes from 4x3). And numbers 3, 6 and 8 are already given to the grid. So there is still available 1, 2, 4, 5, 7, 9, 10, 11 and 12 (=9 numbers). Those numbers should be put to white slots so that each row and column sum matches the values given in the gray slots. Ex. On column 1 you could put 10-8-9 (8 was already given) which satisfies the rule 27. But in order to those be good selections you need to still be able to satisfy the row restrictions too: 30, 18 and 30.

Normally people use paper and pen to solve these kind of problems, but I don't like that :) I prefer creating program that does the dirty work for me. And this kind of problems are typically solved with Brute force / Game theory or whatever you want to call it. The idea is anyway to go through every possible configuration at the board, but use some kind of knowledge of the game to rule out bad configurations or selections away. And why do we want to do that? Well just because it would take a lot of time to go through all the possible combinations. In this example case it would be something like 9! = 362880 combinations. Since our example puzzle is really simple (degree of difficulty isn't know for this example) we don't need to create super Algorithm to solve that. But when board size grows and is something like 5x5 it's totally a different case: 25! = 1.55 * 10^25. That kind of puzzle takes a long time to solve if you go through every possible combination without any optimizations that would rule out unfit solutions. Next I'm going to tell you how I started solving this problem.

Creating the Survo Puzzle Solver

I started working on my Survo Puzzle Solver with Visual Studio 2005 and C#. As you can see from the UI of the application -> I'm not UI Designer :) I just created minimal set of buttons and grid that displays the current situation on the board. Then I created first version of my Algorithm. Here it is in Pseudocode:

 1: Bool Solve 2: { 3:   If ( board is full) 4:   { 5:     // Evaluate returns true if solution is found 6:     // or false if solution isn't correct one. 7:     return Evaluate(); 8:   } 9:   Foreach(location in board)10:   {11:     If (location is free)12:     {13:       Foreach(choice from available choices)14:       {15:         addChoiceToBoard(choice);16:         If (Solve())17:         {18:           // We have solution for this puzzle!19:         }20:         removeChoiceFromBoard(choice);21:       }22:     }23:   }24: }</solver_code1>

So you can easily see that our code uses Recursion. It means that it calls itself (line 16). Solver code 1  inserts available numbers to the board slots that are free and when it notices that board is full (line 3) it validates the solutions (line 7). If Evaluate method returns true our solution has been found:

This solver code 1 clearly solves the problem... right? It does, but not completely. This approach has a lot to improve. I call this phase 1: Code the functionality without any optimization. We have now solution that works, but it's simply isn't usable for a bit harder puzzles. It just takes too long time to get the solution. So now it's time to analyze our solution little bit and try to optimize it to be more robust. If you run that kind of code you would see that it would fill entire board and then validate the correctness of the solution (see image below which is taken during the solving process):

And you can see right away that it isn't correct because the first row doesn't pass the check: 1 + 6 + 2 + 4 = 30. If that part doesn't match then why have we continued with search under it? It's even more noticeable from treelike presentation (click images to see them original size)(Note: Now you know that I can't even draw :).

Full tree search

Optimized search tree

In the first tree every single node of the tree needs to be tested before we know the answer. Optimized search tree shows black nodes that indicates that this subtree can be ignored entirely. So red nodes are the ones that won't even be tested and we still know that the solution can be found from the rest of the nodes! So from this we can go to our next phase. Let's call it phase 2: Simple optimization.

We don't need to modify our code much in order to make simple optimization. Let's add check each time when the row is filled. This one is pretty simple to implement and improves performance quite a lot:

 1: Bool Solve 2: { 3:   If ( board is full) 4:   { 5:     // Evaluate returns true if solution is found 6:     // or false if solution isn't correct one. 7:     return Evaluate(); 8:   } 9:   Foreach(location in board)10:   {11:     If (location is free)12:     {13:       Foreach(choice from available choices)14:       {15:         addChoiceToBoard(choice);16:         If (location is end of the row AND``17:             EvaluateRow())18:         {``19:           If (Solve())20:           {21:             // We have solution for this puzzle!22:           }23:         }24:         removeChoiceFromBoard(choice);25:       }26:     }27:   }28: }</solver_code2>

So before we go to deeper recursion, we'll check the conditions for it (lines 16 and 17). If EvaluateRow method returns true then current row is correctly filled. This improved performance but we haven't reached our goal yet. Since that optimization was easy to make we can take bit harder step next. Let's call it phase 3: return of the optimization! Idea is to check the conditions every time we insert item to the board:

 1: Bool Solve 2: { 3:   If ( board is full) 4:   { 5:     // Solution is found! 6:     return true; 7:   } 8:   Foreach(location in board) 9:   {10:     If (location is free)11:     {12:       Foreach(choice from available choices)13:       {14:         addChoiceToBoard(choice);15:         If (Evaluate``())16:         {``17:           If (Solve())18:           {19:             // We have solution for this puzzle!20:           }21:         }22:         removeChoiceFromBoard(choice);23:       }24:     }25:   }26: }</solver_code3>

Now Evaluate method checks that does the current board configuration satisfy the conditions set. So this kind of example could happen: We have row limit set to 20 and there is 12 + 1+ 3 already put in place in that row (=17). We know that there is still one place to fill and we know what numbers we have left. So if the smallest available number would be 5, then we couldn't satisfy the conditions since 12 + 1 + 3 + 5 isn't 20. So we can take back and ignore that subtree too. It optimizes the search a lot. With that Algorithm we can go solve puzzle of the degree of difficulty 8700 in two seconds!

Of course we need to go through the full search of the tree so that we can verify the uniqueness of the solution. But that's easy to implement. Just don't stop even if solution is found (line 6 on Solver code 3). Just modify it to store the solution somewhere (text file perhaps) and continue the search.

Why would I like to do this? Isn't this kind of solving a bit stupid?

Well yes and no :) Yes in sense, that you could have much more fun solving these by paper and pen. I'm not saying that you can't do that anymore. No in sense that you can use this kind of codebase to test your optimization skills. And Visual Studio is great tool in that! You can run performance/profiling tests and then see where the application spends most of it's time. This is actually really good exercise. You should try it!

Another good thing is that this Algorithm is pretty useful in even more challenging games like chess, tic-tac-toe, checkers, 4 in a row, etc. Creating the base for the solver is easy... extending it with really good Evaluate method is hard. For example in chess you need to take so many variable into count that it makes my head spin. But if you learn from easier games like this you could easily develop such skills. It's always nice to play games that have cool Artificial intelligence.

Wow.. I thought that I start with small first post, but it somehow ended like this. Maybe next time I'll write something shorter.

Anyways... happy hacking!


Updated: I finally had time to test my solver on the "beast" (degree of difficulty 17000 => most difficult ever published Survo puzzle).

And it even suprised me when it managed to solve it in almost 10 seconds! It also verified uniqueness of the solution in 20 seconds. I'll post the solution into here, but don't peek if you do want to solve it by yourself :)
Updated(2): I almost forgot... Jezz Santos found this really nice Sudoku article that you should definitely checkout: Microsoft Sudoku: Optimizing UMPC Applications for Touch and Ink on MSDN by Stephen Toub.