Sudoku Solver using TensorFlow
Last Updated :
14 Dec, 2023
The goal of the project is to build a Sudoku solver that can complete Sudoku problems autonomously using the capabilities of TensorFlow, a Google open-source machine learning toolkit. The algorithm aims to recognize patterns and relationships within the incomplete grids; the solver will be able to predict the missing numbers and ultimately provide a solution.
Sudoku Solver using TensorFlow
The architecture of the Sudoku Solver model, and the training process using TensorFlow during this journey. This project offers a fascinating junction of logic and technology, whether you are a Sudoku enthusiast, a machine learning enthusiast, or both. Let’s take on the challenge of creating a Sudoku solver with TensorFlow.
What is Sudoku?
Sudoku is a classic puzzle that has captured the minds of millions worldwide. It’s not only an excellent way to challenge your logical thinking and problem-solving skills but also serves as an intriguing subject for machine learning and artificial intelligence.
Sudoku is a 9×9 grid puzzle with numbers filled in some cells, leaving others empty. The objective is to fill in the empty cells so that each row, column, and 3×3 sub-grid contains all the digits from 1 to 9 without repetition. Solving Sudoku is traditionally done using backtracking algorithms, but we’ll combine this approach with TensorFlow for digit recognition.
Let’s build a sudoku solver.
Importing required libraries
The implementation requires the following libraries:
Python3
import numpy as np
import pandas as pd
import keras
import keras.backend as K
from keras.optimizers import Adam
from keras.models import Sequential
from keras.utils import Sequence
from keras.layers import *
|
Loading data
Using the following code, we create a DataFrame with columns “quizzes” and “solutions” based on the “puzzle” and “solution” columns in the dataset.The dataset is assumed to have columns “puzzle” containing the initial Sudoku puzzle configuration and “solution” containing the correct solution.
Dataset: 9 million Sudoku Puzzles and Solutions
Python3
data = pd.read_csv( "/content/sudoku.csv" )
try :
data = pd.DataFrame({ "quizzes" : data[ "puzzle" ], "solutions" : data[ "solution" ]})
except :
pass
|
Define a Data generator
The following code defines a custom data generator class (DataGenerator) that inherits from Keras’s Sequence class. This class is used to generate batches of data for training the neural network.
Python3
class DataGenerator(Sequence):
def __init__( self , df,batch_size = 16 ,subset = "train" ,shuffle = False , info = {}):
super ().__init__()
self .df = df
self .batch_size = batch_size
self .shuffle = shuffle
self .subset = subset
self .info = info
self .on_epoch_end()
def __len__( self ):
return int (np.floor( len ( self .df) / self .batch_size))
def on_epoch_end( self ):
self .indexes = np.arange( len ( self .df))
if self .shuffle = = True :
np.random.shuffle( self .indexes)
def __getitem__( self ,index):
X = np.empty(( self .batch_size, 9 , 9 , 1 ))
y = np.empty(( self .batch_size, 81 , 1 ))
indexes = self .indexes[index * self .batch_size:(index + 1 ) * self .batch_size]
for i,f in enumerate ( self .df[ 'quizzes' ].iloc[indexes]):
self .info[index * self .batch_size + i] = f
X[i,] = (np.array( list ( map ( int , list (f)))).reshape(( 9 , 9 , 1 )) / 9 ) - 0.5
if self .subset = = 'train' :
for i,f in enumerate ( self .df[ 'solutions' ].iloc[indexes]):
self .info[index * self .batch_size + i] = f
y[i,] = np.array( list ( map ( int , list (f)))).reshape(( 81 , 1 )) - 1
if self .subset = = 'train' : return X, y
else : return X
|
In the code snippet,
- Initialization (__init__):
- df: DataFrame containing “quizzes” and “solutions” columns.
- batch_size: Number of samples in each batch.
- subset: “train” or “validation” subset.
- shuffle: Whether to shuffle the data.
- info: Dictionary to store information (optional).
- __len__ Method:
- Returns the number of batches in the dataset.
- on_epoch_end Method:
- Shuffles the indexes at the end of each epoch if shuffle is set to True.
- __getitem__ Method:
- Generates one batch of data.
- Normalizes the input Sudoku puzzles.
- For the training subset, also prepares the target solutions.
Building the Neural Network
The following code snippet,
- Creates a Sequential model in Keras.
- Adds Convolutional Neural Network (CNN) layers to the model.
- Compiles the model using the Adam optimizer with a specified learning rate and sparse categorical crossentropy loss.
Python3
model = Sequential()
model.add(Conv2D( 64 , kernel_size = ( 3 , 3 ), activation = 'relu' , padding = 'same' , input_shape = ( 9 , 9 , 1 )))
model.add(BatchNormalization())
model.add(Conv2D( 64 , kernel_size = ( 3 , 3 ), activation = 'relu' , padding = 'same' ))
model.add(BatchNormalization())
model.add(Conv2D( 128 , kernel_size = ( 1 , 1 ), activation = 'relu' , padding = 'same' ))
model.add(Flatten())
model.add(Dense( 81 * 9 ))
model.add(Reshape(( - 1 , 9 )))
model.add(Activation( 'softmax' ))
model. compile (loss = 'sparse_categorical_crossentropy' , optimizer = keras.optimizers.Adam(learning_rate = 0.001 ), metrics = [ 'accuracy' ])
model.summary()
|
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 9, 9, 64) 640
batch_normalization (Batch (None, 9, 9, 64) 256
Normalization)
conv2d_1 (Conv2D) (None, 9, 9, 64) 36928
batch_normalization_1 (Bat (None, 9, 9, 64) 256
chNormalization)
conv2d_2 (Conv2D) (None, 9, 9, 128) 8320
flatten (Flatten) (None, 10368) 0
dense (Dense) (None, 729) 7559001
reshape (Reshape) (None, 81, 9) 0
activation (Activation) (None, 81, 9) 0
=================================================================
Total params: 7605401 (29.01 MB)
Trainable params: 7605145 (29.01 MB)
Non-trainable params: 256 (1.00 KB)
_________________________________________________________________
In the above code snippet,
- The model architecture is defined with
- Three Convolutional layers with batch normalization and ReLU activation.
- Flattened the output and pass through a Dense layer.
- Reshaped to match the Sudoku grid dimensions.
- Applied softmax activation to predict a probability distribution for each cell.
- The model is compiled with
- Loss: Sparse categorical crossentropy.
- Optimizer: Adam with a learning rate of 0.001.
- Metrics: Accuracy.
- Trained the model using the fit_generator method, training_generator and validation_generator are instances of the DataGenerator class. Epoch is set to 5 with specified callbacks
Python3
train_idx = int ( len (data) * 0.95 )
data = data.sample(frac = 1 ).reset_index(drop = True )
training_generator = DataGenerator(data.iloc[:train_idx], subset = "train" , batch_size = 640 )
validation_generator = DataGenerator(data.iloc[train_idx:], subset = "train" , batch_size = 640 )
from keras.callbacks import Callback, ModelCheckpoint, ReduceLROnPlateau
filepath1 = "weights-improvement-{epoch:02d}-{val_accuracy:.2f}.hdf5"
filepath2 = "best_weights.hdf5"
checkpoint1 = ModelCheckpoint(filepath1, monitor = 'val_accuracy' , verbose = 1 , save_best_only = True , mode = 'max' )
checkpoint2 = ModelCheckpoint(filepath2, monitor = 'val_accuracy' , verbose = 1 , save_best_only = True , mode = 'max' )
reduce_lr = ReduceLROnPlateau(
monitor = 'val_loss' ,
patience = 3 ,
verbose = 1 ,
min_lr = 1e - 6
)
callbacks_list = [checkpoint1,checkpoint2,reduce_lr]
history = model.fit_generator(training_generator, validation_data = validation_generator, epochs = 5 , verbose = 1 ,callbacks = callbacks_list )
|
Output:
Epoch 1/5
1485/1485 [==============================] - 140s 95ms/step - loss: 0.7082 - accuracy: 0.7124 - val_loss: 0.4045 - val_accuracy: 0.8139
Epoch 00001: val_accuracy improved from -inf to 0.81389, saving model to weights-improvement-01-0.81.hdf5
Epoch 00001: val_accuracy improved from -inf to 0.81389, saving model to best_weights.hdf5
Epoch 2/5
1485/1485 [==============================] - 133s 90ms/step - loss: 0.3917 - accuracy: 0.8190 - val_loss: 0.3809 - val_accuracy: 0.8201
Model Training Callbacks:
In line, callbacks_list = [checkpoint1, checkpoint2, reduce_lr]
- filepath1: Saves model weights for every epoch.
- filepath2: Saves model weights for the best validation accuracy.
- ReduceLROnPlateau Callback: Reduces the learning rate if the validation loss plateaus.
Loading the best model weights
Python3
model.load_weights( '/content/best_weights.hdf5' )
|
Sudoku solver function:
Python3
def solve_sudoku_with_nn(model, puzzle):
puzzle = puzzle.replace( '\n' , ' ').replace(' ', ' ')
initial_board = np.array([ int (j) for j in puzzle]).reshape(( 9 , 9 , 1 ))
initial_board = (initial_board / 9 ) - 0.5
while True :
predictions = model.predict(initial_board.reshape(( 1 , 9 , 9 , 1 ))).squeeze()
pred = np.argmax(predictions, axis = 1 ).reshape(( 9 , 9 )) + 1
prob = np.around(np. max (predictions, axis = 1 ).reshape(( 9 , 9 )), 2 )
initial_board = ((initial_board + 0.5 ) * 9 ).reshape(( 9 , 9 ))
mask = (initial_board = = 0 )
if mask. sum () = = 0 :
break
prob_new = prob * mask
ind = np.argmax(prob_new)
x, y = (ind / / 9 ), (ind % 9 )
val = pred[x][y]
initial_board[x][y] = val
initial_board = (initial_board / 9 ) - 0.5
solved_puzzle = ''.join( map ( str , initial_board.flatten().astype( int )))
return solved_puzzle
|
In the above code snippet,
- def solve_sudoku_with_nn(model, puzzle):: Defines a function named solve_sudoku_with_nn that takes a neural network model (model) and a string representation of a Sudoku puzzle (puzzle) as input.
- puzzle = puzzle.replace(‘\n’, ”).replace(‘ ‘, ”): Removes newline characters and spaces from the input puzzle string.
- initial_board = np.array([int(j) for j in puzzle]).reshape((9, 9, 1)): Converts the string to a NumPy array of integers and reshapes it to a 3D array representing the Sudoku grid.
- initial_board = (initial_board / 9) – 0.5: Scales the values in the array to be between -0.5 and 0.5.
Solving the Puzzle with the Neural Network:
- “while True:”: Initiates an infinite loop for solving the Sudoku puzzle.
- predictions = model.predict(initial_board.reshape((1, 9, 9, 1))).squeeze(): Uses the neural network to predict values for empty cells in the Sudoku puzzle.
- pred = np.argmax(predictions, axis=1).reshape((9, 9)) + 1: Extracts the most probable digit predictions and reshapes them into a 9×9 grid.
- prob = np.around(np.max(predictions, axis=1).reshape((9, 9)), 2): Extracts the maximum probability for each prediction and reshapes it into a 9×9 grid.
Updating the Sudoku Grid:
- initial_board = ((initial_board + 0.5) * 9).reshape((9, 9)): Rescales the Sudoku grid to the original range (0 to 9).
- mask = (initial_board == 0): Creates a mask for identifying empty cells in the Sudoku grid.
Checking for Completion:
- if mask.sum() == 0:: Checks if there are no more empty cells in the Sudoku grid, indicating the puzzle is solved.
- break: Breaks out of the loop if the puzzle is solved.
Selecting the Next Cell:
- prob_new = prob * mask: Applies the mask to the probabilities to consider only empty cells.
- ind = np.argmax(prob_new): Finds the index of the maximum probability among the empty cells.
- x, y = (ind // 9), (ind % 9): Converts the 1D index to 2D coordinates.
Updating the Grid with Predicted Value:
- val = pred[x][y]: Gets the predicted digit for the selected empty cell.
- initial_board[x][y] = val: Updates the Sudoku grid with the predicted digit.
- initial_board = (initial_board / 9) – 0.5: Rescales the Sudoku grid for the next iteration.
Conversion to String Representation:
- solved_puzzle = ”.join(map(str, initial_board.flatten().astype(int))): Converts the solved Sudoku grid back to a string representation.
Returning the Solved Puzzle:
return solved_puzzle: Returns the solved Sudoku puzzle as a string.Example Sudoku puzzles
Python3
def print_sudoku_grid(puzzle):
puzzle = puzzle.replace( '\n' , ' ').replace(' ', ' ')
for i in range ( 9 ):
if i % 3 = = 0 and i ! = 0 :
print ( "-" * 21 )
for j in range ( 9 ):
if j % 3 = = 0 and j ! = 0 :
print ( "|" , end = " " )
print (puzzle[i * 9 + j], end = " " )
print ()
new_game =
game =
solved_puzzle_nn = solve_sudoku_with_nn(model, game)
print ( "Sudoku Solution (NN):" )
print_sudoku_grid(solved_puzzle_nn)
|
Output:
Here is the link for the kaggle notebook: Kaggle notebook on Sudoku solver
Share your thoughts in the comments
Please Login to comment...