Tic Tac Toe CNN

by Daniel Pollithy Experiment: Writing a CNN for tic tac toe

Let’s write a CNN which detects the current state of a tic tac toe game.

Although it might be super easy to write an opencv2 script which makes this I want to see whether I have learned something from my past book lectures.

Planning

First of all I am going to write a simple script which generates pixel maps for given tic tac toe states. There are three mutual exclusive states for every cell. Cross, Circle or Empty.

Then I am going to test train a 9 CNNs with a softmax for the 3 classes.

When this works I am going to actually train the models in the google cloud.

Pixel map generation Every cell should be 8x8 px with two vertical and two horizontal borders in total. => 26 px x 26 px The pixel maps are fixed templates for every cell.

Draw the grid

import numpy as np
from skimage.draw import line
from skimage import io
from matplotlib import pyplot as plt

def get_grid():
img = np.zeros((26, 26), dtype=np.uint8)
rr, cc = line(8, 0, 8, 25)
img[rr, cc] = 1
rr, cc = line(17, 0, 17, 25)
img[rr, cc] = 1
rr, cc = line(0, 8, 25, 8)
img[rr, cc] = 1
rr, cc = line(0, 17, 25, 17)
img[rr, cc] = 1
return img

io.imshow(get_grid())
plt.show() Draw circles and Xses

def draw_x(img, offset_x=0, offset_y=0):
rr, cc = line(1+offset_y, 1+offset_x, 6+offset_y, 6+offset_x)
img[rr, cc] = 1
rr, cc = line(6+offset_y, 1+offset_x, 1+offset_y, 6+offset_x)
img[rr, cc] = 1
return img

def draw_o(img, offset_x=0, offset_y=0):
rr, cc = circle_perimeter(3+offset_y, 3+offset_x, 2)
img[rr, cc] = 1
return img

img = get_grid()
img = draw_x(img, offset_x=9, offset_y=0)
img = draw_o(img)
img = draw_o(img, offset_x=18, offset_y=0)

io.imshow(img)
plt.show() Draw a full game state

def draw_num_state(num_state):
img = get_grid()
for y in range(0, 3):
y_offset = y * 9
for x in range(0, 3):
x_offset = x * 9
i = y * 3 + x
cell_type = num_state[i]
if cell_type == 1:
draw_x(img, offset_x=x_offset, offset_y=y_offset)
elif cell_type == 2:
draw_o(img, offset_x=x_offset, offset_y=y_offset)
return img

test_num_state = [1, 2, 1, 2, 1, 2, 1, 0, 0]
img = draw_num_state(test_num_state)

io.imshow(img)
plt.show() Generate samples

The following code generates multiple game states and images of them. They are not valid though.

def generate_game_states(n=10):
imgs = []
states = []
for i in range(n):
states.append(np.random.choice(3, 9))
imgs.append(draw_num_state(states[-1]))
return imgs, states

def draw_game_states(images, num_states):
for row in zip(images, num_states):
print("GAME STATE: {}".format(row))
io.imshow(row)
plt.show()

images, num_states = generate_game_states()
draw_game_states(images, num_states) CNN

Now that we can generate data we want to train a model. Training one single CNN would result in 19.683 states (all possible combinations which is 3^9). That is why I want to train a classifier for each cell. %%time

train_images, train_num_states = generate_game_states(n=5000)
train_images = np.array(train_images)
train_num_states = np.array(train_num_states)

test_images, test_num_states = generate_game_states(n=3000)
test_images = np.array(test_images)
test_num_states = np.array(test_num_states)

Classifying one cell

The following CNN shall only be trained on the three labels “Empty”, “X” or “O”.

import tensorflow as tf

# tf.logging.set_verbosity(tf.logging.INFO)

def cnn_model_fn(features, labels, mode):
# Input Layer
# Reshape X to 4-D tensor: [batch_size, width, height, channels]
# The images are 26x26 pixels, and have one color channel
input_layer = tf.reshape(features["x"], [-1, 26, 26, 1])

# Convolutional Layer #1
# Computes 32 features using a 5x5 filter with ReLU activation.
# Input Tensor Shape: [batch_size, 26, 26, 1]
# Output Tensor Shape: [batch_size, 26, 26, 32]
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=32,
kernel_size=[5, 5],
activation=tf.nn.relu)

# Pooling Layer #1
# First max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 26, 26, 32]
# Output Tensor Shape: [batch_size, 13, 13, 32]
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

# Convolutional Layer #2
# Computes 64 features using a 5x5 filter.
# Input Tensor Shape: [batch_size, 13, 13, 32]
# Output Tensor Shape: [batch_size, 13, 13, 64]
conv2 = tf.layers.conv2d(
inputs=pool1,
filters=64,
kernel_size=[5, 5],
activation=tf.nn.relu)

# Pooling Layer #2
# Second max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 13, 13, 64]
# padding same => 13x13 -> 14x14
# Output Tensor Shape: [batch_size, 7, 7, 64]
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2, padding='SAME')

# Flatten tensor into a batch of vectors
# Input Tensor Shape: [batch_size, 7, 7, 64]
# Output Tensor Shape: [batch_size, 7 * 7 * 64]
pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])

# Dense Layer
# Densely connected layer with 1024 neurons
# Input Tensor Shape: [batch_size, 7 * 7 * 64]
# Output Tensor Shape: [batch_size, 1024]
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

# Add dropout operation; 0.6 probability that element will be kept
dropout = tf.layers.dropout(
inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

# Logits layer
# Input Tensor Shape: [batch_size, 1024]
# Output Tensor Shape: [batch_size, 3]
logits = tf.layers.dense(inputs=dropout, units=3)

predictions = {
# Generate predictions (for PREDICT and EVAL mode)
"classes": tf.argmax(input=logits, axis=1),
# Add `softmax_tensor` to the graph. It is used for PREDICT and by the
# `logging_hook`.
"probabilities": tf.nn.softmax(logits, name="softmax_tensor")
}
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

# Calculate Loss (for both TRAIN and EVAL modes)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

# Configure the Training Op (for TRAIN mode)
if mode == tf.estimator.ModeKeys.TRAIN:
train_op = optimizer.minimize(
loss=loss,
global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

# Add evaluation metrics (for EVAL mode)
eval_metric_ops = {
"accuracy": tf.metrics.accuracy(
labels=labels, predictions=predictions["classes"])}
return tf.estimator.EstimatorSpec(
mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

def train(train_data, train_labels, test_data, test_labels, cell_index=0, training_steps=1000):
# Create the Estimator
classifier = tf.estimator.Estimator(
model_fn=cnn_model_fn, model_dir="./ttt_convnet_model_{}".format(cell_index))

# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=500)

# Train the model
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": train_data},
y=train_labels,
batch_size=100,
num_epochs=None,
shuffle=True)

classifier.train(
input_fn=train_input_fn,
steps=training_steps, #20000
hooks=[logging_hook])

# Evaluate the model and print results
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": test_data},
y=test_labels,
num_epochs=1,
shuffle=False)

eval_results = classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

This CNN is an overkill for that static data set we have generated. But maybe in the future I am going to add some error to the generator…

Run: train(train_images, train_num_states[:, 0], test_images, test_num_states[:, 0], cell_index=0)

The CNN ran 15 minutes. In the end it returned as part of the evaluation on the test dataan accuracy of 1.0.

INFO:tensorflow:Saving dict for global step 1202: accuracy = 1.0, global_step = 1202, loss = 0.0017934386
{'loss': 0.0017934386, 'global_step': 1202, 'accuracy': 1.0}

Classifying all cells

Now I can run this for every cell like this:

%%time
for i in range(9):
train(train_images, train_num_states[:, i], test_images, test_num_states[:, i], cell_index=i)

“CPU times: user 1h 41min 26s, sys: 9min 48s, total: 1h 51min 15s Wall time: 36min 40s”

And to make a full prediction I will have to load all models into different graphs and collect their predictions:

tf.logging.set_verbosity(tf.logging.ERROR)

img = get_grid()
prediction = []
for i in range(9):
with tf.Session() as sess:
graph_path = './ttt_convnet_model_{}/model.ckpt-1000.meta'.format(i)
new_saver = tf.train.import_meta_graph(graph_path)
new_saver.restore(sess, tf.train.latest_checkpoint('./ttt_convnet_model_{}'.format(i)))
classifier = tf.estimator.Estimator(model_fn=cnn_model_fn, model_dir="./ttt_convnet_model_{}".format(i))
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array([img])},
num_epochs=1,
shuffle=False
)
classification = next(classifier.predict(predict_input_fn))
prediction.append(classification['classes'])

prediction

Argh! This takes 11.2 seconds. I guess it’s the worst way to do it but works for now…

Okay now we make a test run:

def test_predict():
train_images, train_num_states = generate_game_states(n=1)
train_images = np.array(train_images)
train_num_states = np.array(train_num_states)

draw_game_states(train_images, train_num_states)

print("predict...")
print(predict(train_images))

And voilà - it works! :) How to continue

The following ideas would be fun starting points for new experiments:

• Create more realistic pixel maps with variation (or even a data set)
• Detect whether a state is legal or not
• Check if there is already a winner
• Build a KI for tic tac toe without telling it the rules