2017年10月26日 星期四

Digital number recognition

MNIST for ML Beginners
Digital number recognition
井民全, Jing, mqjing@gmail.com
ETA: 30 mins
Back to Main Page


Google doc: This document.


Quick

Step 1: Install Environment
[install, Anaconda] How to install TensorFlow using Anaconda (view)

Step 2: Download the training data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Step 3: Source

Verification
source activate tensorflow
python mnist_softmax.py

Purpose

利用這份文件, 你可以開始撰寫第一支 tensorflow 程式, 並且建立信心. 每一個學習 tensorflow 的朋友, 都應該看過 MNIST digital number recognition 的官方教學文件 [1]. 所以, 我把它變成 step by step 步驟. 按照這個步驟可以在 30 分鐘內, 完成下列目標:
  1. 建立 tensorflow 開發環境.
  2. 完成官方教學文件的 tensorflow 程式碼, 開始玩 number recognition.  

Abstract

給定一張影像, 乘上不同的模板, 取得得分最高者. 訓練的目的在於能夠產生最佳分辨的模板.


Point

容易搞混的地方在 tensorflow 的陣列描述.

Table of Contents


Setup Environment

  1. [install, Anaconda] How to install TensorFlow using Anaconda (view)

Download the MINST Data

Code

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


E.g.
Fig. Python Code for Download the Database.
Fig. Downloaded MNIST Database.

MNIST Database Info

Training Set & Testing Set & Validation Set

Items
Number
Comments
Training data
55,000 data points

Testing data
10,000 data points

Validation data
5,000 data points



MINIST Data Point Info

Handwrite Digit Image, x
Label, y
mnist.train.images


28 x 28 =  784

(784-dimensional vector space)
mnist.train.labels

1


Data Structure for Represent Data

Items
Number
Data
Training data, Image Part,
mnist.train.images
55,000

[55000, 784]

  1. the index into the list of images
  2. the index for each pixel in each image
  3. between 0 and 1

Training data, Image Part,
mnist.train.labels
55,000
[55000, 10]

Format: one-hot vectors
3 => [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]

Testing data
10,000 data points

Validation data
5,000 data points


Making Model

Softmax Regressions

Purpose: We want to be able to look at an image and give the probabilities for it being each digit. (ex: Give a picture of a nine be 80% sure it's a nine, but give a 5% chance to it being a eight, ...)
Model: Assign probabilities to an object being one of several different things
When to use Softmax: Gives us a list of values between 0 and 1 that add up to 1. Even later on, when we train more sophisticated models, the final step will be a layer of softmax.


The Process of Softmax

  1. add up the evidence of our input being in certain classes
  2. convert that evidence into probabilities
The weights learned for each of these classes. (Red: negative weight, Blue: positive weight)
Bias: represent the things are independent of the input.


Calcuate the evidence for a class i



  • is the weight for class
  • is the bias for class
  • is the index for summing over the pixels in our input image


Convert the evidence to probabilities y

Shaping the output of our linear function into the probability distribution over 10 cases.


Define the softmax
= softmax()

softmax() = normalize(exp(x))

Exponentiation means
  • One more unit of evidence increases the weight given to any hypothesis multiplicatively
  • One less unit of evidence means that a hypothesis gets a fraction of its earier weight
Normalize
  • so that weights add up to one forming a valid probability distribution


Calcuating evidences for all classes and convert

We (1) compute a weighted sum of the xs, (2) add a bias, and (3) then apply softmax.


Data Flow form


  • is the index for summing over the pixels in our input image
  • is the weight for class
  • is the bias for class


(Edit)
Equation form



Matrix form
解釋:
Y: The probalities of classes for a given image X.
X: The pixel of X
B: bias
W: the trainined weigt for each class i


Compact form




Implement the Regression

Define the Model



Model
TensorFlow Python Code
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)
Detail

28 x 28 =  784

# X1 =  [x1, x2, x3, ..., x784],
# X2 = [x1, x2, x3, ..., x784], ...,
# X55000 = [x1, x2, x3, ..., x784]


x = tf.placeholder(tf.float32, [None, 784])

// None => a dimension of any length

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)


(Edit)


Training

Define the Loss Function

Loss Function
TensorFlow Python Code

Where
  • y is our predicted probability distribution
  • y′ is the true distribution (the one-hot vector with the digit labels)
y_ = tf.placeholder(tf.float32, [None, 10])

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))


Note:
reduction_indices=[1]: let tf.reduce_sum adds the elements in the second dimension of y


Minimize the Lost

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)


Launch the Model

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()   #init variable


Run 1000 times for traning model

for _ in range(1000):
 batch_xs, batch_ys = mnist.train.next_batch(100)    # 100 random samples from training set
 sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})


Evaluating Our Model

Logic
TensorFlow Python Code
[True, False, True, True] => [1,0,1,1] => 0.75
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

tf.cast(correct_prediction, tf.float32)
92%
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


Source

mnist_softmax.py
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""A very simple MNIST classifier.

See extensive documentation at
https://www.tensorflow.org/get_started/mnist/beginners
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import sys

from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

FLAGS = None


def main(_):
 # Import data
 mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)

 # Create the model
 x = tf.placeholder(tf.float32, [None, 784])
 W = tf.Variable(tf.zeros([784, 10]))
 b = tf.Variable(tf.zeros([10]))
 y = tf.matmul(x, W) + b

 # Define loss and optimizer
 y_ = tf.placeholder(tf.float32, [None, 10])

 # The raw formulation of cross-entropy,
 #
 #   tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),
 #                                 reduction_indices=[1]))
 #
 # can be numerically unstable.
 #
 # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw
 # outputs of 'y', and then average across the batch.
 cross_entropy = tf.reduce_mean(
     tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
 train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

 sess = tf.InteractiveSession()
 tf.global_variables_initializer().run()
 # Train
 for _ in range(1000):
   batch_xs, batch_ys = mnist.train.next_batch(100)
   sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

 # Test trained model
 correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('===================')
 print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                     y_: mnist.test.labels}))

if __name__ == '__main__':
 parser = argparse.ArgumentParser()
 # parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data',
                     help='Directory for storing input data')
# /home/jing/MNIST_data
parser.add_argument('--data_dir', type=str, default='/home/jing/MNIST_data',
                     help='Directory for storing input data')
 FLAGS, unparsed = parser.parse_known_args()
 tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)


Verification

source activate tensorflow
python mnist_softmax.py


e.g.


More Test

batch_xs, batch_ys = mnist.train.next_batch(x)
x=
3000 => 92.13%
1000 => 92.14%
500 => 92.14%
100 => 91.69%
50 => 91.13%
10 => 85.9%

References



Further Reading


  1. 這不僅僅是另一個使用TensorFlow來做MNIST數字圖像識別的教程 (ref)