What do “compile” , “fit” and “predict” do in Keras sequential models?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












4












$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    Feb 24 at 8:27















4












$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    Feb 24 at 8:27













4












4








4





$begingroup$


I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!










share|improve this question









$endgroup$




I am a little confused between these two parts of Keras sequential models functions. May someone explains what is exactly the job of each one? I mean compile doing forward pass and calculating cost function then pass it through fit to do backward pass and calculating derivatives and updating weights? Or what?



I have seen in some codes, they only used compile function for some of their LSTMs and fit for some other ones! So I need to know each of these functions do what part of the work(training a neural network).



It's also interesting for me to know what exactly do predict function as well.



Very thank you in advanced!







keras prediction backpropagation cost-function methods






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 24 at 7:27









user145959user145959

1268




1268







  • 1




    $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    Feb 24 at 8:27












  • 1




    $begingroup$
    Why don't you consider reading the docs?
    $endgroup$
    – Aditya
    Feb 24 at 8:27







1




1




$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
Feb 24 at 8:27




$begingroup$
Why don't you consider reading the docs?
$endgroup$
– Aditya
Feb 24 at 8:27










1 Answer
1






active

oldest

votes


















5












$begingroup$

Let's first see what we need to do when we want to train a model.



  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)


Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-epoch:02d-val_acc:.2f.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]





The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$








  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    Feb 24 at 8:28










Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









5












$begingroup$

Let's first see what we need to do when we want to train a model.



  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)


Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-epoch:02d-val_acc:.2f.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]





The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$








  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    Feb 24 at 8:28















5












$begingroup$

Let's first see what we need to do when we want to train a model.



  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)


Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-epoch:02d-val_acc:.2f.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]





The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$








  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    Feb 24 at 8:28













5












5








5





$begingroup$

Let's first see what we need to do when we want to train a model.



  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)


Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-epoch:02d-val_acc:.2f.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]





The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()





share|improve this answer











$endgroup$



Let's first see what we need to do when we want to train a model.



  1. First, we want to decide a model architecture, this is the number of hidden layers and activation functions, etc. (compile)

  2. Secondly, we will want to train our model to get all the paramters to the correct value to map our inputs to our outputs. (fit)

  3. Lastly, we will want to use this model to do some feed-forward passes to predict novel inputs. (predict)


Let's go through an example using the mnist database.



from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Let's load our data. Then I normalize the values of the pixels to be between 0 and 1.



(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.


enter image description here



Now we need to reshape our data to compatible with Keras. We need to add an additional dimension to our data which will act as our channel when passing the data through the deep learning model. I then vectorize the output classes.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now let's define our model. We will use a vanilla CNN for this example.



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))


Now we are ready to compile our model. This will create a Python object which will build the CNN. This is done by building the computation graph in the correct format based on the Keras backend you are using. I usually use tensorflow over theano. The compilation steps also asks you to define the loss function and kind of optimizer you want to use. These options depend on the problem you are trying to solve, you can find the best techniques usually reading the literature in the field. For a classification task categorical cross-entropy works very well.



model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Now we have a Python object that has a model and all its parameters with its initial values. If you try to use predict now with this model your accuracy will be 10%, pure random output.



You can save this model to disk to use later.



# Save the model
model_json = model.to_json()
with open("weights/model.json", "w") as json_file:
json_file.write(model_json)


So, now we need to train our model so that the parameters get tuned to provide the correct outputs for a given input. We do this by feeding inputs at the input layer and then getting an output, we then calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.



First let's define some callback functions so that we can checkpoint our model and save it model parameters to file each time we get better results.



# Save the weights using a checkpoint.
filepath="weights/weights-improvement-epoch:02d-val_acc:.2f.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_test_reshaped, y_test_binary))


Now we have a model architecture and we have a file containing all the model parameters with the best values found to map the inputs to an output. We are now done with the computationally expensive part of deep learning. We can now take our model and use feed-forward passes and predict inputs. I prefer to use predict_class, rather than predict because it immediately gives me the class, rather than the output vector.



print('Predict the classes: ')
prediction = model.predict_classes(x_test_reshaped[10:20])
show_imgs(x_test[10:20])
print('Predicted classes: ', prediction)


enter image description here




Predicted classes: [0 6 9 0 1 5 9 7 3 4]





The code to print the MNIST database nicely



import matplotlib.pyplot as plt
%matplotlib inline

# utility function for showing images
def show_imgs(x_test, decoded_imgs=None, n=10):
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

if decoded_imgs is not None:
ax = plt.subplot(2, n, i+ 1 +n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()






share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 24 at 8:22

























answered Feb 24 at 8:05









JahKnowsJahKnows

5,177625




5,177625







  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    Feb 24 at 8:28












  • 1




    $begingroup$
    Nice answer +1!
    $endgroup$
    – Aditya
    Feb 24 at 8:28







1




1




$begingroup$
Nice answer +1!
$endgroup$
– Aditya
Feb 24 at 8:28




$begingroup$
Nice answer +1!
$endgroup$
– Aditya
Feb 24 at 8:28

















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46124%2fwhat-do-compile-fit-and-predict-do-in-keras-sequential-models%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?