Warm tip: This article is reproduced from serverfault.com, please click

My deep learning model is not training. How do I make it train?

发布于 2020-11-21 15:41:04

I'm fairly new to Keras, please excuse me if I made a fundamental error. So, my model has 3 Convolutional (2D) layers and 4 Dense Layers, interspersed with Dropout Layers. I am trying to train a Regression Model using images.

X_train.shape = (5164, 160, 320, 3)

y_train.shape = (5164)

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, Activation, MaxPooling2D, Dropout
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import Huber
from tensorflow.keras.optimizers.schedules import ExponentialDecay

model = Sequential()
model.add(Conv2D(input_shape=(160, 320, 3), filters=32, kernel_size=3, padding="valid"))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Activation('relu'))

model.add(Conv2D(filters=256, kernel_size=3, padding="valid"))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Activation('relu'))

model.add(Conv2D(filters=512, kernel_size=3, padding="valid"))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(1))

checkpoint = ModelCheckpoint(filepath="./ckpts/model.ckpt", monitor='val_loss', save_best_only=True)
stopper = EarlyStopping(monitor='val_acc', min_delta=0.0003, patience = 10)

lr_schedule = ExponentialDecay(initial_learning_rate=0.1, decay_steps=10000, decay_rate=0.9)
optimizer = Adam(learning_rate=lr_schedule)
loss = Huber(delta=0.5, reduction="auto", name="huber_loss")

model.compile(loss = loss, optimizer = optimizer, metrics=['accuracy'])
model.fit(X_train, y_train, validation_split = 0.2, shuffle = True, epochs = 100, 
          callbacks=[checkpoint, stopper])

model.save('model.h5')

When I try to run this model, the training loss decreases as expected, the validation loss hovers around the same region, and the validation accuracy stays exactly the same. I'm not asking for inputs to improve my model (I'll do that on my own), but I need help to get the model unstuck. I want to see the validation accuracy change, even in the third decimal place, decrease or increase doesn't matter. How can I get my model unstuck?

Here's an image of what happens when I try to train the model: enter image description here

Any solution would be much appreciated.

Questioner
Suprateem Banerjee
Viewed
0
Suprateem Banerjee 2020-11-28 10:06:16

The main issue in this code was the metric. Accuracy, being a classification metric doesn't work on Regression models. Decreasing initial learning rate to 0.0001 helped as well.

A slightly different implementation got a more conclusive answer here