keras-How to solve CNN model fitting problem in tensorflow 2.2.0?

Akshay Sehgal 2020-11-30 23:08:23

It seems your kernel is dying (being killed) as the thread is taking too many resources. Seems you are making an unnecessary complex model by adding too many connections and trainable parameters. In fact, the single dense layer in fact is responsible for 99.991% of all your trainable parameters (125,960,448 / 125,971,458).

The issue is you are running out of computation resources (primarily RAM). Just to give you a context, following are some of the most influential CNN based architectures, most of which have been trained for DAYS on power GPUs.

LeNet-5 - 60,000 parameters
AlexNet - 60M paramters
VGG-16 - 138M paramters
Inception-v1 - 5M parameters
Inception-v3 - 24M parameters
ResNet-50 - 26M parameters
Xception - 23M parameters
Inception-v4 - 43M parameters
Inception-ResNet-V2 - 56M parameters
ResNeXt-50 - 25M parameters

Your basic 2 CNN stack model - 125M parameters!

Here is what you can do -

flatten (Flatten)            (None, 492032)            0         
_________________________________________________________________
dropout (Dropout)            (None, 492032)            0         
_________________________________________________________________
dense (Dense)                (None, 256)               125960448 <---!!!!
_________________________________________________________________

You are flattening a 62x62x128 tensor to 492,000 length vector! Instead either try adding more CNN to bring the first 2 dims of the more manageable AND/OR increase the size of kernel in previous CNNs.

The goal here is to have a manageable sized tensor before you hit the Dense layer. Also, try reducing the number of nodes in dense layer drastically.

Try something like this for starters, something that your device can actually handle without killing the kernel, say with 68k parameters (you should go simpler though and increase complexity later.)

model=tf.keras.models.Sequential([
    Conv2D(32, 3, activation='relu', input_shape=(500, 500, 1)),
    MaxPooling2D(3,3),
    Conv2D(64, 3, activation='relu'),
    MaxPooling2D(3,3),
    Conv2D(128, 3, padding='same', activation='relu'),
    MaxPooling2D(3,3),
    Conv2D(256, 3, padding='same', activation='relu'),
    MaxPooling2D(3,3),
    Flatten(),
    Dropout(0.5), 
    Dense(32, activation='relu'),
    Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes 
])

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_19 (Conv2D)           (None, 498, 498, 32)      320       
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 166, 166, 32)      0         
_________________________________________________________________
conv2d_20 (Conv2D)           (None, 164, 164, 64)      18496     
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 54, 54, 64)        0         
_________________________________________________________________
conv2d_21 (Conv2D)           (None, 54, 54, 128)       73856     
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 18, 18, 128)       0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 18, 18, 256)       295168    
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 6, 6, 256)         0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 9216)              0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 9216)              0         
_________________________________________________________________
dense_10 (Dense)             (None, 32)                294944    
_________________________________________________________________
dense_11 (Dense)             (None, 2)                 66        
=================================================================
Total params: 682,850
Trainable params: 682,850
Non-trainable params: 0
_________________________________________________________________

Jade 2020-12-02 08:29:09

Thank you. Your sharing has been very useful.

How to solve CNN model fitting problem in tensorflow 2.2.0?

热门帖子

热门github