printf("ho_tari\n");

HW3 본문

Chest X-ray (Pneumonia 폐렴)

 

Classification problem

◦ input variable: images

◦ 1 binary output variable (pneumonia or normal)

 5863 x-ray images

◦ Already split into train, validation and test.

 

Q1

Data preprocessing: The image sizes all vary. Thus, resizing is essential.

◦ When loading images, resize the image into [128, 128]

◦ flow_from_directory(train_dir, target_size=(128,128), batch_size=20,class_mode='binary’)

 Base model: you will use a pre-trained model, VGG16 (weights=‘imagenet’).

 Classifier: the top MLP structure should be:

◦ GlobalAveragePooling2D ->

◦ Dense(512) -> BatchNormalization -> Activation(Relu) ->

◦ Dense(128) -> Dense(1)

 You should do 2-step fine-tuning

◦ 100 epochs for the frozen base + 50 fine-tuning epochs (only tune 5-blocks)

◦ Learning parameters: RMSprop with learning rate of 1e-5 ◦ When you load a model, you should set optimizer again.

 You can run multiple times and average the results. However, I do not recommend, since it will take quite long time to learn (more than 2 hours using a GPU)

 Fill the table of the HW3 template.

◦ Show your codes, accuracy, and loss in the training and test set,

◦ Also show the accuracy graph and loss graph in the training and validation set.

 Note. Do not forget saving the learned model before and after fine-tuning. You will use the saved model in QE2

 

<pneumonia_finetuning.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q1.h5')

conv_base = model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=50
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q1_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization
def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q1.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q1.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

 

<pneumonia_pretrained.py>

import sklearn
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [128,128,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q1.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q1.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q1.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)
  Before FT After FT
Training
Loss
0.0009 0.0003
Training
Accuracy
0.9998 0.9998
Test
Loss
3.8092 5.0818
Test
Accuracy
0.7949 0.7933

Loss Graph

 

<before>

<after>

Accuracy Graph

 

<before>

<after>

Loss 그래프를 확인해보면 fine tuning을 하기 전에는 약간의 overfitting이 일어나는 모습을 보이지만 fine tuning 이후에는 loss가 감소하는 모습을 보인다. 하지만 이론과 다르게 fine tuning 이후에 test accuracy test loss를 보면 성능이 조금 저하되었다. 적절한 optimizerlearning rate를 찾을 필요가 있다.

 

Q2

The previous model showed serious overfitting. Thus, let’s add dropout.

 The modified classifier: the top MLP structure should be:

◦ GlobalAveragePooling2D -> Dropout(0.25) ->

◦ Dense(512) -> BatchNormalization -> Activation(Relu) -> Dropout(0.25) ->

◦ Dense(128) -> Dropout(0.25) -> Dense(1)

 You should do 2-step fine-tuning

◦ 100 epochs for the frozen base + 100 fine-tuning epochs

◦ All other parameters should be same with the problem 2’s

 Fill the table of the HW3 template.

◦ Show accuracy, and loss in the training and test set,

◦ Also show the accuracy graph and loss graph in the training and validation set.

 Do you think that overfitting is reduced?

 Is it improved compared to the results of Q1?

 Note. Do not forget saving the learned model before and after fine-tuning. You will use the saved model in QE2

 

<pneumonia_finetuning_dropout.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q2.h5')

conv_base = model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=50
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q2_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization
def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q2.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q2.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<pneumonia_pretrained_dropout.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [128,128,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q2.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q2.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q2.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)
  Before FT After FT
Training
Loss
0.0673 0.0035
Training
Accuracy
0.9769 0.9991
Test
Loss
0.5499 2.2543
Test
Accuracy
0.8364 0.8205

Loss Graph

 

<before>

<after>

Accuracy Graph

 

<before>

<after>

Fine Tuning을 하기 전과 후의 Loss 그래프를 비교해보면 하기 전의 그래프에서는 overfitting이 심하게 일어나는 것을 확인할 수 있으며 Fine Tuning을 한 결과로는 overfitting이 어느 정도 줄어들었음을 확인할 수 있다. 하지만 LossAccuracy 그래프 모두 심하게 oscillating 하는 모습을 볼 수 있는데 이는 몇 천개나 되는 Train Dataset을 사용하였음에도 불구하고 Validation Dataset20개도 되지 않는 조건에 의해서 발생하였을 것이다. Dropout Layer가 없을 때와 성능을 비교해보면 성능이 더 나아졌다. Dropout은 학습하는 동안 랜덤하게 cut-off해주며 최적의 결과를 내게 해준다.

 

Q3

Repeat Q2 with image resizing into [256, 256] and [512, 512].

◦ For [512, 512], due to memory limitation, you should change the batch size into 10.

◦ For [256, 256], the batch size of 20 is okay. (No change is required)

 You should do 2-step fine-tuning

◦ All parameters should be same with the problem 3’s

 Fill the table of the HW3 template.

◦ Show accuracy, and loss in the training and test set,

◦ Also show the accuracy graph and loss graph in the training and validation set.

 Which one is the best among results among Q1 Q2, and Q3? Why?

 Note. Do not forget saving the learned model before and after fine-tuning. You will use the saved model in the QE2

 

<pneumonia_finetuning_256.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q3_256.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q3_256_fine.h5')

# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.256.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.256.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<pneumonia_pretrained_256.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [256,256,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q3_256.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.256.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.256.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<pneumonia_finetuning_512.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q3_512.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q3_512_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.512.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.512.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<pneumonia_pretrained_512.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [512,512,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q3_512.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.512.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.512.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)
  [256, 256]
Before FT
[256, 256]
After FT
[512, 512]
Before FT
[512, 512]
After FT
Training
Loss
0.0923 0.0002 0.1279 0.0113
Training
Accuracy
0.9746 1.0000 0.9725 0.9971
Test
Loss
0.4351 5.1157 0.3874 6.0599
Test
Accuracy
0.8316 0.7356 0.8341 0.7292

Loss Graph

 

<256 before>

<256 after>

Accuracy Graph

 

<256 before>

<256 after>

<512 before>

<512 after>

<512 before>

<512 after>

Test AccuracyTest Loss만을 가지고 Q1, Q2, Q3의 결과들을 비교한다면 이미지 사이즈를 128x128을 사용하고 drop-out을 추가한 Q2의 경우에 성능이 가장 좋게 나타났다. 이미지의 사이즈는 증가하였지만 학습하는 epoch등의 변화는 없었기에 이미지의 feature을 자세하게 뽑아내는 것이 힘들어져서 이러한 결과를 얻게 된 것 같다. 하지만 이미지 사이즈 512일 때의 그래프를 보면 loss가 아직 줄고 있는 현상을 확인할 수 있으며 이러한 underfitting 문제를 해결하기 위해 epoch 수를 증가시킨다면 더 좋은 accuracy를 얻을 수 있을 것 같다.

 

Q4

Using the Chapter 5.3 of the textbook, draw the area that was important for classification.

 You can use matplotlib’s pyplot.imshow.

 The results should be similar to below.

 

<get_cam.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers
import cv2
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions
tf.compat.v1.disable_eager_execution()
K.set_learning_phase(0)

# image preprocessing
img_path = '/home//ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/PNEUMONIA/person15_virus_46.jpeg'
img=image.load_img(img_path, target_size=(224,224))
img_tensor=image.img_to_array(img)
img_tensor=np.expand_dims(img_tensor,axis=0)
img_tensor=preprocess_input(img_tensor)

# load model
model = VGG16(weights="imagenet")

# gradCAM
def gradCAM(model, x):
    preds=model.predict(x)

    max_output=model.output[:,np.argmax(preds[0])]
    last_conv_layer = model.get_layer('block5_conv3')
    grads = K.gradients(max_output, last_conv_layer.output)[0]
    pooled_grads = K.mean(grads, axis=(0,1,2))

    iterate=K.function([model.input],[pooled_grads,last_conv_layer[0]])
    pooled_grads_value, conv_layer_output_value=iterate([x])
    for i in range(512):
        conv_layer_output_value[:,:,i] *= pooled_grads_value[i]
    heatmap=np.mean(conv_layer_output_value,axis=-1)
    heatmap=np.maximum(heatmap,0)
    heatmap/=np.max(heatmap)
    return heatmap


heatmap = gradCAM(model, img_tensor)
plt.matshow(heatmap)

#visualization
img=cv2.imread(img_path)
heatmap=cv2.resize(heatmap,(img.shape[1],img.shape[0]))
heatmap=np.uint8(255*heatmap)
heatmap=cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)
superimposed_img=heatmap*0.4+img
cv2.imwrite('/home/ericsungho/PycharmProjects/pythonProject/q4/q4.png',superimposed_img)

<결과>

QE1

We will try CNN for varying image sizes.

 For data_flow_directory, do not specify resize. In other words, simply

◦ flow_from_directory(train_dir, batch_size=20,class_mode='binary’)

 Instead, the input_shape of CNN should be specified as [None, None, 3]

◦ input_shape = [None, None, 3]

 You should do 2-step fine-tuning

◦ All parameters should be same with the problem 4’s

 Fill the table of the HW3 template.

◦ Show accuracy, and loss in the training and test set,

◦ Also show the accuracy graph and loss graph in the training and validation set.

 Run the code. Does it work?

 Replace GlobalAveragePooling2D with Flatten. Run the code, does it work?

 Does it work better than the best model of Q3? If so, why? If not, why?

 Note. Do not forget saving the learned model before and after fine-tuning. You will use the saved model in QE2

 

<0_finetuning.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_qe1_0.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_qe1_0_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<0_pretrained.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [None,None,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_qe1_0.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<flatten_finetuning.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_qe1_flatten_h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_qe1_flatten_h6')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.flatten.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.flatten.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<flatten_pretrained.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import VGG16

# model definition
input_shape = [None,None,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.Flatten())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_qe1_flatten.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.flatten.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.flatten.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)
  [None, None]
Before FT
[None, None]
After FT
Training
Loss
0.0914 0.0020
Training
Accuracy
0.9726 0.9994
Test
Loss
0.3754 3.1453
Test
Accuracy
0.8446 0.7982

Loss Graph

 

<before>

<after>

Accuracy Graph

 

<before>

<after>

GlobalAveragePooling2D를 사용하였을 때는 코드가 잘 실행되었지만 Flatten을 사용하였을 때는 error가 발생하며 실행되지 않았다. (The last dimension of the inputs to `Dense` should be defined)라는 error가 발생하였다. Flatten을 통해 reshape되면 Hidden neuron의 개수를 다시 정의해야 한다. 필요하다. 이미지 사이즈를 None으로 하였을 경우 256, 512의 경우 보다 높은 Test accuracy를 갖지만 128인 경우보다는 낮다.

 

QE2

There are other methods to evaluate the model.

 Compute the following scores in Q1~Q3 (and QE1).

◦ Precision, Recall (sensitivity), Specificity, F1 score, AUC

◦ These scores should be computed in the test data set only.

 You need to use sklearn (adapt it into your code!)

◦ y_pred=model.predict_generator(test_generator)

◦ matrix = sklearn.metrics.confusion_matrix(y_test, y_pred>0.5)

◦ auc=sklearn.metrics.roc_auc_score(y_test, y_pred)

 Which model was the best considering all of the computed scores?

 

Model Test
Accuracy
Precision Recall Specificity F1 Score AUC
[128,128]
before FT
no dropout
0.7949 0.6299 0.8205 0.1966 0.7127 0.5012
[128,128]
after FT
no dropout
0.7933 0.6291 0.8308 0.1838 0.7160 0.5121
[128,128]
before FT
0.8364 0.6307 0.7795 0.2393 0.6972 0.5181
[128,128]
after FT
0.8205 0.6426 0.8205 0.2393 0.7207 0.5253
[256,256]
before FT
0.8316 0.6181 0.7718 0.2051 0.6864 0.5048
[256, 256]
after FT
0.7356 0.6288 0.8949 0.1197 0.7386 0.5314
[512,512]
before FT
0.8341 0.6115 0.7667 0.1881 0.6803 0.4531
[512,512]
after FT
0.7292 0.6315 0.9051 0.1197 0.7439 0.5267
[None, None]
before FT
0.8446 0.6406 0.7769 0.2735 0.7022 0.5143
[None, None]
after FT
0.7981 0.6299 0.8205 0.1966 0.7127 0.4958

모델의 성능을 판단할 때 Accuracy만을 보고 판단하지 않는다. 위에서 구한 여러 score를 분석한 결과, 전체적으로 모두 specificity가 매우 떨어지는 모습을 볼 수 있다. Specificity는 정상인 중에 모델이 정상이라고 맞춘 비율을 의미한다. Specificity가 높을수록 정상인을 제대로 분류하여 쓸데없는 치료 비용을 줄여준다. 하지만 이런 관점에서 모든 모델이 성능이 좋지 않다. 따라서 매우 한정된 치료 비용을 가진 병원에서는 이러한 모델들을 사용하기에는 좋지 않다. 나머지 Score들을 분석해본 결과 이미지 사이즈를 512x512를 사용하고 Fine Tuning을 거친 모델이 가장 성능이 좋았다. 이미지 사이즈를 128x128을 사용하였을 때 보다 Precision 관점에서는 낮은 성능을 보였으나 Recall 관점에서는 좋은 성능을 보였다. Precision은 모델이 폐렴이라고 진단한 사람들 중 실제 폐렴 환자의 비율을 나타내며 Recall은 실제 폐렴환자 중에서 모델이 폐렴이라고 맞춘 비율을 나타낸다. 실제 환자들을 진단할 때 모델이 정상인을 폐렴이라고 진단하였을 경우는 의사가 이 환자들을 2차로 확인하여 정상이라고 진단하면 되지만 모델이 실제 폐렴 환자를 놓치지 않고 판단하는 것은 생명이 걸린 문제이기에 매우 중요하다. 따라서 Recall 관점에서 가장 좋은 성능을 보이는 512x512를 사용하고 Fine Tuning을 거친 모델이 최적의 모델이다. 또한 PrecisionRecall의 조화 평균을 나타내는 F1 Score 역시 이때의 모델이 가장 높은 결과를 보였으면 면적이 클수록 좋은 AUC도 두번째로 높았다.

 

QE3

Let’s use different CNN base model, inceptionV3(weights=‘imagenet’).

◦ Use the same decision maker part with the models in the main questions

 Following Q1-Q3, and QE1, QE2, find the best model. The model should be tested through

◦ 2-step fine-tuning (Q1)

◦ Avoiding overfitting (Q2)

◦ Investigating whether image resizing affects the performance (Q3 and QE1)

◦ Various evaluation methods (QE2)

 

<inceptionv3.py>

import sklearn
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import InceptionV3

# model definition
input_shape = [128,128,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = InceptionV3(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q1.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q1.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q1.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_fine.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers
# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q1.h5')

conv_base = model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=50
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q1_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['accu'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q1.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q1.after-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_dropout.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import InceptionV3

# model definition
input_shape = [128,128,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = InceptionV3(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q2.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q2.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q2.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_dropout_fine.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(128,128),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q2.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

conv_base.summary()
model.summary()

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q2_fine.h5')

# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['accuracy'])
    plt.plot(h.history['val_accuracy'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q2.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q2.after-FT.acc.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_256.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import InceptionV3

# model definition
input_shape = [256,256,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = InceptionV3(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q3_256.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.256.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.256.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_256_fine.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(256,256),batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q3_256.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q3_256_fine.h5')

# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['accuracy'])
    plt.plot(h.history['val_accuracy'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.256.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.256.after-FT.acc.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_512.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)

from tensorflow.keras.applications import InceptionV3

# model definition
input_shape = [512,512,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = InceptionV3(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_q3_512.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.512.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.512.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_512_fine.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,target_size=(512,512),batch_size=10,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_q3_512.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_q3_512_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['accuracy'])
    plt.plot(h.history['val_accuracy'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Q3.512.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Q3.512.after-FT.acc.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_0.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

from tensorflow.keras.applications import InceptionV3

# model definition
input_shape = [None,None,3] # as a shape of image
def build_model():
    model=models.Sequential()
    conv_base = InceptionV3(weights='imagenet',
                      include_top=False,
                      input_shape=input_shape)
    conv_base.trainable=False
    model.add(conv_base)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(128,activation='relu'))
    model.add(layers.Dropout(0.25))
    model.add(layers.Dense(1, activation='sigmoid'))
    # compile
    model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
                  loss='binary_crossentropy', metrics=['accuracy'])
    return model

# learning
import time
starttime=time.time();
num_epochs = 100
model = build_model()
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)
# saving the model
model.save('pneumonia_qe1_0.h5')

# evaluation
train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['acc'])
    plt.plot(h.history['val_acc'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.before-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.before-FT.accuracy.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)

<inceptionv3_0_fine.py>

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization, Dropout, Activation
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from tensorflow.keras import optimizers

# set image generators
train_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/train/'
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,batch_size=20,class_mode='binary'
)
validation_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/val/'
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,batch_size=20,class_mode='binary'
)
test_dir='/home/ericsungho/PycharmProjects/pythonProject/datasets/chest_xray/chest_xray/test/'
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    test_dir,batch_size=20,class_mode='binary'
)

# loading the model
model = load_model('pneumonia_qe1_0.h5')

conv_base=model.layers[0]
for layer in conv_base.layers:
    if layer.name.startswith('block5'):
        layer.trainable = True

model.compile(optimizer=optimizers.RMSprop(lr=1e-5),
              loss='binary_crossentropy', metrics=['accuracy'])

# main loop without cross_validation
import time
starttime=time.time()
num_epochs=100
history = model.fit_generator(train_generator,
                              epochs=num_epochs,
                              validation_data=validation_generator)

model.save('pneumonia_qe1_0_fine.h5')
# evaluation

train_loss, train_acc = model.evaluate_generator(train_generator)
test_loss, test_acc = model.evaluate_generator(test_generator)
print('train_acc:',train_acc)
print('train_loss:',train_loss)
print('test_acc:',test_acc)
print('test_loss:',test_loss)
print("elapsed time (in sec):",time.time()-starttime)

# visualization

def plot_acc(h, title="accuracy"):
    plt.plot(h.history['accuracy'])
    plt.plot(h.history['val_accuracy'])
    plt.title(title)
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

def plot_loss(h, title="loss"):
    plt.plot(h.history['loss'])
    plt.plot(h.history['val_loss'])
    plt.title(title)
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Training','Validation'],loc=0)

plot_loss(history)
plt.savefig('Qe1.after-FT.loss.png')
plt.clf()
plot_acc(history)
plt.savefig('Qe1.after-FT.acc.png')
plt.clf()

# Scores
from sklearn.metrics import confusion_matrix,roc_auc_score
y_pred = model.predict_generator(test_generator)
matrix = confusion_matrix(test_generator.classes, y_pred>0.5)
auc = roc_auc_score(test_generator.classes, y_pred)
print('matrix:',matrix)
print('auc:', auc)
Model Test
Accuracy
Precision Recall Specificity F1 Score AUC
[128,128]
before FT
no dropout
0.8462 0.6320 0.7487 0.2735 0.6854 0.5231
[128,128]
after FT
no dropout
0.8333 0.6318 0.7744 0.2479 0.6959 0.5119
[128,128]
before FT
0.6634 0.6289 0.9256 0.0897 0.7489 0.4949
[128,128]
after FT
0.6955 0.6209 0.8821 0.1026 0.7288 0.4953
[256,256]
before FT
0.6426 0.6269 0.9436 0.0641 0.7533 0.5249
[256, 256]
after FT
0.6763 0.6268 0.9128 0.0940 0.7432 0.4969
[512,512]
before FT
0.7259 0.6243 0.8821 0.1154 0.7311 0.4862
[512,512]
after FT
0.7276 0.6154 0.8615 0.1026 0.7179 0.4856
[None, None]
before FT
0.6250 0.6247 0.9949 0.0042 0.7676 0.5095
[None, None]
after FT
0.6314 0.6242 0.9795 0.0171 0.7625 0.5013

모든 경우의 Precision 점수는 거의 비슷하다. 따라서 best model을 가릴 때 Precision의 경우는 크게 반영하지 않았다. 실제 환자 중 얼마나 환자를 놓치지 않았는지 보여주는 Recall 점수를 비교해보면 Drop-out을 추가하지 않은 128x128 모델 이외에는 모두 다 높은 값을 보여주는 것도 확인할 수 있다. 따라서 Recall 점수 역시 크게 반영하지 않았다. 대신 이 둘의 조화 평균 값인 F1 Score을 비교하였을 때 256x256를 사용하였을 때와 특정 이미지 사이즈를 정하지 않고 사용한 모델의 경우에 가장 높았다. 하지만 None의 경우에는 실제 정상인 중 정상인을 가려내는 성능을 보여주는 Recall 점수가 너무 낮고 256의 경우에도 Fine Tuning하기 전은 낮은 점수를 보인다. 따라서 Best Model256x256을 사용하고 Fine Tuning을 진행한 모델을 선택하였다.

'대학교 4학년 1학기 > 인공신경망과딥러닝' 카테고리의 다른 글

Final Project (Deep Fake Detection on Videos)  (0) 2023.10.31
HW4  (0) 2023.09.11
HW2  (0) 2023.09.07
HW1  (0) 2023.09.05
PCA #13  (0) 2023.09.05