Training a convolutional neural network¶

The python script training_template.py is used to train a new convolutional neural network (conv-net). This python notebook outlines this script in a step-by-step fashion.

The first step is to load the modules we need.

from __future__ import print_function
from keras.optimizers import SGD, RMSprop

from cnn_functions import rate_scheduler, train_model_sample
from model_zoo import bn_feature_net_61x61 as the_model

import os
import datetime
import numpy as np

The deep learning model we're are choosing is loaded from the model_zoo.py file. Here we are using the bn_feature_net_61x61 function. As a rule of thumb, the receptive field of the neural network (61 x 61 pixels) should roughly match the size of the cell.

We next define the batch size (the number of images the conv-net processes at once) and the number of epochs (the number of times we cycle through the entire training data set during training).

batch_size = 256
n_epoch = 25

Next we determine which training dataset we use, record which network we use, and define the directory locations of where the training data resides and where we want our network parameters saved.

dataset = "3T3_all_61x61"
expt = "bn_feature_net_61x61"

direc_save = "/home/vanvalen/DeepCell2/trained_networks/"
direc_data = "/home/vanvalen/DeepCell2/training_data_npz/"

Next, we define which optimizer we will use for training. In our experience, stochastic gradient descent works well for batch normalized networks while RMSprop appears to work better for non-batch normalized networks.

optimizer = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
lr_sched = rate_scheduler(lr = 0.01, decay = 0.95)

Finally, we will train the conv-net model on our training data. We typically train 5 models at a time so we can average the predictions - we've found that using model parallelism in this fashion leads to more rebust segmentation. Because our training data has 2 channels (phase image and nuclear marker) and we are looking for 3 features (edge, interior, and background), we need to specify those flags. The last piece of code is necessary to make sure that the names of each layer in the conv-net are assigned appropriately

for iterate in xrange(5):

	model = the_model(n_channels = 2, n_features = 3, reg = 1e-5)

	train_model_sample(model = model, dataset = dataset, optimizer = optimizer, 
		expt = expt, it = iterate, batch_size = batch_size, n_epoch = n_epoch,
		direc_save = direc_save, 
		direc_data = direc_data, 
		lr_sched = lr_sched,
		rotate = True, flip = True, shear = False)

	del model
	from keras.backend.common import _UID_PREFIXES
	for key in _UID_PREFIXES.keys():
		_UID_PREFIXES[key] = 0