How a team of deep learning newbies came 3rd place in a kaggle contest

Train model for 5 cycles with learning rate = 1e-2

We’ll train our model for 5 epochs (5 cycles through all our data) using the fit_one_cycle function.

Training and validation losses

Notice the metrics getting displayed i.e training_loss and valid_loss? We use them to monitor model improvements over time.

Our best model was obtained at the 4th epoch.

The output of training our model; the progression of training and validation losses internally picks and saves only your best models when you run through your training and validation datasets.

Evaluating our Model

The competition submissions were evaluated based on the Area under the Receiver Operating Characteristic curve between the predicted probability and the observed target, has_oilpalm. Learn more about AUC in this developer crash course, this video, or this Kaggle Learn Forum post. doesn’t come with this metric by default, so we’ll be using the scikit-learn library.

Print out validation metrics

The beauty of using pre-trained models and is that you get a really good prediction accuracy, 99.44% in our case without doing the most.

The metrics for the first stage of training

Let’s save our model and plot a confusion matrix about the predictions made.'resnet50-stg1')

View Results using confusion matrices

Plot confusion matrix

A confusion matrix is a graphical way of seeing the number of images your model accurately or inaccurately predicts.

Confusion matrix for the first stage of training

From this plot, we can see that the model accurately predicts 2,863 images as not having oil palm plantations in them, 168 images with oil palm plantations were correctly classified. 10 images that contained oil palm plantation were classified as not having any and then 7images without oil palm plantations were classified as having them.

Not bad for a simple model 🥳.

Next, we find an ideal learning rate for this training iteration.

Find an ideal learning rate
We choose a learning rate that’s between 1e-6 and 1e-4

Fit the model using a maximum learning rate that is between1e-6 and 1e-4 for 7 epochs.

Train model for 7 cycles, the learning rate shouldn’t exceed the range of 1e-6 and 1e-4
Training and validation losses

Observe the training metrics graphically to monitor the model’s performance after each training cycle.

The output of training our model; the progression of training and validation losses

Save the second stage of model training.'resnet50-stg2')
Accuracy, error rate and AUC score

Print out the model’s accuracy, error rate and Area Under the Curve metrics.

The metrics for the second stage of training

As you’ll notice the model’s accuracy improved from 99.44% to 99.48%. The error rate reduced from 0.0056 to 0.0052. The AUC also improved, 99.82% to 99.87%.

Plot confusion matrix

Upon comparison with the last confusion matrix we plotted, you’ll notice that the model makes better predictions.

Confusion matrix for the second stage of training

Formerly 7 images that didn’t have oil palm plantations were misclassified, now we are down to 3. That’s progress.

You’ll notice we’ve been following a pattern during training and tweaking a few parameters in the course of this. What we’ve been doing is called tuning. Most deep learning experiments follow a similar iterative pattern.