Tensorflow2 Crash Course - Part V

Mong Kok, Hongkong

Performance Tuning
Results
- 'SSD MobileNet V2 FPNLite 320x320' vs 'SSD MobileNet V2 FPNLite 640x640'

This set of Notebooks provides a complete set of code to be able to train and leverage your own custom object detection model using the Tensorflow Object Detection API.

This article is based on a Tutorial by @nicknochnack.

Github Repository

Performance Tuning

Adding more Images for low performing Classes

Add and label new images, copy them into the training folder then re-run the training:

source tfod/bin/activate 
python 02_training_the_model.py

Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.03s).
Accumulating evaluation results...
DONE (t=0.02s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.706
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.881
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.706
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.713
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.725
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.744
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.744

Interestingly, the evaluation remained identical to what I had before ?

Rerun Training with more Steps

The training script contains the training command:

training_command = "python {} --model_dir={} --pipeline_config_path={} --num_train_steps=2000".format(TRAINING_SCRIPT, paths['CHECKPOINT_PATH'],files['PIPELINE_CONFIG'])

Increase the number training steps to the desired value:

--num_train_steps=10000

The run takes about approx. 20 Minutes:

INFO:tensorflow:Step 7000 per-step time 0.124s

After 10.000 steps I got the following metrics (0.759/0.781):

Accumulating evaluation results...
DONE (t=0.02s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.759
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.851
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.759
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.775
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.781
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.781
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.781

Precision: True Positive / (True Positive + False Positive)
Recall: True Positive / (True Positive + False Negative)

Eval	2000 steps	10000 steps
Average Precision	0.706	0.759
Average Recall	0.744	0.781

Changing model architecture using a different pre-trained model as a starting point

Detection Models: Tensorflow provides a collection of detection models pre-trained on the COCO 2017 dataset. So far I have been using the SSD MobileNet V2 FPNLite 320x320. I will now replace it with the slightly slower but more accurate SSD MobileNet V2 FPNLite 640x640:

Model name	Speed (ms)	COCO mAP	Outputs
SSD MobileNet V2 FPNLite 320x320	22	22.2	Boxes
SSD MobileNet V2 FPNLite 640x640	39	28.2	Boxes

How do I have to proceed from here? I added the model to my trainings script and re-run it:

PRETRAINED_MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8'
PRETRAINED_MODEL_URL = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz'

I can see that the model was downloaded and the pipeline file updated. But it seems that the training was not executed - but the evaluation (precision/recall) dropped to 0.478/0.550:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.478
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.664
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.530
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.478
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.512
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.550
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.550
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.550

I deleted the entire content of Tensorflow/workspace/models/my_ssd_mobnet to get rid of all the checkpoint data from the old model and re-run the training script. Ok this seems to work - I can see the trainings steps again. This time a lot slower than with the old model - as expected:

INFO:tensorflow:Step 4000 per-step time 0.446s

This would get you a trainings time of about 75 Minutes. The result is (0.752/0.769):

Accumulating evaluation results...
DONE (t=0.02s).
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.752
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 1.000
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.876
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.752
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.762
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.762
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.769
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.769

Results

The results I am getting are a bit confusing. At first the 320 model performed slightly worse than the 640 - as expected. I then deleted the trainings data for it and re-run the 320. This time I am getting the opposite results as presented above. Is Tensorflow storing trainings data outside of the designated trainings folder? Or is the training in general so inconsistent and needs to be run for a much longer time? Right now I am getting a lot better results after the 20 minute training of the 320 model compared to the 75 minutes for the 640...

'SSD MobileNet V2 FPNLite 320x320' vs 'SSD MobileNet V2 FPNLite 640x640'

Model name	Speed (ms)	COCO mAP	Outputs	Average Precision	Average Recall
SSD MobileNet V2 FPNLite 320x320	22	22.2	Boxes	0.759	0.781
SSD MobileNet V2 FPNLite 640x640	39	28.2	Boxes	0.752	0.769

Spock

The test run makes it even more confusing - the 640 seems to perform better here. But there actually a miss there at the end :-?

SSD MobileNet V2 FPNLite 320x320

SSD MobileNet V2 FPNLite 640x640

Performance Tuning​

Adding more Images for low performing Classes​

Rerun Training with more Steps​

Changing model architecture using a different pre-trained model as a starting point​

Results​

'SSD MobileNet V2 FPNLite 320x320' vs 'SSD MobileNet V2 FPNLite 640x640'​