Intelligent Mobile Projects with TensorFlow
上QQ阅读APP看书,第一时间看更新

Retraining SSD-MobileNet and Faster RCNN models

The pre-trained TensorFlow Object Detection models certainly work well for some problems. But sometimes, you may need to use your own annotated dataset (with bounding boxes around objects or parts of objects that are of particular interest to you) and retrain an existing model so it can more accurately detect a different set of object classes.

We’ll use the same Oxford-IIIT Pets dataset, as documented in the TensorFlow Object Detection API site, to retrain two existing models on your local machine, instead of using Google Cloud covered in the documentation. We’ll also add an explanation for each step when needed. The following is the step-by-step guide on how to retrain a TensorFlow object detection model using the Oxford Pets dataset:

  1. In a Terminal window, preferably on our GPU-powered Ubuntu to make the retraining faster, cd models/research first, then run the following commands to download the dataset (images.tar.gz is about 800MB and annotations.tar.gz is 38MB):
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz 
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
tar -xvf images.tar.gz
tar -xvf annotations.tar.gz
  1. Run the following command to convert the dataset to the TFRecords format:
python object_detection/dataset_tools/create_pet_tf_record.py \
--label_map_path=object_detection/data/pet_label_map.pbtxt \
--data_dir=`pwd` \
--output_dir=`pwd`

This command will generate two TFRecord files, named pet_train_with_masks.record (268MB) and pet_val_with_masks.record (110MB), in the models/research directory. TFRecords is an interesting binary format that includes all the data a TensorFlow app can use for training or validation, and is the required file format if you want to retrain your own dataset with the TensorFlow Object Detection API.

  1. Download and unzip the ssd_mobilenet_v1_coco model and the faster_rcnn_resnet101_coco model to the models/research directory if you haven’t done so in the previous section when testing the object detection notebook:
wget http://storage.googleapis.com/download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz
tar -xvf ssd_mobilenet_v1_coco_2017_11_17.tar.gz
wget http://storage.googleapis.com/download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_11_06_2017.tar.gz
tar -xvf faster_rcnn_resnet101_coco_11_06_2017.tar.gz
  1. Replace five occurrences of PATH_TO_BE_CONFIGURED in the object_detection/samples/configs/faster_rcnn_resnet101_pets.config file, so they become:
fine_tune_checkpoint: "faster_rcnn_resnet101_coco_11_06_2017/model.ckpt"
...
train_input_reader: {
tf_record_input_reader {
input_path: "pet_train_with_masks.record"
}
label_map_path: "object_detection/data/pet_label_map.pbtxt"
}
eval_input_reader: {
tf_record_input_reader {
input_path: "pet_val_with_masks.record"
}
label_map_path: "object_detection/data/pet_label_map.pbtxt"
...
}

The faster_rcnn_resnet101_pets.config file is used to specify the locations of the model’s checkpoint file, which contains the model’s trained weights, the TFRecords files for training and validation, which are generated in step 2, and the labeled items of the 37 classes of pets to be detected. The first and last items of object_detection/data/pet_label_map.pbtxt are as follows:

item {
id: 1
name: 'Abyssinian'
}
...
item {
id: 37
name: 'yorkshire_terrier'
}
  1. Similarly, change five occurrences of PATH_TO_BE_CONFIGURED in the object_detection/samples/configs/ssd_mobilenet_v1_pets.config file, so they become:
fine_tune_checkpoint: "object_detection/ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"
train_input_reader: {
tf_record_input_reader {
input_path: "pet_train_with_masks.record"
}
label_map_path: "object_detection/data/pet_label_map.pbtxt"
}
eval_input_reader: {
tf_record_input_reader {
input_path: "pet_val_with_masks.record"
}
label_map_path: "object_detection/data/pet_label_map.pbtxt"
...
}
  1. Create a new train_dir_faster_rcnn directory, then run the retraining command:
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=object_detection/samples/configs/faster_rcnn_resnet101_pets.config \
--train_dir=train_dir_faster_rcnn

On a GPU-powered system, it takes less than 25,000 steps of training to reach the loss of 0.2 or so from the initial loss of 5.0:

tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7845
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.44GiB
INFO:tensorflow:global step 1: loss = 5.1661 (15.482 sec/step)
INFO:tensorflow:global step 2: loss = 4.6045 (0.927 sec/step)
INFO:tensorflow:global step 3: loss = 5.2665 (0.958 sec/step)
...
INFO:tensorflow:global step 25448: loss = 0.2042 (0.372 sec/step)
INFO:tensorflow:global step 25449: loss = 0.4230 (0.378 sec/step)
INFO:tensorflow:global step 25450: loss = 0.1240 (0.386 sec/step)
  1. Press Ctrl + C to end the running of the retraining script above after about 20,000 steps (for about 2 hours). Create a new train_dir_ssd_mobilenet directory, then run:
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=object_detection/samples/configs/ssd_mobilenet_v1_pets.config \
--train_dir=train_dir_ssd_mobilenet

The training results should look like the following:

INFO:tensorflow:global step 1: loss = 136.2856 (23.130 sec/step)
INFO:tensorflow:global step 2: loss = 126.9009 (0.633 sec/step)
INFO:tensorflow:global step 3: loss = 119.0644 (0.741 sec/step)
...
INFO:tensorflow:global step 22310: loss = 1.5473 (0.460 sec/step)
INFO:tensorflow:global step 22311: loss = 2.0510 (0.456 sec/step)
INFO:tensorflow:global step 22312: loss = 1.6745 (0.461 sec/step)

You can see the retraining of the SSD_Mobilenet model both starts and ends with a bigger loss than that of the Faster_RCNN model.

  1. Terminate the preceding retraining script after about 20,000 training steps. Then create a new eval_dir directory and run the evaluation script:
python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=object_detection/samples/configs/faster_rcnn_resnet101_pets.config \
--checkpoint_dir=train_dir_faster_rcnn \
--eval_dir=eval_dir
  1. Open another Terminal window, cd to the TensorFlow root, then models/research, and run tensorboard --logdir=.. In a browser, open http://localhost:6006, and you’ll see the loss graph, like in Figure 3.2:

Figure 3.2 Total loss trend when retraining an object detection model

You'll also see some evaluation results, as in Figure 3.3:

Figure 3.3 Evaluation image detection results when retraining an object detection model

  1. Similarly, you can run the evaluation script for the SSD_MobileNet model and then use TensorBoard to view its loss trend and evaluation image results:
python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=object_detection/samples/configs/ssd_mobilenet_v1_pets.config \
--checkpoint_dir=train_dir_ssd_mobilenet \
--eval_dir=eval_dir_mobilenet
  1. You can generate the retrained graphs using the following commands:
python object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path object_detection/samples/configs/ssd_mobilenet_v1_pets.config \
--trained_checkpoint_prefix train_dir_ssd_mobilenet/model.ckpt-21817 \
--output_directory output_inference_graph_ssd_mobilenet.pb

python object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path object_detection/samples/configs/faster_rcnn_resnet101_pets.config \
--trained_checkpoint_prefix train_dir_faster_rcnn/model.ckpt-24009 \
--output_directory output_inference_graph_faster_rcnn.pb

You need to replace the --trained_checkpoint_prefix value (21817 and 24009 above) with your own specific checkpoint values.

There you go—you now have two retrained object detection models, output_inference_graph_ssd_mobilenet.pb and output_inference_graph_faster_rcnn.pb, ready to be used in your Python code (for the Jupyter notebook in the last section) or your mobile apps. Without any further delay, let’s jump to the mobile world and see how to use the pre-trained and retrained models we have.