Paige Liu's Posts

Custom object detection using Tensorflow Object Detection API

Problem to solve

Given a collection of images with a target object in many different shapes, lights, poses and numbers, train a model so that given a new image, a bounding box will be drawn around each of the target objects if they are present in the image.

Steps to take

Step 1 - Label the images

You can use tools such as VoTT or LabelImg to label images. Here we use VoTT to output data in Pascal VOC format.

Step 2 - Install Tensorflow Object Detection API

Instead of starting from scratch, pick an Azure Data Science VM, or Deep Learning VM which has GPU attached. This saves a lot of setup steps because the VMs come with a plethora of machine learning frameworks and tools installed, including Tensorflow. We will use a Ubuntu 16.04 based DSVM here. As for the VM size, you can start with a small size such as DS2_v3. But when it’s time to train, you’ll need to scale it to a larger size, otherwise it will probably take many days to train on hundreds of images.

Step 3 - Prepare the labeled images as Tensorflow input

Tensorflow Object Detection API takes TFRecords as input, so we need to convert Pascal VOC data to TFRecords. The script to do the convertion is located in the object_detection/dataset_tools folder. You need to modify one of the files such as create_pascal_tf_record.py or create_pet_tf_record.py to convert your data. Pick a script that converts data format close to yours. Here we pick create_pascal_tf_record.py as our template, and modified it to convert our VoTT output above. Don’t worry about making a mistake here, you will quickly see an error when you run the following command if you made a mistake. Run the script to convert input data to TFRecords:

python object_detection/dataset_tools/{my_create_tf_record}.py --set=train --data_dir=path/to/VoTToutputFolder --output_dir=path/to/TFRecordsOutput
python object_detection/dataset_tools/{my_create_tf_record}.py --set=val --data_dir=path/to/VoTToutputFolder --output_dir=path/to/TFRecordsOutput

Step 4 - Configure an object detection pipeline for training

Instead of creating a model from scratch, a common practice is to train a pre-trained model listed in Tensorflow Detection Model Zoo on your own dataset. These models are trained on well known datasets which may not include the type of object you are trying to detect, but we can leverage transfer learning to train these models to detect new types of object. If you don’t have GPU, pick a faster model over a more accurate one. Here, we choose ssd_mobilenet_v1_coco.

Step 5 - Train and evalute the pipeline

From the tensorflow/models/research/ directory, run the following command to train the model:

python object_detection/model_main.py --pipeline_config_path=path/to/modified_pipeline.config --model_dir=path/to/training_output --alsologtostderr

On a GPU, this may take a couple hours for precision to go above, say, 80%, or loss to go below, say 1. On a CPU, it could take much longer. Run tensorboard to observe how precision and loss change as the model learns:

tensorboard --logdir=path/to/training_output

If your images are of low quality, or the target object is very hard to detect in the images, or you have few images (less than 50), the mean average precision and total loss may appear erratic and unable to converge even after training for long time. Start with easy to detect object and good quality images.

Step 6 - Export the trained model for inferencing

Common errors and solutions

I’ve encountered the following main issues in this process of custom object detection. With some research, I found that the community has found resolutions or workaround.

  1. Many errors can result by forgetting to run the following from tensorflow/models/research folder. Make sure this is set in every shell session:
    export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
    
  2. Error message “Value Error: First Step Cannot Be Zero”
    Resolution: https://github.com/tensorflow/models/issues/3794
  3. Error message “_tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /usr/local/lib/python2.7/dist-packages/tensorflow/models/model/model.ckpt.data-00000-of-00001”
    Resolution: https://github.com/tensorflow/models/issues/2231. fine_tune_checkpoint=file_path/model.ckpt
  4. Error message “TypeError: can’t pickle dict_values objects”
    Resolution: https://github.com/tensorflow/models/issues/4780