TensorFlow Object Detection API

Intro to Object Detection

Object detection is an artificial intelligence technique related to computer vision and image processing that allows us to identify and locate objects of a certain class (such as humans, animals, or cars) in an image or video. This technique is based on the concept that every object class has its own characteristic visual features that helps identify them and distinguish them from others. For example, we define as “square” any 2d-object having 4 corners and 4 sides of equal length.

Object detection can be broken down into machine learning-based approaches and deep learning-based approaches. In the former, computer vision techniques are used to extract a defined set of visual features, such as corners and edges, to identify groups of pixels that may belong to an object; those are later used to train a regression model able to predict the location of the object along with its label. In the latter approach, convolutional neural networks (CNNs) are employed to perform end-to-end object detection in which relevant visual features are automatically extracted and used in the detection task.

TensorFlow Object Detection, a Deep learning-based approach

TensorFlow Installation

The TensorFlow Object Detection API is an open source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models.
To use this API in your environment, it is necessary to install on your machine the library TensorFlow, running the command:

pip install --ignore-installed --upgrade tensorflow==2.5.0

in your terminal.

By default, TensorFlow will attempt to register compatible GPU available on the device. If this fails, TensorFlow will anyhow exploit the available CPU. For clarity, in order for TensorFlow to run on your GPU, the following requirements must be met:

  • Nvidia GPU (GTX 650 or newer) – drivers;
  • CUDA Toolkit v11.2: open this link to download and install CUDA Toolkit 11.2 for your Linux distribution, while installation instructions can be found here;
  • CuDNN 8.1.0 library: open this link to download and install the cuDNN library for Linux (x86_64) (you may need to create a user profile for the Nvidia Developer web site before downloading the library), while installation instructions can be found here;

TensorFlow Object Detection API Installation

Next step involves installing TensorFlow Object Detection API (whose models can be downloaded from the TensorFlow Models repository) and manually install all dependencies, which are:

  • Protobufs, to configure model and training parameters (the latest release can be downloaded in the protoc release page);
  • COCO API, which provides Matlab, Python, and Lua APIs that assists in loading, parsing, and visualizing the annotations in the COCO’s large image dataset designed for object detection.

This step is not trivial hence we recommend carefully following the official documentation available here.

Training Custom Object Detector

Once you have installed both TensorFlow and TensorFlow Object Detection API, it will be possible to train your own object detector to recognize, for example, particular tissues or cells. To do this, you will need to follow the following steps:

  1. Preparing the workspace containing the TensorFlow Object Detection API and all the training setups including images, annotations, etc;
  2. Preparing the datasets:
    1. For image labelling we suggest using the python tool called labelImg, which can be installed by running the command pip install labelImg in your terminal;
    2. The images need to be carefully divided into train and test sets.
    3. Generate label map, which maps each of the used labels to integer values;
    4. The TensorFlow records, generated from image annotations;
  3. Select and configure a pre-trained model provided by TensorFlow, an exhaustive list of available models can be found here TensorFlow 2 Detection Model Zoo. Once the model is identified, some parameters in pipeline.config file need to be defined such as: number of different label classes; the batch size, the value of which depends on the available memory (higher values require more memory and vice-versa); the path to test and training record files and to the label map

Training the Model

Before training the model, copy the script (located in the models/research/object_detection folder) and paste it straight into your workspace folder. This script is then needed to train the model. To start a training job, open a new Terminal inside the workspace folder and run the following command:

python --model_dir=models/my_ssd_resnet50_v1_fpn –pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config

Once the script finishes without errors, the model is ready to be employed in identifying object classes it has been trained for. To do this, first you have to copy the script (located in the models/research/object_detection folder) and paste it straight into you workspace folder, then open a new Terminal inside the workspace folder and run the following command:

python .\ --input_type image_tensor --pipeline_config_path .\models\my_ssd_resnet50_v1_fpn\pipeline.config --trained_checkpoint_dir .\models\my_ssd_resnet50_v1_fpn\ --output_directory .\exported-models\my_model

After the above process is completed, you should find a new folder named my_model in the workspace folder. This folder will contain a series of configuration files related to the trained model you generated, which can be exported and adopted in processing new images and perform the object detection tasks.