YOLO/PASCAL-VOC detection tutorial

This tutorial demonstrates that Akida can perform object detection using a state-of-the-art model architecture. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.

1. Introduction

1.1 Object detection

Object detection is a computer vision task that combines two elemental tasks:

  • object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example

  • object localization that consists in drawing a bounding box around one or several objects in an image

One can learn more about the subject reading this introduction to object detection blog article.

1.2 YOLO key concepts

You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.

As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.

YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.

YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.

Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.

2. Preprocessing tools

As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the occurence of the two classes. Just like the VOC dataset, the subset contains an image folder, an annotation folder and a text file listing the file names of interest.

The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image or parse_voc_annotations.

import os

from tensorflow.keras.utils import get_file
from akida_models.detection.processing import parse_voc_annotations

# Download validation set from Brainchip data server
data_path = get_file(
    "voc_test_car_person.tar.gz",
    "http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz",
    cache_subdir='datasets/voc',
    extract=True)

data_dir = os.path.dirname(data_path)
gt_folder = os.path.join(data_dir, 'voc_test_car_person', 'Annotations')
image_folder = os.path.join(data_dir, 'voc_test_car_person', 'JPEGImages')
file_path = os.path.join(
    data_dir, 'voc_test_car_person', 'test_car_person.txt')
labels = ['car', 'person']

val_data = parse_voc_annotations(gt_folder, image_folder, file_path, labels)
print("Loaded VOC2007 test data for car and person classes: "
      f"{len(val_data)} images.")
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz

     8192/221551911 [..............................] - ETA: 0s
    98304/221551911 [..............................] - ETA: 2:08
   221184/221551911 [..............................] - ETA: 1:52
   376832/221551911 [..............................] - ETA: 1:38
   565248/221551911 [..............................] - ETA: 1:27
   786432/221551911 [..............................] - ETA: 1:18
  1040384/221551911 [..............................] - ETA: 1:09
  1310720/221551911 [..............................] - ETA: 1:03
  1646592/221551911 [..............................] - ETA: 57s 
  2031616/221551911 [..............................] - ETA: 52s
  2490368/221551911 [..............................] - ETA: 46s
  3022848/221551911 [..............................] - ETA: 42s
  3588096/221551911 [..............................] - ETA: 38s
  4177920/221551911 [..............................] - ETA: 35s
  4513792/221551911 [..............................] - ETA: 35s
  5005312/221551911 [..............................] - ETA: 34s
  5423104/221551911 [..............................] - ETA: 33s
  5840896/221551911 [..............................] - ETA: 32s
  6266880/221551911 [..............................] - ETA: 32s
  6709248/221551911 [..............................] - ETA: 31s
  7143424/221551911 [..............................] - ETA: 31s
  7602176/221551911 [>.............................] - ETA: 30s
  8044544/221551911 [>.............................] - ETA: 30s
  8527872/221551911 [>.............................] - ETA: 29s
  9003008/221551911 [>.............................] - ETA: 29s
  9486336/221551911 [>.............................] - ETA: 29s
  9977856/221551911 [>.............................] - ETA: 28s
 10477568/221551911 [>.............................] - ETA: 28s
 10985472/221551911 [>.............................] - ETA: 27s
 11362304/221551911 [>.............................] - ETA: 27s
 11599872/221551911 [>.............................] - ETA: 28s
 12025856/221551911 [>.............................] - ETA: 27s
 12410880/221551911 [>.............................] - ETA: 27s
 12787712/221551911 [>.............................] - ETA: 27s
 13156352/221551911 [>.............................] - ETA: 27s
 13557760/221551911 [>.............................] - ETA: 27s
 13967360/221551911 [>.............................] - ETA: 27s
 14385152/221551911 [>.............................] - ETA: 27s
 14811136/221551911 [=>............................] - ETA: 27s
 15237120/221551911 [=>............................] - ETA: 27s
 15671296/221551911 [=>............................] - ETA: 27s
 16121856/221551911 [=>............................] - ETA: 26s
 16400384/221551911 [=>............................] - ETA: 27s
 16793600/221551911 [=>............................] - ETA: 26s
 17113088/221551911 [=>............................] - ETA: 27s
 17473536/221551911 [=>............................] - ETA: 27s
 17809408/221551911 [=>............................] - ETA: 27s
 18169856/221551911 [=>............................] - ETA: 27s
 18530304/221551911 [=>............................] - ETA: 27s
 18898944/221551911 [=>............................] - ETA: 26s
 19292160/221551911 [=>............................] - ETA: 26s
 19685376/221551911 [=>............................] - ETA: 26s
 20037632/221551911 [=>............................] - ETA: 26s
 20414464/221551911 [=>............................] - ETA: 26s
 20840448/221551911 [=>............................] - ETA: 26s
 21151744/221551911 [=>............................] - ETA: 26s
 21577728/221551911 [=>............................] - ETA: 26s
 22020096/221551911 [=>............................] - ETA: 26s
 22462464/221551911 [==>...........................] - ETA: 26s
 22921216/221551911 [==>...........................] - ETA: 26s
 23379968/221551911 [==>...........................] - ETA: 26s
 23855104/221551911 [==>...........................] - ETA: 25s
 24338432/221551911 [==>...........................] - ETA: 25s
 24821760/221551911 [==>...........................] - ETA: 25s
 25305088/221551911 [==>...........................] - ETA: 25s
 25812992/221551911 [==>...........................] - ETA: 25s
 26173440/221551911 [==>...........................] - ETA: 25s
 26599424/221551911 [==>...........................] - ETA: 25s
 26984448/221551911 [==>...........................] - ETA: 25s
 27353088/221551911 [==>...........................] - ETA: 25s
 27746304/221551911 [==>...........................] - ETA: 25s
 28131328/221551911 [==>...........................] - ETA: 25s
 28491776/221551911 [==>...........................] - ETA: 24s
 28884992/221551911 [==>...........................] - ETA: 24s
 29319168/221551911 [==>...........................] - ETA: 24s
 29736960/221551911 [===>..........................] - ETA: 24s
 30179328/221551911 [===>..........................] - ETA: 24s
 30613504/221551911 [===>..........................] - ETA: 24s
 31080448/221551911 [===>..........................] - ETA: 24s
 31498240/221551911 [===>..........................] - ETA: 24s
 31956992/221551911 [===>..........................] - ETA: 24s
 32432128/221551911 [===>..........................] - ETA: 24s
 32833536/221551911 [===>..........................] - ETA: 24s
 33177600/221551911 [===>..........................] - ETA: 24s
 33529856/221551911 [===>..........................] - ETA: 24s
 33865728/221551911 [===>..........................] - ETA: 24s
 34242560/221551911 [===>..........................] - ETA: 24s
 34611200/221551911 [===>..........................] - ETA: 24s
 34996224/221551911 [===>..........................] - ETA: 23s
 35397632/221551911 [===>..........................] - ETA: 23s
 35799040/221551911 [===>..........................] - ETA: 23s
 36216832/221551911 [===>..........................] - ETA: 23s
 36626432/221551911 [===>..........................] - ETA: 23s
 37052416/221551911 [====>.........................] - ETA: 23s
 37486592/221551911 [====>.........................] - ETA: 23s
 37928960/221551911 [====>.........................] - ETA: 23s
 38371328/221551911 [====>.........................] - ETA: 23s
 38838272/221551911 [====>.........................] - ETA: 23s
 39305216/221551911 [====>.........................] - ETA: 23s
 39780352/221551911 [====>.........................] - ETA: 23s
 40280064/221551911 [====>.........................] - ETA: 22s
 40771584/221551911 [====>.........................] - ETA: 22s
 41271296/221551911 [====>.........................] - ETA: 22s
 41779200/221551911 [====>.........................] - ETA: 22s
 42270720/221551911 [====>.........................] - ETA: 22s
 42704896/221551911 [====>.........................] - ETA: 22s
 42983424/221551911 [====>.........................] - ETA: 22s
 43393024/221551911 [====>.........................] - ETA: 22s
 43802624/221551911 [====>.........................] - ETA: 22s
 44204032/221551911 [====>.........................] - ETA: 22s
 44630016/221551911 [=====>........................] - ETA: 22s
 45031424/221551911 [=====>........................] - ETA: 22s
 45432832/221551911 [=====>........................] - ETA: 22s
 45858816/221551911 [=====>........................] - ETA: 22s
 46235648/221551911 [=====>........................] - ETA: 22s
 46694400/221551911 [=====>........................] - ETA: 21s
 47153152/221551911 [=====>........................] - ETA: 21s
 47620096/221551911 [=====>........................] - ETA: 21s
 48103424/221551911 [=====>........................] - ETA: 21s
 48594944/221551911 [=====>........................] - ETA: 21s
 49086464/221551911 [=====>........................] - ETA: 21s
 49586176/221551911 [=====>........................] - ETA: 21s
 50094080/221551911 [=====>........................] - ETA: 21s
 50610176/221551911 [=====>........................] - ETA: 21s
 51134464/221551911 [=====>........................] - ETA: 21s
 51658752/221551911 [=====>........................] - ETA: 20s
 52183040/221551911 [======>.......................] - ETA: 20s
 52731904/221551911 [======>.......................] - ETA: 20s
 53280768/221551911 [======>.......................] - ETA: 20s
 53829632/221551911 [======>.......................] - ETA: 20s
 54386688/221551911 [======>.......................] - ETA: 20s
 54927360/221551911 [======>.......................] - ETA: 20s
 55500800/221551911 [======>.......................] - ETA: 20s
 56074240/221551911 [======>.......................] - ETA: 19s
 56655872/221551911 [======>.......................] - ETA: 19s
 57245696/221551911 [======>.......................] - ETA: 19s
 57835520/221551911 [======>.......................] - ETA: 19s
 58425344/221551911 [======>.......................] - ETA: 19s
 59015168/221551911 [======>.......................] - ETA: 19s
 59441152/221551911 [=======>......................] - ETA: 19s
 59899904/221551911 [=======>......................] - ETA: 19s
 60342272/221551911 [=======>......................] - ETA: 19s
 60817408/221551911 [=======>......................] - ETA: 19s
 61292544/221551911 [=======>......................] - ETA: 19s
 61775872/221551911 [=======>......................] - ETA: 18s
 62267392/221551911 [=======>......................] - ETA: 18s
 62767104/221551911 [=======>......................] - ETA: 18s
 63266816/221551911 [=======>......................] - ETA: 18s
 63676416/221551911 [=======>......................] - ETA: 18s
 63930368/221551911 [=======>......................] - ETA: 18s
 64364544/221551911 [=======>......................] - ETA: 18s
 64733184/221551911 [=======>......................] - ETA: 18s
 65126400/221551911 [=======>......................] - ETA: 18s
 65519616/221551911 [=======>......................] - ETA: 18s
 65921024/221551911 [=======>......................] - ETA: 18s
 66322432/221551911 [=======>......................] - ETA: 18s
 66748416/221551911 [========>.....................] - ETA: 18s
 67166208/221551911 [========>.....................] - ETA: 18s
 67600384/221551911 [========>.....................] - ETA: 18s
 68050944/221551911 [========>.....................] - ETA: 18s
 68493312/221551911 [========>.....................] - ETA: 18s
 68960256/221551911 [========>.....................] - ETA: 18s
 69419008/221551911 [========>.....................] - ETA: 18s
 69902336/221551911 [========>.....................] - ETA: 18s
 70279168/221551911 [========>.....................] - ETA: 17s
 70475776/221551911 [========>.....................] - ETA: 18s
 70918144/221551911 [========>.....................] - ETA: 17s
 71286784/221551911 [========>.....................] - ETA: 17s
 71639040/221551911 [========>.....................] - ETA: 17s
 72024064/221551911 [========>.....................] - ETA: 17s
 72384512/221551911 [========>.....................] - ETA: 17s
 72794112/221551911 [========>.....................] - ETA: 17s
 73179136/221551911 [========>.....................] - ETA: 17s
 73596928/221551911 [========>.....................] - ETA: 17s
 74006528/221551911 [=========>....................] - ETA: 17s
 74440704/221551911 [=========>....................] - ETA: 17s
 74866688/221551911 [=========>....................] - ETA: 17s
 75309056/221551911 [=========>....................] - ETA: 17s
 75751424/221551911 [=========>....................] - ETA: 17s
 76201984/221551911 [=========>....................] - ETA: 17s
 76652544/221551911 [=========>....................] - ETA: 17s
 77135872/221551911 [=========>....................] - ETA: 17s
 77611008/221551911 [=========>....................] - ETA: 17s
 78110720/221551911 [=========>....................] - ETA: 17s
 78602240/221551911 [=========>....................] - ETA: 17s
 79085568/221551911 [=========>....................] - ETA: 16s
 79609856/221551911 [=========>....................] - ETA: 16s
 80134144/221551911 [=========>....................] - ETA: 16s
 80674816/221551911 [=========>....................] - ETA: 16s
 81215488/221551911 [=========>....................] - ETA: 16s
 81756160/221551911 [==========>...................] - ETA: 16s
 82305024/221551911 [==========>...................] - ETA: 16s
 82870272/221551911 [==========>...................] - ETA: 16s
 83443712/221551911 [==========>...................] - ETA: 16s
 84025344/221551911 [==========>...................] - ETA: 16s
 84606976/221551911 [==========>...................] - ETA: 16s
 85188608/221551911 [==========>...................] - ETA: 15s
 85778432/221551911 [==========>...................] - ETA: 15s
 86360064/221551911 [==========>...................] - ETA: 15s
 86949888/221551911 [==========>...................] - ETA: 15s
 87539712/221551911 [==========>...................] - ETA: 15s
 88129536/221551911 [==========>...................] - ETA: 15s
 88719360/221551911 [===========>..................] - ETA: 15s
 89309184/221551911 [===========>..................] - ETA: 15s
 89907200/221551911 [===========>..................] - ETA: 15s
 90497024/221551911 [===========>..................] - ETA: 15s
 91086848/221551911 [===========>..................] - ETA: 14s
 91676672/221551911 [===========>..................] - ETA: 14s
 92258304/221551911 [===========>..................] - ETA: 14s
 92839936/221551911 [===========>..................] - ETA: 14s
 93429760/221551911 [===========>..................] - ETA: 14s
 94019584/221551911 [===========>..................] - ETA: 14s
 94609408/221551911 [===========>..................] - ETA: 14s
 94961664/221551911 [===========>..................] - ETA: 14s
 95494144/221551911 [===========>..................] - ETA: 14s
 96018432/221551911 [============>.................] - ETA: 14s
 96559104/221551911 [============>.................] - ETA: 14s
 97099776/221551911 [============>.................] - ETA: 14s
 97607680/221551911 [============>.................] - ETA: 14s
 98164736/221551911 [============>.................] - ETA: 13s
 98721792/221551911 [============>.................] - ETA: 13s
 99287040/221551911 [============>.................] - ETA: 13s
 99622912/221551911 [============>.................] - ETA: 13s
100065280/221551911 [============>.................] - ETA: 13s
100499456/221551911 [============>.................] - ETA: 13s
100900864/221551911 [============>.................] - ETA: 13s
101343232/221551911 [============>.................] - ETA: 13s
101777408/221551911 [============>.................] - ETA: 13s
102227968/221551911 [============>.................] - ETA: 13s
102678528/221551911 [============>.................] - ETA: 13s
103153664/221551911 [============>.................] - ETA: 13s
103620608/221551911 [=============>................] - ETA: 13s
104103936/221551911 [=============>................] - ETA: 13s
104595456/221551911 [=============>................] - ETA: 13s
105095168/221551911 [=============>................] - ETA: 13s
105611264/221551911 [=============>................] - ETA: 13s
106119168/221551911 [=============>................] - ETA: 13s
106651648/221551911 [=============>................] - ETA: 12s
107175936/221551911 [=============>................] - ETA: 12s
107724800/221551911 [=============>................] - ETA: 12s
108265472/221551911 [=============>................] - ETA: 12s
108822528/221551911 [=============>................] - ETA: 12s
109379584/221551911 [=============>................] - ETA: 12s
109953024/221551911 [=============>................] - ETA: 12s
110518272/221551911 [=============>................] - ETA: 12s
111099904/221551911 [==============>...............] - ETA: 12s
111681536/221551911 [==============>...............] - ETA: 12s
112263168/221551911 [==============>...............] - ETA: 12s
112828416/221551911 [==============>...............] - ETA: 12s
113385472/221551911 [==============>...............] - ETA: 12s
113975296/221551911 [==============>...............] - ETA: 11s
114540544/221551911 [==============>...............] - ETA: 11s
115089408/221551911 [==============>...............] - ETA: 11s
115638272/221551911 [==============>...............] - ETA: 11s
116187136/221551911 [==============>...............] - ETA: 11s
116736000/221551911 [==============>...............] - ETA: 11s
117284864/221551911 [==============>...............] - ETA: 11s
117833728/221551911 [==============>...............] - ETA: 11s
118390784/221551911 [===============>..............] - ETA: 11s
118947840/221551911 [===============>..............] - ETA: 11s
119537664/221551911 [===============>..............] - ETA: 11s
120127488/221551911 [===============>..............] - ETA: 11s
120700928/221551911 [===============>..............] - ETA: 11s
121274368/221551911 [===============>..............] - ETA: 11s
121577472/221551911 [===============>..............] - ETA: 11s
122109952/221551911 [===============>..............] - ETA: 10s
122626048/221551911 [===============>..............] - ETA: 10s
123166720/221551911 [===============>..............] - ETA: 10s
123699200/221551911 [===============>..............] - ETA: 10s
124231680/221551911 [===============>..............] - ETA: 10s
124772352/221551911 [===============>..............] - ETA: 10s
125329408/221551911 [===============>..............] - ETA: 10s
125878272/221551911 [================>.............] - ETA: 10s
126443520/221551911 [================>.............] - ETA: 10s
127025152/221551911 [================>.............] - ETA: 10s
127614976/221551911 [================>.............] - ETA: 10s
128196608/221551911 [================>.............] - ETA: 10s
128778240/221551911 [================>.............] - ETA: 10s
129163264/221551911 [================>.............] - ETA: 10s
129523712/221551911 [================>.............] - ETA: 10s
129974272/221551911 [================>.............] - ETA: 10s
130359296/221551911 [================>.............] - ETA: 9s 
130826240/221551911 [================>.............] - ETA: 9s
131293184/221551911 [================>.............] - ETA: 9s
131784704/221551911 [================>.............] - ETA: 9s
132268032/221551911 [================>.............] - ETA: 9s
132767744/221551911 [================>.............] - ETA: 9s
133275648/221551911 [=================>............] - ETA: 9s
133791744/221551911 [=================>............] - ETA: 9s
134299648/221551911 [=================>............] - ETA: 9s
134807552/221551911 [=================>............] - ETA: 9s
135331840/221551911 [=================>............] - ETA: 9s
135880704/221551911 [=================>............] - ETA: 9s
136347648/221551911 [=================>............] - ETA: 9s
136724480/221551911 [=================>............] - ETA: 9s
137109504/221551911 [=================>............] - ETA: 9s
137478144/221551911 [=================>............] - ETA: 9s
137805824/221551911 [=================>............] - ETA: 9s
138108928/221551911 [=================>............] - ETA: 9s
138420224/221551911 [=================>............] - ETA: 9s
138731520/221551911 [=================>............] - ETA: 9s
139059200/221551911 [=================>............] - ETA: 9s
139395072/221551911 [=================>............] - ETA: 9s
139722752/221551911 [=================>............] - ETA: 9s
140083200/221551911 [=================>............] - ETA: 8s
140460032/221551911 [==================>...........] - ETA: 8s
140795904/221551911 [==================>...........] - ETA: 8s
141172736/221551911 [==================>...........] - ETA: 8s
141557760/221551911 [==================>...........] - ETA: 8s
141926400/221551911 [==================>...........] - ETA: 8s
142344192/221551911 [==================>...........] - ETA: 8s
142671872/221551911 [==================>...........] - ETA: 8s
142991360/221551911 [==================>...........] - ETA: 8s
143269888/221551911 [==================>...........] - ETA: 8s
143597568/221551911 [==================>...........] - ETA: 8s
143884288/221551911 [==================>...........] - ETA: 8s
144211968/221551911 [==================>...........] - ETA: 8s
144556032/221551911 [==================>...........] - ETA: 8s
144891904/221551911 [==================>...........] - ETA: 8s
145252352/221551911 [==================>...........] - ETA: 8s
145596416/221551911 [==================>...........] - ETA: 8s
145973248/221551911 [==================>...........] - ETA: 8s
146350080/221551911 [==================>...........] - ETA: 8s
146718720/221551911 [==================>...........] - ETA: 8s
147095552/221551911 [==================>...........] - ETA: 8s
147505152/221551911 [==================>...........] - ETA: 8s
147898368/221551911 [===================>..........] - ETA: 8s
148324352/221551911 [===================>..........] - ETA: 8s
148742144/221551911 [===================>..........] - ETA: 8s
149159936/221551911 [===================>..........] - ETA: 8s
149618688/221551911 [===================>..........] - ETA: 8s
150061056/221551911 [===================>..........] - ETA: 8s
150519808/221551911 [===================>..........] - ETA: 7s
150986752/221551911 [===================>..........] - ETA: 7s
151470080/221551911 [===================>..........] - ETA: 7s
151953408/221551911 [===================>..........] - ETA: 7s
152444928/221551911 [===================>..........] - ETA: 7s
152952832/221551911 [===================>..........] - ETA: 7s
153468928/221551911 [===================>..........] - ETA: 7s
153985024/221551911 [===================>..........] - ETA: 7s
154517504/221551911 [===================>..........] - ETA: 7s
155049984/221551911 [===================>..........] - ETA: 7s
155590656/221551911 [====================>.........] - ETA: 7s
156147712/221551911 [====================>.........] - ETA: 7s
156696576/221551911 [====================>.........] - ETA: 7s
157261824/221551911 [====================>.........] - ETA: 7s
157827072/221551911 [====================>.........] - ETA: 7s
158408704/221551911 [====================>.........] - ETA: 7s
158982144/221551911 [====================>.........] - ETA: 6s
159563776/221551911 [====================>.........] - ETA: 6s
160112640/221551911 [====================>.........] - ETA: 6s
160702464/221551911 [====================>.........] - ETA: 6s
161292288/221551911 [====================>.........] - ETA: 6s
161882112/221551911 [====================>.........] - ETA: 6s
162471936/221551911 [=====================>........] - ETA: 6s
163061760/221551911 [=====================>........] - ETA: 6s
163651584/221551911 [=====================>........] - ETA: 6s
164241408/221551911 [=====================>........] - ETA: 6s
164831232/221551911 [=====================>........] - ETA: 6s
165421056/221551911 [=====================>........] - ETA: 6s
166002688/221551911 [=====================>........] - ETA: 6s
166584320/221551911 [=====================>........] - ETA: 6s
167165952/221551911 [=====================>........] - ETA: 5s
167763968/221551911 [=====================>........] - ETA: 5s
168353792/221551911 [=====================>........] - ETA: 5s
168943616/221551911 [=====================>........] - ETA: 5s
169541632/221551911 [=====================>........] - ETA: 5s
170115072/221551911 [======================>.......] - ETA: 5s
170721280/221551911 [======================>.......] - ETA: 5s
171311104/221551911 [======================>.......] - ETA: 5s
171900928/221551911 [======================>.......] - ETA: 5s
172490752/221551911 [======================>.......] - ETA: 5s
172769280/221551911 [======================>.......] - ETA: 5s
173137920/221551911 [======================>.......] - ETA: 5s
173604864/221551911 [======================>.......] - ETA: 5s
173965312/221551911 [======================>.......] - ETA: 5s
174383104/221551911 [======================>.......] - ETA: 5s
174776320/221551911 [======================>.......] - ETA: 5s
175210496/221551911 [======================>.......] - ETA: 5s
175628288/221551911 [======================>.......] - ETA: 5s
176070656/221551911 [======================>.......] - ETA: 4s
176504832/221551911 [======================>.......] - ETA: 4s
176947200/221551911 [======================>.......] - ETA: 4s
177414144/221551911 [=======================>......] - ETA: 4s
177889280/221551911 [=======================>......] - ETA: 4s
178364416/221551911 [=======================>......] - ETA: 4s
178847744/221551911 [=======================>......] - ETA: 4s
179347456/221551911 [=======================>......] - ETA: 4s
179847168/221551911 [=======================>......] - ETA: 4s
180371456/221551911 [=======================>......] - ETA: 4s
180887552/221551911 [=======================>......] - ETA: 4s
181411840/221551911 [=======================>......] - ETA: 4s
181952512/221551911 [=======================>......] - ETA: 4s
182493184/221551911 [=======================>......] - ETA: 4s
183042048/221551911 [=======================>......] - ETA: 4s
183590912/221551911 [=======================>......] - ETA: 4s
184164352/221551911 [=======================>......] - ETA: 4s
184729600/221551911 [========================>.....] - ETA: 4s
185311232/221551911 [========================>.....] - ETA: 3s
185892864/221551911 [========================>.....] - ETA: 3s
186482688/221551911 [========================>.....] - ETA: 3s
187064320/221551911 [========================>.....] - ETA: 3s
187654144/221551911 [========================>.....] - ETA: 3s
188243968/221551911 [========================>.....] - ETA: 3s
188833792/221551911 [========================>.....] - ETA: 3s
189423616/221551911 [========================>.....] - ETA: 3s
190013440/221551911 [========================>.....] - ETA: 3s
190603264/221551911 [========================>.....] - ETA: 3s
191193088/221551911 [========================>.....] - ETA: 3s
191782912/221551911 [========================>.....] - ETA: 3s
192315392/221551911 [=========================>....] - ETA: 3s
192806912/221551911 [=========================>....] - ETA: 3s
193298432/221551911 [=========================>....] - ETA: 3s
193798144/221551911 [=========================>....] - ETA: 3s
194306048/221551911 [=========================>....] - ETA: 2s
194813952/221551911 [=========================>....] - ETA: 2s
195330048/221551911 [=========================>....] - ETA: 2s
195854336/221551911 [=========================>....] - ETA: 2s
196411392/221551911 [=========================>....] - ETA: 2s
196952064/221551911 [=========================>....] - ETA: 2s
197500928/221551911 [=========================>....] - ETA: 2s
198049792/221551911 [=========================>....] - ETA: 2s
198615040/221551911 [=========================>....] - ETA: 2s
199172096/221551911 [=========================>....] - ETA: 2s
199753728/221551911 [==========================>...] - ETA: 2s
200318976/221551911 [==========================>...] - ETA: 2s
200892416/221551911 [==========================>...] - ETA: 2s
201474048/221551911 [==========================>...] - ETA: 2s
202055680/221551911 [==========================>...] - ETA: 2s
202653696/221551911 [==========================>...] - ETA: 2s
203235328/221551911 [==========================>...] - ETA: 1s
203825152/221551911 [==========================>...] - ETA: 1s
204414976/221551911 [==========================>...] - ETA: 1s
205004800/221551911 [==========================>...] - ETA: 1s
205594624/221551911 [==========================>...] - ETA: 1s
206184448/221551911 [==========================>...] - ETA: 1s
206774272/221551911 [==========================>...] - ETA: 1s
207355904/221551911 [===========================>..] - ETA: 1s
207691776/221551911 [===========================>..] - ETA: 1s
208076800/221551911 [===========================>..] - ETA: 1s
208568320/221551911 [===========================>..] - ETA: 1s
209084416/221551911 [===========================>..] - ETA: 1s
209584128/221551911 [===========================>..] - ETA: 1s
210100224/221551911 [===========================>..] - ETA: 1s
210632704/221551911 [===========================>..] - ETA: 1s
211156992/221551911 [===========================>..] - ETA: 1s
211697664/221551911 [===========================>..] - ETA: 1s
212230144/221551911 [===========================>..] - ETA: 0s
212779008/221551911 [===========================>..] - ETA: 0s
213327872/221551911 [===========================>..] - ETA: 0s
213876736/221551911 [===========================>..] - ETA: 0s
214433792/221551911 [============================>.] - ETA: 0s
214982656/221551911 [============================>.] - ETA: 0s
215531520/221551911 [============================>.] - ETA: 0s
216080384/221551911 [============================>.] - ETA: 0s
216637440/221551911 [============================>.] - ETA: 0s
217194496/221551911 [============================>.] - ETA: 0s
217767936/221551911 [============================>.] - ETA: 0s
218357760/221551911 [============================>.] - ETA: 0s
218939392/221551911 [============================>.] - ETA: 0s
219521024/221551911 [============================>.] - ETA: 0s
220094464/221551911 [============================>.] - ETA: 0s
220684288/221551911 [============================>.] - ETA: 0s
221224960/221551911 [============================>.] - ETA: 0s
221274112/221551911 [============================>.] - ETA: 0s
221331456/221551911 [============================>.] - ETA: 0s
221388800/221551911 [============================>.] - ETA: 0s
221429760/221551911 [============================>.] - ETA: 0s
221487104/221551911 [============================>.] - ETA: 0s
221528064/221551911 [============================>.] - ETA: 0s
221551911/221551911 [==============================] - 24s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.

Anchors can also be computed easily using YOLO toolkit.

Note

The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.

from akida_models.detection.generate_anchors import generate_anchors

num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_data, num_anchors, grid_size)
Average IOU for 5 anchors: 0.61
Anchors:  [[0.63263, 1.13864], [1.29467, 2.90717], [2.26527, 2.97757], [3.80627, 5.03516], [5.21984, 5.79988]]

3. Model architecture

The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.

from akida_models import yolo_base

# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2

model = yolo_base(input_shape=(224, 224, 3),
                  classes=classes,
                  nb_box=num_anchors,
                  alpha=0.5)
model.summary()
Model: "yolo_base"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input (InputLayer)          [(None, 224, 224, 3)]     0

 rescaling (Rescaling)       (None, 224, 224, 3)       0

 conv_0 (Conv2D)             (None, 112, 112, 16)      432

 conv_0/BN (BatchNormalizati  (None, 112, 112, 16)     64
 on)

 conv_0/relu (ReLU)          (None, 112, 112, 16)      0

 conv_1 (Conv2D)             (None, 112, 112, 32)      4608

 conv_1/BN (BatchNormalizati  (None, 112, 112, 32)     128
 on)

 conv_1/relu (ReLU)          (None, 112, 112, 32)      0

 conv_2 (Conv2D)             (None, 56, 56, 64)        18432

 conv_2/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_2/relu (ReLU)          (None, 56, 56, 64)        0

 conv_3 (Conv2D)             (None, 56, 56, 64)        36864

 conv_3/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_3/relu (ReLU)          (None, 56, 56, 64)        0

 separable_4 (SeparableConv2  (None, 28, 28, 128)      8768
 D)

 separable_4/BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_4/relu (ReLU)     (None, 28, 28, 128)       0

 separable_5 (SeparableConv2  (None, 28, 28, 128)      17536
 D)

 separable_5/BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_5/relu (ReLU)     (None, 28, 28, 128)       0

 separable_6 (SeparableConv2  (None, 14, 14, 256)      33920
 D)

 separable_6/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_6/relu (ReLU)     (None, 14, 14, 256)       0

 separable_7 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_7/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_7/relu (ReLU)     (None, 14, 14, 256)       0

 separable_8 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_8/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_8/relu (ReLU)     (None, 14, 14, 256)       0

 separable_9 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_9/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_9/relu (ReLU)     (None, 14, 14, 256)       0

 separable_10 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_10/BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_10/relu (ReLU)    (None, 14, 14, 256)       0

 separable_11 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_11/BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_11/relu (ReLU)    (None, 14, 14, 256)       0

 separable_12 (SeparableConv  (None, 7, 7, 512)        133376
 2D)

 separable_12/BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_12/relu (ReLU)    (None, 7, 7, 512)         0

 separable_13 (SeparableConv  (None, 7, 7, 512)        266752
 2D)

 separable_13/BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_13/relu (ReLU)    (None, 7, 7, 512)         0

 1conv (SeparableConv2D)     (None, 7, 7, 1024)        528896

 1conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 1conv/relu (ReLU)           (None, 7, 7, 1024)        0

 2conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 2conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 2conv/relu (ReLU)           (None, 7, 7, 1024)        0

 3conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 3conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 3conv/relu (ReLU)           (None, 7, 7, 1024)        0

 detection_layer (SeparableC  (None, 7, 7, 35)         45091
 onv2D)

=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________

The model output can be reshaped to a more natural shape of:

(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)

where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.

from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape

# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model.output)

# Build the complete model
full_model = Model(model.input, output)
full_model.output
<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>

4. Training

As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.

When using transfer learning for YOLO training, we advise to proceed in several steps that include model calibration:

  • instantiate the yolo_base model and load AkidaNet/ImageNet pretrained float weights,

akida_models create -s yolo_akidanet_voc.h5 yolo_base --classes 2 \
         --base_weights akidanet_imagenet_224_alpha_50.h5
  • freeze the AkidaNet layers and perform training,

yolo_train train -d voc_preprocessed.pkl -m yolo_akidanet_voc.h5 \
    -ap voc_anchors.pkl -e 25 -fb 1conv -s yolo_akidanet_voc.h5
  • quantize the network, create data for calibration and calibrate,

cnn2snn quantize -m yolo_akidanet_voc.h5 -iq 8 -wq 4 -aq 4
yolo_train extract -d voc_preprocessed.pkl -ap voc_anchors.pkl -b 1024 -o voc_samples.npz \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
cnn2snn calibrate adaround -sa voc_samples.npz -b 128 -e 500 -lr 1e-3 \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
  • tune the model to recover accuracy.

yolo_train tune -d voc_preprocessed.pkl \
    -m yolo_akidanet_voc_iq8_wq4_aq4_adaround_calibrated.h5 -ap voc_anchors.pkl \
    -e 10 -s yolo_akidanet_voc_iq8_wq4_aq4.h5

Note

  • voc_anchors.pkl is obtained saving the output of the generate_anchors call to a pickle file,

  • voc_preprocessed.pkl is obtained saving training data, validation data (obtained using parse_voc_annotations) and labels list (i.e [“car”, “person”]) into a pickle file.

Even if transfer learning should be the preferred way to train a YOLO model, it has been observed that for some datasets training all layers from scratch gives better results. That is the case for our YOLO WiderFace model to detect faces. In such a case, the training pipeline to follow is described in the typical training scenario.

5. Performance

The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.

The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.

Note

A call to evaluate_map will preprocess the images, make the call to Model.predict and use decode_output before computing precision for all classes.

Reported performanced for all training steps are as follows:

Float

8/4/4 Calibrated

8/4/4 Tuned

Global mAP

38.38 %

32.88 %

38.83 %

from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation

# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()

# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)

# Create the mAP evaluator object
num_images = 100

map_evaluator = MapEvaluation(model_keras, val_data[:num_images], labels,
                              anchors)

# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()

for label, average_precision in average_precisions.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl.

  0/126 [..............................] - ETA: 0s
126/126 [==============================] - 0s 1us/step
Downloading data from http://data.brainchip.com/models/yolo/yolo_akidanet_voc_iq8_wq4_aq4.h5.

       0/14326960 [..............................] - ETA: 0s
  204800/14326960 [..............................] - ETA: 3s
  671744/14326960 [>.............................] - ETA: 2s
 1056768/14326960 [=>............................] - ETA: 2s
 1368064/14326960 [=>............................] - ETA: 2s
 1720320/14326960 [==>...........................] - ETA: 1s
 2064384/14326960 [===>..........................] - ETA: 1s
 2416640/14326960 [====>.........................] - ETA: 1s
 2793472/14326960 [====>.........................] - ETA: 1s
 3178496/14326960 [=====>........................] - ETA: 1s
 3555328/14326960 [======>.......................] - ETA: 1s
 3964928/14326960 [=======>......................] - ETA: 1s
 4382720/14326960 [========>.....................] - ETA: 1s
 4775936/14326960 [=========>....................] - ETA: 1s
 5177344/14326960 [=========>....................] - ETA: 1s
 5611520/14326960 [==========>...................] - ETA: 1s
 6037504/14326960 [===========>..................] - ETA: 1s
 6455296/14326960 [============>.................] - ETA: 1s
 6873088/14326960 [=============>................] - ETA: 1s
 7266304/14326960 [==============>...............] - ETA: 0s
 7725056/14326960 [===============>..............] - ETA: 0s
 8175616/14326960 [================>.............] - ETA: 0s
 8413184/14326960 [================>.............] - ETA: 0s
 8814592/14326960 [=================>............] - ETA: 0s
 9191424/14326960 [==================>...........] - ETA: 0s
 9502720/14326960 [==================>...........] - ETA: 0s
 9846784/14326960 [===================>..........] - ETA: 0s
10248192/14326960 [====================>.........] - ETA: 0s
10592256/14326960 [=====================>........] - ETA: 0s
10960896/14326960 [=====================>........] - ETA: 0s
11386880/14326960 [======================>.......] - ETA: 0s
11780096/14326960 [=======================>......] - ETA: 0s
12222464/14326960 [========================>.....] - ETA: 0s
12648448/14326960 [=========================>....] - ETA: 0s
13082624/14326960 [==========================>...] - ETA: 0s
13508608/14326960 [===========================>..] - ETA: 0s
13983744/14326960 [============================>.] - ETA: 0s
14326960/14326960 [==============================] - 2s 0us/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 1s 878ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step
car 0.3844
person 0.3477
mAP: 0.3661
Keras inference on 100 images took 5.26 s.

6. Conversion to Akida

6.1 Convert to Akida model

Check model compatibility before akida conversion

from cnn2snn import check_model_compatibility

compat = check_model_compatibility(model_keras, False)
The Keras quantized model is not compatible for a conversion to an Akida model:
 The Reshape layer YOLO_output can only be used to transform a tensor of shape (N,) to a tensor of shape (1, 1, N), and vice-versa. Receives input_shape (7, 7, 35) and output_shape (7, 7, 5, 7).

The last YOLO_output layer that was added for splitting channels into values for each box must be removed before akida conversion.

# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)

When converting to an Akida model, we just need to pass the Keras model and the input scaling that was used during training to cnn2snn.convert. In YOLO preprocess_image function, images are zero centered and normalized between [-1, 1] hence the scaling values.

from cnn2snn import convert

model_akida = convert(compatible_model)
model_akida.summary()
                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [7, 7, 35]    1          18
________________________________________________

_________________________________________________________________
Layer (type)                 Output shape    Kernel shape

============== SW/conv_0-detection_layer (Software) =============

conv_0 (InputConv.)          [112, 112, 16]  (3, 3, 3, 16)
_________________________________________________________________
conv_1 (Conv.)               [112, 112, 32]  (3, 3, 16, 32)
_________________________________________________________________
conv_2 (Conv.)               [56, 56, 64]    (3, 3, 32, 64)
_________________________________________________________________
conv_3 (Conv.)               [56, 56, 64]    (3, 3, 64, 64)
_________________________________________________________________
separable_4 (Sep.Conv.)      [28, 28, 128]   (3, 3, 64, 1)
_________________________________________________________________
                                             (1, 1, 64, 128)
_________________________________________________________________
separable_5 (Sep.Conv.)      [28, 28, 128]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 128)
_________________________________________________________________
separable_6 (Sep.Conv.)      [14, 14, 256]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 256)
_________________________________________________________________
separable_7 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_8 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_9 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_10 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_11 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_12 (Sep.Conv.)     [7, 7, 512]     (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 512)
_________________________________________________________________
separable_13 (Sep.Conv.)     [7, 7, 512]     (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 512)
_________________________________________________________________
1conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 1024)
_________________________________________________________________
2conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
3conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
detection_layer (Sep.Conv.)  [7, 7, 35]      (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 35)
_________________________________________________________________

6.1 Check performance

Akida model accuracy is tested on the first n images of the validation set.

The table below summarizes the expected results:

#Images

Keras mAP

Akida mAP

100

38.80 %

34.26 %

1000

40.11 %

39.35 %

2500

38.83 %

38.85 %

# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
                                 val_data[:num_images],
                                 labels,
                                 anchors,
                                 is_keras_model=False)

# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()

for label, average_precision in average_precisions_ak.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')
car 0.4258
person 0.3430
mAP: 0.3844
Akida inference on 100 images took 14.83 s.

6.2 Show predictions for a random image

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

from akida_models.detection.processing import load_image, preprocess_image, decode_output

# Take a random test image
i = np.random.randint(len(val_data))

input_shape = model_akida.layers[0].input_dims

# Load the image
raw_image = load_image(val_data[i]['image_path'])

# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape

# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)

# Call evaluate on the image
pots = model_akida.predict(input_image)[0]

# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))

# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))

# Rescale boxes to the original image size
pred_boxes = np.array([[
    box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
    box.y2 * raw_height,
    box.get_label(),
    box.get_score()
] for box in raw_boxes])

fig = plt.figure(num='VOC2012 car and person detection by Akida runtime')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)

for box in pred_boxes:
    rect = patches.Rectangle((box[0], box[1]),
                             box[2] - box[0],
                             box[3] - box[1],
                             linewidth=1,
                             edgecolor='r',
                             facecolor='none')
    ax.add_patch(rect)
    class_score = ax.text(box[0],
                          box[1] - 5,
                          f"{labels[int(box[4])]} - {box[5]:.2f}",
                          color='red')

plt.axis('off')
plt.show()
plot 5 voc yolo detection

Total running time of the script: ( 1 minutes 2.367 seconds)

Gallery generated by Sphinx-Gallery