AkidaNet/ImageNet inference

This tutorial presents how to convert, map, and capture performance from AKD1000 Hardware using an AkidaNet model.

AkidaNet architecture is a MobileNet v1-inspired architecture optimized for implementation on Akida 1.0: it exploits the richer expressive power of standard convolutions in early layers, but uses separable convolutions in later layers where filter memory is limiting.

As ImageNet images are not publicly available, performance is assessed using a set of 10 copyright free images that were found on Google using ImageNet class names.

Note

This tutorial uses an Akida 1.0 architecture to show AKD1000 mapping and performance. See the dedicated tutorial for 1.0 and 2.0 differences.

1. Dataset preparation

Test images all have at least 256 pixels in the smallest dimension. They must be preprocessed to fit in the model. The imagenet.preprocessing.preprocess_image function decodes, crops and extracts a square 224x224x3 patch from an input image.

Note

Input size is here set to 224x224x3 as this is what is used by the model presented in the next section.

import akida
import os
import numpy as np

from tensorflow.io import read_file
from tensorflow.image import decode_jpeg
from tensorflow.keras.utils import get_file

from akida_models.imagenet import preprocessing

# Model specification and hyperparameters
NUM_CHANNELS = 3
IMAGE_SIZE = 224

num_images = 10

# Retrieve dataset file from Brainchip data server
file_path = get_file(
    "imagenet_like.zip",
    "https://data.brainchip.com/dataset-mirror/imagenet_like/imagenet_like.zip",
    cache_subdir='datasets/imagenet_like',
    extract=True)
data_folder = os.path.dirname(file_path)

# Load images for test set
x_test_files = []
x_test = np.zeros((num_images, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS)).astype('uint8')
for id in range(num_images):
    test_file = 'image_' + str(id + 1).zfill(2) + '.jpg'
    x_test_files.append(test_file)
    img_path = os.path.join(data_folder, test_file)
    base_image = read_file(img_path)
    image = decode_jpeg(base_image, channels=NUM_CHANNELS)
    image = preprocessing.preprocess_image(image, IMAGE_SIZE)
    x_test[id, :, :, :] = np.expand_dims(image, axis=0)

print(f'{num_images} images loaded and preprocessed.')
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:

TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).

For more information see: https://github.com/tensorflow/addons/issues/2807

  warnings.warn(
Downloading data from https://data.brainchip.com/dataset-mirror/imagenet_like/imagenet_like.zip

    8192/20418307 [..............................] - ETA: 0s
  196608/20418307 [..............................] - ETA: 5s
  606208/20418307 [..............................] - ETA: 3s
  999424/20418307 [>.............................] - ETA: 3s
 1335296/20418307 [>.............................] - ETA: 2s
 1744896/20418307 [=>............................] - ETA: 2s
 2105344/20418307 [==>...........................] - ETA: 2s
 2514944/20418307 [==>...........................] - ETA: 2s
 2883584/20418307 [===>..........................] - ETA: 2s
 3293184/20418307 [===>..........................] - ETA: 2s
 3694592/20418307 [====>.........................] - ETA: 2s
 4087808/20418307 [=====>........................] - ETA: 2s
 4431872/20418307 [=====>........................] - ETA: 2s
 4874240/20418307 [======>.......................] - ETA: 2s
 5275648/20418307 [======>.......................] - ETA: 2s
 5701632/20418307 [=======>......................] - ETA: 1s
 6111232/20418307 [=======>......................] - ETA: 1s
 6553600/20418307 [========>.....................] - ETA: 1s
 6987776/20418307 [=========>....................] - ETA: 1s
 7421952/20418307 [=========>....................] - ETA: 1s
 7847936/20418307 [==========>...................] - ETA: 1s
 8306688/20418307 [===========>..................] - ETA: 1s
 8749056/20418307 [===========>..................] - ETA: 1s
 9224192/20418307 [============>.................] - ETA: 1s
 9699328/20418307 [=============>................] - ETA: 1s
10174464/20418307 [=============>................] - ETA: 1s
10649600/20418307 [==============>...............] - ETA: 1s
11124736/20418307 [===============>..............] - ETA: 1s
11599872/20418307 [================>.............] - ETA: 1s
12091392/20418307 [================>.............] - ETA: 1s
12566528/20418307 [=================>............] - ETA: 0s
13074432/20418307 [==================>...........] - ETA: 0s
13582336/20418307 [==================>...........] - ETA: 0s
14090240/20418307 [===================>..........] - ETA: 0s
14598144/20418307 [====================>.........] - ETA: 0s
15106048/20418307 [=====================>........] - ETA: 0s
15630336/20418307 [=====================>........] - ETA: 0s
16138240/20418307 [======================>.......] - ETA: 0s
16662528/20418307 [=======================>......] - ETA: 0s
17186816/20418307 [========================>.....] - ETA: 0s
17711104/20418307 [=========================>....] - ETA: 0s
18235392/20418307 [=========================>....] - ETA: 0s
18759680/20418307 [==========================>...] - ETA: 0s
19283968/20418307 [===========================>..] - ETA: 0s
19824640/20418307 [============================>.] - ETA: 0s
20365312/20418307 [============================>.] - ETA: 0s
20418307/20418307 [==============================] - 2s 0us/step
10 images loaded and preprocessed.

Labels for test images are stored in the akida_models package. The matching between names (string) and labels (integer) is given through the imagenet.preprocessing.index_to_label method.

import csv

# Parse labels file
fname = os.path.join(data_folder, 'labels_validation.txt')
validation_labels = dict()
with open(fname, newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=' ')
    for row in reader:
        validation_labels[row[0]] = row[1]

# Get labels for the test set by index
labels_test = np.zeros(num_images)
for i in range(num_images):
    labels_test[i] = int(validation_labels[x_test_files[i]])

2. Pretrained quantized model

The Akida model zoo contains a pretrained quantized helper.

The quantization scheme for this model is the following:

  • the first layer has 8-bit weights,

  • all other layers have 4-bit weights,

  • all activations are 4-bit.

from cnn2snn import set_akida_version, AkidaVersion
from akida_models import akidanet_imagenet_pretrained

# Use a quantized model with pretrained quantized weights
with set_akida_version(AkidaVersion.v1):
    model_keras_quantized_pretrained = akidanet_imagenet_pretrained(0.5)
model_keras_quantized_pretrained.summary()
/usr/local/lib/python3.8/dist-packages/akida_models/model_io.py:144: UserWarning: Model akidanet_imagenet_224_alpha_50_iq8_wq4_aq4.h5 has been trained with akida_models 1.1.10 which is the last version supporting 1.0 models training
  warnings.warn(f'Model {model_name_v1} has been trained with akida_models 1.1.10 which is '
Downloading data from https://data.brainchip.com/models/AkidaV1/akidanet/akidanet_imagenet_224_alpha_50_iq8_wq4_aq4.h5.

      0/5589312 [..............................] - ETA: 0s
  98304/5589312 [..............................] - ETA: 3s
 368640/5589312 [>.............................] - ETA: 1s
 925696/5589312 [===>..........................] - ETA: 0s
1310720/5589312 [======>.......................] - ETA: 0s
1695744/5589312 [========>.....................] - ETA: 0s
2088960/5589312 [==========>...................] - ETA: 0s
2490368/5589312 [============>.................] - ETA: 0s
2908160/5589312 [==============>...............] - ETA: 0s
3293184/5589312 [================>.............] - ETA: 0s
3710976/5589312 [==================>...........] - ETA: 0s
4128768/5589312 [=====================>........] - ETA: 0s
4538368/5589312 [=======================>......] - ETA: 0s
4980736/5589312 [=========================>....] - ETA: 0s
5406720/5589312 [============================>.] - ETA: 0s
5589312/5589312 [==============================] - 1s 0us/step
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 rescaling (Rescaling)       (None, 224, 224, 3)       0

 conv_0 (QuantizedConv2D)    (None, 112, 112, 16)      448

 conv_0/relu (QuantizedReLU)  (None, 112, 112, 16)     0

 conv_1 (QuantizedConv2D)    (None, 112, 112, 32)      4640

 conv_1/relu (QuantizedReLU)  (None, 112, 112, 32)     0

 conv_2 (QuantizedConv2D)    (None, 56, 56, 64)        18496

 conv_2/relu (QuantizedReLU)  (None, 56, 56, 64)       0

 conv_3 (QuantizedConv2D)    (None, 56, 56, 64)        36928

 conv_3/relu (QuantizedReLU)  (None, 56, 56, 64)       0

 separable_4 (QuantizedSepar  (None, 28, 28, 128)      8896
 ableConv2D)

 separable_4/relu (Quantized  (None, 28, 28, 128)      0
 ReLU)

 separable_5 (QuantizedSepar  (None, 28, 28, 128)      17664
 ableConv2D)

 separable_5/relu (Quantized  (None, 28, 28, 128)      0
 ReLU)

 separable_6 (QuantizedSepar  (None, 14, 14, 256)      34176
 ableConv2D)

 separable_6/relu (Quantized  (None, 14, 14, 256)      0
 ReLU)

 separable_7 (QuantizedSepar  (None, 14, 14, 256)      68096
 ableConv2D)

 separable_7/relu (Quantized  (None, 14, 14, 256)      0
 ReLU)

 separable_8 (QuantizedSepar  (None, 14, 14, 256)      68096
 ableConv2D)

 separable_8/relu (Quantized  (None, 14, 14, 256)      0
 ReLU)

 separable_9 (QuantizedSepar  (None, 14, 14, 256)      68096
 ableConv2D)

 separable_9/relu (Quantized  (None, 14, 14, 256)      0
 ReLU)

 separable_10 (QuantizedSepa  (None, 14, 14, 256)      68096
 rableConv2D)

 separable_10/relu (Quantize  (None, 14, 14, 256)      0
 dReLU)

 separable_11 (QuantizedSepa  (None, 14, 14, 256)      68096
 rableConv2D)

 separable_11/relu (Quantize  (None, 14, 14, 256)      0
 dReLU)

 separable_12 (QuantizedSepa  (None, 7, 7, 512)        133888
 rableConv2D)

 separable_12/relu (Quantize  (None, 7, 7, 512)        0
 dReLU)

 separable_13 (QuantizedSepa  (None, 7, 7, 512)        267264
 rableConv2D)

 separable_13/global_avg (Gl  (None, 512)              0
 obalAveragePooling2D)

 separable_13/relu (Quantize  (None, 512)              0
 dReLU)

 dropout (Dropout)           (None, 512)               0

 classifier (QuantizedDense)  (None, 1000)             513000

=================================================================
Total params: 1,375,880
Trainable params: 1,375,880
Non-trainable params: 0
_________________________________________________________________

Check model performance on the 10 images set.

from timeit import default_timer as timer

num_images = len(x_test)

start = timer()
potentials_keras = model_keras_quantized_pretrained.predict(x_test, batch_size=100)
end = timer()
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')

preds_keras = np.squeeze(np.argmax(potentials_keras, 1))
accuracy_keras = np.sum(np.equal(preds_keras, labels_test)) / num_images

print(f"Keras accuracy: {accuracy_keras*100:.2f} %")
1/1 [==============================] - ETA: 0s
1/1 [==============================] - 1s 778ms/step
Keras inference on 10 images took 0.80 s.

Keras accuracy: 90.00 %

3. Conversion to Akida

3.1 Convert to Akida model

Here, the Keras quantized model is converted into a suitable version for the Akida accelerator. The cnn2snn.convert function only needs the Keras model as argument.

from cnn2snn import convert

model_akida = convert(model_keras_quantized_pretrained)

The Model.summary method provides a detailed description of the Model layers.

model_akida.summary()
                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [1, 1, 1000]  1          15
________________________________________________

_____________________________________________________________
Layer (type)              Output shape    Kernel shape

============== SW/conv_0-classifier (Software) ==============

conv_0 (InputConv.)       [112, 112, 16]  (3, 3, 3, 16)
_____________________________________________________________
conv_1 (Conv.)            [112, 112, 32]  (3, 3, 16, 32)
_____________________________________________________________
conv_2 (Conv.)            [56, 56, 64]    (3, 3, 32, 64)
_____________________________________________________________
conv_3 (Conv.)            [56, 56, 64]    (3, 3, 64, 64)
_____________________________________________________________
separable_4 (Sep.Conv.)   [28, 28, 128]   (3, 3, 64, 1)
_____________________________________________________________
                                          (1, 1, 64, 128)
_____________________________________________________________
separable_5 (Sep.Conv.)   [28, 28, 128]   (3, 3, 128, 1)
_____________________________________________________________
                                          (1, 1, 128, 128)
_____________________________________________________________
separable_6 (Sep.Conv.)   [14, 14, 256]   (3, 3, 128, 1)
_____________________________________________________________
                                          (1, 1, 128, 256)
_____________________________________________________________
separable_7 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 256)
_____________________________________________________________
separable_8 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 256)
_____________________________________________________________
separable_9 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 256)
_____________________________________________________________
separable_10 (Sep.Conv.)  [14, 14, 256]   (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 256)
_____________________________________________________________
separable_11 (Sep.Conv.)  [14, 14, 256]   (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 256)
_____________________________________________________________
separable_12 (Sep.Conv.)  [7, 7, 512]     (3, 3, 256, 1)
_____________________________________________________________
                                          (1, 1, 256, 512)
_____________________________________________________________
separable_13 (Sep.Conv.)  [1, 1, 512]     (3, 3, 512, 1)
_____________________________________________________________
                                          (1, 1, 512, 512)
_____________________________________________________________
classifier (Fully.)       [1, 1, 1000]    (1, 1, 512, 1000)
_____________________________________________________________

3.2 Check performance

The following will only compute accuracy for the 10 images set.

# Check Model performance
start = timer()
accuracy_akida = model_akida.evaluate(x_test, labels_test)
end = timer()
print(f'Inference on {num_images} images took {end-start:.2f} s.\n')
print(f"Accuracy: {accuracy_akida*100:.2f} %")

# For non-regression purposes
assert accuracy_akida >= 0.8
Inference on 10 images took 0.22 s.

Accuracy: 80.00 %

3.3 Show predictions for a random image

import matplotlib.pyplot as plt
import matplotlib.lines as lines


# Functions used to display the top5 results
def get_top5(potentials, true_label):
    """
    Returns the top 5 classes from the output potentials
    """
    tmp_pots = potentials.copy()
    top5 = []
    min_val = np.min(tmp_pots)
    for ii in range(5):
        best = np.argmax(tmp_pots)
        top5.append(best)
        tmp_pots[best] = min_val

    vals = np.zeros((6,))
    vals[:5] = potentials[top5]
    if true_label not in top5:
        vals[5] = potentials[true_label]
    else:
        vals[5] = 0
    vals /= np.max(vals)

    class_name = []
    for ii in range(5):
        class_name.append(preprocessing.index_to_label(top5[ii]).split(',')[0])
    if true_label in top5:
        class_name.append('')
    else:
        class_name.append(
            preprocessing.index_to_label(true_label).split(',')[0])

    return top5, vals, class_name


def adjust_spines(ax, spines):
    for loc, spine in ax.spines.items():
        if loc in spines:
            spine.set_position(('outward', 10))  # outward by 10 points
        else:
            spine.set_color('none')  # don't draw spine
    # turn off ticks where there is no spine
    if 'left' in spines:
        ax.yaxis.set_ticks_position('left')
    else:
        # no yaxis ticks
        ax.yaxis.set_ticks([])
    if 'bottom' in spines:
        ax.xaxis.set_ticks_position('bottom')
    else:
        # no xaxis ticks
        ax.xaxis.set_ticks([])


def prepare_plots():
    fig = plt.figure(figsize=(8, 4))
    # Image subplot
    ax0 = plt.subplot(1, 3, 1)
    imgobj = ax0.imshow(np.zeros((IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS), dtype=np.uint8))
    ax0.set_axis_off()
    # Top 5 results subplot
    ax1 = plt.subplot(1, 2, 2)
    bar_positions = (0, 1, 2, 3, 4, 6)
    rects = ax1.barh(bar_positions, np.zeros((6,)), align='center', height=0.5)
    plt.xlim(-0.2, 1.01)
    ax1.set(xlim=(-0.2, 1.15), ylim=(-1.5, 12))
    ax1.set_yticks(bar_positions)
    ax1.invert_yaxis()
    ax1.yaxis.set_ticks_position('left')
    ax1.xaxis.set_ticks([])
    adjust_spines(ax1, 'left')
    ax1.add_line(lines.Line2D((0, 0), (-0.5, 6.5), color=(0.0, 0.0, 0.0)))
    # Adjust Plot Positions
    ax0.set_position([0.05, 0.055, 0.3, 0.9])
    l1, b1, w1, h1 = ax1.get_position().bounds
    ax1.set_position([l1 * 1.05, b1 + 0.09 * h1, w1, 0.8 * h1])
    # Add title box
    plt.figtext(0.5,
                0.9,
                "Imagenet Classification by Akida",
                size=20,
                ha="center",
                va="center",
                bbox=dict(boxstyle="round",
                          ec=(0.5, 0.5, 0.5),
                          fc=(0.9, 0.9, 1.0)))

    return fig, imgobj, ax1, rects


def update_bars_chart(rects, vals, true_label):
    counter = 0
    for rect, h in zip(rects, yvals):
        rect.set_width(h)
        if counter < 5:
            if top5[counter] == true_label:
                if counter == 0:
                    rect.set_facecolor((0.0, 1.0, 0.0))
                else:
                    rect.set_facecolor((0.0, 0.5, 0.0))
            else:
                rect.set_facecolor('gray')
        elif counter == 5:
            rect.set_facecolor('red')
        counter += 1


# Prepare plots
fig, imgobj, ax1, rects = prepare_plots()

# Get a random image
img = np.random.randint(num_images)

# Predict image class
outputs_akida = model_akida.predict(np.expand_dims(x_test[img], axis=0)).squeeze()

# Get top 5 prediction labels and associated names
true_label = int(validation_labels[x_test_files[img]])
top5, yvals, class_name = get_top5(outputs_akida, true_label)

# Draw Plots
imgobj.set_data(x_test[img])
ax1.set_yticklabels(class_name, rotation='horizontal', size=9)
update_bars_chart(rects, yvals, true_label)
fig.canvas.draw()
plt.show()
plot 1 akidanet imagenet

4. Hardware mapping and performance

4.1. Map on hardware

List available Akida devices and check that an NSoC V2, Akida 1.0 production chip is available.

If a device is installed but not detected, reinstalling the driver might help, see the driver setup helper.

devices = akida.devices()
print(f'Available devices: {[dev.desc for dev in devices]}')
assert len(devices), "No device found, this example needs an Akida NSoC_v2 device."
device = devices[0]
assert device.version == akida.NSoC_v2, "Wrong device found, this example needs an Akida NSoC_v2."
Available devices: ['PCIe/NSoC_v2/0']

Map the model on the device

model_akida.map(device)

# Check model mapping: NP allocation and binary size
model_akida.summary()
                    Model Summary
_____________________________________________________
Input shape    Output shape  Sequences  Layers  NPs
=====================================================
[224, 224, 3]  [1, 1, 1000]  1          15      32
_____________________________________________________

__________________________________________________________________
Layer (type)              Output shape    Kernel shape       NPs

====== HW/conv_0-classifier (Hardware) - size: 1230620 bytes =====

conv_0 (InputConv.)       [112, 112, 16]  (3, 3, 3, 16)      N/A
__________________________________________________________________
conv_1 (Conv.)            [112, 112, 32]  (3, 3, 16, 32)     4
__________________________________________________________________
conv_2 (Conv.)            [56, 56, 64]    (3, 3, 32, 64)     6
__________________________________________________________________
conv_3 (Conv.)            [56, 56, 64]    (3, 3, 64, 64)     3
__________________________________________________________________
separable_4 (Sep.Conv.)   [28, 28, 128]   (3, 3, 64, 1)      3
__________________________________________________________________
                                          (1, 1, 64, 128)
__________________________________________________________________
separable_5 (Sep.Conv.)   [28, 28, 128]   (3, 3, 128, 1)     2
__________________________________________________________________
                                          (1, 1, 128, 128)
__________________________________________________________________
separable_6 (Sep.Conv.)   [14, 14, 256]   (3, 3, 128, 1)     2
__________________________________________________________________
                                          (1, 1, 128, 256)
__________________________________________________________________
separable_7 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)     1
__________________________________________________________________
                                          (1, 1, 256, 256)
__________________________________________________________________
separable_8 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)     1
__________________________________________________________________
                                          (1, 1, 256, 256)
__________________________________________________________________
separable_9 (Sep.Conv.)   [14, 14, 256]   (3, 3, 256, 1)     1
__________________________________________________________________
                                          (1, 1, 256, 256)
__________________________________________________________________
separable_10 (Sep.Conv.)  [14, 14, 256]   (3, 3, 256, 1)     1
__________________________________________________________________
                                          (1, 1, 256, 256)
__________________________________________________________________
separable_11 (Sep.Conv.)  [14, 14, 256]   (3, 3, 256, 1)     1
__________________________________________________________________
                                          (1, 1, 256, 256)
__________________________________________________________________
separable_12 (Sep.Conv.)  [7, 7, 512]     (3, 3, 256, 1)     2
__________________________________________________________________
                                          (1, 1, 256, 512)
__________________________________________________________________
separable_13 (Sep.Conv.)  [1, 1, 512]     (3, 3, 512, 1)     4
__________________________________________________________________
                                          (1, 1, 512, 512)
__________________________________________________________________
classifier (Fully.)       [1, 1, 1000]    (1, 1, 512, 1000)  1
__________________________________________________________________

4.2. Performance measurement

Power measurement must be enabled on the device’ soc (disabled by default). After sending data for inference, performance measurements are available in the model statistics.

# Enable power measurement
device.soc.power_measurement_enabled = True

# Send data for inference
_ = model_akida.forward(x_test)

# Display floor current
floor_power = device.soc.power_meter.floor
print(f'Floor power: {floor_power:.2f} mW')

# Retrieve statistics
print(model_akida.statistics)
Floor power: 884.31 mW
Average framerate = 23.58 fps
Last inference power range (mW):  Avg 1038.40 / Min 884.00 / Max 1126.00 / Std 99.25
Last inference energy consumed (mJ/frame): 44.03

Total running time of the script: (0 minutes 9.266 seconds)

Gallery generated by Sphinx-Gallery