AkidaNet/ImageNet inference
This tutorial presents how to convert, map, and capture performance from AKD1000 Hardware using an AkidaNet model.
AkidaNet architecture is a MobileNet v1-inspired architecture optimized for implementation on Akida 1.0: it exploits the richer expressive power of standard convolutions in early layers, but uses separable convolutions in later layers where filter memory is limiting.
As ImageNet images are not publicly available, performance is assessed using a set of 10 copyright free images that were found on Google using ImageNet class names.
This tutorial uses an Akida 1.0 architecture to show AKD1000 mapping and performance. See the dedicated tutorial for 1.0 and 2.0 differences.
1. Dataset preparation
Test images all have at least 256 pixels in the smallest dimension. They must
be preprocessed to fit in the model. The imagenet.preprocessing.get_preprocessed_samples
function loads and preprocesses (decodes, crops and extracts a square
224x224x3 patch from an input image) a set of 10 ImageNet-like images.
Input size is here set to 224x224x3 as this is what is used by the model presented in the next section.
import akida
import numpy as np
from akida_models.imagenet import get_preprocessed_samples
# Model specification and hyperparameters
# Load the preprocessed images and their corresponding labels for the test set
x_test, labels_test = get_preprocessed_samples(IMAGE_SIZE, NUM_CHANNELS)
print(f'{x_test.shape[0]} images and their labels are loaded and preprocessed.')
10 images and their labels are loaded and preprocessed.
2. Pretrained quantized model
The Akida model zoo contains a pretrained quantized helper.
The quantization scheme for this model is the following:
the first layer has 8-bit weights,
all other layers have 4-bit weights,
all activations are 4-bit.
from cnn2snn import set_akida_version, AkidaVersion
from akida_models import akidanet_imagenet_pretrained
# Use a quantized model with pretrained quantized weights
with set_akida_version(AkidaVersion.v1):
model_keras_quantized_pretrained = akidanet_imagenet_pretrained(0.5)
/usr/local/lib/python3.8/dist-packages/akida_models/ UserWarning: Model akidanet_imagenet_224_alpha_50_iq8_wq4_aq4.h5 has been trained with akida_models 1.1.10 which is the last version supporting 1.0 models training
warnings.warn(f'Model {model_name_v1} has been trained with akida_models 1.1.10 which is '
Model: "sequential_2"
Layer (type) Output Shape Param #
rescaling (Rescaling) (None, 224, 224, 3) 0
conv_0 (QuantizedConv2D) (None, 112, 112, 16) 448
conv_0/relu (QuantizedReLU) (None, 112, 112, 16) 0
conv_1 (QuantizedConv2D) (None, 112, 112, 32) 4640
conv_1/relu (QuantizedReLU) (None, 112, 112, 32) 0
conv_2 (QuantizedConv2D) (None, 56, 56, 64) 18496
conv_2/relu (QuantizedReLU) (None, 56, 56, 64) 0
conv_3 (QuantizedConv2D) (None, 56, 56, 64) 36928
conv_3/relu (QuantizedReLU) (None, 56, 56, 64) 0
separable_4 (QuantizedSepar (None, 28, 28, 128) 8896
separable_4/relu (Quantized (None, 28, 28, 128) 0
separable_5 (QuantizedSepar (None, 28, 28, 128) 17664
separable_5/relu (Quantized (None, 28, 28, 128) 0
separable_6 (QuantizedSepar (None, 14, 14, 256) 34176
separable_6/relu (Quantized (None, 14, 14, 256) 0
separable_7 (QuantizedSepar (None, 14, 14, 256) 68096
separable_7/relu (Quantized (None, 14, 14, 256) 0
separable_8 (QuantizedSepar (None, 14, 14, 256) 68096
separable_8/relu (Quantized (None, 14, 14, 256) 0
separable_9 (QuantizedSepar (None, 14, 14, 256) 68096
separable_9/relu (Quantized (None, 14, 14, 256) 0
separable_10 (QuantizedSepa (None, 14, 14, 256) 68096
separable_10/relu (Quantize (None, 14, 14, 256) 0
separable_11 (QuantizedSepa (None, 14, 14, 256) 68096
separable_11/relu (Quantize (None, 14, 14, 256) 0
separable_12 (QuantizedSepa (None, 7, 7, 512) 133888
separable_12/relu (Quantize (None, 7, 7, 512) 0
separable_13 (QuantizedSepa (None, 7, 7, 512) 267264
separable_13/global_avg (Gl (None, 512) 0
separable_13/relu (Quantize (None, 512) 0
dropout (Dropout) (None, 512) 0
classifier (QuantizedDense) (None, 1000) 513000
Total params: 1,375,880
Trainable params: 1,375,880
Non-trainable params: 0
Check model performance on the 10 images set.
from timeit import default_timer as timer
num_images = len(x_test)
start = timer()
potentials_keras = model_keras_quantized_pretrained.predict(x_test, batch_size=100)
end = timer()
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
preds_keras = np.squeeze(np.argmax(potentials_keras, 1))
accuracy_keras = np.sum(np.equal(preds_keras, labels_test)) / num_images
print(f"Keras accuracy: {accuracy_keras*num_images:.0f}/{num_images}.")
Keras inference on 10 images took 0.81 s.
Keras accuracy: 9/10.
3. Conversion to Akida
3.1 Convert to Akida model
Here, the Keras quantized model is converted into a suitable version for the Akida accelerator. The cnn2snn.convert function only needs the Keras model as argument.
from cnn2snn import convert
model_akida = convert(model_keras_quantized_pretrained)
The Model.summary method provides a detailed description of the Model layers.
Model Summary
Input shape Output shape Sequences Layers
[224, 224, 3] [1, 1, 1000] 1 15
Layer (type) Output shape Kernel shape
============== SW/conv_0-classifier (Software) ==============
conv_0 (InputConv.) [112, 112, 16] (3, 3, 3, 16)
conv_1 (Conv.) [112, 112, 32] (3, 3, 16, 32)
conv_2 (Conv.) [56, 56, 64] (3, 3, 32, 64)
conv_3 (Conv.) [56, 56, 64] (3, 3, 64, 64)
separable_4 (Sep.Conv.) [28, 28, 128] (3, 3, 64, 1)
(1, 1, 64, 128)
separable_5 (Sep.Conv.) [28, 28, 128] (3, 3, 128, 1)
(1, 1, 128, 128)
separable_6 (Sep.Conv.) [14, 14, 256] (3, 3, 128, 1)
(1, 1, 128, 256)
separable_7 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
(1, 1, 256, 256)
separable_8 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
(1, 1, 256, 256)
separable_9 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
(1, 1, 256, 256)
separable_10 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
(1, 1, 256, 256)
separable_11 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
(1, 1, 256, 256)
separable_12 (Sep.Conv.) [7, 7, 512] (3, 3, 256, 1)
(1, 1, 256, 512)
separable_13 (Sep.Conv.) [1, 1, 512] (3, 3, 512, 1)
(1, 1, 512, 512)
classifier (Fully.) [1, 1, 1000] (1, 1, 512, 1000)
3.2 Check performance
The following will only compute accuracy for the 10 images set.
# Check Model performance
start = timer()
accuracy_akida = model_akida.evaluate(x_test, labels_test)
end = timer()
print(f'Inference on {num_images} images took {end-start:.2f} s.\n')
print(f"Accuracy: {accuracy_akida*num_images:.0f}/{num_images}.")
# For non-regression purposes
assert accuracy_akida >= 0.8
Inference on 10 images took 0.25 s.
Accuracy: 8/10.
3.3 Show predictions for a random image
Labels for test images are stored in the akida_models package. The matching
between names (string) and labels (integer) is given through the
import matplotlib.pyplot as plt
import matplotlib.lines as lines
from akida_models.imagenet import preprocessing
# Functions used to display the top5 results
def get_top5(potentials, true_label):
Returns the top 5 classes from the output potentials
tmp_pots = potentials.copy()
top5 = []
min_val = np.min(tmp_pots)
for ii in range(5):
best = np.argmax(tmp_pots)
tmp_pots[best] = min_val
vals = np.zeros((6,))
vals[:5] = potentials[top5]
if true_label not in top5:
vals[5] = potentials[true_label]
vals[5] = 0
vals /= np.max(vals)
class_name = []
for ii in range(5):
if true_label in top5:
return top5, vals, class_name
def adjust_spines(ax, spines):
for loc, spine in ax.spines.items():
if loc in spines:
spine.set_position(('outward', 10)) # outward by 10 points
spine.set_color('none') # don't draw spine
# turn off ticks where there is no spine
if 'left' in spines:
# no yaxis ticks
if 'bottom' in spines:
# no xaxis ticks
def prepare_plots():
fig = plt.figure(figsize=(8, 4))
# Image subplot
ax0 = plt.subplot(1, 3, 1)
imgobj = ax0.imshow(np.zeros((IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS), dtype=np.uint8))
# Top 5 results subplot
ax1 = plt.subplot(1, 2, 2)
bar_positions = (0, 1, 2, 3, 4, 6)
rects = ax1.barh(bar_positions, np.zeros((6,)), align='center', height=0.5)
plt.xlim(-0.2, 1.01)
ax1.set(xlim=(-0.2, 1.15), ylim=(-1.5, 12))
adjust_spines(ax1, 'left')
ax1.add_line(lines.Line2D((0, 0), (-0.5, 6.5), color=(0.0, 0.0, 0.0)))
# Adjust Plot Positions
ax0.set_position([0.05, 0.055, 0.3, 0.9])
l1, b1, w1, h1 = ax1.get_position().bounds
ax1.set_position([l1 * 1.05, b1 + 0.09 * h1, w1, 0.8 * h1])
# Add title box
"Imagenet Classification by Akida",
ec=(0.5, 0.5, 0.5),
fc=(0.9, 0.9, 1.0)))
return fig, imgobj, ax1, rects
def update_bars_chart(rects, vals, true_label):
counter = 0
for rect, h in zip(rects, yvals):
if counter < 5:
if top5[counter] == true_label:
if counter == 0:
rect.set_facecolor((0.0, 1.0, 0.0))
rect.set_facecolor((0.0, 0.5, 0.0))
elif counter == 5:
counter += 1
# Prepare plots
fig, imgobj, ax1, rects = prepare_plots()
# Get a random image
img = np.random.randint(num_images)
# Predict image class
outputs_akida = model_akida.predict(np.expand_dims(x_test[img], axis=0)).squeeze()
# Get top 5 prediction labels and associated names
true_label = labels_test[img]
top5, yvals, class_name = get_top5(outputs_akida, true_label)
# Draw Plots
ax1.set_yticklabels(class_name, rotation='horizontal', size=9)
update_bars_chart(rects, yvals, true_label)

4. Hardware mapping and performance
4.1. Map on hardware
List available Akida devices and check that an NSoC V2, Akida 1.0 production chip is available.
If a device is installed but not detected, reinstalling the driver might help, see the driver setup helper.
devices = akida.devices()
print(f'Available devices: {[dev.desc for dev in devices]}')
assert len(devices), "No device found, this example needs an Akida NSoC_v2 device."
device = devices[0]
assert device.version == akida.NSoC_v2, "Wrong device found, this example needs an Akida NSoC_v2."
Available devices: ['PCIe/NSoC_v2/0']
Map the model on the device
# Check model mapping: NP allocation and binary size
Model Summary
Input shape Output shape Sequences Layers NPs
[224, 224, 3] [1, 1, 1000] 1 15 68
Layer (type) Output shape Kernel shape NPs
====== HW/conv_0-classifier (Hardware) - size: 1361240 bytes =====
conv_0 (InputConv.) [112, 112, 16] (3, 3, 3, 16) N/A
conv_1 (Conv.) [112, 112, 32] (3, 3, 16, 32) 4
conv_2 (Conv.) [56, 56, 64] (3, 3, 32, 64) 6
conv_3 (Conv.) [56, 56, 64] (3, 3, 64, 64) 3
separable_4 (Sep.Conv.) [28, 28, 128] (3, 3, 64, 1) 6
(1, 1, 64, 128)
separable_5 (Sep.Conv.) [28, 28, 128] (3, 3, 128, 1) 4
(1, 1, 128, 128)
separable_6 (Sep.Conv.) [14, 14, 256] (3, 3, 128, 1) 8
(1, 1, 128, 256)
separable_7 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1) 4
(1, 1, 256, 256)
separable_8 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1) 4
(1, 1, 256, 256)
separable_9 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1) 4
(1, 1, 256, 256)
separable_10 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1) 4
(1, 1, 256, 256)
separable_11 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1) 4
(1, 1, 256, 256)
separable_12 (Sep.Conv.) [7, 7, 512] (3, 3, 256, 1) 8
(1, 1, 256, 512)
separable_13 (Sep.Conv.) [1, 1, 512] (3, 3, 512, 1) 8
(1, 1, 512, 512)
classifier (Fully.) [1, 1, 1000] (1, 1, 512, 1000) 1
4.2. Performance measurement
Power measurement must be enabled on the device’ soc (disabled by default). After sending data for inference, performance measurements are available in the model statistics.
# Enable power measurement
device.soc.power_measurement_enabled = True
# Send data for inference
_ = model_akida.forward(x_test)
# Display floor current
floor_power = device.soc.power_meter.floor
print(f'Floor power: {floor_power:.2f} mW')
# Retrieve statistics
Floor power: 881.10 mW
Average framerate = 53.76 fps
Last inference power range (mW): Avg 1034.50 / Min 881.00 / Max 1188.00 / Std 217.08
Last inference energy consumed (mJ/frame): 19.24
Total running time of the script: (0 minutes 8.298 seconds)