Build Vision Transformers for Akida

The Vision Transformer, or ViT, is a model for image classification that employs a Transformer-like architecture over patches of the image. An image is split into fixed-size patches, each of them are then linearly embedded, position embeddings are added, and the resulting sequence of vectors are fed to a standard Transformer encoder. Please refer to https://arxiv.org/abs/2010.11929 for further details.

Akida 2.0 now supports patch and position embeddings, and the encoder block in hardware. This tutorial explains how to build an optimized ViT using Akida models python API for Akida 2.0 hardware.

1. Model selection

There are many variants of ViT. The choice of the model is typically influenced by the tradeoff among architecture size, accuracy, inference speed, and training capabilities.

The following table shows few variants of commonly used ViT:

Architecture

Original accuracy

#Params

Architecture

ViT Base

79.90%

86M

12 heads, 12 blocks, hidden size 768

ViT Tiny

75.48%

5.8M

3 heads, 12 blocks, hidden size 192

DeiT-dist Tiny

74.17%

5.8M

3 heads, 12 blocks, hidden size 192

Note

The Vision Transformers support has been introduced in Akida 2.0.

The Akida model zoo provides tiny ViT architectures that are optimized to run on Akida hardware:

Both architectures have been modified so that their layers can be quantized to integer only operations.

2. Model optimization for Akida hardware

ViT has many encoder blocks that perform self-attention to process visual data. Each encoder block consists of many different layers. To optimally run ViT at the edge using Akida requires transforming this encoder block in the following way:

Note

Sections below show different ways to train a ViT for Akida which uses the above transformations.

3. Model Training

Akida accelerates ViT model that has the transformation mentioned in Section 2. Training a ViT that optimally runs on Akida can be made possible in the following two ways:

3.1 Option 1: Training a ViT (original) model first and then transforming each layer incrementally

First, train a ViT (original) model on a custom dataset until satisfactory accuracy. It is then possible to transform this model into an Akida optimized one as per Section 2. The layers mentioned in Section 2 are functionally equivalent to each of the layers present in the original model.

Note

To overcome the accuracy drop from the original when transforming the model as per Section 2, it is recommended to replace the original layers one at a time and to fine-tune at every step.

The example below shows the transformation of ViT (tiny) into an optimized model that can run on the Akida hardware.

The akida_models python package provides a Command Line Interface (CLI) to transform vit_ti16 and deit_ti16 model architectures and fine-tune them respectively.

$ akida_models create vit_ti16 -h
usage: akida_models create vit_ti16 [-h] [-c CLASSES] [-bw BASE_WEIGHTS] [--norm {LN,GN1,BN,LMN}]
                                    [--last_norm {LN,BN}] [--softmax {softmax,softmax2}]
                                    [--act {GeLU,ReLU8,swish}] [-i {224,384}]

optional arguments:
  -h, --help            show this help message and exit
  -c CLASSES, --classes CLASSES
                        The number of classes, by default 1000.
  -bw BASE_WEIGHTS, --base_weights BASE_WEIGHTS
                        Optional keras weights to load in the model, by default None.
  --norm {LN,GN1,BN,LMN}
                        Replace normalization in model with a custom function, by default LN
  --last_norm {LN,BN}   Replace last normalization in model with a custom function, by default LN
  --softmax {softmax,softmax2}
                        Replace softmax operation in model with custom function, by default softmax
  --act {GeLU,ReLU8,swish}
                        Replace activation function in model with custom function, by default GeLU
  -i {224,384}, --image_size {224,384}
                        The square input image size

The following shows the transformation of a vit_ti16 model architecture which was trained on ImageNet. The same methods can be applied for other datasets.

# download the pre-trained weights
wget https://data.brainchip.com/models/AkidaV2/vit/vit_ti16_224.h5

# transformation 1: replace layer normalization with mad norm layer and last layer normalization with batch normalization
akida_models create -s vit_ti16_lmnbn.h5 vit_ti16 -bw vit_ti16_224.h5 --norm LMN --last_norm BN
# fine-tuning
imagenet_train tune -m vit_ti16_lmnbn.h5 -e 15 --optim Adam --lr_policy cosine_decay \
                    -lr 6e-5 -s vit_ti16_lmnbn_tuned.h5

# transformation 2: replace GeLU layer with ReLU
akida_models create -s vit_ti16_relu.h5 vit_ti16 -bw vit_ti16_lmnbn_tuned.h5 --norm LMN --last_norm BN --act ReLU8
# fine-tuning
imagenet_train tune -m vit_ti16_relu.h5 -e 15 --optim Adam --lr_policy cosine_decay \
                    -lr 6e-5 -s vit_ti16_relu_tuned.h5

# transformation 3: replace softmax with shiftmax layer
akida_models create -s vit_ti16_shiftmax.h5 vit_ti16 -bw vit_ti16_relu_tuned.h5 --norm LMN --last_norm BN --act ReLU8 --softmax softmax2
# fine-tuning
imagenet_train tune -m vit_ti16_shiftmax.h5 -e 15 --optim Adam --lr_policy cosine_decay \
                    -lr 6e-5 -s vit_ti16_transformed.h5

The above transformation generates a ViT model that is optimized to run efficiently on Akida hardware. Similar steps can also be applied to deit_ti16. The table below highlights the accuracy of the original and transformed models.

Architecture

Original accuracy

Transformed accuracy

ViT

75.48%

74.25%

DeiT-dist

74.17%

75.03%

Note

The models obtained above have floating point weights and are ready to be quantized. See Section 4.

3.2 Option 2: Transfer Learning using Pre-trained transformed model

The Akida models python package has APIs for ViTs which provides pre-trained models for vit_ti16 and deit_ti16. These models can be used for Transfer Learning on a custom dataset. Since the above models are already transformed, no further transformation is required.

Visit our Transfer Learning Example to learn more about Transfer Learning using the Akida models python package. The following code snippet downloads a pre-trained model that can be used for Transfer Learning.

# The following is the API download the vit_t16 model trained on ImageNet dataset
from akida_models.model_io import load_model
from tensorflow.keras.utils import get_file

# Retrieve the float model with pretrained weights and load it
model_file = get_file(
    "bc_vit_ti16_224.h5",
    "https://data.brainchip.com/models/AkidaV2/vit/bc_vit_ti16_224.h5",
    cache_subdir='models/akidanet_imagenet')
model_keras = load_model(model_file)
model_keras.summary()
Downloading data from https://data.brainchip.com/models/AkidaV2/vit/bc_vit_ti16_224.h5

    8192/23695632 [..............................] - ETA: 0s
  106496/23695632 [..............................] - ETA: 14s
  188416/23695632 [..............................] - ETA: 15s
  253952/23695632 [..............................] - ETA: 16s
  311296/23695632 [..............................] - ETA: 19s
  376832/23695632 [..............................] - ETA: 19s
  425984/23695632 [..............................] - ETA: 20s
  491520/23695632 [..............................] - ETA: 20s
  540672/23695632 [..............................] - ETA: 20s
  589824/23695632 [..............................] - ETA: 20s
  622592/23695632 [..............................] - ETA: 21s
  679936/23695632 [..............................] - ETA: 22s
  737280/23695632 [..............................] - ETA: 23s
  770048/23695632 [..............................] - ETA: 24s
  819200/23695632 [>.............................] - ETA: 24s
  860160/23695632 [>.............................] - ETA: 24s
  909312/23695632 [>.............................] - ETA: 24s
  942080/23695632 [>.............................] - ETA: 25s
  974848/23695632 [>.............................] - ETA: 26s
 1007616/23695632 [>.............................] - ETA: 26s
 1048576/23695632 [>.............................] - ETA: 26s
 1081344/23695632 [>.............................] - ETA: 27s
 1122304/23695632 [>.............................] - ETA: 27s
 1163264/23695632 [>.............................] - ETA: 27s
 1187840/23695632 [>.............................] - ETA: 27s
 1236992/23695632 [>.............................] - ETA: 28s
 1261568/23695632 [>.............................] - ETA: 28s
 1310720/23695632 [>.............................] - ETA: 28s
 1335296/23695632 [>.............................] - ETA: 29s
 1368064/23695632 [>.............................] - ETA: 29s
 1409024/23695632 [>.............................] - ETA: 29s
 1466368/23695632 [>.............................] - ETA: 28s
 1515520/23695632 [>.............................] - ETA: 28s
 1581056/23695632 [=>............................] - ETA: 28s
 1646592/23695632 [=>............................] - ETA: 27s
 1720320/23695632 [=>............................] - ETA: 27s
 1777664/23695632 [=>............................] - ETA: 26s
 1835008/23695632 [=>............................] - ETA: 26s
 1892352/23695632 [=>............................] - ETA: 26s
 1949696/23695632 [=>............................] - ETA: 26s
 2023424/23695632 [=>............................] - ETA: 25s
 2097152/23695632 [=>............................] - ETA: 25s
 2170880/23695632 [=>............................] - ETA: 24s
 2244608/23695632 [=>............................] - ETA: 24s
 2318336/23695632 [=>............................] - ETA: 24s
 2375680/23695632 [==>...........................] - ETA: 23s
 2433024/23695632 [==>...........................] - ETA: 23s
 2482176/23695632 [==>...........................] - ETA: 23s
 2539520/23695632 [==>...........................] - ETA: 23s
 2588672/23695632 [==>...........................] - ETA: 23s
 2654208/23695632 [==>...........................] - ETA: 23s
 2727936/23695632 [==>...........................] - ETA: 23s
 2785280/23695632 [==>...........................] - ETA: 23s
 2867200/23695632 [==>...........................] - ETA: 22s
 2908160/23695632 [==>...........................] - ETA: 22s
 2949120/23695632 [==>...........................] - ETA: 22s
 2990080/23695632 [==>...........................] - ETA: 22s
 3055616/23695632 [==>...........................] - ETA: 22s
 3088384/23695632 [==>...........................] - ETA: 22s
 3137536/23695632 [==>...........................] - ETA: 22s
 3186688/23695632 [===>..........................] - ETA: 22s
 3227648/23695632 [===>..........................] - ETA: 22s
 3268608/23695632 [===>..........................] - ETA: 22s
 3309568/23695632 [===>..........................] - ETA: 22s
 3350528/23695632 [===>..........................] - ETA: 22s
 3383296/23695632 [===>..........................] - ETA: 23s
 3432448/23695632 [===>..........................] - ETA: 23s
 3465216/23695632 [===>..........................] - ETA: 23s
 3497984/23695632 [===>..........................] - ETA: 23s
 3547136/23695632 [===>..........................] - ETA: 23s
 3588096/23695632 [===>..........................] - ETA: 23s
 3629056/23695632 [===>..........................] - ETA: 23s
 3653632/23695632 [===>..........................] - ETA: 23s
 3694592/23695632 [===>..........................] - ETA: 23s
 3710976/23695632 [===>..........................] - ETA: 23s
 3751936/23695632 [===>..........................] - ETA: 23s
 3801088/23695632 [===>..........................] - ETA: 23s
 3825664/23695632 [===>..........................] - ETA: 23s
 3850240/23695632 [===>..........................] - ETA: 24s
 3874816/23695632 [===>..........................] - ETA: 24s
 3915776/23695632 [===>..........................] - ETA: 24s
 3956736/23695632 [====>.........................] - ETA: 24s
 4005888/23695632 [====>.........................] - ETA: 23s
 4055040/23695632 [====>.........................] - ETA: 24s
 4112384/23695632 [====>.........................] - ETA: 23s
 4161536/23695632 [====>.........................] - ETA: 23s
 4210688/23695632 [====>.........................] - ETA: 23s
 4276224/23695632 [====>.........................] - ETA: 23s
 4341760/23695632 [====>.........................] - ETA: 23s
 4407296/23695632 [====>.........................] - ETA: 23s
 4472832/23695632 [====>.........................] - ETA: 23s
 4505600/23695632 [====>.........................] - ETA: 23s
 4587520/23695632 [====>.........................] - ETA: 22s
 4653056/23695632 [====>.........................] - ETA: 22s
 4718592/23695632 [====>.........................] - ETA: 22s
 4784128/23695632 [=====>........................] - ETA: 22s
 4866048/23695632 [=====>........................] - ETA: 22s
 4947968/23695632 [=====>........................] - ETA: 21s
 5046272/23695632 [=====>........................] - ETA: 21s
 5128192/23695632 [=====>........................] - ETA: 21s
 5226496/23695632 [=====>........................] - ETA: 21s
 5275648/23695632 [=====>........................] - ETA: 21s
 5357568/23695632 [=====>........................] - ETA: 20s
 5406720/23695632 [=====>........................] - ETA: 20s
 5455872/23695632 [=====>........................] - ETA: 20s
 5505024/23695632 [=====>........................] - ETA: 20s
 5570560/23695632 [======>.......................] - ETA: 20s
 5636096/23695632 [======>.......................] - ETA: 20s
 5701632/23695632 [======>.......................] - ETA: 20s
 5783552/23695632 [======>.......................] - ETA: 20s
 5849088/23695632 [======>.......................] - ETA: 20s
 5898240/23695632 [======>.......................] - ETA: 20s
 5963776/23695632 [======>.......................] - ETA: 19s
 6012928/23695632 [======>.......................] - ETA: 19s
 6078464/23695632 [======>.......................] - ETA: 19s
 6127616/23695632 [======>.......................] - ETA: 19s
 6176768/23695632 [======>.......................] - ETA: 19s
 6225920/23695632 [======>.......................] - ETA: 19s
 6291456/23695632 [======>.......................] - ETA: 19s
 6340608/23695632 [=======>......................] - ETA: 19s
 6373376/23695632 [=======>......................] - ETA: 19s
 6422528/23695632 [=======>......................] - ETA: 19s
 6455296/23695632 [=======>......................] - ETA: 19s
 6471680/23695632 [=======>......................] - ETA: 19s
 6504448/23695632 [=======>......................] - ETA: 19s
 6537216/23695632 [=======>......................] - ETA: 19s
 6586368/23695632 [=======>......................] - ETA: 19s
 6619136/23695632 [=======>......................] - ETA: 19s
 6668288/23695632 [=======>......................] - ETA: 19s
 6701056/23695632 [=======>......................] - ETA: 19s
 6733824/23695632 [=======>......................] - ETA: 19s
 6766592/23695632 [=======>......................] - ETA: 19s
 6799360/23695632 [=======>......................] - ETA: 19s
 6832128/23695632 [=======>......................] - ETA: 19s
 6848512/23695632 [=======>......................] - ETA: 20s
 6881280/23695632 [=======>......................] - ETA: 20s
 6914048/23695632 [=======>......................] - ETA: 20s
 6946816/23695632 [=======>......................] - ETA: 20s
 6979584/23695632 [=======>......................] - ETA: 20s
 7012352/23695632 [=======>......................] - ETA: 20s
 7061504/23695632 [=======>......................] - ETA: 20s
 7110656/23695632 [========>.....................] - ETA: 20s
 7159808/23695632 [========>.....................] - ETA: 20s
 7225344/23695632 [========>.....................] - ETA: 20s
 7290880/23695632 [========>.....................] - ETA: 19s
 7372800/23695632 [========>.....................] - ETA: 19s
 7454720/23695632 [========>.....................] - ETA: 19s
 7536640/23695632 [========>.....................] - ETA: 19s
 7585792/23695632 [========>.....................] - ETA: 19s
 7651328/23695632 [========>.....................] - ETA: 19s
 7716864/23695632 [========>.....................] - ETA: 19s
 7782400/23695632 [========>.....................] - ETA: 18s
 7847936/23695632 [========>.....................] - ETA: 18s
 7913472/23695632 [=========>....................] - ETA: 18s
 7962624/23695632 [=========>....................] - ETA: 18s
 8011776/23695632 [=========>....................] - ETA: 18s
 8093696/23695632 [=========>....................] - ETA: 18s
 8159232/23695632 [=========>....................] - ETA: 18s
 8224768/23695632 [=========>....................] - ETA: 18s
 8290304/23695632 [=========>....................] - ETA: 18s
 8355840/23695632 [=========>....................] - ETA: 18s
 8421376/23695632 [=========>....................] - ETA: 17s
 8503296/23695632 [=========>....................] - ETA: 17s
 8552448/23695632 [=========>....................] - ETA: 17s
 8634368/23695632 [=========>....................] - ETA: 17s
 8699904/23695632 [==========>...................] - ETA: 17s
 8765440/23695632 [==========>...................] - ETA: 17s
 8814592/23695632 [==========>...................] - ETA: 17s
 8847360/23695632 [==========>...................] - ETA: 17s
 8896512/23695632 [==========>...................] - ETA: 17s
 8929280/23695632 [==========>...................] - ETA: 17s
 8962048/23695632 [==========>...................] - ETA: 17s
 8994816/23695632 [==========>...................] - ETA: 17s
 9027584/23695632 [==========>...................] - ETA: 17s
 9060352/23695632 [==========>...................] - ETA: 17s
 9109504/23695632 [==========>...................] - ETA: 17s
 9158656/23695632 [==========>...................] - ETA: 17s
 9224192/23695632 [==========>...................] - ETA: 16s
 9273344/23695632 [==========>...................] - ETA: 16s
 9338880/23695632 [==========>...................] - ETA: 16s
 9388032/23695632 [==========>...................] - ETA: 16s
 9453568/23695632 [==========>...................] - ETA: 16s
 9519104/23695632 [===========>..................] - ETA: 16s
 9584640/23695632 [===========>..................] - ETA: 16s
 9650176/23695632 [===========>..................] - ETA: 16s
 9699328/23695632 [===========>..................] - ETA: 16s
 9764864/23695632 [===========>..................] - ETA: 16s
 9830400/23695632 [===========>..................] - ETA: 16s
 9895936/23695632 [===========>..................] - ETA: 16s
 9961472/23695632 [===========>..................] - ETA: 15s
10027008/23695632 [===========>..................] - ETA: 15s
10092544/23695632 [===========>..................] - ETA: 15s
10174464/23695632 [===========>..................] - ETA: 15s
10256384/23695632 [===========>..................] - ETA: 15s
10354688/23695632 [============>.................] - ETA: 15s
10452992/23695632 [============>.................] - ETA: 15s
10551296/23695632 [============>.................] - ETA: 14s
10633216/23695632 [============>.................] - ETA: 14s
10665984/23695632 [============>.................] - ETA: 14s
10764288/23695632 [============>.................] - ETA: 14s
10829824/23695632 [============>.................] - ETA: 14s
10895360/23695632 [============>.................] - ETA: 14s
10960896/23695632 [============>.................] - ETA: 14s
11010048/23695632 [============>.................] - ETA: 14s
11075584/23695632 [=============>................] - ETA: 14s
11124736/23695632 [=============>................] - ETA: 14s
11173888/23695632 [=============>................] - ETA: 14s
11223040/23695632 [=============>................] - ETA: 13s
11272192/23695632 [=============>................] - ETA: 13s
11321344/23695632 [=============>................] - ETA: 13s
11386880/23695632 [=============>................] - ETA: 13s
11452416/23695632 [=============>................] - ETA: 13s
11517952/23695632 [=============>................] - ETA: 13s
11583488/23695632 [=============>................] - ETA: 13s
11632640/23695632 [=============>................] - ETA: 13s
11714560/23695632 [=============>................] - ETA: 13s
11780096/23695632 [=============>................] - ETA: 13s
11845632/23695632 [=============>................] - ETA: 13s
11919360/23695632 [==============>...............] - ETA: 13s
11976704/23695632 [==============>...............] - ETA: 12s
12042240/23695632 [==============>...............] - ETA: 12s
12091392/23695632 [==============>...............] - ETA: 12s
12156928/23695632 [==============>...............] - ETA: 12s
12222464/23695632 [==============>...............] - ETA: 12s
12288000/23695632 [==============>...............] - ETA: 12s
12353536/23695632 [==============>...............] - ETA: 12s
12402688/23695632 [==============>...............] - ETA: 12s
12468224/23695632 [==============>...............] - ETA: 12s
12550144/23695632 [==============>...............] - ETA: 12s
12632064/23695632 [==============>...............] - ETA: 12s
12713984/23695632 [===============>..............] - ETA: 11s
12795904/23695632 [===============>..............] - ETA: 11s
12877824/23695632 [===============>..............] - ETA: 11s
12976128/23695632 [===============>..............] - ETA: 11s
13041664/23695632 [===============>..............] - ETA: 11s
13139968/23695632 [===============>..............] - ETA: 11s
13205504/23695632 [===============>..............] - ETA: 11s
13287424/23695632 [===============>..............] - ETA: 11s
13369344/23695632 [===============>..............] - ETA: 11s
13467648/23695632 [================>.............] - ETA: 10s
13565952/23695632 [================>.............] - ETA: 10s
13664256/23695632 [================>.............] - ETA: 10s
13713408/23695632 [================>.............] - ETA: 10s
13828096/23695632 [================>.............] - ETA: 10s
13893632/23695632 [================>.............] - ETA: 10s
13975552/23695632 [================>.............] - ETA: 10s
14024704/23695632 [================>.............] - ETA: 10s
14057472/23695632 [================>.............] - ETA: 10s
14123008/23695632 [================>.............] - ETA: 10s
14172160/23695632 [================>.............] - ETA: 10s
14221312/23695632 [=================>............] - ETA: 10s
14270464/23695632 [=================>............] - ETA: 9s 
14319616/23695632 [=================>............] - ETA: 9s
14368768/23695632 [=================>............] - ETA: 9s
14417920/23695632 [=================>............] - ETA: 9s
14483456/23695632 [=================>............] - ETA: 9s
14532608/23695632 [=================>............] - ETA: 9s
14581760/23695632 [=================>............] - ETA: 9s
14630912/23695632 [=================>............] - ETA: 9s
14663680/23695632 [=================>............] - ETA: 9s
14696448/23695632 [=================>............] - ETA: 9s
14712832/23695632 [=================>............] - ETA: 9s
14761984/23695632 [=================>............] - ETA: 9s
14778368/23695632 [=================>............] - ETA: 9s
14811136/23695632 [=================>............] - ETA: 9s
14827520/23695632 [=================>............] - ETA: 9s
14860288/23695632 [=================>............] - ETA: 9s
14893056/23695632 [=================>............] - ETA: 9s
14925824/23695632 [=================>............] - ETA: 9s
14974976/23695632 [=================>............] - ETA: 9s
15024128/23695632 [==================>...........] - ETA: 9s
15089664/23695632 [==================>...........] - ETA: 9s
15155200/23695632 [==================>...........] - ETA: 9s
15220736/23695632 [==================>...........] - ETA: 9s
15286272/23695632 [==================>...........] - ETA: 9s
15368192/23695632 [==================>...........] - ETA: 9s
15450112/23695632 [==================>...........] - ETA: 8s
15532032/23695632 [==================>...........] - ETA: 8s
15613952/23695632 [==================>...........] - ETA: 8s
15679488/23695632 [==================>...........] - ETA: 8s
15745024/23695632 [==================>...........] - ETA: 8s
15810560/23695632 [===================>..........] - ETA: 8s
15892480/23695632 [===================>..........] - ETA: 8s
15974400/23695632 [===================>..........] - ETA: 8s
16056320/23695632 [===================>..........] - ETA: 8s
16138240/23695632 [===================>..........] - ETA: 8s
16236544/23695632 [===================>..........] - ETA: 7s
16334848/23695632 [===================>..........] - ETA: 7s
16433152/23695632 [===================>..........] - ETA: 7s
16531456/23695632 [===================>..........] - ETA: 7s
16613376/23695632 [====================>.........] - ETA: 7s
16695296/23695632 [====================>.........] - ETA: 7s
16777216/23695632 [====================>.........] - ETA: 7s
16826368/23695632 [====================>.........] - ETA: 7s
16924672/23695632 [====================>.........] - ETA: 7s
16990208/23695632 [====================>.........] - ETA: 7s
17072128/23695632 [====================>.........] - ETA: 6s
17137664/23695632 [====================>.........] - ETA: 6s
17203200/23695632 [====================>.........] - ETA: 6s
17268736/23695632 [====================>.........] - ETA: 6s
17334272/23695632 [====================>.........] - ETA: 6s
17399808/23695632 [=====================>........] - ETA: 6s
17448960/23695632 [=====================>........] - ETA: 6s
17514496/23695632 [=====================>........] - ETA: 6s
17563648/23695632 [=====================>........] - ETA: 6s
17612800/23695632 [=====================>........] - ETA: 6s
17661952/23695632 [=====================>........] - ETA: 6s
17727488/23695632 [=====================>........] - ETA: 6s
17760256/23695632 [=====================>........] - ETA: 6s
17809408/23695632 [=====================>........] - ETA: 6s
17858560/23695632 [=====================>........] - ETA: 6s
17907712/23695632 [=====================>........] - ETA: 6s
17956864/23695632 [=====================>........] - ETA: 6s
17989632/23695632 [=====================>........] - ETA: 6s
18038784/23695632 [=====================>........] - ETA: 6s
18087936/23695632 [=====================>........] - ETA: 5s
18137088/23695632 [=====================>........] - ETA: 5s
18186240/23695632 [======================>.......] - ETA: 5s
18235392/23695632 [======================>.......] - ETA: 5s
18284544/23695632 [======================>.......] - ETA: 5s
18333696/23695632 [======================>.......] - ETA: 5s
18382848/23695632 [======================>.......] - ETA: 5s
18448384/23695632 [======================>.......] - ETA: 5s
18513920/23695632 [======================>.......] - ETA: 5s
18579456/23695632 [======================>.......] - ETA: 5s
18644992/23695632 [======================>.......] - ETA: 5s
18694144/23695632 [======================>.......] - ETA: 5s
18776064/23695632 [======================>.......] - ETA: 5s
18825216/23695632 [======================>.......] - ETA: 5s
18890752/23695632 [======================>.......] - ETA: 5s
18956288/23695632 [======================>.......] - ETA: 5s
19021824/23695632 [=======================>......] - ETA: 4s
19087360/23695632 [=======================>......] - ETA: 4s
19152896/23695632 [=======================>......] - ETA: 4s
19202048/23695632 [=======================>......] - ETA: 4s
19267584/23695632 [=======================>......] - ETA: 4s
19316736/23695632 [=======================>......] - ETA: 4s
19365888/23695632 [=======================>......] - ETA: 4s
19431424/23695632 [=======================>......] - ETA: 4s
19496960/23695632 [=======================>......] - ETA: 4s
19562496/23695632 [=======================>......] - ETA: 4s
19628032/23695632 [=======================>......] - ETA: 4s
19693568/23695632 [=======================>......] - ETA: 4s
19759104/23695632 [========================>.....] - ETA: 4s
19824640/23695632 [========================>.....] - ETA: 4s
19906560/23695632 [========================>.....] - ETA: 3s
19988480/23695632 [========================>.....] - ETA: 3s
20045824/23695632 [========================>.....] - ETA: 3s
20086784/23695632 [========================>.....] - ETA: 3s
20152320/23695632 [========================>.....] - ETA: 3s
20201472/23695632 [========================>.....] - ETA: 3s
20250624/23695632 [========================>.....] - ETA: 3s
20299776/23695632 [========================>.....] - ETA: 3s
20332544/23695632 [========================>.....] - ETA: 3s
20381696/23695632 [========================>.....] - ETA: 3s
20430848/23695632 [========================>.....] - ETA: 3s
20480000/23695632 [========================>.....] - ETA: 3s
20529152/23695632 [========================>.....] - ETA: 3s
20578304/23695632 [=========================>....] - ETA: 3s
20627456/23695632 [=========================>....] - ETA: 3s
20692992/23695632 [=========================>....] - ETA: 3s
20725760/23695632 [=========================>....] - ETA: 3s
20774912/23695632 [=========================>....] - ETA: 3s
20807680/23695632 [=========================>....] - ETA: 3s
20856832/23695632 [=========================>....] - ETA: 3s
20889600/23695632 [=========================>....] - ETA: 2s
20922368/23695632 [=========================>....] - ETA: 2s
20955136/23695632 [=========================>....] - ETA: 2s
20971520/23695632 [=========================>....] - ETA: 2s
21004288/23695632 [=========================>....] - ETA: 2s
21020672/23695632 [=========================>....] - ETA: 2s
21053440/23695632 [=========================>....] - ETA: 2s
21069824/23695632 [=========================>....] - ETA: 2s
21086208/23695632 [=========================>....] - ETA: 2s
21118976/23695632 [=========================>....] - ETA: 2s
21151744/23695632 [=========================>....] - ETA: 2s
21184512/23695632 [=========================>....] - ETA: 2s
21217280/23695632 [=========================>....] - ETA: 2s
21250048/23695632 [=========================>....] - ETA: 2s
21282816/23695632 [=========================>....] - ETA: 2s
21315584/23695632 [=========================>....] - ETA: 2s
21348352/23695632 [==========================>...] - ETA: 2s
21397504/23695632 [==========================>...] - ETA: 2s
21446656/23695632 [==========================>...] - ETA: 2s
21512192/23695632 [==========================>...] - ETA: 2s
21569536/23695632 [==========================>...] - ETA: 2s
21626880/23695632 [==========================>...] - ETA: 2s
21676032/23695632 [==========================>...] - ETA: 2s
21741568/23695632 [==========================>...] - ETA: 2s
21807104/23695632 [==========================>...] - ETA: 2s
21856256/23695632 [==========================>...] - ETA: 2s
21905408/23695632 [==========================>...] - ETA: 1s
21954560/23695632 [==========================>...] - ETA: 1s
22020096/23695632 [==========================>...] - ETA: 1s
22085632/23695632 [==========================>...] - ETA: 1s
22151168/23695632 [===========================>..] - ETA: 1s
22233088/23695632 [===========================>..] - ETA: 1s
22298624/23695632 [===========================>..] - ETA: 1s
22380544/23695632 [===========================>..] - ETA: 1s
22429696/23695632 [===========================>..] - ETA: 1s
22511616/23695632 [===========================>..] - ETA: 1s
22577152/23695632 [===========================>..] - ETA: 1s
22642688/23695632 [===========================>..] - ETA: 1s
22708224/23695632 [===========================>..] - ETA: 1s
22757376/23695632 [===========================>..] - ETA: 1s
22839296/23695632 [===========================>..] - ETA: 0s
22904832/23695632 [===========================>..] - ETA: 0s
22970368/23695632 [============================>.] - ETA: 0s
23035904/23695632 [============================>.] - ETA: 0s
23085056/23695632 [============================>.] - ETA: 0s
23134208/23695632 [============================>.] - ETA: 0s
23199744/23695632 [============================>.] - ETA: 0s
23248896/23695632 [============================>.] - ETA: 0s
23298048/23695632 [============================>.] - ETA: 0s
23363584/23695632 [============================>.] - ETA: 0s
23429120/23695632 [============================>.] - ETA: 0s
23494656/23695632 [============================>.] - ETA: 0s
23543808/23695632 [============================>.] - ETA: 0s
23592960/23695632 [============================>.] - ETA: 0s
23658496/23695632 [============================>.] - ETA: 0s
23695632/23695632 [==============================] - 26s 1us/step
/usr/local/lib/python3.8/dist-packages/keras/initializers/initializers.py:120: UserWarning: The initializer TruncatedNormal is unseeded and being called multiple times, which will return identical values each time (even if the initializer is unseeded). Please update your code to provide a seed to the initializer, or avoid using the same initalizer instance more than once.
  warnings.warn(
Model: "vit-tiny"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to
==================================================================================================
 input (InputLayer)             [(None, 224, 224, 3  0           []
                                )]

 Rescale (Rescaling)            (None, 224, 224, 3)  0           ['input[0][0]']

 Embedding (Conv2D)             (None, 14, 14, 192)  147648      ['Rescale[0][0]']

 reshape (Reshape)              (None, 196, 192)     0           ['Embedding[0][0]']

 ClassToken (ClassToken)        (None, 197, 192)     192         ['reshape[0][0]']

 Transformer/PosEmbed (AddPosit  (None, 197, 192)    37824       ['ClassToken[0][0]']
 ionEmbs)

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    384         ['Transformer/PosEmbed[0][0]']
 erNorm_0 (LayerMadNormalizatio
 n)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_0/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout (Dropout)              (None, 197, 192)     0           ['Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    0           ['dropout[0][0]',
 _1 (Add)                                                         'Transformer/PosEmbed[0][0]']

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_0/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_0/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_0/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_1 (Dropout)            (None, 197, 768)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 192)    147648      ['dropout_1[0][0]']
 Block/Dense_1 (Dense)

 dropout_2 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_0/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_0/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_1/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_3 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    0           ['dropout_3[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_0/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_1/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_1/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_1/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_4 (Dropout)            (None, 197, 768)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 192)    147648      ['dropout_4[0][0]']
 Block/Dense_1 (Dense)

 dropout_5 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_1/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_5[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_1/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_2/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_6 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    0           ['dropout_6[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_1/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_2/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_2/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_2/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_7 (Dropout)            (None, 197, 768)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 192)    147648      ['dropout_7[0][0]']
 Block/Dense_1 (Dense)

 dropout_8 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_2/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_8[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_2/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_3/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_9 (Dropout)            (None, 197, 192)     0           ['Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    0           ['dropout_9[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_2/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_3/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_3/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_3/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_10 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 192)    147648      ['dropout_10[0][0]']
 Block/Dense_1 (Dense)

 dropout_11 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_3/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_11[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_3/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_4/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_12 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    0           ['dropout_12[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_3/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_4/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_4/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_4/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_13 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 192)    147648      ['dropout_13[0][0]']
 Block/Dense_1 (Dense)

 dropout_14 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_4/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_14[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_4/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_5/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_15 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    0           ['dropout_15[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_4/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_5/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_5/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_5/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_16 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 192)    147648      ['dropout_16[0][0]']
 Block/Dense_1 (Dense)

 dropout_17 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_5/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_17[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_5/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_6/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_18 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    0           ['dropout_18[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_5/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_6/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_6/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_6/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_19 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 192)    147648      ['dropout_19[0][0]']
 Block/Dense_1 (Dense)

 dropout_20 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_6/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_20[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_6/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_7/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_21 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    0           ['dropout_21[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_6/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_7/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_7/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_7/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_22 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 192)    147648      ['dropout_22[0][0]']
 Block/Dense_1 (Dense)

 dropout_23 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_7/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_23[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_7/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_8/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_24 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    0           ['dropout_24[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_7/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_8/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_8/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_8/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_25 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 192)    147648      ['dropout_25[0][0]']
 Block/Dense_1 (Dense)

 dropout_26 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_8/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_26[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_8/add_
 erNorm_0 (LayerMadNormalizatio                                  2[0][0]']
 n)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (Dense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (Dense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (Dense)

 Transformer/EncoderBlock_9/Mul  ((None, 197, 192),  0           ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (Attention)            ))                               0][0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37056       ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (Dense)                                                       ion[0][0]']

 dropout_27 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    0           ['dropout_27[0][0]',
 _1 (Add)                                                         'Transformer/EncoderBlock_8/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    384         ['Transformer/EncoderBlock_9/add_
 erNorm_2 (LayerMadNormalizatio                                  1[0][0]']
 n)

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_9/Laye
 Block/Dense_0 (Dense)                                           rNorm_2[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    0           ['Transformer/EncoderBlock_9/MlpB
 Block/activation (ReLU)                                         lock/Dense_0[0][0]']

 dropout_28 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 192)    147648      ['dropout_28[0][0]']
 Block/Dense_1 (Dense)

 dropout_29 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    0           ['Transformer/EncoderBlock_9/add_
 _2 (Add)                                                        1[0][0]',
                                                                  'dropout_29[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    384         ['Transformer/EncoderBlock_9/add_
 yerNorm_0 (LayerMadNormalizati                                  2[0][0]']
 on)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (Dense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (Dense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (Dense)

 Transformer/EncoderBlock_10/Mu  ((None, 197, 192),  0           ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (Attention)           ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (Dense)                                                      tion[0][0]']

 dropout_30 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    0           ['dropout_30[0][0]',
 d_1 (Add)                                                        'Transformer/EncoderBlock_9/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    384         ['Transformer/EncoderBlock_10/add
 yerNorm_2 (LayerMadNormalizati                                  _1[0][0]']
 on)

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_10/Lay
 pBlock/Dense_0 (Dense)                                          erNorm_2[0][0]']

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    0           ['Transformer/EncoderBlock_10/Mlp
 pBlock/activation (ReLU)                                        Block/Dense_0[0][0]']

 dropout_31 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_10/Ml  (None, 197, 192)    147648      ['dropout_31[0][0]']
 pBlock/Dense_1 (Dense)

 dropout_32 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    0           ['Transformer/EncoderBlock_10/add
 d_2 (Add)                                                       _1[0][0]',
                                                                  'dropout_32[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    384         ['Transformer/EncoderBlock_10/add
 yerNorm_0 (LayerMadNormalizati                                  _2[0][0]']
 on)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (Dense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (Dense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (Dense)

 Transformer/EncoderBlock_11/Mu  ((None, 197, 192),  0           ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (Attention)           ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37056       ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (Dense)                                                      tion[0][0]']

 dropout_33 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    0           ['dropout_33[0][0]',
 d_1 (Add)                                                        'Transformer/EncoderBlock_10/add
                                                                 _2[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    384         ['Transformer/EncoderBlock_11/add
 yerNorm_2 (LayerMadNormalizati                                  _1[0][0]']
 on)

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_11/Lay
 pBlock/Dense_0 (Dense)                                          erNorm_2[0][0]']

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    0           ['Transformer/EncoderBlock_11/Mlp
 pBlock/activation (ReLU)                                        Block/Dense_0[0][0]']

 dropout_34 (Dropout)           (None, 197, 768)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_11/Ml  (None, 197, 192)    147648      ['dropout_34[0][0]']
 pBlock/Dense_1 (Dense)

 dropout_35 (Dropout)           (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    0           ['Transformer/EncoderBlock_11/add
 d_2 (Add)                                                       _1[0][0]',
                                                                  'dropout_35[0][0]']

 Transformer/EncoderNorm (Batch  (None, 197, 192)    768         ['Transformer/EncoderBlock_11/add
 Normalization)                                                  _2[0][0]']

 ExtractToken (ExtractToken)    (None, 192)          0           ['Transformer/EncoderNorm[0][0]']

 Head (Dense)                   (None, 1000)         193000      ['ExtractToken[0][0]']

==================================================================================================
Total params: 5,717,800
Trainable params: 5,717,416
Non-trainable params: 384
__________________________________________________________________________________________________

Note

The models in Section 3 have floating point weights. Once the desired accuracy is obtained, these models should go through quantization before converting to Akida.

4. Model quantization

Akida 2.0 hardware adds efficient processing of 8-bit weights and activations for Vision Transformer models. This requires models in Section 3 to be quantized to 8-bit integer numbers. This means both weights and activation outputs become 8-bit integer numbers. This results in a smaller model with minimal to no drop in accuracy and achieves improvements in latency and power when running on Akida hardware.

Quantization of ViT models can be done using QuantizeML python package using either Post Training Quantization (PTQ) or Quantization Aware Training (QAT) methods. The following section shows quantization an example, quantization of vit_ti16 trained on ImageNet dataset.

4.1 Post-Training Quantization

Using QuantizeML python package, ViT model can be quantized to 8-bit integer numbers (both weights and activation outputs). PTQ requires calibration (ideally using reference data) which helps to determine optimal quantization ranges. To learn more about PTQ, refer to Advanced QuantizeML tutorial.

# Using QuantizeML to perform quantization
from quantizeml.models import quantize
from quantizeml.layers import QuantizationParams

# Define the quantization parameters.
qparams = QuantizationParams(weight_bits=8, activation_bits=8)

# Quantize the model defined in Section 3.2
model_quantized = quantize(model_keras, qparams=qparams)
model_quantized.summary()
   1/1024 [..............................] - ETA: 1:09:23
   6/1024 [..............................] - ETA: 11s    
  11/1024 [..............................] - ETA: 11s
  16/1024 [..............................] - ETA: 11s
  21/1024 [..............................] - ETA: 11s
  26/1024 [..............................] - ETA: 11s
  31/1024 [..............................] - ETA: 11s
  36/1024 [>.............................] - ETA: 11s
  41/1024 [>.............................] - ETA: 11s
  46/1024 [>.............................] - ETA: 11s
  51/1024 [>.............................] - ETA: 11s
  56/1024 [>.............................] - ETA: 11s
  61/1024 [>.............................] - ETA: 11s
  66/1024 [>.............................] - ETA: 11s
  71/1024 [=>............................] - ETA: 11s
  76/1024 [=>............................] - ETA: 11s
  81/1024 [=>............................] - ETA: 11s
  86/1024 [=>............................] - ETA: 10s
  91/1024 [=>............................] - ETA: 10s
  96/1024 [=>............................] - ETA: 10s
 101/1024 [=>............................] - ETA: 10s
 106/1024 [==>...........................] - ETA: 10s
 111/1024 [==>...........................] - ETA: 10s
 116/1024 [==>...........................] - ETA: 10s
 121/1024 [==>...........................] - ETA: 10s
 126/1024 [==>...........................] - ETA: 10s
 131/1024 [==>...........................] - ETA: 10s
 136/1024 [==>...........................] - ETA: 10s
 141/1024 [===>..........................] - ETA: 10s
 146/1024 [===>..........................] - ETA: 10s
 151/1024 [===>..........................] - ETA: 10s
 156/1024 [===>..........................] - ETA: 10s
 161/1024 [===>..........................] - ETA: 10s
 166/1024 [===>..........................] - ETA: 10s
 171/1024 [====>.........................] - ETA: 9s 
 176/1024 [====>.........................] - ETA: 9s
 181/1024 [====>.........................] - ETA: 9s
 186/1024 [====>.........................] - ETA: 9s
 191/1024 [====>.........................] - ETA: 9s
 196/1024 [====>.........................] - ETA: 9s
 201/1024 [====>.........................] - ETA: 9s
 206/1024 [=====>........................] - ETA: 9s
 211/1024 [=====>........................] - ETA: 9s
 216/1024 [=====>........................] - ETA: 9s
 221/1024 [=====>........................] - ETA: 9s
 226/1024 [=====>........................] - ETA: 9s
 231/1024 [=====>........................] - ETA: 9s
 236/1024 [=====>........................] - ETA: 9s
 241/1024 [======>.......................] - ETA: 9s
 246/1024 [======>.......................] - ETA: 9s
 251/1024 [======>.......................] - ETA: 9s
 256/1024 [======>.......................] - ETA: 8s
 261/1024 [======>.......................] - ETA: 8s
 266/1024 [======>.......................] - ETA: 8s
 271/1024 [======>.......................] - ETA: 8s
 276/1024 [=======>......................] - ETA: 8s
 281/1024 [=======>......................] - ETA: 8s
 286/1024 [=======>......................] - ETA: 8s
 291/1024 [=======>......................] - ETA: 8s
 296/1024 [=======>......................] - ETA: 8s
 301/1024 [=======>......................] - ETA: 8s
 306/1024 [=======>......................] - ETA: 8s
 311/1024 [========>.....................] - ETA: 8s
 316/1024 [========>.....................] - ETA: 8s
 321/1024 [========>.....................] - ETA: 8s
 326/1024 [========>.....................] - ETA: 8s
 331/1024 [========>.....................] - ETA: 8s
 336/1024 [========>.....................] - ETA: 8s
 341/1024 [========>.....................] - ETA: 7s
 346/1024 [=========>....................] - ETA: 7s
 351/1024 [=========>....................] - ETA: 7s
 356/1024 [=========>....................] - ETA: 7s
 361/1024 [=========>....................] - ETA: 7s
 366/1024 [=========>....................] - ETA: 7s
 371/1024 [=========>....................] - ETA: 7s
 376/1024 [==========>...................] - ETA: 7s
 381/1024 [==========>...................] - ETA: 7s
 386/1024 [==========>...................] - ETA: 7s
 391/1024 [==========>...................] - ETA: 7s
 396/1024 [==========>...................] - ETA: 7s
 401/1024 [==========>...................] - ETA: 7s
 406/1024 [==========>...................] - ETA: 7s
 411/1024 [===========>..................] - ETA: 7s
 416/1024 [===========>..................] - ETA: 7s
 421/1024 [===========>..................] - ETA: 7s
 426/1024 [===========>..................] - ETA: 6s
 431/1024 [===========>..................] - ETA: 6s
 436/1024 [===========>..................] - ETA: 6s
 441/1024 [===========>..................] - ETA: 6s
 446/1024 [============>.................] - ETA: 6s
 451/1024 [============>.................] - ETA: 6s
 456/1024 [============>.................] - ETA: 6s
 461/1024 [============>.................] - ETA: 6s
 466/1024 [============>.................] - ETA: 6s
 471/1024 [============>.................] - ETA: 6s
 476/1024 [============>.................] - ETA: 6s
 481/1024 [=============>................] - ETA: 6s
 486/1024 [=============>................] - ETA: 6s
 491/1024 [=============>................] - ETA: 6s
 496/1024 [=============>................] - ETA: 6s
 501/1024 [=============>................] - ETA: 6s
 506/1024 [=============>................] - ETA: 6s
 511/1024 [=============>................] - ETA: 6s
 516/1024 [==============>...............] - ETA: 5s
 521/1024 [==============>...............] - ETA: 5s
 526/1024 [==============>...............] - ETA: 5s
 531/1024 [==============>...............] - ETA: 5s
 536/1024 [==============>...............] - ETA: 5s
 541/1024 [==============>...............] - ETA: 5s
 546/1024 [==============>...............] - ETA: 5s
 551/1024 [===============>..............] - ETA: 5s
 556/1024 [===============>..............] - ETA: 5s
 561/1024 [===============>..............] - ETA: 5s
 566/1024 [===============>..............] - ETA: 5s
 571/1024 [===============>..............] - ETA: 5s
 576/1024 [===============>..............] - ETA: 5s
 581/1024 [================>.............] - ETA: 5s
 586/1024 [================>.............] - ETA: 5s
 591/1024 [================>.............] - ETA: 5s
 596/1024 [================>.............] - ETA: 5s
 601/1024 [================>.............] - ETA: 4s
 606/1024 [================>.............] - ETA: 4s
 611/1024 [================>.............] - ETA: 4s
 616/1024 [=================>............] - ETA: 4s
 621/1024 [=================>............] - ETA: 4s
 626/1024 [=================>............] - ETA: 4s
 631/1024 [=================>............] - ETA: 4s
 636/1024 [=================>............] - ETA: 4s
 641/1024 [=================>............] - ETA: 4s
 646/1024 [=================>............] - ETA: 4s
 651/1024 [==================>...........] - ETA: 4s
 656/1024 [==================>...........] - ETA: 4s
 661/1024 [==================>...........] - ETA: 4s
 665/1024 [==================>...........] - ETA: 4s
 670/1024 [==================>...........] - ETA: 4s
 675/1024 [==================>...........] - ETA: 4s
 680/1024 [==================>...........] - ETA: 4s
 685/1024 [===================>..........] - ETA: 3s
 690/1024 [===================>..........] - ETA: 3s
 695/1024 [===================>..........] - ETA: 3s
 700/1024 [===================>..........] - ETA: 3s
 705/1024 [===================>..........] - ETA: 3s
 710/1024 [===================>..........] - ETA: 3s
 715/1024 [===================>..........] - ETA: 3s
 720/1024 [====================>.........] - ETA: 3s
 725/1024 [====================>.........] - ETA: 3s
 730/1024 [====================>.........] - ETA: 3s
 735/1024 [====================>.........] - ETA: 3s
 740/1024 [====================>.........] - ETA: 3s
 745/1024 [====================>.........] - ETA: 3s
 750/1024 [====================>.........] - ETA: 3s
 755/1024 [=====================>........] - ETA: 3s
 760/1024 [=====================>........] - ETA: 3s
 765/1024 [=====================>........] - ETA: 3s
 770/1024 [=====================>........] - ETA: 2s
 775/1024 [=====================>........] - ETA: 2s
 780/1024 [=====================>........] - ETA: 2s
 785/1024 [=====================>........] - ETA: 2s
 790/1024 [======================>.......] - ETA: 2s
 795/1024 [======================>.......] - ETA: 2s
 800/1024 [======================>.......] - ETA: 2s
 805/1024 [======================>.......] - ETA: 2s
 810/1024 [======================>.......] - ETA: 2s
 815/1024 [======================>.......] - ETA: 2s
 820/1024 [=======================>......] - ETA: 2s
 825/1024 [=======================>......] - ETA: 2s
 830/1024 [=======================>......] - ETA: 2s
 835/1024 [=======================>......] - ETA: 2s
 840/1024 [=======================>......] - ETA: 2s
 845/1024 [=======================>......] - ETA: 2s
 850/1024 [=======================>......] - ETA: 2s
 855/1024 [========================>.....] - ETA: 1s
 860/1024 [========================>.....] - ETA: 1s
 865/1024 [========================>.....] - ETA: 1s
 870/1024 [========================>.....] - ETA: 1s
 875/1024 [========================>.....] - ETA: 1s
 880/1024 [========================>.....] - ETA: 1s
 885/1024 [========================>.....] - ETA: 1s
 890/1024 [=========================>....] - ETA: 1s
 895/1024 [=========================>....] - ETA: 1s
 900/1024 [=========================>....] - ETA: 1s
 905/1024 [=========================>....] - ETA: 1s
 910/1024 [=========================>....] - ETA: 1s
 915/1024 [=========================>....] - ETA: 1s
 920/1024 [=========================>....] - ETA: 1s
 925/1024 [==========================>...] - ETA: 1s
 930/1024 [==========================>...] - ETA: 1s
 935/1024 [==========================>...] - ETA: 1s
 940/1024 [==========================>...] - ETA: 0s
 945/1024 [==========================>...] - ETA: 0s
 950/1024 [==========================>...] - ETA: 0s
 955/1024 [==========================>...] - ETA: 0s
 960/1024 [===========================>..] - ETA: 0s
 965/1024 [===========================>..] - ETA: 0s
 970/1024 [===========================>..] - ETA: 0s
 975/1024 [===========================>..] - ETA: 0s
 980/1024 [===========================>..] - ETA: 0s
 985/1024 [===========================>..] - ETA: 0s
 990/1024 [============================>.] - ETA: 0s
 995/1024 [============================>.] - ETA: 0s
1000/1024 [============================>.] - ETA: 0s
1005/1024 [============================>.] - ETA: 0s
1010/1024 [============================>.] - ETA: 0s
1015/1024 [============================>.] - ETA: 0s
1020/1024 [============================>.] - ETA: 0s
1024/1024 [==============================] - 16s 12ms/step
Model: "vit-tiny"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to
==================================================================================================
 input (InputLayer)             [(None, 224, 224, 3  0           []
                                )]

 Rescale (QuantizedRescaling)   (None, 224, 224, 3)  0           ['input[0][0]']

 Embedding (QuantizedConv2D)    (None, 14, 14, 192)  147648      ['Rescale[0][0]']

 reshape (QuantizedReshape)     (None, 196, 192)     0           ['Embedding[0][0]']

 ClassToken (QuantizedClassToke  (None, 197, 192)    192         ['reshape[0][0]']
 n)

 Transformer/PosEmbed (Quantize  (None, 197, 192)    38208       ['ClassToken[0][0]']
 dAddPositionEmbs)

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    768         ['Transformer/PosEmbed[0][0]']
 erNorm_0 (QuantizedLayerNormal
 ization)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout (QuantizedDropout)     (None, 197, 192)     0           ['Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    384         ['dropout[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/PosEmbed[0][0]']

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_0/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_0/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_0/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_1 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 192)    148032      ['dropout_1[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_2 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_0/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_0/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_3 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    384         ['dropout_3[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_0/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_1/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_1/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_1/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_4 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 192)    148032      ['dropout_4[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_5 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_1/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_5[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_1/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_6 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    384         ['dropout_6[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_1/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_2/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_2/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_2/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_7 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 192)    148032      ['dropout_7[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_8 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_2/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_8[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_2/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_9 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    384         ['dropout_9[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_2/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_3/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_3/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_3/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_10 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 192)    148032      ['dropout_10[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_11 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_3/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_11[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_3/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_12 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    384         ['dropout_12[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_3/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_4/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_4/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_4/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_13 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 192)    148032      ['dropout_13[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_14 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_4/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_14[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_4/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_15 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    384         ['dropout_15[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_4/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_5/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_5/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_5/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_16 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 192)    148032      ['dropout_16[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_17 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_5/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_17[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_5/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_18 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    384         ['dropout_18[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_5/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_6/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_6/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_6/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_19 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 192)    148032      ['dropout_19[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_20 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_6/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_20[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_6/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_21 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    384         ['dropout_21[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_6/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_7/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_7/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_7/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_22 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 192)    148032      ['dropout_22[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_23 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_7/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_23[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_7/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_24 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    384         ['dropout_24[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_7/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_8/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_8/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_8/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_25 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 192)    148032      ['dropout_25[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_26 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_8/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_26[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_8/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_27 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    384         ['dropout_27[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_8/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_9/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_9/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_9/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_28 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 192)    148032      ['dropout_28[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_29 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_9/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_29[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_9/add_
 yerNorm_0 (QuantizedLayerNorma                                  2[0][0]']
 lization)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  ((None, 197, 192),  384         ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (QuantizedAttention)  ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (QuantizedDense)                                             tion[0][0]']

 dropout_30 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    384         ['dropout_30[0][0]',
 d_1 (QuantizedAdd)                                               'Transformer/EncoderBlock_9/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_10/add
 yerNorm_2 (QuantizedLayerNorma                                  _1[0][0]']
 lization)

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_10/Lay
 pBlock/Dense_0 (QuantizedDense                                  erNorm_2[0][0]']
 )

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    1536        ['Transformer/EncoderBlock_10/Mlp
 pBlock/activation (QuantizedRe                                  Block/Dense_0[0][0]']
 LU)

 dropout_31 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_10/Ml  (None, 197, 192)    148032      ['dropout_31[0][0]']
 pBlock/Dense_1 (QuantizedDense
 )

 dropout_32 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    384         ['Transformer/EncoderBlock_10/add
 d_2 (QuantizedAdd)                                              _1[0][0]',
                                                                  'dropout_32[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_10/add
 yerNorm_0 (QuantizedLayerNorma                                  _2[0][0]']
 lization)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  ((None, 197, 192),  384         ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (QuantizedAttention)  ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (QuantizedDense)                                             tion[0][0]']

 dropout_33 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    384         ['dropout_33[0][0]',
 d_1 (QuantizedAdd)                                               'Transformer/EncoderBlock_10/add
                                                                 _2[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_11/add
 yerNorm_2 (QuantizedLayerNorma                                  _1[0][0]']
 lization)

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_11/Lay
 pBlock/Dense_0 (QuantizedDense                                  erNorm_2[0][0]']
 )

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    1536        ['Transformer/EncoderBlock_11/Mlp
 pBlock/activation (QuantizedRe                                  Block/Dense_0[0][0]']
 LU)

 dropout_34 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_11/Ml  (None, 197, 192)    148032      ['dropout_34[0][0]']
 pBlock/Dense_1 (QuantizedDense
 )

 dropout_35 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    0           ['Transformer/EncoderBlock_11/add
 d_2 (QuantizedAdd)                                              _1[0][0]',
                                                                  'dropout_35[0][0]']

 Transformer/EncoderNorm (Quant  (None, 197, 192)    1152        ['Transformer/EncoderBlock_11/add
 izedBatchNormalization)                                         _2[0][0]']

 ExtractToken (QuantizedExtract  (None, 192)         0           ['Transformer/EncoderNorm[0][0]']
 Token)

 Head (QuantizedDense)          (None, 1000)         193000      ['ExtractToken[0][0]']

 dequantizer_4 (Dequantizer)    [(None, 1000)]       0           ['Head[0][0]']

==================================================================================================
Total params: 5,773,528
Trainable params: 5,717,800
Non-trainable params: 55,728
__________________________________________________________________________________________________

The bc_vit_ti16_imagenet_pretrained helper was obtained with the same 8-bit quantization scheme but with an additional QAT step to further improve accuracy.

4.2 Quantization Aware Training (Optional)

In Section 4.1, we performed PTQ and converted the weights and activation outputs to 8-bit integer numbers. In most cases, there is no accuracy drop observed after quantization, however in cases where an accurary drop is observed, it is possible to further fine-tune this model using QAT.

The model that is obtained through QuantizeML python package is an instance of Keras. This allows the model to be fine-tuned using the original dataset to regain accuracy.

Akida models python package provides pre-trained models for vit_ti16 and deit_ti16 that have been trained using QAT method. It can be used in the following way:

from akida_models import bc_vit_ti16_imagenet_pretrained

# Load the pre-trained quantized model
model_quantized = bc_vit_ti16_imagenet_pretrained()
model_quantized.summary()
Downloading data from https://data.brainchip.com/models/AkidaV2/vit/bc_vit_ti16_224_i8_w8_a8.h5.

       0/24405400 [..............................] - ETA: 0s
  122880/24405400 [..............................] - ETA: 10s
  581632/24405400 [..............................] - ETA: 4s 
  983040/24405400 [>.............................] - ETA: 3s
 1368064/24405400 [>.............................] - ETA: 3s
 1769472/24405400 [=>............................] - ETA: 3s
 2170880/24405400 [=>............................] - ETA: 3s
 2580480/24405400 [==>...........................] - ETA: 3s
 3014656/24405400 [==>...........................] - ETA: 2s
 3448832/24405400 [===>..........................] - ETA: 2s
 3874816/24405400 [===>..........................] - ETA: 2s
 4292608/24405400 [====>.........................] - ETA: 2s
 4751360/24405400 [====>.........................] - ETA: 2s
 5193728/24405400 [=====>........................] - ETA: 2s
 5619712/24405400 [=====>........................] - ETA: 2s
 6094848/24405400 [======>.......................] - ETA: 2s
 6537216/24405400 [=======>......................] - ETA: 2s
 6995968/24405400 [=======>......................] - ETA: 2s
 7471104/24405400 [========>.....................] - ETA: 2s
 7946240/24405400 [========>.....................] - ETA: 2s
 8454144/24405400 [=========>....................] - ETA: 1s
 8945664/24405400 [=========>....................] - ETA: 1s
 9437184/24405400 [==========>...................] - ETA: 1s
 9961472/24405400 [===========>..................] - ETA: 1s
10469376/24405400 [===========>..................] - ETA: 1s
10977280/24405400 [============>.................] - ETA: 1s
11501568/24405400 [=============>................] - ETA: 1s
12009472/24405400 [=============>................] - ETA: 1s
12533760/24405400 [==============>...............] - ETA: 1s
13058048/24405400 [===============>..............] - ETA: 1s
13516800/24405400 [===============>..............] - ETA: 1s
14041088/24405400 [================>.............] - ETA: 1s
14434304/24405400 [================>.............] - ETA: 1s
14827520/24405400 [=================>............] - ETA: 1s
15237120/24405400 [=================>............] - ETA: 1s
15630336/24405400 [==================>...........] - ETA: 1s
16039936/24405400 [==================>...........] - ETA: 0s
16465920/24405400 [===================>..........] - ETA: 0s
16891904/24405400 [===================>..........] - ETA: 0s
17334272/24405400 [====================>.........] - ETA: 0s
17760256/24405400 [====================>.........] - ETA: 0s
18202624/24405400 [=====================>........] - ETA: 0s
18644992/24405400 [=====================>........] - ETA: 0s
19120128/24405400 [======================>.......] - ETA: 0s
19595264/24405400 [=======================>......] - ETA: 0s
20054016/24405400 [=======================>......] - ETA: 0s
20512768/24405400 [========================>.....] - ETA: 0s
20987904/24405400 [========================>.....] - ETA: 0s
21479424/24405400 [=========================>....] - ETA: 0s
21970944/24405400 [==========================>...] - ETA: 0s
22454272/24405400 [==========================>...] - ETA: 0s
22937600/24405400 [===========================>..] - ETA: 0s
23429120/24405400 [===========================>..] - ETA: 0s
23953408/24405400 [============================>.] - ETA: 0s
24405400/24405400 [==============================] - 3s 0us/step
Model: "vit-tiny"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to
==================================================================================================
 input (InputLayer)             [(None, 224, 224, 3  0           []
                                )]

 Rescale (QuantizedRescaling)   (None, 224, 224, 3)  0           ['input[0][0]']

 Embedding (QuantizedConv2D)    (None, 14, 14, 192)  147648      ['Rescale[0][0]']

 reshape (QuantizedReshape)     (None, 196, 192)     0           ['Embedding[0][0]']

 ClassToken (QuantizedClassToke  (None, 197, 192)    192         ['reshape[0][0]']
 n)

 Transformer/PosEmbed (Quantize  (None, 197, 192)    38208       ['ClassToken[0][0]']
 dAddPositionEmbs)

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    768         ['Transformer/PosEmbed[0][0]']
 erNorm_0 (QuantizedLayerNormal
 ization)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_0/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_0/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_0/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_0/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout (QuantizedDropout)     (None, 197, 192)     0           ['Transformer/EncoderBlock_0/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    384         ['dropout[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/PosEmbed[0][0]']

 Transformer/EncoderBlock_0/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_0/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_0/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_0/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_1 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_0/Mlp  (None, 197, 192)    148032      ['dropout_1[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_2 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_0/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_0/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_0/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_0/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_1/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_1/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_1/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_1/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_3 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_1/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    384         ['dropout_3[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_0/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_1/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_1/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_1/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_1/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_4 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_1/Mlp  (None, 197, 192)    148032      ['dropout_4[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_5 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_1/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_1/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_1/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_5[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_1/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_2/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_2/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_2/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_2/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_6 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_2/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    384         ['dropout_6[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_1/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_2/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_2/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_2/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_2/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_7 (QuantizedDropout)   (None, 197, 768)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_2/Mlp  (None, 197, 192)    148032      ['dropout_7[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_8 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_2/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_2/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_2/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_8[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_2/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_3/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_3/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_3/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_3/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_9 (QuantizedDropout)   (None, 197, 192)     0           ['Transformer/EncoderBlock_3/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    384         ['dropout_9[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_2/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_3/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_3/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_3/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_3/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_10 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_3/Mlp  (None, 197, 192)    148032      ['dropout_10[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_11 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_3/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_3/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_3/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_11[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_3/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_4/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_4/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_4/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_4/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_12 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_4/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    384         ['dropout_12[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_3/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_4/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_4/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_4/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_4/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_13 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_4/Mlp  (None, 197, 192)    148032      ['dropout_13[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_14 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_4/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_4/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_4/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_14[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_4/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_5/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_5/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_5/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_5/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_15 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_5/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    384         ['dropout_15[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_4/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_5/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_5/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_5/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_5/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_16 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_5/Mlp  (None, 197, 192)    148032      ['dropout_16[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_17 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_5/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_5/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_5/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_17[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_5/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_6/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_6/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_6/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_6/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_18 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_6/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    384         ['dropout_18[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_5/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_6/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_6/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_6/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_6/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_19 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_6/Mlp  (None, 197, 192)    148032      ['dropout_19[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_20 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_6/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_6/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_6/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_20[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_6/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_7/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_7/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_7/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_7/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_21 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_7/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    384         ['dropout_21[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_6/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_7/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_7/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_7/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_7/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_22 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_7/Mlp  (None, 197, 192)    148032      ['dropout_22[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_23 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_7/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_7/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_7/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_23[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_7/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_8/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_8/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_8/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_8/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_24 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_8/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    384         ['dropout_24[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_7/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_8/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_8/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_8/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_8/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_25 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_8/Mlp  (None, 197, 192)    148032      ['dropout_25[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_26 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_8/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_8/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_8/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_26[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_8/add_
 erNorm_0 (QuantizedLayerNormal                                  2[0][0]']
 ization)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/qu                                  rNorm_0[0][0]']
 ery (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37058       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/ke                                  rNorm_0[0][0]']
 y (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_9/Laye
 tiHeadDotProductAttention_1/va                                  rNorm_0[0][0]']
 lue (QuantizedDense)

 Transformer/EncoderBlock_9/Mul  ((None, 197, 192),  384         ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/at   (None, 3, 197, 197             iHeadDotProductAttention_1/query[
 tention (QuantizedAttention)   ))                               0][0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/key[0]
                                                                 [0]',
                                                                  'Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/value[
                                                                 0][0]']

 Transformer/EncoderBlock_9/Mul  (None, 197, 192)    37440       ['Transformer/EncoderBlock_9/Mult
 tiHeadDotProductAttention_1/ou                                  iHeadDotProductAttention_1/attent
 t (QuantizedDense)                                              ion[0][0]']

 dropout_27 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_9/Mult
                                                                 iHeadDotProductAttention_1/out[0]
                                                                 [0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    384         ['dropout_27[0][0]',
 _1 (QuantizedAdd)                                                'Transformer/EncoderBlock_8/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_9/Lay  (None, 197, 192)    768         ['Transformer/EncoderBlock_9/add_
 erNorm_2 (QuantizedLayerNormal                                  1[0][0]']
 ization)

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    148224      ['Transformer/EncoderBlock_9/Laye
 Block/Dense_0 (QuantizedDense)                                  rNorm_2[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 768)    1536        ['Transformer/EncoderBlock_9/MlpB
 Block/activation (QuantizedReL                                  lock/Dense_0[0][0]']
 U)

 dropout_28 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/activation[0][0]']

 Transformer/EncoderBlock_9/Mlp  (None, 197, 192)    148032      ['dropout_28[0][0]']
 Block/Dense_1 (QuantizedDense)

 dropout_29 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_9/MlpB
                                                                 lock/Dense_1[0][0]']

 Transformer/EncoderBlock_9/add  (None, 197, 192)    384         ['Transformer/EncoderBlock_9/add_
 _2 (QuantizedAdd)                                               1[0][0]',
                                                                  'dropout_29[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_9/add_
 yerNorm_0 (QuantizedLayerNorma                                  2[0][0]']
 lization)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_10/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (QuantizedDense)

 Transformer/EncoderBlock_10/Mu  ((None, 197, 192),  384         ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (QuantizedAttention)  ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_10/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_10/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (QuantizedDense)                                             tion[0][0]']

 dropout_30 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    384         ['dropout_30[0][0]',
 d_1 (QuantizedAdd)                                               'Transformer/EncoderBlock_9/add_
                                                                 2[0][0]']

 Transformer/EncoderBlock_10/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_10/add
 yerNorm_2 (QuantizedLayerNorma                                  _1[0][0]']
 lization)

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_10/Lay
 pBlock/Dense_0 (QuantizedDense                                  erNorm_2[0][0]']
 )

 Transformer/EncoderBlock_10/Ml  (None, 197, 768)    1536        ['Transformer/EncoderBlock_10/Mlp
 pBlock/activation (QuantizedRe                                  Block/Dense_0[0][0]']
 LU)

 dropout_31 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_10/Ml  (None, 197, 192)    148032      ['dropout_31[0][0]']
 pBlock/Dense_1 (QuantizedDense
 )

 dropout_32 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_10/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_10/ad  (None, 197, 192)    384         ['Transformer/EncoderBlock_10/add
 d_2 (QuantizedAdd)                                              _1[0][0]',
                                                                  'dropout_32[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_10/add
 yerNorm_0 (QuantizedLayerNorma                                  _2[0][0]']
 lization)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/q                                  erNorm_0[0][0]']
 uery (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37058       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/k                                  erNorm_0[0][0]']
 ey (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_11/Lay
 ltiHeadDotProductAttention_1/v                                  erNorm_0[0][0]']
 alue (QuantizedDense)

 Transformer/EncoderBlock_11/Mu  ((None, 197, 192),  384         ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/a   (None, 3, 197, 197             tiHeadDotProductAttention_1/query
 ttention (QuantizedAttention)  ))                               [0][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/key[0
                                                                 ][0]',
                                                                  'Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/value
                                                                 [0][0]']

 Transformer/EncoderBlock_11/Mu  (None, 197, 192)    37440       ['Transformer/EncoderBlock_11/Mul
 ltiHeadDotProductAttention_1/o                                  tiHeadDotProductAttention_1/atten
 ut (QuantizedDense)                                             tion[0][0]']

 dropout_33 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mul
                                                                 tiHeadDotProductAttention_1/out[0
                                                                 ][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    384         ['dropout_33[0][0]',
 d_1 (QuantizedAdd)                                               'Transformer/EncoderBlock_10/add
                                                                 _2[0][0]']

 Transformer/EncoderBlock_11/La  (None, 197, 192)    768         ['Transformer/EncoderBlock_11/add
 yerNorm_2 (QuantizedLayerNorma                                  _1[0][0]']
 lization)

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    148224      ['Transformer/EncoderBlock_11/Lay
 pBlock/Dense_0 (QuantizedDense                                  erNorm_2[0][0]']
 )

 Transformer/EncoderBlock_11/Ml  (None, 197, 768)    1536        ['Transformer/EncoderBlock_11/Mlp
 pBlock/activation (QuantizedRe                                  Block/Dense_0[0][0]']
 LU)

 dropout_34 (QuantizedDropout)  (None, 197, 768)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/activation[0][0]']

 Transformer/EncoderBlock_11/Ml  (None, 197, 192)    148032      ['dropout_34[0][0]']
 pBlock/Dense_1 (QuantizedDense
 )

 dropout_35 (QuantizedDropout)  (None, 197, 192)     0           ['Transformer/EncoderBlock_11/Mlp
                                                                 Block/Dense_1[0][0]']

 Transformer/EncoderBlock_11/ad  (None, 197, 192)    0           ['Transformer/EncoderBlock_11/add
 d_2 (QuantizedAdd)                                              _1[0][0]',
                                                                  'dropout_35[0][0]']

 Transformer/EncoderNorm (Quant  (None, 197, 192)    1152        ['Transformer/EncoderBlock_11/add
 izedBatchNormalization)                                         _2[0][0]']

 ExtractToken (QuantizedExtract  (None, 192)         0           ['Transformer/EncoderNorm[0][0]']
 Token)

 Head (QuantizedDense)          (None, 1000)         193000      ['ExtractToken[0][0]']

 dequantizer (Dequantizer)      (None, 1000)         0           ['Head[0][0]']

==================================================================================================
Total params: 5,773,528
Trainable params: 5,717,800
Non-trainable params: 55,728
__________________________________________________________________________________________________

5. Conversion to Akida

A model quantized through QuantizeML python package is ready to be converted to Akida. Once the quantized model has the desired accuracy CNN2SNN toolkit is used for conversion to Akida. There is no further optimization required and equivalent accuracy is observed upon converting the model to Akida.

from cnn2snn import convert

# Convert the model
model_akida = convert(model_quantized)
model_akida.summary()
                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [1, 1, 1000]  1          137
________________________________________________

___________________________________________________________________________________________________________________
Layer (type)                                                                      Output shape   Kernel shape

======================================= SW/Embedding-dequantizer (Software) =======================================

Embedding (Stem)                                                                  [1, 197, 192]  (16, 16, 3, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_0/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_1/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_2/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_3/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_4/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_5/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_6/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_7/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_8/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/LayerNorm_0 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MultiHeadDotProductAttention_1/query (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MultiHeadDotProductAttention_1/key (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MultiHeadDotProductAttention_1/value (Dense2D)         [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MultiHeadDotProductAttention_1/attention (Attention)   [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MultiHeadDotProductAttention_1/out (Dense2D)           [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/add_1 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/LayerNorm_2 (MadNorm)                                  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MlpBlock/Dense_0 (Dense2D)                             [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/MlpBlock/Dense_1 (Dense2D)                             [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_9/add_2 (Add)                                            [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/LayerNorm_0 (MadNorm)                                 [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MultiHeadDotProductAttention_1/query (Dense2D)        [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MultiHeadDotProductAttention_1/key (Dense2D)          [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MultiHeadDotProductAttention_1/value (Dense2D)        [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MultiHeadDotProductAttention_1/attention (Attention)  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MultiHeadDotProductAttention_1/out (Dense2D)          [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/add_1 (Add)                                           [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/LayerNorm_2 (MadNorm)                                 [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MlpBlock/Dense_0 (Dense2D)                            [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/MlpBlock/Dense_1 (Dense2D)                            [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_10/add_2 (Add)                                           [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/LayerNorm_0 (MadNorm)                                 [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MultiHeadDotProductAttention_1/query (Dense2D)        [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MultiHeadDotProductAttention_1/key (Dense2D)          [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MultiHeadDotProductAttention_1/value (Dense2D)        [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MultiHeadDotProductAttention_1/attention (Attention)  [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MultiHeadDotProductAttention_1/out (Dense2D)          [1, 197, 192]  (192, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/add_1 (Add)                                           [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/LayerNorm_2 (MadNorm)                                 [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MlpBlock/Dense_0 (Dense2D)                            [1, 197, 768]  (192, 768)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/MlpBlock/Dense_1 (Dense2D)                            [1, 197, 192]  (768, 192)
___________________________________________________________________________________________________________________
Transformer/EncoderBlock_11/add_2 (Add)                                           [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
Transformer/EncoderNorm (BatchNormalization)                                      [1, 197, 192]  N/A
___________________________________________________________________________________________________________________
ExtractToken (ExtractToken)                                                       [1, 1, 192]    N/A
___________________________________________________________________________________________________________________
Head (Dense2D)                                                                    [1, 1, 1000]   (192, 1000)
___________________________________________________________________________________________________________________
dequantizer (Dequantizer)                                                         [1, 1, 1000]   N/A
___________________________________________________________________________________________________________________

6. Displaying results Attention Maps

Instead of showing predictions, here we propose to show attention maps on an image. This is derived from Abnar et al. attention rollout as shown in the following Keras tutorial. This aims to highlight the model abilities to focus on relevant parts in the input image.

Just like for the AkidaNet example, ImageNet images are not publicly available, this example uses a set of 10 copyright free images that were found on Google using ImageNet class names.

Get sample images and preprocess them:

import os
import numpy as np

from tensorflow.io import read_file
from tensorflow.image import decode_jpeg

from akida_models.imagenet import preprocessing

# Model specification and hyperparameters
NUM_CHANNELS = 3
IMAGE_SIZE = 224

NUM_IMAGES = 10

# Retrieve dataset file from Brainchip data server
file_path = get_file(
    "imagenet_like.zip",
    "https://data.brainchip.com/dataset-mirror/imagenet_like/imagenet_like.zip",
    cache_subdir='datasets/imagenet_like',
    extract=True)
data_folder = os.path.dirname(file_path)

# Load images for test set
x_test_files = []
x_test = np.zeros((NUM_IMAGES, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS)).astype('uint8')
for id in range(NUM_IMAGES):
    test_file = 'image_' + str(id + 1).zfill(2) + '.jpg'
    x_test_files.append(test_file)
    img_path = os.path.join(data_folder, test_file)
    base_image = read_file(img_path)
    image = decode_jpeg(base_image, channels=NUM_CHANNELS)
    image = preprocessing.preprocess_image(image, IMAGE_SIZE)
    x_test[id, :, :, :] = np.expand_dims(image, axis=0)

print(f'{NUM_IMAGES} images loaded and preprocessed.')
10 images loaded and preprocessed.

Build and display the attention map for one selected sample:

import cv2
import matplotlib.pyplot as plt

from keras import Model
from quantizeml.layers import ClassToken, Attention
from quantizeml.tensors import FixedPoint
from quantizeml.models.transforms.transforms_utils import get_layers_by_type


def build_attention_map(model, image):
    # Get the Attention layers list
    attentions = get_layers_by_type(model, Attention)

    # Calculate the number of tokens and deduce the grid size
    num_tokens = sum(isinstance(ly, ClassToken) for ly in model.layers)
    grid_size = int(np.sqrt(attentions[0].output_shape[0][-2] - num_tokens))

    # Get the attention weights from each transformer
    outputs = [la.output[1] for la in attentions]
    weights = Model(inputs=model.inputs, outputs=outputs).predict(np.expand_dims(image, 0))

    # Converts to float if needed
    weights = [w.to_float() if isinstance(w, FixedPoint) else w for w in weights]
    weights = np.array(weights)

    # Heads number
    num_heads = weights.shape[2]
    num_layers = weights.shape[0]
    reshaped = weights.reshape((num_layers, num_heads, grid_size**2 + 1, grid_size**2 + 1))

    # Average the attention weights across all heads
    reshaped = reshaped.mean(axis=1)

    # To account for residual connections, we add an identity matrix to the attention matrix and
    # re-normalize the weights.
    reshaped = reshaped + np.eye(reshaped.shape[1])
    reshaped = reshaped / reshaped.sum(axis=(1, 2))[:, np.newaxis, np.newaxis]

    # Recursively multiply the weight matrices
    v = reshaped[-1]
    for n in range(1, len(reshaped)):
        v = np.matmul(v, reshaped[-1 - n])

    # Attention from the output token to the input space
    mask = v[0, 1:].reshape(grid_size, grid_size)
    mask = cv2.resize(mask / mask.max(), (image.shape[1], image.shape[0]))[..., np.newaxis]
    return (mask * image).astype("uint8")


# Using a specific image for which attention map is easier to observe
image = x_test[8]

# Compute the attention map
attention_float = build_attention_map(model_keras, image)
attention_quantized = build_attention_map(model_quantized, image)

# Display the attention map
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3)
ax1.axis('off')
ax1.set_title('Original')
ax1.imshow(image)

ax2.axis('off')
ax2.set_title('Float')
ax2.imshow(attention_float)

ax3.axis('off')
ax3.set_title('Quantized')
ax3.imshow(attention_quantized)
fig.suptitle('Attention masks', fontsize=10)
plt.show()
Attention masks, Original, Float, Quantized
1/1 [==============================] - ETA: 0s
1/1 [==============================] - 2s 2s/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 35s 35s/step

Total running time of the script: (2 minutes 21.267 seconds)

Gallery generated by Sphinx-Gallery