Akida models API

Imports models.

Layer blocks

conv_block

akida_models.layer_blocks.conv_block(inputs, filters, kernel_size, pooling=None, pool_size=(2, 2), add_batchnorm=False, add_activation=True, **kwargs)[source]

Adds a convolutional layer with optional layers in the following order: max pooling, batch normalization, activation.

Parameters

inputs (tf.Tensor) – input tensor of shape (rows, cols, channels)
filters (int) – the dimensionality of the output space (i.e. the number of output filters in the convolution).
kernel_size (int or tuple of 2 integers) – specifying the height and width of the 2D convolution kernel. Can be a single integer to specify the same value for all spatial dimensions.
pooling (str) – add a pooling layer of type ‘pooling’ among the values ‘max’, ‘avg’, ‘global_max’ or ‘global_avg’, with pooling size set to pool_size. If ‘None’, no pooling will be added.
pool_size (int or tuple of 2 integers) – factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.
add_batchnorm (bool) – add a BatchNormalization layer
add_activation (bool) – add a ReLU layer
**kwargs – arguments passed to the keras.Conv2D layer, such as strides, padding, use_bias, weight_regularizer, etc.

Returns

output tensor of conv2D block.

Return type

tf.Tensor

separable_conv_block

akida_models.layer_blocks.separable_conv_block(inputs, filters, kernel_size, pooling=None, pool_size=(2, 2), add_batchnorm=False, add_activation=True, **kwargs)[source]

Adds a separable convolutional layer with optional layers in the following order: global average pooling, max pooling, batch normalization, activation.

Parameters

inputs (tf.Tensor) – input tensor of shape (height, width, channels)
filters (int) – the dimensionality of the output space (i.e. the number of output filters in the pointwise convolution).
kernel_size (int or tuple of 2 integers) – specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
pooling (str) – add a pooling layer of type ‘pooling’ among the values ‘max’, ‘avg’, ‘global_max’ or ‘global_avg’, with pooling size set to pool_size. If ‘None’, no pooling will be added.
pool_size (int or tuple of 2 integers) – factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.
add_batchnorm (bool) – add a BatchNormalization layer
add_activation (bool) – add a ReLU layer
**kwargs – arguments passed to the keras.SeparableConv2D layer, such as strides, padding, use_bias, etc.

Returns

output tensor of separable conv block.

Return type

tf.Tensor

dense_block

akida_models.layer_blocks.dense_block(inputs, units, add_batchnorm=False, add_activation=True, **kwargs)[source]

Adds a dense layer with optional layers in the following order: batch normalization, activation.

Parameters

inputs (tf.Tensor) – Input tensor of shape (rows, cols, channels)
units (int) – dimensionality of the output space
add_batchnorm (bool) – add a BatchNormalization layer
add_activation (bool) – add a ReLU layer
**kwargs – arguments passed to the Dense layer, such as use_bias, kernel_initializer, weight_regularizer, etc.

Returns

output tensor of the dense block.

Return type

tf.Tensor

Helpers

BatchNormalization gamma constraint

akida_models.add_gamma_constraint(model)[source]

Method helper to add a MinValueConstraint to an existing model so that gamma values of its BatchNormalization layers are above a defined minimum.

This is typically used to help having a model that will be Akida compatible after conversion. In some cases, the mapping on hardware will fail because of huge values for threshold or act_step with a message indicating that a value cannot fit in a 20 bit signed or unsigned integer. In such a case, this helper can be called to apply a constraint that can fix the issue.

Note that in order for the constraint to be applied to the actual weights, some training must be done: for an already trained model, it can be on a few batches, one epoch or more depending on the impact the constraint has on accuracy. This helper can also be called to a new model that has not been trained yet.

Parameters: model (keras.Model) – the model for which gamma constraints will be added.
Returns: the same model with BatchNormalisation layers updated.
Return type: keras.Model

Knowledge distillation

class akida_models.distiller.Distiller(*args, **kwargs)[source]

The class that will be used to train the student model using the distillation knowledge method.

Reference Hinton et al. (2015).

Parameters

student (keras.Model) – the student model
teacher (keras.Model) – the well trained teacher model
alpha (float, optional) – weight to student_loss_fn and 1-alpha to distillation_loss_fn. Defaults to 0.1

akida_models.distiller.KLDistillationLoss(temperature=3)[source]

The KLDistillationLoss is a simple wrapper around the KLDivergence loss that accepts raw predictions instead of probability distributions.

Before invoking the KLDivergence loss, it converts the inputs predictions to probabilities by dividing them by a constant ‘temperature’ and applies a softmax.

Parameters: temperature (float) – temperature for softening probability distributions. Larger temperature gives softer distributions.

Pruning

akida_models.prune_model(model, acceptance_function, pruning_rates=None, prunable_layers_policy=<function neural_layers>, prunable_filters_policy=<function smallest_filters>)[source]

Prune model automatically based on an acceptance function.

The algorithm for filter pruning is as follows:

Select the first prunable layer (according to the prunable_layers_policy function).

As long as the acceptance_function returns True, prune successively the layer with different pruning rates (according to pruning_rates and prunable_filters_policy).

When the current pruned model is not acceptable, the last valid pruning rate is selected for the final pruned model.

Repeat steps 1, 2 and 3 for the next prunable layers.

Examples

acceptable_drop = 0.05

def evaluate(model):
    _, accuracy = model.evaluate(data, labels)
    return accuracy

ref_accuracy = evaluate(base_model)

def acceptance_function(pruned_model):
    # This function returns True if the pruned_model is acceptable.
    # Here, the pruned model is acceptable if the accuracy drops
    # less than 5% from the base model.

    return ref_accuracy - evaluate(pruned_model) <= acceptable_drop

# Prune model
pruned_model, pruning_rates = prune_model(model, acceptance_function)

Parameters

model (keras.Model) – a keras model to prune
acceptance_function (function) – a criterion function that returns True if the pruned model is acceptable. The signature must be function(model).
pruning_rates (list, optional) – a list of pruning rates to test. Default is [0.1, 0.2, …, 0.9].
prunable_layers_policy (function, optional) – a function returning a list of layers to prune in the model. The signature must be function(model), and must return a list of prunable layer names. By default, all neural layers (Conv2D/SeparableConv2D/Dense/ QuantizedConv2D/QuantizedSeparableConv2D/QuantizedDense) are candidates for pruning.
prunable_filters_policy (function, optional) – a function that returns the filters to prune in a given layer for a specific pruning rate. The signature must be function(layer, pruning_rate) and returns a list of indices to prune. By default, filters with the lowest magnitude are pruned.

Returns

the pruned model and the pruning rates.

Return type

tuple

akida_models.delete_filters(model, layer_to_prune, filters_to_prune)[source]

Deletes filters in the given layer and updates weights in it and its subsequent layers.

A pruned model is returned. Only linear models are supported.

Parameters

model (keras.Model) – the model to prune.
layer_to_prune (str) – the name of the neural layer where filters will be deleted.
filters_to_prune (list) – indices of filters to delete in the given layer.

Returns

the pruned model

Return type

keras.Sequential

Training

akida_models.training.freeze_model_before(model, freeze_before)[source]

Freezes the model before the given layer name.

Parameters

model (keras.Model) – the model to freeze
freeze_before (str) – name of the layer from which the model will not be frozen

Raises

ValueError – if the provided layer name was not found in the model

akida_models.training.evaluate_model(model, x, y=None, batch_size=None, steps=None, print_history=False)[source]

Evaluates model performances.

Parameters

model (keras.Model) – the model to evaluate
x (tf.Dataset, np.array or generator) – evaluation input data
y (tf.Dataset, np.array or generator, optional) – evaluation target data. Defaults to None.
batch_size (int, optional) – the batch size. Defaults to None.
steps (int, optional) – total number of steps before declaring the evaluation round finished. Defaults to None.
print_history (bool, optional) – either to print all history or just accuracy. Defaults to False.

akida_models.training.evaluate_akida_model(model, x, activation)[source]

Evaluates Akida model and return predictions and labels to compute accuracy.

Parameters

model (akida.Model) – the model to evaluate
x (tf.Dataset, np.array or generator) – evaluation input data
activation (str) – activation function to apply to potentials

Returns

predictions and labels

Return type

np.array, np.array

akida_models.training.compile_model(model, learning_rate=0.001, loss='categorical_crossentropy', metrics=None)[source]

Compiles the model using Adam optimizer.

Parameters

model (keras.Model) – the model to compile
learning_rate (float, optional) – the learning rate. Defaults to 1e-3.
loss (str or function, optional) – the loss function. Defaults to ‘categorical_crossentropy’.
metrics (list, optional) – list of metrics to be evaluated during training and testing. Defaults to None.

akida_models.training.calibrate_model(model, x, num_samples, batch_size=None, epochs=1)[source]

Calibrates model

Parameters

model (keras.Model) – the model to calibrate
x (tf.Dataset, np.array or generator) – train input data
num_samples (int) – total number of samples before declaring the calibration round finished.
batch_size (int, optional) – the batch size. Defaults to None.
epochs (int, optional) – the number of epochs. Defaults to 1.

MACS

akida_models.macs.get_flops(model)[source]

Calculate FLOPS for a tf.keras.Model or tf.keras.Sequential model in inference mode.

It uses tf.compat.v1.profiler under the hood.

Parameters: model (keras.Model) – the model to evaluate
Returns: object containing the FLOPS
Return type: tf.compat.v1.profiler.GraphNodeProto

akida_models.macs.display_macs(model_path, verbose=False)[source]

Displays the MACs for a keras model

By default it displays only the total MACS.

Parameters

model (keras.Model) – the model to evaluate
verbose (bool) – display MACS for each operation

Utils

akida_models.utils.fetch_file(origin, fname=None, file_hash=None, cache_subdir='datasets', extract=False, cache_dir=None)[source]

Downloads a file from a URL if it is not already in the cache.

Reimplements keras.utils.get_file without raising an error when detecting a file_hash mismatch (it will just re-download the model).

Parameters

origin (str) – original URL of the file.
fname (str, optional) – name of the file. If an absolute path /path/to/file.txt is specified the file will be saved at that location. If None, the name of the file at origin will be used. Defaults to None.
file_hash (str, optional) – the expected hash string of the file after download. Defaults to None.
cache_subdir (str, optional) – subdirectory under the Keras cache dir where the file is saved. If an absolute path /path/to/folder is specified the file will be saved at that location. Defaults to ‘datasets’.
extract (bool, optional) – True tries extracting the file as an Archive, like tar or zip. Defaults to False.
cache_dir (str, optional) – location to store cached files, when None it defaults to the default directory ~/.keras/. Defaults to None.

Returns

path to the downloaded file

Return type

str

Model zoo

AkidaNet

ImageNet

akida_models.akidanet_imagenet(input_shape=None, alpha=1.0, include_top=True, pooling=None, classes=1000, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(128, - 1))[source]

Instantiates the AkidaNet architecture.

Parameters

input_shape (tuple, optional) – shape tuple. Defaults to None.
alpha (float, optional) –
controls the width of the model. Defaults to 1.0.
- If alpha < 1.0, proportionally decreases the number of filters in each layer.
- If alpha > 1.0, proportionally increases the number of filters in each layer.
- If alpha = 1, default number of filters from the paper are used at each layer.
include_top (bool, optional) – whether to include the fully-connected layer at the top of the model. Defaults to True.
pooling (str, optional) –
optional pooling mode for feature extraction when include_top is False. Defaults to None.
- None means that the output of the model will be the 4D tensor output of the last convolutional block.
- avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
classes (int, optional) – optional number of classes to classify images into, only to be specified if include_top is True. Defaults to 1000.
weight_quantization (int, optional) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer. Defaults to 0.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int, optional) –
sets all activations in the model to have a particular activation quantization bitwidth. Defaults to 0.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization (int, optional) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (128, -1). Note that following Akida convention, the scale factor is an integer used as a divider.

Returns

a Keras model for AkidaNet/ImageNet.

Return type

keras.Model

Raises

ValueError – in case of invalid input shape.

akida_models.akidanet_imagenet_pretrained(alpha=1.0)[source]

Helper method to retrieve an akidanet_imagenet model that was trained on ImageNet dataset.

Parameters: alpha (float, optional) – width of the model, allowed values in [0.25, 0.5, 1]. Defaults to 1.0.
Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_edge_imagenet(base_model, classes)[source]

Instantiates an AkidaNet-edge architecture.

Parameters

base_model (str/keras.Model) – an akidanet_imagenet quantized model.
classes (int) – the number of classes for the edge classifier.

Returns

a Keras Model instance.

Return type

keras.Model

akida_models.akidanet_edge_imagenet_pretrained()[source]

Helper method to retrieve a akidanet_edge_imagenet model that was trained on ImageNet dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_imagenette_pretrained(alpha=1.0)[source]

Helper method to retrieve a akidanet_imagenet model that was trained on Imagenette dataset.

Parameters: alpha (float, optional) – width of the model, allowed values in [0.25, 0.5, 1]. Defaults to 1.0.
Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_cats_vs_dogs_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on Cats vs.Dogs dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_faceidentification_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on CASIA Webface dataset and that performs face identification.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_faceidentification_edge_pretrained()[source]

Helper method to retrieve an akidanet_edge_imagenet model that was trained on CASIA Webface dataset and that performs face identification.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_faceverification_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on CASIA Webface dataset and optimized with ArcFace that can perform face verification on LFW.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_melanoma_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on SIIM-ISIC Melanoma Classification dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_odir5k_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on ODIR-5K dataset.

The model focuses on the following classes that are a part of the original dataset: normal, cataract, AMD (age related macular degeneration) and pathological myopia.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_retinal_oct_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on retinal OCT dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_ecg_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on ECG classification Physionet2017 dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_plantvillage_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on PlantVillage dataset.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_cifar10_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on CIFAR-10 dataset. Since CIFAR-10 images have a 32x32 size, they need to be resized to match akidanet input layer. This can be done by calling the ‘resize_image’ function available under akida_models.cifar10.preprocessing.

Returns: a Keras Model instance.
Return type: keras.Model

akida_models.akidanet_vww_pretrained()[source]

Helper method to retrieve an akidanet_imagenet model that was trained on VWW dataset.

Returns: a Keras Model instance.
Return type: keras.Model

Preprocessing

akida_models.imagenet.preprocessing.preprocess_image(image, image_size, training=False, data_aug=None)[source]

ImageNet data preprocessing.

Preprocessing includes cropping, and resizing for both training and validation images. Training preprocessing introduces some random distortion of the image to improve accuracy.

Parameters

image (tf.Tensor) – input image as a 3-D tensor
image_size (int) – desired image size
training (bool, optional) – True for training preprocessing, False for validation and inference. Defaults to False.
data_aug (keras.Sequential, optional) – data augmentation. Defaults to None.

Returns

preprocessed image

Return type

tensorflow.Tensor

akida_models.imagenet.preprocessing.index_to_label(index)[source]

Function to get an ImageNet label from an index.

Parameters: index – between 0 and 999
Returns: a string of comma separated labels
Return type: str

Mobilenet

ImageNet

akida_models.mobilenet_imagenet(input_shape=None, alpha=1.0, dropout=0.001, include_top=True, pooling=None, classes=1000, use_stride2=False, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(128, - 1))[source]

Instantiates the MobileNet architecture.

Parameters

input_shape (tuple, optional) – shape tuple. Defaults to None.
alpha (float, optional) –
controls the width of the model. Defaults to 1.0.
- If alpha < 1.0, proportionally decreases the number of filters in each layer.
- If alpha > 1.0, proportionally increases the number of filters in each layer.
- If alpha = 1, default number of filters from the paper are used at each layer.
dropout (float, optional) – dropout rate. Defaults to 1e-3.
include_top (bool, optional) – whether to include the fully-connected layer at the top of the model. Defaults to True.
pooling (str, optional) –
optional pooling mode for feature extraction when include_top is False. Defaults to None.
- None means that the output of the model will be the 4D tensor output of the last convolutional block.
- avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
classes (int, optional) – optional number of classes to classify images into, only to be specified if include_top is True. Defaults to 1000.
use_stride2 (bool, optional) – replace max pooling operations by stride 2 convolutions in layers separable 2, 4, 6 and 12. Defaults to False.
weight_quantization (int, optional) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer. Defaults to 0.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int, optional) –
sets all activations in the model to have a particular activation quantization bitwidth. Defaults to 0.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization (int, optional) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (128, -1). Note that following Akida convention, the scale factor is an integer used as a divider.

Returns

a Keras model for MobileNet/ImageNet.

Return type

keras.Model

Raises

ValueError – in case of invalid input shape.

akida_models.mobilenet_imagenet_pretrained(alpha=1.0)[source]

Helper method to retrieve a mobilenet_imagenet model that was trained on ImageNet dataset.

Parameters: alpha (float) – width of the model.
Returns: a Keras Model instance.
Return type: keras.Model

akida_models.mobilenet_edge_imagenet(base_model, classes)[source]

Instantiates a MobileNet-edge architecture.

Parameters

base_model (str/keras.Model) – a mobilenet_imagenet quantized model.
classes (int) – the number of classes for the edge classifier.

Returns

a Keras Model instance.

Return type

keras.Model

akida_models.mobilenet_edge_imagenet_pretrained()[source]

Helper method to retrieve a mobilenet_edge_imagenet model that was trained on ImageNet dataset.

Returns: a Keras Model instance.
Return type: keras.Model

DS-CNN

KWS

akida_models.ds_cnn_kws(input_shape=(49, 10, 1), classes=33, include_top=True, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(255, 0))[source]

Instantiates a MobileNet-like model for the “Keyword Spotting” example.

This model is based on the MobileNet architecture, mainly with fewer layers. The weights and activations are quantized such that it can be converted into an Akida model.

This architecture is originated from https://arxiv.org/pdf/1711.07128.pdf and was created for the “Keyword Spotting” (KWS) or “Speech Commands” dataset.

Parameters

input_shape (tuple) – input shape tuple of the model
classes (int) – optional number of classes to classify words into, only be specified if include_top is True.
include_top (bool) – whether to include the fully-connected layer at the top of the model.
weight_quantization (int) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int) –
sets all activations in the model to have a particular activation quantization bitwidth.
- ’0’ implements floating point 32-bit activations.
- ’1’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization (int) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’None’ implements the same bitwidth as the other weights.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (255, 0). Note that following Akida convention, the scale factor is an integer used as a divider.

Returns

a Keras model for MobileNet/KWS

Return type

keras.Model

akida_models.ds_cnn_kws_pretrained()[source]

Helper method to retrieve a ds_cnn_kws model that was trained on KWS dataset.

Returns: a Keras Model instance.
Return type: keras.Model

Preprocessing

akida_models.kws.preprocessing.prepare_model_settings(sample_rate, clip_duration_ms, window_size_ms, window_stride_ms, feature_bin_count)[source]

Calculates common settings needed for all models.

Parameters

sample_rate – Number of audio samples per second.
clip_duration_ms – Length of each audio clip to be analyzed.
window_size_ms – Duration of frequency analysis window.
window_stride_ms – How far to move in time between frequency windows.
feature_bin_count – Number of frequency bins to use for analysis.

Returns

Dictionary containing common settings.

Raises

ValueError – If the preprocessing mode isn’t recognized.

akida_models.kws.preprocessing.prepare_words_list(wanted_words)[source]

Prepends common tokens to the custom word list.

Parameters: wanted_words – List of strings containing the custom words.
Returns: List with the standard silence and unknown tokens added.

akida_models.kws.preprocessing.which_set(filename, validation_percentage, testing_percentage)[source]

Determines which data partition the file should belong to.

We want to keep files in the same training, validation, or testing sets even if new ones are added over time. This makes it less likely that testing samples will accidentally be reused in training when long runs are restarted for example. To keep this stability, a hash of the filename is taken and used to determine which set it should belong to. This determination only depends on the name and the set proportions, so it won’t change as other files are added.

It’s also useful to associate particular files as related (for example words spoken by the same person), so anything after ‘_nohash_’ in a filename is ignored for set determination. This ensures that ‘bobby_nohash_0.wav’ and ‘bobby_nohash_1.wav’ are always in the same set, for example.

Parameters

filename – File path of the data sample.
validation_percentage – How much of the data set to use for validation.
testing_percentage – How much of the data set to use for testing.

Returns

String, one of ‘training’, ‘validation’, or ‘testing’.

class akida_models.kws.preprocessing.AudioProcessor(sample_rate, clip_duration_ms, window_size_ms, window_stride_ms, feature_bin_count, data_url=None, data_dir=None, silence_percentage=0, unknown_percentage=0, wanted_words=None, validation_percentage=0, testing_percentage=0)[source]

Handles loading, partitioning, and preparing audio training data.

Methods:

`get_augmented_data_for_wav`(wav_filename, ...)	Applies the feature transformation process to a wav audio file,
`get_data`(how_many, offset, ...)	Gather samples from the data set, applying transformations as needed.
`get_features_for_wav`(wav_filename)	Applies the feature transformation process to the input_wav.
`maybe_download_and_extract_dataset`(data_url, ...)	Download and extract data set tar file.
`prepare_background_data`()	Searches a folder for background noise audio, and loads it into
`prepare_data_index`(silence_percentage, ...)	Prepares a list of the samples organized by set and label.
`prepare_processing_graph`()	Builds a TensorFlow graph to apply the input distortions.

get_augmented_data_for_wav(wav_filename, background_frequency, background_volume_range, time_shift, num_augmented_samples=1)[source]

Applies the feature transformation process to a wav audio file,: adding data augmentation (background noise and time shifting).

Parameters

wav_filename (str) – The path to the input audio file.
background_frequency – How many clips will have background noise, 0.0 to 1.0.
background_volume_range – How loud the background noise will be.
time_shift – How much to randomly shift the clips by in time.
num_augmented_samples – How many samples will be generated using data augmentation.

Returns

Numpy data array containing the generated features for every augmented: sample.

get_data(how_many, offset, background_frequency, background_volume_range, time_shift, mode)[source]

Gather samples from the data set, applying transformations as needed.

When the mode is ‘training’, a random selection of samples will be returned, otherwise the first N clips in the partition will be used. This ensures that validation always uses the same samples, reducing noise in the metrics.

Parameters

how_many – Desired number of samples to return. -1 means the entire contents of this partition.
offset – Where to start when fetching deterministically.
background_frequency – How many clips will have background noise, 0.0 to 1.0.
background_volume_range – How loud the background noise will be.
time_shift – How much to randomly shift the clips by in time.
mode – Which partition to use, must be ‘training’, ‘validation’, or ‘testing’.

Returns

List of sample data for the transformed samples, and list of label indexes

Raises

ValueError – If background samples are too short.

get_features_for_wav(wav_filename)[source]

Applies the feature transformation process to the input_wav.

Runs the feature generation process (generally producing a spectrogram from the input samples) on the WAV file. This can be useful for testing and verifying implementations being run on other platforms.

Parameters: wav_filename – The path to the input audio file.
Returns: Numpy data array containing the generated features.

static maybe_download_and_extract_dataset(data_url, dest_directory)[source]

Download and extract data set tar file.

If the data set we’re using doesn’t already exist, this function downloads it from the TensorFlow.org website and unpacks it into a directory. If the data_url is none, don’t download anything and expect the data directory to contain the correct files already.

Parameters

data_url – Web location of the tar file containing the data set.
dest_directory – File path to extract data to.

prepare_background_data()[source]

Searches a folder for background noise audio, and loads it into: memory.

It’s expected that the background audio samples will be in a subdirectory named ‘_background_noise_’ inside the ‘data_dir’ folder, as .wavs that match the sample rate of the training data, but can be much longer in duration.

If the ‘_background_noise_’ folder doesn’t exist at all, this isn’t an error, it’s just taken to mean that no background noise augmentation should be used. If the folder does exist, but it’s empty, that’s treated as an error.

Returns: List of raw PCM-encoded audio samples of background noise.
Raises: Exception – If files aren’t found in the folder.

prepare_data_index(silence_percentage, unknown_percentage, wanted_words, validation_percentage, testing_percentage)[source]

Prepares a list of the samples organized by set and label.

The training loop needs a list of all the available data, organized by which partition it should belong to, and with ground truth labels attached. This function analyzes the folders below the data_dir, figures out the right labels for each file based on the name of the subdirectory it belongs to, and uses a stable hash to assign it to a data set partition.

Parameters

silence_percentage – How much of the resulting data should be background.
unknown_percentage – How much should be audio outside the wanted classes.
wanted_words – Labels of the classes we want to be able to recognize.
validation_percentage – How much of the data set to use for validation.
testing_percentage – How much of the data set to use for testing.

Returns

Dictionary containing a list of file information for each set partition, and a lookup map for each class to determine its numeric index.

Raises

Exception – If expected files are not found.

prepare_processing_graph()[source]

Builds a TensorFlow graph to apply the input distortions.

Creates a graph that loads a WAVE file, decodes it, scales the volume, shifts it in time, adds in background noise, calculates a spectrogram, and then builds an MFCC fingerprint from that.

VGG

ImageNet

akida_models.vgg_imagenet(input_shape=(224, 224, 3), classes=1000, include_top=True, pooling=None, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(128, - 1))[source]

Instantiates a VGG11 architecture with reduced number of filters in convolutional layers (i.e. a quarter of the filters of the original implementation of https://arxiv.org/pdf/1409.1556.pdf).

Parameters

input_shape (tuple, optional) – input shape tuple. Defaults to (224, 224, 3).
classes (int, optional) – optional number of classes to classify images into. Defaults to 1000.
include_top (bool, optional) – whether to include the classification layers at the top of the model. Defaults to True.
pooling (str, optional) –
Optional pooling mode for feature extraction when include_top is False. Defaults to None.
- None means that the output of the model will be the 4D tensor output of the last convolutional block.
- avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.
weight_quantization (int, optional) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer. Defaults to 0.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int, optional) –
sets all activations in the model to have a particular activation quantization bitwidth. Defaults to 0.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization (int, optional) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (128, -1). Note that following Akida convention, the scale factor is an integer used as a divider.

Returns

a Keras model for VGG/ImageNet

Return type

keras.Model

akida_models.vgg_imagenet_pretrained()[source]

Helper method to retrieve a vgg_imagenet model that was trained on ImageNet dataset.

Returns: a Keras Model instance.
Return type: keras.Model

UTK Face

akida_models.vgg_utk_face(input_shape=(32, 32, 3), weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(127, - 1))[source]

Instantiates a VGG-like model for the regression example on age estimation using UTKFace dataset.

Parameters

input_shape (tuple) – input shape tuple of the model
weight_quantization (int) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int) –
sets all activations in the model to have a particular activation quantization bitwidth.
- ’0’ implements floating point 32-bit activations.
- ’1’ through ‘8’ implements n-bit weights where n is from 1-8 bits.
input_weight_quantization (int) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’None’ implements the same bitwidth as the other weights.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (127, -1). Note that following Akida convention, the scale factor is an integer used as a divider.

Returns

a Keras model for VGG/UTKFace

Return type

keras.Model

akida_models.vgg_utk_face_pretrained()[source]

Helper method to retrieve a vgg_utk_face model that was trained on UTK Face dataset.

Returns: a Keras Model instance.
Return type: keras.Model

Preprocessing

akida_models.utk_face.preprocessing.load_data()[source]

Loads the dataset from Brainchip data server.

Returns: train set, train labels, test set and test labels as numpy arrays
Return type: np.array, np.array, np.array, np.array

YOLO

akida_models.yolo_base(input_shape=(224, 224, 3), classes=1, nb_box=5, alpha=1.0, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, input_scaling=(127.5, - 1))[source]

Instantiates the YOLOv2 architecture.

Parameters

input_shape (tuple) – input shape tuple
classes (int) – number of classes to classify images into
nb_box (int) – number of anchors boxes to use
alpha (float) – controls the width of the model
weight_quantization (int) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization –
sets all activations in the model to have a particular activation quantization bitwidth.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_scaling (tuple, optional) – scale factor and offset to apply to inputs. Defaults to (127.5, -1). Note that following Akida convention, the scale factor is a number used as a divider.

Returns

a Keras Model instance.

Return type

keras.Model

akida_models.yolo_widerface_pretrained()[source]

Helper method to retrieve a yolo_base model that was trained on WiderFace dataset and the anchors that are needed to interpet the model output.

Returns: a Keras Model instance and a list of anchors.
Return type: keras.Model, list

akida_models.yolo_voc_pretrained()[source]

Helper method to retrieve a yolo_base model that was trained on PASCAL VOC2012 dataset for ‘person’ and ‘car’ classes only, and the anchors that are needed to interpet the model output.

Returns: a Keras Model instance and a list of anchors.
Return type: keras.Model, list

YOLO Toolkit

Processing

akida_models.detection.processing.load_image(image_path)[source]

Loads an image from a path.

Parameters: image_path (string) – full path of the image to load
Returns: a Tensorflow image Tensor

akida_models.detection.processing.preprocess_image(image_buffer, output_size)[source]

Preprocess an image for YOLO inference.

Parameters

image_buffer (tf.Tensor) – image to preprocess
output_size (tuple) – shape of the image after preprocessing

Returns

A resized and normalized image as a Numpy array.

akida_models.detection.processing.decode_output(output, anchors, nb_classes, obj_threshold=0.5, nms_threshold=0.5)[source]

Decodes a YOLO model output.

Parameters

output (tf.Tensor) – model output to decode
anchors (list) – list of anchors boxes
nb_classes (int) – number of classes
obj_threshold (float) – confidence threshold for a box
nms_threshold (float) – non-maximal supression threshold

Returns

List of BoundingBox objects

akida_models.detection.processing.parse_voc_annotations(gt_folder, image_folder, file_path, labels)[source]

Loads PASCAL-VOC data.

Data is loaded using the groundtruth informations and stored in a dictionary.

Parameters

gt_folder (str) – path to the folder containing ground truth files
image_folder (str) – path to the folder containing the images
file_path (str) – file containing the list of files to parse
labels (list) – list of labels of interest

Returns

a dictionnary containing all data present in the ground truth file

Return type

dict

akida_models.detection.processing.parse_widerface_annotations(gt_file, image_folder)[source]

Loads WiderFace data.

Data is loaded using the groundtruth informations and stored in a dictionary.

Parameters

gt_file (str) – path to the ground truth file
image_folder (str) – path to the directory containing the images

Returns

a dictionnary containing all data present in the ground truth file

Return type

dict

class akida_models.detection.processing.BoundingBox(x1, y1, x2, y2, score=- 1, classes=None)[source]

Utility class to represent a bounding box.

The box is defined by its top left corner (x1, y1), bottom right corner (x2, y2), label, score and classes.

Methods:

`get_label`()	Returns the label for this bounding box.
`get_score`()	Returns the score for this bounding box.
`iou`(other)	Computes intersection over union ratio between this bounding box and another one.

get_label()[source]

Returns the label for this bounding box.

Returns: Index of the label as an integer.

get_score()[source]

Returns the score for this bounding box.

Returns: Confidence as a float.

iou(other)[source]

Computes intersection over union ratio between this bounding box and another one.

Parameters: other (BoundingBox) – the other bounding box for IOU computation
Returns: IOU value as a float

Performances

class akida_models.detection.map_evaluation.MapEvaluation(model, val_data, labels, anchors, period=1, obj_threshold=0.5, nms_threshold=0.5, max_box_per_image=10, is_keras_model=True, decode_output_fn=<function decode_output>)[source]

Evaluate a given dataset using a given model. Code originally from https://github.com/fizyr/keras-retinanet.

Parameters

model (keras.Model) – model to evaluate.
val_data (dict) – dictionary containing validation data as obtained using preprocess_widerface.py module
labels (list) – list of labels as strings
anchors (list) – list of anchors boxes
period (int, optional) – periodicity the precision is printed, defaults to once per epoch.
obj_threshold (float, optional) – confidence threshold for a box
nms_threshold (float, optional) – non-maximal supression threshold
max_box_per_image (int, optional) – maximum number of detections per image
is_keras_model (bool, optional) – indicated if the model is a Keras model (True) or an Akida model (False)
decode_output_fn (Callable, optional) – function to decode model’s outputs. Defaults to decode_output() (yolo decode output function).

Returns

A dict mapping class names to mAP scores.

Methods:

`evaluate_map`()	Evaluates current mAP score on the model.
`on_epoch_end`(epoch[, logs])	Keras callback called at the end of an epoch.

evaluate_map()[source]

Evaluates current mAP score on the model.

Returns: global mAP score and dictionnary of label and mAP for each class.
Return type: tuple

on_epoch_end(epoch, logs=None)[source]

Keras callback called at the end of an epoch.

Parameters

epoch (int) – index of epoch.
logs (dict, optional) – metric results for this training epoch, and for the validation epoch if validation is performed. Validation result keys are prefixed with val. For training epoch, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘acc’: 0.7}. Defaults to None.

Anchors

akida_models.detection.generate_anchors.generate_anchors(annotations_data, num_anchors=5, grid_size=(7, 7))[source]

Creates anchors by clustering dimensions of the ground truth boxes from the training dataset.

Parameters

annotations_data (dict) – dictionnary of preprocessed VOC data
num_anchors (int, optional) – number of anchors
grid_size (tuple, optional) – size of the YOLO grid

Returns

the computed anchors

Return type

list

ConvTiny

CWRU

akida_models.convtiny_cwru()[source]

Instantiates a CNN for CWRU classification with input shape (32, 32, 1) and 10 classes.

Returns: a Keras model for Convtiny/CWRU
Return type: keras.Model

akida_models.convtiny_cwru_pretrained()[source]

Helper method to retrieve a convtiny_cwru model that was trained on CWRU dataset.

Returns: a Keras Model instance.
Return type: keras.Model

PointNet++

ModelNet40

akida_models.pointnet_plus_modelnet40(selected_points=128, features=3, knn_points=64, classes=40, alpha=1.0, weight_quantization=0, activ_quantization=0)[source]

Instantiates a PointNet++ model for the ModelNet40 classification.

This example implements the point cloud deep learning paper PointNet (Qi et al., 2017). For a detailed introduction on PointNet see this blog post.

PointNet++ is conceived as a repeated series of operations: sampling and grouping of points, followed by the trainable convnet itself. Those operations are then repeated at increased scale. Each of the selected points is taken as the centroid of the K-nearest neighbours. This defines a localized group.

Parameters

selected_points (int, optional) – the number of points to process per sample. Default is 128.
features (int, optional) – the number of features. Expected values are 1 or 3. Default is 3.
knn_points (int, optional) – the number of points to include in each localised group. Must be a power of 2, and ideally an integer square (so 64, or 16 for a deliberately small network, or 256 for large). Default is 64.
classes (int, optional) – the number of classes for the classifier. Default is 40.
alpha (float, optional) – network filters multiplier. Default is 1.0.
weight_quantization (int, optional) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer. Defaults to 0.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int, optional) –
sets all activations in the model to have a particular activation quantization bitwidth. Defaults to 0.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.

Returns

a quantized Keras model for PointNet++/ModelNet40.

Return type

keras.Model

akida_models.pointnet_plus_modelnet40_pretrained()[source]

Helper method to retrieve a pointnet_plus model that was trained on ModelNet40 dataset.

Returns: a Keras Model instance.
Return type: keras.Model

Processing

akida_models.modelnet40.preprocessing.get_modelnet_from_file(num_points, filename='ModelNet40.zip')[source]

Load the ModelNet data from file.

First parse through the ModelNet data folders. Each mesh is loaded and sampled into a point cloud before being added to a standard python list and converted to a numpy array. We also store the current enumerate index value as the object label and use a dictionary to recall this later.

Parameters

num_points (int) – number of points with which mesh is sample.
filename (str) – the dataset file to load if the npz file was not generated yet. Defaults to “ModelNet40.zip”.

Returns

train set, train labels, test set, test labels as numpy arrays and dict containing class folder name.

Return type

np.array, np.array, np.array, np.array, dict

akida_models.modelnet40.preprocessing.get_modelnet(train_points, train_labels, test_points, test_labels, batch_size, selected_points=128, knn_points=64)[source]

Obtains the ModelNet dataset.

Parameters

train_points (numpy.array) – train set.
train_labels (numpy.array) – train labels.
test_points (numpy.array) – test set.
test_labels (numpy.array) – test labels.
batch_size (int) – size of the batch.
selected_points (int) – num points to process per sample. Default is 512.
knn_points (int) – number of points to include in each localised group. Must be a power of 2, and ideally an integer square (so 64, or 16 for a deliberately small network, or 256 for large).

Returns

train and test point with data augmentation.

Return type

tf.data.Dataset, tf.data.Dataset

GXNOR

MNIST

akida_models.gxnor_mnist(weight_quantization=0, activ_quantization=0, input_weight_quantization=None)[source]

Instantiates a Keras GXNOR model with an additional dense layer to make better classification.

The paper describing the original model can be found here.

Parameters

weight_quantization (int, optional) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer. Defaults to 0.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int, optional) –
sets all activations in the model to have a particular activation quantization bitwidth. Defaults to 0.
- ’0’ implements floating point 32-bit activations.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
input_weight_quantization (int, optional) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
- ’0’ implements floating point 32-bit weights.
- ’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.

Returns

a Keras model for GXNOR/MNIST

Return type

keras.Model

akida_models.gxnor_mnist_pretrained()[source]

Helper method to retrieve a gxnor_mnist model that was trained on MNIST dataset.

This model was trained with the distillation knowledge method, using the EfficientNet model from this repository and the Distiller class from the knowledge distillation toolkit (akida_models.distiller).

The float training was done for 30 epochs with a learning rate of 1e-4 After that we gradually quantize the model from: 8-4-4 –> 4-4-4 –> 4-4-2 –> 2-2-2 –> 2-2-1 tuning the model at each step with the same distillation training method for 5 epochs and a learning rate of 5e-5.

Returns: a Keras Model instance.
Return type: keras.Model