Demo 3: HKR classifier on MNIST dataset

This notebook will demonstrate learning a binary task on the MNIST0-8 dataset.

# pip install deel-lip -qqq

import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.python.keras.layers import Input, Flatten
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import binary_accuracy
from tensorflow.keras.models import Sequential

from deel.lip.layers import (
    SpectralConv2D,
    SpectralDense,
    FrobeniusDense,
    ScaledL2NormPooling2D,
)
from deel.lip.activations import MaxMin, GroupSort, GroupSort2, FullSort
from deel.lip.losses import HKR, KR, HingeMargin

2021-09-09 17:57:46.192001: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

data preparation

For this task we will select two classes: 0 and 8. Labels are changed to {-1,1}, wich is compatible with the Hinge term used in the loss.

from tensorflow.keras.datasets import mnist

# first we select the two classes
selected_classes = [0, 8]  # must be two classes as we perform binary classification


def prepare_data(x, y, class_a=0, class_b=8):
    """
    This function convert the MNIST data to make it suitable for our binary classification
    setup.
    """
    # select items from the two selected classes
    mask = (y == class_a) + (
        y == class_b
    )  # mask to select only items from class_a or class_b
    x = x[mask]
    y = y[mask]
    x = x.astype("float32")
    y = y.astype("float32")
    # convert from range int[0,255] to float32[-1,1]
    x /= 255
    x = x.reshape((-1, 28, 28, 1))
    # change label to binary classification {-1,1}
    y[y == class_a] = 1.0
    y[y == class_b] = -1.0
    return x, y


# now we load the dataset
(x_train, y_train_ord), (x_test, y_test_ord) = mnist.load_data()

# prepare the data
x_train, y_train = prepare_data(
    x_train, y_train_ord, selected_classes[0], selected_classes[1]
)
x_test, y_test = prepare_data(
    x_test, y_test_ord, selected_classes[0], selected_classes[1]
)

# display infos about dataset
print(
    "train set size: %i samples, classes proportions: %.3f percent"
    % (y_train.shape[0], 100 * y_train[y_train == 1].sum() / y_train.shape[0])
)
print(
    "test set size: %i samples, classes proportions: %.3f percent"
    % (y_test.shape[0], 100 * y_test[y_test == 1].sum() / y_test.shape[0])
)

train set size: 11774 samples, classes proportions: 50.306 percent
test set size: 1954 samples, classes proportions: 50.154 percent

Build lipschitz Model

Let’s first explicit the paremeters of this experiment

# training parameters
epochs = 10
batch_size = 128

# network parameters
activation = GroupSort  # ReLU, MaxMin, GroupSort2

# loss parameters
min_margin = 1.0
alpha = 10.0

Now we can build the network. Here the experiment is done with a MLP. But Deel-lip also provide state of the art 1-Lipschitz convolutions.

K.clear_session()
# helper function to build the 1-lipschitz MLP
wass = Sequential(
    layers=[
        Input((28, 28, 1)),
        Flatten(),
        SpectralDense(32, GroupSort2(), use_bias=True),
        SpectralDense(16, GroupSort2(), use_bias=True),
        FrobeniusDense(1, activation=None, use_bias=False),
    ],
    name="lipModel",
)
wass.summary()

2021-09-09 17:57:48.839870: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-09-09 17:57:48.840412: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-09-09 17:57:48.860183: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.860431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-09-09 17:57:48.860445: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-09-09 17:57:48.861561: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-09-09 17:57:48.861590: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-09-09 17:57:48.862154: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-09-09 17:57:48.862289: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-09-09 17:57:48.863612: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-09-09 17:57:48.863933: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-09-09 17:57:48.864005: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-09-09 17:57:48.864070: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.864347: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.864570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-09-09 17:57:48.865066: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-09-09 17:57:48.865129: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.865365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-09-09 17:57:48.865378: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-09-09 17:57:48.865391: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-09-09 17:57:48.865399: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-09-09 17:57:48.865408: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-09-09 17:57:48.865417: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-09-09 17:57:48.865425: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-09-09 17:57:48.865434: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-09-09 17:57:48.865443: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-09-09 17:57:48.865479: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.865725: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:48.865942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-09-09 17:57:48.865959: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-09-09 17:57:49.409108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-09-09 17:57:49.409130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0
2021-09-09 17:57:49.409134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N
2021-09-09 17:57:49.409273: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:49.409541: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:49.409770: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-09 17:57:49.409985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7250 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
2021-09-09 17:57:49.482789: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-09-09 17:57:49.779380: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11

Model: "lipModel"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
flatten (Flatten)            (None, 784)               0
_________________________________________________________________
spectral_dense (SpectralDens (None, 32)                50241
_________________________________________________________________
spectral_dense_1 (SpectralDe (None, 16)                1057
_________________________________________________________________
frobenius_dense (FrobeniusDe (None, 1)                 32
=================================================================
Total params: 51,330
Trainable params: 25,664
Non-trainable params: 25,666
_________________________________________________________________

optimizer = Adam(lr=0.001)

# as the output of our classifier is in the real range [-1, 1], binary accuracy must be redefined
def HKR_binary_accuracy(y_true, y_pred):
    S_true = tf.dtypes.cast(tf.greater_equal(y_true[:, 0], 0), dtype=tf.float32)
    S_pred = tf.dtypes.cast(tf.greater_equal(y_pred[:, 0], 0), dtype=tf.float32)
    return binary_accuracy(S_true, S_pred)

wass.compile(
    loss=HKR(
        alpha=alpha, min_margin=min_margin
    ),  # HKR stands for the hinge regularized KR loss
    metrics=[
        KR,  # shows the KR term of the loss
        HingeMargin(min_margin=min_margin),  # shows the hinge term of the loss
        HKR_binary_accuracy,  # shows the classification accuracy
    ],
    optimizer=optimizer,
)

Learn classification on MNIST

Now the model is build, we can learn the task.

wass.fit(
    x=x_train,
    y=y_train,
    validation_data=(x_test, y_test),
    batch_size=batch_size,
    shuffle=True,
    epochs=epochs,
    verbose=1,
)

Epoch 1/10

2021-09-09 17:57:50.462540: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-09-09 17:57:50.480817: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3600000000 Hz

92/92 [==============================] - 3s 10ms/step - loss: -0.5542 - KR: 3.2748 - HingeMargin: 0.2721 - HKR_binary_accuracy: 0.8725 - val_loss: -5.0345 - val_KR: 5.5790 - val_HingeMargin: 0.0553 - val_HKR_binary_accuracy: 0.9777
Epoch 2/10
92/92 [==============================] - 1s 6ms/step - loss: -4.8969 - KR: 5.4644 - HingeMargin: 0.0567 - HKR_binary_accuracy: 0.9785 - val_loss: -5.3840 - val_KR: 5.7409 - val_HingeMargin: 0.0383 - val_HKR_binary_accuracy: 0.9845
Epoch 3/10
92/92 [==============================] - 1s 6ms/step - loss: -5.3341 - KR: 5.7611 - HingeMargin: 0.0427 - HKR_binary_accuracy: 0.9840 - val_loss: -5.5146 - val_KR: 5.8514 - val_HingeMargin: 0.0360 - val_HKR_binary_accuracy: 0.9845
Epoch 4/10
92/92 [==============================] - 1s 6ms/step - loss: -5.4725 - KR: 5.8629 - HingeMargin: 0.0390 - HKR_binary_accuracy: 0.9858 - val_loss: -5.5682 - val_KR: 5.9083 - val_HingeMargin: 0.0362 - val_HKR_binary_accuracy: 0.9855
Epoch 5/10
92/92 [==============================] - 1s 6ms/step - loss: -5.4682 - KR: 5.8617 - HingeMargin: 0.0393 - HKR_binary_accuracy: 0.9862 - val_loss: -5.5683 - val_KR: 5.9196 - val_HingeMargin: 0.0366 - val_HKR_binary_accuracy: 0.9845
Epoch 6/10
92/92 [==============================] - 1s 6ms/step - loss: -5.5441 - KR: 5.9086 - HingeMargin: 0.0364 - HKR_binary_accuracy: 0.9878 - val_loss: -5.6268 - val_KR: 5.9399 - val_HingeMargin: 0.0336 - val_HKR_binary_accuracy: 0.9874
Epoch 7/10
92/92 [==============================] - 1s 6ms/step - loss: -5.6141 - KR: 5.9665 - HingeMargin: 0.0352 - HKR_binary_accuracy: 0.9877 - val_loss: -5.7121 - val_KR: 5.9817 - val_HingeMargin: 0.0300 - val_HKR_binary_accuracy: 0.9894
Epoch 8/10
92/92 [==============================] - 1s 6ms/step - loss: -5.6687 - KR: 6.0017 - HingeMargin: 0.0333 - HKR_binary_accuracy: 0.9875 - val_loss: -5.7358 - val_KR: 6.0305 - val_HingeMargin: 0.0322 - val_HKR_binary_accuracy: 0.9869
Epoch 9/10
92/92 [==============================] - 1s 6ms/step - loss: -5.6956 - KR: 6.0167 - HingeMargin: 0.0321 - HKR_binary_accuracy: 0.9883 - val_loss: -5.7684 - val_KR: 6.0966 - val_HingeMargin: 0.0350 - val_HKR_binary_accuracy: 0.9840
Epoch 10/10
92/92 [==============================] - 1s 6ms/step - loss: -5.7525 - KR: 6.0836 - HingeMargin: 0.0331 - HKR_binary_accuracy: 0.9881 - val_loss: -5.8637 - val_KR: 6.0924 - val_HingeMargin: 0.0260 - val_HKR_binary_accuracy: 0.9899

<tensorflow.python.keras.callbacks.History at 0x7f9fb4099690>

As we can see the model reach a very decent accuracy on this task.