CovarianceNet: Convolutional Neural Network Feature
Extraction Using Covariance Tensor Decomposition
IEEE Access 2021 (Oral Presentation, Best Poster Award)
- Ricardo Fonseca Federico Santa María University Digevo
- Oscar Guarnizo Yachay Tech University Digevo
- Diego Suntaxi Yachay Tech University Digevo
- Alfonso Cadiz Digevo
- Werner Creixell Federico Santa María University
Abstract
This work describes a new method to extract image features using tensor decomposition to model data. Given an image dataset, we extract patches from images, compute the covariance tensor for all patches, decompose with the Tucker Decomposition, and obtain the most critical features from a tensor core. These features can be leveraged in diverse forms and with different objectives. This work explored rearranging these features into block-wise kernels , which are plugged into convolutional layers of CNNs for classification tasks. Preliminary experiments were conducted on these kernels to test their discriminative and initializer capabilities. The discriminative experiments showed an increased classification accuracy of around 67% with CIFAR 10, 64% with CIFAR 100, and 98% with MNIST, using 10,100,1000 samples with a single feed-forward training. On the other hand, the initialization experiments showed the feature extraction capability versus available initializers (He random, He uniform, Glorot, random), returning comparable findings with state-of-the-art around 91% with CIFAR 10, 72% with CIFAR 100, and 99% with MNIST. The results were promising, allowing us to identify the proposed method's advantages over traditional convolutional neural networks.
Video
Covariance Tensor
The covariance tensor, also called n-mode cross-covariance, is calculated among different patches extracted from images in the training dataset. The covariance tensor is a generalization of the covariance matrix.
The next step involves getting the factor matrices (U’s)
from the mean covariance tensor. Here, unlike the standard
tucker decomposition, we decide only to get the factor
matrices U1, U2, and U3. It is because the covariance
matrix is super symmetric so that we can get a good
approximation of the tensor core with only these three
matrices.
We got the factor matrices by unfolding and decomposing
(Singular Value Decomposition) the mean covariance tensor.
Tucker Decomposition
Once we have the factor matrices, we can get an approximation of the core tensor by Tucker Decomposition.
The resulting core tensor dimensions are given by the first dimension of each factor matrix U and the patches dimensions.
The core tensor encapsulates some relevant features of our dataset. We propose rearranging these features into kernels of size (patch height, patch width, channels) for subsequently plugged into convolutional layers.
Results
We evaluated the discriminative and initializer capabilities of the generated kernels. Among the results, we noted different patterns by performing a visual sanity check, a similarity check of the correlation between kernels generated with different small subsets, and a state-of-the-art comparison.
We study the correlation between generated kernels (per layer) using different sizes of training sub-datasets.
We compare our model with state-of-the-art architectures on CIFAR 10, CIFAR 100 and MNIST.
Related links
- The book Multilinear Subspace Learning Dimensionality Reduction of Multidimensional Data provides an excellent introduction to tensor operations: unfolding, n-mode product, n-mode cross-covariance (Covariance Tensor), and Tucker Decomposition.
- To understand Tucker Decomposition with R, explore Understanding the Tucker decomposition, and compressing tensor-valued data (with R code)
- To know more about the cascade style used in our work, check PCA Based Kernel Initialization for Convolutional Neural Networks.
Citation
Powered by Jon Barron and Michaël Gharbi.