2D Self-Organizing Map on Mini CIFAR-10

Description

Abstract

In this project, we implemented a 2D Self-Organizing Map (SOM) entirely from scratch and trained it on a mini CIFAR-10 dataset—1,000 images total, exactly 100 per class. Three conditions compared raw color inputs, grayscale versions, and grayscale features after PCA reduction. A 10×10 feature map ran for 200 epochs in each condition. To visualize the results, we labeled every neuron two ways: first, with the index of the training sample that triggered its strongest activation; second, with the dominant class that neuron came to represent. PCA cut the grayscale feature space from 1,024 dimensions down to 130 while still holding 95% of the original variance. That compression sped up training noticeably. The final maps showed pockets of class clustering—some neurons clearly preferred ships or automobiles, but many neurons remained mixed, responding to multiple classes without a clear winner. That outcome, while perhaps unsatisfying, is typical for unsupervised learning on natural images; it suggests the SOM captured low-level visual regularities rather than clean category boundaries.

 

 Introduction

CIFAR-10 has become an established benchmark in computer vision. Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton assembled it as a labeled subset of the larger 80 million tiny images dataset. The full collection contains 60,000 color images, each 32 by 32 pixels, spread evenly across ten object classes—6,000 images per class. We built a smaller, balanced version for this project: exactly 1,000 images, 100 from each class, drawn from the standard test set. That scale gave us enough variety to test our algorithm without straining our computational limits. The Self-Organizing Map, or SOM, came from Teuvo Kohonen’s work. It operates as an unsupervised neural network with an unusual goal. Rather than classifying inputs or predict- ing outputs, a SOM projects high-dimensional data onto a low-dimensional grid—typically two-dimensional—while trying to preserve the topological structure of the original space. Neighbors in the input space should end up as neighbors on the map. That property, when it works, makes SOMs attractive for both dimensionality reduction and data visualization. We wrote the entire SOM algorithm from scratch in MATLAB with no toolbox functions. Three experiments then compared how well the map organized different representations of the same images. First, we fed raw color images directly into the SOM. Second, we converted those images to grayscale and repeated the training. Third, we reduced the grayscale features using Principal Component Analysis (PCA) and trained a SOM on those compressed vectors. What follows describes each experiment’s setup, the results we observed, and what those results suggest about unsupervised learning on natural images.

 

Conclusion

We implemented a 2D Self-Organizing Map from scratch and applied it to a mini CIFAR-10 dataset. Three conditions were tested: raw color images, grayscale versions, and grayscale images reduced with PCA. Each condition produced a trained map. The raw color maps took the longest and showed the least organized class regions. Grayscale sped things up and gave slightly cleaner clusters. PCA compressed the feature space from 1,024 dimensions down to 130—while retaining 95% of the variance—and that made training dramatically faster. More importantly, the PCA-based map displayed the most distinct class groupings we observed, particularly for automobiles, ships, and airplanes. That outcome suggests something worth noting. Removing irrelevant variation (color, high-frequency noise) through linear preprocessing helped the SOM find structure. But the map never achieved clean separation. Cats and dogs remained entangled throughout the grid. That is not a failure, it is what we should expect from an unsupervised algorithm on natural images where class boundaries are fuzzy and within-class variation is large. We met the project’s stated goals. A 10×10 SOM was trained on color images, on grayscale images, and on PCA-reduced grayscale features. For each experiment, we produced two visualizations: labeling by input index and labeling by dominant class. The code runs, the figures are generated, and the patterns are documented. If we were to extend this work—and we think there are productive directions—we would look at larger map sizes, longer training schedules, and alternative neighborhood functions, but even without those extensions, the exercise confirmed something useful: a from-scratch SOM implementation can reveal structure in small image datasets, and PCA preprocessing offers a real trade-off between speed and fidelity that, in our case, leaned in favor of speed without obvious cost to organization quality.

Authors

DOI: 10.5281/zenodo.20682959

Publication Date: 2019-05-06

Back to publications list


About