Skip to main content

Data Preparation: Other

The input to Metascatter is a CSV file containing:

  • Image paths
  • 2D co-ordinate representations of the images (obtained using dimensionality reduction on raw images or model features)
  • Associated metadata

For instance, below is an example CSV for the dog breed classification example:

Dog breed classification CSV In this case, the Paths column contains the relative paths to the images (you will be able to specific the path prefix when loading the csv). You could also use the full paths to an accessible online storage. X-Coord and Y-Coord provide the co-ordinates for the scatterplot. These three fields are the minimum needed for metascatter to run.

Other columns of the CSV can contain any other type of metadata. For instance, this could be data output from the model, e.g. predictions and confidence of predictions, or image information such as ground truth labels, acquisition method, associated demographic data and so forth. The more metadata, the more analyses you can do in Metascatter.

For image classification tasks, you can produce an example CSV file using the scripts provided in Image Classification.

Creating the Image Co-ordinates

In this section we will discuss how to create the 2D co-ordinate representations for each image. this involves:

  1. Feature extraction

  2. Dimensionality reduction

    The co-ordinates can be obtained by either using image intensities or features extracted by passing the images through a model as in Feature Extraction. After extracting either the intensity or model features, these will have to be reduced to two-dimensional co-ordinates as described in Dimensionality Reduction.

Feature extraction

Machine learning methods typically work by transforming the input data to a reduced number of features on which the task (e.g. classification, object detection, segmentation) can work more succesfully. For instance, for a classifier that distinguishes between images of zebras and horses, features representing a stripy pattern would work better than features representing legs and tails.

In deep learning, these features are optimised in the training process (rather than being pre-defined as in conventional machine learning techqniques). Metascatter allows you to evaluate the performance of a deep learning model by visualising how the model 'sees' the images in the dataset in terms of these features. For a well-performing model, you would expect a clear boundary between the features of horses and the features of zebras.

Different layers of a trained deep learning model will produce different features. You can obtain these features by running each image through the model up to the relevant layer.

For example, for classification models such as VGG16, we can pass the images through a feature model up to the layer before the final classification:

(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
================================================================
# (4): ReLU(inplace=True)
# (5): Dropout(p=0.5, inplace=False)
# (6): Linear(in_features=4096, out_features=1000, bias=True)
)

Some scripts for doing this with common example models are given below:

Classification

Multi-class classification (120 classes) using PyTorch Mobilenet. We recommend using the last layer before the classification head. [Download script here]

Segmentation (Keras)

U-Net for pixel-level segmentation of images of small animals. Download: [Script] [Model] [Data]

Tip

For a deep learning segmentation model, we typically choose the output features after the final encoding layer (before any upsampling).

In keras, you can output the features by creating a sub-model up to the required named layer:

model = unet()
model.load_weights('my_trained_model.h5')
submodel = Model(model.inputs, model.get_layer('conv2d_transpose_1').output)
features = submodel.predict(np.expand_dims(img,0))

Object Detection (PyTorch)

Object detection using Ultralytics YOLOv5 small [Download script here] [Example CSV here]

No model

If you do not have a trained model, you can use the raw image (or after some pre-processing) intensities as features instead.

features = img.flatten()

Dimensionality reduction

The resulting data (features or image intensities) might still be several thousands long for each image. For instance, in the above VGG16 classification model, 4096 features are extracted from layer (3). To visualise these on a 2D plot, we need to reduce the dimensionality while preserving the structure of relationships between data points.

We therefore apply dimensionality reduction to these features. We use T-SNE to obtain a 2D representation of the data that compactly describes the underlying structure. In python, you can use the sklearn library.

from sklearn.manifold import TSNE
tsne = TSNE(n_components=2, verbose=1, perplexity=50, n_iter=500)
X_coords = tsne.fit_transform(features)