Skip to main content

Data Preparation: Object Detection

In this section we describe how to create a CSV file from trained object detection tasks, which can be uploaded into Metascatter. We provide scripts for standard classification model architectures (with user-provided weights) for:

You will only need to edit some configuration files to point to your data and model weights.

Metalynx Object Detection

PyTorch

Downloads

Usage: python create_csv_torch.py 'path_to_config_file.ini'

Quick Start

To create a CSV from a Pytorch object detection model, simply edit the variables in red in the template configuration file: PyTorch Quickstart

Prepare CSV file

We provide scripts to create a CSV file that works with metascatter, given image folders and models.

An example script for PyTorch classifcation models can be downloaded here: PyTorch script download. You should not ordinarily need to edit this file.

The following Python3 libraries are required:

torch
torchvision
pillow>=9.1.0
pandas>=1.4.2
sklearn
umap-learn

The following models can be used with default or with your own trained weights:

    "frcnn-resnet": torchvision.models.detection.fasterrcnn_resnet50_fpn
"frcnn-mobilenet-320": torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn
"frcnn-mobilenet": torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn
"retinanet": torchvision.models.detection.retinanet_resnet50_fpn
"fcos": torchvision.models.detection.fcos_resnet50_fpn
"ssd-vgg": torchvision.models.detection.ssd300_vgg16
"ssdlite": torchvision.models.detection.ssdlite320_mobilenet_v3_large

You will need to supply a configuration file and a file describing the transforms for inference, for which templates can be found below:

Usage: python create_csv_OD_torch.py 'path_to_config_file.ini'

The create_csv_OD_torch.py script should not be changed. Edit the configuration file as below.

[MODEL VARIABLES]
model_name = frcnn-mobilenet-320
# Choose from: frcnn-resnet, frcnn-mobilenet-320, frcnn-mobilenet, retinanet, fcos, ssd-vgg, ssdlite
model_weights = /path/to/model/weight/file.pth
transform_name = inference
# Should correspond to transforms_config.py
confidence_threshold = 0.5
image_size = 224

Please use one of the standard model architectures listed above and provide the path to your trained weights. The transform_name field should correspond to the name given in transforms_config.py.

[TRAIN IMAGE FOLDERS] 
# paths to labelled images and annotations
image_folder_paths = [/path/to/folder/of/labelled/images/1/ /path/to/folder/of/labelled/images/2/ /path/to/folder/of/labelled/images/3/]
image_folder_names = [Name_of_source_of_folder_1 Name_of_source_of_folder_2 Name_of_source_of_folder_3]
# e.g. train and val folders For multiple locations, please separate folders and sources by a space.

Inlcude a list of folders which store the labelled images you want to use. The folder structure of each image should be in the following class format:

You can provide several folders, e.g. if you have different folders for TRAINING and VALIDATION images. You can reference these by entering a corresponding name in the field image_folder_names. Please ensure there are the same number of source names as folders provided. The folders and names should be separated by a space.

[UNLABELLED FOLDERS]
unlabelled_folder_paths = [/path/to/folder/of/unlabelled/images/1/ /path/to/folder/of/unlabelled/images/2/]
unlabelled_folder_names = [Name_of_source_of_folder_1 Name_of_source_of_folder_2]
# Unordered images: Folder->Image. For multiple locations, please separate folders and sources by a space.

Similarly, you can include one or many folders for unlabelled test data. Leave blank if there are no such folders.

In order to output class names (instead of numbers) to the CSV, you will need to provide a class list file.

[CLASS NAME FILE]
class_file = /path/to/file/with/class/names.txt

The list of classes should be in order corresponding to the output of the model:

Class0
Class1
Class2
...
ClassN

Finally, enter the filename and path of the output csv file and the folder to store the images with bounding box annotations drawn on them

[OUTPUT FILENAME]
output_annotation_folder = /path/to/folder/of/images/with/bounding_boxes
savefile = /path/to/save/file.csv

This works with standard architectures of the models named above with either default or trained weights. For bespoke architectures, please see Data Preparation.