Building Image Recognition Models with Small Datasets

Document Type


Lead Author Type

CIS Masters Student


Dr. Jonathan Leidig; jonthan.leidig@gvsu.edu

Embargo Period



Traditionally, there has not been extensive research in underwater datasets using Deep Learning. A key limitation is because large datasets for underwater environment are not publically available. We developed an application that generates a large collection of images based on a small input dataset of images, generated by using various transformation and distortion algorithms. We generated 12,545 synthetic images based on seven initial Lethrinus (emperor fish) images and used the larger collection to build a model using a CNN deep learning approach. We also developed and tested a model using a Dascyllus reticulatus (two-stripe damselfish) dataset which contained 12,112 images and compared it with the results from the model created with our generated images. The model accuracy was 89.4 for Dascyllus reticulatus and 94.6 for Lethrinus.

Thapa.Puri.pdf (1469 kB)

This document is currently not available here.