Our Deep Vision Data division recently developed a deep learning model that classifies product types, available stock and missing stock for a retail store merchandising audit system. The model, based on the ResNet101 architecture, was trained with 20,000 synthetic product images using a 50-50 split of structured and unstructured domain randomized subsets and an 80-20 training/validation data split. Model validation was also done with synthetic training data, and the resulting system correctly classified product brands and out-of-stock items on actual photos with nearly 100% accuracy.
Synthetic training data can be utilized for almost any machine learning application, either to augment a physical dataset or completely replace it. By effectively utilizing domain randomization, synthetic data can be made to be indistinguishable from physical training information. Equally important, because the user controls the dataset distribution, features or events that rarely occur in physical datasets can be made to be more prevalent, thus enabling model training for those rare occurrences. Synthetic training data is inherently less costly, faster to create, perfectly annotated, and isn’t constrained by availability, time or even the physics of the natural world.
Kinetic Vision President and CEO Rick Schweet explained, "This example demonstrates that synthetic data can be as effective as physical information, or even more so, in training deep learning models. We created the synthetic dataset in days; product photography and manual data annotation would have taken weeks and the resulting dataset wouldn't have the feature variability needed to create a robust, scalable AI system. That's the power of synthetic training data - domain randomization and perfect annotation comes at zero cost." Learn more at synthetictrainingdata.com.