Configurable Utility for Synthetic Dataset Creation


Tomáš Bubeníček Supervisor: Jiří Bittner Master thesis 2020
When evaluating existing computer vision algorithms or training new machine learning algorithms, large datasets of various images with ground truth, the ideal known solution to the currently solved problem, need to be acquired. We review existing real-life datasets containing ground truth, which are used in computer vision, and explore how they were acquired. We then recount different synthetic datasets, and survey the different ways such data can be calculated. We propose a tool to simplify generation of such data, and implement such tool as an extension of the Unity editor. Our implementation is able to use textured 3D models to generate image sequences with additional labeling, such as surface normals, depth map, object segmentation, optical flow, motion segmentation among others. We use the tool to create a set of three example datasets.