Motion-region annotation for complex videos via label propagation across occluders

Motion cue is pivotal in moving object analysis, which is the root for motion segmentation and detection. These preprocessing tasks are building blocks for several applications such as recognition, matching and estimation. To devise a robust algorithm for motion analysis, it is imperative to have a comprehensive dataset to evaluate an algorithm’s performance. The main limitation in making these kind of datasets is the creation of ground-truth annotation of motion, as each moving object might span over multiple frames with changes in size, illumination and angle of view. Besides the optical changes, the object can undergo occlusion by static or moving occluders. The challenge increases when the video is captured by a moving camera. In this paper, we tackle the task of providing ground-truth annotation on motion regions in videos captured from a moving camera. With minimal manual annotation of an object mask, we are able to propagate the label mask in all the frames. Object label correction based on static and moving occluder is also performed by applying occluder mask tracking for a given depth ordering. A motion annotation dataset is also proposed to evaluate algorithm performance. The results show that our cascaded-naive approach provides successful results. All the resources of the annotation tool are publicly available at ​
This document is licensed under a Creative Commons:Attribution (by) Creative Commons by4.0