NeTra-V: Toward an Object-Based
Video Representation
Yining Deng, Student Member,
IEEE, and B. S. Manjunath, Member, IEEE
Abstract
We present here a
prototype video analysis and
retrieval system, called NeTra-V, that is being developed to
build an object-based video representation for functionalities
such as search and retrieval of video objects. A region-based
content description scheme using low-level visual descriptors is
proposed. In order to obtain regions for local feature extraction,
a new spatio-temporal segmentation and region-tracking scheme
is employed. The segmentation algorithm uses all three visual
features: color, texture, and motion in the video data. A group
processing scheme similar to the one in the MPEG-2 standard is
used to ensure the robustness of the segmentation. The proposed
approach can handle complex scenes with large motion. After
segmentation, regions are tracked through the video sequence
using extracted local features. The results of tracking are sequences
of coherent regions, called "subobjects". Subobjects are
the fundamental elements in our low-level content description
scheme, which can be used to obtain meaningful physical objects
in a high-level content description scheme. Experimental results
illustrating segmentation and retrieval are provided.
