People
Projects
Software
Datasets
        
 

ViPER: Video Performance Evaluation Resource
Researchers
Overview
At the Language and Media Processing Lab, a lot of our research focusses on analyzing video for semantic content. This includes tracking people, detecting text, and so forth. ViPER is our system for evaluating our work. The Video Processing Analysis Resource is a toolkit of scripts and Java programs that enable the markup of visual data ground truth, and systems for evaluating how closely sets of result data approximate that truth.

The Performance Evaluation Problem

In order to evaluate a video analysis algorithm, or a set of algorithms, it is necessary to define a methodology. As there are many books and papers describing methods for evaluating specific types of algorithms, we decided to develop a general framework for evaluation. The basic idea common in most types of evaluations we do is a comparison between the computer generated output and some ideal version of 'Truth'.

In some subfields of vision, like document processing, it is possible to automatically generate test data. However for video processing, it is more common for a human to define the ground truth for each video clip. In order to ensure that researchers may repeat and verify evaluations, it is important to make the ground truth metadata is available to other researchers in a documented format. It is very useful to have methods of qualitatively verifying the ground truth, as well. ViPER-GT provides tools for solving the metadata problem.

There are many ways to define how correct a result data set is with respect to a ground truth data set. A metric that looks at difference in size of bounding boxes for text detection may give different results than a more goal-oriented metric which operates on the number of characters or words correctly recognized. ViPER-PE provides tools for solving the evaluation problem.

The ViPER Ground Truth Authoring Tool

ViPER-GT gives the process of authoring ground truth a Java graphical user interface. It is designed to allow frame-by-frame markup of video metadata stored in the Viper format. It is also useful for visualization. For more information, see the appropriate manual.

The Viper Performance Evaluation Tool

ViPER-PE is a command line performance evaluation tool. It offers a variety of metrics for performing comparison between video metadata files. With it, a user can select from multiple metrics to compare a result data set with ground truth data. It can give precision and recall metrics, perform frame-by-frame and object-based evaluations, and features a filtering mechanism to evaluate relevant subsets of the data. The tool is further described in its manual.

Other Tools

The ViPER API is a set of Java interfaces and classes that provide programmatic access to data stored in the ViPER format. It offers a generic, object-oriented view of video metadata that is aimed at evaluation. Since ViPER data is stored in XML, it is not difficult to read in the data in languages that cannot interface with Java.

ViPER-Viz is (currently) a set of UNIX scripts that enable a user to compactly visualize ground truth, analysis results, performance evaluation results, or an entire video clips, using several flexible representations.

Important Notes

The system is currently under development, so there are still bugs in the program. As such, there is no warranty on this, expressed or implied. Save early, save often. See our buglist for more details.

Also, note that the web site hosted at Sourceforge is likely to be kept more up to date. Also, its url is easier to remember. So, go see the new Video Processing Evaluation Resource (ViPER) Toolkit home page.
Other Figures
Contact
Presentations
Publications
(Search All Publications)

D. Doermann and D. Mihalcik. "Tools and Techniques for Video Performances Evaluation." ICPR, pp. 167-170, 2000. (BibTex)

Related Links


Last Updated: Tuesday 20 December, 2011



home | language group | media group | sponsors & partners | publications | seminars | contact us | staff only
© Copyright 2001, Language and Media Processing Laboratory, University of Maryland, All rights reserved.