Video Mosaic Component ProgramsThe video mosaic process has been broken up into four programs. The primary reason for this is so that operations that require movie compression/decompression/manipulation are separated from the actual mosaic search itself. That way, the mosaic search can (and has) be written for multiplatform compilation and can be run on the fastest machine available. VideoDBGenThis program takes the following command-line parameters:
In addition, the following pertinent compiler predefinition have been placed in an interface file to the utilities file that all programs use:
VideoDBGen works with video via the QuickTime SDK, which can handle movies of nearly any compression. Each movie in the video repository is read in and decompressed. The movie is then divided into tiles of width and height indicated by the user. The number of frames in a tile is indicated by a compiler predefinition in a shared library. A database consists of a folder with file entries for each clip. For each tile of each movie, metrics are run and a feature vector generated. These vectors are written to the database entry file. Tile metadata (i.e. frame location, tile number, etc) is associated with each vector. Movie metadata is written at the top of each database entry file. Presently, the database entries are written out in ASCII text for debug reasons, although a library is already in place to do binary I/O with these files.
VideoMosaicPrep also uses the QuickTime SDK. A movie is read in and decompressed. The movie is then divided into tiles of width and height indicated by the user. The number of frames in a tile is randomized slightly so that all the tiles don't start and end at the same time. Once tiles have been created, this program then calculates the optical flow between each frame in the entire movie, and then writes this information to the output file. Next, it iterates through them and generates feature vectors, which are then written to a file. Tile metadata (i.e. frame location, tile number, etc) is associated with each vector.
The VideoMosaic program does the meat of the mosaic creation process: the actual search for tile replacements. The program is very CPU and memory intensive, and does not make use of any OS-dependant SDKs so that it could be ported easily to faster *nix machines. This program takes in a video mosaic prep file and then searches the database for the best fit tile for each tile in the original video. The search is linear and has been optimized to 3 tight loops in C++. In addition, a number of statistical files can be read in to modify how vectors are compared. Because the values of the features are distributed very differently, one of the tactics used is to attempt to normalize them based on statistics of samples of comparison values. This is so that some features can be proportionally weighted more than others.
This program uses the QuickTime SDK to build actual mosaic movies from the results of the VideoMosaic search. The input movie is decompressed in memory, and then tiles of it are replaced with tiles from other movies in the repository. When a tile is replaced it is faded into its replacement to provide a more seamless effect. This program is I/O bound and has been optimized so that creating a mosaic video is pretty efficient.
Sections
|