Video Mosaic Features

These are the features used when calculating a vector description of a tile of video.  Note that some of the steps for these metrics have been combined during calculation for optimization. All steps for each feature are described separately for clarity.

The following constants are universal for all metrics:

  • Tile size
  • Number of frames in a tile
  • Number of histogram bins
  • The factor by which frames are reduced using bicubic resampling.  This is so that the metrics adapt better to big frames.
Temporal Integral
The temporal integral of two frames is literally the pixel difference between them.  This metric calculates the average temporal integral over all the frames in a tile.
  • For each pair of frames in the tile:
    • Convert frames to black and white via averaging the R, G, and B components of each pixel.
    • Resize the frames using a bicubic resize algorithm by the global reduction factor.
    • For each resized frame pair, sum the difference between each pixel in both frames.
  • Average this value over all pairs of frames.


Optical Flow
Optical flow per tile is calculated. In addition to being used as a metric, the flow in x and y directions can be used to shift the tile coordinate frames, allowing the tile to move around in the parent movie.

  • For each pair of frames in the tile:
    • Convert frames to black and white via averaging the R, G, and B components of each pixel.
    • Resize the frames using a bicubic resize algorithm by the global reduction factor.
    • For each pixel in both frames:
      • Calculate optical flow between the same pixel in both frames via a reference algorithm from a robotic vision handbook, chapter 12.[REF] This algorithm was first written in Matlab, then ported to C and checked against Matlab results.
    • Calculate an average per-pixel optical flow over the two frames in the tile.
  • Calculate these average optical flow values for each pair of frames in the entire tile. 


Average Color
The obvious metric, extended to take time into account by dividing the tile into buckets and averaging color over each bucket.

  • Divide up frames in tile into several groups of consecutive frames, where the number of groups is as close as possible to the number of histogram bins, and the number of frames per group is as close as possible to equal.
  • For each frame of each group:
    • Calculate the average R, G, and B value in the frame.
  • For each group of frames
    • Calculate the average R, G, and B values.
  • The average color for each group is then saved out.


Hue Color Histogram
Take the hue values of a tile into account.

  • For each frame in the tile:
    • Convert each pixel from RBG to HSV
    • Bin the H value into some number of histogram bins specified by the user (typically 10). The binning is done on a Euclidean (non-log) scale.
  • Once every pixel of every frame has been put into the histogram, the histogram is normalized.


Intensity Color Histogram
In addition to average color, intensity histograms handle grayscale, which hue histograms utterly fail at.

  • For each frame in the tile:
    • Get the pseudo-intensity of each pixel by averaging R,G, and B components.
    • Bin this value into some number of histogram bins specified by the user (typically 10). The binning is done on a Euclidean (non-log) scale.
  • Once every pixel of every frame has been put into the histogram, the histogram is normalized.


Edge Histograms
Edge detection on a per-tile basis.

  • For each frame in the tile:
    • Convert frame to black and white via averaging the R, G, and B components of each pixel.
    • Resize the frames using a bicubic resize algorithm by the global reduction factor.
  • Divide newly resized frame into 4 corner quadrants of equal (or best approx) size. For each frame of each section:
  • For each pixel excepting those in the top row and rightmost column
    • Calculate the gradient of the pixel by subtracting nearby pixels.
    • If the energy of this pixel is greater than an energy threshold constant:
    • Take the arctan2 of the gradient in the x and y dimensions.
    • Bin this value into some number of histogram bins specified by the user (typically 10). The binning is done on a Euclidean (non-log) scale.
  • The histogram for each quadrant is then normalized and recorded.


Energy Histogram
Measure the 'energy' or noise in a tile.

  • For each frame in the tile
    • Convert frame to black and white via averaging the R, G, and B components of each pixel.
    • Resize the frame using a bicubic resize algorithm to a new user-chosen width and height (typically 10 by 10 pixels, down from 30 by 30 pixels)
  • For each frame:
    • For each pixel excepting those in the top row and rightmost column
      • Calculate the gradient of the pixel by subtracting nearby pixels.
      • Calculate the energy of the pixel using the gradient in the x and y dimensions.
      • Bin this value into some number of histogram bins specified by the user (typically 10). The binning is done on a Log scale.
  • Once every pixel of every frame has been put into the histogram, the histogram is normalized.

 

 

Sections

  1. Section 1: Video Mosaic Creation Process
  2. Section 2: Video Mosaic Component Programs
  3. Section 3: Video Mosaic Features
  4. Section 4: Some Video Mosaic Results
  5. Section 5: Work In-Progress and Future Goals