Real-time Video Scaling

Scaling for real-time video streams has more constraints than for still images that are processed off-line. The number one constraint is the need to process each frame in a given amount of time. The second main constraint is the handling of the interlaced frames.

Because of the speed issues and the fact that video scaling by definition contains no rotations, more intensive Spline and Sync interpolation is not needed. At the other end of the processing spectrum is a straight merging of the even and odd lines of two consecutive video fields. This is not usually the preferred method because motion between the 1/30th of second that the fields are captured will create a "feathering effect" on the de-interlaced video output. See Figure 1.

A well-established median method for video scaling is the use of Bilinear Interpolation.

Bilinear Interpolation

Bilinear Interpolation is the process of using each of the intermediate fields in an interlaced video frame to generate a full size target image. Either all the odd or all the even lines on the field are used. Interpolations are then performed between the lines and between adjoining pixels to generate an entire non-interlaced frame for the progressive scan output.

With video scaling, the desired amount of stretch in the X and Y directions is rarely a simple power of two. See Figure 2.

Thus, a quick interpolation between the source pixels is not feasible. Instead, a weighted coefficient method is typically used where the target pixel becomes the linearly interpolated value between adjacent points that are weighted by how close they are spatially to the target pixel. See Figure 3.

For a moderate cost in hardware, the weighted bilinear interpolation has the following benefits:

  • The aspect ratio can be adjusted.
  • Arbitrary borders can be added to the video output.
  • Frame rate conversion (i.e. from 60 Hz to 72 Hz) is smoother since the input frame rate is higher.

Sensoray's model 2246 HDTV Frame Grabber and Display processes multiple video inputs with many input sizes. By using several identical instances of a generic bilinear interpolator, the 2246 is able to scale any input to any of the output targets, plus an MPEG compression target for streaming overlaid video back to host computer for PVR functions. To give an idea of size, the scaling engine used in Sensoray's 2246 uses approximately 15% of Altera's second smallest Stratix II FPGA (the EP2S30). Some of the possible Input to Output scale settings are as follows:

  • 1920 x 1080i to 720 x 480 (HD to NTSC)
  • 1920 x 1080i to 720 x 576 (HD to PAL)
  • 1920 x 1080i to 1024 x 768 (HD to DVI)
  • 1920 x 1080i to 1600 x 1200 (HD to DVI)
  • 720 x 480 to 1920 x 1080i (NTSC to HD)
  • 720 x 576 to 1920 x 1080i (PAL to HD)
  • 720 x 480 to 1024 x 768 (NTSC to DVI)
  • 720 x 576 to 1920 x 1080p (PAL to DVI)

Figure 1

Odd Frame
1  -------X----
 
3  -----XXXXX--
 
5  ---XXXXXXXXX
 
7  -----XXXXX--
 
9  -------X----
 
Even Frame
 
2  --XXX-------
 
4  XXXXXXX-----
 
6  XXXXXXX-----
 
8  --XXX-------
 
10 ------------
Simple Merge
1  -------X----
2  --XXX-------
3  -----XXXXX--
4  XXXXXXX-----
5  ---XXXXXXXXX
6  XXXXXXX-----
7  -----XXXXX--
8  --XXX-------
9  -------X----
10 ------------

Figure 2

Odd Frame
1  -------X----
2  ------III---
3  -----XXXXX--
4  ----IIIIIII-
5  ---XXXXXXXXX
6  ----IIIIIII-
7  -----XXXXX--
8  ------III---
9  -------X----
10 ------------
Even Frame
1  ------------
2  --XXX-------
3  -IIIII------
4  XXXXXXX-----
5  IIIIIII-----
6  XXXXXXX-----
7  -IIIII------
8  --XXX-------
9  ------------
10 ------------
Bilinear Interpolation Example

As an example of Weighted Scaling by a non-multiple number, let's scale 5 lines to 8 lines.

Set Step size = (SourceLines - 1) / (TargetLines - 1) = 4/7

 1  -------A----
 
 
 2  -----BBBBB--
 
 
 3  ---CCCCCCCCC
 
 
 4  -----DDDDD--
 
 
 5  -------E----
 1  -------F----
 
 2  -----GGGGG--  G = 4/7 distance from A to B
 
 3  ----HHHHHHH-  H = 1/7 distance from B to C
 4  ---IIIIIIIII    :
 
 5  ---JJJJJJJJJ    :
 6  ----KKKKKKK-    :
 
 7  -----LLLLL--    :
 
 8  -------M----    :

Equations

F =  1       *A
G = (1-(4/7))*A + (4/7)*B
H = (1-(1/7))*B + (1/7)*C
I = (1-(5/7))*B + (5/7)*C
J = (1-(2/7))*C + (2/7)*D
K = (1-(6/7))*C + (6/7)*D
L = (1-(3/7))*D + (3/7)*E
M =  0       *D + (7/7)*E
 
Repeat for X direction

Code

The following Pseudo code performs arbitrary Bilinear Interpolation. In addition, it shows how the floating-point arithmetic can be handled with integer arithmetic only.

X-Direction

  
 HSrcStepInt = SrcWidth / TgtWidth;
 HSrcStepRem = SrcWidth % TgtWidth;
  
 for ( CurrentReadLine = 0; CurrentReadLine < SrcHeight; CurrentReadLine++ ) {
   CurrentWriteLine = CurrentReadLine;
   HTotStepInt      = 2; // pre-load first two pixels
  
   while ( NumPixels < HActiveStop) {
     if ( HTotStepInt != 0) {
       SourcePix1 = SourcePix2;
       SourcePix2 = SourceImage[CurrentReadLine][HSrcPosInt];
       HSrcPosInt++;
       HTotStepInt--;
     }
     else if ( NumPixels < TgtWidth ) {
       if ( NumPixels >= HActiveStart && NumPixels < HActiveStop ) {
         coeff1 = ( TgtWidth - HSrcPosRem ) * 65536 / TgtWidth;
         coeff2 = HSrcPosRem * 65536 / TgtWidth;
         TargetLine[CurrentWriteLine][PixelCount] =
            ( coeff1 * SourcePix1 + coeff2 * SourcePix2 ) >> 16;
         PixelCount++;
       }
       HSrcPosRem += HSrcStepRem;
       HTotStepInt = HSrcStepInt;
       if ( HSrcPosRem >= TgtWidth ) {
         HSrcPosRem -= TgtWidth;
         HTotStepInt++;
       }
       NumPixels++;
       if ( HTotStepInt != 0 ) {
         SourcePix1 = SourcePix2;
         SourcePix2 = SourceImage[CurrentReadLine][HSrcPosInt];
         HSrcPosInt++;
         HTotStepInt--;
       }
     }
   } // While NumPixels
 } // For CurrentReadLine

Y-Direction

  
 int VSrcStepInt = SrcHeight / TgtHeight;
 int VSrcStepRem = SrcHeight % TgtHeight;
 int VTotStepInt = 1;
  
 for ( CurrentReadLine1 = 0; CurrentReadLine1 + 1 < SrcHeight; CurrentReadLine1++ ) {
   CurrentReadLine2 = CurrentReadLine1 + 1;
  
   if ( VTotStepInt != 0 ) {
     if ( VSrcPosInt < SrcHeight ) {
       VSrcPosInt++;
     }
   }
   else if ( NumLines < TgtHeight ) {
     VSrcPosRem += VSrcStepRem;
     VTotStepInt = VSrcStepInt;
     if ( VSrcPosRem >= TgtHeight ) {
       VSrcPosRem -= TgtHeight;
       VTotStepInt++;
     }
     NumLines++;
   }
  
   for ( NumPixels = 0; NumPixels < TgtWidth; NumPixels++ ) {
     if ( NumPixels < TgtWidth ) {
       TopPixel = SourceImage[CurrentReadLine1][NumPixels];
       BottomPixel = SourceImage[CurrentReadLine2][NumPixels];
  
       coeff1 = ( TgtHeight - VSrcPosRem ) * 65536 / TgtHeight;
       coeff2 = VSrcPosRem * 65536 / TgtHeight;
  
       TargetPixel = ( coeff1 * TopPixel + coeff2 * BottomPixel ) >> 16;
     }
   }
 } // For CurrentReadLine1

Conclusion

Real-time video scaling engines based on the Bilinear Interpolation approach can be very modular, fast enough for cross-converting all of the many video standards, and cost effective.

SENSORAY | 7313 SW Tech Center Dr. | Tigard, OR 97223 | 503.684.8005 | Email Us

CONTACT SENSORAY

Employment | Privacy Policy | Press Releases | Copyright © 1982-2024 Sensoray ~ All Rights Reserved