从编码图像和视频中提取DCT系数

本文介绍了从编码图像和视频中提取DCT系数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

有没有办法从编码图像和视频中轻松提取DCT系数（和量化参数）？任何解码器软件都必须使用它们来解码块DCT编码的图像和视频。所以我很确定解码器知道它们是什么。有没有办法将它们暴露给使用解码器的人？

Is there a way to easily extract the DCT coefficients (and quantization parameters) from encoded images and video? Any decoder software must be using them to decode block-DCT encoded images and video. So I'm pretty sure the decoder knows what they are. Is there a way to expose them to whomever is using the decoder?

我正在实施一些直接在DCT域中工作的视频质量评估算法。目前，我的大多数代码都使用OpenCV，因此如果有人知道使用该框架的解决方案，那将会很棒。我不介意使用其他库（也许是libjpeg，但这似乎只适用于静止图像），但我主要关心的是尽可能少地执行特定格式的工作（我不想重新发明轮子并写入我自己的解码器）。我希望能够打开OpenCV可以打开的任何视频/图像（H.264，MPEG，JPEG等），如果它是块DCT编码的，则可以获得DCT系数。

I'm implementing some video quality assessment algorithms that work directly in the DCT domain. Currently, the majority of my code uses OpenCV, so it would be great if anyone knows of a solution using that framework. I don't mind using other libraries (perhaps libjpeg, but that seems to be for still images only), but my primary concern is to do as little format-specific work as possible (I don't want to reinvent the wheel and write my own decoders). I want to be able to open any video/image (H.264, MPEG, JPEG, etc) that OpenCV can open, and if it's block DCT-encoded, to get the DCT coefficients.

在最坏的情况下，我知道我可以编写自己的块DCT代码，通过它运行解压缩的帧/图像然后我会回到DCT域。这不是一个优雅的解决方案，我希望我能做得更好。

In the worst case, I know that I can write up my own block DCT code, run the decompressed frames/images through it and then I'd be back in the DCT domain. That's hardly an elegant solution, and I hope I can do better.

目前，我使用相当常见的OpenCV样板来打开图像：

Presently, I use the fairly common OpenCV boilerplate to open images:

IplImage *image = cvLoadImage(filename);
// Run quality assessment metric

我用于视频的代码同样微不足道：

The code I'm using for video is equally trivial:

CvCapture *capture = cvCaptureFromAVI(filename);
while (cvGrabFrame(capture))
{
    IplImage *frame = cvRetrieveFrame(capture);
    // Run quality assessment metric on frame
}
cvReleaseCapture(&capture);

在这两种情况下，我得到一个3通道 IplImage 采用BGR格式。有没有什么方法可以得到DCT系数？

In both cases, I get a 3-channel IplImage in BGR format. Is there any way I can get the DCT coefficients as well?

推荐答案

好吧，我做了一些阅读和我原来的问题似乎是一厢情愿的例子。

Well, I did a bit of reading and my original question seems to be an instance of wishful thinking.

基本上，由于H.264 不使用DCT 。它使用不同的变换（整数变换）。接下来，该变换的系数不一定在逐帧的基础上改变 - H.264更智能，因为它将帧分成片。应该可以通过一个特殊的解码器获得这些系数，但我怀疑OpenCV是否为用户公开了它。

Basically, it's not possible to get the DCT coefficients from H.264 video frames for the simple reason that H.264 doesn't use DCT. It uses a different transform (integer transform). Next, the coefficients for that transform don't necessarily change on a frame-by-frame basis -- H.264 is smarter cause it splits up frames into slices. It should be possible to get those coefficients through a special decoder, but I doubt OpenCV exposes it for the user.

对于JPEG，事情要好一些。我怀疑，会为您公开DCT系数。我写了一个小应用程序，以表明它的工作原理（源头）。它使用每个块的DC项创建一个新图像。由于DC项等于块平均值（在适当缩放后），DC图像是输入JPEG图像的下采样版本。

For JPEG, things are a bit more positive. As I suspected, libjpeg exposes the DCT coefficients for you. I wrote a small app to show that it works (source at the end). It makes a new image using the DC term from each block. Because the DC term is equal to the block average (after proper scaling), the DC images are downsampled versions of the input JPEG image.

编辑：源中的固定缩放

原始图像（512 x 512）：

Original image (512 x 512):

DC图像（64x64）：luma Cr Cb RGB

DC images (64x64): luma Cr Cb RGB

来源（C ++）：

#include <stdio.h>
#include <assert.h>

#include <cv.h>
#include <highgui.h>

extern "C"
{
#include "jpeglib.h"
#include <setjmp.h>
}

#define DEBUG 0
#define OUTPUT_IMAGES 1

/*
 * Extract the DC terms from the specified component.
 */
IplImage *
extract_dc(j_decompress_ptr cinfo, jvirt_barray_ptr *coeffs, int ci)
{
    jpeg_component_info *ci_ptr = &cinfo->comp_info[ci];
    CvSize size = cvSize(ci_ptr->width_in_blocks, ci_ptr->height_in_blocks);
    IplImage *dc = cvCreateImage(size, IPL_DEPTH_8U, 1);
    assert(dc != NULL);

    JQUANT_TBL *tbl = ci_ptr->quant_table;
    UINT16 dc_quant = tbl->quantval[0];

#if DEBUG
    printf("DCT method: %x\n", cinfo->dct_method);
    printf
    (
        "component: %d (%d x %d blocks) sampling: (%d x %d)\n",
        ci,
        ci_ptr->width_in_blocks,
        ci_ptr->height_in_blocks,
        ci_ptr->h_samp_factor,
        ci_ptr->v_samp_factor
    );

    printf("quantization table: %d\n", ci);
    for (int i = 0; i < DCTSIZE2; ++i)
    {
        printf("% 4d ", (int)(tbl->quantval[i]));
        if ((i + 1) % 8 == 0)
            printf("\n");
    }

    printf("raw DC coefficients:\n");
#endif

    JBLOCKARRAY buf =
    (cinfo->mem->access_virt_barray)
    (
        (j_common_ptr)cinfo,
        coeffs[ci],
        0,
        ci_ptr->v_samp_factor,
        FALSE
    );
    for (int sf = 0; (JDIMENSION)sf < ci_ptr->height_in_blocks; ++sf)
    {
        for (JDIMENSION b = 0; b < ci_ptr->width_in_blocks; ++b)
        {
            int intensity = 0;

            intensity = buf[sf][b][0]*dc_quant/DCTSIZE + 128;
            intensity = MAX(0,   intensity);
            intensity = MIN(255, intensity);

            cvSet2D(dc, sf, (int)b, cvScalar(intensity));

#if DEBUG
            printf("% 2d ", buf[sf][b][0]);
#endif
        }
#if DEBUG
        printf("\n");
#endif
    }

    return dc;

}

IplImage *upscale_chroma(IplImage *quarter, CvSize full_size)
{
    IplImage *full = cvCreateImage(full_size, IPL_DEPTH_8U, 1);
    cvResize(quarter, full, CV_INTER_NN);
    return full;
}

GLOBAL(int)
read_JPEG_file (char * filename, IplImage **dc)
{
  /* This struct contains the JPEG decompression parameters and pointers to
   * working space (which is allocated as needed by the JPEG library).
   */
  struct jpeg_decompress_struct cinfo;

  struct jpeg_error_mgr jerr;
  /* More stuff */
  FILE * infile;        /* source file */

  /* In this example we want to open the input file before doing anything else,
   * so that the setjmp() error recovery below can assume the file is open.
   * VERY IMPORTANT: use "b" option to fopen() if you are on a machine that
   * requires it in order to read binary files.
   */

  if ((infile = fopen(filename, "rb")) == NULL) {
    fprintf(stderr, "can't open %s\n", filename);
    return 0;
  }

  /* Step 1: allocate and initialize JPEG decompression object */

  cinfo.err = jpeg_std_error(&jerr);

  /* Now we can initialize the JPEG decompression object. */
  jpeg_create_decompress(&cinfo);

  /* Step 2: specify data source (eg, a file) */

  jpeg_stdio_src(&cinfo, infile);

  /* Step 3: read file parameters with jpeg_read_header() */

  (void) jpeg_read_header(&cinfo, TRUE);
  /* We can ignore the return value from jpeg_read_header since
   *   (a) suspension is not possible with the stdio data source, and
   *   (b) we passed TRUE to reject a tables-only JPEG file as an error.
   * See libjpeg.txt for more info.
   */

  /* Step 4: set parameters for decompression */

  /* In this example, we don't need to change any of the defaults set by
   * jpeg_read_header(), so we do nothing here.
   */

  jvirt_barray_ptr *coeffs = jpeg_read_coefficients(&cinfo);

  IplImage *y    = extract_dc(&cinfo, coeffs, 0);
  IplImage *cb_q = extract_dc(&cinfo, coeffs, 1);
  IplImage *cr_q = extract_dc(&cinfo, coeffs, 2);

  IplImage *cb = upscale_chroma(cb_q, cvGetSize(y));
  IplImage *cr = upscale_chroma(cr_q, cvGetSize(y));

  cvReleaseImage(&cb_q);
  cvReleaseImage(&cr_q);

#if OUTPUT_IMAGES
  cvSaveImage("y.png",   y);
  cvSaveImage("cb.png", cb);
  cvSaveImage("cr.png", cr);
#endif

  *dc = cvCreateImage(cvGetSize(y), IPL_DEPTH_8U, 3);
  assert(dc != NULL);

  cvMerge(y, cr, cb, NULL, *dc);

  cvReleaseImage(&y);
  cvReleaseImage(&cb);
  cvReleaseImage(&cr);

  /* Step 7: Finish decompression */

  (void) jpeg_finish_decompress(&cinfo);
  /* We can ignore the return value since suspension is not possible
   * with the stdio data source.
   */

  /* Step 8: Release JPEG decompression object */

  /* This is an important step since it will release a good deal of memory. */
  jpeg_destroy_decompress(&cinfo);

  fclose(infile);

  return 1;
}

int
main(int argc, char **argv)
{
    int ret = 0;
    if (argc != 2)
    {
        fprintf(stderr, "usage: %s filename.jpg\n", argv[0]);
        return 1;
    }
    IplImage *dc = NULL;
    ret = read_JPEG_file(argv[1], &dc);
    assert(dc != NULL);

    IplImage *rgb = cvCreateImage(cvGetSize(dc), IPL_DEPTH_8U, 3);
    cvCvtColor(dc, rgb, CV_YCrCb2RGB);

#if OUTPUT_IMAGES
    cvSaveImage("rgb.png", rgb);
#else
    cvNamedWindow("DC", CV_WINDOW_AUTOSIZE);
    cvShowImage("DC", rgb);
    cvWaitKey(0);
#endif

    cvReleaseImage(&dc);
    cvReleaseImage(&rgb);

    return 0;
}

这篇关于从编码图像和视频中提取DCT系数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！