KR20010006040A

KR20010006040A - Method and apparatus for assessing the visibility of differences between two signal sequences

Info

Publication number: KR20010006040A
Application number: KR1019997009118A
Authority: KR
Inventors: 제프리 루빈; 마이클 에이치. 브릴; 알버트 피. 피카; 로저 엘. 크레인; 왈터 폴; 게리 에이. 젠델
Original assignee: 윌리암 제이. 버크; 사르노프 코포레이션
Priority date: 1998-02-02
Filing date: 1998-04-06
Publication date: 2001-01-15
Also published as: KR100574595B1

Abstract

두개의 입력 신호 시퀀스들 예컨데, 영상 시퀀스들 사이의 가시성 차이들을 평가하기 위한 방법 및 장치(110)가 개시되어 있다. 상기 장치는 입력 신호 처리부(210), 휘도 처리부(220), 색 처리부(230), 및 퍼셉튜얼 매트릭 발생부(240, 250, 260)을 구비하는 퍼셉튜얼 매트릭 발생기(112)를 포함한다.Two input signal sequences are disclosed, for example, a method and apparatus 110 for evaluating visibility differences between image sequences. The apparatus includes a perceptual metric generator 112 having an input signal processor 210, a luminance processor 220, a color processor 230, and a perceptual metric generator 240, 250, 260.

Description

METHOD AND APPARATUS FOR ASSESSING THE VISIBILITY OF DIFFERENCES BETWEEN TWO SIGNAL SEQUENCES

시호 처리 시스템 예컨데, 영상 시스템 설계자는 종종 콘트라스트, 해상도 및/또는 압축/압축 해제 (코덱) 과정에서의 비트-비율 효율과 같은 물리적인 파라미터들로 그 고안품의 성능을 평가한다. 이러한 파라미터들을 쉽게 측정할 수는 있지만, 성능을 평가하기 위한 정확한 측정 장비가 없을 수 있다. 그 이유는, 영상 시스템의 최종 사용자가 아티펙트(artifacts) 또는 왜곡의 가시성 및 어떤 경우에는, 영상 예컨데, MRI(Magnetic Resonance Imaging) 영상 또는 CAT(Computer-Assisted Tomography) 스캔 영상에서의 종양의 존재와 같은 정보를 노출시킬 수 있는 영상 특징들의 개선과 같은 주관적인 시각적 성능과 일반적으로 더욱 관련된다는 것이다.Signal Processing Systems For example, imaging system designers often evaluate the performance of their designs with physical parameters such as contrast, resolution, and / or bit-ratio efficiency during compression / decompression (codec) processes. While these parameters can be easily measured, there may be no accurate measurement equipment to evaluate performance. The reason is that the end-user of the imaging system may not be able to see the visibility of artifacts or distortions and, in some cases, the presence of tumors in an image, such as a magnetic resonance imaging (MRI) image or a computer-assisted tomography (CAT) scan image. It is generally more relevant to subjective visual performance, such as improvements in imaging features that can expose information.

예컨데, 두개의 다른 코덱 영상들을 발생시키기 위한 두개의 다른 코덱 알고리즘들을 사용하여 입력된 영상을 처리할 수 있다. 만일 코덱 영상 충실도의 측정이 순전히 인간의 시각의 정신 물리학적 특성 고려없이 양 코덱 영상들에 대한 제곱 평균 에러(MSE) 연산의 이행과 같은 파라미터들에만을 기초로 한다면, 낮은 MSE 값을 갖는 코덱 영상은 높은 MSE 값을 갖는 코덱 영상보다 큰 시각적 왜곡을 실질적으로 갖게 된다.For example, two different codec algorithms for generating two different codec images may be used to process the input image. If the measurement of codec image fidelity is based solely on parameters such as the implementation of the squared mean error (MSE) operation for both codec images without considering the psychophysical characteristics of human vision, the codec image with a low MSE value Has substantially larger visual distortion than codec images with high MSE values.

따라서, 당업계에서는, 신호 처리 시스템 예컨데, 영상 시스템의 주관적 성능에 대한 물리적 파라미터들의 영향을 평가하기 위한 방법 및 장치에 대한 필요성이 존재하고 있다.Accordingly, there is a need in the art for a method and apparatus for evaluating the influence of physical parameters on a signal processing system such as the subjective performance of an imaging system.

본 출원은 1997년 4월 4일자로 출원된 No. 60/043,050 및 1998년 2월 2일 자로 출원된 No. 60/073,435 미국 가출원의 권리를 주장하며, 여기서 참조 문헌으로써 제시한다.This application was filed on April 4, 1997. No. 60 / 043,050 and filed Feb. 2, 1998. 60 / 073,435 claims the rights of provisional applications in the United States and is hereby incorporated by reference.

신호 처리 시스템의 성능을 평가 및 개선하기 위한 장치 및 부수적 방법에 관한 것이다. 특히, 본 발명은 두 신호의 시퀀스들 사이의 가시성 차이를 평가하는 방법 및 장치에 관한 것이다.An apparatus and an ancillary method for evaluating and improving the performance of a signal processing system. In particular, the present invention relates to a method and apparatus for evaluating the difference in visibility between sequences of two signals.

도 1은 본 발명의 신호 처리 시스템을 도시한 도면이다;1 is a diagram illustrating a signal processing system of the present invention;

도 2는 퍼셉튜얼 매트릭 발생기의 블록 구성도를 도시한 도면이다;2 is a block diagram of a perceptual metric generator;

도 3은 입력 신호 처리부의 블록 구성도를 도시한 도면이다;3 is a block diagram of an input signal processor;

도 4는 휘도 처리부의 블록 구성도를 도시한 도면이다;4 is a block diagram showing a luminance processing unit;

도 5는 색 처리부의 블록 구성도를 도시한 도면이다;5 is a block diagram of a color processing unit;

도 6은 휘도 처리부의 상세한 블록 구성도를 도시한 도면이다;6 is a diagram showing a detailed block diagram of the luminance processing section;

도 7은 휘도 매트릭 발생부의 블록 구성도를 도시한 도면이다;7 is a block diagram showing a luminance metric generating unit;

도 8은 색 처리부의 블록 구성도를 도시한 도면이다;8 is a block diagram showing a color processing unit;

도 9는 색 매트릭 발생부의 블록 구성도를 도시한 도면이다;9 is a block diagram showing a color metric generating unit;

도 10은 휘도 공간 감도 데이터를 도시한 그래프이다;10 is a graph showing luminance spatial sensitivity data;

도 11은 휘도 시간 감도 데이터를 도시한 그래프이다;11 is a graph showing luminance time sensitivity data;

도 12는 휘도 콘트라스트 판별 데이터를 도시한 그래프이다;12 is a graph showing luminance contrast determination data;

도 13은 디스크 검출 데이터를 도시한 그래프이다;13 is a graph showing disk detection data;

도 14는 체크보드 검출 데이터를 도시한 그래프이다;14 is a graph illustrating checkerboard detection data;

도 15는 모서리 날키로움 판별 데이터를 도시한 그래프이다;15 is a graph showing edge sharpness determination data;

도 16은 색 공간 감도 데이터를 도시한 그래프이다;16 is a graph depicting color space sensitivity data;

도 17은 색 콘트라스트 판별 데이터를 도시한 그래프이다;17 is a graph showing color contrast determination data;

도 18은 등급 예측 데이터를 도시한 그래프이다;18 is a graph illustrating grade prediction data;

도 19는 휘도 처리부의 선택적 실시예의 블록 구성도를 도시한 도면이다;19 is a block diagram of an optional embodiment of a luminance processing unit;

도 20은 도 19의 휘도 처리부의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다;20 is a detailed block diagram of an alternative embodiment of the luminance processor of FIG. 19;

도 21은 도 19의 휘도 매트릭 발생부의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다;FIG. 21 is a detailed block diagram of an alternative embodiment of the luminance metric generator of FIG. 19; FIG.

도 22는 1/2-높이 영상들을 처리하기 위한 휘도 처리부의 블록 구성도를 도시한 도면이다;FIG. 22 is a block diagram showing a luminance processor for processing 1 / 2-height images; FIG.

도 23은 1/2-높이 영상들을 처리하기 위한 휘도 매트릭 발생부의 블록 구성도를 도시한 도면이다;FIG. 23 is a block diagram showing a luminance metric generating unit for processing 1 / 2-height images; FIG.

도 24는 색 처리부의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다;24 is a detailed block diagram of an alternative embodiment of the color processing unit;

도 25는 색 매트릭 발생부의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다;25 is a detailed block diagram of an alternative embodiment of the color metric generating unit;

도 26은 1/2-높이 영상들을 처리하기 위한 색 처리부의 블록 구성도를 도시한 도면이다; 및FIG. 26 is a block diagram of a color processor for processing 1 / 2-height images; FIG. And

도 27은 1/2-높이 영상들을 처리하기 위한 색 매트릭 발생부의 블록 구성도를 도시한 도면이다.FIG. 27 is a block diagram illustrating a color metric generation unit for processing 1 / 2-height images.

본 발명은 두개의 입력 신호 시퀀스들 예컨데, 영상 신호들 사이의 가시성 차이를 평가하기 위한 방법 및 장치이다. 상기 장치는 입력 신호 처리부, 휘도 처리부, 색 처리부, 및 퍼셉튜얼(perceptual) 매트릭(matric) 발생부로 구성되는 퍼셉튜얼 매트릭 발생기를 포함한다.The present invention is a method and apparatus for evaluating the difference in visibility between two input signal sequences, for example image signals. The apparatus includes a perceptual metric generator composed of an input signal processor, a luminance processor, a color processor, and a perceptual metric generator.

상기 입력 신호 처리부는 입력 신호들은 정신 물리학적으로 정의된 양들 예컨데, 휘도 성분들 및 색 성분들로 변환시킨다. 상기 휘도 처리부는 상기 휘도 성분들을 처리하여 휘도 퍼셉튜얼 매트릭을 발생시키는 한편, 상기 색 처리부는 상기 색 성분들을 처리하여 색 퍼셉튜얼 매트릭을 발생시킨다. 최종적으로, 상기 퍼셉튜얼 매트릭 발생부는 상기 휘도 퍼셉튜얼 매트릭을 상기 색 퍼셉튜얼 매트릭과 상관하여 단일화된 퍼셉튜얼 매트릭 예컨데 JND 맵(just-noticeable-difference map)을 발생시킨다.The input signal processor converts the input signals into psychophysically defined quantities, eg, luminance components and color components. The luminance processor processes the luminance components to generate a luminance perceptual metric, while the color processor processes the color components to generate a color perceptual metric. Finally, the perceptual metric generating unit correlates the luminance perceptual metric with the color perceptual metric to generate a unified perceptual metric such as a just-noticeable-difference map.

도 1은 본 발명을 활용하는 신호 처리 시스템(100)을 도시한 도면이다. 상기 신호 처리 시스템은 신호 수신부(130), 신호 처리부(110), 입력/출력 디바이스들(120), 및 시스템 언더 테스트(140)로 구성된다.1 is a diagram illustrating a signal processing system 100 utilizing the present invention. The signal processing system includes a signal receiver 130, a signal processor 110, input / output devices 120, and a system under test 140.

신호 수신부(130)는 영상 디바이스들로부터의 영상 시퀀스들 또는 마이크로폰 또는 기록 매체로부터의 오디오 신호들과 같은 다른 시간-가변 신호들과 같은 입력 데이터 신호들을 수신하기 위해 제공된다. 따라서, 비록 본 발명이 영상에 대해 아래에 설명되겠지만, 본 발명은 위에서 언급한 바와 같이다른 입력 신호들에 적용할 수 있다는 점을 이해해야 한다.Signal receiver 130 is provided for receiving input data signals such as image sequences from video devices or other time-varying signals such as microphones or audio signals from a recording medium. Thus, although the invention will be described below with respect to an image, it should be understood that the invention can be applied to other input signals as mentioned above.

상기 신호 수신부(130)는 데이터 수신브(132) 및 데이터 저장부(134)를 포함한다. 상기 데이터 수신부(130)는 모뎀 및 아날로그-디지털 변환기와 같은 다수의 디바이스들을 포함할 수 있다. 아날로그-디지털 변환기가 아날로그 신호를 디지털 신호로 변환하지만, 모뎀이 전화선 또는 다른 통신 시스템을 통해 이진 데이터를 전송 및 수신하기 위한 변조 및 복조기를 포함하고 있다는 것은 공지되어 있다. 이에, 상기 신호 수신부(130)는 "온-라인" 또는 실시간으로 입력 신호들을 수신할 수 있으며, 필요하다면, 그들을 디지털 형태로 변환시킬 수 있다. 상기 신호 수신부(130) 그 자체는 컴퓨터, 카메라, 비디오 레코더, 또는 다양한 의료 영상 디바이스들과 같은 하나 이상의 디바이스들로부터의 신호들을 수신할 수 있다.The signal receiver 130 includes a data receiver 132 and a data storage unit 134. The data receiver 130 may include a plurality of devices such as a modem and an analog-to-digital converter. While analog-to-digital converters convert analog signals to digital signals, it is known that modems include modulators and demodulators for transmitting and receiving binary data over telephone lines or other communication systems. Accordingly, the signal receiver 130 may receive input signals "on-line" or in real time, and may convert them into digital form if necessary. The signal receiver 130 itself may receive signals from one or more devices, such as a computer, a camera, a video recorder, or various medical imaging devices.

상기 데이터 저장부(134)는 상기 데이터 수신부(132)에 의해 수신된 입력 신호들을 저장하기 위해 제공된다. 상기 데이터 저장부(134)는 디스크 드라이브, 반도체 메모리 또는 다른 저장 메체와 같은 하나 이상의 디바이스를 포함한다. 이러한 저장 디바이스들은 상기 입력 신호들을 지연시키기 위한 방법 또는 단순히 다음 과정을 위해 입력 신호를 저장하기 위해 제공된다.The data storage unit 134 is provided for storing input signals received by the data receiving unit 132. The data storage unit 134 includes one or more devices such as a disk drive, semiconductor memory, or other storage medium. Such storage devices are provided for a method for delaying the input signals or for simply storing the input signal for the next procedure.

바람직한 실시예에서, 신호 처리부(110)는 퍼셉튜얼(perceptual) 매트릭(metric) 발생기(또는 시각적 판단 측정(visual discrimination measure; VDM))으로서 알여진), 중앙 처리 장치(114; CPU), 및 영상 처리를 돕기 위한 메모리(116)를 구비하는 일반 목적 컴퓨터를 포함한다. 퍼셉튜얼 매트릭 발생기(112)는 다양한 필터들로부터 구성된 물리적 장치 또는 통신 채널을 통해 상기 CPU에 연결된 프로세서일 수 있다. 선택적으로, 퍼셉튜얼 매트릭 발생기(112)는 입력/출력 디바이스(120)으로부터 호출 및 상기 신호 처리부의 상기 CPU에 의해 실행되는 소프트웨어 응용으로서 구현될 수 있다. 따라서, 본 발명의 퍼셉튜얼 매트릭 발생기(112)는 컴퓨터가 독출 가능한 매체에 저장될 수 있다.In a preferred embodiment, signal processor 110 is known as a perceptual metric generator (or visual discrimination measure (VDM)), central processing unit 114 (CPU), and video. A general purpose computer having a memory 116 to assist with processing. Perceptual metric generator 112 may be a processor coupled to the CPU through a physical device or communication channel configured from various filters. Optionally, the perceptual metric generator 112 may be implemented as a software application called from an input / output device 120 and executed by the CPU of the signal processor. Accordingly, the perceptual metric generator 112 of the present invention may be stored in a computer-readable medium.

상기 신호 처리부(110)는 또한 키보드, 마우스, 비디오 모니터, 또는 마그네틱 및 광학 디바이스에 제한되지 않고 디스켓 또는 테이프 예컨데, 하드 디스크 드라이브 또는 콤팩트 디스크 드라이브를 포함하는 저장 디바이스들과 같은 다수의 입력/출력 디바이스들(120)과 결합된다. 상기 입력 디바이스들은 입력들(제어 신호들 및 데이터)을 입력 영상을 처리하기 위한 상기 신호 처리부에 제공하기 위해 제공되며, 상기 출력 디바이스들은 그 결과들을 디스플레이 또는 저장 예컨데, 디스플레이 장치에 퍼셉튜얼 매트릭를 디스플레이시키기 위해 제공된다.The signal processor 110 is also not limited to keyboards, mice, video monitors, or magnetic and optical devices, but can also include a plurality of input / output devices, such as storage devices including diskettes or tapes such as hard disk drives or compact disk drives. Coupler 120 with each other. The input devices are provided for providing inputs (control signals and data) to the signal processor for processing an input image, the output devices displaying or storing the results, eg displaying a perceptual metric on a display device. Is provided for.

퍼셉튜얼 매트릭 발생기(112)를 사용하는 상기 신호 처리 시스템(100)은 인간 주체가 두개의 신호 시퀀스들 예컨데, 감쇄되지 않은 부본과 관련된 감쇄된 색-영상 시퀀스에 할당하게 될 퍼셉튜얼 등급들을 예측할 수 있다. 상기 퍼셉튜얼 매트릭 발생기(112)는 입력된 영상의 두 시퀀스 또는 스트림 사이의 가시성 차이를 평가하여, 상기 시퀀스들 사이의 퍼셉튜얼 차이들의 단일 매트릭을 포함하여, 여러 차 평가들을 발생시킨다. 이러한 차이들은 변형된 인간의 JND 매트릭을 단위로 양자화된다. 이러한 매트릭은 JND 값, JND 맵, 또는 확률 예측으로서 표현될 수 있다. 차례로, CPU는 디지털 영상 압축에 국한되지 않고, 영상 양 측정 및 목표 검출을 포함하는 다양한 처리 과정들을 최적화하기 위한 JND 영상 매트릭을 활용할 수 있다. 설명하자면, 입력된 영상 시퀀스는 두개의 다른 경로들 또는 채널들을 통과하여 신호 처리 시스템(100)에 도달한다. 상기 입력된 영상 시퀀스는 일 경로상에서는 (기준 채널 또는 기준 영상 시퀀스) 처리없이 직접적으로 신호 처리부에 도달하며, 이에 대해 동일한 입력된 영상 시퀀스는 영상 시퀀스가 어떤 형태로 처리되는 (검사 받는 채널 또는 검사 영상 시퀀스) 검사 받는 시스템(140)를 통과하는 다른 경로를 통과한다. 상기 신호 처리 시스템(100)은 두 영상 시퀀스들 사이의 차이를 측정하는 퍼셉튜얼 매트릭을 발생시킨다. 상기 검사 받는 시스템(140)은 많은 디바이스들 또는 시스템들 예컨데, 디코더, 전송 채널 자체, 오디오 또는 비디오 레코더, 스케너, 디스플레이 장치, 또는 전송기일 수 있다. 따라서, 신호 처리 시스템(100)은 검사 영상 시퀀스의 주관적인 질을 평가하는 데에 채용할 수 있다. 상기 퍼셉튜얼 매트릭 발생기(112)의 사용을 통해, 기준 시퀀스과 관련된 검사 영상의 주간적 질의 평가를 인간 관찰자없이 수행할 수 있다.The signal processing system 100 using the perceptual metric generator 112 can predict the perceptual grades that a human subject will assign to two signal sequences, eg, attenuated color-image sequences associated with an undamped copy. have. The perceptual metric generator 112 evaluates the visibility difference between two sequences or streams of the input image, generating several difference estimates, including a single metric of the perceptual differences between the sequences. These differences are quantized in units of modified human JND metrics. This metric can be represented as a JND value, JND map, or probability prediction. In turn, the CPU can utilize JND image metrics to optimize various processing processes, including, but not limited to, digital image compression. To illustrate, the input image sequence passes through two different paths or channels to reach the signal processing system 100. The input image sequence directly arrives at the signal processing unit without (reference channel or reference image sequence) processing on one path, and the same input image sequence is processed in some form of the image sequence (inspected channel or inspection image). Sequence) passes through another path through the system 140 being inspected. The signal processing system 100 generates a perceptual metric measuring the difference between the two image sequences. The inspected system 140 may be many devices or systems, such as a decoder, a transport channel itself, an audio or video recorder, a scanner, a display device, or a transmitter. Accordingly, the signal processing system 100 can be employed to evaluate the subjective quality of the inspection image sequence. Through the use of the perceptual metric generator 112, weekly query evaluation of the test image associated with the reference sequence can be performed without a human observer.

최종적으로, 상기 퍼셉튜얼 매트릭은 경로(150)를 경유하는 검사를 받는 시스템의 파라미터들을 변형 또는 제어하는 데에 사용될 수 있다. 예컨데, 인코더의 파라미터들은 개선된 퍼셉튜얼 비율 예컨데, 인코딩된 영상이 디코딩될 때의 보다 작은 시각적 왜곡을 갖는 인코딩된 영상을 발생시키기 위해 변형될 수 있다. 또한, 검사 받는 시스템(140)을 분리된 디바이스로서 설명하지만, 검사 받는 시스템을 상기 신호 처리부의 메모리(166)에 상주하는 소프트웨어 구현 예컨데, 비디오 인코딩 방법으로 구현할 수 있다는 것을 당업자라면 이해할 수 있을 것이다.Finally, the perceptual metric can be used to modify or control the parameters of the system under inspection via path 150. For example, the parameters of the encoder can be modified to produce an improved perceptual ratio, eg, an encoded image having a smaller visual distortion when the encoded image is decoded. In addition, although the system 140 to be inspected is described as a separate device, it will be understood by those skilled in the art that the system to be inspected may be implemented by a software implementation resident in the memory 166 of the signal processor, for example, by a video encoding method.

도 2는 상기 퍼셉튜얼 매트릭 발생기(112)의 단순화된 블록 구성도를 도시한 도면이다. 바람직한 실시예에서, 상기 퍼셉튜얼 매트릭 발생기는 입력 신호 처리부(210), 휘도 처리부(220), 색 처리부(230), 휘도 매트릭 발생부(240), 색 매트릭 발생부(250), 및 퍼셉튜얼 매트릭 발생부(260)로 구성된다.2 illustrates a simplified block diagram of the perceptual metric generator 112. In a preferred embodiment, the perceptual metric generator is an input signal processor 210, a luminance processor 220, a color processor 230, a luminance matrix generator 240, a color matrix generator 250, and a perceptual matrix It consists of a generator 260.

상기 입력 신호 처리부는 입력 신호(205)들을 정신 물리학적으로 정의된 양들 예컨데, 영상 신호에 대한 휘도 성분들 및 색 성분들로 변환시킨다. 상기 입력 신호들은 임의 길이의 두개의 영상 시퀀스들이다. 비록 단지 하나의 신호만이 도 2에 도시되어 있지만, 상기 입력 신호 처리부는 하나 이상의 입력 신호들을 동시에 처리할 수 있다는 것을 이해할 수 있을 것이다. 상기 입력 신호 처리부(210)의 목적은 입력된 영상 신호들을 빛 출력들로 변환하고, 이러한 빛 출력들을 휘도 및 색으로 분리하여 특징화된 정신 물리학적으로 정의된 양들로 변환시키는 것이다.The input signal processor converts the input signal 205 into psychophysically defined quantities, eg, luminance and color components for the image signal. The input signals are two image sequences of arbitrary length. Although only one signal is shown in FIG. 2, it will be appreciated that the input signal processor can process one or more input signals simultaneously. The purpose of the input signal processor 210 is to convert the input image signals into light outputs, and to convert these light outputs into luminance and color separations into characterized psychophysically defined quantities.

특히, 각각의 입력 시퀀스들의 각각의 필드에 대한, D1 테이프로부터 도출된 도 2의 상부에 Y', C_b', 및 C_r'로 표시된 3개의 데이터 세트들이 있다. 차례로, Y, C_b, 및 C_r데이터들은 디스플레이 픽셀 값들을 일으키는 R', G', B' 전자-총 전압들로 변환된다. 상기 입력 신호 처리부에서, R', G', B' 전압들은 또한 다음 처리 단계 또는 부들을 거치게 되는 휘도 및 두개의 색채된 영상들로 전환되는 과정을 거치게 된다.In particular, for each field of each input sequence, there are three data sets labeled Y ', C _b ', and C _r 'at the top of FIG. 2 derived from the D1 tape. In turn, Y, C _b , and C _r data are converted into R ′, G ′, B ′ electron-total voltages resulting in display pixel values. In the input signal processor, the R ', G', and B 'voltages are also subjected to a process of being converted into two colored images and luminance which is subjected to the next processing step or parts.

상기 휘도 처리부(220)는 디스플레이 장치의 최대 휘도의 분수들로 표현된 휘도(Y)의 두개의 영상들(검사 및 기준)을 받아들인다. 상기 휘도 처리부(220)로부터의 출력들은 아래의 설명에서와 같이, 다양한 시간적 및 공간적 처리에 기인한 공간적 및/또는 시간적 응답들로서 넓게 정의할 수 있다. 이러한 출력들은 휘도 JND 맵을 발생시키는 휘도 매트릭 발생부(240)에 제공된다. 상기 JMD 맵은 그 회색이 상기 대응하는 픽셀 위치에서의 검사 및 기준 영상들 사이의 JND들의 수에 비례한 영상이다.The luminance processor 220 receives two images (inspection and reference) of luminance Y expressed as fractions of the maximum luminance of the display device. The outputs from the luminance processor 220 may be broadly defined as spatial and / or temporal responses due to various temporal and spatial processing, as described below. These outputs are provided to a luminance metric generator 240 that generates a luminance JND map. The JMD map is an image whose gray is proportional to the number of JNDs between the inspection and reference images at the corresponding pixel location.

유사하게, 상기 색 처리부(230)는 상기 입력 신호들의 상기 색 성분들을 처리하여 색 퍼셉튜얼 매트릭을 발생시킨다. 즉, 상기 색 처리부(230)는 (색 영상들 u* 및 v* 각각에 대해 발생되는), CIE L*u*v* 균일한-색 공간을 기초로 하는 그리고 디스플레이 장치의 최대 휘도의 분수들로 표현된 색의 두개의 영상들(검사 및 기준)을 받아들인다. 다시, 상기 색 처리부(230)로부터의 출력들은 아래의 설명에서와 같이, 다양한 시간적 및 공간적 처리에 기인한 공간적 및/또는 시간적 응답들로서 넓게 정의할 수 있다. 차례로, u* 및 v* 처리의 출력들은 색 JND 맵을 발생시키기 위한 상기 색 매트릭 발생부(250)에 의해 수신 및 합성된다.Similarly, the color processor 230 processes the color components of the input signals to generate a color perceptual metric. That is, the color processor 230 is based on the CIE L * u * v * uniform-color space (which is generated for each of the color images u * and v *) and fractions of the maximum brightness of the display device. Accept two images of color (inspection and reference). Again, outputs from the color processor 230 may be broadly defined as spatial and / or temporal responses due to various temporal and spatial processing, as described below. In turn, the outputs of u * and v * processing are received and synthesized by the color metric generator 250 for generating a color JND map.

또한, 두 색 및 휘도 처리 과정들은 휘도 영상의 구조에 의존하여 감각적인 차이가 더욱 또는 덜 선명하게 보이게 하는 경로(225)를 거치는 소위 "마스킹(masking)"이라는 휘도 채널로부터의 입력들에 의해 영향을 받는다. 마스킹(자체 또는 반대)은 일반적으로 채널 또는 이웃 채널에서의 현재의 정보에서의 감도의 감소를 언급하는 것이다.In addition, the two color and luminance processes are affected by the inputs from the so-called " masking " channel through the path 225, which makes the sensory difference appear more or less clear depending on the structure of the luminance image. Receive. Masking (self or vice versa) generally refers to a decrease in sensitivity in current information in a channel or neighboring channel.

상기 색, 휘도, 및 합성 휘도-색 JND 맵은 이러한 맵들로부터 도출된 작은 수의 요약 측정치들과 함께, 상기 퍼셉튜얼 매트릭 발생부(260)으로의 출력으로서 각각 이용가능하다. 단일의 JND 값 (JND 요약들) 출력은 검사 시퀀스 왜곡의 목격자의 전반적인 등급을 모델링하는데에 유용하므로, 상기 JMD 맵들은 아티팩트들의 엄중 및 위치의 보다 상세한 모습을 부여하게 된다. 차례로, 상기 퍼셉튜얼 매트릭 발생부(260; 영상 매트릭 발생기)는 상기 휘도 퍼셉튜얼 매트릭을 상기 색 퍼셉튜얼 매트릭과 상관하여 단일의 퍼셉튜얼 영상 매트릭(270) 예컨데, 전체 JND(just-noticeable-difference) 맵을 발생시킨다.The color, luminance, and composite luminance-color JND maps are available as outputs to the perceptual metric generator 260, respectively, with a small number of summary measurements derived from these maps. Since a single JND value (JND summaries) output is useful for modeling the overall rating of an eye of test sequence distortion, the JMD maps give a more detailed view of the stringency and location of the artifacts. In turn, the perceptual metric generator 260 correlates the luminance perceptual metric with the color perceptual metric to form a single perceptual image metric 270, for example, the entire just-noticeable-difference (JND). Generate a map.

두개의 기본 가정들이 본 발명의 기초라는 것을 주목해야 한다. 첫째, 각 픽셀을 "제곱"하고 .03 도의 시계 각도를 정한다. 이러한 수는 480 픽셀들의 스크린 높이 및 4개의 스크린-높이들의 시각 거리 ("Rec 500" 표준안에서 규정한 가장 가까운 시각 거리)로부터 도출된다. 상기 퍼셉튜얼 매트릭 발생기를 4개의 스크린 높이드로다 먼 시각 거리들에서 인간의 지각과 비교할 때, 상기 퍼셉튜얼 매트릭 발생기는 공간적 상세에서 인간의 감각을 과대 평가할 수 있다. 시각 거리의 엄격한 제한의 부재시에, 상기 퍼셉튜얼 매트릭 발생기는 "Rec 500"의 권유내에서 가능한 민감하도록 조절된다. 그러나, 퍼셉튜얼 매트릭 발생기의 민감도는 특정 응용을 위해 조절할 수 있다.It should be noted that two basic assumptions are the basis of the present invention. First, we "square" each pixel and set a clock angle of .03 degrees. This number is derived from the screen height of 480 pixels and the viewing distance of four screen-heights (the closest viewing distance defined in the "Rec 500" standard). When comparing the perceptual metric generator to human perception at distant viewing distances at four screen heights, the perceptual metric generator may overestimate human senses in spatial detail. In the absence of strict limitations of the viewing distance, the perceptual metric generator is adjusted to be as sensitive as possible within the recommendation of "Rec 500". However, the sensitivity of the perceptual metric generator can be adjusted for specific applications.

두번째, 상기 퍼셉튜얼 매트릭 발생기는 약 20 ft-L(모든 시공 주파수들이 측정된)에서 가장 큰 정확도를 갖고 .01-100 ft-L 스크린 휘도(그 전체 감도가 측정된)에 적용된다. 가변되는 휘도는 모든 시공 주파수들에서 감각 변화에 비례하여 발생되는 것으로 또한 가정하며, 이러한 가정은 부가적인 측정이 발생된 20 ft-L 근처에서 덜 중요하다. 측정 및 경험적인 데이터를 아래에 제시하였다.Secondly, the perceptual metric generator has the highest accuracy at about 20 ft-L (all construction frequencies measured) and is applied to .01-100 ft-L screen brightness (total sensitivity measured). It is also assumed that the varying luminance occurs in proportion to the sensory change at all construction frequencies, and this assumption is less important near 20 ft-L where additional measurements have occurred. Measurement and empirical data are presented below.

이하, 도 2에 도시된 상기 처리부를 도 3 내지 7을 참조하여 보다 상세히 설명한다.Hereinafter, the processing unit shown in FIG. 2 will be described in more detail with reference to FIGS. 3 to 7.

도 3은 입력 신호 처리부(210)의 블록 구성도를 도시한 도면이다. 바람직한 실시예에서, 각 입력 신호는 4개의 필드들(305)의 세트로 처리된다. 따라서, 도 3 상부에 Y', C_b', C_r'로 표시된 4개의 필드들의 적층은 검사 또는 기준 영상 시퀀스로부터 4개의 연속된 필드들의 세트를 나타낸다. 그러나, 본 발명은 그러한 구현에 제한되지 않으며, 다른 필드 그룹화 방법들을 사용할 수 있다.3 is a block diagram of the input signal processor 210. In the preferred embodiment, each input signal is processed into a set of four fields 305. Thus, Figure 3 the upper part of the Y ', C _b' a stack of four fields as shown, C _r 'represents a set of four successive fields from the test or reference video sequence. However, the present invention is not limited to such an implementation, and other field grouping methods may be used.

다중 변환들이 상기 입력 신호 처리부(210)에 포함된다. 간략히, 상기 입력 신호 처리부(210)는 Y', C_b', C_r' 비디오 입력 신호들을 먼저 전자-총 전압들로 변환한 다음, 3개의 발광체(phosphors)의 휘도 값들로 변환하고, 최종적으로 휘도 및 색 성분들로 분리하는 정신 물리학적 변수들로 변환한다. 아래에서 계산할 3중 자극 값 Y은 색 처리 이전에 사용된 "모델 세기 값(model intensity value)"을 대체한다. 또한, 픽셀마다 색 성분(u* 및 v*)이 CIE 균일-색 명세서들에 따라 발생된다.Multiple transformations are included in the input signal processor 210. Briefly, the input signal processor 210 first converts Y ', C _b ', C _r 'video input signals into electron-total voltages, and then converts them into luminance values of three phosphors, and finally Convert to psychophysical variables that separate luminance and color components. The triple stimulus value Y to be calculated below replaces the "model intensity value" used before color processing. In addition, color components u * and v * per pixel are generated according to the CIE uniform-color specifications.

만일 입력 신호가 받아들일 수 있는 규일-색 공간에 항시 있다면, 상기 입력 신호 처리부(210)을 최적으로 구현할 수 있다는 것을 주목해야 한다. 예컨데, 상기 입력 신호가 이전에 적절한 포멧으로 처리되고 저장 디바이스 예컨데, 마그네틱 또는 광 드라이브 및 디스크에 저장될 수 있다. 또한, 비록 본 발명을 CIELUV, 국제 표준 균일-색 공간으로 맵된 픽셀들로 구현되었지만, 본 발명은 다른 공간들로 맵된 입력 신호들을 처리하기 위해 채용 및 구현될 수 있다.It should be noted that if the input signal is always in an acceptable color space, the input signal processor 210 can be optimally implemented. For example, the input signal may have been previously processed in a suitable format and stored in a storage device such as a magnetic or optical drive and disk. Furthermore, although the invention is implemented with CIELUV, pixels mapped to the international standard uniform-color space, the invention can be employed and implemented to process input signals mapped to other spaces.

제1 처리 단계(310)는 Y', C_b', C_r' 데이터를 R', G', B' 총 전압들로 변환시킨다. 특히, 아래에 개설한 단계들은 Y', C_b', C_r' 영상 프레임으로부터 CRT 장치를 구동하는 R', G', B' 전압 신호들로의 변환을 설명한다. 여기서, 아스트로퍼는 입력 신호들이 인코더에서 앞서 감마-수정된(gamma-precorrected) 것을 지적한다. 즉, 이러한 신호들은 변환 후, 그 전압-전류 변환 기능이 감마 비선형성에 의해 거의 근사될 수 있는 CRT 디스플레이 디바이스를 구동할 수 있다.The first processing step 310 converts the Y ', C _b ', C _r 'data into R', G ', B' total voltages. In particular, the steps outlined below describe the conversion from Y ', _Cb ', _Cr 'image frames to R', G ', B' voltage signals driving a CRT device. Here, the astroper indicates that the input signals are gamma-precorrected at the encoder. That is, these signals can drive a CRT display device after conversion, whose voltage-current conversion function can be approximated by gamma nonlinearity.

상기 입력 디지털 영상들은 4:2: 2 포맷으로 가정한다: 휘도 상관 Y'에서 최대 해상도, 그리고 색 상관 C'_b및 C'_r에서 수평으로 1/2-해상도, 여기서, Y', C'_b, C'_r데이터들은 ANSI/SMPTE 표준안 125M-1992에 명시된 순서로 저장되는 것으로 가정한다. 예컨데,It is assumed in two formats: the input digital image are 4:02 brightness correlation Y 'at the maximum resolution, and color correlation C' and C _b _'r in horizontally 1/2-resolution, wherein, Y', C _'b , C _'r data are assumed to be stored in the order set forth in ANSI / SMPTE standard 125M-1992. For example,

C'_b0, Y'₀, C'_r0, C'_b1, Y'₁, C'_r1, Y'₃, …, C'_bn/2-1, Y'_n-1, C'_rn/2-1, Y'_n-2, ….C ' _b0 , Y' ₀ , C ' _r0 , C' _b1 , Y ' ₁ , C' _r1 , Y ' ₃ _,. , C ' _{bn / 2-1} , Y' _n-1 , C ' _{rn / 2-1} , Y' _n-2 ,... .

아래의 열거된 단계들에서, 색 업샘플링(upsampling)을 위한 두개의 실시예들 또는 대체들, 및 Y' C'_bC'_r로부터의 R'G'B'로의 매트릭스 변환(matrix convesion)을 위한 세개의 실시예들 또는 대체들이 있다. 이러한 대체들은 다양한 공통 요건들 예컨데, 다양한 응용들에서 고려될 수 있는 디코딩 요건들을 커버한다.Matrix transformation (matrix convesion) in the listed steps, to the color of the up-sampling (upsampling) the two embodiments or substitute for, and Y 'C' C _b _'r from the R'G'B' under the There are three embodiments or alternatives for this. Such alternatives cover various common requirements, such as decoding requirements, which can be considered in various applications.

특히, 상기 제1 색 업샘플링 실시예에서, 단일 프래임으로부터의 Y' C'_bC'_r어레이들은 Y' 영상의 최대 해상도까지 확장된다. C'_b및 C'_r어레이들은 초기에 수평으로 1/2 해상도이며, 이어 최대 해상 필드들을 생성하기 위해 업샘플링된다. 즉, 열방향으로 교번적인 C'_b, C'_r픽셀들은 데이터 스트림의 짝수의 Y_i에 할당된다. 이어, 짝수의 Y_i와 관련된 C'_b, C'_r짝은 (i) 복제 및 (ⅱ) 그 이웃하는 것과의 평균 중 하나에 의해 연산된다.In particular, in the first color upsampling embodiment, the Y 'C' _b C ' _r arrays from a single frame extend to the maximum resolution of the Y' image. The C ′ _b and C ′ _r arrays are initially half resolution horizontally, then upsampled to produce maximum resolution fields. That is, alternating C _'b, C' _r pixels in the column direction are assigned to the even-numbered Y _i of the data stream. The C ′ _b , C ′ _r pairs associated with even Y _i are then computed by one of (i) replication and (ii) an average with its neighbors.

제2 색 업샘플링 실시예에서, 최대 해상도 Y' C'_bC'_r어레이들은 두개의 필드들로 분배된다. Y'의 경우에, 제1 필드는 Y' 어레이의 홀수 라인들을 포함하며, 제2 필드는 Y' 어레이의 짝수 라인들을 포함한다. 제1 및 제2 C'_b및 C'_r필드들을 발생시키기 위해 동일한 처리 과정이 C'_b및 C'_r어레이들에 이행된다.In a second color upsampling embodiment, the maximum resolution Y 'C' _b C ' _r arrays are divided into two fields. In the case of Y ', the first field contains odd lines of the Y' array and the second field contains even lines of the Y 'array. The same process is performed on the C ' _b and C' _r arrays to generate the first and second C ' _b and C' _r fields.

Y' C'_bC'_r로부터의 R'G'B'로의 매트릭스 변환에서, 대응하는 Y' C'_bC'_r값들은 상기 두 필드들 각각의 각 픽셀들에 대한 총 입력 값들 R'G'B'로 변환된다. 상기 Y' C'_bC'_r값들이 다음 3개의 선택적 방정식들중 하나에 의해 R'G'B' 값들에 관련되게 한다. 상기 제1 두개의 방정식들은 1996년 샌디에고 하이텍스트, 캐이스 잭(Keith Jack)에 의한 신비성이 벗겨진 비디오(Video Demystified) Ch. 3, p. 40-42에서 발견할 수 있다. 방정식(3)은 1996년 와일리, 시. 에이. 포인톤(C. A. Poynton에 의한 디지털 비디오에 대한 기술 소개(A Technical Introduction to Digital Video) 에서 (C_b가 U로 C_r가 V로 대체된) 방정식 9.9 에 대응한다. 바람지기한 실시예에서, 방정식(2)가 디폴트로서 선택되며, 그 밖에 관심있는 디스플레이의 측정이 지적되지 않는한 사용되어 한다.In the matrix transformation from Y 'C' _b C ' _r to R'G'B', the corresponding Y 'C' _b C ' _r values are the total input values R'G for each pixel of each of the two fields. Is converted to 'B'. The Y 'C' _b C ' _r values are related to the R'G'B' values by one of the following three optional equations. The first two equations were described in 1996 by Video Demystified Ch. By San Diego Hightext, Keith Jack. 3, p. Found at 40-42. Equation (3) is 1996 Wylie, Po. a. Corresponds to Equation 9.9 in A Technical Introduction to Digital Video by CA Poynton (where C _b is replaced by U and C _r is replaced by V). (2) is selected as the default and otherwise used unless the measurement of the display of interest is indicated.

상기 R', G', 및 B' 어레이들은 상기 입력 신호 처리부(210)에서의 제2 처리 단계(320)에 의해 수신된다. 상기 제2 처리 단계(320)는 포인트-비선형성을 각 R', G', B' 영상에 적용한다. 본 제2 처리 단계는 R', G', B' 총 전압들의 디스플레이 장치의 세기들(R, G, 및 B)로의 변환(최대 휘도에 분수들)을 모델링한다. 상기 비선형성은 또한 디스플레이 장치에 의한 각 평면에서의 낮은 휘도들에서의 클링핑(clipping)을 수행한다.The R ', G', and B 'arrays are received by the second processing step 320 in the input signal processor 210. The second processing step 320 applies point non-linearity to each of the R ', G', and B 'images. This second processing step models the conversion (fractions at maximum brightness) of the R ', G', B 'total voltages to the intensities R, G, and B of the display device. The nonlinearity also performs clipping at low luminance in each plane by the display device.

특히, (R', G', B') 및 (R, G, B) 사이의 변형은 두개의 부분들을 포함하며, 그중 하나는 각 픽셀 값을 독립적으로 변형시키는 것이고 다른 하나는 변환된 픽셀 값들에서 공간적 필터링을 수행하는 것이다. 아래에 상기 두 부분들이 설명되어 있다.In particular, the transformation between (R ', G', B ') and (R, G, B) comprises two parts, one of which transforms each pixel value independently and the other of the converted pixel values Is to perform spatial filtering on. The two parts are described below.

픽셀-값 변형Pixel-value transformations

먼저, 입력 R'에 대응하는 최대 휘도 R에 대한 분수가 각각의 픽셀에 대해 연산된다. 유사하게, 분수의 휘도들 G 및 B가 입력들 G', B'로부터 연산된다. 각 총으로부터 최대 휘도가 상기 입력 값 255에 대응하는 것으로 가정한다. 다음의 방정식들은 (R', G', B')로부터 (R, G, B)로의 변형을 나타낸다.First, a fraction for the maximum luminance R corresponding to the input R 'is computed for each pixel. Similarly, fractional luminances G and B are computed from inputs G ', B'. It is assumed that the maximum luminance from each gun corresponds to the input value 255. The following equations represent the transformation from (R ', G', B ') to (R, G, B).

디폴트 임계 값 t_d을 디스플레이 장치의 블랙 레벨에 대응하는 16이 되도록 선택하고, γ를 2.5로 디폴트한다.The default threshold t _d is selected to be 16, corresponding to the black level of the display device, and γ is defaulted to 2.5.

상기 디스플레이 장치에 거의 1000:1인 (255/16)^2.5의 능동 범위를 제공할 수 있도록 t_d에 대해 값 16이 선택된다. 이러한 동적 범위는 비교적 넓으며, 주변 조도가 최대 디스플레이 화이트의 약 1%인 곳에서는 필요없을 수 있다. 따라서, 상기 퍼셉튜얼 발생기가 상기 값 16 대신에 블랙 레벨로서 100:1의 동적 범위를 제공하는 값 40을 채택할 때 조차 그 물리적 충실도를 유지할 수 있다. 사실, 보다 작은 동적 범위는 연산 사이클에서의 절약, 예컨데, 처리 과정중 일 또는 두 비트 절약할 수 있게 한다.The value 16 is selected for t _d to provide the display device with an active range of (255/16) ^2.5 which is nearly 1000: 1. This dynamic range is relatively wide and may not be needed where the ambient illuminance is about 1% of the maximum display white. Thus, the physical fidelity can be maintained even when the perceptual generator adopts a value 40 which provides a dynamic range of 100: 1 as the black level instead of the value 16. In fact, a smaller dynamic range makes it possible to save on computational cycles, for example one or two bits during processing.

디스플레이에 대한 두개의 관측들을 아래에 설명한다. 제1 관측은 절대 스크린 휘도에 대한 의존성을 포함한다. 상기 퍼셉튜얼 매트릭 발생기의 예측들은 함축적으로 상기 퍼셉튜얼 매트릭 발생기를 측정하여 발생되는 휘도 레벨들에만 단지 적용된다.Two observations on the display are described below. The first observation includes a dependency on absolute screen brightness. The predictions of the perceptual metric generator implicitly apply only to the luminance levels generated by measuring the perceptual metric generator.

전형적인 측정 데이터를 위해 (제이. 제이. 코엔데린크(J. J. Koenderink) 및 에이. 제이. 벤 도온(A. J. van Doorn), "시공간적 콘트라스트 검출 임계 표면은 바이모들이다," 광학 문자들 4, 32-34(1979)), 직경 2mm의 디폴트 동공을 사용하여 망막의 조도가 200 트로랜드(trolands)였다. 이것은 63.66 cd/㎡, 또는 18.58 ft-L의 스크린 휘도을 의미한다. 측정 휘도는 주관적 등급 검사에 사용된 디스플레이들의 휘도들과 비교할 수 있다. 예컨데, 비록 두 실험들의 최대-화이트 휘도들이 71 및 97 ft-L이지만, 픽셀 값 128에서의 휘도들은 각각 15 및 21 ft-L이다. 이러한 값들 및 퍼셉튜얼 매트릭 발생기의 전체 감도가 .01로부터 100 ft-L까지 측정되었다는 사실을 고려하면, 퍼셉튜얼 매트릭 발생기가 약 20부터 100 ft-L까지의 스크린 휘도들에 적용되었다라는 결론을 내릴 수 있다.For typical measurement data (JJ Koenderink and AJ van Doorn, "The Spatiotemporal Contrast Detection Critical Surfaces Are Bimoses," Optical Characters 4, 32-34 (1979), the retinal illuminance was 200 trolands using a default pupil of 2 mm diameter. This means a screen brightness of 63.66 cd / m 2, or 18.58 ft-L. The measured luminance can be compared with the luminance of the displays used for subjective grading checking. For example, although the maximum-white luminance of both experiments is 71 and 97 ft-L, the luminance at pixel value 128 is 15 and 21 ft-L, respectively. Considering these values and the fact that the overall sensitivity of the perceptual metric generator was measured from .01 to 100 ft-L, it can be concluded that the perceptual metric generator was applied to screen luminances from about 20 to 100 ft-L. Can be.

제2 관찰은 다른 모델들에서의 방정식 (4)의 관계를 포함한다. 오프세트 전압 t_d(예컨데, 캐소드 및 TV 스크린 사이의 그리드 설정으로부터의)는 푠튼(Poynton)에 의해 제시된 모델, R=k[R'+b]^γ(그리고, G 및 B에 대해서도 유사함)로 방정식 (4)를 변형시키기는 데에 사용될 수 있다(C. A. Poynton, "Gamma" and its disguises: The nonlinear mappings of intensity in perception, CRTs, Film, and Video, SMPTE Journal, 1993, 12, pp.1099-1108). 새로운 전압 R"=R' - t_d를 정의함으로써 푠튼의 모델을 얻을 수 있다. 따라서, R=k[R"+t_d]^γ이며 G 및 B에 대해서도 유사하다. 푠튼의 방정식보다는 오히려 방정식(4)를 기입함으로써, 오프세트 전압은 -t_d라고 가정한다. 또한, 주변 조도가 없다라고 가정한다.The second observation involves the relationship of equation (4) in other models. The offset voltage t _d (eg from the grid setting between the cathode and the TV screen) is the model presented by Poynton, R = k [R '+ b] ^γ (and similar for G and B). Can be used to transform equation (4) (CA Poynton, "Gamma" and its disguises: The nonlinear mappings of intensity in perception, CRTs, Film, and Video, SMPTE Journal, 1993, 12, pp. 1099). -1108). By defining a new voltage R " R " -t _d , a model of Shepton can be obtained. Thus R = k [R " + t _d ] ^γ is similar for G and B. By writing equation (4) rather than Shepton's equation, it is assumed that the offset voltage is -t _d . It is also assumed that there is no ambient illuminance.

주변 조도가 있는 경우, 상기 전압 오프세트는 무시할 수 있으며, 방정식 (4)는 메이어(Meyer)가 제시한 모델("The importance of gun balancing in monitor calibration," in Perceiving, Measuring, and Using Color (M. Brill, ed.), Proc. SPIE, Vol. 1250, pp. 69-79(1990)), 즉 R=kR'^γ+c와 거의 동일하다. G 및 B에 대해서도 유사한 표현이 낳는다. 만일 주변 조도가 제시되면, 방정식(4)는 k=(1/255)^γ+c 및 c=.01을 갖는 메이어 모델로 대체된다.In the case of ambient illuminance, the voltage offset can be neglected, and Equation (4) is based on the model proposed by Meyer ("The importance of gun balancing in monitor calibration," in Perceiving, Measuring, and Using Color (M). Brill, ed.), Proc. SPIE, Vol. 1250, pp. 69-79 (1990)), ie, R = kR ′ ^γ + c. Similar expressions are given for G and B. If the ambient illuminance is given, equation (4) is replaced by a Meyer model with k = (1/255) ^γ + c and c = .01.

본 퍼셉튜얼 메트릭 발생기는 (R, G, B) 영상들의 수직적 표현의 명시, 각 프레임(순차적인 영상들에서), 및 홀수 및 짝수 필드들을 위한 3개의 선택 사항들을 제공한다.This perceptual metric generator provides three options for specifying a vertical representation of (R, G, B) images, each frame (in sequential images), and odd and even fields.

선택 사항 1. 프레임Optional 1. Frame

영상은 최대-높이이고 하나의 순차적으로 스캔된 영상을 포함한다.The image is full-height and contains one sequentially scanned image.

선택 사항 2. 최대-높이 인터레이스(Interlace)Option 2. Maximum-Height Interlace

1/2-높이 영상들이 인터레이스된 스크린내에 있으므로, 블랭크 라인들로 점철되어 최대-높이가 된다. 블랭크 라인들을 아래의 설명처럼 보간에 의해 연이어 채워진다.Since half-height images are in an interlaced screen, they are dotted with blank lines to achieve maximum-height. The blank lines are subsequently filled by interpolation as described below.

선택 사항 3. 최대-높이 인터레이스(Interlace)Option 3. Maximum-Height Interlace

1/2 높이 영상들을 직접적으로 처리한다.Process half height images directly.

제1 두개의 선택 사항들은 비디오 영상 구조에 더욱 충실하며, 이에 대해, 제3 선택 사항은 50% 만큼씩 처리 시간 및 메모리 요건들을 감소시키는 이점을 갖는다. 선택 사항 1 및 2는 최대-높이 영상에서 동작하므로, 선택 사항 1 및 2에 대한 휘도 및 색 처리 과정은 동일하다. 이하, 이러한 세 선택 사항들에 대해 상세히 설명한다.The first two options are more faithful to the video image structure, with the third option having the advantage of reducing processing time and memory requirements by 50%. Since options 1 and 2 operate on the maximum-height image, the luminance and color processing for options 1 and 2 are the same. The three options are described in detail below.

공간적 프리-필터링(Spatial Pre-Filtering)Spatial Pre-Filtering

위의 선택 사항 1 및 3은 공간적인 프리-처리 과정을 요하지는 않는다. 그러나, 최대-높이 인터레이스 선택 사항 2와 관련된 공간 프리-필터링이 있다.Options 1 and 3 above do not require spatial pre-processing. However, there is spatial pre-filtering associated with maximum-height interlace option 2.

필드에서 라인으로부터 인터-라인 픽셀로의 빛의 확산을 수용하기 위해서는, 상기 R, G, 및 B 필드 영상들을 라인 보간 처리 과정을 또한 거쳐야 한다. 4개의 보간 방법들을 이하에서 설명하나, 본 발명은 이러한 보간 방법들에 의해 제한되지는 않는다. 각각의 방법에서, 전체 프레임을 독출한 다음, 비활성 필드에 속하는 라인들의 각 픽셀을 바로 위아래의 픽셀들로부터 연산된 값들로 대체한다. 방법 (3) 및 (4)에 대해, 상기 연산은 비활성 필드로부터의 픽셀 값들을 또한 사용한다.In order to accommodate the spread of light from a line to an inter-line pixel in a field, the R, G, and B field images must also undergo a line interpolation process. Four interpolation methods are described below, but the present invention is not limited by these interpolation methods. In each method, the entire frame is read and then each pixel of the lines belonging to the inactive field is replaced with values computed from the pixels directly above and below. For methods (3) and (4), the operation also uses pixel values from the inactive field.

보간해야될 비활성 라인 픽셀을 P_inactive, 및 P_inactive의 위아래 활성 라인 픽셀들을 P_above및 P_below로 각각 표시할 때, 상기 4개의 방법은:When marking inactive line pixels to be interpolated P _inactive and P _inactive up and down active line pixels as P _above and P _below , respectively, the above four methods are:

방법 (1) 평균은 디폴트이다.Method (1) Average is the default.

도 3으로 돌아가서, 비선형성 처리 과정 다음의, 제3 처리 단계(330)은 위아래로부터의 보간된 값들로 R, G, B 필드들의 인터라인 값들을 대체함으로써, 인터라인 위치에 확산되는 수직 전자-빔 스폿을 모델링한다. 이어, 필드내의 각 픽셀에서의 백터 (R, G, B)는 CIE 1931 트리스티뮬러스(tristimulus) 좌표(X, Y, Z)로의 선형 변형(디스플레이 인광체들에 의존하는)된다. 상기 벡터의 휘도 성분 Y는 앞서 언급한 바와 같이, 휘도 처리부(220)에 제공된다.Returning to FIG. 3, after the nonlinear processing, the third processing step 330 replaces the interline values of the R, G, and B fields with interpolated values from up and down, thereby spreading the vertical electrons spread at the interline location. Model the beam spot. The vector (R, G, B) at each pixel in the field is then linearly transformed (depending on the display phosphors) to CIE 1931 tristimulus coordinates (X, Y, Z). As mentioned above, the luminance component Y of the vector is provided to the luminance processing unit 220.

특히, CIE 1931 트리스티뮬러스 좌표 X, Y, 및 Z는 주어진 분수의 휘도 값들 R, G, B, 각각의 픽셀들에 대해 연산된다. 이러한 처리 과정은 디스플레이 디바이스에 의존하는 다음의 입력들을 요구한다: 3개 인광체의 색채 좌표들 (x_r, y_r), (x_g, y_g), (x_b, y_b) 및 모니터 화이트 포인트(x_w, y_w)의 색도.In particular, the CIE 1931 Tristimulus coordinates X, Y, and Z are computed for the luminance values R, G, B, respective pixels of a given fraction. This process requires the following inputs depending on the display device: color coordinates (x _r , y _r ), (x _g , y _g ), (x _b , y _b ) and monitor white point of the three phosphors Chromaticity of (x _w , y _w ).

(x_w, y_w)=(.3128, .3292)가 되도록, 조도 D65에 대응하는 것으로서 화이트 포인트를 선택한다(G. Wyszecki 및 W. S. Stile, Color Science, 2nd ed., Wiley, 1982, p.761을 참조). 적색, 그린, 및 파랑 인광체들에 대한 (x_r, y_r)=(.6245, .3581), (x_g, y_g)=(.2032, .716), 및 (x_b, y_b)=(.1465, .0549)은 NTSC 인광체들에 아주 근사한 현재의 이용가능한 인광체들에 대응한다. 그러나, 아래의 표 1에 다른 디스플레이 인광체 좌표(인광체 1차 색체) 선택 사항들을 나타내었다. ITU-R BT.709)는 디폴트이다.The white point is selected as corresponding to roughness D65 such that (x _w , y _w ) = (. 3128, .3292) (G. Wyszecki and WS Stile, Color Science, 2nd ed., Wiley, 1982, p. 761). (X _r , y _r ) = (. 6245, .3581), (x _g , y _g ) = (. 2032, .716), and (x _b , y _b ) for red, green, and blue phosphors = (. 1465, .0549) corresponds to the currently available phosphors very close to the NTSC phosphors. However, Table 1 below shows the different display phosphor coordinate (phosphor primary chromosome) options. ITU-R BT.709) is the default.

표 1. 디스플레이 발광체 좌표 선택 사항Table 1. Display illuminant coordinate options

위의 파라미터 값들을 사용하여, 픽셀의 X, Y, Z 값들은 다음의 방정식들에 의해 주어진다:Using the above parameter values, the pixel's X, Y, Z values are given by the following equations:

여기서, z_r=(1-x_r-y_r), z_g=(1-x_g-y_g), z_b=(1-x_b-y_b)이고, Y_0r, Y_0g, Y_0b은 다음의 방정식에 의해 주어진다:Where z _r = (1-x _r -y _r ), z _g = (1-x _g -y _g ), z _b = (1-x _b -y _b ), Y _0r , Y _0g , Y _0b Is given by the following equation:

여기서, z_W=(1-x_W-y)이다(D.Post, Colorimetric measurement, calibration, and characterization of self-luminous displays, in Color in Electronic Displays, H. Widdel and D.L.Post(eds), Plenum Press, 1992, p.306을 참조).Where z _W = (1-x _W -y) (D.Post, Colorimetric measurement, calibration, and characterization of self-luminous displays, in Color in Electronic Displays, H. Widdel and DLPost (eds), Plenum Press, 1992, p.306).

상기 디바이스의 화이트 포인트의 트리스티뮬러스 값들이 또한 필요하다. 이러한 값들은 색체 (x_W, y_W)에 대응하고, 최대 인광체 활성 (R'=G'=B'=255)에 있으며, 그에 따라 Y=1 이다. 따라서, 상기 화이트 포이트에 대한 트리스티뮬러스 값들은 (X_n, Y_n, Z_n)=(x_W/y_W, 1, z_W/y_W)이다.Tristimulus values of the white point of the device are also needed. These values correspond to color (x _W , y _W ) _and are at maximum phosphor activity (R '= G' = B '= 255), thus Y = 1. Thus, the tristimulus values for the white point are (X _n , Y _n , Z _n ) = (x _W / y _W , 1, z _W / y _W ).

X, Y, Z 값들을 도출할 때 선택적인 최종 단계로서, 디스플레이 스크린으로부터의 반사를 감추는데에서 기인한 가정된 주변 빛을 수용할 수 있도록 조절이 이행 될 수 있다. 이러한 조절은 다음의 형태를 갖는다:As an optional final step in deriving the X, Y, and Z values, adjustments can be made to accommodate the assumed ambient light resulting from hiding the reflection from the display screen. This control takes the following form:

여기서, 두개의 사용자-명시가능 파라미터들, L_max및 L_n이 도입되고, 디폴트 값들이 할당된다. 디스플레이 최대 휘도, L_max는 상업적인 디스플레이들에 대응하도록 100 cd/㎡로 설정한다. 상기 베일링(veiling) 휘도 L_n은 Rec 500 조건들하에서 측정된 스크린 값들과 일관되게, 5 cd/㎡로 설정한다.Here, two user-definable parameters, L _max and L _n, are introduced and default values are assigned. The display maximum brightness, L _max, is set at 100 cd / m 2 to correspond to commercial displays. The veiling luminance L _n is set to 5 cd / m 2, consistent with the screen values measured under Rec 500 conditions.

주변 빛의 색채가 디스플레이 화이트 포인트의 색채와 동일한 것으로 가정한다. 중립 포인트 (X_n, Y_n, Z_n)를 연산하지 않는 휘도-유일 구현 선택 사항에서, 상기 조절은 방정식 (6a) 대신에:Assume that the color of the ambient light is the same as the color of the display white point. In luminance-only implementation options that do not compute the neutral point (X _n , Y _n , Z _n ), the adjustment is instead of equation (6a):

로 이행된다는 점을 주지해야 한다. Y_n이 늘 1이기 때문에, 이것은 방정식(6a)의 Y 성분과 동일하다. 또한, 양 L_max*Y은 cd/㎡의 디스플레이의 휘도라는 점을 주지해야 한다.It should be noted that the Since Y _n is always 1, this is the same as the Y component of equation (6a). It should also be noted that the amount L _max * Y is the luminance of the display of cd / m 2.

도 3으로 다시 돌아가서, (각각의 픽셀에서) 색 공간의 아이소루미넌트 (isoluminant) 색 차들에 대한 근사적인 퍼셉튜얼 균일성을 확인하기 위해, 개별적인 픽셀들을 제4 처리 단계(340)에서 CIELUM, 국제 표준 균일-색 공간으로 맵한다. 이 공간의 색 성분들 u*, v*을 상기 색 처리부(230)에 제공한다.Returning to FIG. 3 again, in order to ascertain the approximate perceptual uniformity for the isoluminant color differences in the color space (at each pixel), individual pixels were removed in a fourth processing step 340 by CIELUM, International. Map to standard uniform-color space. The color components u * and v * of this space are provided to the color processor 230.

특히, 픽셀마다 X,Y,Z, 값들을 1976 CIELUV 균등-색 시스템(Wyszecki and Stiles, 1982, p.165)으로 변형시킨다:In particular, transform the X, Y, Z, values per pixel into the 1976 CIELUV uniform-color system (Wyszecki and Stiles, 1982, p. 165):

여기서,here,

좌표 L*은 상기 휘도 처리부(220)에 보내지지 않는다는 점에 주지해야 한다. L*은 색 좌표 u* 및 v*를 연산할 때에 만 사용된다. 따라서, 단지 u* 및 v* 영상들은 다른 처리 과정을 위해 저장된다.It should be noted that the coordinate L * is not sent to the luminance processor 220. L * is used only when calculating color coordinates u * and v *. Thus, only u * and v * images are stored for other processing.

도 4는 상기 휘도 처리부(220)의 블록 구성도를 도시한 도면이다. 도 4를 휘도 처리 단계들의 흐름도 또는 그러한 휘도 처리 단계들을 이행하기 위한 다수의 하드웨어 성분들 예컨데, 필터들, 다양한 회로 성분들 및/또는 주문형 집적 회로(ASIC)의 블록 구성도로서 이해할 수 있다.4 is a block diagram of the luminance processor 220. 4 may be understood as a flow diagram of luminance processing steps or a block diagram of a number of hardware components such as filters, various circuit components and / or an application specific integrated circuit (ASIC) for implementing such luminance processing steps.

도 4를 참조하면, 각 휘도 필드는 필터링되고, 입력되는 시각적 신호를 상이한 공간적-주파수 대역들 412-418로의 정신 물리학적 및 생리학적인 관측된 분석을 모델화하기 위해, 4-레벨 가우시안(Gaussian) 피라미드(410; 피라미드 발생기)에서 다운-샘플링된다. 상기 분석 이 후, 연이은 선택적 처리 과정, 예컨데, 적응 필터링, 각각의 피라미드 레벨(해상도 레벨)에서의 적용을 이행할 수 있다.Referring to FIG. 4, each luminance field is filtered and a four-level Gaussian pyramid, in order to model the psychophysical and physiological observed analysis of the input visual signal into different spatial-frequency bands 412-418. Down-sampled at 410 (pyramid generator). After the analysis, subsequent selective processing, eg adaptive filtering, may be implemented at each pyramid level (resolution level).

이어, 비선형 동작(430)이 상기 피라미드 분석 바로 다음에 이행된다. 이 단계는 가장 거친 피라미드 레벨내에서의 최대 휘도의 시간-의존 윈도우 평균(across fields)을 기초로하는 이득-세팅(gain-setting) 동작(표준화)이다.Then, nonlinear operation 430 is performed immediately after the pyramid analysis. This step is a gain-setting operation (standardization) based on the time-dependent window average of the maximum luminance within the coarsest pyramid level.

중간 값 I_norm을 도출하기 위해 중간 표준화 처리 과정(420)이 이행된다는 점을 주목해야 한다. 이하에서 설명한 바와 같이, 상기 4개의 피라미드 레벨들 각각을 스케일하기 위해 I_norm값이 채용된다.It should be noted that the intermediate standardization process 420 is implemented to derive the intermediate value I _norm . As described below, an I _norm value is employed to scale each of the four pyramid levels.

표준화한 다음, 가장 낮은 해상도 피라미드 영상(418)은 시간 필터링(시간 필터) 및 콘트라스트 연산(450)을 거치며, 다른 3개의 레벨들(412-416)은 공간 필터링 및 콘트라스트 연산(440)을 거친다. 각각의 경우에, 콘트라스트는 거의 스케일된, 로컬 합산에 의해 나눠진 픽셀 값들의 로컬 차이다. 상기 퍼셉튜얼 매트릭 발생기의 공식에서, 이것은 "1 JND"의 정의를 확립했었고, 상기 퍼셉튜얼 메트릭 발생기의 다음 단계들로 진행한다. (이하에서 설명한 바와 같이 측정은 반복적으로 중간의 퍼셉튜얼 매트릭 발생기에서 상기 1-JND 해석을 교정한다). 각각의 경우, 상기 콘트라스트는 제곱되어, 알려진 바와 같이 콘트라스트 에너지를 산출한다. 상기 콘트라스트의 대수적 부호는 영상 비교(JND 맵 연산) 이전에 바로 다시 부가하기 위해 보전된다.After normalization, the lowest resolution pyramid image 418 is subjected to temporal filtering (temporal filter) and contrast operation 450, and the other three levels 412-416 are subjected to spatial filtering and contrast operation 440. In each case, the contrast is the local difference of pixel values divided by the local sum, which is almost scaled. In the formula of the perceptual metric generator, this has established the definition of "1 JND" and proceeds to the next steps of the perceptual metric generator. (Measurement is repeated to calibrate the 1-JND analysis in an intermediate perceptual metric generator as described below). In each case, the contrast is squared to yield the contrast energy, as known. The algebraic sign of contrast is preserved to add back immediately before image comparison (JND map operation).

다음 단계들(460 및 470; 콘트라스트-에너지 마스킹)은 각각의 오리엔티드 응답 (콘트라스트 에너지)를 모든 콘트라스트 에너지들의 함수로써 나누는 이득-세팅 동작을 형성한다. 다른 로컬 응답들에 의한 각 응답의 이러한 합성된 감쇄는 "비지(busy)" 영상 영역들에서의 왜곡들에 대한 감도의 감소와 같은 시각적 "마스킹" 효과들을 모델화하기 위해 포함된다. 퍼셉튜얼 매트릭 발생기의 이러한 단계에서, 시간적 구조(플릭커(flicker))가 시간적 차를 마스킹하기 위해 또한 형성된다.이하에서 설명하는 바와 같이, 휘도 마스킹이 또한 색 측면에 적용된다.The following steps 460 and 470 (contrast-energy masking) form a gain-setting operation that divides each orientated response (contrast energy) as a function of all contrast energies. This synthesized attenuation of each response by other local responses is included to model visual "masking" effects, such as a decrease in sensitivity to distortions in "busy" image regions. In this step of the perceptual metric generator, a temporal structure (flicker) is also formed to mask the temporal difference. As described below, luminance masking is also applied to the color side.

마스크된 콘트라스트 에너지들(콘트라스트 부호들과 함께)가 상기 휘도 JND 맵(408)을 발생시키기 위해 사용된다. 간략히 설명하면, 상기 휘도 JND 맵은: 1) 각 영상을 각 영상을 양 및 음의 성분들(반파 정류) 분류; 2) 로컬 풀링을 이행 (정신 물리학적 경험에서 관찰된 로컬 공간적 합산을 모델화하기 위해 평균 및 다운샘플링); 3) 채널에 의한 절대 영상 차들 채널을 평가; 4) 임계화(coring); 5) 파워에 대해 코어된 영상차들을 발생; 및 6) 동일한 해상도로 업-샘플링(풀링 단계에 의해 본래의 영상의 1/2 해상도가 되는)하므로써 발생된다.Masked contrast energies (along with contrast codes) are used to generate the luminance JND map 408. In brief, the luminance JND map comprises: 1) classifying each image into each image with positive and negative components (half-wave rectification); 2) implementing local pooling (mean and downsampling to model local spatial summations observed in mental physics experience); 3) evaluating the channel of absolute image differences by channel; 4) coring; 5) generate cored image differences for power; And 6) up-sampling at the same resolution (which results in half the resolution of the original image by the pulling step).

도 19는 휘도 처리부(220)의 선택적 실시예의 블록 구성도를 도시한 도면이다. 특히, 도 4의 표준화 단계들(420 및 430)이 휘도 압축 단계(1900)으로 대체된다. 요컨데, 입력 신호의 각 휘도 값은 먼저 이하에서 상세히 설명하게 될, 압축의 비선형성 고정을 거친다. 도 19의 다른 단계들은 도 4와 유사하다. 이러한 유사한 단계들은 위에서 설명하였다. 비유사한 단계들에 대한, 도 19의 휘도 처리부의 상세한 설명이 도 20을 참조하여 아래에서 제시한다.19 is a block diagram of an optional embodiment of the luminance processor 220. In particular, the normalization steps 420 and 430 of FIG. 4 are replaced by the luminance compression step 1900. In sum, each luminance value of the input signal is first subjected to nonlinear fixation of compression, which will be described in detail below. Other steps of FIG. 19 are similar to FIG. 4. These similar steps are described above. A detailed description of the luminance processor of FIG. 19 for dissimilar steps is given below with reference to FIG. 20.

일반적으로, 도 19의 휘도 처리부는 바람직한 실시예이다. 그러나, 두 실시예들은 다른 특징들을 띄므로, 그들의 성능은 다른 적용하에서 차이가 있을 수 있다. 예컨데, 도 4의 휘도 처리부는 보다 낮은 동적 범위에 비해 보다 높은 동적 범위 예컨데, 10-비트 입력 영상에서 잘 동작된다.In general, the luminance processor of Fig. 19 is a preferred embodiment. However, since the two embodiments have different features, their performance may be different under different applications. For example, the luminance processor of FIG. 4 operates well on higher dynamic ranges, eg, 10-bit input images, compared to lower dynamic ranges.

도 5는 색 처리부(230)의 블록 구성도를 도시한 도면이다. 도 5는 색 처리 단계들 또는 그러한 색 처리 단계들을 이행하기 위한 다수의 하드웨어 성분들 예컨데, 필터, 다양한 회로 성분들 및/또는 주문형 집적 회로(ASIC)로서 고려할 수 있다. 색 처리 과정은 여러면에서 휘도 처리 과정과 평행한다. CIELV 공간의 색(u* 502 및 v* 504)의 인트라-영상 차들은 휘도 처리부에서의 검출 임계치를 정의하는 데에 사용된다. 또한, 휘도 동작의 유추에서, u* 및 v* 차이에 의해 정의된 색채의 "콘트라스트들"은 마스킹 동작을 거치게 된다. 변환기 비선형성은 한 영상 및 또 다른 영상 사이의 콘트라스트 증분의 판별을 두 영상에 공통인 콘트라스트 에너지에 의존케 한다.5 is a block diagram of the color processor 230. 5 may be considered as a number of hardware components, such as a filter, various circuit components and / or an application specific integrated circuit (ASIC), for implementing color processing steps or such color processing steps. Color processing is parallel to luminance processing in many ways. Intra-image differences of the colors (u * 502 and v * 504) of the CIELV space are used to define the detection threshold in the luminance processor. In addition, in analogy of the luminance operation, the "contrasts" of the color defined by the u * and v * differences are subjected to the masking operation. Transducer nonlinearity makes the determination of the contrast increment between one image and another image dependent on the contrast energy common to both images.

특히, 도 5는 휘도 처리부에서와 같이, 각각의 색 성분(u* 502 및 v* 504)이 피라미드 분석 과정(510)을 거치는 것을 보여주고 있다. 그러나, 휘도 처리 과정이 바람직한 실시예에서 4개의 피라미드 레벨 분석을 이행하는 데에 반해, 색 처리 과정은 7개의 레벨들로 구현된다. 이러한 구현은 색의 채널들이 휘도 채널들보다 훨씬 낮은 공간적 주파수들에 민감하다는 경험적 사실을 지적하고 있는 것이다(K. T. Mullen, "The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings," J. Physiol. 359, 381-400, 1985). 또한, 그러한 분석은 색차를 넓고 균등한 영역에서 목격할 수 있다는 직관적인 사실을 고려한다.In particular, FIG. 5 shows that each of the color components u * 502 and v * 504 passes through the pyramid analysis process 510, as in the luminance processor. However, while the luminance process performs four pyramid level analysis in the preferred embodiment, the color process is implemented in seven levels. This implementation points to the empirical fact that color channels are sensitive to spatial frequencies much lower than luminance channels (KT Mullen, "The contrast sensitivity of human color vision to red-green and blue-yellow chromatic gratings," J. Physiol. 359, 381-400, 1985). In addition, such an analysis takes into account the intuitive fact that color differences can be seen in wide and even areas.

이어, 플릭커에 대한 색 채널들의 본연의 둔감을 반영하기 위해, 시간 처리 과정(520)을 4개의 영상 필드들 이상을 평균하여 성취한다.Then, to reflect the natural insensitivity of the color channels to the flicker, a time process 520 is achieved by averaging over four image fields.

이어, 라플라스 커넬(530; Laplacian kernel)에 의한 공간 필터링을 u* 및 v*에 대해 이행한다. 이러한 연산은 JND에 계량적으로 연결된(균일 색 공간의 정의에 의해) u*, v*에서의 색차를 발생시킨다. 이러한 단계에서 1의 값은 휘도 채널에서의 웨버-법칙-기초(Weber's-law-based) 콘트라스트에 대한 유추에서, 단일의 JND를 얻었다는 것을 의미한다. (휘도 처리 과정에서와 같이, 1-JND 색 유니트는 측정하는 동안 재해석 과정을 거쳐야 한다.)Subsequently, spatial filtering by the Laplacian kernel (530) is performed for u * and v *. This operation produces color differences in u *, v * that are quantitatively connected (by definition of a uniform color space) to JND. A value of 1 in this step means that a single JND was obtained from the analogy to Weber's-law-based contrast in the luminance channel. (As with luminance processing, 1-JND color units must undergo reinterpretation during measurement.)

이러한 색차 값은 가중치가 부여되고, 제곱되어, 콘트라스트-에너지 마스킹 처리 단계(540)으로(콘트라스트 대수 부호를 가지고) 진행하게 된다. 상기 마스킹 단계는 휘도 처리부에서와 동일한 함수를 수행한다. 그 차가 평가되는 휘도 채널들 및 색 채널로부터의 입력만을 받기 때문에, 상기 연산은 다소 간단하다. 최종적으로, 단계 50에서 색 JND 맵을 발생시키기 위해 마스킹된 콘트라스트 에너지들을 상기 휘도 처리부에서와 같이 정확히 처리된다.This chrominance value is weighted, squared, and proceeds to contrast-energy masking process step 540 (with contrast logarithmic sign). The masking step performs the same function as in the luminance processor. The operation is rather simple because the difference only receives input from the luminance and color channels being evaluated. Finally, in step 50 the masked contrast energies are processed exactly as in the luminance processing section to generate a color JND map.

상기 비디오-시퀀스 비교에 있어서의 각 필드를 위해, 휘도 및 색 JND 맵들은 단일-수 요약들, 즉 휘도 및 색 JND 값들로 먼저 환산되어야 된다. 각각의 경우, 맵으로부터 수로의 환산은 민코스키(Minkowski) 덧셈을 통해 모든 픽셀 값들을 합산함으로써 수행된다. 이어, 상기 휘도 및 색 JND 수들은 상기 퍼셉튜얼 매트릭 발생부(260)에 의해 처리되는 필드에 대한 JND 평가치를 발생시키기 위해 다시 민코스키 덧셈을 통해 합성된다. 비디오 시퀀스의 다수의 필드들을 위한 단일 수행 측정(270)은 민코스키 감각에서, 각 필드에 대한 JND 평가치를 가산함으로써 결정된다.For each field in the video-sequence comparison, the luminance and color JND maps must first be converted to single-number summaries, ie, luminance and color JND values. In each case, the conversion from the map to the channel is performed by summing all pixel values through Minkowski addition. Subsequently, the luminance and color JND numbers are synthesized again through Minkoski addition to generate a JND evaluation value for the field processed by the perceptual metric generator 260. A single performance measure 270 for multiple fields of the video sequence is determined by adding, in Minkoski's sense, a JND estimate for each field.

도 6은 도 4의 휘도 처리부(220)의 상세한 블록 구성도를 도시한 도면이다. 입력 검사 및 기준 필드 영상들을 I_k및 I^refk(k = 0, 1, 2, 3)로 각각 표시한다. I_k및 I^refk에서의 픽셀 값들을 I_k(i,j) 및 I^refk(i,j)로 각각 표시한다. 이러한 값들은 입력 신호 처리부(210)에서 연산된 Y 트리스팀뮬러스(tristimulus) 값들(605)이다. 이하에서는, I^refk 연산 과정이 I_k와 동일함으로, 단지 I_k필드들만을 설명한다. k=3은 4-필드 시퀀스에서 가장 최근 필드를 지적한다는 것을 주목해야 한다.6 is a block diagram illustrating a detailed block diagram of the luminance processor 220 of FIG. 4. The input check and reference field images are represented by I _k and I ^ref k (k = 0, 1, 2, 3), respectively. Pixel values at I _k and I ^ref k are _denoted by I _k (i, j) and I ^ref k (i, j), respectively. These values are Y tristimulus values 605 calculated by the input signal processor 210. Hereinafter, the operation procedure by the same I ^ref k and I _k, I _k field is only illustrative only. Note that k = 3 indicates the most recent field in the four-field sequence.

4개의 해상도 레벨들에서 공간 분석은 해상도의 연속적으로 보다 악화되는 레벨 각각에서의 인자 2로 상기 영상을 스미어링 및 다운샘플링하는 피라미드 처리 또는 피라미드 분석 연산적으로 효율적인 방법을 통해 성취된다. 상기 본래의 최대 해상도 영상을 피라미드의 제로번째 레벨(레벨 0)라 할 때, G₀= I₃(i,j)이다. 보다 낮은 해상도들에서, 연이은 레벨들을 소위 리듀스(REDUCE)라는 연산으로 구할 수 있다. 즉, 가중치 (1,2,1)/4를 갖는 세개-탭 저역 통과 필터(610)가 흐릿한 영상을 발생시키기 위해 영상의 각 방향으로 연속으로 G₀에 적용된다. 결과인 영상은 다음 레벨 G₁을 생성하기 위해 인자 2에 (모든 그 밖의 픽셀이 제거된다) 의해 서브샘플된다.Spatial analysis at four resolution levels is achieved through pyramid processing or pyramid analysis computationally efficient methods of smearing and downsampling the image with a factor 2 at each of the successive worsening levels of resolution. When the original maximum resolution image is referred to as the zeroth level (level 0) of the pyramid, G ₀ = I ₃ (i, j). At lower resolutions, successive levels can be obtained by a so-called reduce (REDUCE) operation. That is, a three-tap low pass filter 610 with weights (1,2,1) / 4 is applied to G ₀ continuously in each direction of the image to produce a blurred image. The resulting image is subsampled by factor 2 (all other pixels are removed) to produce the next level G ₁ .

하나의 피라미드 레벨에 의한 필터링 및 다운샘플링의 연산자로서 fds1()로 표시하면, 리듀스(REDUCE) 처리 과정을 식(14)로서 표현할 수 있다.When expressed as fds1 () as an operator for filtering and downsampling by one pyramid level, a REDUCE process can be expressed as Equation (14).

리듀스 처리 과정은 순환적으로 각각의 새로운 레벨에 적용된다. (P. J. Burt and E. H. Adelson, "The Laplacian pyramid as a compact image code," IEEE Transactions on Communication, COM-31, 532-540 (1983)).The reduce process is applied to each new level cyclically. (P. J. Burt and E. H. Adelson, "The Laplacian pyramid as a compact image code," IEEE Transactions on Communication, COM-31, 532-540 (1983)).

역으로, 동일한 3x3 커넬(kernel)에 의해 업셈플링 및 필터링하는 연산자 엑스펜드(EXPAND)가 정의된다. 이 연산자를 usf1()으로 표시하며, 아래에 나타내었다.Conversely, an operator EXPAND is defined that upsamples and filters by the same 3x3 kernel. This operator is denoted by usf1 () and is shown below.

각 방향(수평 및 수직)으로의 상기 fds1() 및 usf1() 필터 커넬들은 각각 k_d[1,2,1] 및 k_u[1,2,1]이며, 여기서 균일-필드 값들을 보존할 수 있도록 상기 k_d및 k_u가 선택된다. fds1에 대해, 상수 k_d= 0.25이고, ufs1에 대해, k_u= 0.5(업샘플된 영상에서 제로들 때문에)이다. 적절한 연산으로서 usf1을 수행하기 위해, 제로 값들을 대체하기 위해 커널이 항등의 선형 보간법으로 대체된다. 그러나, 개념적 단순화를 위해, 그것을 "업셈플-필터"로서 언급될 수 있다.The fds1 () and usf1 () filter kernels in each direction (horizontal and vertical) are k _d [1,2,1] and k _u [1,2,1], respectively, where the uniform-field values will be preserved. So that k _d and k _u are selected. For fds1, the constant k _d = 0.25 and for ufs1, k _u = 0.5 (due to zeros in the upsampled image). To perform usf1 as a proper operation, the kernel is replaced by linear interpolation of equality to replace zero values. However, for conceptual simplicity, it may be referred to as an "upsample-filter."

이어, 표준화가 적용되며, 여기서 중간 값(I_｜v｜3으로 표시되는)이 4개의 값들, 각 필드(k=0,1,2,3)에 대한 레벨 3에서의 최대 픽셀 값들을 평균함으로써 연산된다. 상기 단계는 피라미드 분석 과정 본연의 스므싱(smoothing)에 의해 최대 해상도(레벨 0) 영상에서의 아웃라이어들(outliers)의 영향을 약화시킨다. 이어, I_｜v｜3을 이전의 대(k=2)에 사용된 표준화 인자, I_norm의 감쇄된 값과 비교한다. 현재의 대(k=3)에 대한 I_norm은 이러한 두 값들중 보다 큰 값과 동일하게 설정된다. 가장 나중 필드의 4 피라미드 레벨들 모두에 대한 영상들은 I_norm의 새로운 값을 사용하여 다시 스케일되어, 포화 비선형화를 거치게 된다.Normalization is then applied, where the intermediate value (denoted by I _{| v | 3} ) is averaged by four values, the maximum pixel values at level 3 for each field (k = 0, 1, 2, 3). Is calculated. This step weakens the influence of outliers in the maximum resolution (level 0) image by inherent smoothing in the pyramid analysis process. Then I _{| v | 3} is compared with the attenuated value of the normalization factor, I _norm , used in the previous band (k = 2). I _norm for the current band (k = 3) is set equal to the larger of these two values. Images for all four pyramid levels of the last field are _rescaled using the new value of I _norm to undergo saturation nonlinearization.

다음의 방정식들은 이러한 과정을 나타낸 것이다. 만일 위로부터의 피라미드 레벨들이 I_3,l(i,j)이고, 3 및 l이 각각 가장 나중의 필드 및 피라미드 레벨이라면,The following equations illustrate this process. If the pyramid levels from above are I _{3, l} (i, j) and 3 and l are the last field and pyramid levels, respectively,

여기서, I_norm= [αＩ^(') _norm, I_｜v｜3](615), Ｉ^(') _norm은 필드-3 피라미드 레벨들, 2에 대한 m 디폴트들,및Where I _norm = [αI ^(') _norm , I _{| v | 3} ] (615), I ^(') _norm is field-3 pyramid levels, m defaults for 2, and

를 표준화하기 위한 이전 대에서 사용된 I_norm값이이며, Δt는 필드 주파수의 역수이고, t_half= 1/2은 브라이트 스티뮤러스(bright stimulus)의 제거를 따르는 인간 시각의 적응 비율과 관련된다. 50 및 60 Hz 에 대한 값들은 0.9727 및 0.9772이다. 상수 L_D는 빛의 부재시 존재하는 잉여의 시각적 응답(노이즈) 및 0.01 값에 디폴트들을 나타낸다. 방정식(14)에서의 포화 비선형성은 생물학적인 기본 모델들로부터 도출된다(참조, Shapley 및 Enroth-Cugell, 1984).The value of I _norm used in the previous band to standardize is, Δt is the inverse of the field frequency, and t _half = 1/2 is related to the rate of adaptation of human vision following the removal of bright stimulus . The values for 50 and 60 Hz are 0.9727 and 0.9772. The constant L _D represents the defaults in the value of 0.01 and the excess visual response (noise) present in the absence of light. Saturation nonlinearity in equation (14) is derived from biological basic models (see Shapley and Enroth-Cugell, 1984).

방향성 공간 필터들(중심 및 주변)을 필드 3을 위한 레벨 0,1, 및 2 영상들에 적용한다. 이와 대조적으로, 시간 필터가 가장 낮은 레벨(레벨 3)에 적용된다. 즉, 첫번째 및 마지막 필드들의 짝들은 각각 초기(Early) 및 나중(Late) 영상들로 선형 합성된다.Directional spatial filters (center and periphery) are applied to level 0, 1, and 2 images for field 3. In contrast, a time filter is applied at the lowest level (level 3). That is, the pairs of the first and last fields are linearly synthesized into early and later images, respectively.

중심 및 주변 필터들(625 및 627)은 분리가능한 3x3 필터들로서, 방위의 모든 조합들을 제공한다: 중심 수직(CV), 중심 수평(CH), 주변 수직(SV), 및 주변 수평(SH).Center and peripheral filters 625 and 627 are separable 3x3 filters, providing all combinations of orientation: center vertical (CV), center horizontal (CH), peripheral vertical (SV), and peripheral horizontal (SH).

필터 커넬들은 다음과 같다:The filter kernels are as follows:

레벨 3 초기(630) 및 나중(632) 영상들은 각각,Level 3 initial 630 and later 632 images, respectively,

이다.to be.

60Hz에 대한 상수들 t_e및 t₁은 각각 0.5161 및 0.4848이고, 50Hz에 대해서는 각각 0.70 및 0.30이다. 콘트라스트 연산을 위한 입력들은 상기 중심 및 주변 영상들(CV_i, CH_i, SV_i, 및 SH_i; 피라미드 레벨들 0, 1, 및 2에 대해 i=0,1,2), 및 상기 초기 및 나중 영상들 (E₃및 L₃; 피라미드 레벨 3에 대해)이다. 콘트라스트 비율을 연산하기 위해 사용된 방정식은 미켈슨 콘트라스트(Michelson contrasts)와 유사하다. 수평 및 수직 방위들에 대해, 일 픽셀씩, 각각의 콘트라스트들은The constants t _e and t ₁ for 60 Hz are 0.5161 and 0.4848 respectively and 0.70 and 0.30 respectively for 50 Hz. Inputs for contrast computation are the center and peripheral images CV _i , CH _i , SV _i , and SH _i ; i = 0,1,2 for pyramid levels 0, 1, and 2, and the initial and Later images (E ₃ and L ₃ ; for pyramid level 3). The equation used to calculate the contrast ratio is similar to Michelson contrasts. For horizontal and vertical orientations, by one pixel, the contrasts are

이다.to be.

유사하게, 상기 시간 성분에 대한 콘트라스트 비율은Similarly, the contrast ratio for the time component

이다.to be.

정신 물리학적 검사 데이터로의 측정에 의해 결정된 바와 같이, i = 0,1,2,3 에 대한 wi 값들은 각각 0.015, 0.0022, 0.0015, 및 0.003이다.As determined by measurement with psychophysical test data, the wi values for i = 0,1,2,3 are 0.015, 0.0022, 0.0015, and 0.003, respectively.

수평 및 수직 콘트라스트-에너지 영상들(640 및 642)은 두개의 앞선 방정식들에 의해 정의된 픽셀 값들을 제곱하여 연산되어, 그에 따라 아래의 식(21)을 구한다.The horizontal and vertical contrast-energy images 640 and 642 are computed by squaring the pixel values defined by the two preceding equations, thus obtaining equation (21) below.

유사하게, 시간 콘트라스트-에너지 영상(650)은 상기 픽셀 값들을 제곱하여 연산된다:Similarly, temporal contrast-energy image 650 is computed by square the pixel values:

제곱하기 이전에 각 콘트라스트 비율 픽셀 값의 대수적 부호는 나중에 사용하기 위해 보존된다.Before squaring, the algebraic sign of each contrast ratio pixel value is reserved for later use.

콘트라스트-에너지 마스킹은 방정식들(21 및 22)으로 연산된 콘트라스트-에너지들 또는 영상들 각각에 적용되는 비선형 함수이다. 상기 마스킹 동작은 검사 영상의 왜곡 판별시의 기준 영상 시퀀스에서의 시공적 구조의 영향을 모델링한다.Contrast-energy masking is a nonlinear function applied to each of the contrast-energys or images computed with equations (21) and (22). The masking operation models the influence of the constructional structure on the reference image sequence in determining the distortion of the inspection image.

예컨데, 검사 영상 및 기준 영상은 낮은-진폭의 공간적 사인파만큼의 차이가 있다. 이러한 차이가 두 영상들이 균일한 필드를 포함하는 경우보다는, 두 영상들이 공통적으로 동일한 공간 주파수의 중간-콘트라스트 사이파를 갖는 경우에, 더욱 가시적인 것으로 알려졌다(Nachmias 및 Sansbury, 1974). 그러나, 공통 사인파의 콘트라스트가 너무 큰 경우, 영상의 차이는 덜 가시적이다. 다른 공간 주파수들의 사인파들은 콘트라스트 차이의 가시성에 영향을 미칠 수 있다는 것은 또한 경우이다. 이러한 작용을 낮은 콘트라스트 에너지에서의 만곡인 비선형성 및 높은 콘트라스트 에너지들에 대해 증가하는 파워 함수에 의해 모델화할 수 있다. 또한, 다음의 특징은 인간의 시력으로 대략적으로 목격할 수 있다. 각 채널은 그 자체를 마스킹하며, 높은 공간 주파수들은 낮은 주파수들을 마스킹하며(그러나 그 역은 아니다), 시간적 플리커는 공간적 콘트라스트 감도를 마스킹한다 (그리고 또한, 그 역도 그러하다).For example, the inspection image and the reference image differ by as much as a low-amplitude spatial sine wave. This difference is known to be more visible when the two images have a common mid-contrast sipa of the same spatial frequency, rather than when the two images contain a uniform field (Nachmias and Sansbury, 1974). However, if the contrast of the common sine wave is too large, the difference in the image is less visible. It is also the case that sinusoids of other spatial frequencies may affect the visibility of the contrast difference. This action can be modeled by an increasing power function for nonlinearity and high contrast energies that are curved at low contrast energies. In addition, the following features can be seen approximately with human vision. Each channel masks itself, high spatial frequencies mask low frequencies (but not vice versa), and temporal flicker masks spatial contrast sensitivity (and vice versa).

이러한 인간 시각의 특성들에 응답하여, 다음의 비선형성 형태(660; 일 픽셀씩 적용된)가 적용된다:In response to these characteristics of human vision, the following nonlinear form 660 (applied by one pixel) is applied:

여기서, y는 마스킹된 콘트라스트 에너지: 공간적, Hi 또는 Vi(방정식(21)) 또는 시간적(T₃)(방정식(22)). 양 D_i는 y가 속한 피라미드 레벨 i에 의존하는 영상에 관련된(일 픽셀씩) 것이다. 퍼셉튜얼 매트릭 발생기 연산으로 양들 β, σ, a, 및 c가 각각 1.17, 1.00, 0.0757, 및 0.4753인 것을 발견했으며, d_y는 제곱하기 이전의 콘트라스트 y 본래의 대수적 부호이다.Where y is the masked contrast energy: spatial, Hi or Vi (equation (21)) or temporal (T ₃ ) (equation (22)). The amount D _i is related (by one pixel) to the image depending on the pyramid level i to which y belongs. Perceptual metric generator operations found that the quantities β, σ, a, and c were 1.17, 1.00, 0.0757, and 0.4753, respectively, where d _y is the original algebraic sign of contrast y before squared.

D_i의 연산은 피라미드 구조(필터링 및 다운샘플링) 및 피라미드 재건(필터링 및 업샘플링)을 요한다. 이러한 D_i연산이 도 6에 도시되어 있다. 먼저, E₀가 H₀및 V₀의 합산으로서 연산된다. 상기 합산은 필터링, 단계(652)에 의해 다운샘플링되어, H₁+V₁에 가산되어 E₁이 된다. 다음으로, E₁은 또한 필터링, 다운샘플링되고, H₂+V₂에 가산되어 E₂가 된다. 차례로, E₂는 또한 필터링, 다운샘플링되어 E₃가 된다. 그러는 동안, 시간 콘트라스트들 T₃의 영상을 m_t로 곱하고, mftE₃에 가산하여 D₃로 나타낸 합산을 발생시킨다.The computation of D _i requires pyramid structure (filtering and downsampling) and pyramid reconstruction (filtering and upsampling). This D _i operation is shown in FIG. 6. First, E ₀ is calculated as the sum of H ₀ and V ₀ . The summation is downsampled by filtering, step 652, which is added to H ₁ + V ₁ to be E ₁ . Next, E ₁ is also filtered and downsampled and added to H ₂ + V ₂ to become E ₂ . In turn, E ₂ is also filtered and downsampled to E ₃ . In the meantime, the image of time contrasts T ₃ is multiplied by m _t and added to mftE ₃ to generate a sum represented by D ₃ .

차례로, D₃는 반복적으로 단계(654)에 의해 업샘플링 및 필터링되어 T₂, T₁, 및 T₀를 발생시킨다. 최종적으로, 영상 D_i는 D_i= m_fE_i+ T_i, i = 0, 1, 2로서 정의된다. 여기서 m_f는 0.05가 되도록 연산하여 결정되며, mft는 0.0005, 및 mt는 0.05로 설정한다. 상기 필터링, 다운샘플링, 및 업샘플링 단계들은 앞의 설명과 동일하다.In turn, D ₃ is repeatedly upsampled and filtered by step 654 to generate T ₂ , T ₁ , and T ₀ . Finally, the image D _i is defined as D _i = m _f E _i + T _i , i = 0, 1, 2. Where m _f is determined by calculating to be 0.05, mft is set to 0.0005, and mt is set to 0.05. The filtering, downsampling, and upsampling steps are the same as described above.

위의 과정은 보다 높은 공간 주파수들은 보다 낮은 주파수들을 마스킹하며(D_i가 i보다 작거나 같은 피라미드 레벨에 의해 영향받으므로), 시간 채널은 모든 공간 채널들에 의해 마스킹된다라는 것을 설명하고 있다. 이러한 마스킹 동작은 일반적으로 생물학적 관찰과 조화되어 있다. 양들 D_i, i=0,1,2는 또한 아래에서 설명하는 바와 같이 색 콘트라스트들을 마스킹한다(그러나, 그 역은 아니다).The above procedure explains that higher spatial frequencies mask lower frequencies (since D _i is affected by a pyramid level less than or equal to i), and the temporal channel is masked by all spatial channels. This masking operation is generally in harmony with biological observation. The amounts D _i , i = 0,1,2 also mask color contrasts as described below (but not vice versa).

도 20은 도 19의 휘도 처리부(220)의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다. 도 19의 휘도 처리부는 도 6의 휘도 처리부와 많은 유사 단계들을 포함하고 있으므로, 이하에서는 다른 단계들에 대해서만 설명한다.FIG. 20 is a detailed block diagram illustrating an exemplary embodiment of the luminance processor 220 of FIG. 19. Since the luminance processor of FIG. 19 includes many similar steps as the luminance processor of FIG. 6, only the other steps will be described below.

하나의 중요한 차이점은 도 20에서 도 6의 표준화 단계들을 휘도 압축(압축의 비선형성)단계(2000)로의 대체이다. 즉, 비선형성은 콘트라스트에 의해 오프세트된 감속 파워 함수를 포함한다. 가장 나중 필드로부터의 상대적-휘도 어레이를 Y₃(i,j)라 하자, 여기서 3은 가장 늦은 필드를 가리킨다.One important difference is the replacement of the normalization steps of FIG. 20 to FIG. 6 with a luminance compression (nonlinearity of compression) step 2000. That is, nonlinearity includes a deceleration power function that is offset by contrast. Let the relative-luminance array from the last field be Y ₃ (i, j), where 3 refers to the latest field.

디스플레이의 최대 휘도 L_max를 100 cd/㎡으로 설정한다. 상기 함수는 8cd/deg 에서 콘트라스트-감도 데이터로 검정된다. 조절가능한 파라미터들 m 및 L_D가 각각 0.65 및 7.5라는 것을 알 수 있다. 즉, L_d및 m 값들이 0.001 내지 100ft-L(van Des and Bouman, 1967) 휘도 레벨들에 콘트라스 검출 데이터를 조화시키기 위해 선택된다. 달리 말하면, 방정식(23a)은 절대 휘도에 대향하여 검증된다. 예컨데, 디스플레이의 최대 휘도 변화는 총 휘도 출력에 영향을 주게 될 것이다. 방정식(23a)을 보는 또 다른 방식은 상기 퍼셉튜얼 매트릭 발생기가 휘도-의존 콘트라스트-감조 함수와 협력하게 한다는 것이다.Set the maximum brightness L _max of the display to 100 cd / m 2. The function is tested with contrast-sensitive data at 8 cd / deg. It can be seen that the adjustable parameters m and L _D are 0.65 and 7.5, respectively. That is, L _d and m values are selected to match the contrast detection data to 0.001 to 100 ft-L (van Des and Bouman, 1967) luminance levels. In other words, equation 23a is verified against absolute luminance. For example, the maximum brightness change of the display will affect the total brightness output. Another way of looking at equation 23a is that the perceptual metric generator cooperates with the luminance-dependent contrast-sensing function.

선택적으로, 부가적인 휘도 압축 단계(2000; 도 20에서 은선의 박스로 나타낸)는 본 퍼셉튜얼 매트릭 발생기가 휘도 및 공간 주파수의 함수로서 콘트라스트 감도를 모델화할 수 있게 하기 위해 각 피라미드 레벨에 삽입될 수 있다. 그렇치 않다면, 단지 두개의 파라미터들로의 하나의 휘도 압축 단계(2000) 수행이 다른 공간 주파수들을 모델링하기에 불충분하게 될 것이다.Optionally, an additional luminance compression step 2000 (indicated by a box of hidden lines in FIG. 20) may be inserted at each pyramid level to allow the present perceptual metric generator to model contrast sensitivity as a function of luminance and spatial frequency. have. If not, performing one luminance compression step 2000 with only two parameters would be insufficient to model other spatial frequencies.

특히, 각 휘도 영상의 피라미드 분석 후, 비선형성을 각 피라미드 레벨 k에 적용한다. 그 때, 피라미드 레벨 k에 대한 압축 비성형성은In particular, after pyramid analysis of each luminance image, nonlinearity is applied to each pyramid level k. At that time, compression non-forming for pyramid level k

로 주어지며, 여기서 다시 0.01부터 100ft-L(van Nes et al. 1967)까지의 휘도 레벨들에서 콘트라스트 검출를 조화시키기 위해 m(k) 및 L_D(k)가 선택된다. 값 L_a는 주변 스크린 휘도(스크린 측정들에 기초하여 5 cd/㎡으로 설정된)에 대한 오프세트이며, L_max는 디스플레이의 최대 휘도(일반적으로 약 100 cd/㎡이다)이다.Where again m (k) and L _D (k) are selected to match the contrast detection at luminance levels from 0.01 to 100 ft-L (van Nes et al. 1967). The value L _a is an offset for the ambient screen brightness (set to 5 cd / m 2 based on screen measurements) and L _max is the maximum brightness of the display (generally about 100 cd / m 2).

방정식 (23b)를 검증하기 위한 데이터를 아래에 표로 작성하였다.The data for verifying equation (23b) is tabulated below.

위의 테이블에서 각 콘트라스트 변조 C_m은 공간 주파수 f_s및 망막의 휘도 I₀의 사인파의 즉시-판별가능(just-discriminable) 콘트라스트를 일으킨 경험적인 값이다. 2mm 인공 눈동자가 검증에 사용되므로, 휘도 값들(L cd/㎡)을 도출시키기 위해 망막의 휘도 값들(트로랜드(trolands)에서 I₀)에를 곱한다. 모든 m(k) 및 L_D(k)에 대한 검증을 위한 좋은 시작 점은 그 적절한 지수 m가 .65인 8c/deg 사인파 검출을 위한 디폴트값들을 사용하는 것이며, L_D의 적절한 값은 7.5 cd/㎡이다.Each contrast modulation C _m in the table above is an empirical value that caused the just-discriminable contrast of the sine wave of the spatial frequency f _s and the luminance I ₀ of the retina. Since 2 mm artificial pupils are used for verification, the luminance values of the retina (I ₀ in trolands) are used to derive the luminance values (L cd / m 2). Multiply by A good starting point for the validation of all the m (k) and _D L (k) is to use the default values for the appropriate indices of 0.65 m is 8c / deg sine wave is detected, the appropriate value of _D L is 7.5 cd / M 2.

도 6 및 도 20의 두 퍼셉튜얼 매트릭 발생기를 위한 휘도의 공간 및 시간 필터링은 동일하다. 그러나, 도 20의 퍼셉튜얼 매트릭 발생기의 휘도 콘트라스트 연산은 제곱 연산없이 구한다. 단계들(640, 642, 및 650)이 도 20에서는 단계들(2040, 2042, 및 2050)로 대체된다.The spatial and temporal filtering of luminance for the two perceptual metric generators of FIGS. 6 and 20 are identical. However, the luminance contrast operation of the perceptual metric generator of FIG. 20 is obtained without the square operation. Steps 640, 642, and 650 are replaced with steps 2040, 2042, and 2050 in FIG. 20.

특히, 콘트라스-응답 영상들을 위의 방정식 (19) 및 (20)에 의해 정의된 양의 절대 값들의 클리핑된 버젼들(clipped versions)로서 연산된다. 이러한 양들은 식(23c) 및 (23d)로서 연산된다.In particular, the contrast-responsive images are computed as clipped versions of the positive absolute values defined by equations (19) and (20) above. These quantities are calculated as equations (23c) and (23d).

절대-값 연산 이전의 각 콘트라스트 비율 픽셀 값의 대수적 부호는 이 후의 사용을 위해 또한 보존되어야 한다.The algebraic sign of each contrast ratio pixel value before the absolute-value operation must also be preserved for later use.

도 6 및 도 20의 상기 퍼셉튜얼 매트릭 발생기들 사이의 또 다른 중요한 차이점은 콘트라스트 에너지 마스킹에 있다. 도 6과는 다르게, 도 20의 퍼셉튜얼 매트릭 발생기들은 두 개의 분리된 단계에서 콘트라스트 에너지 마스킹(2060)을 수행한다: 수평 및 수직 채널들 각각을 위한 크로스 마스킹 단계 및 쉘프 마스킹 단계(도 20을 참조). 쉘프 마스킹은 현 채널내의 정보의 출현시 감도를 감소시킨다. 이와 달리, 크로스 마스킹은 이웃 채널에 정보가 출현할 때 감도를 감소시킨다. 사실, 이러한 두 분리된 마스킹 단계들의 순서는 반대로 될 수 있다. 이러한 콘트라스트 에너지 마스킹 단계들은 다음과 같은 형식을 취한다:Another important difference between the perceptual metric generators of FIGS. 6 and 20 is the contrast energy masking. Unlike FIG. 6, the perceptual metric generators of FIG. 20 perform contrast energy masking 2060 in two separate steps: cross masking step and shelf masking step for each of the horizontal and vertical channels (see FIG. 20). ). Shelf masking reduces the sensitivity in the presence of information in the current channel. Cross masking, on the other hand, reduces the sensitivity when information appears in neighboring channels. In fact, the order of these two separate masking steps can be reversed. These contrast energy masking steps take the following form:

여기서, y는 마스크될 콘트라스트이다: 공간적, Hi 또는 Vi(방정식(23c)) 또는 시간적 (T₃)(방정식(24d)). 양 D_i는 y가 속하는 피라미드 레벨 i에 의존하는 영상과 관련된다(일 픽셀씩). 양들 b, a, c, mf, 및 mt은 각각 1.4, 3/32, 5/32. 10/1024, 및 50이 되도록 모델 검증으로 구했었다. dy는 절대 값을 취하기 이전의 보존된 콘트라스트의 대수적 부호이다.Where y is the contrast to be masked: spatial, Hi or Vi (equation (23c)) or temporal (T ₃ ) (equation (24d)). The amount D _i is associated with the image (by one pixel) that depends on the pyramid level i to which y belongs. The quantities b, a, c, mf, and mt are 1.4, 3/32, 5/32. Model validation was made to be 10/1024, and 50. dy is the algebraic sign of the preserved contrast before taking the absolute value.

D_i연산은 앞서 설명한 바와 같이 도 6의 그것과 유사하다. 즉, fds()는 일 피라미드마다의 다운샘플링이 따르는 3x3 필터링을 표시하며, usf()는 3x3 필터링이 따르는 일 피라미드 레벨마다의 업셈플링을 나타낸다. 먼저, 어레이 E₀는 다음과 같이 연산된다:The _Di operation is similar to that of FIG. 6 as described above. That is, fds () represents 3x3 filtering followed by downsampling per one pyramid, and usf () represents upsampling per one pyramid level following 3x3 filtering. First, the array E ₀ is calculated as follows:

이어, i = 1,2에 대해, 어레이들 E_i는 반복적으로 연산된다:Then, for i = 1,2, the arrays E _i are computed iteratively:

상기 어레이들 E_i는 이어 콘트라스트 분석기 어레이들 D_i를 주기위해 시간 콘트라스트 영상 T₃및 영상들 T_i와 다음과 같이 합성된다:It said array E _i is followed by a contrast analyzer arrays time contrast image, to apply D _i T ₃ and T _i and the image is synthesized in the following manner:

여기서, 파라미터 mft =3/64은 모든 공간적-휘도 채널들과 함께 시간적(플릭커) 휘도-채널을 마스킹시키는 강도를 변조시킨다; 그리고, 파라미터 mt =50은 상기 시간적(플릭커) 휘도-채널과 함께 공간적-휘도 채널들 각각을 마스킹시키는 강도를 변조시킨다.Where the parameter mft = 3/64 modulates the intensity masking the temporal (flicker) luminance-channel along with all the spatial-luminance channels; The parameter mt = 50 modulates the intensity masking each of the spatial-luminance channels together with the temporal (flickr) luminance-channel.

도 7은 휘도 매트릭 발생부(240)의 상세한 블록 구성도를 도시한 도면이다. 다시, 도 7은 휘도 매트릭 발생 단계들의 흐름도 또는 그러한 휘도 매트릭 발생 단계들을 수행하는 다수의 하드웨어 성분들, 예컨데, 필터들, 다양한 회로 성분들 및/또는 주문형 집적 회로(ASIC)를 갖는 휘도 매트릭 발생부의 블록 구성도로서 생각할 수 있다. 아래에 설명된 구조는 위의 도 6에서 발생된 모든 마스킹된-콘트라스트 영상들에 적용된다:FIG. 7 is a diagram illustrating a detailed block diagram of the luminance metric generating unit 240. Again, FIG. 7 is a flow chart of luminance metric generating steps or a luminance metric generator having a plurality of hardware components, such as filters, various circuit components and / or an application specific integrated circuit (ASIC) for performing such luminance metric generating steps. It can be considered as a block diagram. The structure described below applies to all masked-contrast images generated in FIG. 6 above:

파라미드 H 및 V에서의 영상들 (예컨데, 영상들 H₀, V₀, H₁, V₁, 및 H₂, V₂), 영상 T₃(레벨 3에서의 해상도를 갖는), 및 기준 시퀀스드로부터 도출된 대응하는 영상들 (도 6 및 도 7에서 위첨자 ref로 나타낸).Images in Paramides H and V (eg, Images H ₀ , V ₀ , H ₁ , V ₁ , and H ₂ , V ₂ ), Image T ₃ (with resolution at level 3), and reference sequence Corresponding images derived from DE (shown with superscript ref in FIGS. 6 and 7).

다음의 처리 과정에서 처음 4개의 단계들은 개별적으로 위의 영상들에 적용된다. 다음의 설명에서 X는 검사 시퀀스로부터 도출된 이들 영상들중 어느 한 영상을 나타내며, X^ref로 기준 시퀀스로부터 도출된 상기 대응하는 영상을 나타낸다. 이러한 표시법을 부여할 때, 단계들은 다음과 같다:In the following processing, the first four steps are individually applied to the above images. In the following description, X represents any of these images derived from the test sequence, and X ^ref represents the corresponding image derived from the reference sequence. When giving this notation, the steps are as follows:

단계 710에서, 상기 영상 X는 두개의 반파 정류된 영상들로 분리되며, 그 하나는 양의 콘트라스트들(712)에 대한 것이고, 다른 하나는 음의 콘트라스트들(714)이다. 상기 양의-콘트라스트(소위 X₊)에서, 상기 X 콘트라스트로부터의 부호들은 음의 콘트라스트들을 갖는 X₊내의 모든 픽셀들에 제로를 할당하기 위해 사용된다. 반대의 과정이 음의-콘트라스트 영상 X_-에서 발생된다.In step 710, the image X is separated into two half-wave rectified images, one for positive contrasts 712 and the other for negative contrasts 714. In the positive-contrast (so-called X ₊ ), the signs from the X contrast are used to assign zero to all the pixels in X ₊ with negative contrasts. The opposite process occurs in the negative-contrast image X ₋ .

단계 720에서, 각 영상 X₊및 X_-에 대한, 로컬 플링 연산이 수평 및 수직으로 0.25(1,2,1)의 필터 커넬로 영상을 감기 위해 3x3 필터를 적용시킴으로써 이행된다.In step 720, for each image X ₊ and X ₋ , a local fling operation is implemented by applying a 3 × 3 filter to wind the image with a filter kernel of 0.25 (1,2,1) horizontally and vertically.

또한, 단계 720에서, 결과 영상들은 상기 플링 연산으로부터의 중복 결과를 제거하기 위해 각각의 방향으로 인자 2에 의해 다운샘플된다. X에 적용된 과정과 동일한 처리 과정이 상기 대응하는 기준 영상 X^ref에 대해 이행된다.Further, in step 720, the result images are downsampled by factor 2 in each direction to remove duplicate results from the fling operation. The same processing as that applied to X is performed for the corresponding reference image X ^ref .

단계 730에서, 절대-차 영상들 ｜X₊- X₊ ^ref｜ 및 ｜X_-- X_- ^ref｜이 픽셀마다 연산된다. 상기 결과 영상들은 JND 맵들이다.In step 730, the absolute-difference image the | X ₊ - X ^ref ₊ | and | X _- - X _- ^ref | is calculated for each pixel. The resulting images are JND maps.

단계 740에서, 코링 동작(coring operation)이 상기 JND 맵들상에서 이행된다. 즉, 임계치 tc 미만의 모든 값들이 제로로 설정된다. 바람직한 실시예에서, tc는 0.5 값으로 디폴트된다.In step 740, a coring operation is performed on the JND maps. That is, all values below the threshold tc are set to zero. In a preferred embodiment, tc defaults to a value of 0.5.

단계 750에서, 이러한 영상의 Q-th 파워가 결정된다. 바람직한 실시예에서, Q는 값 2에 디폴트된 양의 정수이다.In step 750, the Q-th power of this image is determined. In a preferred embodiment, Q is a positive integer defaulted to the value 2.

이러한 과정이 모든 X, X^ref쌍에 대해 수행된 이 후, 종합 측정들이 반복적으로 업샘플링, 필터링 및 요구되는 레벨까지 모든 영상들을 가산함으로써 결정된다. 이것은 다음과 같이 성취된다:After this process is performed for all X, X ^ref pairs, the composite measurements are determined by iteratively upsampling, filtering and adding all the images up to the required level. This is accomplished as follows:

단계 760에서, 업샘플링 및 필터링 과정은 레벨-2 영상을 도출시키기 위해 T₃, T₃ ^ref로부터 도출된 레벨-3 영상들에 적용된다.In step 760, the upsampling and filtering process is applied to the level-3 images derived from T ₃ , T ₃ ^ref to derive the level-2 images.

단계 761에서, 업샘플링 및 필터링 과정은 단계 760으로부터의 레벨-2 영상 및 H₂, H₂ ^ref, V₂, 및 V₂ ^ref로부터 도출된 레벨-2 영상들의 합산에 적용된다.In step 761, the upsampling and filtering process is applied to the sum of the level-2 images from step 760 and the level-2 images derived from H ₂ , H ₂ ^ref , V ₂ , and V ₂ ^ref .

단계 762에서, 업샘플링 및 필터링 과정은 단계 761로부터의 레벨-2 영상 및 H₁, H₁ ^ref, V₁, 및 V₁ ^ref로부터 도출된 레벨-2 영상들의 합산에 적용된다.In step 762, the upsampling and filtering process is applied to the summation of the level-2 images from step 761 and the level-2 images derived from H ₁ , H ₁ ^ref , V ₁ , and V ₁ ^ref .

단계 763에서, 업샘플링 및 필터링 과정은 단계 762으로부터의 레벨-2 영상 및 H₀, H₀ ^ref, V₀, 및 V₀ ^ref로부터 도출된 레벨-2 영상들의 합산에 적용된다. 단계 763으로부터 경로 765상의 출력은 휘도 JND 맵이다.In step 763, the upsampling and filtering process is applied to the sum of the level-2 images from step 762 and the level-2 images derived from H ₀ , H ₀ ^ref , V ₀ , and V ₀ ^ref . The output on path 765 from step 763 is the luminance JND map.

최종 처리 단계 763 이전에, 그 결과 영상은 본래의 영상 해상도의 절반임을 주지해야 한다. 유사하게, 본 처리부에서의 각 피라미드-레벨 인덱스(index)는 원래 그것이 도출되는 피라미드 레벨에 관한 것이며, 필터링/다운샘플링 이후의 레벨과 관련된 영상의 두배이다.It should be noted that before the final processing step 763, the resulting image is half the original image resolution. Similarly, each pyramid-level index in the present processing section is originally related to the pyramid level from which it is derived and is twice the image associated with the level after filtering / downsampling.

상기 위의 반복된 업샘플링, 필터링, 및 가산 과정에 의해 발생된 모든 영상들은 Q-th-파워-JND 영상들이다. 상기 레벨-0 영상은 두개의 형태로 사용되며, 그리고 거기서, 그것은 경로 764 상의 합계 처리 과정에 직접적으로 전송되거나, 또는 디스플레이 용도를 위한 본랭의 영상 해상도로 단계 763에서 업샘플링 및 필터링된다.All images generated by the above repeated upsampling, filtering, and addition processes are Q-th-power-JND images. The level-0 image is used in two forms, where it is either sent directly to the summation process on path 764, or upsampled and filtered in step 763 with the original image resolution for display use.

도 21은 휘도 매트릭 발생부(240)의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다. 도 21의 휘도 매트릭 발생 과정은 도 7의 휘도 매트릭 발생 과정과 많은 유사 단계들을 포함하므로, 이하에서는 단지 상이한 단계들에 대해서만 설명한다.21 is a detailed block diagram of an alternative embodiment of the luminance metric generator 240. Since the luminance metric generation process of FIG. 21 includes many similar steps as the luminance metric generation process of FIG. 7, only the different steps will be described below.

특히, "코링" 단계 740 및 "Q^th파워로의 상승" 단계 750는 채널 출력들의 런닝 합(running sum) 및 런닝 최대(running maximum)을 유지하는 다수의 맥스 및 합산 단계들에 의해 대체된다. 도 21에 도시된 처리 과정은 단계 730까지는 동일함으로, 도 21의 과정은 절대-차 영상들 ｜X₊- X₊ ^ref｜ 및 ｜X_-- X_- ^ref｜를 결정하는 단에서부터 설명한다.In particular, the " coring "step 740 and the " rising to Q ^th power " step 750 are replaced by a number of max and summation steps that maintain the running sum and running maximum of the channel outputs. The process shown in Figure 21 by the same up to step 730, the process of Figure 21 is absolute-difference image the | will be described from the end to determine the | X ₊ - X ^ref ₊ | and | ^ref _- X _- - X.

이어, X, X^ref의 모든 쌍들에 대한 과정이 완료된 후, 런닝-합 영상은 T₃, T₃ ^ref로부터 도출된 레벨-3 영상의 합산을 포함하기 위해 단계 2140에서 초기화된다. 유사하게, 런닝-최대 영상은 ｜T₃₊- X₃₊ ^ref｜ 및 ｜T_3-- X_3- ^ref｜의 한점마다의 최대로서의 런닝-최대 영상을 포함하기 위해 단계 2142에서 초기화된다.Then, after the process for all pairs of X and X ^ref is completed, the running-sum image is initialized in step 2140 to include the summation of the level-3 images derived from T ₃ , T ₃ ^ref . Likewise, the running-up image is | is initialized at step 2142 to include the maximum image-up as the running of each one point of | T ₃₊ - - ₃₊ X ^ref | | X and T _3- _3- ^ref.

이어, 런닝-합 및 런닝-최대 영상들은 두개의 레벨-2 영상들을 포함하기 위해 단계들 2140a 및 2142a에 의해 각각 업샘플링 및 필터링된다. 이어, 상기 런닝-합 영상은 H₂, H₂ ^ref, V₂, 및 V₂ ^ref로부터 도출된 상기 레벨-2 영상들을 가산하여 단계 2144에 의해 갱신된다. 유사하게, 상기 런닝-최대 영상은 H₂, H₂ ^ref, V₂, 및 V₂ ^ref로부터 도출된 상기 레벨-2 영상들과 비교하여 단계 2146에 의해 갱신된다.The running-sum and running-maximum images are then upsampled and filtered by steps 2140a and 2142a, respectively, to include two level-2 images. The running-sum image is then updated by step 2144 by adding the level-2 images derived from H ₂ , H ₂ ^ref , V ₂ , and V ₂ ^ref . Similarly, the running-maximum image is updated by step 2146 compared to the level-2 images derived from H ₂ , H ₂ ^ref , V ₂ , and V ₂ ^ref .

다음, 상기 런닝-합 및 런닝-최대 영상들은 두개의 레벨-1 영상들을 포함하기 위해 단계들 2144a 및 2146a에 의해 각각 업샘플링 및 필터링된다. 상기 런닝-합 영상은 H₁, H₁ ^ref, V₁, 및 V₁ ^ref로부터 도출된 레벨-1 영상들을 가산하여 단계 2148에 의해 갱신된다. 유사하게, 상기 런닝-최대 영상은 H₁, H₁ ^ref, V₁, 및 V₁ ^ref로부터 도출된 레벨-1 영상들과 비교하여 단계 2150에 의해 갱신된다.The running-sum and running-maximum images are then upsampled and filtered by steps 2144a and 2146a, respectively, to include two level-1 images. The running-sum image is updated by step 2148 by adding the level-1 images derived from H ₁ , H ₁ ^ref , V ₁ , and V ₁ ^ref . Similarly, the running-maximum image is updated by step 2150 compared to level-1 images derived from H ₁ , H ₁ ^ref , V ₁ , and V ₁ ^ref .

다음, 상기 런닝-합 및 런닝-최대 영상들은 두개의 레벨-0 영상들을 포함하기 위해 단계들 2148a 및 2150a에 의해 각각 업샘플링 및 필터링된다. 상기 런닝-합 영상은 H₀, H₀ ^ref, V₀, 및 V₀ ^ref로부터 도출된 레벨-0 영상들을 가산하여 단계 2152에 의해 갱신된다. 유사하게, 상기 런닝-최대 영상은 H₀, H₀ ^ref, V₀, 및 V₀ ^ref로부터 도출된 레벨-0 영상들과 비교하여 단계 2154에 의해 갱신된다.The running-sum and running-maximum images are then upsampled and filtered by steps 2148a and 2150a, respectively, to include two level-0 images. The running-sum image is updated by step 2152 by adding the level-0 images derived from H ₀ , H ₀ ^ref , V ₀ , and V ₀ ^ref . Similarly, the running-maximum image is updated by step 2154 compared to the level-0 images derived from H ₀ , H ₀ ^ref , V ₀ , and V ₀ ^ref .

최종적으로, 상기 런닝-합 및 런닝-최대 영상들의 한 포인트씩 선형 합성이 단계 2160에서 이행되어 아래의 식(23k)에 의해 휘도 JND 맵을 발생시킨다:Finally, linear synthesis by one point of the running-sum and running-maximum images is performed in step 2160 to generate a luminance JND map by the following equation (23k):

여기서, kL = 0.783. k 값은 민코스키 Q-norm 근사에 의해 결정된다. Q의 값 및 함께 오게될 영상들의 수 N가 주어지면, 모든 비교된 에트리들(한 픽셀에서)이 동일 거나 또는 단지 하나의 엔트리가 제로가 아닌 경우, 근사 측정이 정확히 상기 Q-norm 과 일치한다는 것을 값 kL = [N - N^1/Q]/[N-1] 으로 확인한다. 이러한 경우, N = 14(채널들의 수), 및 Q =2 이다.Where kL = 0.783. The k value is determined by the Minkowski Q-norm approximation. Given the value of Q and the number N of images to be brought together, the approximate measurement is exactly identical to the Q-norm if all the compared trees (in one pixel) are equal or only one entry is non-zero. Confirm that value kL = [N-N ^{1 / Q} ] / [N-1]. In this case, N = 14 (number of channels), and Q = 2.

이러한 처리 과정이 후, 결과 영상이 본래의 영상 해상도의 절반이라는 점을 주지해야 한다. 유사히게, 이러한 과정에서 각 피라미드-레벨 인덱스는 인덱스가 도출되며, 필터링/다운샘플링 이후의 그 레벨과 관련된 해상도의 2배인 본래의 피라이드 레벨에 대한 것이다.It should be noted that after this processing, the resulting image is half the original image resolution. Similarly, in this process each pyramid-level index is for the original pyride level, which is an index derived and twice the resolution associated with that level after filtering / downsampling.

최종적으로, 반복된 필터링/다운샘플링 및 가산/배합 과정에 의해 발생된 모든 영상들에는 JND 영상들을 발생시키기 위해 가중치들 kL 및 1-kL이 가산될 수 있다는 점을 주지해야 한다. 상기 레벨-0 영상은 두가 형태로 처리될 수 있으며, 그곳에서, 상기 레벨-0 영상은 경로 2161을 통해 JND 합산 과정에 직접적으로 전송되거나, 또는 단계 2170에 의해 디스플레이 용도를 위한 본래의 영상 해상도로 업샘플링 및 필터링된다.Finally, it should be noted that the weights kL and 1-kL may be added to all the images generated by the repeated filtering / downsampling and addition / combining process to generate JND images. The level-0 image can be processed in bivalent form, where the level-0 image is sent directly to the JND summation process via path 2161 or the original image resolution for display use by step 2170. Up-sampled and filtered.

일반적으로, 도 21의 휘도 매트릭 발생부는 바람직한 실시예이며, 이에 반해, 도 7의 휘도 매트릭 발생부는 선택적인 실시예이다. 배합-합산 방법이 계산적으로는 보다 싸다는 것이 한 이유이다. 따라서, 만일 정수 설정의 동적 범위를 원한다면, 도 21의 휘도 매트릭 발생부가 바람직하다. 이와는 다르게, 만일 플로팅 포인트 프로세서를 채용한다면, 도 7의 휘도 매트릭 발생부가 또한 사용될 수 있다.Generally, the luminance metric generator of FIG. 21 is a preferred embodiment, whereas the luminance metric generator of FIG. 7 is an optional embodiment. One reason is that the compounding-summing method is computationally cheaper. Therefore, if a dynamic range of constant setting is desired, the luminance metric generating portion of Fig. 21 is preferable. Alternatively, if a floating point processor is employed, the luminance metric generator of FIG. 7 may also be used.

1/2-높이 휘도 처리 과정1 / 2-high luminance processing

저장 요건 및 연산 사이클들은 처리 과정의 중요 과제이므로, 본 발명은 1/2-높이 영상들 예컨데, 인터레이스 영상의 상부 및 하부 필처리할 수 있는 퍼셉튜얼 매트릭 발생기의 선택적인 실시예를 제공한다. 본 실시예는 최대-높이 영상들을 저장하는데에 필요한 저장 공간의 양을 감소시키며, 이와 동시에, 연산 사이클들의 수를 감소시킨다.Since storage requirements and computational cycles are an important task of the processing process, the present invention provides an alternative embodiment of a perceptual metric generator capable of filling half-height images, such as top and bottom of an interlaced image. This embodiment reduces the amount of storage space needed to store the maximum-height images, while at the same time reducing the number of computation cycles.

만일, 상기 1/2 높이 영상들이 제로-필터링없이 실질 영상 높이로 통과시켜야 한다면, 상기 위의 휘도 처리부(220)는 본래의 수직 해상도가 본래의 수평 해상도의 단지 1/2인 것을 반영하도록 변조되어야 한다. 도 22 및 도 23은 1/2-높이 영상을 처리하기 위한 휘도 처리부 및 휘도 매트릭 발생부의 블록 구성도를 도시한 도면이다.If the 1/2 height images have to pass at the real image height without zero-filtering, the luminance processor 220 above must be modulated to reflect that the original vertical resolution is only half of the original horizontal resolution. do. 22 and 23 are block diagrams illustrating a luminance processing unit and a luminance metric generating unit for processing a 1 / 2-height image.

이들 구성도들(도 22 및 도 23) 및 최대-높이 인터레이스 영상들(도 20 및 도 21)을 위한 대응하는 구성도들을 비교하면, 많은 단계들이 동일함을 알 수 있다. 그것으로서, 도 22 및 도 23에 대한 아래의 설명에서는 두 개의 이해들 사이의 차이들에 한정한다.Comparing these diagrams (FIGS. 22 and 23) and the corresponding diagrams for the maximum-height interlaced images (FIGS. 20 and 21), it can be seen that many of the steps are the same. As such, the following description of FIGS. 22 and 23 limits the differences between the two understandings.

먼저, 가장 높은-해상도 수평 채널 H0가 제거된다. 둘째로, 상기 가장 높은-해상도 영상은 3x1 "켈" 필터(수직 필터)로 수직으로(즉, 종열을 따라) 저역통과-필터링된다. 상기 필터는 라이들이 공간 주파의 절반으로 샘플링으로 인한 영향을 제거하기 위해 수직 차원의 엔티-에리어싱(anti-aliasing) 필터이다. 즉, 수직으로 흐려지는 저역 통과 필터이다. 이어, 결과인 수직으로 필터링된 영상 L0은 1x3 필터(2220; 커넬 0.25[1,21])에 의해 수평으로 필터링된다. 그 결과 영상 LP0는 L0의 수평으로 저역-통과된 버전이다.First, the highest-resolution horizontal channel H0 is removed. Secondly, the highest-resolution image is lowpass-filtered vertically (ie along a column) with a 3x1 "Kell" filter (vertical filter). The filter is an anti-aliasing filter in the vertical dimension to eliminate the effect of sampling on the half of the spatial frequency. That is, it is a low pass filter that is blurred vertically. Subsequently, the resulting vertically filtered image L0 is horizontally filtered by the 1 × 3 filter 2220 (kernel 0.25 [1,21]). The resulting image LP0 is a horizontally low-passed version of L0.

이어, L0 및 LP0는 다른 방향성 채널들의 (S-C)/(S+C) 응답들과 유사한 저역통과(LP0) 방향성 응답에 의해 분리된 대역 통과(LP0-L0)를 발생시키기 위해 합성된다.L0 and LP0 are then synthesized to generate a separated band pass (LP0-L0) by a lowpass (LP0) directional response similar to the (S-C) / (S + C) responses of the other directional channels.

차례로, 영상 LP0(720x240 픽셀들)은 최대 높이 1/2-해상도 영상(360x240)으로 수평 방향으로 다운샘플링된다. 이점에서, 화면비는 이 영상 및 나머지 세개의 피라미드 레벨들을 모두의 처리가 최대-높이 선택 사항들에서와 같이 연속될 수 있게 할 수 있다.In turn, image LP0 (720x240 pixels) is downsampled in the horizontal direction to the maximum height 1 / 2-resolution image 360x240. In this regard, the aspect ratio may allow the processing of this image and the remaining three pyramid levels to continue as in the maximum-height options.

이어, 레벨 0으로부터의 1/2-높이 영상들 및 레벨 1의 최대 높이 영상들 사이의 다운샘플링 및 업샘플링이 단계 2232(레벨된 1x3 필터 ＆ d.s.)에 의해 1x3 필터링/수평 다운샘플링 및 단계 2234에 이해 수평 업샘플링(h.u.s.)/1x3 필터링으로 각각 이행된다. 수평 업샘플링은 수평 차원으로 두 개의 인자에 의해 즉, 영상의 모든 다른 종열을 던지는 것에 의해 데시매이션(deciamtion)을 적용한다. 수평 업샘플링은 존재하는 각 두 개의 종열들 사이에 제로들의 종열을 삽입한다. 업샘플링 이 후의 상기 필터 커넬은 위에서 주지한 이유로 0.5[1,2,1]에 의해 정의된다.The downsampling and upsampling between half-height images from level 0 and the maximum height images of level 1 are then performed by step 2232 (leveled 1x3 filter & ds) by 1x3 filtering / horizontal downsampling and step 2234. Each is implemented as horizontal upsampling (hus) / 1x3 filtering. Horizontal upsampling applies deciamtion by two factors in the horizontal dimension, i.e. by throwing every other column of the image. Horizontal upsampling inserts a column of zeros between each two columns present. The filter kernel after upsampling is defined by 0.5 [1, 2, 1] for reasons noted above.

도 23은 1/2-높이 영상들을 처리하기 위한 휘도 매트릭 발생부를 도시한 도면이다. 먼저, 가장 높은-해상도 수평 채널 H0는 제시되지 않는다. V0 채널을 위해, 1x3 필터링 및 수평 다운샘플링 단계(2300)가 다른 채널들에서 사용한 바와 같이, 3x3 필터링 및 다운샘플링 단계를 대체하기 위해 제공된다.FIG. 23 is a diagram illustrating a luminance metric generator for processing 1 / 2-height images. First, the highest-resolution horizontal channel H0 is not presented. For the V0 channel, a 1x3 filtering and horizontal downsampling step 2300 is provided to replace the 3x3 filtering and downsampling step, as used in other channels.

H0 채널이 빠지므로, 다양한 파라미터들 및 런닝-최대(running-maximum) 및 런닝-합산(running-sum)의 통로가 변형된다. 예컨데, k를 결정하는 N의 값은 14부터 12까지 변한다. 동일한 값 k=0.783은 두 최대-높이 및 1/2-높이 처리를 위해 사용되며, 위에서 주어진 방정식으로부터 연산된 최대-높이 및 1/2-높이 상수들의 평균이다.As the H0 channel is missed, the paths of various parameters and running-maximum and running-sum are modified. For example, the value of N that determines k varies from 14 to 12. The same value k = 0.783 is used for both maximum-height and half-height processing and is the average of the maximum-height and half-height constants calculated from the equation given above.

최종적으로, 상기 최대-높이 실시예에서와 같이, 합산 측정들을 위한 휘도 맵은 디스플레이되기 전에 최대 영상 해상도로 가져가야 한다. 디스플레이 바로 이전에, 최종 JND 맵은 단계 2310에서 1x3 필터링(커넬 0.5[1,2,1])가 따르는 업샘플링에 의해 수평 방향으로 최대 해상도가 된다. 수직 방향에서는, 라인-더블링(line-doubling)이 단계 2320에서 이행된다.Finally, as in the maximum-height embodiment, the luminance map for summation measurements must be taken at the maximum image resolution before being displayed. Just before the display, the final JND map is at full resolution in the horizontal direction by upsampling following 1x3 filtering (kernel 0.5 [1,2,1]) in step 2310. In the vertical direction, line-doubling is performed at step 2320.

각각의 공간 필터는 수평 및 수직의 공간 의존성을 가지므로, 그것의 최대-높이 부본과의 비교에서와 같이, 1/2-높이 실시예에는 몇몇 상이점들이 있다. 그러나, 1/2-높이 실시예는 주관적 등급과의 상관 관계에서 약간의 혼란을 단지 표출할 것이다. 따라서, 비인터레이스 선택 사항이 인터레이스 선택 사항에 택이ㄹ적인 실행 가능 및 시간-절약으로서 채용될 수 있다.Since each spatial filter has horizontal and vertical spatial dependencies, there are some differences in the 1 / 2-height embodiment, as compared to its maximum-height copy. However, the half-height embodiment will only express some confusion in correlation with the subjective grade. Thus, non-interlaced options can be employed as viable and time-saving alternatives to interlaced options.

도 8은 색 처리부(230)의 상세한 블록 구성도를 도시한 도면이다. 다시, 도 8은 색 처리 단계들의 흐름도 또는 색 처리 단계들을 이행하기 위한 다수의 하드웨어 성분들 예컨데, 필터, 다양한 회로 성분들 및/또는 주문형 집적 회로(ASIC)를 갖는 블록 구성도로서 이해할 수 있다. 레벨들 0, 1, 2를 갖는 피라미드와는 별개로, 상기 색 처리부(230)은 u*(802) 및 v*(804)에 대한 레벨들 0, 1, 2를 갖는 피라미드들을 연산하다.8 is a detailed block diagram of the color processor 230. Again, FIG. 8 can be understood as a block diagram having a flow chart of color processing steps or a number of hardware components for implementing the color processing steps, such as a filter, various circuit components and / or an application specific integrated circuit (ASIC). Apart from the pyramid having levels 0, 1, 2, the color processor 230 computes pyramids having levels 0, 1, 2 for u * 802 and v * 804.

해상도는 인터-리셉터(inter-receptor) 공간에 의해서가 아닌 인터-픽셀(inter-pix) 공간에 의해 도출되기 때문에, 색 채널들(즉, 최대 높이의 피라미드 레벨의 해상도)의 공간적 해상도는 휘도의 해상도와 동일하도록 선택된다. 인터-리셉터 공간은 가시각 0.0007 도이며, 인터-픽셀 공간은 그 높이 4배에서 볼 때 높이에서 480 픽셀들을 갖는 스크린으로부터 돌출된 0.03 도이다. 한편, 1985년 모건(Morgan) 및 애바(Aiba)는 빨강색-녹색 버니어 어큐어티(vernier acuity)가 아이서루미넌스(isoluminacne)에서 3개의 인자, 어큐어티의 다른 종류들에 대한 3개의 인터-리셉터 공간들와 등식화되어야 하는 인자에 의해 감소된다는 것을 발견했다. 또한, 파랑색-노란색 색 채널의 해상도는 약 2'(또는 .033 도)의 가시 각도보다 작은 빛 서브탠딩(subtending)에 대한 3색 색맹(파랑색 블라인드)이라는 사실에 의해 제한된다(Wyszecki and Stiles, 1982, p.571 참조). 가시각 0.03도의 픽셀 해상도는 이러한 값들의 가장 큰 값에 매우 접근하므로, 상기 휘도 및 색 채널들의 픽셀 해상도들 등식화하기 위해 사용된다.Since the resolution is derived not by inter-receptor space but by inter-pix space, the spatial resolution of the color channels (i.e., the pyramid-level resolution of the maximum height) is determined by the luminance. It is selected to be the same as the resolution. The inter-receptor space is 0.0007 degrees of viewing angle and the inter-pixel space is 0.03 degrees protruding from the screen with 480 pixels in height when viewed at 4 times its height. On the other hand, in 1985 Morgan and Aiba, red-green vernier acuity had three factors in isoluminacne, three interfaces for different kinds of acuity. We found that it is reduced by the factor that needs to be equalized with the receptor spaces. In addition, the resolution of the blue-yellow color channel is limited by the fact that it is tricolor blind (blue blind) for light subtending less than about 2 '(or .033 degrees) visible angle (Wyszecki and Stiles, 1982, p. 571). A pixel resolution of 0.03 degrees of viewing angle is very close to the largest value of these values, so it is used to equalize the pixel resolutions of the luminance and color channels.

색 피라미드는 레벨 6까지 확장된다. 이것은 관찰자가 색의 넓고 공간적으로 균일한 필드들 사이의 차이를 목격한다라는 증거를 지지한다. 이러한 효과를 공간적으로 연장된 JND 맵을 사용함으로써 제시할 수 있다. 그러하 낮은 공간 주파수들에 의한 JND로의 공헌에 대한 양적인 증거가 뮬렌(Mullen; 1985)에 의해 제시되었다.The color pyramid extends to level six. This supports the evidence that the observer witnesses the difference between the wide and spatially uniform fields of color. This effect can be presented by using a spatially extended JND map. However, quantitative evidence for the contribution to JND by low spatial frequencies has been presented by Mullen (1985).

다시 도 8로 되돌아가, 휘도 처리와 유사하게, 7개의 해상도 레벨들에서의 공간적인 분석은 해상도의 각 연속적인 하등의 레벨에서의 인자 2에 의한 영상의 스미어 및 다운샘플링하는 피라미드 분석을 통해 성취된다. 본래의 최대 해상도 영상을 소위 피라미드의 제로번째 레벨(레벨 0)이라 한다.Returning to FIG. 8 again, similar to luminance processing, spatial analysis at seven resolution levels is achieved through smearing and downsampling pyramid analysis of an image by factor 2 at each successive lower level of resolution. do. The original maximum resolution image is called the zeroth level (level 0) of the pyramid.

보다 낮은 해상도에서의 뒤이은 레벨들은 리듀스라(REDUCE)라는 연산으로써 얻게 된다. 즉, 가중치 (1, 2,1)/4를 갖는 3-탭 저역통과 필터(805)가 흐린 영상을 발생시키기 위해 영상의 각 방향으로 연속적으로 레벨 0에 적용된다. 이어, 그 결과 영상은 다음 레벨, 레벨 1을 생성하기 위해 인자 2(모든 다른 픽셀들은 제거된다)에 의해 서브샘플링된다.Subsequent levels at lower resolutions are obtained with an operation called REDUCE. That is, a three-tap lowpass filter 805 with weights (1, 2,1) / 4 is applied to level 0 continuously in each direction of the image to produce a blurred image. The resulting image is then subsampled by factor 2 (all other pixels are removed) to produce the next level, level 1.

단계 810에서, 4-필드 평균이 탭 가중치들(0.25, 0.25, 0.25, 0.25)를 가지고 각 해상도 레벨을 위한 u* 영상들 및 v* 영상들에서 이행된다. 즉:In step 810, a four-field average is performed on u * images and v * images for each resolution level with tap weights (0.25, 0.25, 0.25, 0.25). In other words:

(231)(231)

여기서, j는 필드 인덱스이다. 이러한 평균 연산은 색 채널들의 본연의 저역 통과 시간 필터링을 반영하며, 시간 휘도 채널의 "초기-늦음(early-late)" 처리를 대체한다.Where j is the field index. This averaging reflects the natural low pass time filtering of the color channels and replaces the "early-late" processing of the temporal luminance channel.

단계 820에서, 비방향성 라플라시안 공간 필터(820)가 u* 및 v* 영상들 각각에 적용된다. 상기 필터는 다음과 같은 3x3 커넬을 갖는다:In step 820, a non-directional Laplacian spatial filter 820 is applied to each of the u * and v * images. The filter has the following 3x3 kernel:

(24)(24)

식(24)는 단위 값을 갖는 두 균일 면적들 사이의 모서리에 1의 최대 길이로 응답하고 제로 총 가중치를 갖도록 선택된다.(최대 응답이 수평 및 수직 모서리에 의해 얻게된다.) 이것은 u* 및 v* 영상들을 JND 유니트들에서의 평가로 색 차의 맵들로 변경시킨다.Equation (24) is chosen to respond to the edge between two uniform areas with unit values with a maximum length of 1 and to have a total weight of zero (the maximum response is obtained by the horizontal and vertical edges). v * Change the images into maps of color difference by evaluation in JND units.

단계 830에서, 콘트라스트 연산은 상기 휘도 처리부에서 연산된 미첼슨(Michelson) 콘트라스트들로 연속적으로 해석되도록, 색 콘트라스트 피라미드로서의 단계 820에 의해 상기 u* 및 v* 영상들상에서 직접 이행된다.휘도 콘트라스트들과 유사하게, 색 콘트라스트들은 라플라시안 피라미드들에 의해 영향받는 인트라-영상 비교를 통해 연산된다. 공간 평쿤에 의해 나눠진 라플라시안 차이가 웹버의 법칙을 통해 1-JND 레벨(검출 임계)에서 일정 값을 가정하는 미첼슨 콘트라스트를 표현하는 바와 같이, u* 및 v* 영상들상에서 동작하는 라플라시안 피라미드는 1-JND 해석을 갖는다. 유사하게, 이러한 해석은 검증 절차에 따라 변조된다. 상기 변조는 본 발명의 모든 부분들의 내부동작 및 1-JND 응답을 도출하는 자극들이 퍼셉튜얼 매트릭 발생기의 식에서 단순하지 않다는 사실을 반영한다.In step 830, a contrast operation is directly performed on the u * and v * images by step 820 as a color contrast pyramid, so as to be continuously interpreted as Michelson contrasts computed by the luminance processor. Similarly, color contrasts are computed through intra-image comparison influenced by Laplacian pyramids. The Laplacian pyramid, which operates on u * and v * images, represents the Mitchellson contrast, which assumes a constant value at the 1-JND level (detection threshold) through the Webber's law, as the Laplacian difference divided by the spatial square. Has a JND interpretation. Similarly, this interpretation is modulated according to the verification procedure. The modulation reflects the fact that the inner workings of all parts of the present invention and the stimuli that lead to the 1-JND response are not simple in the equation of the perceptual metric generator.

또한, 단계 830에서, 일 레벨마다 콘트라스트 피라미드 영상들은 7개의 상수들 qi(i=0,...,6)으로 나눠, 그 값들이 각각 1000, 125, 40, 12.5, 10, 10, 10이 되도록 검증에 의해 결정된다. 이러한 상수들은 휘도 처리부에서의 양들 wi(i=0,..,3)과 유사하다.Further, in step 830, the contrast pyramid images for each level are divided by seven constants qi (i = 0, ..., 6), and the values are 1000, 125, 40, 12.5, 10, 10, 10, respectively. Is determined by verification. These constants are similar to the quantities wi (i = 0, .., 3) in the luminance processor.

단계 840에서, 모든 u* 및 v* 콘트라스트들의 제곱들이 결정되며, 그 대수 부호는 다시 이후의 사용을위해 보존된다. 상기 부호의 보존은 제곱 연산에서 부호의 상실의 혼란성으로 인한 두 상이한 영상 사이의 0 JND들을 기록 가능성을 방지한다. 그 결과는 두개의 색 제곱-상수 피라미드들 Cu, Cv이다.In step 840, the squares of all u * and v * contrasts are determined and the algebraic sign is again preserved for future use. The preservation of the sign prevents the possibility of writing 0 JNDs between two different images due to the confusion of loss of sign in the square operation. The result is two color squared-constant pyramids Cu, Cv.

단계 850에서, 콘트라스트 에너지 마스킹이 이행된다. 먼저, 결정자 피라미드 레벨 D_m(m=0, 1, 2)가 착렬없이 휘도 처리부(220)으로부터 직접적으로 채용된다. 그러나, 레벨들 3,..., 6 에 대해, D₂의 시퀀스 필터링 및 다운샘플링이 새로운 항의 추가는 없이 휘도 처리 방법에서와 같은 동일한 방법을 사용하여 이행된다. 휘도가 JND들의 보다 중요한 인자이므로 색으로의 휘도 영향이 휘도로의 색 영향보다 더 중요하다는 가정하에 섭동 원리에 입각하여 단계 840에서 이러한 D_m값들이 사용된다. 즉, 대부분의 경우 색 영향 이상으로 휘도 영향이 우세한 것으로 예견되므로 색 처리부는 휘도 처리부상의 제1-차수 섭동으로서 볼 수 있다. 따라서, 휘도의 영향들(D_m)은 색의 마스킹으로서 모델링되나 그 역은 아니다.In step 850, contrast energy masking is performed. First, the determinant pyramid level D _m (m = 0, 1, 2) is employed directly from the luminance processing unit 220 without incident. However, for levels 3, ..., 6, the sequence filtering and downsampling of D ₂ is implemented using the same method as in the luminance processing method without the addition of a new term. These D _m values are used in step 840 based on the perturbation principle, assuming that the luminance effect on color is more important than the color effect on luminance since luminance is a more important factor of the JNDs. That is, in most cases, the luminance effect is expected to be predominant over the color effect, so the color processor can be regarded as the first-order perturbation on the luminance processor. Thus, the effects of luminance D _m are modeled as masking of color but not vice versa.

휘도-채널 분모 피라미드 D_m및 모든 피라미드 레벨들 m=0, 1, 2에 대해 색 제곱-콘트라스트 피라미드를 마스킹하기 위한 휘도 변환기에 사용된 동일한 함수적 형태를 사용함으로써 상기 마스킹된 색 콘트라스트 피라미드가 발생된다:The masked color contrast pyramid is generated by using the same functional form used in the luminance converter to mask the color squared-contrast pyramid for luminance-channel denominator pyramid D _m and all pyramid levels m = 0, 1, 2. do:

단계 830에서 제거된 대수 부호는 s_um및 s_vm인자들을 통해 다시 부착된다는 점을 주지해야 한다. 이러한 연산은 u* 및 v* 에 대한 마스킹된 콘트라스트 피라미드들을 발생시킨다. 검증은 a_c=0.15, c_c=0.3, k=0.7, σ_c=1.0, 및 β_c=1.17 값들을 결정한다. 또한, m_c의 값 1로의 설정은 모든 검증들 및 예상들에서 충분한 수행을 제공한다.Note that the algebraic sign removed in step 830 is reattached via the s _um and s _vm factors. This operation results in masked contrast pyramids for u * and v *. The verification determines the values a _c = 0.15, c _c = 0.3, k = 0.7, σ _c = 1.0, and β _c = 1.17. In addition, setting the value of m _c to 1 provides sufficient performance in all verifications and expectations.

도 24는 색 처리부(230)의 선택적인 실시예의 상세한 블록 구성도를 도시한 도면이다. 도 24의 색 처리부는 도 8의 색 처리부의 단계들과 많은 유사한 단계들을 포함하고 있으므로, 이하에서는 비유사 단계들만을 설명한다.24 is a detailed block diagram of an alternative embodiment of the color processing unit 230. Since the color processor of FIG. 24 includes many similar steps to those of the color processor of FIG. 8, only dissimilar steps will be described below.

특히, 단계 830에서, 각 레벨마다의 콘트라스트 피라미드 영상들은 7개의 상수들qi(i=0,...,6)으로 나눠, 그 값들이 각각 384, 60, 24, 6, 4, 3, 3이 되도록 검증에 의해 결정된다. 이러한 상수들은 도 8의 상수들과는 상이하다는 점을 주지해야 한다. 이러한 상수들은 휘도 처리부에서의 양들 wi(i=0,..,3)과 유사하다.In particular, in step 830, the contrast pyramid images for each level are divided by seven constants qi (i = 0, ..., 6), the values of which are 384, 60, 24, 6, 4, 3, 3, respectively. Is determined by verification. It should be noted that these constants are different from the constants of FIG. 8. These constants are similar to the quantities wi (i = 0, .., 3) in the luminance processor.

이어, 모든 u* 및 v* 콘트라스트들의 클리핑된 절대 값들[여기서, clip(x)=max(0,x-e)]가 연산되며, 여기서 e=0.75이다. 다시, 대수 부호는 보존되고, 이후의 사용을 위해 다시 부여된다. 이것은 상기 부호의 보존은 제곱 연산에서 부호의 상실의 혼란성으로 인한 두 상이한 영상 사이의 0 JND들을 기록 가능성을 방지한다. 그 결과는 두개의 색 제곱-상수 피라미드들 Cu, Cv이다.The clipped absolute values of all u * and v * contrasts, where clip (x) = max (0, x−e), are then calculated, where e = 0.75. Again, the algebraic sign is preserved and given again for later use. This preservation of the sign prevents the possibility of writing 0 JNDs between two different images due to the confusion of loss of sign in the square operation. The result is two color squared-constant pyramids Cu, Cv.

도 8 및 도 24의 퍼셉튜얼 매트릭 발생기들 사이의 또 다른 중요한 차이점은 상기 콘트라스트 에너지 마스킹의 구현에 있다. 도 8과는 다르게, 도 24의 퍼셉튜얼 매트릭 발생기들은 두개의 분리된 단계들에서 콘트라스트 에너지 마스킹(2410)을 이행한다: 수평 및 수직 채널들 각각에 대한 크로서 마스킹 단계 및 쉘프 마스킹 단계(도 24 참조). 쉘프 마스킹은 현 채널내로의 정보 출현시의 감도를 감소시키며, 크로스 마스킹은 이웃 채널의 정보 출현시의 감도를 감소시킨다. 사실, 이러한 두 분리된 마스킹 단계들의 순서는 바뀔 수 있다.Another important difference between the perceptual metric generators of FIGS. 8 and 24 lies in the implementation of the contrast energy masking. Unlike FIG. 8, the perceptual metric generators of FIG. 24 implement contrast energy masking 2410 in two separate steps: masking step and shelf masking step (FIG. 24) for each of the horizontal and vertical channels. Reference). Shelf masking reduces the sensitivity at the appearance of information into the current channel, and cross masking reduces the sensitivity at the appearance of information of neighboring channels. In fact, the order of these two separate masking steps can be reversed.

휘도-채널 분모 피라미드 D_m및 모든 피라미드 레벨들 m=0, 1, 2에 대해, 색 제곱-콘트라스트 피라미드들을 마스킹하기 위한 휘도 변환기에 사용된 동일한 함수적 형태를 사용:For the luminance-channel denominator pyramid D _m and all pyramid levels m = 0, 1, 2, use the same functional form used in the luminance converter to mask the color squared-contrast pyramids:

여기서, D_i는 i〉2인 경우 D₂의 필터링 및 다운샘플된 버전이다. 비슷하게,Here, D _i is a filtered and downsampled version of D ₂ when i> 2. similarly,

위에서 제거된 대수 부호는 s_um및 s_vm인자을 통해 다시 부착된다는 점에 주의하라.이러한 연산은 u_i및 v_i에 대한 마스킹된 콘트라스트 피라미들을 발생시킨다. 검증은 a_c=1/2, c_c=1/2, k=0.7, β_c=1.4, 및 m_c=m_f=10/1024 값들을 결정한다. 일반적으로, 도 24의 상기 색 처리부는 바람직한 실시예이며, 이에 반해 도 8의 색 처리부는 선택적인 실시예이다.Note that the algebraic sign removed above is reattached via the s _um and s _vm arguments. This operation results in masked contrast pyramids for u _i and v _i . The verification determines values a _c = 1/2, c _c = 1/2, k = 0.7, β _c = 1.4, and m _c = m _f = 10/1024. In general, the color processor of FIG. 24 is a preferred embodiment, whereas the color processor of FIG. 8 is an optional embodiment.

도 9는 색 매트릭 발생부(250)의 블록 구성도를 도시한 도면이다. 다시, 도 9는 색 메트릭 발생 단계들의 흐름도 또는 휘도 메트릭 발생 단계들을 이행하기 위한 다수의 하드웨어 성분들 예컨데, 필터, 다양한 회로 성분들 및/또는 주문형 집적 회로(ASIC)를 갖는 블록 구성도로서 이해할 수 있다. 색 JND 맵의 구조는 도 7에 대한 위에서 설명한 바와 같이 휘도 JND 맵의 구조와 유사하다. 색의 경우에 있어서, 그 과정은 위의 단계 840에 의해 발생된 모든 마스킹된-콘트라스트 색 영상들에 적용된다.: 즉, C_u0, C_v0,..., C_u6,C_v6영상들 및 기준 시퀀스로부터 도출된 대응하는 영상들(도 8 및 도 9에서 위 첨자 ref로 표시된).9 is a block diagram illustrating the color metric generating unit 250. Again, FIG. 9 can be understood as a block diagram with a flow diagram of color metric generation steps or a number of hardware components for implementing the luminance metric generation steps, eg, a filter, various circuit components and / or an application specific integrated circuit (ASIC). have. The structure of the color JND map is similar to that of the luminance JND map as described above with respect to FIG. In the case of color, the process is applied to all masked-contrast color images generated by step 840 above: i.e. C _u0 , C _v0 , ..., C _u6 , C _v6 images and Corresponding images derived from the reference sequence (indicated by superscript ref in FIGS. 8 and 9).

다음 과정에서의 제1 4개의 단계들은 개별적으로 위의 영상들에 적용된다. 다음의 언급될 X는 검사 시퀀스로부터 도출된 영상들 및 X^ref에 의해 상기 기준 시퀀스로부터 도출된 대응하는 영상중 하나를 가리킨다. 이러한 표시법이 주어질 때, 단계들은 다음과 같다:The first four steps in the following process are individually applied to the above images. X to be mentioned next refers to one of the images derived from the test sequence and the corresponding image derived from the reference sequence by X ^ref . Given this notation, the steps are as follows:

단계 910에서, 영상 X는 두개의 반파 정류 영상들로 분리된다. 그 하나는 양의 콘트라스트(912)에 대한 것이고, 다른 하나는 음의 콘트라스트(914)에 대한 것이다. 양의-콘트라스트 영상(소위 X₊)에서, X 콘트라스트로부터의 부호들(위에서 언급한 바와 같이 분리되어 저장된)은 음의 콘트라스트들를 갖는 X₊의 모든 픽셀들에 제로를 할당하기 위해 사용된다. 반대의 과정이 음의-콘트라스트 영상 X_-에서 발생된다.In step 910, image X is separated into two half-wave rectified images. One is for positive contrast 912 and the other is for negative contrast 914. In a positive-contrast image (so-called X ₊ ), the symbols from the X contrast (stored separately as mentioned above) are used to assign zero to all pixels of X ₊ with negative contrasts. The opposite process occurs in the negative-contrast image X ₋ .

단계 920에서, 각각의 영상 X₊및 X_-에 대해, 로컬 플링 연산이 수직 및 수평으로 0.5(1,2,1)의 필터 커넬을 갖는 영상을 감는 3x3 필터를 적용함으로써 이행된다.In step 920, for each image X ₊ and X ₋ , a local fling operation is implemented by applying a 3 × 3 filter that winds up the image with a filter kernel of 0.5 (1,2,1) vertically and horizontally.

또한, 단계 920에서, 결과 영상은 상기 플링 연산으로부터 중복 결과를 제거하기 위해 각 방향으로 인자 2에 의해 다운샘플링된다. X에 적용된 바와 같이 동이라한 과정이 대응하는 기준 영상 X^ref에 적용된다.Further, in step 920, the resulting image is downsampled by factor 2 in each direction to remove duplicate results from the fling operation. As applied to X, the same process is applied to the corresponding reference image X ^ref .

단계 930에서, 절대-차 영상들 ｜X₊-X₊ ^ref｜ 및 ｜X_--X_- ^ref｜이 픽셀마다 연산된다. 결과 영상들은 JND 맵들이다.In step 930, absolute-differential images | X ₊ -X ₊ ^ref | and | X _-- X _- ^ref | are computed per pixel. The resulting images are JND maps.

단계 940에서, 코링 연산이 JND 맵상에서 이행된다. 즉, 임계치 t_c미만의 모든 값들은 제로로 설정된다. 바람직한 실시예에서, t_c는 0.5 값으로 디폴트된다.In step 940, a coring operation is performed on the JND map. That is, all values below the threshold t _c are set to zero. In a preferred embodiment, t _c defaults to a value of 0.5.

단계 950에서, 이러한 영상들의 Q-th 파워가 결정된다. 바람직한 실시에에서, Q는 값 2에 디폴트된 양의 정수이다.In step 950, the Q-th power of these images is determined. In a preferred embodiment, Q is a positive integer defaulted to the value 2.

이러한 과정이 모든 X, X^ref짝들에 대해 이행된 후, 요구되는 레벨까지 모든 영상들을 반복적으로 업샘플링, 필터링, 및 가산함으로써, 합산 측정이 결정된다. 이것은 다음과 같이 성취된다:After this process is implemented for all X, X ^ref pairs, the summation measurement is determined by iteratively upsampling, filtering, and adding all the images up to the required level. This is accomplished as follows:

단계 960에서, 레벨-5 영상을 도출하기 위해 C_u6, C_u6 ^ref, C_v6, C_v6 ^ref로부터 도출된 레벨-6 영상들에 업샘플링 및 필터링이 적용된다.In step 960, upsampling and filtering is applied to level-6 images derived from C _u6 , C _u6 ^ref , C _v6 , C _v6 ^ref to derive a level-5 image.

다음 단계에서, C_u5, C_u5 ^ref, C_v5, C_v5 ^ref로부터 도출된 레벨-5 영상들과 단계 960으로부터 도출된 레벨-5 영상의 합산에 업샘플링 및 필터링이 적용된다. 이러한 과정은 레벨 0까지 계속된다.In a next step, upsampling and filtering is applied to the sum of the level-5 images derived from C _u5 , C _u5 ^ref , C _v5 , C _v5 ^ref and the level-5 images derived from step 960. This process continues to level zero.

휘도 처리 과정과 비슷하게, 최종 처리 단계 963 이전에, 그 결과 영상은 본래 영상의 해상도에 절반이라는 점에 주지해야 한다. 비슷하게, 본 처리부에서의 각 피라미드-레벨 인덱스는 필터링/다운샘플링 이 후의 레벨과 관련된 영상의 해상도의 두배이고, 방향성으로 도출되는 피라미드 레벨에 관한 것이라는 점에 주지해야 한다.Similar to the luminance processing, it should be noted that before the final processing step 963, the resulting image is half the resolution of the original image. Similarly, it should be noted that each pyramid-level index in the present processing section is twice the resolution of the image associated with the level after filtering / downsampling, and relates to the pyramid level derived directionally.

상기 반복된 업샘플링, 필터링, 및 가산 과정에 의해 발생된 모든 영상들은 Q-th-파워-JND 영상들이다라는 점을 또한 주지해야 한다. 상기 레벨-0 영상은 경로 964상의 합산 과정으로 직접적으로 전송되거나 또는, 디스플레이 목적으로 원래의 영상 해상도에 단계 930에서 업샘플링 및 필터링되는 두개의 형태로 사용된다.It should also be noted that all images generated by the repeated upsampling, filtering, and addition process are Q-th-power-JND images. The level-0 image is either sent directly to the summing process on path 964 or used in two forms, upsampling and filtering in step 930 to the original image resolution for display purposes.

전술한 바와 같이, 출력 개요 단계로 통과된 휘도 및 색차(JND) 맵은 Q차 파워 JND 이미지이며, 최초 이미지의 해상도의 절반을 나타낸다. 이것은 각ㄱ가의 마스킹된 콘트라스트 스테이지에서 풀링을 수행함에 있어서 고유 용장을 이용한다. 이들 절반 해상도 이미지의 각각은 민코프스키 가산법을 통해 모든 화소들을 평균함으로써 단일 JND 성능 측정치로 감소될 수 있다.As described above, the luminance and chrominance (JND) map passed in the output summary step is a Q-order power JND image, representing half the resolution of the original image. This uses inherent redundancy in performing pooling at each masked contrast stage. Each of these half resolution images can be reduced to a single JND performance measure by averaging all the pixels through the Minkowski addition.

N_P는 각 JND 맵의 화소 수이며, JND_휘도및 JND_색차는 개요 측정치이며, L_JND ^Q는및 C_JND ^Q는 각각 휘도 및 색차 맵 구성으로부터의 절반 해상도 맵이다. 각각의 경우에 있어서, 합은 이미지의 모든 화소를 통해서이다. 전술한 바와 같이, 민코프스키 지수 Q의 값은 2로 디폴트된다.N _P is the number of pixels in each JND map, JND _luminance and JND _chrominance are outline measurements, and L _JND ^Q and C _JND ^Q are half resolution maps from the luminance and chrominance map configuration, respectively. In each case, the sum is through all the pixels of the image. As mentioned above, the value of the Minkowski index Q defaults to two.

휘도 및 색차 개요 측정으로부터, 필드에 대한 단일 성능 측정치가 민코프스키 가산에 의해 계산된다. 즉,From the luminance and chrominance scheme measurements, a single performance measure for the field is calculated by the Minkowski addition. In other words,

여기서, Q는 다시 2로 디폴트된다.Where Q is again defaulted to two.

다시 민코프스키 Q의 의미에서, 각 필드에 대한 JND값을 가산함으로써 비디오 시쿠너스의 N 필드에 대한 단일 성능 측정치 JND_필드는 2로 디폴트된다.Again in the sense of Minkowski Q, by adding the JND value for each field, the single performance measure JND _field for the N field of the video sequence is defaulted to two.

도 25는 색차 메트릭 발생부(250)의 다른 실시예의 상세 블록도를 도시한다. 도 25의 색차 메트릭 발생은 도 9의 색차 메트릭 발생과 많이 유사한 스테이지들을 포함하기 때문에, 유사하지 않은 스테이지들에 대해서만 상세한 설명을 한다.25 is a detailed block diagram of another embodiment of the color difference metric generator 250. Since the color difference metric generation in FIG. 25 includes stages that are much similar to the color difference metric generation in FIG. 9, only the dissimilar stages will be described in detail.

좀 더 상술하면, "코어링(coring)" 단계 940 및 "Q차 파워로의 상승" 단계 950는 채널 출력의 실행 합산치와 실행 최대치를 유지하는 다수의 맵 및 합산 스테이지에 의해 대체된다. 도 25에 도시된 과정이 단계 930까지 도 9와 동일하기 때문에, 도 25의 과정은 절대값-차 이미지및가 결정되는 시점으로부터 시작하는 것을 기술한다.More specifically, the "coring" step 940 and the "rise to Q order power" step 950 are replaced by a number of map and summation stages that maintain the execution summation and execution maxima of the channel output. Since the process shown in FIG. 25 is the same as FIG. 9 until step 930, the process of FIG. 25 is an absolute value-difference image. And Describes starting from the point where is determined.

이어서, 프로세스가 모든 X, X^ref쌍에 대하여 완료된 후에, 실행 합 이미지는 C_u6, C_u6 ^ref, C_v6, 및 C_v6 ^ref로부터 유도된 레벨-6 이미지의 합산치를 포함하기 위하여 스테이지(2540)에서 초기화된다. 이와 유사하게, 실행 최대치 이미지는 동일 이미지의 포인트-바이-포인트 최대치로서 단계 2542에서 초기화된다.Then, after the process is completed for all X, X ^ref pairs, the execution sum image is stage 2540 to include the sum of the level-6 images derived from C _u6 , C _u6 ^ref , C _v6 , and C _v6 ^ref . Initialized at Similarly, the execution maximum image is initialized in step 2542 as a point-by-point maximum of the same image.

이어서, 실행 합산치 및 실행 최대치 이미지는 언샘플링되고 두개의 레벨-5 이미지를 구성하기 위하여 각각 단계 2540a 및 2542a에 의해 필터링된다. 실행 합산치 이미지는 C_u5, C_u5 ^ref, C_v5, 및 C_v5 ^ref로부터 유도된 레벨-5를 그것에 가산함으로써 단계 2544에 의해 갱신된다. 이와 유사하게, 실행 최대치 이미지는 C_u5, C_u5 ^ref, C_v5, 및 C_v5 ^ref로부터 유도된 레벨-5 이미지와 비교함으로써 단계 2546에 의해 갱신된다. 이 프로세스는 피라미드-레벨 0으로 하향 반복된다.The run sum and run max images are then unsampled and filtered by steps 2540a and 2542a, respectively, to construct two level-5 images. The execution summation image is updated by step 2544 by adding to it the level-5 derived from C _u5 , C _u5 ^ref , C _v5 , and C _v5 ^ref . Similarly, the execution maximum image is updated by step 2546 by comparing it with a level-5 image derived from C _u5 , C _u5 ^ref , C _v5 , and C _v5 ^ref . This process is repeated down to pyramid-level zero.

최종적으로, 상기 단계들을 수행하여, 실행 합산치와 실행 최대치 이미지의 포인트-바이-포인트 결합이 수행되어 색차 JND 맵을 생성한다:Finally, by performing the above steps, a point-by-point combination of the run sum and the run maximum image is performed to generate the chrominance JND map:

여기서 k_c는 0.836 이다. k_c에 대한 값은 민코프스키 Q-표준을 근사시킴으로써 결정된다. Q값과 복수의 이미지 N이 합쳐진다고 가정할 경우, 값k_c=[N-N^1/Q]/[N-1]은 근사 측정치가 모든 비교된 엔트리(화소에서)가 동일할 때 그리고 오직 하나의 0이 아닌 엔트리일 때 Q-표준과 정확히 매칭하는 것을 보장한다. 이 경우, N=28(채널 수)이고 Q=2 이다.Where k _c is 0.836. The value for k _c is determined by approximating the Minkowski Q-standard. Assuming that the Q value and multiple images N are combined, the value k _{c =} [NN ^{1 / Q} ] / [N-1] is approximated when all compared entries (in pixels) are equal and only one It guarantees an exact match to the Q-standard for nonzero entries. In this case, N = 28 (number of channels) and Q = 2.

휘도 처리시에, 이들 동작 후 그 결과 이미지는 최초 해상도의 절반이다. 이 프로세스에서 피라미드 레벨 인덱스는 최초로 유도된 피라미드 레벨을 인용하며, 그것은 필터링/다운샘플링 후의 레벨과 연관된 해상도의 2배라는 것을 주지하여야 한다.In luminance processing, after these operations the resulting image is half of the original resolution. Note that the pyramid level index in this process quotes the first derived pyramid level, which is twice the resolution associated with the level after filtering / downsampling.

또한, 상기 반복된 업샘플링/필터링 및 가산/최대화에 의해 발생된 모든 이미지들은 JND 이미지를 생성하기 위하여 웨이트 k_c및 1-k^c로 가산될 수 있다는 것을 주지하여야 한다. 레벨-0 이미지는 2가지 방식으로 사용되며, 그것은 갸요 처리로 직접 보내지거나 최초 이미지 해상도로 업샘플링되고 디스플레이 목적을 위해 필터링된다.It should also be noted that all images generated by the repeated upsampling / filtering and addition / maximization can be added to the weights k _c and 1-k ^c to produce a JND image. Level-0 images are used in two ways, either sent directly to the Gyayo process or upsampled to the original image resolution and filtered for display purposes.

일반적으로, 도 25의 색차 메트릭 발생부는 바람직한 실시예인 반면, 도 9의 휘도 메트릭 발생부는 변형 실시예이다. 한가지 이유는 최대 합산 방법이 계산적으로 덜 비싸기 때문이다. 그러므로, 도 25의 발생부의 동적 범위가 적정할 경우, 도 25의 색차 메트릭 발생부가 바람직하다. 그와는 달리, 유동점 프로세서가 사용될 경우, 도 9의 휘도 메트릭 발생부가 사용될 수도 있다.In general, the chrominance metric generator of FIG. 25 is a preferred embodiment, while the luminance metric generator of FIG. 9 is a modified embodiment. One reason is that the maximum summation method is computationally less expensive. Therefore, when the dynamic range of the generator of FIG. 25 is appropriate, the color difference metric generator of FIG. 25 is preferable. Alternatively, when the floating point processor is used, the luminance metric generator of FIG. 9 may be used.

절반-높이 색차 처리Half-height color difference processing

절반-높이 이미지가 순 이미지 높이에 제로-필터링없이 직접 통과될 경우, 색차 처리부(230)는 고유 수직 해상도가 고유 수평 해상도의 절반이라는 것을 반영하기 위하여 수정되어야 한다.If the half-height image is passed directly through the net image height without zero-filtering, the chrominance processing unit 230 should be modified to reflect that the intrinsic vertical resolution is half the intrinsic horizontal resolution.

본 발명의 지각 미터 생성기에서, 각 스테이지에서의 보더 반사가 인공결과를 발광 및 색차 JND 맵으로 전파시킬 수 있다는 것이 관찰되었으며, 이에 따라 JND 맵이 이들 인공결과로부터 오염되는 것을 막기 위해서 절단하는 것이 필요하게 된다. 이 임계값을 표현하기 위해서, 무한 범위의 회색 배젤(bezel)로 스크린 보더를 치환하는 방법이 개발되었으나, 이는 이는 실제 이미지 크기를 면에 6개 픽셀 이상으로 향상시키지는 못한다. 이 "가상 배젤"을 사용하면 보더 인공결과를 회피하기 위해서 JND 맵을 절단할 필요가 없어진다. 무한 회색 배젤은 관찰 조건을 정형화하여 비-인공결과적인 것으로 간주된다. 이러한 해석에 의해, 전체 JND 맵은 인공결과에 의해 오염되지 않으며, 사진 품질 분석기에 의해 표시될 수 있다.In the perceptual meter generator of the present invention, it has been observed that the border reflections at each stage can propagate the artifacts to the luminescence and chrominance JND maps, so it is necessary to cut the JND maps to prevent contamination from these artifacts Done. To represent this threshold, a method of substituting screen borders with an infinite range of gray bezels has been developed, but this does not improve the actual image size beyond six pixels on the face. Using this "virtual bezel" eliminates the need to truncate the JND map to avoid border artifacts. Infinite gray bezels are considered non-artificial by shaping observational conditions. By this interpretation, the entire JND map is not contaminated by artificial results and can be displayed by the photo quality analyzer.

이하의 상세한 설명에서, 모든 면에 6개의 픽셀이 패드(pad)된 이미지가 "패드 이미지"로 불릴 것이며, 패드되지 않은 이미지 또는 패드된 이미지 내의 괘적은 "이미지 프로퍼(proper)"로 불릴 것이다.In the detailed description below, an image in which six pixels are padded on all sides will be referred to as a "pad image" and an unpadded image or a rule in the padded image will be referred to as an "image proper". .

이미지 조작이 국부적이기 때문에, 가상적인 무한 배젤이 효과적으로 사용될 수 있다. 이미지 프로퍼가 충분히 멀리 떨어진 경우에 무한 배젤이 소정의 스테이지에서 이상적이고 일정한 값의 세트가 야기된다. 이미지 조작의 효과, 예를들어 이러한 일정 영역에서 행해지는 필터링은, 우선순위(priori)를 갖는 것으로 계산될 수있다. 이와같이, 좁은 보더(현재 구성에서 6 픽셀)은 이미지 프로퍼로부터 무한 배젤에 적절한 전이를 제공할 수 있다.Since image manipulation is local, a virtual infinite bezel can be used effectively. If the image prop is far enough away an infinite bezel results in a set of ideal and constant values at a given stage. The effect of image manipulation, for example filtering done in this area, can be calculated as having a priori. As such, a narrow border (6 pixels in the current configuration) can provide an appropriate transition from the image propeller to the infinite bezel.

입력에서, 배젤은 Y'= 90, U'= V= 0 로 주어진다. (Y'= 90 은 최대 스크린 발광의 15%의 Rec 500 배경 값의 절반에 대응된다.) 그러나, 이미지 보더를 넘어 연장도ㅚ는 공간 인터렉션이 이 스테이지에서는 발생되지 않기 때문에, 배젤이 전방-말단 프로세싱 후까지 필요한 것은 아니다. 발광 채널에서, 어떠한 보더(및 배젤 값)이 발광 압축 후까지는 부가되지 않는다. 색차 채널에서, 보더는 전방 말단 프로세싱 후에 부가된다.At the input, the basel is given by Y '= 90 and U' = V = 0. (Y '= 90 corresponds to half of the Rec 500 background value of 15% of the maximum screen emission.) However, because the spatial interaction extending beyond the image border does not occur at this stage, the bezel is front-end It is not necessary until after processing. In the light emitting channel, no borders (and bezel values) are added until after light compression. In the chrominance channel, the border is added after the front end processing.

발광 채널에서, 발광 압축후의 첫번째 배젤 값은 아래와 같다.In the light emitting channel, the first bezel value after light emission compression is as follows.

(30c) (30c)

u* 및 v* 채널에서, 첫번째 배젤 값은 모두 0이다.In u * and v * channels, the first bezel value is both zero.

이들 값들은 프로세싱의 차후 스테이지를 통해 이하의 세가지 경우로 전파된다.These values propagate through the following three stages of processing in three cases:

1) 한 픽셀 한 픽셀(pixel-by-pixel) 함수는 구 배젤 값으로부터 신 배젤 값을 생성하도록 조작한다. 예를들어, 1.4 파워 함수로부터 발생되는 배젤 값은 다음과 같다.1) Pixel by pixel The pixel-by-pixel function manipulates to generate a new bezel value from an old bezel value. For example, the bezel value from the 1.4 power function is

(30d) (30d)

2) 행과 열의 합이 P인 3x3 공간 필터는 출력 배젤 값을 입력 배젤 곱하기 P로 설정한다.2) A 3x3 spatial filter with a row and column sum of P sets the output bezel value to the input bezel times P.

3) 콘트라스트 함수 계산자와 4-필드 타임 필터(제로의 탭 합계를 가짐)는 출력 배젤 값을 0으로 설정한다.3) The contrast function calculator and 4-field time filter (with zero tap sum) set the output bezel value to zero.

콘트라스트 스테이지와 차후 단계에서, 발광 및 색차 채널의 값 0, 즉 공간적으로 일정한 어레이상의 제로섬(zero-sum) 선형 커넬(kernel)로 조작된 논리 결과가 배젤에 주어진다.In the contrast stage and subsequent stages, the logic result manipulated with a zero-sum linear kernel on a spatially constant array of values of the emission and chrominance channels is given to the bezel.

가상 배젤을 생성하기 위한 본 발명은 1997년 12월 23일 출원된 미국 특허 출원 08/997,267호, "이미지 피라미드 보더를 생성하기 위한 방법"에 개시되어 있다. 이 미국 출원 08/997,267은 참고자료로 결합된다.The present invention for generating a virtual bezel is disclosed in US Patent Application 08 / 997,267, "Method for Generating Image Pyramid Border," filed December 23,1997. This US application 08 / 997,267 is incorporated by reference.

이미지와 배젤 일체화Integrate image and bezel

피라미드 스테이지 모델에서 시작하여, 보더가 제공될 필요가 있다. N×M 입력 이미지 상의 첫번째 보더 조작은 적절한 배젤 값(압축 발광 이미지에 대한 제1_발광_배젤과 u* 및 v* 이미지에 대한 0)을 갖는 6 픽셀(모든 면의)에 이미지를 패딩하는 것이다. 패딩된 이미지는 (N + 12)×(M + 12) 차원을 갖는다. k^th피라미드 레밸(여기서 k는 0에서 7일 수 있음)에 대해서, 패딩된 이미지는의 차원을 가지며, 여기서 "[x]"는 x의 최대 정수를 나타낸다.Starting with the pyramid stage model, a border needs to be provided. The first border operation on the N × M input image pads the image to 6 pixels (on all sides) with the appropriate bezel values (first_emission_bezel for compressed light emission images and zeros for u * and v * images). will be. The padded image has (N + 12) × (M + 12) dimensions. For k ^th pyramid levels, where k can be 0 to 7, the padded image Has the dimension of "[x]" where x represents the largest integer.

모든 피라미드 레벨의 이미지는 이미지 프로퍼의 좌상 핸드 코너에 각각 등록된다. 이미지 프로퍼의 인덱스는 0≤y≤높이, 0≤x≤폭의 범위를 갖는다. 이미지 프로퍼의 좌상 핸드 코너는 항상 인덱스 (0,0)을 갖는다. 배젤 픽셀의 인덱스는 높이와 폭 값에서 0 이하의 값을 취한다. 예를들어, 좌상 핸드 베젤 픽셀은 (-6, -6)이다. 폭 w의 좌상 핸드 코너(배젤 폭 w+12와 이미지의 합)에 시작하여 x 차원을 따라 보는 경우에, 배젤 픽셀은 x=(-6, -5,...,-1)로 인덱싱되고 실제 이미지는 (0.1,...,w-1)로 인덱싱되며 오른쪽 핸드 배젤 인덱스는 (w, w+1,...,w+5)의 값이된다.Images of all pyramid levels are registered respectively in the upper left hand corner of the image prop. The index of the image prop has a range of 0 ≦ y ≦ height and 0 ≦ x ≦ width. The upper left hand corner of the image prop always has an index (0,0). The index of the bezel pixel takes a value less than or equal to zero in height and width values. For example, the upper left hand bezel pixel is (-6, -6). When looking along the x dimension starting at the top left hand corner of the width w (the sum of the bezel width w + 12 and the image), the bezel pixels are indexed with x = (-6, -5, ...,-1) The actual image is indexed as (0.1, ..., w-1) and the right hand bezel index is the value of (w, w + 1, ..., w + 5).

패딩된 이미지가 주어지면, 차후 프로세싱 스테이이지에 따라 4가지 일이 발생될 수 있다. 이하에서 이들 조작을 설명하면, 공간 프로세싱을 요약하기 위해 단일 이미지 라인이 이용되는 것이 가능하다(유사한 경우가 수직 방향에서 발생할 수 있다는 것이 이해되어야 한다).Given a padded image, four things can happen depending on the subsequent processing stage. In describing these manipulations below, it is possible for a single image line to be used to summarize spatial processing (it should be understood that similar cases may occur in the vertical direction).

(a) 한 픽셀 한 픽셀 조작에 대하여. 다음 조작이 한 픽셀 한 픽셀 조작(예를들어, 비선형성을 갖는)인 경우에, 패딩된 이미지는 단순히 상기 조작을 거치게 되며, 출력 이미지 차원은 입력 이미지 차원과 동일하다. 조작이 서로 다른 필드 또는 서로다른 색 밴드의 대응 픽셀 사이에 이뤄지는 경우 동일한 결과가 발생한다.(a) One pixel One pixel operation. If the next operation is one pixel one pixel operation (eg with nonlinearity), the padded image simply goes through the operation and the output image dimension is the same as the input image dimension. The same result occurs when an operation is made between corresponding pixels of different fields or different color bands.

(b) 3×3 공간 필터에 대하여. (1차원에서) 패딩되지 않은 입력 이미지가 N_k+12차원을 갖는 것으로 가정하자. 그러면, 패딩된 입력 이미지는 N_k+12 차원을 갖고, 패딩된 출력 이미지도 차원을 N_k+12 갖는다. 출력 배젤 값이 먼저 계산되고, 차후 이미지 조작에 의해 달리 채워지지 않은 적어도 이들 배젤 픽셀에 씌여진다. 다름, 패딩된 입력 이미지의 왼쪽 에지에서 떨어지 1 픽셀로부터 시작하여, 3×3 커넬이 입력 이미지 상에 조작되기 시작하고 출력 이미지의 배젤 값을 덮어 쓰기하며, 이미지의 오른쪽(또는 저부) 에지로부터 떨어진 1 픽셀에서 종단된다(여기서 최초 배젤 값은 남아 있다). 미리 씌여진 배젤 값은 커넬 조작이 이들 값을 계산하기 위해서 최초 (패딩된) 이미지 외부로 벗어나는 것을 불필요하게 만든다.(b) 3 × 3 spatial filter. Assume that the unpadded input image (in one dimension) has N _k +12 dimensions. The padded input image then has N _k +12 dimensions, and the padded output image also has N _k +12 dimensions. The output bezel values are calculated first and written to at least these bezel pixels that are not otherwise filled by subsequent image manipulations. Different, starting from one pixel away from the left edge of the padded input image, the 3x3 kernel starts to manipulate on the input image and overwrites the bezel value of the output image, and falls away from the right (or bottom) edge of the image Terminate at 1 pixel (where the initial bezel value remains). The prewritten bezel values make it unnecessary for kernel operations to escape outside the original (padded) image to calculate these values.

(c) 리듀스(REDUCE)에서의 필터링 및 다운 샘플링에 대하여. 입력 패딩 이미지에 N_k+12 차원이 주어지면, 출력 어레이에 [N_k/2] +12 차원이 할당된다. 배젤 값은 차후의 필터 및 다운샘플 조작에 의해 달리 채워지지 않은 적어도 이들 배젤에 씌여진다. 다음으로, 입력 이미지는 (b)에 따라 필터링되지만, 상기 필터는 입력 이미지가 배출될 때까지 화소(-4, -2, 0, 2, 4)에 적용되고, 출력값이 이들이 출력 이미지에 추가로 배치되지않을 때까지 연속적 화소(-2, -1, 0, 1, 2,...)로 기입된다. 새로운 이미지(7)내의 화소(0)의 위치는 새로운 이미지의 왼쪽 끝으로부터 7개 화소라는 점에 주의하라. 상기 필터의 최종 화소 응용은 N_k가 홀수라면 입력 화소(N_k+3) 대 출력 화소([N_k/2]+2)를 취하고, N_k가 짝수라면 입력 화소(N_k+4) 대 출력 화소([N_k/2]+2)를 취한다(여기에서, 필터의 입력 화소를 3-화소 커넬의 중심에 대응하는 화소로서 참조한다).(c) Regarding filtering and down sampling at REDUCE. Given an N _k +12 dimension in the input padding image, an [N _k / 2] +12 dimension is assigned to the output array. The bezel values are written to at least these bezels that are not otherwise filled by subsequent filter and downsample operations. Next, the input images are filtered according to (b), but the filter is applied to the pixels (-4, -2, 0, 2, 4) until the input image is ejected, and the output values are added to these output images. It is written into successive pixels (-2, -1, 0, 1, 2, ...) until not arranged. Note that the position of pixel 0 in the new image 7 is seven pixels from the left end of the new image. The final pixel application of the filter is N _k is an odd number if the input pixel (N _k +3) for output pixel _{([N k / 2] +2} ) for taking, N _k is an even number if the input pixel (N _k +4) for Take an output pixel [N _k / 2] +2 (here, the input pixel of the filter is referred to as the pixel corresponding to the center of the three-pixel kernel).

휘도 조정 및 예측Luminance Adjustment and Prediction

정신 물리학적 데이터가 2가지 목적: 1) 휘도 처리부를 조정하기 위해(예를 들어, 특정 프로세싱 파라미터에 대한 값을 결정하기 위해), 그리고 2) 상기 섹션이 조정되었을 때 루미넌스 프로세싱 섹션의 예측값을 형성하기 위해 사용된다. 모든 경우에, 스티뮤리(stimuli)가 휘도 처리 이전에 즉시 Y값 이미지로서 지각에 의한 메트릭 발생기로 주입된다.The psychophysical data serves two purposes: 1) to adjust the luminance processor (e.g., to determine a value for a particular processing parameter), and 2) to form a prediction of the luminance processing section when the section is adjusted. Used to In all cases, stimuli are injected into the metric generator by perception as Y-value images immediately before luminance processing.

조정adjustment

상기 휘도 처리부(220)은 2개의 데이터 세트를 사용하여 반복적으로 조정될 수 있다. 하나의 데이터 세트는 휘도 처리부의 단계(640, 642 및 650)에서 프리마스킹 상수(w_i, t_e및 tl)를 조정하는데 사용된다. 다른 데이터 세트는 휘도 처리부의 단계 660에서 마스킹 단계 상수(σ,β, a 및 c)를 조정하는데 사용된다. JND 값이 항상 단계(660)이후 평가되기 때문에, 제2 데이터 세트를 갖는 단계(660)의 상수 조절은 제1 데이터 세트를 갖는 단계 640, 642 및 650 상수의 재조절을 필요로 한다. 이런 상수의 재조절은 어떤 반복에서 다른 반복까지 추가 변화가 관찰되지않을 때까지 지속된다. 상기 반복 프로세스가 하나의 가상 출력의 JND로서 마스크되지않은 콘트라스트(단계 640, 642 및 650)의 단위값을 해석함으로써 수행되더라도, 마스킹 과정은 이런 해석을 교란시킨다. 상기 조절의 상세는 아래의 서브섹션에 기술되어 있다.The luminance processor 220 may be repeatedly adjusted using two data sets. One data set is used to adjust the premasking constants w _i , t _e and tl in steps 640, 642 and 650 of the luminance processor. The other data set is used to adjust the masking step constants σ, β, a and c in step 660 of the luminance processor. Since the JND value is always evaluated after step 660, constant adjustment of step 660 with the second data set requires reconditioning of the constants 640, 642 and 650 with the first data set. The readjustment of this constant lasts until no further change is observed from one iteration to another. Although the iterative process is performed by interpreting unit values of unmasked contrast (steps 640, 642 and 650) as the JND of one virtual output, the masking process disturbs this interpretation. Details of these adjustments are described in the subsections below.

콘트라스트-정규화 상수의 조절(단계 640, 642 및 650)Adjusting Contrast-Normalization Constants (Steps 640, 642, and 650)

지각에 의한 메트릭 발생기는 코엔드링크와 반 도른(1979)에 의해 제시된 사인파에 대한 콘트라스트-감도 데이터에 매칭되어지는 마스킹 이전에 공간 및 임시 콘트라스트 감도를 예측한다. 지각에 의한 메트릭 발생기에 기초한 곡선상의 위치를 발생시키기 위해, 저진폭 사인파가 지각에 의한 메트릭 발생기(공간 또는 시간으로)에 테스트 이미지로서 제공되며, 1 JND출력에 대한 콘트라스트 임계값이 평가된다. 각각의 경우에 기준 이미지는 함축적으로 테스트 필드로서 동일한 평균 루미넌스를 갖는 균일한 필드를 가지게 된다.The perceptual metric generator predicts spatial and temporal contrast sensitivity prior to masking that matches the contrast-sensitivity data for the sine wave presented by coendlink and van Dorn (1979). To generate a position on a curve based on the perceptual metric generator, a low amplitude sine wave is provided as a test image to the perceptual metric generator (in space or time), and the contrast threshold for one JND output is evaluated. In each case the reference image implicitly has a uniform field with the same average luminance as the test field.

데이터에 대한 공간 콘트라스트 감도(최종 피트에 대해 도 10 참조)가 지각에 의한 메트릭 발생기의 단계(640, 642 및 650)에서 콘트라스트-피라미드 감도 파라미터(w0, w1 및 w2)를 조절하는데 사용되었다. 도 10의 점선은 전체 감도(실선)를 포함하는 개별 피라미드 채널의 감도를 표현한다. 도 10의 공간적 모델 피트는 15 cysles/deg를 초과하지않으며, 이미 개시된 관찰-거리 제약, 즉 4개의 스크린 높이의 관찰 거리와 호환가능하다. w0, w1 및 w2의 유사한 조절이 약간 다른 관찰 거리를 수용하기 위해 수행될 수 있는데, 훨씬 더 큰 관찰 거리는 아마 더 낮은 해상도 피라미드 레벨을 요구할 것이며, 이것은 낮은 계산적 비용으로 쉽게 구현될 수 있다.Spatial contrast sensitivity to the data (see FIG. 10 for the final fit) was used to adjust the contrast-pyramid sensitivity parameters w0, w1 and w2 in the metric generator's steps 640, 642 and 650 by perception. The dashed line in FIG. 10 represents the sensitivity of the individual pyramid channel including the overall sensitivity (solid line). The spatial model fit of FIG. 10 does not exceed 15 cysles / deg and is compatible with the observation-distance constraints already disclosed, ie the viewing distances of four screen heights. Similar adjustments of w0, w1 and w2 can be performed to accommodate slightly different viewing distances, with much larger viewing distances probably requiring lower resolution pyramid levels, which can be easily implemented at low computational cost.

상기 데이터에 대한 임시 콘트라스트 감도의 피트(최종 피트에 대해 도 11 참조)가 임시 필터-탭 파라미터(t_e과 tl) 뿐만 아니라 콘트라스트 피라미드 감도 파라미터(w3)를 조절하는데 사용되었다. 상기 파라미터들을 피팅하는데 사용되는 방법은 공간-콘트라스트 조정과 유사하다. 여러가지 임시 주파수에서 반 도른과 코엔드링크의 최저 공간 주파수 데이터는 공간적으로 균일한 임시 사인파에 대해 계산된 감도에 대해 매칭되었다. 각각의 경우에, 비전-모델 필드 비율은 임시 사인파를 50 내지 60 ㎐로 샘플링하고, 이것은 이미 언급한 별개의 파라미터 값으로 주어진다.(See Fig. 11 for end feet) feet of the temporary contrast sensitivity for the data, a temporary filter was used to adjust the parameters tab _(e t and tl) as well as a contrast pyramid sensitivity parameter (w3). The method used to fit the parameters is similar to the space-contrast adjustment. At various temporal frequencies, the lowest spatial frequency data of the half-row and the co-end link were matched against the sensitivity calculated for the spatially uniform temporal sine wave. In each case, the vision-model field ratio samples the temporal sine wave from 50 to 60 Hz, which is given by the separate parameter values already mentioned.

마스킹 상수의 조절(단계 660)Adjusting Masking Constants (Step 660)

마스킹 파라미터값(σ,β, a 및 c)(지각 메트릭 발생기의 단계 660에서)은 마스크된 콘트라스트 판별에 대한 예측을 칼슨과 코헨(1978)에 의해 획득된 데이터와 비교함으로써 피팅되었다. 최종 피트 비교의 결과는 도 12에 도시되어 있다. 칼슨-코헨 연구로부터, 단일 관찰자의 데이터는 대표가 되는 기준에 종속되고 또한 충분한 데이터 포인트를 가지도록 선택된다. 이런 경우에, 지각 메트릭 발생기 스티뮤리는 테스트와 기준 필드의 소정 페데스탈 콘트라스트의 공간 사인파, 및 부가적으로 테스트 필드 사인파의 콘트라스트 증분으로 구성된다. 콘트라스트 증분은 각각의 콘트라스트 페데스탈 값에 대해 지각 메트릭 발생기로부터 결정되어지는 1 JND를 얻는데 필요하며, 도 12로 도식화된다.The masking parameter values σ, β, a and c (at step 660 of the perceptual metric generator) were fitted by comparing the prediction for masked contrast determination with the data obtained by Carlson and Cohen (1978). The result of the final pit comparison is shown in FIG. From the Carlson-Cohen study, the data of a single observer are chosen to be dependent on the representative criteria and to have enough data points. In this case, the perceptual metric generator steamy consists of a test and a spatial sine wave of a predetermined pedestal contrast of the reference field, and additionally a contrast increment of the test field sine wave. Contrast increments are needed to obtain 1 JND, which is determined from the perceptual metric generator for each contrast pedestal value, and is depicted in FIG. 12.

예측prediction

지각 메트릭 발생기 조정후, 지각 메트릭 발생기 예측이 사인파가 아닌 스티뮤리로부터의 데이터를 검출 및 판별로 계산된다. 이것은 더욱 일반적인 스티뮤리에 대한 사인파 결과의 전송능력을 체킹하기 위해 수행된다. 도 13, 14 및 15에서, 상기 예측은 10 cycles/deg 이상의 공칭 공간 주파수를 갖는 패턴에 적용되지않는다는 것을 알 수 있다. 이런 패턴은 15 cycles/deg 이상의 공간 주파수에서 상당한 에너지를 가지게 될 것이며, 화소 샘플링 비율(30 샘플/도-상기 논의 참조)로 참조될 것이다.After perceptual metric generator adjustment, perceptual metric generator prediction is computed by detecting and discriminating data from stimuli that are not sinusoidal. This is done to check the transmission of sinusoidal results for more general steamy. In Figures 13, 14 and 15, it can be seen that the prediction does not apply to patterns with nominal spatial frequencies above 10 cycles / deg. This pattern will have significant energy at spatial frequencies above 15 cycles / deg and will be referred to as the pixel sampling rate (30 samples / degree-see discussion above).

제1 연구(도 13 참조)에서, 테스트 필드의 낮은 콘트라스트 디스크가 균일한 기준 필드에 대해 검출된다. 경험적 데이터는 블랙웰과 블랙웰(1971)로부터 온다. 이런 특별한 연구를 위한 지각 메트릭 발생기의 실행에서, 상기 데이터는 Q-노르마 개요 크기를 최대값으로 대체하는데 필요하다. 그렇지 않으면 JND 결과는 상기 디스크의 백그라운드의 크기(예를 들어, 이미지 크기)에 민감해진다.In the first study (see FIG. 13), the low contrast disk of the test field is detected for a uniform reference field. Empirical data comes from Blackwell and Blackwell (1971). In the implementation of the perceptual metric generator for this particular study, the data is needed to replace the Q-Norma schema size with the maximum value. Otherwise the JND result is sensitive to the size of the background of the disc (eg image size).

작은 크기의 체커보드(checkerboard) 검출에 대한 제 2 연구(도 14)에서, 데이터는 사르노프의 미발표된 논문에서 획득된다.In a second study (FIG. 14) on small size checkerboard detection, data is obtained from Sarnov's unpublished paper.

제 3 연구(칼슨 및 코헨으로부터의 데이터, 1980)는 제 1 두 개의 연구와는 다소 다르다. erf(ax)에 의해 제공된 흐릿한 에지가 기준 이미지에 존재하고, 검사 이미지에서 erf(a'x)에 의해 제공된 에지에 대비한 식별이 시도되었다. x는 가시 각도에서의 망막 거리이고(a=πf/[ln(2)]^0.5, a'=π(f+Δf/[ln(2)]^0.5), f는 사이클/각도이다. 여기서 Δf는 하나의 JND에 대하여 요구된 f의 변화이다. 도 15의 도면은 Δf/f 대 f이다.The third study (data from Carlson and Cohen, 1980) is somewhat different from the first two studies. A blurry edge provided by erf (ax) is present in the reference image, and an identification has been attempted against the edge provided by erf (a'x) in the inspection image. x is the retinal distance at the viewing angle (a = πf / [ln (2)] ^0.5 , a '= π (f + Δf / [ln (2)] ^0.5 ), and f is the cycle / angle, where Δf is This is the change in f required for one JND, Figure 15 is Δf / f vs. f.

지각적인 미터법 발생기 예측은 4개의 스크린 높이 관찰 거리에서 디스플레이의 공간 주파수 특성 범위에 대한 데이터에 잘 일치되는 것을 알 수 있다.Perceptual metric generator predictions show good agreement with data for the spatial frequency characteristic range of the display at four screen height viewing distances.

색차 캘리브레이션Color difference calibration

휘도 파라미터 캘리브레이션으로서, 물리적 데이터는 휘도 파라미터를 캘리브레이트하기 위하여(즉, 가장 좋은 모델을 위해 그 값을 조절하기 위하여) 사용된다. 모든 경우, 스티물리(stimuli)는 CIELUV로 전화된기 전에 CIE X, Y 및 Z로서 지각적인 미터법 발생기에 투사된 4개의 동일한 필드이다.As a luminance parameter calibration, physical data is used to calibrate the luminance parameter (ie, adjust its value for the best model). In all cases, stimuli are four identical fields projected onto the perceptual metric generator as CIE X, Y and Z before being converted to CIELUV.

대비 표준화 상수의 조절(단계 830)Adjusting Contrast Standardization Constants (Step 830)

마스킹전에 색채 상수 감도에 대한 지각적인 미터법 발생기 예측은 물렌(Mullen)(1985)에 의해 제공된 대비 감도 데이터와 매칭된다. 사용된 검사 시퀀스는 4개의 동일한 필드이고, 각각의 필드는 (X,Y,Z) 값으로서 도입된 수평 가변 공간 사인파 격자를 가진다. 캘리브레이션을 위하여 사용된 데이터는 물렌의 도 6에서, 각각의 검사 이미지가 청록색 이소루미너스(isoluminious) 사인파와 대응한다. 화소(i)에서, 가사 이미지 사인파는 다음 식에 의해 제공된 트리스티물리어스(tristimulus) 값을 가진다 :Perceptual metric generator predictions for color constant sensitivity prior to masking are matched with the contrast sensitivity data provided by Mullen (1985). The test sequence used is four identical fields, each of which has a horizontal variable spatial sine wave grating introduced as a (X, Y, Z) value. The data used for calibration is shown in Figure 6 of Mullen, where each inspection image corresponds to a cyan isoluminescent sine wave. In pixel i, the lyrics image sine wave has a tristimulus value provided by the following equation:

X(i) = (Y₀/2){xr/yr + xg/yg) + cos(2πfai)Δm(xr/yr - xg/yg)} _{X (i) = (Y 0} /2) {xr / yr + xg / yg) + cos (2πfai) Δm (xr / yr - xg / yg)}

Y(i) = Y₀(31)Y (i) = Y ₀ (31)

Z(i) = (Y₀/2){zr/yr + zg/yg) + cos(2πfai)Δm(zr/yr - zg/yg)} _{Z (i) = (Y 0} /2) {zr / yr + zg / yg) + cos (2πfai) Δm (zr / yr - zg / yg)}

여기서 Δm은 임계 증가 판별 대비값이고, (xr, yr) = (.636, .364)는 적색 간섭 필터(602 nm에서)의 색도이고, (xg,yg) = (.122, .823)은 녹색 필터(526 nm에서)의 색도이고, zr=1-xr-yr, zg=1-xg-yg, 및 a=.03 도/화소이다. 기준 이미지는 방정식(28)에 의해 표현된 균일한 필드이지만 Δm=0을 가진다. 지각적인 미터법 발생기를위하여, Y₀=1을 설정하는 것이 충분하다.Where Δm is the critical increase discrimination contrast value, (xr, yr) = (.636, .364) is the chromaticity of the red interference filter (at 602 nm), and (xg, yg) = (.122, .823) is The chromaticity of the green filter (at 526 nm), zr = 1-xr-yr, zg = 1-xg-yg, and a = .03 degrees / pixel. The reference image is a uniform field represented by equation 28 but has Δm = 0. For perceptual metric generators, setting Y ₀ = 1 is sufficient.

모델 바탕 곡선상에서 포인트를 생성하기 위하여, 상기 스티물러스는 여러 f 값에 나타나고, 1 JND 출력에 대한 일정한 임계값(Δm)은 평가된다. 모델링된 색채 대조 감도를 데이터에 설치하는 것은(최종 일치를 위하여 도 16 참조) 지각적인 미터법 발생기에서 파라미터 qi(i=0,...,6)를 조절하기 위하여 사용된다.In order to generate points on the model ground curve, the stimulus appears at several f values and a constant threshold Δm for 1 JND output is evaluated. Installing the modeled color contrast sensitivity in the data (see FIG. 16 for final matching) is used to adjust the parameters qi (i = 0, ..., 6) in the perceptual metric generator.

마스킹 상수의 조절(단계 840)Adjusting Masking Constants (Step 840)

색채 마스킹을 위한 지각적인 미터법 발생기 예측은 스위크(Switkes) 등(1988)에 의해 제공된 데이터와 매칭된다. 사용된 검사 시퀀스는 4개의 동일한 필드이고, 각각은 (X,Y,Z) 값으로서 도입된 수평 가변 공간 사인파 격자를 가진다. 상기 작업(색차의 색차 마스킹)의 도 4와 대응하기 위하여, 화소(i)에서, 검사 이미지 사인파는 다음 식에 의해 제공된 트리스티물러스 값을 가진다 :Perceptual metric generator predictions for color masking match the data provided by Switkes et al. (1988). The test sequence used is four identical fields, each with a horizontal variable space sine wave grating introduced as a (X, Y, Z) value. In order to correspond to FIG. 4 of the above operation (color difference masking of chrominance), in pixel i, the inspection image sine wave has a Tristimulus value provided by the following equation:

X(i) = (Y₀/2){(xr/yr+xg/yg)+cos(2πfai)[(m+Δm)(xr/yr-xg/yg)]} _{X (i) = (Y 0} /2) {(xr / yr + xg / yg) + cos (2πfai) [(m + Δm) (xr / yr-xg / yg)]}

Y(i) = Y₀(32)Y (i) = Y ₀ (32)

Z(i) = (Y₀/2){(zr/yr+zg/yg)+cos(2πfai)[(m+Δm)(zr/yr-zg/yg)]} _{Z (i) = (Y 0} /2) {(zr / yr + zg / yg) + cos (2πfai) [(m + Δm) (zr / yr-zg / yg)]}

여기서, Δm은 임계 증가 판별 상수이고, (xr,yr)=(.580, .362)는 적색 인의 색차이고, (xg, yg)=(.301,. 589)는 적색 인광 물질의 색차이고, zr=1-xr-yr, zg=1-xg-yg 및 fa=2c/eh*.03 deg/pixel=.06이다. 기준 이미지 사인파는 검사 이미지 사인파와 같지만 Δm=0를 가진다. 지각적인 미터법 발생기를 위하여, Y₀=1을 설정하는 것이 만족스럽다.Where Δm is the critical increase determination constant, (xr, yr) = (. 580, .362) is the color difference of red phosphorus, (xg, yg) = (. 301 ,. 589) is the color difference of the red phosphor, zr = 1-xr-yr, zg = 1-xg-yg and fa = 2c / eh * .03 deg / pixel = .06. The reference image sine wave is the same as the test image sine wave but has Δm = 0. For perceptual metric generators, it is satisfactory to set Y ₀ = 1.

평가 데이터와 비교Compare with rating data

4개의 이미지 시퀀스에 대하여, 다양한 각도의 왜곡을 가지는 이미지 시권스는 DSCQS 레이팅 데이터와 현재 지각적인 미터법 발생기를 비교하기 위하여 사용된다. 그 결과는 도 18에 도시되고, 지각적인 미터법 발생기 및 데이터 사이에 상호관계 .9474를 나타낸다. 각각의 시퀀스에 대하여, 지각적인 미터법 발생기는 30 필드(이전 릴리스를 검사하기 위하여 사용된 4개의 필드에 반대되는 필드)를 처리한다.For four image sequences, image sizing with varying degrees of distortion is used to compare the DSCQS rating data with the current perceptual metric generator. The result is shown in FIG. 18 and shows the correlation .9474 between the perceptual metric generator and the data. For each sequence, the perceptual metric generator processes 30 fields (fields opposite to the four fields used to check for previous releases).

몇몇 데이터 지점은 이전 릴리스에 나타난 도면으로부터 제거된다. 이들 포인트는 두 개의 이유 때문에 삭제된다:Some data points are removed from the figure shown in the previous release. These points are deleted for two reasons:

(1) 5개의 포인트는 모든 서브젝트상에 "워밍업" 검사에 대응되도도록 삭제된다. Rec 500은 시퀀스의 제 1 5개 검사가 서브젝트 판단 안정화를 나타내기 때문에 삭제되어야 하는 것을 제안한다.(1) Five points are deleted to correspond to a "warm up" check on all subjects. Rec 500 suggests that the first five checks of the sequence should be deleted because they indicate subject decision stabilization.

(2) "그웬(Gwen)" 시퀀스중 하나에 대하여, 전경이 검사 및 기준 사이에 정확하게 정렬될때조차 배경 나무의 이미지 사이에 발생하는 기준 시퀀스에 관련하여 검사 시퀀스의 작은 시프트가 있다. 청색 스크린 비디오는 이런 특정 경우에 일시적인 정렬 에러를 가지는 검사 및 기준을 위하여 각각 도입된다.(2) For one of the "Gwen" sequences, there is a small shift in the inspection sequence with respect to the reference sequence that occurs between the images of the background tree even when the foreground is correctly aligned between the inspection and the reference. Blue screen video is introduced respectively for inspection and reference with temporary alignment errors in this particular case.

JND 맵 상호관계JND Map Correlation

JND 맵은 임의의 공간 또는 일시적인 윈도우내에 JND를 결정하기 위하여 추후 처리에 적당한 형태이다. 상기된 바와 같이 맵의 값은 간단한 JND 유니트보다 Q 번째 전력으로 상승된 JND 유니트이다. 비디오 스트림의 소정 공간 시간 영역에 대한 단일 JND 값을 얻기 위하여, 상기 영역내의 JND 맵으로부터의 값을 합산하고 Q번째 근을 얻기만 하면 된다.The JND map is in a form suitable for further processing to determine the JND in any spatial or temporary window. As described above, the value of the map is a JND unit that is raised to the Q th power than a simple JND unit. To obtain a single JND value for a given spatial time region of a video stream, one only needs to sum the values from the JND maps in that region and obtain the Q root.

몇 개의 예가 이러한 처리를 명확히 한다. 각각의 픽셀에 대한 1JND값(가장 적절한 출력)을 검색하기 위하여, JND 맵의 각각의 픽셀의 Q번째 근을 얻는다.Some examples clarify this process. In order to retrieve the 1 JND value (the most appropriate output) for each pixel, the Q th root of each pixel of the JND map is obtained.

그러나, 전형적인 MPEG-2 인코더 분석 분야에서, 각각의 픽셀이 아닌 각각의 16×16픽셀 마이크로블록에 대한 단일 JND을 가지는 것이 유용하다. 마이크로블록당 1 JND를 얻기 위하여, 각각의 마이크로블록내의 모든 JND 맵 출력을 먼저 합산하고, 다음에 Q번째 근을 얻는다. 그 결과는 JND값의 마이크로블록-해상도 맵이 된다.However, in the field of typical MPEG-2 encoder analysis, it is useful to have a single JND for each 16x16 pixel microblock rather than each pixel. To get 1 JND per microblock, all JND map outputs within each microblock are summed first and then the Q th root is obtained. The result is a microblock-resolution map of JND values.

피라미드 구조: 이미지 사이즈＆경계 요구Pyramid Structure: Image Size & Boundary Requirements

현재 피라미드 방법의 구현은 큰 이미지 디멘젼 N 및 작은 이미지 디멘젼 M이 다음의 조건을 만족한다면 이미지-디멘젼 문제가 발생하지 않는다.The implementation of the current pyramid method does not cause an image-dimension problem if the large image dimension N and the small image dimension M satisfy the following conditions.

1) M은 적어도 128이상이어야 함1) M must be at least 128

2) 64이하의 계수를 검색하는 것과 같은 횟수(P)로 M이 2로 나누어야 함2) M should be divided by 2 by the same number of times as searching for coefficients less than 64

3) N은 2로 P번 나누어야 함.3) N must be divided P times by 2

지각 메트릭 발생기는 상기 조건들을 만족하지 않는 이미지를 비정상으로 판별한다. 이들 규칙이 동작하는 방법의 예로서, 이미지 디멘젼 N=720, M=480을 생각해본다. M〉128이기 때문에, 조건(a)은 만족된다. M이 2로 3번 나누어지고 3번 나누기에 의하여 64 이하가 되기 때문에 조건(b)이 만족된다(여기서 P=3). 마지막으로, N은 정수를 얻기 위하여 2로 3번 나누어질 수 있기 조건(c)이 만족된다.The perceptual metric generator determines abnormally an image that does not satisfy the conditions. As an example of how these rules work, consider image dimensions N = 720 and M = 480. Since M> 128, condition (a) is satisfied. The condition (b) is satisfied because M is divided by 2 three times and divided by three times to be 64 or less (where P = 3). Finally, condition (c) is satisfied that N can be divided by two three times to obtain an integer.

인터레이스 관련Interlaced

다음 설명의 목적은 본 발명의 지각 메트릭 발생기에서 필드 인터레이스(특히, 인터-라인 공간)의 처리를 명확히 하기 위한 것이다. 인터-라인 공간은 인간 뷰잉 디스플레이에 의하여 보이지 않지만, 이들이 블록값로 모델링될 경우 지각 메트릭 발생기에서 뚜렷한 효과를 발생시킨다. 지각 메트릭 발생기에 의한 라인의 가시 결과로서, 소정 공간 주파수에서 가장 이미지 왜곡은 고주파수 라인 구조에 의하여 마스킹된다. 또한, 라인 구조의 보임은 인터레이스된 시퀀스가 인터레이스되지 않은 시퀀스와 비교될 때 JND 인공물이 일차 원인이다.The purpose of the following description is to clarify the processing of field interlaces (particularly inter-line space) in the perceptual metric generator of the present invention. The inter-line space is not visible by the human viewing display, but when they are modeled as block values, produces a distinct effect in the perceptual metric generator. As a result of the line's visibility by the perceptual metric generator, the most image distortion at a given spatial frequency is masked by the high frequency line structure. In addition, the display of the line structure is a primary cause of JND artifacts when the interlaced sequence is compared to the uninterlaced sequence.

이러한 문제의 해결책은 공간에서 기지의 평균치와 디스플레이 자체에 발생하는 시간을 결합하는 디스플레이 모델을 변경하는 것이다. 상기와 같은 평균은 인터-라인 공간이 적게 보이도록 한다. 제 1단계는 적당한 모델을 결정하기 위하여 이들 효과의 크기를 한정하는 것이다.The solution to this problem is to change the display model, which combines the known mean in space with the time that occurs on the display itself. This average makes the inter-line space appear less. The first step is to define the magnitude of these effects to determine the appropriate model.

인광물질이 한정된 지연 시간을 가지기 때문에 시간적인 평균치가 디스플레이에 발생한다. 따라서 예를 들어 필드 N의 짝수 라인으로부터 일차 방출 시간에 필드 N-1의 홀수 라인의 소거후 나머지가 발행할 수 있다. 그러나, 인터-필드 구간(16500 마이크로초)과 비교하여, 인광물질 소거 시간은 예를 들어 청색 인광체의 경우 70마이크로초로, 녹색 인광체의 경우 100마이크로초 그리고 적색 인광체의 경우 700마이크로초로 일반적으로 상당히 짧다. 따라서, 디스플레이 모델에서 시간적인 평균은 인터-라인 평활에 기여하지 못한다.Since the phosphor has a finite delay, a temporal average occurs on the display. Thus, for example, the remainder may be issued after the erase of the odd line of field N-1 at the primary emission time from the even line of field N. However, compared to the inter-field interval (16500 microseconds), the phosphor scavenging time is generally quite short, for example 70 microseconds for the blue phosphor, 100 microseconds for the green phosphor and 700 microseconds for the red phosphor. . Thus, the temporal average in the display model does not contribute to inter-line smoothing.

픽셀로부터의 방출은 정상 픽셀 경계이상으로 확산되기 때문에 공간 평균이 디스플레이에서 발생한다. 인터레이스된 디스플레이에서, 전자빔 스폿 구조는 인터레이스 아키텍쳐와 결합하도록 설계된다. 그 결과, 픽셀 확산은 수직 방향으로 더욱 명확하도록 되어, 인터-라인 공간을 채워 이들이 잘 보이지 않도록 한다. 확산은 특히 높은 빔 전류에서 심한데, 이는 높은 휘도값에 상응하며 따라서 이미지에서 가장 눈에 띈다. 따라서, 디스플레이 관측으로부터, 공간적인 평균은 인터-라인 평활을 위한 양호한 물리적 모델이다.Since the emission from the pixels spreads beyond the normal pixel boundaries, a spatial average occurs in the display. In an interlaced display, the electron beam spot structure is designed to combine with the interlaced architecture. As a result, pixel diffusion becomes clearer in the vertical direction, filling the inter-line space so that they are less visible. Diffusion is particularly severe at high beam currents, which corresponds to high luminance values and thus is most noticeable in the image. Thus, from display observations, the spatial mean is a good physical model for inter-line smoothing.

선택적으로, 일부 시간적인 평균이 인터-라인 평활에 사용될 수 있다. 비쥬얼 시스템 자체는 인터-라인 공간이 보이지 않도록 충분한 시간적 평균을 수행한다. 그러나, 다음 설명에서 알 수 있는 바와 같이, 본 발명의 지각 메트릭 발생기에서 눈의 이동 부족은 지각 메트릭 발생기가 시간적인 평균 동작으로부터 이탈되도록 한다.Optionally, some temporal mean may be used for inter-line smoothing. The visual system itself performs a sufficient temporal averaging so that the inter-line space is not visible. However, as can be seen in the following description, the lack of eye movement in the perceptual metric generator of the present invention causes the perceptual metric generator to deviate from the temporal average operation.

인간 시야는 두 개의 구별되는 공간-시간 응답 형태를 가진 메카니즘의 도움을 받는다: 즉 높은 공간 해상도 낮은 시간적 해상도로 "유지되고" 높은 시간적 해상도 낮은 공간적 해상도로 "전달"된다.Human vision is aided by a mechanism with two distinct space-time response forms: high spatial resolution "maintains" at low temporal resolution and "transfers" to high temporal resolution low spatial resolution.

이러한 지각 메트릭 발생기의 하나의 실현은 두 개의 채널의 응답을 형성하도록 분리가능한 공간/시간 필터를 이용하는 것이다. 이러한 모델링 선택의 직접적인 중요성은 디스플레이에 일반적인 60Hz 시간 샘플링 속도와 시간적으로 비교하여 저역인 유지 채널상의 시간적 필터이다. 전달 응답은 60Hz 샘플링 속도에서도 민감하지 않다. 그러나, 유지/전달 모델에 들어가지 않는 하나의 엘리먼트는 눈 이동 효과이며, 특히 이미지의 이동 물체를 추적하는 눈의 능력이다. 이러한 추적은 나타난 물체의 세부상에 대한 시각적 감도를 향상시키는데, 이는 제한된 자극에 대한 정신물리학적 경험에 충실한 지각 메트릭 발생기에 의하여 포착되지 않도록 한다.One realization of this perceptual metric generator is to use a spatial / temporal filter that is separable to form the response of the two channels. The direct importance of this modeling choice is the temporal filter on the low pass hold channel compared to the 60Hz time sampling rate typical for displays. The propagation response is not sensitive even at 60Hz sampling rate. However, one element that does not enter the maintenance / transfer model is the eye movement effect, in particular the eye's ability to track moving objects in the image. This tracking improves the visual sensitivity of the details of the objects shown, which is prevented from being captured by the perceptual metric generator, which is faithful to the psychophysical experience of limited stimuli.

이미지 시퀀스에서 왜곡 측정치상의 이동 효과가 고려될 수 있다. 눈이 이미지에서 이동하는 물체를 추적하지 않는다면 유지되는 시간적 응답으로부터 발생하는 이미지에서의 흐림은 하나의 채널에서 많은 시간적 평균에 의하여 지각 메트릭 발생기에서 정확하게 반영된다. 그러나, 눈은 이동하는 물체를 추적하지 않으며, 따라서 이미지는 흐려지지 않는다. 이동 물체를 추적하는 능력이 없으면, 시간적 시야 응답을 향상시키기 위한 지각 메트릭 발생기는 이동 흐림을 디스플레이해야한다. 그러나, 상기와 같은 흐림은 정확한 JND 맵의 발생을 방해한다.The effect of the shift on the distortion measure in the image sequence can be considered. If the eye does not track moving objects in the image, the blur in the image resulting from the retained temporal response is accurately reflected in the perceptual metric generator by many temporal averages in one channel. However, the eye does not track the moving object, so the image is not blurred. Without the ability to track moving objects, perceptual metric generators to improve temporal visual response should display moving blur. However, such blurring prevents the generation of accurate JND maps.

추적 모델없이 이러한 곤란성을 해결하기 위하여, 필드의 시간 평균이 아닌 마지막 필드를 동작시킴으로써 공간 채널(공간 세부상에 민감한 "유지" 채널의 역할을 가짐)을 나타내는 타협이 이루어진다. 이러한 방법의 결과에 의하여, 이미지 시퀀스에 나타난 물체의 이동을 추적하는 눈의 경우에 공간 채널은 포커싱이 잘된 JND 맵을 나타낸다.To solve this difficulty without a tracking model, a compromise is made that represents the spatial channel (having the role of a "maintain" channel sensitive to spatial detail) by operating the last field rather than the time average of the field. As a result of this method, the spatial channel represents a well-focused JND map in the case of the eye tracking the movement of the object shown in the image sequence.

이상의 포함된 정신을 유지하면서, 공간 채널의 "외향성(specious-present)" 특성을 완화시켜 두 필드에 대해 평균하고 이에 따라 하나의 프레임에 대해 평균되도록 한다. 이러한 방법은 인터레이스(interlace) 피르ㄷ내의 블랭크 라인의 가시성을 감소시키고, "외향성" 솔류션보타 더 물리적이고 더 생리학적이다. 하지만, 단점은 두 필드의 일시적인 평균화를 구원하고 이는 완만한 이동 에지이어야 하는 "콤브"의 모양을 가진다는 것이다.While maintaining the spirit involved above, the "specious-present" property of the spatial channel is relaxed to be averaged over two fields and thus averaged over one frame. This method reduces the visibility of the blank lines in the interlace field and is more physical and physiological than "extroverted" solutions. However, the disadvantage is that it saves the temporal averaging of the two fields, which has the shape of a "comb" which should be a smooth moving edge.

두-필드 평균화를 가진 모델에서 콤브가 나타나야 하는 이유를 이해하기 위해, 짝수 필드(필드 N이라 부름)와 홀수 필드(필드 N+1이라 부름) 사이의 시간 간격에서 이동하는 물체를 가시화하기에 충분하다. 물체가 필드 사이를 수평으로 5개의 화소를 이동하는 수직 에지부를 가진다고 가정한다. 또한, 물체 에지가 필드 N의 짝수 라인의 화소 n에 위치한다고 가정하자. 그러면, 이러한 에지는 필드 N+1에서 홀수 라인의 화소 n+%를 나타낼 것이다. 만일 특정 필드의 래스터 라인 사이에 어떠한 "충진"이 없다면, 필드 N과 필드 N+1의 평균화는 더이상 수직이지 않지만 화소 N과 n+5 사이를 변화시키는 에지를 발생시킨다. 이는 "콤브" 효과이다.To understand why combs should appear in a model with two-field averaging, it is sufficient to visualize moving objects in the time interval between even fields (called field N) and odd fields (called field N + 1). Do. Assume that an object has a vertical edge that moves five pixels horizontally between fields. Also assume that the object edge is located at pixel n of the even line of field N. This edge would then represent pixel n +% of the odd lines in field N + 1. If there is no "fill" between the raster lines of a particular field, the averaging of field N and field N + 1 is no longer vertical, but produces an edge that changes between pixels N and n + 5. This is a "comb" effect.

실제 시각 시스템이 이러한 콤브 효과를 나타내지 않는 이유를 이해하기 위해, 물체는 눈이 물체를 트랙킹하기에 충분하다고 가정하자. 이는 물체가 레티나에 고정되어 있다는 것을 의미하는데, 이는 레티나가 다음 필드내로의 물체의 이동을 예상하기 때문이다. 만일 물체의 에지가 필드 N의 짝수 라인의 화소 n에 위치한다면, 이는 필드 N+1의 홀수 라인의 화소 n에도 위치할 것이고, 이는 단순히 물체에 대한 눈의 트랙킹이 거의 완벽하기 때문이다.To understand why the real visual system does not exhibit this comb effect, assume that the object is sufficient for the eye to track the object. This means that the object is fixed to the retina, because the retina expects to move the object into the next field. If the edge of the object is located at pixel n of the even line of field N, it will also be located at pixel n of the odd line of field N + 1, simply because the eye tracking on the object is almost perfect.

콤브 및 다른 인터레이스 단점을 방지하기 위해, 직각력 미터 발생기는 디스플레이의 각각의 필드 사이에 공간 충진을 수행한다. 이러한 수직 평균화는 일시적인 공간 에지(어또한 평균화도 발생하지 않는)의 번역을 제공하기 때문에 콤브 효과를 방지한다. 또한, 수직 평균화는 인터레이스 라인 구조의 가시성의 본질적인 문제를 해결하고, 이느 전자 빔 스폿 구조의 공지된 공간 분산과 호환가능하다.To avoid combs and other interlacing shortcomings, the perpendicular force meter generator performs space filling between each field of the display. This vertical averaging prevents the comb effect because it provides translation of temporal spatial edges (no averaging occurs). In addition, vertical averaging solves the inherent problem of visibility of interlaced line structures and is compatible with the known spatial dispersion of these electron beam spot structures.

이상에서 이미진 충실도와 시각화 응용을 개선을 위한 두 개의 입력 이미지 사이의 차이를 가진 가시성에 액세스하기 위한 새로운 장치와 방법이 도시되고 설명되었다. 하지만, 당업자라면 설명된 실시예와 첨부된 도면을 고려해 볼때 많은 변경, 변화등이 가능하다는 것을 알 수 있을 것이다.In the above, a novel apparatus and method for accessing visibility with differences between two input images for improving image fidelity and visualization applications have been shown and described. However, those skilled in the art will appreciate that many modifications, changes, and the like are possible in view of the described embodiments and the accompanying drawings.

Claims

A method for evaluating a difference in visibility between two input image sequences each having a plurality of images comprising a color component,

(a) analyzing a color component of each of the input images to generate a pyramid having a plurality of resolution levels of the color component;

(b) applying temporal filtering of at least one of said levels of said pyramid of said color component into a color time response: and

(c) generating an image metric from the color time response.

The method of claim 1, wherein each of the input images is one half height.

The method of claim 1, wherein each of the input images further comprises a luminance component.

(a ') analyzing a luminance component of each of the input images to generate a pyramid having a plurality of resolution levels of the luminance component;

(b) applying temporal filtering of the luminance component to a luminance time response of at least one of the levels of the pyramid; and

Wherein said generating step (c) generates an image metric from said time responses from said luminance and said color response.

The method of claim 1, wherein the method is

(a ") transforming the color component of the input image into a CIELUV uniform-color space prior to the analyzing step (a) and converting the color component of the input image before the analyzing step (a) Transforming to CIELUV uniform-color space

The method of claim 3, wherein the method

(a ") transforming said luminance component of said input image to a CIE 1931 tristimulus value Y prior to said analyzing step (a).

The method of claim 3, wherein the generating step (c),

(c ') generating a luminance metric separate from said luminance time response;

(c ") generating a color metric separate from said color time response; and

(c '") generating an image metric from said luminance metric and said color metric.

The method of claim 3, wherein the applying step (b ') applies time filtering to the lowest level of the pyramid of the luminance component in a luminance time response,

The method further includes (b ") applying spatial filtering to the remaining levels of the pyramid of the luminance component in the luminance spatial response.

7. The method of claim 6, wherein said generating step (c ") generates a color metric separate from said color time response, and said color time response is masked by said luminance time response.

In the apparatus 112 for evaluating the difference in visibility between two input image sequences each having a plurality of images comprising a color component,

A pyramid generator (410, 510) for analyzing a color component of each of the input images to generate a pyramid having a plurality of resolution levels of the color component;

A time filter 520, coupled to the pyramid generator, for time filtering at least one of the levels of the pyramid of the color component with a color time response; and

An image metric generator (250, 550) coupled to the time filter for generating an image metric from the color time response.

A computer-readable medium storing a plurality of instructions consisting of instructions, the method comprising:

(c) generating an image metric from the color time response, wherein the processor performs the steps when the processor is executed.