KR20210076660A

KR20210076660A - Method and Apparatus for Stereoscopic Image Quality Assessment Based on Convolutional Neural Network

Info

Publication number: KR20210076660A
Application number: KR1020190168080A
Authority: KR
Inventors: 이상훈; 이성민
Original assignee: 연세대학교 산학협력단
Priority date: 2019-12-16
Filing date: 2019-12-16
Publication date: 2021-06-24

Abstract

Provided, in the present embodiment, are a device and method that extracts, for a stereoscopic image, a regional feature using a patch-based neural network, and evaluates a quality of the stereoscopic image by grouping the regional feature into a plurality of feature groups to predict a global score. Therefore, the present invention is capable of having an effect that can evaluate the quality of the stereoscopic image.

Description

Method and Apparatus for Stereoscopic Image Quality Assessment Based on Convolutional Neural Network

본 발명이 속하는 기술 분야는 스테레오스코픽 이미지의 화질 평가 방법 및 장치에 관한 것이다.The technical field to which the present invention pertains relates to a method and apparatus for evaluating the image quality of a stereoscopic image.

이 부분에 기술된 내용은 단순히 본 실시예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information for the present embodiment and does not constitute the prior art.

스테레오스코픽 이미지는 양안 시차를 활용해서 시청자가 두 눈의 각도 차이를 통해 3차원 이미지를 느낄 수 있도록 하는 입체 이미지이다.A stereoscopic image is a stereoscopic image that utilizes binocular disparity so that the viewer can feel a three-dimensional image through the difference in angle between the two eyes.

기존의 2차원 이미지의 화질 평가 모델을 스테레오스코픽 이미지의 화질 평가에 적용하기에는 크게 3 가지 문제점이 있다.There are three major problems in applying the existing two-dimensional image quality evaluation model to the stereoscopic image quality evaluation.

첫 번째로 스테레오스코픽 이미지에 관한 학습 데이터가 부족하고, 두 번째로 이미지의 지역적인 평가 점수가 존재하지 않고, 세 번째로 스테레오스코픽 이미지를 화질 평가할 때 일반적인 이미지 분류 모델을 그대로 사용할 수 없다는 점이다.First, there is insufficient training data for stereoscopic images, secondly, there is no regional evaluation score of the image, and thirdly, a general image classification model cannot be used as it is when evaluating the quality of stereoscopic images.

한국등록특허공보 제10-1393621호 (2014.05.02)Korean Patent Publication No. 10-1393621 (2014.05.02)

본 발명의 실시예들은 스테레오스코픽 이미지에 대해서, 패치 기반의 신경망을 사용하여 지역적 특징을 추출하고, 지역적 특징을 복수의 특징 그룹으로 그룹핑하여 전역적 점수를 예측함으로써, 스테레오스코픽 이미지의 화질을 평가하는 데 주된 목적이 있다.Embodiments of the present invention evaluate the picture quality of a stereoscopic image by extracting local features using a patch-based neural network for a stereoscopic image, and predicting a global score by grouping the local features into a plurality of feature groups. has the main purpose of

본 발명의 명시되지 않은 또 다른 목적들은 하기의 상세한 설명 및 그 효과로부터 용이하게 추론할 수 있는 범위 내에서 추가적으로 고려될 수 있다.Other objects not specified in the present invention may be additionally considered within the scope that can be easily inferred from the following detailed description and effects thereof.

본 실시예의 일 측면에 의하면, 컴퓨팅 디바이스에 의한 스테레오스코픽 이미지의 화질 평가 방법에 있어서, 원본의 스테레오스코픽 이미지와 왜곡된 스테레오스코픽 이미지에 대해서 패치마다 참조 화질 평가 모델을 통해 지역적 정답 점수를 산출하는 단계, 상기 왜곡된 스테레오스코픽 이미지에 대해서 패치 기반의 비참조 화질 평가 모델을 통해 지역적 특징을 추출하고 상기 지역적 정답 점수를 기준으로 지역적 점수를 예측하는 단계, 상기 지역적 특징을 복수의 특징 그룹으로 그룹핑하고 상기 복수의 특징 그룹에 대해서 전역적 점수를 예측하는 단계를 포함하는 스테레오스코픽 이미지의 화질 평가 방법을 제공한다.According to one aspect of this embodiment, in the method for evaluating the quality of a stereoscopic image by a computing device, the step of calculating a local correct score through a reference quality evaluation model for each patch for an original stereoscopic image and a distorted stereoscopic image , extracting regional features through a patch-based non-reference quality evaluation model for the distorted stereoscopic image and predicting a regional score based on the regional correct score, grouping the regional features into a plurality of feature groups, and the Provided is a method for evaluating the quality of a stereoscopic image, comprising predicting a global score for a plurality of feature groups.

상기 지역적 정답 점수를 산출하는 단계는, 스테레오스코픽 이미지(Stereoscopic Image)를 좌우 합성 이미지(Cyclopean Image)로 변환하며, 상기 원본의 스테레오스코픽 이미지를 변환한 기준 좌우 합성 이미지와 상기 왜곡된 스테레오스코픽 이미지를 변환한 왜곡된 좌우 합성 이미지를 패치로 분할하여 상기 참조 화질 평가 모델에 적용할 수 있다.In the step of calculating the local correct score, a stereoscopic image is converted into a left-right composite image (Cyclopean Image), and a reference left-right composite image obtained by converting the original stereoscopic image and the distorted stereoscopic image The transformed distorted left and right composite images may be divided into patches and applied to the reference quality evaluation model.

상기 지역적 정답 점수를 산출하는 단계는, 이미지의 픽셀마다 상기 참조 화질 평가 모델을 적용하여 산출한 결과 점수를 평균을 내어 상기 패치마다 지역적 정답 점수를 산출할 수 있다.The calculating of the regional correct answer score may include calculating the local correct answer score for each patch by averaging the result scores calculated by applying the reference quality evaluation model to each pixel of the image.

상기 전역적 점수를 예측하는 단계는, (i) 상기 지역적 특징에 대해서 평균을 산출한 제1 특징 그룹, (ii) 상기 지역적 특징에 대해서 분산을 산출한 제2 특징 그룹, (iii) 상기 지역적 특징의 히스토그램에서 상위 비율의 평균을 산출한 제3 특징 그룹, 및 (iv) 상기 지역적 특징의 히스토그램에서 하위 비율의 평균을 산출한 제4 특징 그룹을 전부 결합한 통합 특징 그룹에 대해서 전역적 점수를 예측할 수 있다.Predicting the global score includes: (i) a first feature group in which an average is calculated for the regional feature, (ii) a second feature group in which a variance is calculated for the local feature, (iii) the local feature A global score can be predicted for an integrated feature group that combines all of the third feature group in which the average of the upper proportion is calculated in the histogram of have.

본 실시예의 다른 측면에 의하면, 하나 이상의 프로세서 및 상기 하나 이상의 프로세서에 의해 실행되는 하나 이상의 프로그램을 저장하는 메모리를 포함하는 스테레오스코픽 이미지의 화질 평가 장치에 있어서, 상기 프로세서는 원본의 스테레오스코픽 이미지와 왜곡된 스테레오스코픽 이미지에 대해서 패치마다 참조 화질 평가 모델을 통해 지역적 정답 점수를 산출하고, 상기 프로세서는 상기 왜곡된 스테레오스코픽 이미지에 대해서 패치 기반의 비참조 화질 평가 모델을 통해 지역적 특징을 추출하고 상기 지역적 정답 점수를 기준으로 지역적 점수를 예측하고, 상기 프로세서는 상기 지역적 특징을 복수의 특징 그룹으로 그룹핑하고 상기 복수의 특징 그룹에 대해서 전역적 점수를 예측하는 것을 특징으로 하는 스테레오스코픽 이미지의 화질 평가 장치를 제공할 수 있다.According to another aspect of this embodiment, in the apparatus for evaluating the image quality of a stereoscopic image including one or more processors and a memory for storing one or more programs executed by the one or more processors, the processor is configured to distort the original stereoscopic image For the stereoscopic image, a local correct score is calculated through a reference quality evaluation model for each patch, and the processor extracts a regional feature through a patch-based non-reference quality evaluation model for the distorted stereoscopic image, and the local correct answer Predicting a regional score based on the score, the processor groups the regional features into a plurality of feature groups, and provides an apparatus for evaluating the quality of stereoscopic images, characterized in that predicting global scores for the plurality of feature groups can do.

이상에서 설명한 바와 같이 본 발명의 실시예들에 의하면, 스테레오스코픽 이미지에 대해서, 패치 기반의 신경망을 사용하여 지역적 특징을 추출하고, 지역적 특징을 복수의 특징 그룹으로 그룹핑하여 전역적 점수를 예측함으로써, 스테레오스코픽 이미지의 화질을 평가할 수 있는 효과가 있다.As described above, according to the embodiments of the present invention, for a stereoscopic image, a regional feature is extracted using a patch-based neural network, and a global score is predicted by grouping the local feature into a plurality of feature groups, It has the effect of evaluating the quality of a stereoscopic image.

여기에서 명시적으로 언급되지 않은 효과라 하더라도, 본 발명의 기술적 특징에 의해 기대되는 이하의 명세서에서 기재된 효과 및 그 잠정적인 효과는 본 발명의 명세서에 기재된 것과 같이 취급된다.Even if it is an effect not explicitly mentioned herein, the effects described in the following specification expected by the technical features of the present invention and their potential effects are treated as if they were described in the specification of the present invention.

도 1은 본 발명의 일 실시예에 따른 스테레오스코픽 이미지의 화질 평가 장치를 예시한 블록도이다.
도 2는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법을 예시한 흐름도이다.
도 3은 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법에 적용된 비참조 화질 평가 모델을 예시한 도면이다.
도 4는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 산출한 화질 맵을 예시한 도면이다.
도 5는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 지역적 특징을 추출하고 지역적 점수를 예측하는 것을 예시한 도면이다.
도 6은 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 특징 그룹에 대해서 전역적 점수를 예측하는 것을 예시한 도면이다.
도 7은 본 발명의 실시예들에 따라 수행된 모의실험 결과를 도시한 것이다.1 is a block diagram illustrating an apparatus for evaluating the quality of a stereoscopic image according to an embodiment of the present invention.
2 is a flowchart illustrating a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention.
3 is a diagram illustrating a non-referenced image quality evaluation model applied to a method for evaluating the image quality of a stereoscopic image according to another embodiment of the present invention.
4 is a diagram illustrating a picture quality map calculated by the method for evaluating the picture quality of a stereoscopic image according to another embodiment of the present invention.
5 is a diagram illustrating a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention to extract regional features and predict a regional score.
6 is a diagram illustrating the prediction of a global score for a feature group by a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention.
7 shows simulation results performed according to embodiments of the present invention.

이하, 본 발명을 설명함에 있어서 관련된 공지기능에 대하여 이 분야의 기술자에게 자명한 사항으로서 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하고, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다.Hereinafter, in the description of the present invention, if it is determined that the subject matter of the present invention may be unnecessarily obscured as it is obvious to those skilled in the art with respect to related known functions, the detailed description thereof will be omitted, and some embodiments of the present invention will be described. It will be described in detail with reference to exemplary drawings.

기존의 컴퓨터 비전 처리 방식와 달리 스테레오스코픽 이미지의 화질을 평가 방식에 딥 러닝을 적용하면 3 가지 문제가 있다.Unlike the existing computer vision processing method, there are three problems when deep learning is applied to the stereoscopic image quality evaluation method.

첫 번째로 스테레오스코픽 이미지에 관한 훈련 데이터가 부족하다.First, there is a lack of training data for stereoscopic images.

이미지 분류에 사용되는 ImageNet 이미지 데이터베이스는 클래스 레이블과 매칭된 1500만 개 이상의 이미지를 포함한다. 기존의 이미지 데이터베이스에 저장된 레이블은 의미적으로 명확하므로 검증자료(Ground Truth)를 획득하는 것이 용이하다.The ImageNet image database used for image classification contains more than 15 million images matched with class labels. Since the labels stored in the existing image database are semantically clear, it is easy to obtain ground truth.

반면에 스테레오스코픽 이미지의 화질 평가를 위한 데이터베이스는 1000 개의 이미지를 초과하지 않는다. LIVE 데이터베이스는 5 가지 왜곡 유형에 대해 365 개의 왜곡된 이미지만 있고 SA 데이터베이스는 9 가지 왜곡 유형에 대해 360 개의 이미지만 보유한 실정이다.On the other hand, the database for image quality evaluation of stereoscopic images does not exceed 1000 images. The LIVE database only has 365 distorted images for the 5 distortion types, and the SA database only has 360 images for the 9 distortion types.

두 번째로 이미지의 지역적인 평가 점수가 존재하지 않는다.Second, there is no regional evaluation score of the image.

주관적 점수를 획득하기 위해 심리적 테스트를 수행해야 하므로, 검증자료를 획득하는 것이 쉽지 않다. 이미지를 분할하는 패치 방식을 적용하더라도 패치마다 주관적 점수가 필요한데 검증된 데이터베이스가 존재하지 않는다. 전체 이미지에 대한 주관적 점수만 확보하고 지역적 검증자료가 없는 실정이다. 전체 이미지에 대한 주관적 점수의 평균은 각 패치의 점수와 상이하다. 이미지의 각 패치에서의 점수는 균일하지 않다. 예를 들어, 공원 사진에서 중간 부분의 도로 영역이 심하게 왜곡되면, 이 영역의 지역적 화질이 하나의 이미지의 화질 점수보다 낮을 수 있다. 신뢰할만한 지역적 검증자료를 적용하지 않고 패치 기반의 지도 학습을 수행할 수 없다.It is not easy to obtain verification data because a psychological test must be performed to obtain a subjective score. Even if the patch method that divides the image is applied, subjective scores are required for each patch, but there is no verified database. Only subjective scores for the entire image are secured, and there is no regional verification data. The average of the subjective scores for the whole image is different from the score of each patch. The score in each patch of the image is not uniform. For example, if a road region in the middle of a park photo is severely distorted, the regional image quality of this region may be lower than the image quality score of one image. Patch-based supervised learning cannot be performed without applying reliable local verification data.

세 번째로 스테레오스코픽 이미지를 화질 평가할 때 일반적인 이미지 분류 모델을 그대로 사용할 수 없다.Third, when evaluating the quality of stereoscopic images, a general image classification model cannot be used as it is.

이미지 인식 모델은 이미지에서 왜곡이 발생하더라도 대상 물체를 감지해야 하므로 왜곡에 강한 특징을 학습해야 한다. 이와 달리 화질 평가 모델은 왜곡 정보를 민감하게 포착해야 하므로, 왜곡에 민감한 특징을 추출해야 한다.Since the image recognition model must detect the target object even if distortion occurs in the image, it must learn a characteristic strong against distortion. On the other hand, since the image quality evaluation model needs to sensitively capture distortion information, it is necessary to extract distortion-sensitive features.

본 실시예에 따른 스테레오스코픽 이미지의 화질 평가 장치는 딥 러닝을 적용하여 비참조 방식으로 스테레오스코픽 이미지의 화질을 평가한다.The apparatus for evaluating the image quality of a stereoscopic image according to the present embodiment evaluates the image quality of the stereoscopic image in a non-referential manner by applying deep learning.

스테레오스코픽 이미지의 화질 평가 장치는 훈련 데이터의 부족을 극복하기 위해 패치 기반의 접근 방식을 사용한다. 신뢰할 수 있는 지역적 주관적 점수가 없는 문제를 해결하기 위해, 패치 기반의 학습 및 참조 방식의 평가 모델을 통해 지역적 패치에 대한 슈도 검증자료를 획득한다. The stereoscopic image quality evaluation device uses a patch-based approach to overcome the lack of training data. In order to solve the problem of not having a reliable regional subjective score, pseudo-verification data for regional patches are obtained through an evaluation model of patch-based learning and reference methods.

도 1은 본 발명의 일 실시예에 따른 스테레오스코픽 이미지의 화질 평가 장치를 예시한 블록도이다.1 is a block diagram illustrating an apparatus for evaluating the quality of a stereoscopic image according to an embodiment of the present invention.

스테레오스코픽 이미지의 화질 평가 장치(110)는 적어도 하나의 프로세서(120), 컴퓨터 판독 가능한 저장매체(130) 및 통신 버스(170)를 포함한다. The stereoscopic image quality evaluation apparatus 110 includes at least one processor 120 , a computer-readable storage medium 130 , and a communication bus 170 .

프로세서(120)는 스테레오스코픽 이미지의 화질 평가 장치(110)로 동작하도록 제어할 수 있다. 예컨대, 프로세서(120)는 컴퓨터 판독 가능한 저장 매체(130)에 저장된 하나 이상의 프로그램들을 실행할 수 있다. 하나 이상의 프로그램들은 하나 이상의 컴퓨터 실행 가능 명령어를 포함할 수 있으며, 컴퓨터 실행 가능 명령어는 프로세서(120)에 의해 실행되는 경우 스테레오스코픽 이미지의 화질 평가 장치(110)로 하여금 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.The processor 120 may control to operate as the apparatus 110 for evaluating the quality of a stereoscopic image. For example, the processor 120 may execute one or more programs stored in the computer-readable storage medium 130 . The one or more programs may include one or more computer-executable instructions, which, when executed by the processor 120 , cause the stereoscopic image quality evaluation apparatus 110 to perform operations according to the exemplary embodiment. may be configured to perform.

컴퓨터 판독 가능한 저장 매체(130)는 컴퓨터 실행 가능 명령어 내지 프로그램 코드, 프로그램 데이터 및/또는 다른 적합한 형태의 정보를 저장하도록 구성된다. 컴퓨터 판독 가능한 저장 매체(130)에 저장된 프로그램(140)은 프로세서(120)에 의해 실행 가능한 명령어의 집합을 포함한다. 일 실시예에서, 컴퓨터 판독한 가능 저장 매체(130)는 메모리(랜덤 액세스 메모리와 같은 휘발성 메모리, 비휘발성 메모리, 또는 이들의 적절한 조합), 하나 이상의 자기 디스크 저장 디바이스들, 광학 디스크 저장 디바이스들, 플래시 메모리 디바이스들, 그 밖에 스테레오스코픽 이미지의 화질 평가 장치(110)에 의해 액세스되고 원하는 정보를 저장할 수 있는 다른 형태의 저장 매체, 또는 이들의 적합한 조합일 수 있다.Computer-readable storage medium 130 is configured to store computer-executable instructions or program code, program data, and/or other suitable form of information. The program 140 stored in the computer-readable storage medium 130 includes a set of instructions executable by the processor 120 . In one embodiment, the computer-readable storage medium 130 includes memory (volatile memory, such as random access memory, non-volatile memory, or a suitable combination thereof), one or more magnetic disk storage devices, optical disk storage devices, It may be flash memory devices, other types of storage media that can be accessed by the apparatus 110 for evaluating the image quality of a stereoscopic image and store desired information, or a suitable combination thereof.

통신 버스(170)는 프로세서(120), 컴퓨터 판독 가능한 저장 매체(140)를 포함하여 스테레오스코픽 이미지의 화질 평가 장치(110)의 다른 다양한 컴포넌트들을 상호 연결한다.The communication bus 170 interconnects various other components of the apparatus 110 for evaluating the quality of a stereoscopic image, including the processor 120 and the computer-readable storage medium 140 .

스테레오스코픽 이미지의 화질 평가 장치(110)는 또한 하나 이상의 입출력 장치를 위한 인터페이스를 제공하는 하나 이상의 입출력 인터페이스(150) 및 하나 이상의 통신 인터페이스(160)를 포함할 수 있다. 입출력 인터페이스(150) 및 통신 인터페이스(160)는 통신 버스(170)에 연결된다. 입출력 장치는 입출력 인터페이스(150)를 통해 스테레오스코픽 이미지의 화질 평가 장치(110)의 다른 컴포넌트들에 연결될 수 있다.The stereoscopic image quality evaluation apparatus 110 may also include one or more input/output interfaces 150 and one or more communication interfaces 160 that provide interfaces for one or more input/output devices. The input/output interface 150 and the communication interface 160 are connected to the communication bus 170 . The input/output device may be connected to other components of the stereoscopic image quality evaluation device 110 through the input/output interface 150 .

스테레오스코픽 이미지의 화질 평가 장치(110)는 원본의 스테레오스코픽 이미지와 왜곡된 스테레오스코픽 이미지에 대해서 패치마다 참조 화질 평가 모델을 통해 지역적 정답 점수를 산출한다. The stereoscopic image quality evaluation apparatus 110 calculates regional correct scores for the original stereoscopic image and the distorted stereoscopic image through a reference quality evaluation model for each patch.

스테레오스코픽 이미지의 화질 평가 장치(110)는 왜곡된 스테레오스코픽 이미지에 대해서 패치 기반의 비참조 화질 평가 모델을 통해 지역적 특징을 추출하고 지역적 정답 점수를 기준으로 지역적 점수를 예측한다. The stereoscopic image quality evaluation apparatus 110 extracts regional features from the distorted stereoscopic image through a patch-based non-referenced image quality evaluation model, and predicts a regional score based on a regional correct score.

스테레오스코픽 이미지의 화질 평가 장치(110)는 지역적 특징을 복수의 특징 그룹으로 그룹핑하고 복수의 특징 그룹에 대해서 전역적 점수를 예측한다.The stereoscopic image quality evaluation apparatus 110 groups regional features into a plurality of feature groups and predicts global scores for the plurality of feature groups.

도 2는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법을 예시한 흐름도이다. 스테레오스코픽 이미지의 화질 평가 방법은 스테레오스코픽 이미지의 화질 평가 장치 또는 컴퓨팅 디바이스 등에 의해 수행될 수 있다.2 is a flowchart illustrating a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention. The method for evaluating the picture quality of a stereoscopic image may be performed by an apparatus or a computing device for evaluating the picture quality of a stereoscopic image.

단계 S210에서 프로세서는 원본의 스테레오스코픽 이미지와 왜곡된 스테레오스코픽 이미지에 대해서 패치마다 참조 화질 평가 모델을 통해 지역적 정답 점수를 산출한다.In step S210, the processor calculates a local correct score for the original stereoscopic image and the distorted stereoscopic image through the reference quality evaluation model for each patch.

단계 S220에서 프로세서는 왜곡된 스테레오스코픽 이미지에 대해서 패치 기반의 비참조 화질 평가 모델을 통해 지역적 특징을 추출하고 지역적 정답 점수를 기준으로 지역적 점수를 예측한다.In step S220, the processor extracts regional features through a patch-based non-reference quality evaluation model for the distorted stereoscopic image, and predicts a regional score based on the local correct score.

단계 S230에서 프로세서는 지역적 특징을 복수의 특징 그룹으로 그룹핑하고 복수의 특징 그룹에 대해서 전역적 점수를 예측한다.In step S230, the processor groups local features into a plurality of feature groups and predicts global scores for the plurality of feature groups.

도 3은 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법에 적용된 비참조 화질 평가 모델을 예시한 도면이다.3 is a diagram illustrating a non-referenced image quality evaluation model applied to a method for evaluating the image quality of a stereoscopic image according to another embodiment of the present invention.

본 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법에 적용된 화질 평가 모델은 2 가지의 훈련 단계를 수행한다. 첫 번째 단계는 참조 화질 평가 모델을 통해 획득한 슈도 검증자료를 사용하여 지역적 품질 점수를 예측한다. 두 번째 단계는 첫 번째 단계에서 학습한 모델을 사용하여 전역적 주관적 점수를 예측한다. The image quality evaluation model applied to the method for evaluating the image quality of a stereoscopic image according to the present embodiment performs two training steps. The first step is to predict regional quality scores using the pseudo-validation data obtained through the reference quality evaluation model. The second step predicts the global subjective score using the model learned in the first step.

먼저 스테레오스코픽 이미지의 화질 평가 장치 방법은 참조 화질 평가 모델을 통해 슈도 검증자료를 생성한다.First, the stereoscopic image quality evaluation apparatus method generates pseudo-verification data through a reference image quality evaluation model.

도 4는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 산출한 화질 맵을 예시한 도면이다.4 is a diagram illustrating a picture quality map calculated by the method for evaluating the picture quality of a stereoscopic image according to another embodiment of the present invention.

참조 화질 평가 모델에 기준 이미지와 왜곡된 이미지를 입력한다. 참조 화질 평가 모델은 화질 맵을 생성한다. 화질 맵은 64x36 패치로 나눠질 수 있으며, 각 패치의 평균을 취하여 지역적 점수를 획득한다. 각 패치에 대해 슈도 검증자료를 획득하고, 지역적 화질 점수 맵을 생성한다. 화질 맵을 생성하는데 구조적 유사도 지수(Structural Similarity Index, SSIM)를 적용할 수 있다. The reference image and the distorted image are input to the reference quality evaluation model. The reference picture quality evaluation model creates a picture quality map. The quality map can be divided into 64x36 patches, and the average of each patch is taken to obtain a regional score. For each patch, pseudo-verification data is obtained, and a regional image quality score map is generated. A Structural Similarity Index (SSIM) may be applied to generate the image quality map.

프로세서가 지역적 정답 점수를 산출하는 단계는 스테레오스코픽 이미지(Stereoscopic Image)를 좌우 합성 이미지(Cyclopean Image)로 변환하며, 원본의 스테레오스코픽 이미지를 변환한 기준 좌우 합성 이미지와 왜곡된 스테레오스코픽 이미지를 변환한 왜곡된 좌우 합성 이미지를 패치로 분할하여 참조 화질 평가 모델에 적용한다.In the step of the processor calculating the local correct score, the stereoscopic image is converted into a left-right composite image (Cyclopean Image), and the standard left-right composite image converted from the original stereoscopic image and the distorted stereoscopic image are converted. The distorted left and right composite images are divided into patches and applied to the reference quality evaluation model.

좌우 합성 이미지를 획득하는 방식은 수학식 1과 같이 표현된다.A method of acquiring the left and right composite images is expressed as in Equation (1).

d는 좌우 이미지 I_l과 I_r의 픽셀 시차이고, W_l과 W_r은 가보 필터 응답에 의한 정규화된 가중치를 나타낸다.d is the pixel parallax of the left _{and right images I l} and I _r _{, and W l} and W _r represent the normalized weights by the Gabor filter response.

왜곡된 스테레오스코픽 이미지에 대한 좌우 합성 이미지 쌍 I^' _c는 매핑 방식으로 합성된다. 왜곡된 스테레오스코픽 이미지 쌍으로부터 신뢰할만한 시차 측정이 없으므로, 기준 이미지 쌍의 시차 맵이 좌우 합성 이미지 쌍에 사용된다. The left and right composite image pairs I ^' _c for the distorted stereoscopic image are synthesized in a mapping manner. Since there is no reliable disparity measurement from the distorted stereoscopic image pair, the disparity map of the reference image pair is used for the left and right composite image pairs.

기준 좌우 합성 이미지 I_c 및 왜곡된 좌우 합성 이미지 I^' _c 간의 화질 맵 측정에 구조적 유사도 지수(Structural Similarity Index, SSIM)을 적용할 수 있다. 구조적 유사도 지수(SSIM)는 인간의 시각 체계가 이미지의 구조적 정보에 예민하다는 점을 이용하여 구조적 정보를 이용하여 이미지를 평가하는 대표적인 인간 시각 체계를 만족시키는 평가 기법이다.A Structural Similarity Index (SSIM) can be applied to measure the image quality map between the reference left and right composite images I _c and the distorted left and right composite images I ^' _c. Structural Similarity Index (SSIM) is an evaluation technique that satisfies a representative human visual system that evaluates images using structural information using the fact that the human visual system is sensitive to structural information of images.

왜곡된 스테레오스코픽 이미지에 대한 좌우 합성 이미지에서 획득된 분할된 패치 세트를 P_C로 칭한다.

N은 패치의 전체 개수이다.It refers to a segmented set of patches obtained in the left and right composite images to the stereoscopic image distortion to P _C.

N is the total number of patches.

지역적 정답 점수를 산출하는 단계는 이미지의 픽셀마다 참조 화질 평가 모델을 적용하여 산출한 결과 점수를 평균을 내어 패치마다 지역적 정답 점수를 산출한다. 지역적 정답 점수는 수학식 2와 같이 표현된다.In the step of calculating the local correct score, a local correct score is calculated for each patch by averaging the result scores calculated by applying the reference quality evaluation model to each pixel of the image. The local correct score is expressed as Equation (2).

h_p 및 w_p는 수직 및 수평 패치 사이즈이다. 예컨대, h_p는 16, w_p는 18로 설정될 수 있다.h _p and w _p are the vertical and horizontal patch sizes. For example, h _p may be set to 16, and w _p may be set to 18.

프로세서는 비참조 화질 평가 모델에 입력된 데이터를 전처리할 수 있다. 왜곡된 좌우 이미지 I_l과 I_r은 정규화를 거치고, 프로세서는 정규화된 이미지 세트

및

를 생성한다. M은 왜곡된 스테레오스코픽 이미지 쌍의 전체 개수이다. The processor may pre-process data input to the non-reference image quality evaluation model. The distorted left _{and right images I l} and I _r are normalized, and the processor sets the normalized image

and

create M is the total number of distorted stereoscopic image pairs.

각 이미지는 중첩되지 않는 패치로 분할된다. 패치 세트

가 비참조 화질 평가 모델에 입력된다.Each image is divided into non-overlapping patches. patch set

is input to the non-reference quality evaluation model.

도 5는 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 지역적 특징을 추출하고 지역적 점수를 예측하는 것을 예시한 도면이다.5 is a diagram illustrating a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention to extract regional features and predict a regional score.

패치 기반의 비참조 화질 평가 모델이 수행하는 첫 번째 단계는 지역적 주관적 점수의 회귀 단계이다. 비참조 화질 평가 모델은 레이어가 연결된 네트워크이며 가중치 및 바이어스를 학습하는 모델이다. 비참조 화질 평가 모델은 CNN(Convolutional Neural Network) 등의 신경 네트워크로 구현될 수 있다.The first step performed by the patch-based non-reference quality evaluation model is the regression step of regional subjective scores. The non-reference quality evaluation model is a layered network and a model that learns weights and biases. The non-reference quality evaluation model may be implemented as a neural network such as a Convolutional Neural Network (CNN).

CNN 모델은 각 패치에 대한 지역적 화질 점수를 훈련시키기 위해 적용된다. 2 개의 컨볼루션 레이어와 4 개의 풀리 커넥티드 레이어가 사용될 수 있다. 100 개의 차원을 갖는 추출된 특징 벡터를 사용하여 지역적 화질 점수는 회귀할 수 있다. A CNN model is applied to train a local quality score for each patch. Two convolutional layers and four fully connected layers can be used. Using the extracted feature vectors with 100 dimensions, the local quality score can be regressed.

는 파라미터

에 의한 컨볼루션 레이어와 풀리 커넥티드 레이러를 갖는 특징 벡터 추출기이다.

는 파라미터

에 의한 회귀 함수이다. 100 개의 차원의 특징 벡터

는 n 번째 패치 쌍으로부터

에 의해 추출된다.

is the parameter

It is a feature vector extractor with a convolutional layer and a fully connected layer by

is the parameter

is a regression function by 100 dimensional feature vector

is from the nth patch pair

is extracted by

각 패치 쌍

은 모델에 독립적으로 입력되고 다른 패치와 상관 관계가 없다.each patch pair

is input independently into the model and has no correlation with other patches.

네트워크의 파라미터

를 최적화하기 위한 목적 함수는 슈도 검증자료와 예측된 점수 간의 평균 제곱 오차를 사용할 수 있다.network parameters

The objective function for optimizing α can use the mean square error between the pseudo-test data and the predicted score.

는 평균 제곱 오차로 정의된 손실 함수이다.

is the loss function defined as the mean squared error.

도 6은 본 발명의 다른 실시예에 따른 스테레오스코픽 이미지의 화질 평가 방법이 특징 그룹에 대해서 전역적 점수를 예측하는 것을 예시한 도면이다.6 is a diagram illustrating the prediction of a global score for a feature group by a method for evaluating the quality of a stereoscopic image according to another embodiment of the present invention.

패치 기반의 비참조 화질 평가 모델이 수행하는 두 번째 단계는 전역적 주관적 점수의 회귀 단계이다.The second step performed by the patch-based non-reference quality evaluation model is the regression step of the global subjective score.

특징 벡터 세트

는

에 의해 스테레오스코픽 이미지 쌍으로부터 획득된다.Features vector set

is

is obtained from a stereoscopic image pair by

전역적 점수를 예측하는 단계는 (i) 지역적 특징에 대해서 평균을 산출한 제1 특징 그룹, (ii) 지역적 특징에 대해서 분산을 산출한 제2 특징 그룹, (iii) 지역적 특징의 히스토그램에서 상위 비율의 평균을 산출한 제3 특징 그룹, 및 (iv)기 지역적 특징의 히스토그램에서 하위 비율의 평균을 산출한 제4 특징 그룹을 전부 결합한 통합 특징 그룹에 대해서 전역적 점수를 예측한다.The step of predicting the global score includes (i) a first feature group in which the average of the regional features is calculated, (ii) a second feature group in which the variance is calculated for the regional features, and (iii) the top ratio in the histogram of the regional features. A global score is predicted with respect to the integrated feature group in which all of the third feature group for which the average is calculated and the fourth feature group for which the average of the lower proportions are calculated in the histogram of (iv) period regional features.

특징 풀링 레이어는

와 같이 4 가지의 풀링 함수를 결합한다. 각 통계적 풀링 함수는 패치 특징을 이미지 특징으로 변환(

)한다. The feature pooling layer is

Combine the four pooling functions as Each statistical pooling function converts patch features to image features (

)do.

훈련된 모델을 사용하여 각 패치에 대해 100 차원의 특징이 추출된다. 주어진 이미지가 N 개의 패치로 구성되면, 하나의 이미지에서 N x 100 개의 특징 벡터를 생성할 수 있다. 이미지 특징 벡터

는 각 풀링 함수에 의해 획득하고, 수학식 4 내지 수학식 7과 같이 표현된다.100-dimensional features are extracted for each patch using the trained model. If a given image consists of N patches, N x 100 feature vectors can be generated from one image. image feature vector

is obtained by each pooling function, and is expressed as in Equations 4 to 7.

수학식 4는 평균 풀링, 수학식 5는 분산 풀링, 수학식 6은 높은 백분위 풀링, 수학식 7은 낮은 백분위 풀링을 표현하며, 4 가지 종류의 풀링을 사용하여 통합 특징 벡터를 획득한다.Equation 4 represents average pooling, Equation 5 is variance pooling, Equation 6 represents high percentile pooling, and Equation 7 represents low percentile pooling, and an integrated feature vector is obtained using four types of pooling.

l은 특징 인덱스이다. l is the feature index.

n^p ⁺와 n^p ^-는 패치 특징의 히스토그램

에서 p 번째 비율의 상한과 하한이다.

는 p 번째 비율의 품질이며, 예컨대

로 설정될 수 있다.n ^p ⁺ and n ^p ^- are histograms of the patch features

are the upper and lower bounds of the p-th ratio in .

is the quality of the p-th ratio, e.g.

can be set to

이미지 특징 벡터

는 z()에 의해 통합된다. image feature vector

is integrated by z().

풀링된 특징 벡터는 전역적 주관적 점수

를 예측하기 위해 회귀한다. 풀링된 특징 벡터의 가중치를 미세 조정하기 위해 역 전파 프로세스에 의해 모델이 학습된다. The pooled feature vector is a global subjective score

regress to predict A model is trained by a back-propagation process to fine-tune the weights of the pooled feature vectors.

네트워크의 파라미터

를 최적화하기 위한 목적 함수는 주관적 점수와 예측된 점수 간의 평균 제곱 오차를 사용할 수 있다.network parameters

The objective function for optimizing α can use the mean squared error between the subjective score and the predicted score.

는 평균 제곱 오차로 정의된 손실 함수이다.

is the loss function defined as the mean squared error.

도 7은 본 발명의 실시예들에 따라 수행된 모의실험 결과를 도시한 것이다.7 shows simulation results performed according to embodiments of the present invention.

도 7에서 첫 번째 행은 왜곡된 이미지이고, 두 번째 행은 참조 방식의 화질 맵이고, 세 번째 행은 예측된 화질 맵을 나타낸다. 예측된 지역 화질 맵은 왜곡된 이미지에 대한 실제 SSIM 맵과 거의 유사하다는 것을 쉽게 확인할 수 있다. In FIG. 7 , the first row is a distorted image, the second row is a reference image quality map, and the third row is a predicted image quality map. It can be easily seen that the predicted regional quality map is almost similar to the actual SSIM map for the distorted image.

스테레오스코픽 이미지의 화질 평가 장치는 하드웨어, 펌웨어, 소프트웨어 또는 이들의 조합에 의해 로직회로 내에서 구현될 수 있고, 범용 또는 특정 목적 컴퓨터를 이용하여 구현될 수도 있다. 장치는 고정배선형(Hardwired) 기기, 필드 프로그램 가능한 게이트 어레이(Field Programmable Gate Array, FPGA), 주문형 반도체(Application Specific Integrated Circuit, ASIC) 등을 이용하여 구현될 수 있다. 또한, 장치는 하나 이상의 프로세서 및 컨트롤러를 포함한 시스템온칩(System on Chip, SoC)으로 구현될 수 있다.The stereoscopic image quality evaluation apparatus may be implemented in a logic circuit by hardware, firmware, software, or a combination thereof, or may be implemented using a general-purpose or special-purpose computer. The device may be implemented using a hardwired device, a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), or the like. In addition, the device may be implemented as a system on chip (SoC) including one or more processors and controllers.

스테레오스코픽 이미지의 화질 평가 장치는 하드웨어적 요소가 마련된 컴퓨팅 디바이스 또는 서버에 소프트웨어, 하드웨어, 또는 이들의 조합하는 형태로 탑재될 수 있다. 컴퓨팅 디바이스 또는 서버는 각종 기기 또는 유무선 통신망과 통신을 수행하기 위한 통신 모뎀 등의 통신장치, 프로그램을 실행하기 위한 데이터를 저장하는 메모리, 프로그램을 실행하여 연산 및 명령하기 위한 마이크로프로세서 등을 전부 또는 일부 포함한 다양한 장치를 의미할 수 있다.The apparatus for evaluating the image quality of a stereoscopic image may be mounted in a form of software, hardware, or a combination thereof on a computing device or server provided with hardware elements. A computing device or server is all or part of a communication device such as a communication modem for performing communication with various devices or a wired/wireless communication network, a memory for storing data for executing a program, and a microprocessor for executing operations and commands by executing the program It can mean a variety of devices, including

도 2에서는 각각의 과정을 순차적으로 실행하는 것으로 기재하고 있으나 이는 예시적으로 설명한 것에 불과하고, 이 분야의 기술자라면 본 발명의 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 2에 기재된 순서를 변경하여 실행하거나 또는 하나 이상의 과정을 병렬적으로 실행하거나 다른 과정을 추가하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이다.Although it is described that each process is sequentially executed in FIG. 2, this is only an exemplary description, and those skilled in the art change the order described in FIG. 2 without departing from the essential characteristics of the embodiment of the present invention. Alternatively, various modifications and variations may be applied by executing one or more processes in parallel or adding other processes.

본 실시예들에 따른 동작은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능한 매체에 기록될 수 있다. 컴퓨터 판독 가능한 매체는 실행을 위해 프로세서에 명령어를 제공하는 데 참여한 임의의 매체를 나타낸다. 컴퓨터 판독 가능한 매체는 프로그램 명령, 데이터 파일, 데이터 구조 또는 이들의 조합을 포함할 수 있다. 예를 들면, 자기 매체, 광기록 매체, 메모리 등이 있을 수 있다. 컴퓨터 프로그램은 네트워크로 연결된 컴퓨터 시스템 상에 분산되어 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수도 있다. 본 실시예를 구현하기 위한 기능적인(Functional) 프로그램, 코드, 및 코드 세그먼트들은 본 실시예가 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있을 것이다.The operations according to the present embodiments may be implemented in the form of program instructions that can be performed through various computer means and recorded in a computer-readable medium. Computer-readable medium represents any medium that participates in providing instructions to a processor for execution. Computer-readable media may include program instructions, data files, data structures, or a combination thereof. For example, there may be a magnetic medium, an optical recording medium, a memory, and the like. A computer program may be distributed over a networked computer system so that computer readable code is stored and executed in a distributed manner. Functional programs, codes, and code segments for implementing the present embodiment may be easily inferred by programmers in the technical field to which the present embodiment pertains.

본 실시예들은 본 실시예의 기술 사상을 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The present embodiments are for explaining the technical idea of the present embodiment, and the scope of the technical idea of the present embodiment is not limited by these embodiments. The protection scope of this embodiment should be interpreted by the following claims, and all technical ideas within the equivalent range should be interpreted as being included in the scope of the present embodiment.

Claims

In the method for evaluating the image quality of a stereoscopic image by a computing device,
calculating a local correct score through a reference quality evaluation model for each patch for the original stereoscopic image and the distorted stereoscopic image;
extracting regional features from the distorted stereoscopic image through a patch-based non-referenced quality evaluation model and predicting a regional score based on the regional correct score;
grouping the local features into a plurality of feature groups and predicting a global score for the plurality of feature groups;
A method for evaluating the image quality of a stereoscopic image comprising a.

According to claim 1,
The step of calculating the local correct score is,
A stereoscopic image is converted into a left-right composite image (Cyclopean Image), and a reference left-right composite image converted from the original stereoscopic image and a distorted left-right composite image converted from the distorted stereoscopic image are converted into a patch. A method for evaluating the quality of a stereoscopic image, characterized in that it is divided and applied to the reference image quality evaluation model.

According to claim 1,
The step of calculating the local correct score is,
A method for evaluating the image quality of a stereoscopic image, characterized in that the local correct score is calculated for each patch by averaging the results obtained by applying the reference quality evaluation model to each pixel of the image.

According to claim 1,
Predicting the global score comprises:
(i) a first feature group in which the average of the regional features is calculated, (ii) a second feature group in which the variance is computed with respect to the regional features, (iii) the average of the upper proportions in the histogram of the regional features is calculated Quality evaluation of a stereoscopic image, characterized in that the global score is predicted for the third feature group, and (iv) the integrated feature group that combines all the fourth feature groups that calculate the average of the lower proportions in the histogram of the regional features Way.

An apparatus for evaluating the quality of a stereoscopic image comprising one or more processors and a memory for storing one or more programs executed by the one or more processors,
The processor calculates a local correct score through a reference quality evaluation model for each patch for the original stereoscopic image and the distorted stereoscopic image,
The processor extracts regional features through a patch-based non-reference quality evaluation model for the distorted stereoscopic image and predicts a regional score based on the regional correct score,
wherein the processor groups the regional features into a plurality of feature groups and predicts a global score for the plurality of feature groups.