KR102295652B1

KR102295652B1 - Method and apparatus for measurement of image quality based on deep-learning

Info

Publication number: KR102295652B1
Application number: KR1020200095086A
Authority: KR
Inventors: 김동현; 최증원; 김대식; 배성호
Original assignee: 국방과학연구소
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2021-08-30

Abstract

According to an embodiment of the present invention, a method for measurement of image quality performed by an image quality measurement device comprises the following steps of: passing image data requiring quality measurement through a trained first convolutional neural network; passing visual silence information for the image data through a learned second convolutional neural network; merging an output value of a first convolutional neural network and an output value of the second convolutional neural network; and obtaining a quality measurement result for the image data through a regression operation on a combined output value.

Description

Apparatus and method for measuring video quality based on deep learning

본 발명은 딥러닝 기법을 기반으로 하여 비디오의 품질을 측정하는 장치와 그 비디오의 품질을 측정하는 방법에 관한 것이다.The present invention relates to an apparatus for measuring video quality based on a deep learning technique and a method for measuring the video quality.

동영상(비디오)는 그 원본 영상의 방대함으로 인하여 손실(lossy) 압축을 하여 전송하는 것이 일반적이다. 하지만, 이러한 손실 압축을 거치게 된 후에는 통상적으로 사용자 만족도(quality of experience)가 떨어지기 마련이다. 따라서, 성능 열화 정도에 대한 객관적 지표에 대한 요구는 생겨나기 마련인데, 정영상 (사진)의 성능 측정 시 사용되는 척도를 포함하여 다양한 성능 측정 기법이 개발되어 왔다.A moving picture (video) is generally transmitted after lossy compression due to the vastness of the original image. However, after undergoing such lossy compression, user satisfaction (quality of experience) tends to decrease in general. Therefore, a demand for an objective indicator for the degree of performance degradation is bound to arise, and various performance measurement techniques have been developed, including a scale used to measure the performance of a still image (photograph).

화질 측정 방법으로는 크게 객관적 화질 측정과 주관적 화질 측정이 있다. 객관적 화질 측정은 데이터에 기반하여 원본 영상의 대조 또는 대조 없이 화질 열화도에 대해서 측정하는 기법으로, 대표적 척도 중 하나로 PSNR(Peak Signal to noise ratio)를 꼽을 수 있다. PSNR은 원본 영상과의 대조를 통해 값을 도출하는 방법이나, 원본과 복원본의 픽셀 별 단순 비교를 통해 산출되는 특징으로 인해 사용자가 주관적으로 느끼는 열화 정도를 잘 반영하지 못하는 단점을 가진다. 이러한 점을 극복하기 위해 SSIM 등과 같은 다양한 성능 평가 방법이 제시되었다.There are two major methods for measuring image quality: objective image quality measurement and subjective image quality measurement. Objective quality measurement is a technique for measuring the degree of image quality degradation without contrast or contrast of the original image based on data. One of the representative measures is PSNR (Peak Signal to Noise Ratio). PSNR is a method of deriving a value through contrast with the original image, but has a disadvantage in that it does not reflect the degree of deterioration subjectively felt by the user due to the characteristic calculated through a simple comparison of pixels between the original and the reconstructed copy. To overcome this point, various performance evaluation methods such as SSIM have been proposed.

객관적 화질 측정 방법은 크게 세가지로 구분할 수 있는데, 원본 영상과 처리 영상을 비교하는 전 기준법(full reference), 기준 영상 자체가 아닌 기준 영상에 관한 일부 정보만을 추출하여 비교하는 감소 기준법(reduced reference), 처리 영상만을 가지고 평가하는 무 기준법(no reference)으로 구분할 수 있다.Objective quality measurement methods can be roughly divided into three categories: a full reference method that compares the original image with the processed image, a reduced reference method that extracts and compares only some information about the reference image, not the reference image itself, It can be classified as a no-reference method that evaluates only with processed images.

한편, 동영상의 최종 수용자가 사람이기 때문에 주관적 화질 측정이 궁극적인 화질 측정의 목표라고 볼 수 있지만, 이를 측정하는 것은 상당한 노력과 시간이 들어가야만 하며, 시험자의 주관이 반영됨에 따라 통상적인 지표로 삼기에는 어려운 상황이다.On the other hand, subjective image quality measurement can be seen as the ultimate goal of image quality measurement because the final audience of a video is human, but measuring it requires considerable effort and time, and as the subjectivity of the tester is reflected, it should be used as a general indicator. in a difficult situation.

한국공개특허공보, 제10-2019-0076288호 (2019.07.02. 공개)Korean Patent Laid-Open Publication No. 10-2019-0076288 (published on 02.07.2019)

본 발명의 일 실시예에 따르면, 주관적 화질 측정이 궁극적인 화질 측정의 목표이지만 이를 통상적인 지표로 삼기 어려운 상황에서, 주관적 화질 측정 결과와 유사성, 즉 수학적으로는 상관관계(correlation)가 큰 객관적 화질 측정 기법을 무 기준법 조건하에 만드는 것이 궁극적인 과제라 할 수 있다. 이에, 컨볼루션 신경망에 영상 데이터와 비주얼 사일런스(visual saliency) 정보를 각각 학습시켜서 이를 비디오 품질 측정에 이용함으로써 동영상이 압축/복원된 후에 성능 열화 정도를 알기 위한 화질 측정을 원본 영상과의 비교 없이 수행하면서도 주관적 화질 측정 결과와 큰 유사성을 갖도록 한 비디오 품질 측정 방법 및 장치를 제공한다.According to an embodiment of the present invention, in a situation in which subjective image quality measurement is the ultimate goal of image quality measurement, but it is difficult to use it as a general indicator, objective image quality that is similar to the subjective image quality measurement result, that is, mathematically has a large correlation It can be said that the ultimate task is to make the measurement technique under no-standard conditions. Therefore, by learning the image data and visual saliency information in the convolutional neural network, respectively, and using it for video quality measurement, the image quality measurement is performed to know the degree of performance degradation after the video is compressed/restored without comparison with the original image. Provided are a method and apparatus for measuring video quality while still having a great similarity with subjective quality measurement results.

다만, 본 발명의 해결하고자 하는 과제는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재로부터 본 발명이 속하는 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the problems to be solved of the present invention are not limited to those mentioned above, and other problems to be solved that are not mentioned can be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description. will be.

제 1 관점에 따른 비디오 품질 측정 장치가 수행하는 비디오 품질 측정 방법은, 화질 측정이 필요한 영상 데이터를 학습된 제 1 컨볼루션 신경망에 통과시키는 단계와, 상기 영상 데이터에 대한 비주얼 사일런스 정보를 학습된 제 2 컨볼루션 신경망에 통과시키는 단계와, 상기 제 1 컨볼루션 신경망의 출력값과 상기 제 2 컨볼루션 신경망의 출력값을 합치는 단계와, 상기 합쳐진 출력값에 대한 리그레션(regression) 연산을 통하여 상기 영상 데이터에 대한 화질 측정 결과를 획득하는 단계를 포함한다.The video quality measurement method performed by the video quality measurement apparatus according to the first aspect includes the steps of passing image data requiring image quality measurement through a learned first convolutional neural network, and passing visual silence information on the image data to the learned second 2 Passing through the convolutional neural network, merging the output value of the first convolutional neural network and the output value of the second convolutional neural network, and regression operation on the combined output value to the image data and obtaining a quality measurement result for the .

제 2 관점에 따른 비디오 품질 측정 장치는, 화질 측정이 필요한 영상 데이터 및 상기 영상 데이터에 대한 비주얼 사일런스 정보를 입력받는 입력부와, 상기 영상 데이터에 대하여 화질을 측정하는 프로세서부와, 상기 프로세서부에 의한 상기 영상 데이터에 대한 화질 측정 결과를 출력하는 출력부를 포함하고, 상기 프로세서부는, 상기 영상 데이터를 학습된 제 1 컨볼루션 신경망에 통과시키고, 상기 비주얼 사일런스 정보를 학습된 제 2 컨볼루션 신경망에 통과시키며, 상기 제 1 컨볼루션 신경망의 출력값과 상기 제 2 컨볼루션 신경망의 출력값을 합치고, 상기 합쳐진 출력값에 대한 리그레션 연산을 통하여 상기 영상 데이터에 대한 상기 화질 측정 결과를 획득한다.A video quality measuring apparatus according to a second aspect comprises: an input unit for receiving image data requiring image quality measurement and visual silence information for the image data; a processor unit measuring image quality with respect to the image data; an output unit for outputting a quality measurement result for the image data, wherein the processor unit passes the image data through a learned first convolutional neural network and passes the visual silence information through a learned second convolutional neural network, , sums the output value of the first convolutional neural network and the output value of the second convolutional neural network, and obtains the image quality measurement result for the image data through a regression operation on the combined output value.

제 3 관점에 따라 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체는, 상기 컴퓨터 프로그램이, 프로세서에 의해 실행되면, 상기 비디오 품질 측정 방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to the third aspect, a computer readable recording medium storing a computer program includes instructions for causing the processor to perform the video quality measurement method when the computer program is executed by a processor.

제 4 관점에 따라 컴퓨터 판독 가능 기록매체에 저장되어 있는 컴퓨터 프로그램은, 상기 컴퓨터 프로그램이, 프로세서에 의해 실행되면, 상기 비디오 품질 측정 방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to a fourth aspect, a computer program stored in a computer-readable recording medium includes instructions for causing the processor to perform the video quality measurement method when the computer program is executed by a processor.

일 실시예에 따르면, 컨볼루션 신경망에 영상 데이터와 비주얼 사일런스 정보를 각각 학습시켜서 이를 비디오 품질 측정에 이용함으로써 동영상이 압축/복원된 후에 성능 열화 정도를 알기 위한 화질 측정을 원본 영상과의 비교 없이 수행하면서도 주관적 화질 측정 결과와 큰 유사성을 갖도록 한다.According to an embodiment, the convolutional neural network learns image data and visual silence information, respectively, and uses them to measure video quality, so that after a video is compressed/restored, quality measurement is performed to determine the degree of performance degradation without comparison with the original image. However, to have a large similarity with the subjective image quality measurement result.

도 1은 본 발명의 일 실시예에 따른 비디오 품질 측정 장치의 구성도이다.
도 2는 본 발명의 일 실시예에 따른 비디오 품질 측정 장치가 수행하는 비디오 품질 측정 방법을 설명하기 위한 흐름도이다.1 is a block diagram of an apparatus for measuring video quality according to an embodiment of the present invention.
2 is a flowchart illustrating a video quality measurement method performed by an apparatus for measuring video quality according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the art to which the present invention pertains It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, if it is determined that a detailed description of a well-known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, the terms to be described later are terms defined in consideration of functions in an embodiment of the present invention, which may vary according to intentions or customs of users and operators. Therefore, the definition should be made based on the content throughout this specification.

본 명세서에서 단수의 표현은 문맥상 명백하게 다름을 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, '포함하다' 또는 '구성하다' 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this specification, the singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as 'comprise' or 'comprise' are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification is present, and one or more other It should be understood that this does not preclude the possibility of addition or presence of features or numbers, steps, operations, components, parts, or combinations thereof.

또한, 본 발명의 실시 예에서, 어떤 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적인 연결뿐 아니라, 다른 매체를 통한 간접적인 연결의 경우도 포함한다. 또한 어떤 부분이 어떤 구성 요소를 포함한다는 의미는, 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있다는 것을 의미한다.In addition, in an embodiment of the present invention, when it is said that a part is connected to another part, this includes not only a direct connection but also an indirect connection through another medium. In addition, the meaning that a certain component includes a certain component does not exclude other components unless otherwise stated, but may further include other components.

도 1은 본 발명의 일 실시예에 따른 비디오 품질 측정 장치의 구성도이다.1 is a block diagram of an apparatus for measuring video quality according to an embodiment of the present invention.

도 1을 참조하면, 일 실시예에 따른 비디오 품질 측정 장치(100)는 입력부(110), 프로세서부(120) 및 출력부(130)를 포함하고, 저장부(140)를 더 포함할 수 있다.Referring to FIG. 1 , an apparatus 100 for measuring video quality according to an embodiment includes an input unit 110 , a processor unit 120 , and an output unit 130 , and may further include a storage unit 140 . .

입력부(110)는 화질 측정이 필요한 영상 데이터 및 영상 데이터에 대한 비주얼 사일런스(visual saliency) 정보를 입력받아 이를 프로세서부(120)에 제공한다. 예를 들어, 입력부(110)는 통신망을 통하여 영상 데이터 및 비주얼 사일런스 정보를 수신할 수 있는 통신모듈을 포함하거나 영상 데이터 및 비주얼 사일런스 정보를 직접 입력받을 수 있는 인터페이스를 포함할 수 있다.The input unit 110 receives image data requiring image quality measurement and visual saliency information on the image data, and provides it to the processor unit 120 . For example, the input unit 110 may include a communication module capable of receiving image data and visual silence information through a communication network, or an interface capable of directly receiving image data and visual silence information.

프로세서부(120)는 학습된 컨볼루션 신경망(Convolutional Neural Network, CNN)을 이용하여 화질 측정이 필요한 영상 데이터에 대한 화질 측정 결과를 획득한다. 이러한 프로세서부(120)는 영상 데이터를 학습된 제 1 컨볼루션 신경망에 통과시키고, 비주얼 사일런스 정보를 학습된 제 2 컨볼루션 신경망에 통과시키며, 제 1 컨볼루션 신경망의 출력값과 제 2 컨볼루션 신경망의 출력값을 합치고, 합쳐진 출력값에 대한 리그레션(regression) 연산을 통하여 영상 데이터에 대한 화질 측정 결과를 획득한다. 여기서, 프로세서부(120)는 합쳐진 출력값에 대한 리그레션 연산을 하기 전에 리니어 컨볼루션을 수행할 수 있다. 그리고, 제 1 컨볼루션 신경망에 통과시키는 영상 데이터는 화질 측정이 필요한 영상 데이터의 전체 픽셀영역이 포함될 수 있다.The processor unit 120 obtains a quality measurement result for image data requiring quality measurement by using a learned convolutional neural network (CNN). The processor unit 120 passes the image data through the learned first convolutional neural network, passes the visual silence information through the learned second convolutional neural network, and the output value of the first convolutional neural network and the second convolutional neural network. The output values are summed, and a quality measurement result of the image data is obtained through a regression operation on the combined output values. Here, the processor 120 may perform linear convolution before performing a regression operation on the combined output values. In addition, the image data passed through the first convolutional neural network may include the entire pixel area of the image data requiring quality measurement.

출력부(130)는 프로세서부(120)에 의한 영상 데이터에 대한 화질 측정 결과를 출력한다. 예를 들어, 출력부(130)는 프로세서부(120)에 의한 처리의 결과로서 화질 측정 결과 데이터를 송신할 수 있는 통신모듈을 포함하거나 화질 측정 결과 데이터를 다른 전자장치로 전달할 수 있는 인터페이스를 포함할 수 있다. 또, 출력부(130)는 프로세서부(120)에 의한 화질 측정 결과 데이터를 시각적 식별이 가능하게 출력할 수 있는 표시장치 또는 인쇄장치 등을 포함할 수 있다.The output unit 130 outputs the quality measurement result of the image data by the processor unit 120 . For example, the output unit 130 includes a communication module capable of transmitting image quality measurement result data as a result of processing by the processor unit 120, or includes an interface capable of transmitting image quality measurement result data to another electronic device. can do. In addition, the output unit 130 may include a display device or a printing device capable of outputting the image quality measurement result data by the processor unit 120 to be visually identified.

저장부(140)는 프로세서부(120)에 의한 처리의 결과로서 화질 측정 결과 데이터를 프로세서부(120)의 제어에 따라 저장할 수 있다. 예를 들어, 저장부(140)는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 플래시 메모리(flash memory)와 같은 프로그램 명령어들을 저장하고 수행하도록 특별히 구성된 하드웨어 장치 등과 같이 컴퓨터 판독 가능한 기록매체일 수 있다.The storage unit 140 may store image quality measurement result data as a result of processing by the processor unit 120 under the control of the processor unit 120 . For example, the storage unit 140 includes magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and floppy disks. It may be a computer-readable recording medium, such as a magneto-optical medium, a hardware device specially configured to store and execute program instructions, such as a flash memory.

도 2는 본 발명의 일 실시예에 따른 비디오 품질 측정 장치가 수행하는 비디오 품질 측정 방법을 설명하기 위한 흐름도이다.2 is a flowchart illustrating a video quality measurement method performed by an apparatus for measuring video quality according to an embodiment of the present invention.

도 2를 참조하면, 일 실시예에 따른 비디오 품질 측정 방법은 화질 측정이 필요한 영상 데이터를 학습된 제 1 컨볼루션 신경망에 통과시키는 단계(S210)를 포함한다.Referring to FIG. 2 , the method for measuring video quality according to an embodiment includes passing image data requiring quality measurement through a first learned convolutional neural network ( S210 ).

그리고, 일 실시예에 따른 비디오 품질 측정 방법은 영상 데이터에 대한 비주얼 사일런스 정보를 학습된 제 2 컨볼루션 신경망에 통과시키는 단계(S220)를 더 포함한다.And, the method of measuring video quality according to an embodiment further includes passing the visual silence information on the image data to the learned second convolutional neural network (S220).

그리고, 일 실시예에 따른 비디오 품질 측정 방법은 제 1 컨볼루션 신경망의 출력값과 제 2 컨볼루션 신경망의 출력값을 합치는 단계(S230를 더 포함한다.In addition, the method for measuring video quality according to an embodiment further includes the step of merging the output value of the first convolutional neural network and the output value of the second convolutional neural network (S230).

그리고, 일 실시예에 따른 비디오 품질 측정 방법은 합쳐진 출력값에 대하여 리니어 컨볼루션을 수행하는 단계(S240)를 더 포함한다.In addition, the method of measuring video quality according to an embodiment further includes performing linear convolution on the combined output values ( S240 ).

그리고, 일 실시예에 따른 비디오 품질 측정 방법은 리니어 컨볼루션의 수행 결과에 대한 리그레션(regression) 연산을 통하여 영상 데이터에 대한 화질 측정 결과를 획득하는 단계(S250)를 더 포함한다.In addition, the method of measuring video quality according to an embodiment further includes obtaining a quality measurement result for image data through a regression operation on a result of linear convolution ( S250 ).

이하, 도 1 및 도 2를 참조하여 본 발명의 일 실시예에 따른 비디오 품질 측정 장치(100)가 수행하는 비디오 품질 측정 방법에 대해 자세히 살펴보기로 한다.Hereinafter, a video quality measuring method performed by the video quality measuring apparatus 100 according to an embodiment of the present invention will be described in detail with reference to FIGS. 1 and 2 .

본 발명의 일 실시예에 따른 비디오 품질 측정 장치(100)의 프로세서부(120)는 화질 측정이 필요한 영상 데이터, 예컨대 영상 압축/복원을 거친 후의 데이터에 대하여 화질을 측정함에 있어서 제 1 컨볼루션 신경망과 제 2 컨볼루션 신경망을 이용할 것이다. 이를 위해, 제 1 컨볼루션 신경망과 제 2 컨볼루션 신경망을 학습시키는 선행 절차를 수행하여야 한다. 예를 들어, 제 1 컨볼루션 신경망을 위한 학습 데이터 세트는 입력이 학습용 영상 데이터이고 레이블은 학습용 영상 데이터의 화질값일 수 있다. 또, 제 2 컨볼루션 신경망을 위한 학습 데이터 세트로는 학습용 영상 데이터의 비주얼 사일런스 정보일 수 있고 레이블은 해당 비주얼 사일런스 정보의 특징값일 수 있다. 예컨대, 제 1 컨볼루션 신경망과 제 2 컨볼루션 신경망은 각각의 학습 데이터 세트에 의하여 사전에 학습될 수 있다.The processor unit 120 of the video quality measuring apparatus 100 according to an embodiment of the present invention measures the image quality of image data requiring image quality measurement, for example, data after image compression/restore, in measuring the image quality of the first convolutional neural network. and a second convolutional neural network. To this end, it is necessary to perform a preceding procedure for learning the first convolutional neural network and the second convolutional neural network. For example, in the training data set for the first convolutional neural network, an input may be image data for training, and a label may be a quality value of the image data for training. In addition, as the training data set for the second convolutional neural network, visual silence information of image data for training may be used, and the label may be a feature value of the corresponding visual silence information. For example, the first convolutional neural network and the second convolutional neural network may be trained in advance by each training data set.

제 1 컨볼루션 신경망과 제 2 컨볼루션 신경망의 사전 학습이 수행된 상태에서, 비디오 품질 측정 장치(100)의 입력부(110)는 화질 측정이 필요한 영상 데이터 및 영상 데이터에 대한 비주얼 사일런스 정보를 입력받아 이를 프로세서부(120)에 제공한다.In a state in which pre-learning of the first convolutional neural network and the second convolutional neural network is performed, the input unit 110 of the video quality measuring apparatus 100 receives visual silence information for image data and image data requiring quality measurement. This is provided to the processor unit 120 .

그러면, 비디오 품질 측정 장치(100)의 프로세서부(120)는 사전 학습된 제 1 컨볼루션 신경망에 화질 측정이 필요한 영상 데이터를 통과시키고(S210), 사전 학습된 제 2 컨볼루션 신경망에 화질 측정이 필요한 영상 데이터의 비주얼 사일런스 정보를 통과시킨다(S220). 여기서, 제 1 컨볼루션 신경망에 통과시키는 영상 데이터는 화질 측정이 필요한 영상 데이터의 전체 픽셀영역이 포함될 수 있다.Then, the processor unit 120 of the video quality measuring apparatus 100 passes the image data requiring quality measurement to the pre-trained first convolutional neural network (S210), and the image quality measurement is performed to the pre-trained second convolutional neural network. Visual silence information of the necessary image data is passed (S220). Here, the image data passed through the first convolutional neural network may include the entire pixel area of the image data requiring quality measurement.

이어서, 프로세서부(120)는 제 1 컨볼루션 신경망의 출력값인 화질값과 제 2 컨볼루션 신경망의 출력값인 특성값을 합친다(S230).Next, the processor unit 120 combines the quality value that is the output value of the first convolutional neural network and the characteristic value that is the output value of the second convolutional neural network ( S230 ).

그리고, 프로세서부(120)는 단계 S230에서 합쳐진 출력값에 대하여 리니어 컨볼루션(linear convolution)을 수행할 수 있다(S240). 다만, 프로세서부(120)는 이러한 리니어 컨볼루션의 수행을 생략할 수도 있다.In addition, the processor 120 may perform linear convolution on the output values combined in step S230 ( S240 ). However, the processor 120 may omit the linear convolution.

다음으로, 프로세서부(120)는 단계 S240에 의한 리니어 컨볼루션의 수행 결과에 대한 리그레션(regression) 연산을 통하여 영상 데이터에 대한 화질 측정 결과를 획득한다. 만약, 단계 S240의 수행이 생략된 경우라면, 프로세서부(120)는 단계 S230에서 합쳐진 출력값에 대한 리그레션 연산을 통하여 영상 데이터에 대한 화질 측정 결과를 획득할 수 있다(S250).Next, the processor unit 120 obtains a quality measurement result of the image data through a regression operation on the result of the linear convolution in step S240 . If the execution of step S240 is omitted, the processor 120 may obtain a quality measurement result for the image data through a regression operation on the output values combined in step S230 ( S250 ).

이후, 저장부(140)는 프로세서부(120)에 의한 영상 데이터에 대한 화질 측정 결과를 프로세서부(120)의 제어에 따라 저장할 수 있다. 또한, 출력부(130)는 프로세서부(120)에 의한 영상 데이터에 대한 화질 측정 결과를 프로세서부(120)의 제어에 따라 출력할 수 있다.Thereafter, the storage unit 140 may store the image quality measurement result of the image data by the processor unit 120 under the control of the processor unit 120 . Also, the output unit 130 may output the quality measurement result of the image data by the processor unit 120 under the control of the processor unit 120 .

지금까지 설명한 바와 같이, 본 발명의 실시예에 의하면, 컨볼루션 신경망에 영상 데이터와 비주얼 사일런스 정보를 각각 학습시켜서 이를 비디오 품질 측정에 이용함으로써 동영상이 압축/복원된 후에 성능 열화 정도를 알기 위한 화질 측정을 원본 영상과의 비교 없이 수행하면서도 주관적 화질 측정 결과와 큰 유사성을 갖도록 한다.As described so far, according to the embodiment of the present invention, image data and visual silence information are respectively learned in the convolutional neural network and used for video quality measurement. is performed without comparison with the original image, while having a large similarity to the subjective image quality measurement result.

본 발명에 첨부된 각 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 기록매체에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 기록매체에 저장된 인스트럭션들은 흐름도의 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each step in each flowchart attached to the present invention may be performed by computer program instructions. These computer program instructions may be embodied in a processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, such that the instructions performed by the processor of the computer or other programmable data processing equipment provide the functions described in each step of the flowchart. It creates a means to do these things. These computer program instructions may also be stored in a computer-usable or computer-readable medium that may direct a computer or other programmable data processing equipment to implement a function in a particular manner, and thus the computer-usable or computer-readable medium. The instructions stored in the recording medium are also possible to produce an article of manufacture including instruction means for performing the functions described in each step of the flowchart. The computer program instructions may also be mounted on a computer or other programmable data processing equipment, such that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to create a computer or other programmable data processing equipment. It is also possible that instructions for performing the processing equipment provide steps for performing the functions described in each step of the flowchart.

또한, 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Further, each step may represent a module, segment, or portion of code comprising one or more executable instructions for executing the specified logical function(s). It should also be noted that in some alternative embodiments it is also possible for the functions recited in the steps to occur out of order. For example, it is possible that two steps shown one after another may in fact be performed substantially simultaneously, or that the steps may sometimes be performed in the reverse order depending on the function in question.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical spirit of the present invention, and various modifications and variations will be possible without departing from the essential characteristics of the present invention by those skilled in the art to which the present invention pertains. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical spirit of the present invention, but to explain, and the scope of the technical spirit of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

100: 비디오 품질 측정 장치
110: 입력부
120: 프로세서부
130: 출력부 100: video quality measurement device
110: input unit
120: processor unit
130: output unit

Claims

A video quality measurement method performed by a video quality measurement device, comprising:
Passing the image data, which is the object of quality measurement, through the learned first convolutional neural network;
passing visual saliency information for the image data through a learned second convolutional neural network;
merging the output value of the first convolutional neural network and the output value of the second convolutional neural network;
obtaining a quality measurement result for the image data through a regression operation on the combined output values;
In the training data set learned by the first convolutional neural network, an input is image data for training, and a label is a quality value of the image data for training,
In the training data set learned by the second convolutional neural network, the input is visual silence information of the image data for training, and the label is a feature value of the visual silence information of the image data for training.
How to measure video quality.

The method of claim 1,
The image data passed through the first convolutional neural network includes the entire pixel area of the image data subject to the image quality measurement.
How to measure video quality.

The method of claim 1,
performing linear convolution before performing a regression operation on the combined output values
How to measure video quality.

An input unit for receiving image data, which is a target of image quality measurement, and visual saliency information on the image data;
a processor for measuring the image quality of the image data;
and an output unit for outputting a quality measurement result for the image data by the processor unit,
The processor unit,
Passing the image data through the learned first convolutional neural network, passing the visual silence information through the learned second convolutional neural network, combining the output value of the first convolutional neural network and the output value of the second convolutional neural network, , obtaining the image quality measurement result for the image data through a regression operation on the combined output value,
In the training data set learned by the first convolutional neural network, an input is image data for training, and a label is a quality value of the image data for training,
In the training data set learned by the second convolutional neural network, the input is visual silence information of the image data for training, and the label is a feature value of the visual silence information of the image data for training.
Video quality measurement device.

5. The method of claim 4,
The image data passed through the first convolutional neural network includes the entire pixel area of the image data subject to the image quality measurement.
Video quality measurement device.

5. The method of claim 4,
The processor unit,
performing linear convolution before performing a regression operation on the combined output values
Video quality measurement device.

As a computer-readable recording medium storing a computer program,
The computer program, when executed by a processor,
Passing the image data, which is the object of quality measurement, through a learned first convolutional neural network, and passing visual saliency information on the image data to a second learned convolutional neural network; Comprising the steps of merging the output value of the convolutional neural network and the output value of the second convolutional neural network, and obtaining a quality measurement result for the image data through a regression operation on the combined output value, the first 1 In the training data set learned by the convolutional neural network, the input is image data for training, the label is the quality value of the image data for training, and the training data set learned by the second convolutional neural network is the input of the image data for training. A computer-readable recording medium comprising instructions for causing the processor to perform a method in which the visual silence information and the label are a characteristic value of the visual silence information of the image data for training.

As a computer program stored in a computer-readable recording medium,
The computer program, when executed by a processor,
Passing the image data, which is the object of quality measurement, through a learned first convolutional neural network, and passing visual saliency information on the image data to a second learned convolutional neural network; Comprising the steps of merging the output value of the convolutional neural network and the output value of the second convolutional neural network, and obtaining a quality measurement result for the image data through a regression operation on the combined output value, the first 1 In the training data set learned by the convolutional neural network, the input is image data for training, the label is the quality value of the image data for training, and the training data set learned by the second convolutional neural network is the input of the image data for training. A computer program comprising instructions for causing the processor to perform a method in which the visual silence information and the label is a feature value of the visual silence information of the image data for training.