KR102492430B1

KR102492430B1 - Image processing apparatus and method for generating information beyond image area

Info

Publication number: KR102492430B1
Application number: KR1020210034698A
Authority: KR
Inventors: 심재완; 하헌필
Original assignee: 한국과학기술연구원
Priority date: 2021-03-17
Filing date: 2021-03-17
Publication date: 2023-01-30
Also published as: KR20220129852A

Abstract

본 발명은 영상 영역 밖의 정보를 생성하는 영상 처리 기술에 관한 것으로, 영상 처리 장치가 제 1 화면 규격을 갖는 동영상을 입력받고, 제 1 화면 규격을 기준으로 제 1 화면 규격과 다르게 정의된 제 2 화면 규격과의 차이에 해당하는 영상 영역을 미확보 영역으로 설정하고, 입력된 제 1 화면 규격의 동영상에 포함된 프레임(frame)별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하고, 입력된 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 유사 프레임 그룹을 참조하여 미확보 영역에 대한 영상을 생성하며, 제 1 화면 규격의 동영상에 포함된 원본 영상과 미확보 영역에 대해 생성된 영상으로부터 제 2 화면 규격을 갖는 동영상을 출력한다.The present invention relates to an image processing technology for generating information outside an image area, wherein an image processing device receives a video having a first screen standard, and a second screen defined differently from the first screen standard based on the first screen standard The image area corresponding to the difference from the standard is set as an unsecured area, each similar frame is detected for each frame included in the video of the input first screen standard, and set as a similar frame group, and the input first screen For each frame included in the standard video, a similar frame group is referred to to create an image for the unsecured area, and a second screen standard is generated from the original video included in the video of the first screen standard and the image generated for the unsecured area. Outputs a video with

Description

Image processing apparatus and method for generating information beyond image area

본 발명은 영상 처리 기술에 관한 것으로, 특히 동영상 내의 다양한 프레임을 참조하거나 또는 인공지능에 기반하여 영상 영역 밖의 정보를 생성하는 영상 처리 장치 및 방법, 그 방법을 기록한 기록매체에 관한 것이다.The present invention relates to image processing technology, and more particularly, to an image processing device and method for generating information outside an image area based on artificial intelligence or referring to various frames in a video, and a recording medium recording the method.

통신 기술, 영상 처리 기술 및 하드웨어의 발달과 더불어 보다 고품질의 영상을 실시간으로 처리하는 것이 가능해졌다. 이에 따라 영상의 규격에도 많은 변화가 나타나게 되었다. 예를 들어, 과거에는 동영상의 화면비(aspect ratio)가 4:3인 것이 일반적이었으나, 요즘은 16:9와 같은 와이드(wide) 화면비를 선호하는 추세이다. 따라서, 과거에 제작된 동영상을 새로운 화면비를 가진 최신의 장치에서 재생하게 될 경우, 영상 재생 장치의 좌우 영역에 표시할 정보가 없이 해당 영역을 비워놓아야 한다는 현실에 마주하게 된다. 이를 해결하기 위해, 영상의 비율을 변화시키는 스트레치(stretch)나 일부 영역을 잘라내는 크롭(crop)을 사용하지 않는 한, 통상적으로 영상 재생 장치는 영상 정보가 없는 영역을 검은색으로 채워 표시하게 된다. 이렇게 정보가 없어 빈 공간에 검은색으로 채우는 레터박스(letterbox) 또는 필러박스(pillarbox) 방식을 통해 보완하게 되었다.With the development of communication technology, image processing technology and hardware, it has become possible to process higher quality images in real time. As a result, many changes have occurred in the standards of images. For example, in the past, it was common for a video to have an aspect ratio of 4:3, but nowadays a wide aspect ratio such as 16:9 is preferred. Therefore, when a video produced in the past is played on a latest device having a new aspect ratio, the reality is faced with the fact that the left and right areas of the video reproducing device must be left empty without information to be displayed. In order to solve this problem, unless a stretch that changes the ratio of an image or a crop that cuts out a part of an image is not used, an image reproducing device usually fills an area without image information with black and displays it. . Since there was no information like this, it was supplemented with a letterbox or pillarbox method that fills the empty space with black.

또한, 과거에 제작된 영상은 화면비뿐만 아니라, 영상 자체의 품질도 좋지 못하였는데, 예를 들어, 해상도(resolution) 내지 비트 레이트(bit rate)가 낮아 고해상도의 화소(pixel) 기반으로 설계된 최신 디스플레이 장치에서 해당 영상을 재생할 경우 화질의 조악함이 두드러지게 된다.In addition, images produced in the past had poor quality as well as aspect ratio. When the video is played back in , the poor picture quality becomes noticeable.

즉, 과거에 제작된 영상을 최신의 장치에서 활용하고자 할 경우 소비자들의 높아진 안목과 기대 수준에 부합하기 어렵다. 따라서, 과거의 저품질 영상을 고품질 영상을 지원하는 최신의 영상 재생 장치에서 활용하기 위한 다양한 아이디어가 제안되고 있다. 이하에서 제시되는 선행기술문헌에는 이러한 영상의 품질 향상을 위한 변환 기술이 소개되어 있다.In other words, when trying to use images produced in the past on the latest devices, it is difficult to meet the heightened expectations and expectations of consumers. Accordingly, various ideas have been proposed for utilizing old low-quality images in the latest image reproducing apparatus supporting high-quality images. In the prior art literature presented below, a conversion technique for improving the quality of such an image is introduced.

"저해상도 영상의 손실 정보 추정 방법과 고해상도 영상 변환 방법", 한국특허공개공보 제2011-0026942호"Loss information estimation method of low resolution image and high resolution image conversion method", Korean Patent Publication No. 2011-0026942

본 발명이 해결하고자 하는 기술적 과제는, 과거의 화면 규격에 따른 동영상을 최신의 화면 규격에 따른 영상 재생 장치에서 재생함에 있어서 종래의 기술이 단순히 과거의 동영상의 일측면을 기준으로 영상의 크기를 조절하되 나머지 타측면 밖의 영역에는 표시할 정보가 없어 검은색으로 표시하는 레터박스 또는 필러박스 방식을 사용함으로 인해 최신 영상 재생 장치의 자원이 낭비되는 문제를 해결하고, 그로 인해 최신의 기기를 사용하였음에도 불구하고 사용자가 체감하는 품질 만족도가 오히려 저하되는 약점을 극복하고자 한다.The technical problem to be solved by the present invention is that the conventional technology simply adjusts the size of the video based on one side of the past video in playing a video according to the screen standard of the past in a video reproducing apparatus according to the latest screen standard. However, there is no information to be displayed in the area outside the other side, so it solves the problem of wasting resources of the latest video playback device by using a letterbox or pillarbox method that is displayed in black, even though the latest device is used. and overcome the weakness that the quality satisfaction experienced by users is rather degraded.

상기 기술적 과제를 해결하기 위하여, 본 발명의 일 실시예에 따른 영상 처리 방법은, (a) 영상 처리 장치가 제 1 화면 규격을 갖는 동영상을 입력받는 단계; (b) 상기 영상 처리 장치가 상기 제 1 화면 규격을 기준으로 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격과의 차이에 해당하는 영상 영역을 미확보 영역으로 설정하는 단계; (c) 상기 영상 처리 장치가 입력된 상기 제 1 화면 규격의 동영상에 포함된 프레임(frame)별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하는 단계; (d) 상기 영상 처리 장치가 입력된 상기 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 상기 유사 프레임 그룹을 참조하여 상기 미확보 영역에 대한 영상을 생성하는 단계; 및 (e) 상기 영상 처리 장치가 상기 제 1 화면 규격의 동영상에 포함된 원본 영상과 상기 미확보 영역에 대해 생성된 영상으로부터 상기 제 2 화면 규격을 갖는 동영상을 출력하는 단계;를 포함한다.In order to solve the above technical problem, an image processing method according to an embodiment of the present invention includes: (a) receiving a video having a first screen standard by an image processing device; (b) setting, by the image processing device, an image area corresponding to a difference between the first screen standard and a second screen standard defined differently from the first screen standard as an unsecured area; (c) detecting similar frames for each frame included in the input moving picture of the first screen standard by the image processing device and setting them as a similar frame group; (d) generating, by the image processing device, an image for the unsecured area by referring to the similar frame group for each frame included in the input video of the first screen standard; and (e) outputting, by the image processing device, a video having the second screen standard from an original video included in the video of the first screen standard and an image generated for the unsecured area.

일 실시예에 따른 영상 처리 방법에서, 미확보 영역으로 설정하는 상기 (b) 단계는, 상기 제 2 화면 규격을 목표로 하여 화면비(aspect ratio), 해상도(resolution), 및 화각(angle of view) 중 적어도 하나를 포함하는 화면 규격의 차이로 인해 상기 제 1 화면 규격이 보유하지 못하고 있는 영상 영역을 미확보 영역으로 설정할 수 있다.In the image processing method according to an embodiment, the step (b) of setting the unsecured area to the second screen standard among aspect ratio, resolution, and angle of view Due to a difference in screen standards including at least one, an image area not possessed by the first screen standard may be set as an unsecured area.

일 실시예에 따른 영상 처리 방법에서, 유사 프레임 그룹으로 설정하는 상기 (c) 단계는, 특징 매칭(feature matching), 템플릿 매칭(template matching), 및 히스토그램(histogram) 비교 중 적어도 하나를 이용하여 2개의 인접 프레임들을 비교하여 유사도 값을 산출할 수 있다.In the image processing method according to an embodiment, the step (c) of setting the group of similar frames to 2 frames using at least one of feature matching, template matching, and histogram comparison. A similarity value may be calculated by comparing two adjacent frames.

일 실시예에 따른 영상 처리 방법에서, 유사 프레임 그룹으로 설정하는 상기 (c) 단계는, (c1) 상기 제 1 화면 규격의 동영상에 포함된 하나의 기준 프레임에 대하여 상기 기준 프레임에 시간적으로 선행하거나 후행하는 인접 프레임들과의 유사도를 각각 산출하는 단계; (c2) 상기 인접 프레임들로부터 산출된 유사도가 임계치 이상인 경우, 임계치 이상의 인접 프레임을 새로운 기준 프레임으로 설정하여 다시 인접 프레임들과의 유사도를 각각 산출하고 새롭게 산출된 유사도가 상기 임계치 이상인지를 검사하는 과정을 연쇄적으로 반복함으로써 유사도가 임계치 이상인 프레임만을 최초의 기준 프레임에 대한 유사 프레임 그룹으로 설정하는 단계; 및 (c3) 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 (c1) 단계 및 상기 (c2) 단계를 수행하여 프레임별로 유사 프레임 그룹을 도출하는 단계;를 포함할 수 있다.In the image processing method according to an embodiment, the step (c) of setting the group of similar frames may include (c1) temporally preceding one reference frame included in the moving picture of the first screen standard, or Calculating similarities with subsequent adjacent frames, respectively; (c2) If the similarity calculated from the adjacent frames is equal to or greater than the threshold, setting adjacent frames equal to or greater than the threshold as a new reference frame, calculating the similarity with each adjacent frame again, and checking whether the newly calculated similarity is equal to or greater than the threshold setting only frames whose degree of similarity is equal to or higher than a threshold value as a group of similar frames with respect to the first reference frame by repeating the process serially; and (c3) deriving a similar frame group for each frame by performing steps (c1) and (c2) for all frames included in the video of the first screen standard.

또한, 유사 프레임 그룹으로 설정하는 상기 (c) 단계는, 2개의 인접 프레임들 간의 유사도 값을 해당 프레임의 식별자 쌍(pair)에 매칭시켜 룩업 테이블(look-up table)에 저장하되, 새롭게 2개의 인접 프레임들 간의 유사도를 산출하는 경우 먼저 상기 룩업 테이블을 조회하여 미리 저장된 유사도 값이 존재하는 경우 저장된 해당 유사도 값을 독출하여 사용하고, 미리 저장된 유사도 값이 존재하지 않는 경우에만 유사도를 산출하여 상기 룩업 테이블에 저장할 수 있다.In addition, in the step (c) of setting a similar frame group, the similarity value between two adjacent frames is matched with an identifier pair of the corresponding frame and stored in a look-up table, and two newly When calculating the similarity between adjacent frames, the lookup table is first inquired, and if there is a pre-stored similarity value, the stored similarity value is read and used, and the similarity is calculated only when the pre-stored similarity value does not exist. can be stored in a table.

일 실시예에 따른 영상 처리 방법에서, 유사 프레임 그룹으로 설정하는 상기 (c) 단계는, (c4) 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 순차적으로 인접 프레임들 간의 유사도를 각각 산출하는 단계; 및 (c5) 산출된 유사도가 임계치 이상인 값이 시계열적으로 연속하는 프레임 구간에 대하여 각각의 구간별로 유사도가 임계치 이상인 프레임만을 포함하는 유사 프레임 그룹으로 설정하는 단계;를 포함할 수 있다.In the image processing method according to an embodiment, the step (c) of setting the group of similar frames includes: (c4) sequentially calculating the similarity between adjacent frames for all frames included in the video of the first screen standard; doing; and (c5) setting a similar frame group including only frames having a similarity equal to or greater than the threshold value for each section of frame sections in which the calculated similarity value is equal to or greater than the threshold value.

일 실시예에 따른 영상 처리 방법에서, 미확보 영역에 대한 영상을 생성하는 상기 (d) 단계는, (d1) 입력된 상기 제 1 화면 규격의 동영상에 포함된 현재 프레임에 대하여 상기 유사 프레임 그룹 내에 포함된 유사 프레임과 현재 프레임 간의 기하학적 관계를 이용하여 상기 현재 프레임을 확장하는 단계; 및 (d2) 입력된 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 (d1) 단계의 프레임 확장을 수행하여 상기 미확보 영역에 대한 영상을 생성하는 단계;를 포함할 수 있다.In the image processing method according to an exemplary embodiment, the step (d) of generating an image of an unsecured area includes a current frame included in the input video of the first screen standard of (d1) included in the similar frame group. extending the current frame by using a geometric relationship between the similar frame and the current frame; and (d2) generating an image for the unsecured area by performing the frame expansion of step (d1) on all frames included in the input video of the first screen standard.

또한, 현재 프레임을 확장하는 상기 (d1) 단계는, 상기 유사 프레임 및 상기 현재 프레임 각각으로부터 특징점(keypoint)을 추출하고 추출된 특징점을 기준으로 양자의 영상을 매칭함으로써 상기 현재 프레임을 확장할 수 있다.In addition, in the step (d1) of extending the current frame, the current frame can be extended by extracting keypoints from each of the similar frame and the current frame and matching both images based on the extracted keypoints. .

나아가, 미확보 영역에 대한 영상을 생성하는 상기 (d) 단계는, (d3) 상기 (d1) 단계의 프레임 확장을 통해 상기 미확보 영역에 대한 영상이 생성되지 않은 경우 GAN(Generative Adversarial Network) 또는 오토인코더(Autoencoder)를 이용하여 상기 미확보 영역에 대한 영상을 보충하는 단계;를 더 포함할 수 있다.Furthermore, the step (d) of generating an image of the unsecured region may include (d3) a Generative Adversarial Network (GAN) or an autoencoder when the image of the unsecured region is not generated through the frame expansion of the step (d1). The method may further include supplementing an image of the unsecured area using an autoencoder.

일 실시예에 따른 영상 처리 방법에서, 제 2 화면 규격을 갖는 동영상을 출력하는 상기 (e) 단계는, (e1) 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 각각 원본 영상과 상기 미확보 영역에 대해 생성된 영상을 결합하여 상기 제 2 화면 규격을 갖는 영상으로 변환하는 단계;를 포함할 수 있다.In the image processing method according to an embodiment, the step (e) of outputting a video having a second screen standard includes (e1) the original video and the unsecured video for all frames included in the video having the first screen standard. It may include combining the images generated for the region and converting them into images having the second screen standard.

또한, 제 2 화면 규격을 갖는 동영상을 출력하는 상기 (e) 단계는, (e2) 상기 미확보 영역에 대해 생성된 영상과 상기 원본 영상 간의 품질 차이가 수용 한계치 이상인 경우 열등한 영상의 해상도(resolution) 또는 비트 레이트(bir rate)를 상기 유사 프레임으로부터 보충하는 단계;를 더 포함할 수 있다.In addition, in the (e) step of outputting a video having a second screen standard, (e2) when the quality difference between the video generated for the unsecured area and the original video is greater than the acceptance limit, the resolution of the inferior video or Compensating for a bit rate from the similar frame; may further include.

나아가, 제 2 화면 규격을 갖는 동영상을 출력하는 상기 (e) 단계는, (e3) 상기 미확보 영역에 대해 생성된 영상과 상기 제 1 화면 규격의 동영상에 포함된 원본 영상의 경계를 스무딩(smoothing)하는 단계;를 더 포함할 수 있다.Furthermore, in the step (e) of outputting a video having the second screen standard, (e3) smoothing the boundary between the video generated for the unsecured area and the original video included in the video of the first screen standard. It may further include;

한편, 이하에서는 상기 기재된 영상 처리 방법들을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다.Meanwhile, hereinafter, a computer-readable recording medium in which a program for executing the above-described image processing methods in a computer is recorded is provided.

상기 기술적 과제를 해결하기 위하여, 본 발명의 일 실시예에 따른 영상 처리 장치는, 제 1 화면 규격을 갖는 동영상을 입력받는 입력부; 입력된 상기 제 1 화면 규격을 갖는 동영상을 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격의 동영상으로 변환하는 프로그램을 저장하는 메모리; 및 상기 메모리에 저장된 프로그램을 실행하는 프로세서;를 포함하고, 상기 메모리에 저장된 프로그램은, 상기 제 1 화면 규격을 기준으로 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격과의 차이에 해당하는 영상 영역을 미확보 영역으로 설정하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 프레임(frame)별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 상기 유사 프레임 그룹을 참조하여 상기 미확보 영역에 대한 영상을 생성하며, 상기 제 1 화면 규격의 동영상에 포함된 원본 영상과 상기 미확보 영역에 대해 생성된 영상으로부터 상기 제 2 화면 규격을 갖는 동영상을 출력하는 명령을 포함한다.In order to solve the above technical problem, an image processing device according to an embodiment of the present invention includes an input unit for receiving a video having a first screen standard; a memory for storing a program for converting the input video having the first screen standard into a video having a second screen standard defined differently from the first screen standard; and a processor executing a program stored in the memory, wherein the program stored in the memory generates an image corresponding to a difference between the first screen standard and a second screen standard defined differently from the first screen standard. An area is set as an unsecured area, similar frames are detected for each frame included in the input video of the first screen standard, and set as a similar frame group, and included in the input video of the first screen standard For each frame, an image for the unsecured area is generated by referring to the similar frame group, and the second screen standard is obtained from the original image included in the video of the first screen standard and the image generated for the unsecured area. Includes a command to output a video with

일 실시예에 따른 영상 처리 장치에서, 상기 메모리에 저장된 프로그램은, 상기 제 2 화면 규격을 목표로 하여 화면비(aspect ratio), 해상도(resolution), 및 화각(angle of view) 중 적어도 하나를 포함하는 화면 규격의 차이로 인해 상기 제 1 화면 규격이 보유하지 못하고 있는 영상 영역을 미확보 영역으로 설정할 수 있다.In the image processing device according to an embodiment, the program stored in the memory includes at least one of an aspect ratio, a resolution, and an angle of view targeting the second screen standard. Due to the difference in screen standards, an image area not possessed by the first screen standard may be set as an unsecured area.

일 실시예에 따른 영상 처리 장치에서, 상기 메모리에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 하나의 기준 프레임에 대하여 상기 기준 프레임에 시간적으로 선행하거나 후행하는 인접 프레임들과의 유사도를 각각 산출하고, 상기 인접 프레임들로부터 산출된 유사도가 임계치 이상인 경우, 임계치 이상의 인접 프레임을 새로운 기준 프레임으로 설정하여 다시 인접 프레임들과의 유사도를 각각 산출하고 새롭게 산출된 유사도가 상기 임계치 이상인지를 검사하는 과정을 연쇄적으로 반복함으로써 유사도가 임계치 이상인 프레임만을 최초의 기준 프레임에 대한 유사 프레임 그룹으로 설정하며, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 유사도를 각각 산출하는 과정 및 상기 유사 프레임 그룹으로 설정하는 과정을 수행하여 프레임별로 유사 프레임 그룹을 도출할 수 있다.In the image processing device according to an embodiment, the program stored in the memory determines the similarity of one reference frame included in the moving picture of the first screen standard with neighboring frames temporally preceding or following the reference frame. If the similarity calculated from the adjacent frames is equal to or higher than the threshold value, the adjacent frame equal to or higher than the threshold value is set as a new reference frame, the similarity value with each adjacent frame is calculated again, and whether the newly calculated similarity value is equal to or higher than the threshold value is checked. and setting only frames whose similarity is equal to or higher than a threshold value as similar frame groups for the first reference frame by serially repeating the process, and calculating the similarity for all frames included in the video of the first screen standard. Similar frame groups may be derived for each frame by performing a process of setting similar frame groups.

일 실시예에 따른 영상 처리 장치에서, 상기 메모리에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 순차적으로 인접 프레임들 간의 유사도를 각각 산출하고, 산출된 유사도가 임계치 이상인 값이 시계열적으로 연속하는 프레임 구간에 대하여 각각의 구간별로 유사도가 임계치 이상인 프레임만을 포함하는 유사 프레임 그룹으로 설정할 수 있다.In the image processing device according to an embodiment, the program stored in the memory sequentially calculates a similarity between adjacent frames for all frames included in the moving picture of the first screen standard, and the calculated similarity is a value equal to or greater than a threshold value. With respect to the time-sequentially continuous frame sections, a similar frame group including only frames having similarities greater than or equal to a threshold value for each section may be set.

일 실시예에 따른 영상 처리 장치에서, 상기 메모리에 저장된 프로그램은, 입력된 상기 제 1 화면 규격의 동영상에 포함된 현재 프레임에 대하여 상기 유사 프레임 그룹 내에 포함된 유사 프레임과 현재 프레임 간의 기하학적 관계를 이용하여 상기 유사 프레임 및 상기 현재 프레임 각각으로부터 특징점(keypoint)을 추출하고 추출된 특징점을 기준으로 양자의 영상을 매칭함으로써 상기 현재 프레임을 확장하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 현재 프레임을 확장하는 과정을 수행하여 상기 미확보 영역에 대한 영상을 생성할 수 있다.In the image processing device according to an exemplary embodiment, the program stored in the memory uses a geometric relationship between the current frame and the similar frame included in the similar frame group with respect to the current frame included in the input moving picture of the first screen standard. Keypoints are extracted from each of the similar frames and the current frame, and the current frame is expanded by matching both images based on the extracted keypoints, and all frames included in the input video of the first screen standard An image of the unsecured area may be generated by performing a process of extending the current frame for .

또한, 상기 메모리에 저장된 프로그램은, 상기 현재 프레임을 확장하는 과정을 통해 상기 미확보 영역에 대한 영상이 생성되지 않은 경우 GAN(Generative Adversarial Network) 또는 오토인코더(Autoencoder)를 이용하여 상기 미확보 영역에 대한 영상을 보충할 수 있다.In addition, the program stored in the memory, when the image of the unsecured area is not generated through the process of extending the current frame, uses a Generative Adversarial Network (GAN) or an autoencoder to generate an image of the unsecured area. can be supplemented.

일 실시예에 따른 영상 처리 장치에서, 상기 메모리에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 각각 원본 영상과 상기 미확보 영역에 대해 생성된 영상을 결합하여 상기 제 2 화면 규격을 갖는 영상으로 변환할 수 있다.In the image processing device according to an embodiment, the program stored in the memory combines an original image and an image generated for the unsecured area for all frames included in the video of the first screen standard, respectively, to obtain the second screen. It can be converted into an image having a standard.

본 발명의 실시예들은, 동영상 내의 인접한 프레임 내의 정보에 기초하여 영상을 확장하되 이와 더블어 인공지능을 통해 확장되지 못한 영역에 대한 영상을 보충함으로써 화면 규격의 변화에 능동적이고 효과적으로 대응할 수 있고, 화면 규격의 변화에 따라 영상 내의 일부 영역에 표시할 정보가 없어 검은색으로 표시하는 레터박스 또는 필러박스를 방지할 수 있으며, 최신 영상 재생 장치의 자원을 충분히 활용함으로써 결과적으로 사용자가 체감하는 품질 만족도를 향상시킬 수 있다.Embodiments of the present invention can actively and effectively respond to changes in the screen standard by expanding an image based on information in an adjacent frame in a video, but supplementing the image for an unextended area through artificial intelligence. It is possible to prevent letterbox or pillarbox, which is displayed in black because there is no information to be displayed in some areas in the video according to the change of , and as a result, the quality satisfaction experienced by the user is improved by sufficiently utilizing the resources of the latest video playback device. can make it

도 1은 영상 영역 밖의 정보를 획득하여 화면 규격이 변화된 영상을 생성하는 본 발명의 기본 아이디어를 설명하기 위한 예시도이다.
도 2는 본 발명의 일 실시예에 따른 영상 영역 밖의 정보를 생성하는 영상 처리 방법을 도시한 흐름도이다.
도 3은 본 발명의 일 실시예에 따른 영상 처리 방법에서 유사 프레임을 검출하는 과정을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 영상 처리 방법에서 유사 프레임 그룹을 설정하는 과정을 설명하기 위한 도면이다.
도 5 및 도 6은 본 발명의 일 실시예에 따른 영상 처리 방법에서 미확보 영역에 대한 영상을 생성하는 과정과 이를 위한 프로그램 코드를 예시한 도면이다.
도 7 및 도 8은 각각 딥 뉴럴 네트워크를 이용하여 영상 내의 빈 영역을 보충하는 과정을 설명하기 위한 예시도와 블럭도이다.
도 9는 본 발명의 일 실시예에 따른 영상 영역 밖의 정보를 생성하는 영상 처리 장치를 도시한 블럭도이다.1 is an exemplary diagram for explaining the basic idea of the present invention for generating an image with a changed screen standard by acquiring information outside an image area.
2 is a flowchart illustrating an image processing method for generating information outside an image area according to an embodiment of the present invention.
3 is a diagram for explaining a process of detecting similar frames in an image processing method according to an embodiment of the present invention.
4 is a diagram for explaining a process of setting similar frame groups in an image processing method according to an embodiment of the present invention.
5 and 6 are diagrams illustrating a process of generating an image of an unsecured area in an image processing method according to an embodiment of the present invention and program codes for the same.
7 and 8 are exemplary diagrams and block diagrams for explaining a process of supplementing blank areas in an image using a deep neural network, respectively.
9 is a block diagram illustrating an image processing device generating information outside an image area according to an embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명의 실시예들을 구체적으로 설명하도록 한다. 다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지 기능 또는 구성에 대한 상세한 설명은 생략한다. 덧붙여, 명세서 전체에서, 어떤 구성 요소를 '포함'한다는 것은, 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. However, detailed descriptions of well-known functions or configurations that may obscure the gist of the present invention will be omitted in the following description and accompanying drawings. In addition, throughout the specification, 'including' a certain component means that other components may be further included, not excluding other components unless otherwise stated.

또한, 제 1, 제 2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로 사용될 수 있다. 예를 들어, 본 발명의 권리 범위로부터 이탈되지 않은 채 제 1 구성 요소는 제 2 구성 요소로 명명될 수 있고, 유사하게 제 2 구성 요소도 제 1 구성 요소로 명명될 수 있다.Also, terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms may be used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention.

본 발명에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구비하다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in the present invention are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "comprise" or "comprise" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, but that one or more other features or It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

특별히 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미이다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미인 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless specifically defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in this application, they are not interpreted in an ideal or excessively formal meaning. .

도 1은 영상 영역 밖의 정보를 획득하여 화면 규격이 변화된 영상을 생성하는 본 발명의 기본 아이디어를 설명하기 위한 예시도이다.1 is an exemplary diagram for explaining the basic idea of the present invention for generating an image with a changed screen standard by acquiring information outside an image area.

도 1의 (A)는 바르셀로나 올림픽 마라톤 결승 동영상의 일부 프레임을 예시한 것으로 현재의 화면비를 변화시켜 와이드 화면비를 갖는 동영상을 생성하고자 한다. 이를 위해, 주어진 재생 시점에 재생되고 있는 영상의 시야(화각) 밖의 정보는 많은 경우에 있어서 재생 시점 전/후의 영상에 담겨있다는 점에 주목하였다. 즉, 특정 재생 시점의 전/후의 프레임(frame) 내에 포함된 정보를 활용하여 상기 특정 재생 시점에 재생되고 있는 영상의 시야(화각)를 확장할 수 있다.FIG. 1(A) illustrates some frames of the Barcelona Olympics marathon final video, and a video having a wide aspect ratio is intended to be created by changing the current aspect ratio. To this end, it was noted that information outside the field of view (angle of view) of an image being reproduced at a given playback time point is contained in images before/after the playback time point in many cases. That is, the field of view (angle of view) of the image being reproduced at the specific reproduction time point can be expanded by utilizing information included in frames before/after the specific play time point.

도 1의 (B)를 참조하면, 도 1의 (A)에 예시된 특정 시점의 전/후 프레임으로부터 얻은 정보와 원래의 특정 시점의 프레임을 결합하여 보다 큰 이미지를 생성하였음을 보여준다. 이러한 방식을 통해 특정 시점의 영상 영역 밖의 정보를 얻을 수 있으며, 인접한 전/후 프레임에 대한 탐색을 반복적으로 확장할 경우 보다 큰 이미지의 생성 또한 기대할 수 있다.Referring to (B) of FIG. 1 , it is shown that a larger image is generated by combining information obtained from frames before/after a specific viewpoint illustrated in (A) of FIG. 1 with original frames of a specific viewpoint. Through this method, information outside the video area at a specific point in time can be obtained, and when the search for adjacent previous/next frames is repeatedly expanded, a larger image can be created.

나아가, 인공지능을 활용하여 인접한 전/후 프레임으로부터 얻을 수 없는 영역의 정보를 획득하는 것이 가능하다. 도 1의 (C)를 참조하면, 앞서 도 1의 (B)를 통해 예시된 바와 같이 주어진 프레임의 전후 프레임에서 얻은 정보로부터 생성된 큰 이미지보다 더욱 큰 이미지가 생성된 결과를 예시하였다. 이러한 도 1의 (C)의 이미지는 인공지능을 이용하여 화각 밖의 영역을 추정하여 새롭게 생성함으로써 얻을 수 있었다.Furthermore, it is possible to obtain information of an area that cannot be obtained from adjacent previous/next frames by utilizing artificial intelligence. Referring to (C) of FIG. 1, as illustrated through (B) of FIG. 1 above, a result of generating a larger image than a large image generated from information obtained from frames before and after a given frame is illustrated. The image of FIG. 1 (C) was obtained by estimating an area outside the angle of view using artificial intelligence and newly generating it.

즉, 동영상 내의 인접한 프레임 내의 정보와 인공지능을 통해 영상의 보충을 함께 활용함으로써 화면 규격의 변화에 능동적이고 효과적으로 대응할 수 있다. 이하에서 제시되는 본 발명의 실시예들에서는 이러한 아이디어를 구현하는 구체적인 기술적 수단을 도면을 참조하여 설명하도록 한다.In other words, it is possible to actively and effectively respond to changes in screen specifications by utilizing the information in adjacent frames in the video and the supplement of the video through artificial intelligence. In the embodiments of the present invention presented below, specific technical means for implementing these ideas will be described with reference to the drawings.

도 2는 본 발명의 일 실시예에 따른 영상 영역 밖의 정보를 생성하는 영상 처리 방법을 도시한 흐름도이다.2 is a flowchart illustrating an image processing method for generating information outside an image area according to an embodiment of the present invention.

S210 단계에서, 영상 처리 장치는 제 1 화면 규격을 갖는 동영상을 입력받는다. 여기서, 동영상은 복수 개의 정지 영상 프레임이 연속하여 연결되어 형성된 집합 데이터로 이해될 수 있다. 따라서, 각각의 프레임들은 시계열적인 관계가 존재하며, 설명의 편의를 위해 특정 시점에서 선택된 하나의 프레임을 기준으로 시간적으로 앞서는 것을 선행 프레임으로, 시간적으로 뒤지는 것을 후행 프레임을 명명하도록 한다.In step S210, the image processing device receives a video having a first screen standard. Here, a video may be understood as aggregate data formed by continuously connecting a plurality of still image frames. Therefore, each frame has a time-sequential relationship, and for convenience of description, a frame that precedes a frame selected at a specific point in time is referred to as a preceding frame, and a frame that chronologically lags behind is referred to as a following frame.

S220 단계에서, 상기 영상 처리 장치는 상기 제 1 화면 규격을 기준으로 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격과의 차이에 해당하는 영상 영역을 미확보 영역으로 설정한다. 이 과정에서는, 상기 제 2 화면 규격을 목표로 하여 화면비(aspect ratio), 해상도(resolution), 및 화각(angle of view) 중 적어도 하나를 포함하는 화면 규격의 차이로 인해 상기 제 1 화면 규격이 보유하지 못하고 있는 영상 영역을 미확보 영역으로 설정할 수 있다. 이러한 과정을 통해 원본 영상이 갖지 못한 정보가 어느 영역에 해당하는지를 특정하게 된다.In step S220, the image processing device sets an image area corresponding to a difference between the first screen standard and a second screen standard defined differently as an unsecured area based on the first screen standard. In this process, the first screen standard is held due to a difference in screen standards including at least one of aspect ratio, resolution, and angle of view, aiming at the second screen standard. An unsecured image area may be set as an unsecured area. Through this process, it is specified which region the information that the original image does not have corresponds to.

S230 단계에서, 상기 영상 처리 장치는 입력된 상기 제 1 화면 규격의 동영상에 포함된 프레임(frame)별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정한다. 즉, 하나의 기준 프레임과 인접한 선행 프레임 또는 후행 프레임을 탐색하여 기준 프레임과 유사한 프레임들이 존재하는지를 검사하고, 검사 결과, 유사도 값이 미리 설정된 기준을 만족하는 경우, 해당 프레임을 유사 프레임으로서 하나의 그룹으로 묶어 관리하게 된다. 이러한 과정을 동영상 내의 모든 프레임에 대해 수행함으로써 특정 시점의 프레임과 유사도가 높은 유사 프레임 그룹을 도출하게 된다. 이 과정에서는, 특징 매칭(feature matching), 템플릿 매칭(template matching), 및 히스토그램(histogram) 비교 중 적어도 하나를 이용하여 2개의 인접 프레임들을 비교하여 유사도 값을 산출할 수 있다.In step S230, the image processing device detects similar frames for each frame included in the input video of the first screen standard and sets them as similar frame groups. That is, a previous frame or a subsequent frame adjacent to one reference frame is searched to check whether frames similar to the reference frame exist, and as a result of the check, if the similarity value satisfies a preset criterion, the corresponding frame is regarded as a similar frame as one group. tied together and managed. By performing this process on all frames in the video, a similar frame group having a high similarity to a frame at a specific point in time is derived. In this process, a similarity value may be calculated by comparing two adjacent frames using at least one of feature matching, template matching, and histogram comparison.

본 발명의 일 실시예에 따른 영상 처리 방법에서 유사 프레임을 검출하는 과정을 설명하기 위한 도 3을 참조하면, 시간 축을 따라 5개의 프레임이 예시되었다. 예를 들어, 시간 t에 해당하는 프레임 f_t를 기준으로 그에 인접한 시간 t-1 및 t+1에서 각각 프레임 f_t-1 및 f_t+1을 선택한 후, 기준 프레임 f_t와 비교하여 유사도를 산출한다. 만약 산출된 유사도가 미리 설정된 임계치 이상인 경우 다시 해당 인접 프레임에 인접한 시간 t-2 또는 t+2의 프레임 f_t-2 및 f_t+2에 대해 유사도 평가를 재차 수행하게 된다. 이러한 방식으로 기준 프레임 f_t와 유사한 인접 프레임들을 탐색하여 시간적으로 연속하는 유사 프레임만을 유사 프레임 그룹 내에 포함시킨다. 여기서, 유사 프레임 그룹은 이후 기준 프레임 f_t에서 화면 규격 밖의 영역에 대한 정보를 얻기 위해 활용될 수 있는 후보들의 집합을 의미한다.Referring to FIG. 3 for describing a process of detecting similar frames in an image processing method according to an embodiment of the present invention, five frames are illustrated along the time axis. For example, after selecting frames f _t-1 and f _t+ 1 at times t-1 and t+1 adjacent to the frame f _t corresponding to time t, respectively, and comparing them with the reference frame f _t , the degree of similarity is determined. yield If the calculated similarity is equal to or greater than the preset threshold, similarity evaluation is performed again for frames f _t-2 and f _t+2 at time t-2 or t+2 adjacent to the adjacent frame. In this way, adjacent frames similar to the reference frame f _t are searched for, and only temporally continuous similar frames are included in the similar frame group. Here, the similar frame group means a set of candidates that can be used to obtain information about an area outside the screen standard in the reference frame f _t .

한편, 제 1 화면 규격의 동영상에 포함된 프레임별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하기 위해, 구현의 관점에서 다양한 실시예가 활용될 수 있다.Meanwhile, in order to detect similar frames for each frame included in the video of the first screen standard and set them as similar frame groups, various embodiments may be utilized in terms of implementation.

첫 번째 실시예로서, 먼저 제 1 화면 규격의 동영상에 포함된 하나의 기준 프레임에 대하여 기준 프레임에 시간적으로 선행하거나 후행하는 인접 프레임들과의 유사도를 각각 산출한다. 그런 다음, 상기 인접 프레임들로부터 산출된 유사도가 임계치 이상인 경우, 임계치 이상의 인접 프레임을 새로운 기준 프레임으로 설정하여 다시 인접 프레임들과의 유사도를 각각 산출하고 새롭게 산출된 유사도가 상기 임계치 이상인지를 검사하는 과정을 연쇄적으로 반복함으로써 유사도가 임계치 이상인 프레임만을 최초의 기준 프레임에 대한 유사 프레임 그룹으로 설정한다. 이제, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 인접 프레임들과의 유사도를 각각 산출하는 과정 및 상기 유사 프레임 그룹으로 설정하는 과정을 수행하여 프레임별로 유사 프레임 그룹을 도출할 수 있다.As a first embodiment, first, for one reference frame included in a video of the first screen standard, similarities with adjacent frames temporally preceding or following the reference frame are calculated. Then, if the similarity calculated from the adjacent frames is greater than or equal to the threshold, setting the adjacent frames that are greater than or equal to the threshold as new reference frames, calculating similarities with the adjacent frames again, and checking whether the newly calculated similarity is greater than or equal to the threshold. By repeating the process in chain, only frames whose similarity is equal to or greater than the threshold are set as similar frame groups for the first reference frame. Now, for all frames included in the video of the first screen standard, similar frame groups can be derived for each frame by performing the process of calculating the degree of similarity with the adjacent frames and the process of setting them as similar frame groups. .

연산의 반복을 줄이고 데이터 재사용 효율을 높이기 위해, 2개의 인접 프레임들 간의 유사도 값을 해당 프레임의 식별자 쌍(pair)에 매칭시켜 룩업 테이블(look-up table)에 저장하되, 새롭게 2개의 인접 프레임들 간의 유사도를 산출하는 경우 먼저 상기 룩업 테이블을 조회하여 미리 저장된 유사도 값이 존재하는 경우 저장된 해당 유사도 값을 독출하여 사용하고, 미리 저장된 유사도 값이 존재하지 않는 경우에만 유사도를 산출하여 상기 룩업 테이블에 저장하는 것이 바람직하다.In order to reduce the repetition of operations and increase the efficiency of data reuse, the similarity value between two adjacent frames is matched with the identifier pair of the corresponding frame and stored in a look-up table. When calculating the similarity between the two, the lookup table is first inquired, and if there is a pre-stored similarity value, the stored similarity value is read and used, and only when the pre-stored similarity value does not exist, the similarity is calculated and stored in the look-up table It is desirable to do

도 4는 본 발명의 일 실시예에 따른 영상 처리 방법에서 유사 프레임 그룹을 설정하는 과정을 설명하기 위한 도면으로 룩업 테이블을 예시하였다. 도 4를 참조하면 연속하는 프레임들에 대해 2개씩 쌍을 지어 프레임의 식별자 쌍을 인덱스(index)로 설정하고, 그에 대응하는 유사도 값을 저장하였다. 유사도 산출 과정이 반복되면서 유사도 값이 기록되고 유사 프레임 그룹의 형성을 위한 검사가 수행된다. 예를 들어, 유사도 판정을 위한 임계치를 '0.7'로 설정하였다고 가정하면, (f₁,f₂), (f₂,f₃), (f₃,f₄)의 연속하는 구간이 '유사 프레임 그룹 #1'을 형성하였음을 확인할 수 있다. 그러나, (f₄,f₅)에 해당하는 유사도는 임계치 기준을 만족하지 못하였으며, 다시 (f₅,f₆), (f₆,f₇)의 연속하는 구간이 '유사 프레임 그룹 #2'를 형성하였다. 따라서, 이렇게 생성된 룩업 테이블을 참조함으로써 특정 시점의 프레임과 유사한 인접 프레임이 어느 것인지를 빠르게 확인할 수 있으며, 이후 영상의 화면 규격을 변경할 경우 원본 영상에서 영역 밖의 정보를 얻기 위한 후보 데이터로서 활용할 수 있다.4 is a diagram for explaining a process of setting a similar frame group in an image processing method according to an embodiment of the present invention, and illustrates a lookup table. Referring to FIG. 4 , for consecutive frames, two pairs of frame identifiers are set as an index, and a similarity value corresponding thereto is stored. As the similarity calculation process is repeated, the similarity value is recorded and a test is performed to form a similar frame group. For example, assuming that the threshold for similarity determination is set to '0.7', the successive sections of (f ₁ ,f ₂ ), (f ₂ ,f ₃ ), and (f ₃ ,f ₄ ) are 'similarity frames'. It can be confirmed that group #1' was formed. However, the similarity corresponding to (f ₄ ,f ₅ ) did not satisfy the threshold criterion, and the consecutive sections of (f ₅ ,f ₆ ) and (f ₆ ,f ₇ ) are 'Similar frame group #2'. was formed. Therefore, by referring to the lookup table generated in this way, it is possible to quickly determine which adjacent frame is similar to a frame at a specific point in time, and when the screen standard of an image is changed thereafter, it can be used as candidate data for obtaining information outside the region from the original image. .

제 1 화면 규격의 동영상에 포함된 프레임별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하기 위한 두 번째 실시예로서, 시간 순서에 따라 프레임 간의 유사도를 미리 산출하는 방식이 활용 가능하다. 먼저, 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 순차적으로 인접 프레임들 간의 유사도를 각각 산출한다. 그런 다음, 산출된 유사도가 임계치 이상인 값이 시계열적으로 연속하는 프레임 구간에 대하여 각각의 구간별로 유사도가 임계치 이상인 프레임만을 포함하는 유사 프레임 그룹으로 설정할 수 있다. 즉, 앞서 기술한 첫 번째 실시예가 하나의 기준 프레임으로부터 인접한 전/후의 선행/후행 프레임을 탐색하는데 반해, 두 번째 실시예는 일방향으로 인접 프레임간의 유사도를 산출하는 방식으로 동작한다. 이러한 두 번째 실시예에서도 산출된 유사도를 관리하기 위해 앞서 소개한 룩업 테이블이 활용될 수 있음은 물론이다.As a second embodiment for detecting similar frames for each frame included in the video of the first screen standard and setting them as similar frame groups, a method of pre-calculating the similarity between frames according to time order can be utilized. First, similarities between adjacent frames are sequentially calculated for all frames included in a video of the first screen standard. Then, with respect to frame sections in which the calculated similarity value is equal to or greater than the threshold value, a similar frame group including only frames having a similarity value equal to or greater than the threshold value for each section may be set as a similar frame group. That is, while the first embodiment described above searches for previous/next adjacent frames from one reference frame, the second embodiment operates in a way of calculating the degree of similarity between adjacent frames in one direction. Of course, in this second embodiment, the lookup table introduced above can be utilized to manage the calculated similarity.

다시 도 2로 돌아와, S240 단계에서, 상기 영상 처리 장치는 입력된 상기 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 상기 유사 프레임 그룹을 참조하여 상기 미확보 영역에 대한 영상을 생성한다. 이 과정에서는, 우선 입력된 상기 제 1 화면 규격의 동영상에 포함된 현재 프레임에 대하여 상기 유사 프레임 그룹 내에 포함된 유사 프레임과 현재 프레임 간의 기하학적 관계를 이용하여 상기 현재 프레임을 확장한다. 그런 다음, 입력된 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기된 현재 프레임의 확장 과정을 수행하여 상기 미확보 영역에 대한 영상을 생성할 수 있다.Returning to FIG. 2 again, in step S240, the image processing device refers to the similar frame group for each frame included in the input video of the first screen standard to generate an image for the unsecured area. In this process, with respect to the current frame included in the input moving picture of the first screen standard, the current frame is extended by using a geometric relationship between the current frame and the similar frame included in the similar frame group. Then, the video for the unsecured area may be generated by performing the process of expanding the current frame with respect to all frames included in the input video of the first screen standard.

현재 프레임(기준이 되는 기본 프레임)을 확장하고자 할 때, 현재 프레임과 유사도가 높은 유사 프레임을 복수 개 포함하는 유사 프레임 그룹이 설정되었다고 가정하자. 이러한 유사 프레임 그룹에서 하나의 프레임을 선택하여 현재 프레임을 확장하는 상황에 있어서, 현재 프레임과 유사 프레임에 담긴 내용은 같아도(예를 들어, 이미지에 사람이 달리는 장면이 포함된 프레임), 프레임이 촬영되는 순간에 카메라가 향하는 각도, 원근, 조도 등에 따라서 나타나는 프레임의 특성이 서로 상이할 수 있다. 따라서, 현재 프레임을 확장할 때에는 이러한 프레임의 특성을 맞추는 작업이 필요하다. 예를 들어, 정육면체가 존재할 때 카메라가 향하는 각도에 따라서 정육면체의 모양이 정사각형으로 보이기도 하고, 사다리꼴로 보이기도 하며, 임의의 사각형으로 보이기도 한다.Assume that a similar frame group including a plurality of similar frames having a high similarity to the current frame is set when extending the current frame (a basic frame serving as a reference). In a situation where the current frame is expanded by selecting one frame from the similar frame group, even if the contents of the current frame and the similar frame are the same (for example, a frame including a scene of a person running in an image), the frame is captured. The characteristics of the frames appearing may be different from each other according to the angle, perspective, illumination, etc. to which the camera is directed at the moment when the frame is displayed. Therefore, when extending the current frame, it is necessary to match the characteristics of the frame. For example, when a cube exists, the shape of the cube may be seen as a square, trapezoid, or arbitrary square depending on the angle at which the camera is directed.

현재 프레임과 유사 프레임을 하나의 프레임으로 병합하기 위해서는 두 개의 프레임 간의 카메라 각도 및 원근 등을 포함하는 특성 요소에 관한 다양한 보정이 필요하다. 구현의 관점에서, 보정을 위해 이미지를 늘리거나 줄이는 등의 수학적 변환이 필요한데, 이하에서 예시하는 일련의 처리 과정을 이용하여 효과적으로 보정을 수행할 수 있다.In order to merge the current frame and similar frames into one frame, various corrections for characteristic elements including a camera angle and perspective between the two frames are required. From the implementation point of view, mathematical conversion such as enlarging or reducing an image is required for correction, and correction can be effectively performed using a series of processing procedures exemplified below.

우선, 인공 신경망을 구성한다. 그런 다음, 기본 입체 모형(직육면체, 정육면체, 원기둥, 구 등)과 이의 합체 구성 모형을 다양한 각도 및 원근에서 바라본 영상을 컴퓨터를 활용하여 인공적으로 생성하여 인공 신경망을 통해 학습시킨다. 그러면, 인공 신경망은 각도 및 원근에 대한 개념을 학습하여 임의의 각도 및 원근의 이미지를 입력으로 받았을 때 특정한 각도 및 원근의 이미지로 출력할 수 있는 상태가 된다. 이제, 두 개의 이미지를 입력으로 하여 이미지들 간의 각도 및 원근 차이를 출력하는 인공 신경망을 학습시키고, 이를 이용하여 유사 프레임을 변환하여 각도 및 원근 특징을 기본 프레임과 맞춤으로써 두 개의 프레임을 하나로 병합할 수 있다.First, an artificial neural network is constructed. Then, the basic three-dimensional model (cuboid, cube, cylinder, sphere, etc.) and its combined configuration model are artificially created using a computer using images viewed from various angles and perspectives, and are trained through an artificial neural network. Then, the artificial neural network learns the concept of angle and perspective, and when an image of an arbitrary angle and perspective is received as an input, it is in a state in which it can output an image of a specific angle and perspective. Now, an artificial neural network that takes two images as input and outputs the difference in angle and perspective between them is trained, and by using this, a similar frame is converted to match the angle and perspective characteristics with the basic frame to merge the two frames into one. can

도 5 및 도 6은 본 발명의 일 실시예에 따른 영상 처리 방법에서 미확보 영역에 대한 영상을 생성하는 과정과 이를 위한 프로그램 코드를 예시한 도면이다.5 and 6 are diagrams illustrating a process of generating an image of an unsecured area in an image processing method according to an embodiment of the present invention and program codes for the same.

앞서, 도 2의 S240 단계에서, 입력된 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 유사 프레임 그룹을 참조하여 미확보 영역에 대한 영상을 생성할 수 있음을 설명하였다. 이를 위해, 상기 유사 프레임 및 상기 현재 프레임 각각으로부터 특징점(keypoint)을 추출하고 추출된 특징점을 기준으로 양자의 영상을 매칭함으로써 상기 현재 프레임을 확장할 수 있다. 도 5를 참조하면 주어진 기준 프레임에 인접한 선행 및 후행 프레임에서 얻을 정보로부터 원본 이미지보다 큰 이미지를 생성하는 과정을 예시하였으며, 도 6은 도 5의 이미지로부터 재차 인접한 선행 및 후행 프레임을 확장하여 해당 프레임들로부터 얻은 정보를 활용하여 더욱 큰 이미지를 생성하였음을 보여준다.Previously, in step S240 of FIG. 2 , it has been described that an image for an unsecured area can be generated by referring to a similar frame group for each frame included in the input video of the first screen standard. To this end, the current frame may be extended by extracting keypoints from each of the similar frames and the current frame and matching both images based on the extracted keypoints. Referring to FIG. 5, a process of generating an image larger than the original image from information obtained from preceding and succeeding frames adjacent to a given reference frame is illustrated, and FIG. 6 illustrates a corresponding frame by extending adjacent preceding and succeeding frames from the image of FIG. It shows that a larger image was created using the information obtained from the field.

한편, 도 6에는 좌측 하단과 우측 상단에 영상 내의 정보가 확보되지 않은 빈 영역이 나타나고 있는 것을 확인할 수 있다. 이하에서는 이를 보완하기 위한 기술적 수단을 소개한다.Meanwhile, in FIG. 6 , it can be confirmed that blank areas in which information in the image is not secured appear in the lower left and upper right portions. In the following, technical means to compensate for this will be introduced.

도 7 및 도 8은 각각 딥 뉴럴 네트워크를 이용하여 영상 내의 빈 영역을 보충하는 과정을 설명하기 위한 예시도와 블럭도이다.7 and 8 are exemplary diagrams and block diagrams for explaining a process of supplementing blank areas in an image using a deep neural network, respectively.

본 발명의 실시예들은 유사 프레임 그룹을 참조하여 미확보 영역에 대한 영상을 생성하였다. 그럼에도 불구하고 요구되는 화면 규격에 대해 유사 프레임 그룹으로부터 미확보 영역에 대한 정보를 얻지 못할 수 있다. 즉, 입력된 제 1 화면 규격의 동영상에 포함된 현재 프레임에 대하여 유사 프레임 그룹 내에 포함된 유사 프레임과 현재 프레임 간의 기하학적 관계를 이용하여 현재 프레임을 확장하였으나, 상기 미확보 영역에 대한 영상이 생성되지 않은 경우 GAN(Generative Adversarial Network) 또는 오토인코더(Autoencoder)를 이용하여 상기 미확보 영역에 대한 영상을 보충할 수 있다.In the embodiments of the present invention, an image of an unsecured area is created by referring to a group of similar frames. Nevertheless, information on the unsecured area may not be obtained from the similar frame group for the required screen size. That is, with respect to the current frame included in the input video of the first screen standard, the current frame is extended using the geometric relationship between the current frame and the similar frame included in the similar frame group, but the image for the unsecured area is not generated. In this case, the image for the unsecured area may be supplemented using a Generative Adversarial Network (GAN) or an autoencoder.

도 7을 참조하면, 도 6과 비교할 때, 좌측 상단의 영역(710) 및 우측 하단의 영역(720)이 보충되었음을 확인할 수 있다. 도 8을 참조하면, 먼저 정보가 비어 있는 영상을 포함하여 다수의 인접 영상들을 입력영상(810)으로 입력한다. 그런 다음, 딥 뉴럴 네트워크(820)를 통해 정보가 비어 있는 빈자리 영상에 대한 실마리 정보(830)를 제공받음으로써 정보가 비어 있는 영역을 보충할 수 있는 영상을 새롭게 생성한 후 출력영상(840)으로 출력한다. 이때, 딥 뉴럴 네트워크(820)는 GAN 또는 오토인코더를 이용하여 구현할 수 있다.Referring to FIG. 7 , when compared with FIG. 6 , it can be seen that the area 710 at the upper left and the area 720 at the lower right are supplemented. Referring to FIG. 8 , first, a plurality of adjacent images including an image with empty information are input as an input image 810 . Then, the deep neural network 820 receives clue information 830 for the vacant image with empty information, so that an image capable of supplementing the information-empty area is newly generated and then output as an output image 840. print out In this case, the deep neural network 820 can be implemented using a GAN or an autoencoder.

이제 마지막으로, 도 2의 S250 단계에서, 상기 영상 처리 장치는 상기 제 1 화면 규격의 동영상에 포함된 원본 영상과 상기 미확보 영역에 대해 생성된 영상으로부터 상기 제 2 화면 규격을 갖는 동영상을 출력한다. 여기서, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 각각 원본 영상과 상기 미확보 영역에 대해 생성된 영상을 결합하여 상기 제 2 화면 규격을 갖는 영상으로 변환할 수 있다.Finally, in step S250 of FIG. 2 , the image processing device outputs a video having the second screen standard from the original video included in the video of the first screen standard and the video generated for the unsecured area. Here, with respect to all frames included in the video of the first screen standard, each original image and an image generated for the unsecured area may be combined and converted into an image having the second screen standard.

앞서 유사성이 높은 프레임으로 판정된 경우는 크게 3가지 유형을 고려할 수 있다. 첫 번째 유형은 촬영 수단(카메라)이 고정된 상태로 영상을 촬영한 경우이고, 두 번째 유형은 촬영 수단을 상/하/좌/우로 움직이며 촬영한 경우이며, 세 번째 유형은 촬영 수단으로 줌인(zoom-in)/줌아웃(zoom-out)하여 촬영한 경우이다.In the case where the frame is determined to have a high similarity above, three types may be considered. The first type is when the image is captured while the recording means (camera) is fixed, the second type is when the recording means is moved up/down/left/right, and the third type is when the recording means is used to zoom in (zoom-in)/zoom-out (zoom-out) to take a picture.

첫 번째 경우는 스튜디오에서 진행되는 뉴스에서 진행자가 말하는 장면을 예로 들 수 있다. 유사도 비교시 프레임 간 차이가 많이 나는 화소는 버리고, 차이가 많이 나지 않는 화소만을 활용하여 해상도를 향상시킬 수 있다. 예를 들어, 기준 프레임에 시간적으로 연속하는 구간에 4개의 유사 프레임이 존재한다고 판단되었다면, 이러한 유사 프레임으로 구성된 유사 프레임 그룹을 활용하여 1 by 1 화상 정보를 4 by 4 화상 정보로 향상시킬 수 있다.The first case can be taken as an example of a scene in which a presenter speaks in a news broadcast in a studio. When comparing the similarity, pixels with a large difference between frames are discarded, and resolution can be improved by using only pixels with a small difference. For example, if it is determined that four similar frames exist in a temporally successive section of a reference frame, 1 by 1 image information can be enhanced to 4 by 4 image information by using a similar frame group composed of these similar frames. .

두 번째 또는 세 번째 경우는 기준 프레임에 인접한 선행 및 후행 프레임에 기초하여 상기 기준 프레임을 확장시킬 수 있다. 예를 들어, 4:3의 화면비를 갖는 영상들을 연결하여 16:9의 화면비를 갖는 영상을 생성할 수 있다.In the second or third case, the reference frame can be expanded based on preceding and succeeding frames adjacent to the reference frame. For example, an image having an aspect ratio of 16:9 may be created by connecting images having an aspect ratio of 4:3.

이상에서 제안한 영상 생성 방법을 사용하여 동영상의 품질을 향상시킬 경우, 해상도가 일관되지 못하거나, 보충된 영상의 일부 영역에서 의도하지 않았음에도 이질적인 색상(예를 들어, 검정색)이 나타나는 문제가 발생할 수 있다. 따라서, 이러한 이질감을 해소하기 위한 영상 후처리 내지 필터링이 요구된다. 구현의 관점에서, 상기 미확보 영역에 대해 생성된 영상과 상기 원본 영상 간의 품질 차이가 수용 한계치 이상인 경우 열등한 영상의 해상도(resolution) 또는 비트 레이트(bir rate)를 상기 유사 프레임으로부터 보충할 수 있다. 또한, 상기 미확보 영역에 대해 생성된 영상과 상기 제 1 화면 규격의 동영상에 포함된 원본 영상의 경계를 스무딩(smoothing)할 수 있다. If the quality of a video is improved using the video generation method proposed above, problems such as inconsistent resolution or unintended color (for example, black) appearing in some areas of the supplemented video may occur. there is. Therefore, image post-processing or filtering is required to resolve this sense of heterogeneity. From an implementation point of view, when the quality difference between the image generated for the unsecured area and the original image is greater than or equal to the acceptance limit, the resolution or bit rate of the inferior image may be supplemented from the similar frame. In addition, the boundary between the video generated for the unsecured area and the original video included in the video of the first screen standard may be smoothed.

도 9는 본 발명의 일 실시예에 따른 영상 영역 밖의 정보를 생성하는 영상 처리 장치(920)를 도시한 블럭도로서, 도 2의 흐름도에 대응하여 하드웨어 구현의 관점에서 각각의 구성요소의 동작 및 기능을 나타내었다. 따라서, 설명의 중복을 피하기 위해 여기서는 그 개괄적인 구성만을 약술하도록 한다.FIG. 9 is a block diagram illustrating an image processing device 920 generating information outside an image area according to an embodiment of the present invention. Corresponding to the flowchart of FIG. function was shown. Therefore, in order to avoid duplication of description, only the general configuration is outlined here.

입력부(921)는 제 1 화면 규격을 갖는 동영상(910)을 입력받는 수단이다. 메모리(923)는 입력부(921)를 통해 입력된 상기 제 1 화면 규격을 갖는 동영상(910)을 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격의 동영상(930)으로 변환하는 프로그램을 저장한다. 프로세서(925)는 상기 메모리(923)에 저장된 프로그램을 실행하는 주체이다. 여기서, 상기 메모리(923)에 저장된 프로그램은, 상기 제 1 화면 규격을 기준으로 상기 제 1 화면 규격과 다르게 정의된 제 2 화면 규격과의 차이에 해당하는 영상 영역을 미확보 영역으로 설정하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 프레임(frame)별로 각각 유사 프레임을 검출하여 유사 프레임 그룹으로 설정하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 각각의 프레임에 대하여 상기 유사 프레임 그룹을 참조하여 상기 미확보 영역에 대한 영상을 생성하며, 상기 제 1 화면 규격의 동영상에 포함된 원본 영상과 상기 미확보 영역에 대해 생성된 영상으로부터 상기 제 2 화면 규격을 갖는 동영상을 출력하는 명령을 포함한다.The input unit 921 is a means for receiving the video 910 having the first screen standard. The memory 923 stores a program for converting the video 910 having the first screen standard input through the input unit 921 into a video 930 having a second screen standard defined differently from the first screen standard. . The processor 925 is a subject that executes a program stored in the memory 923 . Here, the program stored in the memory 923 sets an image area corresponding to a difference between the first screen standard and a second screen standard defined differently from the first screen standard as an unsecured area, and Each similar frame is detected for each frame included in the video of the first screen standard and set as a similar frame group, and the similar frame group is set for each frame included in the input video of the first screen standard. and a command for generating an image of the unsecured area with reference to, and outputting a video having the second screen standard from an original video included in the video of the first screen standard and an image generated for the unsecured area.

메모리(923)에 저장된 프로그램은, 상기 제 2 화면 규격을 목표로 하여 화면비(aspect ratio), 해상도(resolution), 및 화각(angle of view) 중 적어도 하나를 포함하는 화면 규격의 차이로 인해 상기 제 1 화면 규격이 보유하지 못하고 있는 영상 영역을 미확보 영역으로 설정할 수 있다.The program stored in the memory 923 aims at the second screen standard, and due to a difference in screen standard including at least one of aspect ratio, resolution, and angle of view, the first An image area not possessed by the 1 screen standard can be set as an unsecured area.

메모리(923)에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 하나의 기준 프레임에 대하여 상기 기준 프레임에 시간적으로 선행하거나 후행하는 인접 프레임들과의 유사도를 각각 산출하고, 상기 인접 프레임들로부터 산출된 유사도가 임계치 이상인 경우, 임계치 이상의 인접 프레임을 새로운 기준 프레임으로 설정하여 다시 인접 프레임들과의 유사도를 각각 산출하고 새롭게 산출된 유사도가 상기 임계치 이상인지를 검사하는 과정을 연쇄적으로 반복함으로써 유사도가 임계치 이상인 프레임만을 최초의 기준 프레임에 대한 유사 프레임 그룹으로 설정하며, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 유사도를 각각 산출하는 과정 및 상기 유사 프레임 그룹으로 설정하는 과정을 수행하여 프레임별로 유사 프레임 그룹을 도출할 수 있다.A program stored in the memory 923 calculates a similarity between one reference frame included in the moving picture of the first screen standard and adjacent frames temporally preceding or following the reference frame, and comparing the adjacent frames with each other. If the similarity calculated from is greater than or equal to the threshold value, setting adjacent frames equal to or greater than the threshold value as a new reference frame, calculating similarities with adjacent frames, and checking whether the newly calculated similarity value is greater than or equal to the threshold value are repeated in chain. The process of setting only the frames whose similarity is equal to or higher than the threshold value as a similar frame group for the first reference frame, calculating the similarity for all frames included in the video of the first screen standard, and setting the similar frame group as the similar frame group. By performing this, similar frame groups can be derived for each frame.

메모리(923)에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 순차적으로 인접 프레임들 간의 유사도를 각각 산출하고, 산출된 유사도가 임계치 이상인 값이 시계열적으로 연속하는 프레임 구간에 대하여 각각의 구간별로 유사도가 임계치 이상인 프레임만을 포함하는 유사 프레임 그룹으로 설정할 수 있다.The program stored in the memory 923 sequentially calculates the similarity between adjacent frames for all frames included in the video of the first screen standard, and the frame intervals in which the calculated similarity is greater than or equal to a threshold are consecutive in time series. For each section, it can be set as a similar frame group that includes only frames whose degree of similarity is equal to or higher than the threshold value.

메모리(923)에 저장된 프로그램은, 입력된 상기 제 1 화면 규격의 동영상에 포함된 현재 프레임에 대하여 상기 유사 프레임 그룹 내에 포함된 유사 프레임과 현재 프레임 간의 기하학적 관계를 이용하여 상기 유사 프레임 및 상기 현재 프레임 각각으로부터 특징점(keypoint)을 추출하고 추출된 특징점을 기준으로 양자의 영상을 매칭함으로써 상기 현재 프레임을 확장하고, 입력된 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 상기 현재 프레임을 확장하는 과정을 수행하여 상기 미확보 영역에 대한 영상을 생성할 수 있다. 또한, 메모리(923)에 저장된 프로그램은, 상기 현재 프레임을 확장하는 과정을 통해 상기 미확보 영역에 대한 영상이 생성되지 않은 경우 GAN(Generative Adversarial Network) 또는 오토인코더(Autoencoder)를 이용하여 상기 미확보 영역에 대한 영상을 보충할 수 있다. The program stored in the memory 923 uses a geometric relationship between the current frame and the similar frame included in the similar frame group with respect to the current frame included in the input moving picture of the first screen standard, and uses the similar frame and the current frame. Expanding the current frame by extracting keypoints from each and matching both images based on the extracted keypoints, and extending the current frame for all frames included in the input video of the first screen standard A process may be performed to generate an image of the unsecured area. In addition, the program stored in the memory 923, when the image of the unsecured area is not generated through the process of extending the current frame, uses a Generative Adversarial Network (GAN) or an autoencoder to capture the unsecured area. You can supplement the video about it.

메모리(923)에 저장된 프로그램은, 상기 제 1 화면 규격의 동영상에 포함된 모든 프레임에 대하여 각각 원본 영상과 상기 미확보 영역에 대해 생성된 영상을 결합하여 상기 제 2 화면 규격을 갖는 영상으로 변환할 수 있다.A program stored in the memory 923 may combine original images and images generated for the unsecured area for all frames included in the video of the first screen standard, respectively, and convert them into images having the second screen standard. there is.

본 발명의 실시예들에 따르면, 동영상 내의 인접한 프레임 내의 정보에 기초하여 영상을 확장하되 이와 더블어 인공지능을 통해 확장되지 못한 영역에 대한 영상을 보충함으로써 화면 규격의 변화에 능동적이고 효과적으로 대응할 수 있고, 화면 규격의 변화에 따라 영상 내의 일부 영역에 표시할 정보가 없어 검은색으로 표시하는 레터박스 또는 필러박스를 방지할 수 있으며, 최신 영상 재생 장치의 자원을 충분히 활용함으로써 결과적으로 사용자가 체감하는 품질 만족도를 향상시킬 수 있다.According to the embodiments of the present invention, it is possible to actively and effectively respond to changes in screen specifications by expanding an image based on information in an adjacent frame in a video, but supplementing the image for an unexpanded area through artificial intelligence, As the screen size changes, it is possible to prevent letterboxing or pillarboxing, which is displayed in black because there is no information to be displayed in some areas of the video, and as a result, users experience quality satisfaction by fully utilizing the resources of the latest video playback device. can improve

한편, 본 발명의 실시예들은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the embodiments of the present invention can be implemented as computer readable codes in a computer readable recording medium. The computer-readable recording medium includes all types of recording devices in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like. In addition, the computer-readable recording medium may be distributed to computer systems connected through a network, so that computer-readable codes may be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the present invention can be easily inferred by programmers in the technical field to which the present invention belongs.

이상에서 본 발명에 대하여 그 다양한 실시예들을 중심으로 살펴보았다. 본 발명에 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.In the above, the present invention was examined focusing on various embodiments thereof. Those of ordinary skill in the art pertaining to the present invention will be able to understand that the present invention can be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered from an illustrative rather than a limiting point of view. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent scope will be construed as being included in the present invention.

810: 입력영상
820: 딥 뉴럴 네트워크 830: 빈자리 영상 실마리 정보
840: 출력영상
910: 제 1 규격 동영상
920: 영상 처리 장치 921: 입력부
923: 메모리 925: 프로세서
930: 제 2 규격 동영상810: input image
820: deep neural network 830: vacancy image clue information
840: output image
910 First standard video
920: image processing device 921: input unit
923: memory 925: processor
930: Second standard video

Claims

(a) receiving, by an image processing device, a video having a first screen standard;
(b) setting, by the image processing device, an image area corresponding to a difference between the first screen standard and a second screen standard defined differently from the first screen standard as an unsecured area;
(c) detecting similar frames for each frame included in the input moving picture of the first screen standard by the image processing device and setting them as a similar frame group;
(d) generating, by the image processing device, an image for the unsecured area by referring to the similar frame group for each frame included in the input video of the first screen standard; and
(e) outputting, by the image processing device, a video having the second screen standard from an original image included in the video of the first screen standard and an image generated for the unsecured area; .

According to claim 1,
In step (b),
An image that is not possessed by the first screen standard due to a difference in screen standards including at least one of aspect ratio, resolution, and angle of view targeting the second screen standard An image processing method of setting an area as an unsecured area.

According to claim 1,
In step (c),
An image processing method of calculating a similarity value by comparing two adjacent frames using at least one of feature matching, template matching, and histogram comparison.

According to claim 1,
In step (c),
(c1) calculating a similarity between one reference frame included in the video of the first screen standard and adjacent frames temporally preceding or following the reference frame;
(c2) If the similarity calculated from the adjacent frames is equal to or greater than the threshold, setting adjacent frames equal to or greater than the threshold as a new reference frame, calculating the similarity with each adjacent frame again, and checking whether the newly calculated similarity is equal to or greater than the threshold setting only frames whose degree of similarity is equal to or higher than a threshold value as a group of similar frames with respect to the first reference frame by repeating the process serially; and
(c3) deriving a similar frame group for each frame by performing steps (c1) and (c2) for all frames included in the video of the first screen standard.

According to claim 4,
In step (c),
The similarity value between two adjacent frames is matched with an identifier pair of the corresponding frame and stored in a look-up table,
When calculating the similarity between two adjacent frames, first, the lookup table is searched, and if there is a pre-stored similarity value, the stored similarity value is read and used, and the similarity is calculated only when the pre-stored similarity value does not exist and storing them in the lookup table.

According to claim 1,
In step (c),
(c4) sequentially calculating similarities between adjacent frames for all frames included in the video of the first screen standard; and
(c5) setting a similar frame group including only frames having a similarity equal to or greater than the threshold value for each frame interval in which the calculated similarity value is equal to or greater than the threshold value in time-sequential succession;

According to claim 1,
In step (d),
(d1) extending the current frame with respect to the current frame included in the input moving picture of the first screen standard by using a geometric relationship between the current frame and the similar frame included in the similar frame group; and
(d2) generating an image for the unsecured area by performing the frame extension of step (d1) on all frames included in the input video of the first screen standard.

According to claim 7,
In the step (d1),
The image processing method of extending the current frame by extracting a keypoint from each of the similar frame and the current frame and matching both images based on the extracted keypoint.

According to claim 7,
(d3) supplementing an image of the unsecured region using a Generative Adversarial Network (GAN) or an autoencoder when the image of the unsecured region is not generated through frame extension in step (d1); Further comprising a, image processing method.

According to claim 1,
In step (e),
(e1) converting an image having the second screen standard by combining an original image and an image generated for the unsecured area for all frames included in the video of the first screen standard; processing method.

According to claim 10,
(e2) supplementing the resolution or bit rate of the inferior image from the similar frame when the quality difference between the image generated for the unsecured area and the original image is greater than the acceptance limit; , image processing method.

According to claim 10,
(e3) smoothing a boundary between the image generated for the unsecured area and the original image included in the video of the first screen standard;

A computer-readable recording medium recording a program for executing the method of any one of claims 1 to 12 in a computer.

an input unit for receiving a video having a first screen standard;
a memory for storing a program for converting the input video having the first screen standard into a video having a second screen standard defined differently from the first screen standard; and
A processor that executes a program stored in the memory; includes,
Programs stored in the memory,
Based on the first screen standard, an image area corresponding to a difference between the first screen standard and a second screen standard defined differently is set as an unsecured area, and a frame included in the input video of the first screen standard ( Detect similar frames for each frame, set them as a similar frame group, and create an image for the unsecured area by referring to the similar frame group for each frame included in the input moving picture of the first screen standard; and a command for outputting a video having the second screen standard from an original video included in the video of the first screen standard and an image generated for the unsecured area.

15. The method of claim 14,
Programs stored in the memory,
An image that is not possessed by the first screen standard due to a difference in screen standards including at least one of aspect ratio, resolution, and angle of view targeting the second screen standard An image processing device that sets an area as an unsecured area.

15. The method of claim 14,
Programs stored in the memory,
For one reference frame included in the video of the first screen standard, a similarity with adjacent frames temporally preceding or following the reference frame is calculated,
If the similarity calculated from the adjacent frames is greater than or equal to the threshold value, the adjacent frames that are greater than or equal to the threshold value are set as new reference frames, the similarities with the adjacent frames are calculated again, and the process of checking whether the newly calculated similarity value is greater than or equal to the threshold value is chained. By repeatedly repeating, only frames whose similarity is equal to or higher than the threshold are set as similar frame groups for the first reference frame.
The image processing apparatus of claim 1 , wherein a similar frame group is derived for each frame by performing a process of calculating the degree of similarity and a process of setting the similar frame group for all frames included in the video of the first screen standard.

15. The method of claim 14,
Programs stored in the memory,
For all frames included in the video of the first screen standard, similarities between adjacent frames are sequentially calculated, respectively;
An image processing device configured to set a similar frame group including only frames having a similarity equal to or greater than a threshold value for each section of frame sections in which the calculated similarity value is equal to or greater than the threshold value.

15. The method of claim 14,
Programs stored in the memory,
With respect to the current frame included in the input video of the first screen standard, a keypoint is extracted from each of the similar frame and the current frame by using a geometric relationship between the similar frame included in the similar frame group and the current frame; Expanding the current frame by matching both images based on the extracted feature points;
An image processing device that generates an image for the unsecured area by performing a process of extending the current frame for all frames included in the input video of the first screen standard.

According to claim 18,
Programs stored in the memory,
When the image of the unsecured region is not generated through the process of extending the current frame, supplementing the image of the unsecured region using a Generative Adversarial Network (GAN) or an autoencoder. Image processing device.

15. The method of claim 14,
Programs stored in the memory,
An image processing device for converting an image having the second screen standard by combining an original image and an image generated for the unsecured area, respectively, for all frames included in the video of the first screen standard.

According to claim 1,
In step (d),
For each frame included in the input video of the first screen standard, the similar frame group is referred to, and the angle and perspective characteristics of the similar frame are determined by using an artificial neural network that has learned the angle and perspective of the object. An image processing method of generating an image of the unsecured area by merging the current frame and the similar frame into one frame by correcting frame characteristics to match .

According to claim 21,
In step (d),
(d1) With respect to the current frame included in the input video of the first screen standard, the current frame is expanded by using the geometric relationship between the similar frame included in the similar frame group and the current frame, and the object is viewed from various angles and perspectives. correcting a camera angle and perspective between the current frame and the similar frame using the artificial neural network that outputs a difference in angle and perspective between input images by pre-learning an image viewed from the camera; and
(d2) generating an image for the unsecured area by performing the frame expansion and correction of step (d1) on all frames included in the input video of the first screen standard; .

15. The method of claim 14,
Programs stored in the memory,
For each frame included in the input video of the first screen standard, the similar frame group is referred to, but the angle and perspective characteristics of the similar frame are applied to the current frame using an artificial neural network that has learned the angle and perspective of the object. and generating an image of the unsecured area by merging the current frame and the similar frame into one frame by correcting frame characteristics to match.

24. The method of claim 23,
Programs stored in the memory,
With respect to the current frame included in the input video of the first screen standard, a keypoint is extracted from each of the similar frame and the current frame by using a geometric relationship between the similar frame included in the similar frame group and the current frame; Expanding the current frame by matching both images based on the extracted feature points, using the artificial neural network that outputs the angle and perspective differences between input images by pre-learning images viewed from various angles and perspectives Correct camera angle and perspective between the current frame and the similar frame;
An image processing device that generates an image for the unsecured area by performing a process of extending and correcting the current frame for all frames included in the input video of the first screen standard.