KR20180028588A

KR20180028588A - Method and apparatus for adaptive frame synchronizaion

Info

Publication number: KR20180028588A
Application number: KR1020160115685A
Authority: KR
Inventors: 안기옥; 조영탁; 이태원; 김민기; 최주영; 이홍채; 김묘숙; 채옥삼
Original assignee: 주식회사 이타기술
Priority date: 2016-09-08
Filing date: 2016-09-08
Publication date: 2018-03-19

Abstract

The present invention relates to an adaptive frame synchronization apparatus using a scene-based fingerprint of video data with different input points, the apparatus comprising: a video data input unit for receiving multiple data in various input methods such as video and audio signals that are being transmitted in real time, a digital video file recorded in a file system of an operating system, or the like; a scene extraction unit for dividing a scene in real time from each of the input video data to extract a representative frame of each scene as an individual frame image; a scene fingerprint data extraction unit for extracting fingerprint data with respect to a video frame image for each scene divided and extracted by the scene extraction unit; a frame gap calculation unit for selecting a fingerprint index of each input having the highest similarity through comparison between the extracted scene fingerprint data and fitness evaluation, and calculating a frame difference of each input by using the same; and a time point synchronization unit for synchronizing time points when video and audio of each input are generated by applying the frame gap to each input video.

Description

METHOD AND APPARATUS FOR ADAPTIVE FRAME SYNCHRONIZATION Using Scene-Based Fingerprint of Moving Picture Data at Different Input Times [

본 발명은 동영상 데이터를 이용한 방송 및 아카이빙 시스템에 부가하여 동일한 동영상 데이터에 대하여 서로 다른 시점에 발생하는 복수의 송출 신호를 입력받아 각 신호의 비디오 및 오디오의 시점을 동기화하는 방법 및 장치에 관한 것이다. The present invention relates to a method and an apparatus for receiving a plurality of transmission signals generated at different points in time for the same moving picture data in addition to a broadcasting and archiving system using moving picture data and synchronizing the viewpoints of video and audio of each signal.

동영상이라 함은 일반적으로 디지털화된 비디오와 오디오 신호가 혼재된 데이터를 일컫는다. 전통적으로 동영상 데이터를 활용하는 대표적인 분야의 예로서 방송산업분야가 있으며, 과거 아날로그 테이프 방식의 저장매체를 이용하였으나 방송 콘텐츠의 관리 및 유통의 편의성 제고의 요구가 증대되면서 컴퓨터 및 통신기술의 발달을 바탕으로 방송 송출 역시 디지털 동영상 파일을 이용하고 있다.Video generally refers to data in which digitized video and audio signals are mixed. As a representative example of utilizing video data, there is a broadcasting industry field. In the past, analog tape-based storage media have been used, but the demand for the convenience of management and distribution of broadcasting contents has increased, And broadcasting digital video files.

특히 방송통신 및 영상기술의 발달로 과거와는 비교할 수 없을 정도로 동영상 콘텐츠의 생산량은 폭발적으로 증가하고 있으며, 이러한 콘텐츠의 자동화된 효율적인 관리가 매우 중요해졌다. Especially, due to the development of broadcasting communication and video technology, the production volume of video contents has been explosively increasing to such an extent that it can not be compared with the past, and the automated and efficient management of such contents becomes very important.

근래에는 방송, 영화, 광고 산업뿐만 아니라 각급 정부 및 행정기관, 지방자치단체는 물론 기업체가 생산하는 각종 기록물의 디지털화 및 관리의 중요성이 증대되고 있으며, 국가기관인 국가기록원을 비롯하여 한국영상자료원 등은 물론 KBS 등의 지상파 방송사, KT Skylife, SK 브로드밴드 등의 IPTV 사업자 등이 디지털 아카이빙 시스템을 구축하여 운용하고 있다. 이러한 디지털 아카이빙 시스템은 CMS (Contents Management System), MAM (Media Asset Management), DAM (Digital Asset Management) 등으로 불리고 있다.In recent years, the importance of digitalization and management of not only broadcasting, film, and advertisement industries but also various government, administrative and local governments as well as corporations has been increasing, and national archives, KT Skylife, SK Broadband, and other IPTV operators are building and operating a digital archiving system. Such a digital archiving system is called a contents management system (CMS), a media asset management (MAM), or a digital asset management (DAM).

한편, 종래에는 기록물의 디지털화 및 관리에 목적을 두고 아카이빙 시스템의 구축이 이루어졌다면, 최근에는 아카이빙되는 기록물의 품질에 대한 관심이 증대되고 있다. 낮은 품질의 기록물은 활용도를 떨어뜨리는 주요인이 되고 있으며, FHD (Full HD) 또는 4K UHD (Ultra HD) 등 초고화질 영상기술의 보급으로 동영상 데이터의 규모 또한 급증하고 있다.In the past, if an archiving system has been constructed for the purpose of digitalization and management of archival material, the quality of the archival archival material has been increasing in recent years. Low-quality recordings have become a major source of inefficiency, and the size of video data is also soaring due to the spread of ultra-high-definition video technology such as FHD (Full HD) or 4K UHD (Ultra HD).

문서 또는 사진에 비하여 동영상의 경우 그 데이터 규모가 방대하여 비디오 또는 오디오에 포함된 다양한 오류를 소수의 인력으로 육안검사로서 품질을 측정하기란 현실적으로 불가능해졌으며, IPTV 등 디지털 방송에 따른 셋톱박스 등 방송 송출 또는 수신 장치에 대한 실시간 동영상 스트리밍 데이터의 품질을 측정하는 작업 또한 같은 문제로 인해 동영상 품질 검사 자동화 시스템의 수요가 발생하고 있다.In the case of moving pictures, the data size is enormous, so it is impossible to measure quality by visual inspection with a small number of human errors included in video or audio. Or to measure the quality of real time video streaming data for the receiving device. Due to the same problem, there is a demand for a video quality inspection automation system.

본 발명이 해결하고자 하는 과제는 입력 시점이 다른 동영상 데이터의 장면기반 핑거프린트를 이용한 적응적 프레임 동기화 방법 및 장치를 제공하는 것이다.SUMMARY OF THE INVENTION It is an object of the present invention to provide an adaptive frame synchronization method and apparatus using a scene-based fingerprint of moving picture data having different input time points.

본 발명의 실시예에 따라 서로 다른 시간에 발생한 동일한 비디오 및 오디오 데이터를 갖는 서로 다른 입력 방식의 복수의 입력에 대하여 동기화를 수행하는 동영상 입력 동기화 장치로서, 실시간 송출 중인 비디오 및 오디오 신호 또는 운영체제의 파일 시스템에 기록된 디지털 동영상 파일 등 다양한 입력 방식으로 복수의 데이터를 입력받는 동영상 데이터 입력부, 상기 각 입력 동영상 데이터로부터 실시간으로 장면을 분할하여 각 장면의 대표 프레임을 개별 프레임 이미지로 추출하는 장면 추출부, 각 입력 동영상별로 상기 장면 추출부에 의해 분할 및 추출된 각 장면별 비디오 프레임 이미지에 대한 핑거프린트 데이터를 추출하는 장면 핑거프린트 데이터 추출부, 상기 추출한 장면 핑거프린트 데이터간 비교 및 적합도 평가를 통해 가장 높은 동일성을 보이는 각 입력의 핑거프린트 인덱스를 선정하고 이를 이용하여 각 입력의 프레임 차이를 계산하는 프레임 갭(gap) 산출부, 각 입력 동영상 데이터에 대하여 상기 프레임 갭을 적용하여 각 입력의 비디오 및 오디오의 발생 시점을 동기화하는 시점 동기화부를 포함한다.According to an embodiment of the present invention, there is provided a moving picture input synchronization apparatus for synchronizing a plurality of inputs of different input methods having the same video and audio data generated at different times, A scene extracting unit for dividing a scene in real time from each input moving picture data and extracting a representative frame of each scene as a separate frame image, a moving picture data input unit for inputting a plurality of data in various input methods such as a digital moving picture file recorded in the system, A scene fingerprint data extracting unit for extracting fingerprint data of each scene-specific video frame image divided and extracted by the scene extracting unit for each input moving picture, comparing the extracted scene fingerprint data with each other, same A frame gap calculating unit for calculating a frame difference of each input by using a fingerprint index of each input showing the input video data and a frame gap of each input video data, And a time synchronization unit for synchronizing the generation time.

상기 장면 추출부는 상이한 시간에 동일하거나 서로 다른 입력 방식을 갖는 데이터 소스로부터 발생하는 c개의 동영상 데이터 입력에 대하여 각각의 처리 상태 및 작업 스케줄을 관리하는 동영상 데이터 분석 작업 관리부에 의해 c개의 각 동영상 데이터 입력에 대한 작업번호를 할당하고 동 입력별로 장면 추출 작업을 할당하며, 각 입력별 비디오 데이터로부터 일반적으로 Scene 및 Shot으로 지칭하는 각 장면별로 대표 비디오 프레임을 이미지로 추출하여 파일 시스템에 저장하거나 주기억장치에 일시적으로 기록할 수 있다.Wherein the scene extracting unit extracts c pieces of moving picture data input by the moving picture data analysis task managing unit for managing respective processing states and job schedules for c pieces of moving picture data input generated from data sources having the same or different input methods at different times, A representative video frame is extracted from each input video data for each scene, which is generally referred to as a scene and a shot, and is stored in a file system or stored in a main memory It can be recorded temporarily.

상기 장면 핑거프린트 추출부는 재생시간이 매우 긴 동영상에 대해서 계산량을 줄여 빠른 동기화 처리가 가능하도록 동영상을 구성하는 모든 장면 정보를 이용하는 대신, 파일 시스템 혹은 주기억장치에 저장된 c개의 작업, 즉 c개의 동영상 데이터 입력별로 추출한 일련의 장면 프레임 이미지 중 첫 번째 추출된 장면으로부터 연속된 n개의 이미지를 선정하여 핑거프린트를 추출할 수 있다.The scene fingerprint extracting unit extracts c pieces of work stored in the file system or the main memory, that is, c pieces of moving picture data, which are stored in the main memory device, instead of using all scene information constituting a moving picture, A fingerprint can be extracted by selecting n consecutive images from the first extracted scene among a series of scene frame images extracted for each input.

상기 장면 핑거프린트 추출부는 M x N크기의 장면 이미지 데이터에 대하여 0 ~ (N-1)의 범위를 갖는 각 행의 0 ~ (M-1) 위치의 각 열의 화소값을 누적합산한 1화소 두께의 장면별 핑거프린트를 추출하는 것을 연속된 n개의 각 이미지에 대하여 수행하고, 상기 추출된 n개의 장면별 핑거프린트를 연결하여 n x N 크기의 2차원 배열에 저장하여 n개의 장면 열(시퀀스)에 대한 핑거프린트 이미지를 생성하는 것을 c개의 각 입력별로 수행할 수 있다.The scene fingerprint extracting unit extracts the scene fingerprint from the scene image data of the M × N-sized scene image data by multiplying the pixel values of each column of 0 to (M-1) positions of each row having a range of 0 to (N-1) The fingerprints of each of n consecutive n pieces of scenes are connected to each other and stored in a two-dimensional array of nxN size, The generation of a fingerprint image for each c input can be performed.

상기 프레임 갭 산출부는 n x N 크기를 갖는 c개의 핑거프린트 이미지에 대하여 이미지 1의 n개의 각 행에 대하여 이미지 2의 n개의 각 행별 적합도를 계산하여 가장 높은 적합도를 갖는 이미지 1의 i행과 이미지 2의 j행을 선정하고, 각 행의 장면 핑거프린트가 추출된 원 이미지, 즉 각각에 상응하는 동영상 데이터 입력으로부터 해당 장면의 동영상 프레임의 번호를 확인하여 차분한 값을 입력 동영상 1과 입력 동영상 2의 동일 장면에 대한 프레임 차이, 즉 프레임 갭으로 산출하고, 마찬가지로 나머지 입력 동영상에 대해 같은 방법을 적용하여 이미지 1에 대한 c - 1개의 이미지별 프레임 갭을 산출할 수 있다. The frame gap calculator calculates the fit of each of n rows of the image 2 for each of n rows of the image 1 for c fingerprint images having the size of nx N, And the number of the moving picture frame of the corresponding scene is checked from the original image in which the scene fingerprint of each row is extracted, that is, the moving picture data input corresponding to each row, so that the calibrated value of the input picture 1 and the input moving picture 2 The frame gap for the scene, that is, the frame gap, and similarly, the same method is applied to the remaining input moving images to calculate the frame gap per c-1 image for the image 1.

상기 프레임 갭 산출부는 산출한 입력 동영상 1에 대한 c - 1개의 입력 동영상별 프레임 갭을 시점 동기화부에 제공하고, 상기 시점 동기화부는 입력 동영상별 프레임 갭을 적용하여 입력별로 산출된 프레임 갭만큼 각 입력의 프레임 번호를 보정하여 각 입력별 동일한 장면 및 음향의 발생 시점을 동기화화며, 보정된 프레임번호와 입력 동영상의 프레임률(frame-rate)을 이용하여 재생시간 정보 또는 타임코드를 동일한 시작 시간을 기준으로 변환하여 제공하는 시간 정보 변환부를 더 포함할 수 있다.Wherein the frame gap calculating unit provides c-1 input frame-by-frame gaps for the input moving image 1 to the viewpoint synchronization unit, and the viewpoint synchronization unit applies frame gaps for each input moving picture, And synchronizes the generation times of the same scene and sound for each input. The playback time information or the time code is set to the same start time as the start time by using the corrected frame number and the frame rate of the input moving picture And provides the time information converting unit.

상기 시점 동기화부는 동기화된 비디오 및 오디오 데이터의 표출을 위하여, 동영상 스트리밍, 동영상 재생기, 동영상 편집기, 비디오 카탈로깅과 같은 비디오 프레임 및 오디오 데이터 응용 시스템과 연동하는 기능을 제공할 수 있다.The viewpoint synchronization unit may provide a function of interworking with a video frame and an audio data application system such as video streaming, a video player, a video editor, and video cataloging for displaying synchronized video and audio data.

본 발명의 실시예에 따르면 복수의 입력 장치로부터 RS-422 등의 외부 인터페이스 장치를 이용한 입력 흐름 제어에 의한 입력 시점 동기화가 불가능한 환경은 물론 각 입력의 타임코드 정보의 유무 및 신호 기반 또는 파일 기반 데이터 입력 등의 입력 형태와 무관하게 동일한 비디오/오디오 데이터로 구성된 각각의 입력이 갖는 비디오 및 오디오의 발생 시점을 동기화할 수 있다.According to the embodiment of the present invention, it is possible to provide an environment in which input time synchronization can not be performed by input flow control using an external interface device such as RS-422 from a plurality of input devices, It is possible to synchronize the generation timing of video and audio of each input composed of the same video / audio data regardless of the input type such as input.

본 발명의 실시예에 따르면 긴 동영상과 이를 편집하였거나 동일한 데이터 원본 신호로부터 일부만 캡쳐한 짧은 부분 동영상간에도 동기화할 수 있다.According to an embodiment of the present invention, a long moving image can be synchronized with a short moving image that has been edited or partially captured from the same data source signal.

본 발명의 실시예에 따르면 입력받은 타임코드 정보가 누락되거나 오류가 발생하더라도 타임코드와 무관하게 입력간 동기화할 수 있다.According to the embodiment of the present invention, even if the input time code information is missing or an error occurs, the input can be synchronized with the input regardless of the time code.

본 발명의 실시예에 따르면 운용 환경 및 조건에 따라 데이터 입력 과정에서 일부 데이터가 누락되더라도 각 입력이 갖는 공통된 장면들의 연속으로부터 동기화할 수 있다.According to the embodiment of the present invention, even if a part of data is omitted in the data input process according to the operating environment and conditions, it can be synchronized from a series of common scenes of each input.

본 발명의 실시예에 따르면 이러한 일련의 동기화 과정을 자동화하여 별도의 작업인력 없이 복수의 동영상 데이터 입력이 필요한 다양한 목적의 응용 시스템에 그대로 이용할 수 있다.According to the embodiment of the present invention, such a series of synchronization processes can be automated and used as it is for a variety of application systems requiring multiple input of moving picture data without any additional workforce.

도 1은 본 발명의 한 실시예에 따른 동기화 방법의 흐름도이다.1 is a flowchart of a synchronization method according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise. Also, the terms " part, "" module," and " module ", etc. in the specification mean a unit for processing at least one function or operation and may be implemented by hardware or software or a combination of hardware and software have.

동영상에 한정하여 아카이빙 과정을 살펴보면 VCR 등의 장치를 이용하여 동영상 신호를 발생시켜 지정된 형식으로 인코딩하여 디지털 데이터를 획득하는 인제스트 과정을 거친다. 이때, 동영상 품질 검사는 인제스트 시점과 인제스트 완료 후에 부가하여 동영상 신호 수준에서의 품질검사와 인코딩하여 디지털화된 데이터에 대하 품질검사를 수행한다. 즉, 동영상의 원본 신호에 포함된 품질 오류와 인코딩 및 스토리지로의 이동 과정에서 발생하는 품질 오류를 검출하여 원본과 디지털본의 차이와 공정상의 문제를 검토할 수 있도록 한다.When viewing the archiving process only for moving images, an ingest process is performed in which a video signal is generated using a device such as a VCR and is encoded into a designated format to acquire digital data. At this time, the moving image quality check is performed after the ingest point and after the ingest is completed, and quality inspection and encoding at the moving image signal level are performed to perform quality inspection on the digitized data. That is, quality errors included in the original signal of the moving image and quality errors occurring in the encoding and moving to the storage are detected, so that the difference between the original and the digital pattern and the problem of the process can be examined.

아카이빙 환경은 고품질 동영상 콘텐츠의 생산과 관리에 목적을 두고 있다면, 동영상 품질 검사는 동영상 콘텐츠의 송수신 등 유통과정에서도 활용한다. If the archiving environment is aimed at the production and management of high quality video content, the video quality inspection is also used in the distribution process such as sending and receiving video contents.

IPTV 및 디지털 케이블방송 등은 방송 수신을 위한 셋톱박스를 필요로 하며, 셋톱박스에 대해 동영상 품질검사를 적용하여 수신된 방송 콘텐츠의 비디오 및 오디오의 품질을 측정할 수 있다. 복수의 셋톱박스를 모니터링하여 생산된 셋톱박스 제품의 전수 혹은 샘플 검사를 수행하여 검사 결과를 리포트로 제공하며, 이를 통해 품질검사 자동화 및 품질오류 발생시 의사결정을 효과적으로 지원할 수 있다. 이러한 동영상 품질검사 시스템에서는 다음과 같은 문제가 발생할 수 있다.IPTV, digital cable broadcasting, and the like require a set-top box for broadcasting reception, and video quality inspection of the set-top box can be applied to measure the quality of video and audio of the received broadcasting contents. By monitoring multiple set-top boxes and performing full-scale or sample inspection of the set-top box products produced, the inspection result is provided as a report, thereby enabling the quality inspection automation and decision-making in case of quality error to be effectively supported. Such a video quality inspection system may cause the following problems.

우선, 시스템을 구성하는 장치 또는 설비의 한계로 VCR과 같은 동영상 신호 발생 장치의 제어에 있어 인제스트 장비와 품질검사 장비간 입력 신호의 동기화가 불가능한 경우 양측에 동일하게 입력되는 동영상 신호에 대하여 인코딩을 개시한 시점과 품질검사 개시 시점이 달라지는 문제가 발생한다.First, when the input signal can not be synchronized between the ingest device and the quality inspection device in the control of the video signal generating device such as the VCR due to the limitation of the device or the equipment constituting the system, There arises a problem that the starting point and the quality testing starting point are different.

또한, 재생시간 관점에서 상기의 문제로 인하여 긴 동영상과 이를 일부 잘라낸 짧은 동영상간 비교평가를 수행해야 하므로 각 입력간에 동일한 비디오와 오디오 위치를 동기화해주어야 하는 문제가 발생한다.In addition, due to the above problem from the viewpoint of reproduction time, a comparative evaluation between a long moving picture and a short moving picture having a short cutout must be performed. Therefore, there is a problem that the same video and audio positions must be synchronized between respective inputs.

또한, 실시간 신호 입력과 인코딩 완료된 동영상 파일은 각기 다른 시간 개념을 사용하고 있고, 인코딩된 동영상의 경우 29.97 fps 등 근사화된 프레임률을 가지며, 동영상 신호 입력 시점에 타임코드 정보를 이용할 수 없는 경우도 있어 두 입력간 정렬이 어려운 문제가 있다.In addition, the real-time signal input and the encoded video file use different time concepts, the encoded video has an approximated frame rate such as 29.97 fps, and the time code information can not be used at the time of video signal input There is a problem that alignment between two inputs is difficult.

IPTV 및 디지털 캐이블방송용 셋톱박스의 경우 인터넷망을 이용하여 방송 데이터를 수신하는데, - 복수의 셋톱박스를 동시에 검수하는 조건에서 셋톱박스별로 연결 상태에 따라 데이터 수신 속도 및 안정성 등에 차이가 발생하여 같은 채널을 수신한다 해도 각각의 셋톱박스가 수신하는 비디오 및 오디오의 재생 시점에 차이가 발생하며, 데이터 수신 불량으로 일부 비디오 또는 오디오가 누락되는 현상도 발생하므로, 각 셋톱박스의 비디오와 오디오의 검사 시점을 동기화하기 어려운 문제가 발생한다.In the case of a set-top box for IPTV and digital cable broadcasting, broadcast data is received using an Internet network. In the condition that a plurality of set-top boxes are simultaneously checked, a difference in data reception speed and stability varies depending on a set- There is a difference in reproduction timing of video and audio received by each set-top box, and some video or audio is missing due to data reception failure. Therefore, Problems that are difficult to synchronize occur.

종래의 기술들은 대부분 동영상 콘텐츠의 불법복제 또는 변형/왜곡 등을 검출하기 위한 목적을 가지고 있으며, 이를 위해 원본 동영상 콘텐츠의 핑거프린트 및 비디오 해시(hash) 정보를 데이터베이스화하여 입력 동영상과 데이터베이스를 비교평가하는 방식이다. 하지만, 이러한 데이터베이스의 구축 및 관리를 위한 인력 또는 작업을 요하는 문제가 있으며, 뉴스와 같은 생방송 또는 긴급방송의 경우 원본 콘텐츠의 생산과 동시에 방송 송출이 이루어지므로 사전에 데이터베이스를 확보하기가 불가능하므로 실시간 검사를 필요로 하는 다양한 환경에 적용하기 어려운 문제가 있다.Most of the conventional technologies have a purpose of detecting illegal copying or distortion / distortion of video contents. For this purpose, fingerprint and video hash information of original video contents are converted into a database and input / . However, there is a problem that manpower or work is required to construct and manage such a database. In the case of live broadcast or urgent broadcast such as news, broadcast is transmitted simultaneously with production of original contents, There is a problem that it is difficult to apply to various environments requiring inspection.

본 발명은 상기와 같은 종래기술의 문제점을 해결하기 위하여, 복수의 입력 장치로부터 RS-422 등의 외부 인터페이스 장치를 이용한 입력 흐름 제어에 의한 입력 시점 동기화가 불가능한 환경은 물론 각 입력의 타임코드 정보의 유무 및 신호 기반 또는 파일 기반 데이터 입력 등의 입력 형태와 무관하게 동일한 비디오/오디오 데이터로 구성된 각각의 입력이 갖는 비디오 및 오디오의 발생 시점을 동기화할 수 있는 방법(장치)을 제공하는 것을 목적으로 한다. In order to solve the problems of the related art as described above, it is an object of the present invention to provide an environment in which input time synchronization can not be achieved by input flow control using an external interface device such as RS-422 from a plurality of input devices, And to provide a method (apparatus) capable of synchronizing the generation timing of video and audio of each input composed of the same video / audio data irrespective of an input form such as presence or absence of a signal or a signal based or file based data input .

본 발명은 긴 동영상과 이를 편집하였거나 동일한 데이터 원본 신호로부터 일부만 캡쳐한 짧은 부분 동영상간에도 동기화할 수 있는 방법(장치)을 제공하는 것을 목적으로 한다. It is an object of the present invention to provide a method (device) capable of synchronizing a long moving picture and a short moving picture partially edited or partially captured from the same data source signal.

본 발명은 입력받은 타임코드 정보가 누락되거나 오류가 발생하더라도 타임코드와 무관하게 입력간 동기화가 가능한 방법(장치)을 제공하는 것을 목적으로 한다. An object of the present invention is to provide a method (apparatus) capable of synchronizing inputs regardless of a time code even if an inputted time code information is missing or an error occurs.

본 발명은 운용 환경 및 조건에 따라 데이터 입력 과정에서 일부 데이터가 누락되더라도 각 입력이 갖는 공통된 장면들의 연속으로부터 동기화가 가능한 방법(장치)을 제공하는 것을 목적으로 한다.It is an object of the present invention to provide a method (device) capable of synchronizing from a succession of common scenes of each input even if some data is missing in a data input process according to the operating environment and conditions.

본 발명은 이러한 일련의 동기화 과정을 자동화하여 별도의 작업인력 없이 복수의 동영상 데이터 입력이 필요한 다양한 목적의 응용 시스템에 그대로 이용할 수 있는 방법(장치)을 제공하는 것을 목적으로 한다.An object of the present invention is to provide a method (apparatus) capable of automating such a series of synchronization processes so that it can be used as it is for various purpose application systems that require a plurality of moving picture data inputs without requiring a separate work force.

도 1을 참고하면, 동영상 입력 동기화 장치는 서로 다른 시간에 발생한 동일한 비디오 및 오디오 데이터를 갖는 서로 다른 입력 방식의 복수의 입력에 대하여 동기화를 수행한다. Referring to FIG. 1, a moving picture input synchronization device performs synchronization with a plurality of inputs of different input methods having the same video and audio data generated at different times.

동영상 입력 동기화 장치는 실시간 송출 중인 비디오 및 오디오 신호 또는 운영체제의 파일 시스템에 기록된 디지털 동영상 파일 등 다양한 입력 방식으로 복수의 데이터를 입력받는 동영상 데이터 입력부, 상기 각 입력 동영상 데이터로부터 실시간으로 장면을 분할하여 각 장면의 대표 프레임을 개별 프레임 이미지로 추출하는 장면 추출부, 각 입력 동영상별로 상기 장면 추출부에 의해 분할 및 추출된 각 장면별 비디오 프레임 이미지에 대한 핑거프린트 데이터를 추출하는 장면 핑거프린트 데이터 추출부, 상기 추출한 장면 핑거프린트 데이터간 비교 및 적합도 평가를 통해 가장 높은 동일성을 보이는 각 입력의 핑거프린트 인덱스를 선정하고 이를 이용하여 각 입력의 프레임 차이를 계산하는 프레임 갭(gap) 산출부, 각 입력 동영상 데이터에 대하여 상기 프레임 갭을 적용하여 각 입력의 비디오 및 오디오의 발생 시점을 동기화하는 시점 동기화부를 포함한다.The moving picture input synchronization device includes a moving picture data input unit for inputting a plurality of data in various input methods such as a video and audio signal being transmitted in real time or a digital moving picture file recorded in a file system of an operating system, A scene extracting unit for extracting a representative frame of each scene as an individual frame image, a scene fingerprint data extracting unit for extracting fingerprint data for each scene-specific video frame image divided and extracted by the scene extracting unit for each input moving picture, A frame gap calculation unit for calculating a frame difference of each input by selecting a fingerprint index of each input having the highest identity through comparison between the extracted scene fingerprint data and the fitness evaluation, About data And a time synchronization unit for synchronizing generation time of video and audio of each input by applying the frame gap.

이와 같이, 본 발명은 입력 시점이 다른 동영상 데이터의 장면기반 핑거프린트를 이용한 적응적 프레임 동기화 방법으로서, 동영상 신호 또는 파일 형태의 데이터에 대한 비교 평가 등 다양한 처리를 수행함에 있어 별도의 장치 또는 방법으로 데이터입력에 대한 흐름제어를 할 수 없는 환경에서 동일한 비디오 및 오디오 데이터를 갖는 복수의 입력이 서로 다른 시간에 발생하는 경우 각각의 입력이 포함하는 비디오 및 오디오의 발생 시점을 동기화하는 장면 기반 적응적 정합 방법이다.As described above, the present invention is an adaptive frame synchronization method using a scene-based fingerprint of moving picture data having a different input time point, and is a separate device or method for performing various processes such as comparison evaluation of moving picture signals or data in file format Scene-based adaptive matching that synchronizes the timing of video and audio included in each input when a plurality of inputs having the same video and audio data occur at different times in an environment where flow control for data input is not possible Method.

본 발명은 복수의 입력 장치로부터 RS-422 등의 외부 인터페이스 장치를 이용한 입력 흐름 제어에 의한 입력 시점 동기화가 불가능한 환경은 물론 각 입력의 타임코드 정보의 유무 및 신호 기반 또는 파일 기반 데이터 입력 등의 입력 형태와 무관하게 동일한 비디오/오디오 데이터로 구성된 각각의 입력이 갖는 비디오 및 오디오의 발생 시점을 동기화할 수 있고, 긴 동영상과 이를 편집하였거나 동일한 데이터 원본 신호로부터 일부만 캡쳐한 짧은 부분 동영상간에도 동기화할 수 있으며, 입력받은 타임코드 정보가 누락되거나 오류가 발생하더라도 타임코드와 무관하게 입력간 동기화가 가능하며, 운용 환경 및 조건에 따라 데이터 입력 과정에서 일부 데이터가 누락되더라도 각 입력이 갖는 공통된 장면들의 연속으로부터 동기화가 가능하며, 이러한 일련의 동기화 과정을 자동화하여 별도의 작업인력 없이 복수의 동영상 데이터 입력이 필요한 다양한 목적의 응용 시스템에 그대로 이용할 수 있는 효과가 있다.The present invention is not limited to an environment in which input time synchronization can not be achieved by input flow control using an external interface device such as RS-422 from a plurality of input devices, and also includes an input of time code information of each input and input of signal- It is possible to synchronize the time of occurrence of the video and audio of each input composed of the same video / audio data regardless of the format, and to synchronize the long video with the short video which is partially captured from the same data source signal , It is possible to synchronize the inputs regardless of the time code even if the inputted time code information is missing or an error occurs. Even if some data is lost during the data input process according to the operating environment and conditions, This is possible It is possible to use the present invention as it is for a variety of application systems that require multiple input of moving picture data without a separate workforce.

이상에서 설명한 본 발명의 실시예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하는 프로그램 또는 그 프로그램이 기록된 기록 매체를 통해 구현될 수도 있다.The embodiments of the present invention described above are not implemented only by the apparatus and method, but may be implemented through a program for realizing the function corresponding to the configuration of the embodiment of the present invention or a recording medium on which the program is recorded.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.

Claims

1. A moving picture input synchronization device for performing synchronization for a plurality of inputs of different input methods having identical video and audio data occurring at different times,
A moving picture data input unit for receiving a plurality of data in various input methods such as a video and audio signal being transmitted in real time or a digital moving picture file recorded in a file system of an operating system,
A scene extracting unit for dividing a scene in real time from each input moving image data and extracting representative frames of each scene as individual frame images;
A scene fingerprint data extracting unit for extracting fingerprint data of a video frame image for each scene divided and extracted by the scene extracting unit for each input moving picture,
A frame gap calculator for calculating a frame difference of each input by selecting a fingerprint index of each input having the highest identity through comparison between the extracted scene fingerprint data and fitness evaluation,
A time synchronization unit for synchronizing generation times of video and audio of each input by applying the frame gap to each input moving image data;
&Lt; / RTI >

The method of claim 1,
The scene extracting unit
Further comprising: a moving image data analysis job management unit managing respective processing states and job schedules for c moving image data inputs generated from data sources having the same or different input methods at different times,
The moving picture data analysis job management unit
a task number is assigned to each c video data input, a scene extraction task is assigned for each input, and a representative video frame is extracted as an image for each scene, which is generally referred to as a scene and a shot, from the video data for each input, Or to temporarily write to the main memory.

The method of claim 1,
The scene fingerprint extracting unit
Instead of using all the scene information that constitutes a moving picture so that a fast synchronization process can be performed by reducing the amount of calculation for a moving image having a very long playback time, a series of scenes extracted for each of the c jobs stored in the file system or the main memory, A synchronization device for selecting n consecutive images from a first extracted scene of a frame image and extracting a fingerprint.

4. The method of claim 3,
The scene fingerprint extracting unit
A scene-by-scene fingerprint that cumulatively adds the pixel values of the respective columns from 0 to (M-1) in each row having a range of 0 to (N-1) Extracting is performed for each successive n images,
And a finger print image for n scene lines (sequence) is generated for each c input by connecting the extracted n fingerprints for each scene and storing them in a two-dimensional array of nx N sizes.

The method of claim 1,
The frame gap calculation unit
For each of n rows of image 1, for each of n fingerprinted images of size n x N, the fit of each of n rows of image 2 is calculated to determine the i row of image 1 and j row of image 2 with the highest fit and,
The number of the moving picture frame of the corresponding scene is checked from the original image from which the scene fingerprint of each row is extracted, that is, the moving picture data input corresponding to each of the rows, and a calibrated value is set as a frame difference for the same scene of the input moving picture 1 and the input moving picture 2, And calculates a c-1 frame-by-image gap for image 1 by applying the same method to the remaining input videos.

The method of claim 5,
Wherein the frame gap calculating unit provides c-1 input frame-by-frame gaps for the input moving image 1 to the viewpoint synchronization unit,
Wherein the viewpoint synchronization unit applies a frame gap per input moving picture to correct a frame number of each input by a frame gap calculated for each input to synchronize generation times of the same scene and sound for each input,
And a time information converter for converting the reproduction time information or the time code based on the same start time using the corrected frame number and the frame rate of the input moving picture.

The method of claim 1,
The viewpoint synchronization unit
A synchronization device for providing synchronization with video frame and audio data application systems, such as video streaming, video player, video editor, video cataloging, for presentation of synchronized video and audio data.