KR102118093B1

KR102118093B1 - Method for processing workflow of character information providing system based on face recognition

Info

Publication number: KR102118093B1
Application number: KR1020180157351A
Authority: KR
Inventors: 박정선; 허세흥
Original assignee: 주식회사 코난테크놀로지
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2020-06-02

Abstract

The present invention relates to a method for processing the workflow of a character information providing system based on facial recognition which provides an optimal deep metadata management technique. According to an embodiment of the present invention, the method for processing the workflow of a character information providing system based on facial recognition comprises: a step in which a content registration unit copies an original content file to store the original content file in a storage, and registers and stores content metadata in an MAM server; a step in which the MAM server assigns a preprocessing job required for facial recognition and verification to a preprocessing server; a step in which the preprocessing server performs preprocessing including transcoding, cataloging, and frame extraction; a step in which a recognition server recognizes a sound source, a situation, and a character in an image when the MAM server uses preprocessed results to assign a recognition job to the recognition server; a step in which a manager terminal uses an authoring tool to check recognition results and correct the recognition results through verification; and a step in which the MAM server transfers results to a user terminal after verification is completed.

Description

Workflow processing method of face recognition based character information providing system {METHOD FOR PROCESSING WORKFLOW OF CHARACTER INFORMATION PROVIDING SYSTEM BASED ON FACE RECOGNITION}

본 발명은 미디어 에셋 관리 기술에 관한 것이다.The present invention relates to a technology for managing media assets.

미디어 에셋 관리(Media Asset Management: MAM, 이하 'MAM'이라 칭함) 시스템은 좁은 의미에서 동영상, 오디오, 이미지와 같은 미디어 파일을 관리하는 것을 뜻하며, 넓은 의미에서 콘텐츠 관리 시스템(Content Management System: CMS)이라 볼 수 있다.Media Asset Management (MAM, hereinafter referred to as'MAM') refers to the management of media files such as video, audio, and images in a narrow sense, and Content Management System (CMS) in a broad sense. It can be said.

CMS란 콘텐츠에 일반데이터와 메타데이터를 포함하여 관리하는 시스템을 뜻한다. 여기서 말하는 콘텐츠(content)란 전자문서화돼 있는 데이터 파일을 뜻하며, 워드 파일, 사진 이미지, 동영상 파일, 음원 파일 등 독립적으로 존재하는 전자파일을 말한다. 일반데이터란 전자 결재 자료, 시스템 로그, 이메일, 메신저 문자 등 DB에 기록되지만, 단독으로 존재하는 파일이 아닌 데이터들을 말한다. 메타데이터(metadata)란 콘텐츠 자체를 시스템에 인식하기 위한 추가적인 데이터를 뜻한다. 예를 들어 콘텐츠 파일은 별도의 저장소에 보관하고, 이를 활용하기 위한 패스 정보나 포맷정보, 생성일, 소유자, 제목, 사용 권한 등이 DB에 관리되는데 이 정보를 메타데이터라 한다.CMS refers to a system that manages content by including general data and metadata. The content referred to herein refers to an electronically documented data file, and refers to an electronic file that exists independently, such as a word file, photo image, video file, and sound source file. General data refers to data that is recorded in DB such as electronic payment data, system logs, emails, messenger texts, etc., but does not exist alone. Metadata refers to additional data for recognizing the content itself to the system. For example, the content file is stored in a separate storage, and the path information, format information, creation date, owner, title, and permission to use it are managed in the DB, and this information is called metadata.

일 실시 예에 따라, 영상 내 등장인물, 상황, 장소, 음원 등을 인식하여 사용자에게 딥 메타데이터(deep metadata)를 제공하며, 콘텐츠를 입수하여 딥 메타데이터 추출 및 전송까지 전 과정을 통합한 최적의 딥 메타 관리기술을 제공하는 얼굴 인식 기반 등장인물 정보 제공 시스템의 워크플로우 처리방법을 제안한다.According to one embodiment, it recognizes the characters, situations, places, sound sources, etc. in a video to provide deep metadata to the user, and obtains content to optimize the entire process from extracting and transmitting deep metadata We propose a workflow processing method for face recognition based character information providing system that provides deep meta management technology.

일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 워크플로우 처리방법은, 콘텐츠 등록부가 콘텐츠 원본 파일을 복사하여 스토리지에 저장하고, 콘텐츠 메타데이터를 MAM 서버에 등록 및 저장하는 단계와, MAM 서버가 전처리 서버에 인식 작업 및 검증에 필요한 전처리 작업을 할당하는 단계와, 전처리 서버가 트랜스코딩, 카탈로깅 및 프레임 추출을 포함한 전처리를 수행하는 단계와, MAM 서버가 전처리 된 결과를 이용하여 인식 작업을 인식 서버에 할당하면, 인식 서버가 영상 내 인물, 상황 및 음원을 인식하는 단계와, 관리자 단말이 저작도구(Authoring Tool)를 이용하여 인식 결과를 확인하고 검증을 거쳐 수정하는 단계와, 검증완료 후, MAM 서버가 사용자 단말로 결과를 전달하는 단계를 포함한다.The workflow processing method of the facial recognition-based character information providing system according to an embodiment includes the steps of a content registration unit copying a content original file and storing it in a storage, and registering and storing content metadata in the MAM server, and the MAM server A) assigning a pre-processing task necessary for recognition and verification to the pre-processing server; and a pre-processing server performing pre-processing including transcoding, cataloging, and frame extraction, and the MAM server using the pre-processed results to recognize the task. When assigned to the recognition server, the recognition server recognizes the person, situation, and sound source in the video, and the administrator terminal checks the recognition result using the authoring tool and verifies and corrects it after verification is completed. , MAM server passing the result to the user terminal.

전처리를 수행하는 단계는, MAM 서버가 트랜스코더에 트랜스코딩 작업을 할당하고, 트랜스코더가 원본 영상을 웹에서 재생 가능한 검색 영상으로 변환하는 단계와, MAM 서버가 카탈로거에 카탈로깅 작업을 할당하고, 카탈로거가 원본 영상을 분석하여 장면 전환 지점 기준으로 장면을 분할하여 샷 이미지를 출력하는 단계와, MAM 서버가 프레임 추출기에 프레임 추출 작업을 할당하고, 프레임 추출기가 프레임 이미지를 추출하는 단계를 포함하며, 전처리 작업을 구성하는 각 단계는 동시에 병렬로 진행되며, MAM 서버와 전처리 서버의 트랜스코더, 카탈로거 및 프레임 추출기 간에는 TCP 프로토콜을 이용하여 통신할 수 있다.The pre-processing steps include: the MAM server assigning a transcoding operation to the transcoder, the transcoder converting the original image into a searchable image playable on the web, and the MAM server assigns a cataloging operation to the cataloger , Catalogger analyzes the original video and divides the scene based on the transition point to output a shot image, and the MAM server assigns a frame extraction operation to the frame extractor, and the frame extractor extracts the frame image Each step constituting the pre-processing operation is performed in parallel at the same time, and can be communicated using the TCP protocol between the MAM server and the transcoder of the pre-processing server, the cataloger and the frame extractor.

영상 내 인물, 상황 및 음원을 인식하는 단계는, MAM 서버가 인물 인식기에 등장인물 피처 추출 작업을 할당하는 단계와, 인물 인식기가 등장인물의 인식용 피처 추출을 위한 인물인식 서버를 호출하는 단계와, 호출된 인물인식 서버가 기저장된 등장인물의 갤러리 이미지를 이용하여 등장인물의 인식용 피처를 추출하는 단계와, 인물 인식기가 인물인식 서버에 피처 추출 상태를 확인하고 확인 결과를 MAM 서버에 등록하는 단계와, MAM 서버가 인물 인식기에 인물인식 작업을 할당하는 단계와, 인물 인식기가 인물 인식을 위한 인물인식 서버를 호출하는 단계와, 호출된 인물인식 서버가 프레임 추출 이미지 및 등장인물 피처 파일을 이용하여 인물을 인식하여, 대표 프로필 추천 사진, 인물 클러스터링 이미지, 프레임별 피처 파일 및 인식 결과 파일을 제공하는 단계와, MAM 서버가 인식 결과를 저장하는 단계를 포함할 수 있다.Recognizing a person, a situation, and a sound source in an image includes: a MAM server assigning a character feature extraction task to a person recognizer, and a person recognizer calling a person recognition server for extracting features for recognition of characters; , Extracting the feature for recognition of the character using the gallery image of the character that the called character recognition server is pre-stored, and the character recognizer confirms the feature extraction status in the character recognition server and registers the confirmation result to the MAM server Step, the MAM server assigns a person recognition task to the person recognizer, the person recognizer calls the person recognition server for person recognition, and the called person recognition server uses the frame extraction image and the character feature file. The method may include recognizing a person, providing a representative profile recommendation picture, a person clustering image, a feature file for each frame, and a recognition result file, and storing the recognition result by the MAM server.

영상 내 인물, 상황 및 음원을 인식하는 단계는, MAM 서버가 음원 인식기에 음원인식 작업을 할당하는 단계와, 음원 인식기가 스토리지로부터 wav 파일을 추출하고 추출된 wav 파일을 이용하여 음원인식 서버에 음원인식을 호출하는 단계와, 음원인식 서버가 음원인식 라이브러리를 이용하여 음원을 인식하고 음원인식 결과를 MAM 서버에 등록하는 단계와, MAM 서버가 미리 설정된 단위로 인식된 결과에서 오인식 결과를 수정하고 유효한 결과를 병합 처리하는 단계와, MAM 서버가 최종 결과를 저장하는 단계를 포함할 수 있다.In the step of recognizing the person, situation, and sound source in the video, the MAM server assigns a sound source recognition task to the sound source recognizer, and the sound source recognizer extracts the wav file from the storage and uses the extracted wav file to record the sound source to the sound source recognition server. Invoking the recognition, the sound source recognition server recognizes the sound source using the sound source recognition library and registers the sound source recognition result to the MAM server, and the MAM server corrects the misrecognized result from the recognized result in a preset unit and is effective It may include the step of merging the results and storing the final result by the MAM server.

유효한 결과를 병합 처리하는 단계는, 스토리보드의 음원인식 페이지를 통해 음원인식 결과를 씬 단위로 제공하는 단계와, 타임코드가 중복된 항목을 대상으로 음원 경고를 표시하고, 중복된 음원을 삭제하거나 서로 병합하는 편집 화면을 제공하는 단계를 포함할 수 있다.The step of merging valid results includes providing sound recognition results in scene units through the sound recognition page of the storyboard, and displaying sound source warnings for items with duplicate timecodes, and deleting duplicate sound sources, or It may include a step of providing an editing screen to merge with each other.

영상 내 인물, 상황 및 음원을 인식하는 단계는, MAM 서버가 상황 인식기에 상황인식 작업을 할당하는 단계와, 상황 인식기가 프레임 추출 이미지를 상황인식 서버에 전달하면서 상황인식 서버를 호출하는 단계와, 상황인식 서버가 프레임 추출 이미지를 입력받아 상황인식을 통해 객체, 이벤트, 장소, 랜드마크 및 동영상을 인식하고 인식 결과를 제공하는 단계를 포함할 수 있다.Recognizing a person, a situation, and a sound source in the video includes: a MAM server assigning a context recognition task to the context recognizer; and a context recognizer calling the context recognition server while delivering the frame extraction image to the context recognition server, The context recognition server may include receiving a frame extraction image and recognizing objects, events, places, landmarks, and videos through context recognition, and providing a recognition result.

검증을 거쳐 수정하는 단계는, 등록된 콘텐츠와 딥메타 인식결과를 확인 및 관리할 수 있는 웹 기반의 저작도구를 제공하며, 저작도구는 콘텐츠 관리 페이지, 스토리보드 페이지, 검증 페이지, 등장인물 관리 페이지 및 관리자 페이지를 제공할 수 있다.The verification and correction step provides a web-based authoring tool that can check and manage registered content and deep meta recognition results, and the authoring tool includes a content management page, a storyboard page, a verification page, and a character management page. And an administrator page.

콘텐츠 관리 페이지는 콘텐츠 및 메타데이터 등록 화면, 콘텐츠 확인 화면, 메타데이터 확인 화면, 전처리 명령 입력 화면 및 인식 명령 입력 화면을 포함할 수 있다.The content management page may include a content and metadata registration screen, a content confirmation screen, a metadata confirmation screen, a pre-processing command input screen, and a recognition command input screen.

등장인물 관리 페이지는 인식 서버에서 추천된 프로필용 이미지 화면, 동일한 인물끼리 그룹핑한 클러스터링 이미지 화면 및 입력된 인물사진과 관련된 인물을 검색하여 제공하는 검색 화면을 포함할 수 있다.The character management page may include an image screen for a profile recommended by the recognition server, a clustering image screen grouping the same people, and a search screen to search for and provide a person related to the input portrait.

스토리보드 페이지는 카탈로깅을 통해 생성된 샷 추출내용을 확인하고 샷 기반으로 씬을 생성 및 관리하는 스토리보드-샷 화면, 음원인식 결과를 씬 단위로 확인하고 오인식된 결과를 수정하며 타임코드가 중복된 항목을 삭제하거나 병합하기 위한 스토리보드-음원인식 화면, 객체인식 결과를 확인하고 오인식된 결과를 편집하기 위한 스토리보드-객체인식 화면을 포함할 수 있다.The storyboard page is a storyboard-shot screen that checks shot extractions generated through cataloging, creates and manages scenes based on shots, checks sound source recognition results in units of scenes, corrects misrecognized results, and duplicates timecode It may include a storyboard-sound recognition screen for deleting or merging deleted items, a storyboard-object screen for checking object recognition results and editing the misrecognized results.

검증 페이지는 재생시점에 해당하는 등장인물 표시화면, 음원 표시화면, 등장인물의 부가 정보 확인 화면, 등장인물 출연지점 안내 화면, 엔딩 지점에서 관련 콘텐츠 안내 화면을 포함할 수 있다.The verification page may include a character display screen corresponding to a reproduction time point, a sound source display screen, an additional information confirmation screen of the character, a character appearance point guidance screen, and a related content guidance screen at the ending point.

일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템은 영상 내 등장인물, 상황, 장소, 음원 등을 인식하여 사용자에게 딥 메타데이터(deep metadata)를 제공할 수 있다. 이때, 콘텐츠를 입수하여 딥 메타데이터 추출 및 전송까지 전 과정을 통합한 최적의 딥 메타 관리기술을 제공할 수 있다.The face recognition based character information providing system according to an embodiment may recognize a character, a situation, a place, and a sound source in an image and provide deep metadata to a user. At this time, it is possible to provide an optimal deep meta management technology that integrates the entire process from obtaining content to extracting and transmitting deep metadata.

얼굴 인식 기반 등장인물 정보 제공 시스템은 MAM 기술을 이용하여 최적의 콘텐츠 관리 솔루션을 제공한다. 또한, 인물 인식 엔진, 상황 인식 엔진 및 음원인식 엔진을 통합하여 자동화 처리하고, 개방형 애플리케이션 프로그래밍 인터페이스(Open Application Programming Interface: Open API)를 이용하여 서로 다양한 플랫폼의 엔진을 연동함에 따라 사용의 편의성을 증대시킨다. 나아가, 콘텐츠 및 인식 작업을 관리할 수 있는 저작도구(Authoring Tool)를 제공하며, 최종 딥 메타데이터를 검증할 수 있는 플레이어를 제공한다.The face recognition-based character information providing system provides an optimal content management solution using MAM technology. In addition, the person recognition engine, the context recognition engine, and the sound source recognition engine are integrated for automatic processing, and the convenience of use is increased by linking engines of various platforms with each other using the Open Application Programming Interface (Open API). Order. Furthermore, it provides an authoring tool that can manage content and recognition tasks, and a player that can verify the final deep metadata.

도 1은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 구성을 도시한 도면,
도 2는 얼굴 인식 기반 등장인물 정보 제공 시스템의 구성요소들의 기능과 입력 데이터 및 출력 데이터를 보여주는 표를 도시한 도면,
도 3은 본 발명의 일 실시 예에 따른 MAM 서버와 클라이언트의 구성을 도시한 도면,
도 4는 본 발명의 일 실시 예에 따른 MAM 서버의 워크플로우 설정 예를 도시한 도면,
도 5는 본 발명의 일 실시 예에 따른 MAM 서버의 워크플로우 처리 흐름을 도시한 도면,
도 6은 본 발명의 일 실시 예에 따른 MAM 서버의 메타데이터 설정 예를 도시한 도면,
도 7은 본 발명의 일 실시 예에 따른 MAM 서버의 메타데이터 처리 흐름을 도시한 도면,
도 8은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 전체 워크플로우 처리 프로세스를 도시한 도면,
도 9는 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 전처리 워크플로우 처리 프로세스를 도시한 도면,
도 10은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 인물 인식 워크플로우 처리 프로세스를 도시한 도면,
도 11은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 음원인식 워크플로우 처리 프로세스를 도시한 도면,
도 12는 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 상황 인식 워크플로우 처리 프로세스를 도시한 도면,
도 13은 본 발명의 일 실시 예에 따른 Web UI를 통한 콘텐츠 등록 화면을 도시한 도면,
도 14는 본 발명의 일 실시 예에 따른 와치폴더(WatchFolder) 방식을 위한 인제스트(Ingest) 프로그램을 통한 콘텐츠 등록 화면을 도시한 도면,
도 15는 본 발명의 일 실시 예에 따른 콘텐츠 등록 및 전처리 작업 프로세스를 도시한 도면,
도 16은 본 발명의 일 실시 예에 따른 인물인식을 위한 장치 구성을 도시한 도면,
도 17은 본 발명의 일 실시 예에 따른 인물인식 프로세스를 도시한 도면,
도 18은 본 발명의 일 실시 예에 따른 음원인식을 위한 장치 구성을 도시한 도면,
도 19는 본 발명의 일 실시 예에 따른 음원인식 프로세스를 도시한 도면,
도 20은 본 발명의 일 실시 예에 따른 상황인식을 위한 장치 구성을 도시한 도면,
도 21은 본 발명의 일 실시 예에 따른 상황 인식 프로세스를 도시한 도면,
도 22는 본 발명의 일 실시 예에 따른 저작도구 화면을 도시한 도면,
도 23은 본 발명의 일 실시 예에 따른 저작도구의 콘텐츠 관리 화면을 도시한 도면,
도 24는 본 발명의 일 실시 예에 따른 저작도구의 등장인물 관리 화면을 도시한 도면,
도 25는 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(샷) 관리 화면을 도시한 도면,
도 26은 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(음원인식) 관리 화면을 도시한 도면,
도 27은 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(객체인식) 관리 화면을 도시한 도면,
도 28은 본 발명의 일 실시 예에 따른 저작도구의 검증 화면을 도시한 도면이다.1 is a diagram showing the configuration of a facial recognition-based character information providing system according to an embodiment of the present invention;
2 is a diagram showing a table showing functions and input data and output data of the components of the facial recognition-based character information providing system;
3 is a view showing the configuration of a MAM server and a client according to an embodiment of the present invention,
4 is a diagram showing an example of a workflow setting of the MAM server according to an embodiment of the present invention,
5 is a view showing a workflow processing flow of the MAM server according to an embodiment of the present invention,
6 is a diagram illustrating an example of metadata setting of a MAM server according to an embodiment of the present invention;
7 is a view showing a metadata processing flow of the MAM server according to an embodiment of the present invention,
8 is a diagram illustrating an entire workflow processing process of a face recognition-based character information providing system according to an embodiment of the present invention;
9 is a diagram illustrating a pre-processing workflow processing process of a face recognition-based character information providing system according to an embodiment of the present invention;
FIG. 10 is a diagram illustrating a process for processing a person recognition workflow in a system for providing character information based on face recognition according to an embodiment of the present invention;
11 is a diagram showing a sound source recognition workflow processing process of a face recognition based character information providing system according to an embodiment of the present invention;
12 is a diagram illustrating a situation recognition workflow processing process of a face recognition based character information providing system according to an embodiment of the present invention;
13 is a diagram showing a content registration screen through a Web UI according to an embodiment of the present invention;
14 is a diagram illustrating a content registration screen through an ingest program for a watchfolder method according to an embodiment of the present invention;
15 is a diagram showing a content registration and preprocessing work process according to an embodiment of the present invention;
16 is a view showing a device configuration for person recognition according to an embodiment of the present invention,
17 is a view showing a person recognition process according to an embodiment of the present invention,
18 is a view showing a device configuration for sound source recognition according to an embodiment of the present invention,
19 is a diagram illustrating a sound source recognition process according to an embodiment of the present invention,
20 is a view showing a device configuration for situational awareness according to an embodiment of the present invention;
21 is a diagram illustrating a situation recognition process according to an embodiment of the present invention,
22 is a view showing a authoring tool screen according to an embodiment of the present invention;
23 is a diagram showing a content management screen of the authoring tool according to an embodiment of the present invention;
24 is a view showing a character management screen of the authoring tool according to an embodiment of the present invention;
25 is a view showing a storyboard (shot) management screen of the authoring tool according to an embodiment of the present invention;
26 is a view showing a storyboard (sound source recognition) management screen of the authoring tool according to an embodiment of the present invention;
27 is a view showing a storyboard (object chain) management screen of the authoring tool according to an embodiment of the present invention;
28 is a view showing a verification screen of the authoring tool according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention, and methods for achieving them will be clarified with reference to embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the embodiments allow the disclosure of the present invention to be complete, and common knowledge in the technical field to which the present invention pertains. It is provided to fully inform the holder of the scope of the invention, and the invention is only defined by the scope of the claims. The same reference numerals refer to the same components throughout the specification.

본 발명의 실시 예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이며, 후술되는 용어들은 본 발명의 실시 예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In the description of the embodiments of the present invention, when it is determined that a detailed description of known functions or configurations may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted, and terms to be described later in the embodiments of the present invention These terms are defined in consideration of the function of the user, and may vary depending on the user's or operator's intention or custom. Therefore, the definition should be made based on the contents throughout this specification.

첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램인스트럭션들(실행 엔진)에 의해 수행될 수도 있으며, 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다.Combinations of each block in the accompanying block diagrams and steps in the flow charts may be performed by computer program instructions (execution engines), these computer program instructions being incorporated into a processor of a general purpose computer, special purpose computer, or other programmable data processing device. Since it can be mounted, the instructions executed through a processor of a computer or other programmable data processing device create a means to perform the functions described in each block of the block diagram or in each step of the flowchart.

이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 지향할 수 있는 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions can also be stored in computer readable or computer readable memory that can be oriented to a computer or other programmable data processing device to implement a function in a particular way, so that computer readable or computer readable memory The instructions stored in it are also possible to produce an article of manufacture containing instructions means for performing the functions described in each block of the block diagram or in each step of the flowchart.

그리고 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장치를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명되는 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.And since computer program instructions may be mounted on a computer or other programmable data processing device, a series of operational steps are performed on the computer or other programmable data processing device to create a process that is executed by the computer to generate a computer or other programmable It is also possible for instructions to perform the data processing apparatus to provide steps for executing the functions described in each block of the block diagram and each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능들을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있으며, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하며, 또한 그 블록들 또는 단계들이 필요에 따라 해당하는 기능의 역순으로 수행되는 것도 가능하다.In addition, each block or each step can represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical functions, and in some alternative embodiments referred to in blocks or steps It should be noted that it is also possible for functions to occur out of sequence. For example, two blocks or steps shown in succession may in fact be performed substantially simultaneously, and it is also possible that the blocks or steps are performed in the reverse order of the corresponding function as necessary.

이하, 첨부 도면을 참조하여 본 발명의 실시 예를 상세하게 설명한다. 그러나 다음에 예시하는 본 발명의 실시 예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 상술하는 실시 예에 한정되는 것은 아니다. 본 발명의 실시 예는 이 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위하여 제공된다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the embodiments of the present invention exemplified below may be modified in various other forms, and the scope of the present invention is not limited to the embodiments described below. Embodiments of the present invention are provided to more fully describe the present invention to those of ordinary skill in the art.

도 1은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 구성을 도시한 도면이고, 도 2는 얼굴 인식 기반 등장인물 정보 제공 시스템의 구성요소들의 기능과 입력 데이터 및 출력 데이터를 보여주는 표를 도시한 도면이다.1 is a diagram showing the configuration of a facial recognition-based character information providing system according to an embodiment of the present invention, and FIG. 2 is a function of the components of the facial recognition-based character information providing system and input data and output data. It is a diagram showing a table.

일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템(1)은 영상 내 등장인물, 상황, 장소, 음원 등을 인식하여 사용자에게 딥 메타데이터(deep metadata)를 제공한다. 이때, 콘텐츠를 입수하여 딥 메타데이터 추출 및 전송까지 전 과정을 통합한 최적의 딥 메타 관리기술을 제공한다. 딥 메타 솔루션은 AI 기반의 영상분석 기술로, 영상 내 등장인물, 상황 및 음원 등에 대한 상세 부가정보를 제공하며, 영상의 다양한 장면을 자동으로 분류하여 장면의 특성에 따라 오디오의 음장(sound filed), 영상의 색상 등을 자동 조절해 주는 서비스를 제공한다.The face recognition based character information providing system 1 according to an embodiment recognizes a character, a situation, a place, and a sound source in an image and provides deep metadata to the user. At this time, it provides optimal deep meta management technology that integrates the entire process from content acquisition to deep metadata extraction and transmission. Deep Meta Solution is an AI-based video analysis technology that provides detailed additional information about characters, situations, and sound sources in the video, and automatically classifies various scenes of the video to sound files according to the characteristics of the scene. , Provides a service that automatically adjusts the color of images.

보다 구체적으로, 얼굴 인식 기반 등장인물 정보 제공 시스템은 미디어 에셋 관리(Media Asset Management: MAM, 이하 'MAM'이라 칭함) 기술을 이용하여 최적의 콘텐츠 관리 솔루션을 제공한다. 또한, 인물 인식 엔진, 상황 인식 엔진 및 음원인식 엔진을 통합하여 자동화 처리하고, 개방형 애플리케이션 프로그래밍 인터페이스(Open Application Programming Interface: Open API)를 이용하여 서로 다양한 플랫폼의 엔진을 연동함에 따라 사용의 편의성을 증대시킨다. 나아가, 콘텐츠 및 인식 작업을 관리할 수 있는 저작도구(Authoring Tool)를 제공하며, 최종 딥 메타데이터를 검증할 수 있는 플레이어를 제공한다.More specifically, the facial recognition-based character information providing system provides an optimal content management solution by using a media asset management (MAM) technology. In addition, the person recognition engine, the context recognition engine, and the sound source recognition engine are integrated for automatic processing, and the convenience of use is increased by linking engines of various platforms with each other using the Open Application Programming Interface (Open API). Order. Furthermore, it provides an authoring tool that can manage content and recognition tasks, and a player that can verify the final deep metadata.

도 1 및 도 2를 참조하면, 얼굴 인식 기반 등장인물 정보 제공 시스템(1)은 콘텐츠 제공부(10), MAM 스토리지(11-1), MAM DB(11-2), 메타 관리서버(12), 전처리 서버(13), 얼굴인식 서버(14), 상황인식 서버(15), 음원인식 서버(16), 관리자 단말(17), 결과 전송부(18) 및 사용자 단말(19)을 포함한다.1 and 2, the facial recognition based character information providing system 1 includes a content providing unit 10, a MAM storage 11-1, a MAM DB 11-2, and a meta management server 12. , A pre-processing server 13, a face recognition server 14, a situation recognition server 15, a sound source recognition server 16, an administrator terminal 17, a result transmission unit 18 and a user terminal 19.

① 메타 관리서버(12)는 콘텐츠 제공부(10)로부터 원본 영상 및 CMS 메타데이터를 획득하여 MAM 스토리지(11-1)와 MAM DB(11-2)에 저장 및 관리한다(자동 등록). 메타 관리서버(12)는 MAM 서버와 MAM 확장서버(MamEx) 를 포함할 수 있다.① The meta management server 12 obtains the original image and CMS metadata from the content providing unit 10 and stores and manages them in the MAM storage 11-1 and the MAM DB 11-2 (automatic registration). The meta management server 12 may include a MAM server and a MAM extension server (MamEx).

② 전처리 서버(13)는 트랜스코더(transcoder), 카탈로거(cataloger), 프레임 추출기, 인물 인식기, 상황 인식기 및 음원 인식기를 포함한다. 트랜스코더는 원본 영상을 입력받아 스트리밍 가능한 포맷으로 변환하여 검색용 영상을 출력한다. 카탈로거는 원본 영상을 입력받아 장면(샷)을 분할하고 장면 전환 지점의 샷 이미지를 출력한다. 프레임 추출기는 원본 영상을 입력받아 프레임 이미지를 출력한다. 인물 인식기는 MAM 서버와 얼굴인식 서버(14)를 연동하고, 상황 인식기는 MAM 서버와 상황인식 서버(15)를 연동하며, 음원 인식기는 MAM 서버와 음원인식 서버(16)를 연동한다.② The pre-processing server 13 includes a transcoder, a cataloger, a frame extractor, a person recognizer, a situation recognizer, and a sound source recognizer. The transcoder receives the original image, converts it into a streamable format, and outputs a search image. The cataloger receives the original video, splits the scene (shot), and outputs the shot image of the scene change point. The frame extractor receives the original image and outputs a frame image. The person recognizer works with the MAM server and the face recognition server 14, the situation recognizer works with the MAM server and the situation recognition server 15, and the sound source recognizer works with the MAM server and the sound source recognition server 16.

③ 얼굴인식 서버(14)는 영상 내 등장인물을 인식하고 프로필 추천 이미지를 생성한다. 얼굴인식 서버(14)의 입력 데이터는 추출 프레임 이미지(예를 들어, 10 프레임 단위로 추출된 이미지)이고, 얼굴인식을 통한 출력 데이터는 인물 클러스터링 이미지, 대표 프로필 추천 이미지, 프레임 피처 저장 파일 등이다.③ The face recognition server 14 recognizes the characters in the image and generates a profile recommendation image. The input data of the face recognition server 14 is an extracted frame image (for example, an image extracted in units of 10 frames), and output data through face recognition is a person clustering image, a representative profile recommendation image, a frame feature storage file, and the like. .

④ 상황인식 서버(15)는 영상 내 객체, 이벤트, 장소, 랜드마크 등을 인식한다. 상황인식 서버(15)의 입력 데이터는 추출 프레임 이미지(예를 들어, 5 프레임 단위로 추출된 이미지)이고, 상황인식을 통한 출력 데이터는 객체, 이벤트, 장소, 랜드마크, 동영상 인식 결과 파일 등이다.④ The situation recognition server 15 recognizes objects, events, places, and landmarks in the video. The input data of the context recognition server 15 is an extracted frame image (for example, an image extracted in units of 5 frames), and the output data through context recognition is an object, event, place, landmark, video recognition result file, and the like. .

⑤ 관리자 단말(17)는 전처리 및 인식작업이 완료된 콘텐츠를 대상으로 샷 기반으로 씬(scene)을 생성 및 관리하며 검증한다. 관리자 단말(17)의 입력 데이터는 샷(shot) 이미지이고, 출력 데이터는 타임 코드(time code)이다. 타임 코드는 예를 들어, "시:분:초:프레임"이다. 검증을 위해 딥메타 결과를 검증할 수 있는 검증 페이지를 제공하며, 검증 페이지는 재생시점에 해당하는 등장인물과 음원을 표시하며 등장인물의 부가 정보를 확인할 수 있는 기능을 제공하며, 영상 재생시 딥메타 정보를 실시간으로 표시함으로써 데이터 검증 기능을 제공한다.⑤ The manager terminal 17 creates, manages, and verifies a scene based on a shot targeting the content that has been pre-processed and recognized. The input data of the manager terminal 17 is a shot image, and the output data is a time code. The time code is, for example, "hour:minute:second:frame". For verification, a verification page is provided to verify the results of the deep meta, and the verification page displays the characters and sound sources corresponding to the playback time, and provides the function to check the additional information of the characters. It provides data verification function by displaying meta information in real time.

⑥ 결과 전송부(18)는 인식 및 검증 결과를 사용자 단말(19)에 전송한다. 사용자 단말(19)은 TV, 셋톱박스, 모바일 단말 등이 있다.⑥ The result transmission unit 18 transmits the recognition and verification results to the user terminal 19. The user terminal 19 includes a TV, a set-top box, and a mobile terminal.

도 3은 본 발명의 일 실시 예에 따른 MAM 서버와 클라이언트의 구성을 도시한 도면이다.3 is a diagram showing the configuration of a MAM server and a client according to an embodiment of the present invention.

MAM 서버(2)는 통신부(26), MAM 엔진 및 DB(20)를 포함한다.The MAM server 2 includes a communication unit 26, a MAM engine, and a DB 20.

통신부(26)는 클라이언트(3)와의 통신을 처리한다. 통신 시 TCP 기반 프로토콜 및 SOAP 프로토콜을 지원한다. TCP 기반 프로토콜/SOAP 프로토콜은 프로세스 실행 옵션으로 설정할 수 있다. TCP 기반 프로토콜을 지원하는 클라이언트용 SDK를 제공할 수 있다.The communication unit 26 handles communication with the client 3. It supports TCP-based and SOAP protocols for communication. The TCP-based protocol/SOAP protocol can be configured as a process execution option. SDK for clients that support TCP-based protocols can be provided.

MAM 엔진은 워크플로우 관리부(Workflow Manager)(21), 에셋 관리부(Asset Manager)(22), 어드민 관리부(Admin Manager)(23), 시스템 관리부(System Manager)(24) 및 확장 MAM 관리부(MamEx Manager)(25)를 포함한다.The MAM engine includes a workflow manager 21, an asset manager 22, an admin manager 23, a system manager 24, and an extended MAM manager ) (25).

워크플로우 관리부(21)는 시스템 워크플로우를 관리한다. 이때, 작업 할당을 통제하여 작업 종류에 맞는 콤포넌트 서버(전처리 서버 / 인식엔진 연동기)(30)에 작업을 할당한다. MAM 서버(2)로부터 어떤 작업을 내려받아 대신 처리하는 중개 어플리케이션을 컴포넌트 서버(Component server)라고 하고, 이 서버가 처리하는 일을 컴포넌트 작업(Component job)이라 정의한다. 트랜스코딩 후 전송과 같이 컴포넌트 작업 2개가 모여 요구사항 하나가 되는데, 이런 순차로 발생하는 콤포넌트 작업 묶음을 시스템 워크플로우(System workflow)라 정의한다.The workflow management unit 21 manages the system workflow. At this time, the job allocation is controlled to allocate the job to the component server (pre-processing server / recognition engine interlocker) 30 suitable for the job type. A mediation application that downloads and processes a job from the MAM server 2 is called a component server, and the job that the server processes is defined as a component job. After transcoding, two component tasks, such as transmission, come together to form a requirement, and a sequence of component tasks that occur in this order is defined as a system workflow.

에셋 관리부(22)는 구축된 에셋 스키마(asset scheme)에 따라 메타데이터(metadata)를 관리하며, 메타데이터의 CRUD(Create: 생성, Read: 읽기, Update: 갱신, Delete: 삭제) 기능을 제공하다. 에셋 스키마는 데이터에 대한 생성, 조회, 수정, 삭제 조작 대상이 되는 DB 필드를 재정의한 것을 뜻한다. 메타데이터는 콘텐츠 자체를 시스템에 인식하기 위한 추가적인 데이터를 의미한다. 예를 들어, 콘텐츠 파일은 별도의 저장소에 보관하고, 이를 활용하기 위한 패스정보, 포맷정보, 생성일, 소유자, 제목, 사용권한 등이 DB에 관리되는데, 이 정보를 메타데이터라 한다.The asset management unit 22 manages metadata according to the established asset scheme, and provides CRUD (Create: Create, Read: Read, Update: Update, Delete: Delete) functions of metadata. . The asset schema means redefining the DB field that is the target of creation, inquiry, modification, and deletion operations on data. Metadata refers to additional data for recognizing the content itself to the system. For example, the content file is stored in a separate storage, and the pass information, format information, creation date, owner, title, usage rights, etc. to utilize it are managed in the DB, and this information is called metadata.

어드민 관리부(23)는 부서, 사용자, 권한 등 시스템 운영에 필요한 관리 기능을 제공한다. 시스템 관리부(24)는 시스템 설정 기능을 지원하고, 에셋 및 워크플로우 설정을 처리한다. 확장 MAM 관리부(25)는 MAM 서버(2)의 확장 모듈인 MAM 확장서버(MamEx server)(4)와의 연동 처리를 담당한다.The admin management unit 23 provides management functions necessary for operating the system, such as departments, users, and authorities. The system management unit 24 supports a system setting function, and processes assets and workflow settings. The extended MAM management unit 25 is in charge of interworking with the MAM extension server 4, which is an extension module of the MAM server 2.

DB(20)는 DB 및 검색엔진 연동을 담당한다.The DB 20 is in charge of interworking with the DB and search engine.

클라이언트(3)는 콤포넌트 서버(30), 저작도구(Authoring tool)(31), 관리 및 모니터링 도구(Admin tool/Monitoring tool)(32), 메타데이터 설계도구(MAM Designer)(33)를 포함한다. 콤포넌트 서버(30)는 MAM 서버(2)의 워크플로우 관리부(21)로부터 작업을 할당받아 수행하고, 저작도구(31)는 MAM 서버(2)의 에셋 관리부(22)와 연결되고, 관리 및 모니터링 도구(32)는 MAM 서버(2)의 어드민 관리부(23)와 연결되며, 메타데이터 설계도구(33)는 MAM 서버(2)의 시스템 관리부(24)와 연결된다.The client 3 includes a component server 30, an authoring tool 31, an administration tool/monitoring tool 32, and a metadata design tool (MAM Designer) 33. . The component server 30 is assigned and performed by the workflow management unit 21 of the MAM server 2, and the authoring tool 31 is connected to the asset management unit 22 of the MAM server 2, and is managed and monitored. The tool 32 is connected to the admin management unit 23 of the MAM server 2, and the metadata design tool 33 is connected to the system management unit 24 of the MAM server 2.

도 4는 본 발명의 일 실시 예에 따른 MAM 서버의 워크플로우 설정 예를 도시한 도면이다.4 is a diagram illustrating an example of a workflow setting of a MAM server according to an embodiment of the present invention.

도 3 및 도 4를 참조하면, MAM 서버(2)는 모든 워크플로우 설정을 코드 레벨이 아닌 스키마 테이블(scheme table)인 작업 설정 테이블(40)에 저장하여 관리한다. 단 적격화가 필요한 전처리, 후처리는 MAM 확장서버(MamEx)를 통해 처리할 수 있다. 작업 설정 테이블(40)은 작업 식별자(41), 작업 XML(42), 캡션(CAPTION)(43), 시작호출(STARTCMSEXNAME)(44), 종료호출(ENDCMSEXNAME)(45) 필드를 포함한다.Referring to FIGS. 3 and 4, the MAM server 2 stores and manages all workflow settings in a work setting table 40 which is a schema table rather than a code level. However, pre-processing and post-processing that require qualification can be processed through the MAM extension server (MamEx). The job setting table 40 includes the job identifier 41, job XML 42, caption 43, start call (STARTCMSEXNAME) 44, and end call (ENDCMSEXNAME) 45 fields.

① MAM 서버(2)는 작업 할당 시 설정된 항목을 작업별로 표준화시킨 작업 XML(42)을 콤포넌트 서버에 전달한다. ② 작업 XML(42)은 콤포넌트 서버에서 작업 전후로 호출할 MamEx 정보(44, 45)를 포함한다. ③ 작업 설정 테이블(40)의 필드에 작업 상태정보를 업데이트한다. 작업 상태정보는 시작 시각, 종료 시각, 진행률, 작업 상태, 실패 원인 등을 포함한다.① The MAM server 2 delivers the work XML 42, which standardizes the items set at the time of task assignment, to the component server. ② The job XML 42 includes MamEx information 44 and 45 to be called before and after the job in the component server. ③ The job status information is updated in the field of the job setting table 40. The work status information includes a start time, an end time, a progress rate, a work status, and a cause of failure.

도 5는 본 발명의 일 실시 예에 따른 MAM 서버의 워크플로우 처리 흐름을 도시한 도면이다.5 is a diagram showing a workflow processing flow of the MAM server according to an embodiment of the present invention.

1. 워크플로우 호출1. Call workflow

클라이언트(3)가 MAM 서버(2)에 워크플로우를 호출한다(단계 1). MAM 서버(2)는 호출된 작업 내용을 작업 설정 테이블에 추가한다(단계 1-1).The client 3 calls the workflow to the MAM server 2 (step 1). The MAM server 2 adds the called job content to the job setting table (step 1-1).

2. 작업 할당2. Task assignment

MAM 서버(2)는 작업정보를 콤포넌트 서버(5)에 전달하여 작업을 할당한다(단계 2). 이때, 작업 관리 쓰레드가 작업 설정 테이블을 주기적으로 확인하여 대기 중인 작업을 처리 가능한 콤포넌트 서버(5)에 할당하며, 작업 설정 테이블을 기반으로 작업 XML을 전달한다. 대기중인 콤포넌트 서버가 여러 대인 경우 우선순위를 적용하여 작업을 할당한다. MAM 서버(2)는 작업할당 후 작업 시작 관련 정보를 업데이트한다(단계 2-1).The MAM server 2 assigns the job by passing the job information to the component server 5 (step 2). At this time, the task management thread periodically checks the task setting table and allocates the waiting task to the processable component server 5, and delivers the task XML based on the task setting table. If there are multiple waiting component servers, the task is assigned by applying priority. The MAM server 2 updates information related to the start of the task after task assignment (step 2-1).

3. 작업 전처리3. Pre-treatment

콤포넌트 서버(5)는 작업을 할당받은 후에 전처리 작업을 위한 MAM 확장서버(Start Job MamEx)(4)를 호출한다(단계 3). MAM 확장서버(4)는 작업에 필요한 전처리 작업을 실행(단계 3-1)한 후, 최종 작업 정보(Job XML)를 완성하여 콤포넌트 서버(5)에 전달한다(단계 3-2).The component server 5 calls the MAM extension server (Start Job MamEx) 4 for the pre-processing job after the job is allocated (step 3). The MAM extension server 4 executes the pre-processing task required for the task (step 3-1), then completes the final task information (Job XML) and delivers it to the component server 5 (step 3-2).

4. 작업 처리4. Work processing

콤포넌트 서버(5)는 작업을 처리(단계 4) 하면서, 주기적으로 MAM 서버(2)에 진행률을 보고한다(단계 4-1). MAM 서버(2)는 작업 진행 관련 정보를 업데이트 한다(단계 4-2).The component server 5 reports the progress to the MAM server 2 periodically (Step 4-1) while processing the operation (Step 4). The MAM server 2 updates the work progress related information (step 4-2).

5. 작업 후처리5. Post-processing

콤포넌트 서버(5)는 작업 결과와 산출물 정보를 포함하여 작업을 위한 MAM 확장서버(End Job MamEx)(4)를 호출한다(단계 5). MAM 확장서버(4)는 작업 후처리(작업 산출물 정보 저장 등)(단계 5-1) 후 최종 작업 결과를 콤포넌트 서버(5)에 전달한다(단계 5-2).The component server 5 calls the MAM extension server (End Job MamEx) 4 for the job including the job result and the product information (step 5). The MAM extension server 4 transmits the final work result to the component server 5 after the work post-processing (storage work product information storage, etc.) (step 5-1) (step 5-2).

6. 작업 결과 통보6. Notification of work results

콤포넌트 서버(5)는 MAM 확장서버(4)로부터 전달받은 최종 작업 결과를 MAM 서버(2)에 통보한다(단계 6). MAM 서버(2)는 작업 종료 관련 정보를 업데이트한다(단계 6-2).The component server 5 notifies the MAM server 2 of the final operation result received from the MAM extension server 4 (step 6). The MAM server 2 updates the task termination related information (step 6-2).

도 6은 본 발명의 일 실시 예에 따른 MAM 서버의 메타데이터 설정 예를 도시한 도면이다.6 is a diagram illustrating an example of metadata setting of a MAM server according to an embodiment of the present invention.

도 3 및 도 6을 참조하면, MAM 서버(2)는 에셋(메타데이터) 설정 역시 코드 레벨이 아닌 메타데이터 설정 테이블을 이용하여 저장 및 관리한다. 메타데이터 설계도구(MAM Designer)(33)를 통해 메타데이터를 설계할 수 있다. 단 적격화가 필요한 내용은 MAM 확장서버(4)를 통해 처리 가능하다.3 and 6, the MAM server 2 stores and manages asset (metadata) settings using a metadata setting table, not a code level. Metadata can be designed through the metadata design tool (MAM Designer) 33. However, content that needs to be qualified can be processed through the MAM extension server (4).

① 메타데이터 설계도구(MAM Designer)(33)는 메타데이터 설정 테이블의 기본 설정을 수행한다. 메타데이터 테이블의 이름 및 기본 키(Primary Key: PK, 이하 'PK'라 칭함) 발급을 위한 시퀀스와, PK 컬럼을 설정한다.① The metadata design tool (MAM Designer) 33 performs basic setting of the metadata setting table. Set the sequence and PK column for issuing the name and primary key of the metadata table (Primary Key: PK, hereinafter referred to as'PK').

② 메타데이터 설계도구(MAM Designer)(33)는 에셋 생성 또는 삭제 시 연동할 MAM 확장서버(4)를 설정한다. 생성의 경우 주로 적격화가 필요한 기본 값(메타 ID, 파일경로 등)을 업데이트 하고, 삭제의 경우 물리 파일 삭제를 처리한다.② The metadata design tool (MAM Designer) 33 sets the MAM extension server 4 to be linked when an asset is created or deleted. In the case of creation, mainly the basic values (meta ID, file path, etc.) that need to be qualified are updated.

③ 메타데이터 설계도구(MAM Designer)(33)는 메타데이터 항목을 설정한다. 예를 들어, 메타데이터의 필드 타입, 캡션, 기본값 등을 설정한다. 설정에 따라 저작도구(Authoring tool)의 메타데이터 항목을 동적으로 구성할 수 있다.③ The metadata design tool (MAM Designer) 33 sets metadata items. For example, set the field type, caption and default value of metadata. Depending on the setting, metadata items of the authoring tool can be dynamically configured.

도 7은 본 발명의 일 실시 예에 따른 MAM 서버의 메타데이터 처리 흐름을 도시한 도면이다.7 is a diagram illustrating a metadata processing flow of the MAM server according to an embodiment of the present invention.

1. 에셋 스키마 설정1. Asset Schema Setting

메타데이터 설계도구(MAM Designer)(33)를 이용하여 에셋 스키마를 설정한다(단계 1). 클라이언트(3)는 MAM 서버(2)를 통해 에셋 스키마 정보를 조회(단계 1-1)한 후, 동적으로 메타데이터 항목을 구성한다(단계 1-2).The asset schema is set using the metadata design tool (MAM Designer) 33 (step 1). The client 3 retrieves asset schema information through the MAM server 2 (step 1-1), and then dynamically configures metadata items (step 1-2).

2. 에셋 생성2. Asset Creation

클라이언트(3)는 MAM 서버(2)에 에셋 스키마 설정을 기반으로 에셋 생성을 요청한다(단계 2). MAM 서버(2)는 에셋을 생성하고 기본 값을 업데이트한다(단계 2-1). MAM 서버(2)는 에셋 생성 적격화를 위한 MAM 확장서버(4)를 호출(단계 2-2)하고, MAM 확장서버(4)가 에셋 생성 관련 적격화 로직을 수행하여 MAM 서버(2)에 추가 기본 값을 업데이트한다(단계 2-3). MAM 서버(2)는 신규 에셋 정보를 클라이언트에 반환한다(단계 2-4).The client 3 requests the MAM server 2 to create an asset based on the asset schema setting (step 2). The MAM server 2 creates assets and updates the default values (step 2-1). The MAM server 2 calls the MAM extension server 4 for the asset creation qualification (step 2-2), and the MAM extension server 4 performs the asset creation related qualification logic to the MAM server 2 Update additional default values (step 2-3). The MAM server 2 returns the new asset information to the client (steps 2-4).

3. 에셋 삭제3. Delete assets

클라이언트(3)가 MAM 서버(2)에 에셋 삭제를 요청(단계 3)하면, MAM 서버(2)는 PK로 사용하고 있는 테이블의 참조 에셋을 먼저 삭제 처리한다(단계 3-1). 예를 들어, 비디오 테이블인 경우, 샷, 얼굴인식, 음원인식 테이블 등을 삭제 처리한다. 그리고 삭제 요청된 에셋으로 실행 중인 워크플로우를 취소한다(단계 3-2). 이어서, MAM 서버(2)는 에셋 삭제 적격화를 위한 MAM 확장서버(4)를 호출(단계 3-3) 하고, MAM 확장서버(4)는 에셋 삭제 관련 적격화 로직을 수행(예를 들어, 파일 삭제 등)(단계 3-4)한 후 에셋 삭제 처리 결과를 MAM 서버(2)에 반환한다(단계 3-5). MAM 서버(2)는 에셋 삭제 처리 결과를 클라이언트(3)에 반환한다(단계 3-6).When the client 3 requests to delete the asset from the MAM server 2 (step 3), the MAM server 2 first deletes the reference asset of the table used as the PK (step 3-1). For example, in the case of a video table, shots, face recognition, and sound source recognition tables are deleted. Then, the workflow being executed with the asset requested to be deleted is canceled (step 3-2). Subsequently, the MAM server 2 calls the MAM extension server 4 for eligibility for asset deletion (step 3-3), and the MAM extension server 4 performs eligibility logic related to asset deletion (for example, File deletion, etc.) (step 3-4), and the result of the asset deletion processing is returned to the MAM server 2 (step 3-5). The MAM server 2 returns the result of the asset deletion processing to the client 3 (steps 3-6).

도 8은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 전체 워크플로우 처리 프로세스를 도시한 도면이다.8 is a diagram illustrating an entire workflow processing process of a face recognition-based character information providing system according to an embodiment of the present invention.

1. 콘텐츠 등록1. Content registration

콘텐츠 등록부(Ingest Manager)(6)는 콘텐츠 원본 파일을 복사(단계 1-1)하여 스토리지(11-1)에 저장하고, 콘텐츠 메타데이터를 MAM 서버(2)에 등록 및 저장한다(단계 1-2).The content register (Ingest Manager) 6 copies the content original file (step 1-1) and stores it in the storage 11-1, and registers and stores the content metadata in the MAM server 2 (step 1- 2).

2. 전처리 작업2. Pre-treatment work

MAM 서버(2)는 전처리 서버(13)에 인식 작업 및 검증에 필요한 전처리 작업을 할당한다(단계 2-1). 전처리 서버(13)는 트랜스코딩, 카탈로깅, 프레임 추출 등의 전처리를 수행한다. 트랜스코딩은 원본 영상을 이용하여 스트리밍 가능한 포맷으로 변환하는 것이고, 카탈로깅은 장면 전환 지점을 추출하는 것이며, 프레임 추출은 영상의 프레임을 이미지 파일로 저장하는 것이다. 추출된 프레임 이미지는 인식 서버(7)에 입력된다. 이어서, 전처리 서버(13)는 전처리 결과를 스토리지(11-1)에 저장(단계 2-1)하고, MAM 서버(2)에 작업 완료를 통보한다(단계 2-2).The MAM server 2 allocates a pre-processing task required for recognition and verification to the pre-processing server 13 (step 2-1). The pre-processing server 13 performs pre-processing such as transcoding, cataloging, and frame extraction. Transcoding is to convert to a streamable format using the original image, cataloging is to extract the transition point, and frame extraction is to save the frame of the image as an image file. The extracted frame image is input to the recognition server 7. Subsequently, the pre-processing server 13 stores the pre-processing results in the storage 11-1 (step 2-1), and notifies the MAM server 2 of the completion of the operation (step 2-2).

3. 인식 작업3. Recognition work

MAM 서버(2)는 전처리 된 결과를 이용하여 인식 작업을 인식 서버(7)에 할당한다. 인식 서버(7)는 인물 인식, 상황 인식 및 음원 인식 등을 수행한다. 인물 인식을 통해 영상 내 등장인물을 인식하고 프로필 추천 이미지를 생성하며 인물 클러스터링을 수행한다. 상황 인식을 통해 영상 내 객체, 이벤트, 장소 및 랜드마크 등을 인식한다. 음원 인식을 통해 배경 음악을 인식한다. 이어서, 인식 서버(7)는 인식 결과를 스토리지(11-1)에 저장(단계 3-1)하고, MAM 서버(2)에 작업 완료를 통보한다(단계 3-2).The MAM server 2 assigns the recognition task to the recognition server 7 using the pre-processed result. The recognition server 7 performs person recognition, situation recognition, and sound source recognition. Character recognition is performed through person recognition, profile recommendation images are generated, and character clustering is performed. Recognize objects, events, places and landmarks in the video through context recognition. Recognize background music through sound source recognition. Subsequently, the recognition server 7 stores the recognition result in the storage 11-1 (step 3-1), and notifies the MAM server 2 of the completion of the operation (step 3-2).

4. 데이터 검증4. Data verification

관리자 단말(8)은 저작도구(Authoring Tool)를 이용하여 인식 결과를 확인하고 검증을 거쳐 수정한다(단계 4).The administrator terminal 8 checks the recognition result using the authoring tool and corrects it after verification (step 4).

5. 결과 전송5. Sending results

검증완료 후, MAM 서버(2)는 사용자 단말(19)로 결과를 전달한다(단계 5).After verification is completed, the MAM server 2 delivers the result to the user terminal 19 (step 5).

도 9는 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 전처리 워크플로우 처리 프로세스를 도시한 도면이다.9 is a diagram illustrating a pre-processing workflow processing process of a face recognition-based character information providing system according to an embodiment of the present invention.

1. 트랜스코딩1. Transcoding

MAM 서버(2)는 트랜스코더(130)에 트랜스코딩 작업을 할당(단계 1)하고 트랜스코더(130)는 원본 영상(ts)을 웹에서 재생 가능한 검색 영상(mp4)으로 변환한다. 이때, 트랜스코더(130)의 입력 데이터는 원본 영상이고, 출력 데이터는 검색 영상(mp4, 640×360)이다. 트랜스코더(130)는 검색 영상(mp4)을 생성(단계 1-1)하여 스토리지(11-1)에 저장하고, 검색 영상 정보를 MAM 서버(2)에 등록(단계 1-2)하며, 완료를 통보한다(단계 1-3). 스트리밍 원본 수급 방식인 경우, 트랜스코딩 작업을 생략할 수 있다.The MAM server 2 assigns a transcoding operation to the transcoder 130 (step 1), and the transcoder 130 converts the original image ts into a searchable image mp4 playable on the web. At this time, the input data of the transcoder 130 is an original image, and the output data is a search image (mp4, 640×360). The transcoder 130 generates the search image mp4 (step 1-1) and stores it in the storage 11-1, registers the search image information to the MAM server 2 (step 1-2), and completes Is informed (steps 1-3). In the case of the streaming source supply and demand method, the transcoding operation can be omitted.

2. 카탈로깅2. Cataloging

MAM 서버(2)는 카탈로거(131)에 카탈로깅 작업을 할당(단계 2)하고, 카탈로거(131)는 원본 영상을 분석하여 장면 전환 지점 기준으로 장면을 분할한다. 이때, 입력 데이터는 원본 영상이고, 출력 데이터는 샷 이미지(jpg, 320×180)이다. 카탈로거(131)는 샷 이미지(jpg)를 스토리지(11-1)에 저장(단계 2-1)하고, 샷 정보를 MAM 서버(2)에 등록(단계 2-2) 하며, 완료를 통보한다(단계 2-3).The MAM server 2 allocates a catalogging operation to the catalog 131 (step 2), and the cataloger 131 analyzes the original image and divides the scene based on the scene change point. At this time, the input data is an original image, and the output data is a shot image (jpg, 320×180). The cataloger 131 stores the shot image jpg in the storage 11-1 (step 2-1), registers the shot information to the MAM server 2 (step 2-2), and notifies completion. (Step 2-3).

3. 프레임 추출3. Frame extraction

MAM 서버(2)는 프레임 추출기(132)에 프레임 추출 작업을 할당(단계 3)하고, 프레임 추출기(132)는 인물 인식 및 상황 인식에 사용될 프레임 이미지를 추출한다. 예를 들어, 인물 인식용은 10 프레임 단위로 이미지를 추출하고, 상황 인식용은 5 프레임 단위로 이미지를 추출한다. 입력 데이터는 원본 영상이고, 출력 데이터는 프레임 이미지(jpg, 원본 영상 사이즈)이다. 트랜스코딩, 카탈로깅 및 프레임 추출은 병렬로 동시에 진행될 수 있다. 프레임 추출기(132)는 프레임 이미지(jpg)를 스토리지(11-1)에 저장(단계 3-1)하고, 프레임 추출 정보를 MAM 서버(2)에 등록(단계 3-2) 하며, 완료를 통보한다(단계 3-3).The MAM server 2 allocates a frame extracting operation to the frame extractor 132 (step 3), and the frame extractor 132 extracts a frame image to be used for character recognition and context recognition. For example, an image is extracted in units of 10 frames for person recognition, and an image is extracted in units of 5 frames for context recognition. The input data is an original image, and the output data is a frame image (jpg, original image size). Transcoding, catalogging and frame extraction can be performed simultaneously in parallel. The frame extractor 132 stores the frame image jpg in the storage 11-1 (step 3-1), registers the frame extraction information to the MAM server 2 (step 3-2), and notifies the completion. (Step 3-3).

MAM 서버(2)와 전처리 서버의 트랜스코더(130), 카탈로거(131), 프레임 추출기(132) 간에는 TCP 프로토콜을 이용하여 통신할 수 있다.The MAM server 2 and the transcoder 130 of the pre-processing server, the catalog 131, and the frame extractor 132 may communicate using a TCP protocol.

도 10은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 인물 인식 워크플로우 처리 프로세스를 도시한 도면이다.FIG. 10 is a diagram illustrating a process of processing a person recognition workflow in a face recognition based character information providing system according to an embodiment of the present invention.

1. 등장인물 피처 추출 작업1. Character feature extraction

MAM 서버(2)가 인물 인식기(133)에 등장인물 피처 추출 작업을 할당(단계 1)하면, 인물 인식기(133)가 등장인물의 인식용 피처 추출을 위한 인물인식 서버(14)를 호출한다(단계 1-1). 인물인식 서버(14)는 구축된 등장인물의 갤러리 이미지를 이용하여 등장인물의 인식용 피처를 추출하고, 추출된 피처 파일을 스토리지(11-1)에 저장한다(단계 1-2). 인물 인식기(133)는 인물인식 서버(14)에 피처 추출 상태를 확인(단계 1-3)하며, 확인 결과를 MAM 서버(2)에 등록하고 완료를 통보한다(단계 1-4). 인물인식 서버(14)의 입력 데이터는 등장인물의 갤러리 이미지이고, 출력 데이터는 등장인물의 피처 저장 파일이다. 모든 등장인물의 피처가 추출되어 있는 경우 등장인물의 피처 추출 작업은 생략 가능하다.When the MAM server 2 assigns the character feature extraction operation to the person recognizer 133 (step 1), the person recognizer 133 calls the person recognition server 14 for extracting features for recognition of the characters ( Step 1-1). The person recognition server 14 extracts features for recognition of the characters using the constructed gallery image of the characters, and stores the extracted feature files in the storage 11-1 (steps 1-2). The person recognizer 133 checks the feature extraction state with the person recognition server 14 (steps 1-3), registers the verification result with the MAM server 2, and notifies completion (steps 1-4). The input data of the person recognition server 14 is a gallery image of the character, and the output data is a feature storage file of the character. If all the features of the characters are extracted, the feature extraction operation of the characters can be omitted.

2. 인물인식 작업2. Person recognition work

인물 인식기(133)를 이용하여 인물인식 서버(14)와 MAM 서버(2)를 연동한다. MAM 서버(2)가 인물 인식기(133)에 인물인식 작업을 할당(단계 2)하면, 인물 인식기(133)는 인물 인식을 위한 인물인식 서버(14)를 호출한다(단계 2-1). 인물인식 서버(14)는 프레임 추출 이미지 및 등장인물 피처 파일을 이용하여 인물을 인식하여, 대표 프로필 추천 사진, 인물 클러스터링 이미지, 프레임별 피처 파일 및 인식 결과 파일을 출력한다. 인물인식 서버(14)는 프레임 피처 파일을 스토리지(11-1)에 저장(단계 2-2)하고, 인물 클러스터링 파일을 스토리지(11-1)에 저장(단계 2-3)하고, 대표 프로필 추천 파일을 스토리지(11-1)에 저장(단계 2-4)하며, 인물 인식 결과 파일을 스토리지(11-1)에 저장한다(단계 2-5). 인물 인식기(133)는 인물인식 서버(14)에서 인물인식 상태를 확인한다(단계 2-6). 인물 인식기(133)는 스토리지(11-1)에서 인식결과 파일을 확인(단계 2-7)하고, MAM 서버(2)에 결과를 등록하고 완료를 통보한다(단계 2-8).The person recognition server 14 is used to link the person recognition server 14 and the MAM server 2. When the MAM server 2 assigns a person recognition task to the person recognizer 133 (step 2), the person recognizer 133 calls the person recognition server 14 for person recognition (step 2-1). The person recognition server 14 recognizes a person using the frame extraction image and the character feature file, and outputs a representative profile recommendation picture, a person clustering image, a feature file for each frame, and a recognition result file. The person recognition server 14 stores the frame feature file in the storage 11-1 (step 2-2), stores the person clustering file in the storage 11-1 (step 2-3), and recommends a representative profile The file is stored in the storage 11-1 (step 2-4), and the person recognition result file is stored in the storage 11-1 (step 2-5). The person recognizer 133 checks the person recognition state in the person recognition server 14 (steps 2-6). The person recognizer 133 checks the recognition result file in the storage 11-1 (step 2-7), registers the result with the MAM server 2, and notifies completion (step 2-8).

도 11은 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 음원인식 워크플로우 처리 프로세스를 도시한 도면이다.11 is a diagram illustrating a sound source recognition workflow processing process of a facial recognition-based character information providing system according to an embodiment of the present invention.

1. 음원 작업1. Sound source operation

영상 전체를 미리 설정된 단위, 예를 들어, 7초 단위로 잘라 wav 파일로 스토리지(11-1)에 저장한다. MAM 서버(2)가 음원 인식기(134)에 음원인식 작업을 할당한다(단계 1). 그러면, 음원 인식기(134)는 스토리지(11-1)로부터 wav 파일을 추출한다(단계 1-1). 음원 인식기(134)는 음원인식 서버(16)와 MAM 서버(2)를 연동하며, wav 파일을 이용하여 음원인식 서버(16)에 음원인식을 호출한다(단계 1-2). 음원인식 서버(16)는 음원을 인식하고 음원 인식기(134)가 음원인식 결과를 MAM 서버(2)에 등록한다(단계 1-3). 음원인식 호출(1-2 단계) 및 음원인식 결과 등록(단계 1-3)이 반복 수행된다. 작업 완료 후 음원 인식기(134)는 wav 파일을 삭제하고 MAM 서버(2)에 완료를 통보한다(단계 1-4).The entire image is cut in units of a preset unit, for example, 7 seconds, and stored in the storage 11-1 as a wav file. The MAM server 2 assigns a sound source recognition task to the sound source recognizer 134 (step 1). Then, the sound source recognizer 134 extracts the wav file from the storage 11-1 (step 1-1). The sound source recognizer 134 interlocks the sound source recognition server 16 and the MAM server 2 and calls the sound source recognition server 16 to the sound source recognition server 16 using a wav file (step 1-2). The sound source recognition server 16 recognizes the sound source and the sound source recognizer 134 registers the sound source recognition result in the MAM server 2 (steps 1-3). The sound source recognition call (step 1-2) and sound source recognition result registration (step 1-3) are repeatedly performed. After the operation is completed, the sound source recognizer 134 deletes the wav file and notifies the MAM server 2 of completion (steps 1-4).

2. 음원인식 결과 병합2. Merging sound source recognition results

MAM 서버(2)는 미리 설정된 단위, 예를 들어, 7초 단위로 인식된 결과에서 오인식 결과를 수정하고 유효한 결과를 병합한다(단계 2). 예를 들어, 타임코드가 중복된 항목을 삭제하거나 서로 병합한다.The MAM server 2 corrects the misrecognized result from the result recognized in a preset unit, for example, 7 seconds, and merges valid results (step 2). For example, items with duplicate time codes are deleted or merged with each other.

도 12는 본 발명의 일 실시 예에 따른 얼굴 인식 기반 등장인물 정보 제공 시스템의 상황 인식 워크플로우 처리 프로세스를 도시한 도면이다.12 is a diagram illustrating a situation recognition workflow processing process of a face recognition based character information providing system according to an embodiment of the present invention.

1. 상황인식 작업1. Situational awareness work

MAM 서버(2)는 상황 인식기(135)에 상황인식 작업을 할당한다(단계 1). 상황인식 서버(15)와 MAM 서버(2)를 연동하는 상황 인식기(135)는 프레임 추출 이미지를 상황인식 서버(15)에 전달하면서 상황인식 서버(15)를 호출한다(단계 1-1). 상황인식 서버(15)는 프레임 추출 이미지를 입력받아 상황인식을 통해 객체, 이벤트, 장소, 랜드마크 및 동영상을 인식하고, 상황인식 결과 파일을 스토리지(11-1)에 저장한다(단계 1-2). 상황 인식기(135)는 상황인식 결과 파일을 스토리지(11-1)에서 확인(단계 1-3) 하고, MAM 서버(2)에 상황인식 결과를 등록한 후 완료를 통보한다(단계 1-4).The MAM server 2 assigns a situation recognition task to the situation recognizer 135 (step 1). The context recognizer 135 that links the context recognition server 15 and the MAM server 2 calls the context recognition server 15 while passing the frame extraction image to the context recognition server 15 (step 1-1). The context recognition server 15 receives the frame extraction image, recognizes objects, events, places, landmarks, and videos through context recognition, and stores the context recognition result file in the storage 11-1 (steps 1-2) ). The situation recognizer 135 checks the situation recognition result file in the storage 11-1 (step 1-3), registers the situation recognition result with the MAM server 2, and notifies the completion (step 1-4).

도 13은 본 발명의 일 실시 예에 따른 Web UI를 통한 콘텐츠 등록 화면을 도시한 도면이고, 도 14는 본 발명의 일 실시 예에 따른 와치폴더(WatchFolder) 방식을 위한 인제스트(Ingest) 프로그램을 통한 콘텐츠 등록 화면을 도시한 도면이다.13 is a diagram illustrating a content registration screen through a Web UI according to an embodiment of the present invention, and FIG. 14 is an ingest program for a watch folder method according to an embodiment of the present invention This is a diagram showing a content registration screen.

도 13 및 도 14를 참조하면, 콘텐츠 등록을 위해 Web UI, 와치폴더, 스트리밍 인코딩 방식을 제공하며, 운영 시스템의 상황에 따라 등록 방법을 선택하도록 한다.13 and 14, a Web UI, a watch folder, and a streaming encoding method are provided for content registration, and a registration method is selected according to the situation of the operating system.

도 15는 본 발명의 일 실시 예에 따른 콘텐츠 등록 및 전처리 작업 프로세스를 도시한 도면이다.15 is a diagram showing a content registration and pre-processing process according to an embodiment of the present invention.

도 15를 참조하면, 얼굴 인식 기반 등장인물 정보 제공 시스템은 콘텐츠를 등록하고, 콘텐츠 등록 완료 후 자동으로 전처리 작업을 실행한다. 전처리 작업은 프레임 이미지 추출, 카탈로깅, 스트리밍 영상 생성 등을 포함한다. 전처리 작업들은 동시에 병렬로 진행 가능하다. 스트리밍 인코딩 방식을 통한 콘텐츠 등로 시에는 스트리밍 영상 생성 프로세스가 생략된다. 프레임 이미지 추출 작업 완료 후 인식 작업을 자동으로 실행한다.15, the facial recognition-based character information providing system registers content and automatically executes a pre-processing operation after content registration is completed. Pre-processing tasks include frame image extraction, cataloging, and streaming video generation. Preprocessing tasks can be performed in parallel at the same time. In the case of content such as streaming encoding, the streaming video generation process is omitted. After the frame image extraction operation is completed, the recognition operation is automatically executed.

도 16은 본 발명의 일 실시 예에 따른 인물인식을 위한 장치 구성을 도시한 도면이다.16 is a diagram illustrating a device configuration for person recognition according to an embodiment of the present invention.

도 16을 참조하면, 인물인식은 인물 인식기(133)를 통한 전처리 작업과 인물인식 서버(14)를 통한 인물 인식작업으로 구분된다. 인물 인식기(133)는 GUI(1330), 인물인식 서버(14)와 MAM 서버(2)를 연동하기 위해 MAM 연동기(1332)와 인식서버 연동기(1334)를 포함한다. MAM 연동기(1332)는 MAM 서버(2)와 TCP/IP를 이용하여 연결되고, 인식서버 연동기(1334)는 인물인식 서버(14)와 Restful을 이용하여 연결될 수 있다. 인물인식 서버(14)는 Restful API(140) 및 얼굴인식 라이브러리(142)를 포함한다. 전처리 작업과 인식 작업은 MAM 서버(2)를 통해 자동으로 진행된다.Referring to FIG. 16, person recognition is divided into a pre-processing task through the person recognizer 133 and a person recognition task through the person recognition server 14. The person recognizer 133 includes a GUI 1330, a person recognition server 14, and an MAM linker 1332 and a recognition server linker 1334 to link the MAM server 2 with each other. The MAM linker 1332 may be connected to the MAM server 2 using TCP/IP, and the recognition server linker 1334 may be connected to the person recognition server 14 using Restful. The person recognition server 14 includes a Restful API 140 and a face recognition library 142. The pre-processing and recognition tasks are automatically performed through the MAM server 2.

도 17은 본 발명의 일 실시 예에 따른 인물인식 프로세스를 도시한 도면이다.17 is a diagram illustrating a person recognition process according to an embodiment of the present invention.

도 16 및 도 17을 참조하면, 콘텐츠 등록 완료 후, 얼굴 인식기(133)가 원본 영상으로부터 프레임 이미지를 추출(예를 들어, 10 프레임 당 1장씩 이미지를 추출)하고 저장된 인물 갤러리를 이용하여 등장인물들의 인식용 피처를 추출한다. 얼굴 인식기(133)는 추출된 프레임 이미지 및 인물 피처 값을 얼굴인식 서버(14)에 전달함에 따라 얼굴인식 작업이 시작된다. 얼굴인식 서버(14)는 등장인물을 인식하고 대표 프로필 추천 이미지, 등장인물 클러스터링 이미지, 프레임 피처 저장 파일을 생성한다. 인식 작업의 결과물은 MAM 서버(2)를 통해 스토리지와 DB에 저장된다.Referring to FIGS. 16 and 17, after content registration is completed, the face recognizer 133 extracts a frame image from the original image (for example, extracts one image per 10 frames) and uses the stored person gallery to characterize Extracts features for recognition. The face recognizer 133 starts the face recognition operation as the extracted frame image and person feature values are transmitted to the face recognition server 14. The face recognition server 14 recognizes the characters and generates a representative profile recommendation image, a character clustering image, and a frame feature storage file. The result of the recognition work is stored in the storage and DB through the MAM server 2.

도 18은 본 발명의 일 실시 예에 따른 음원인식을 위한 장치 구성을 도시한 도면이다.18 is a diagram illustrating a device configuration for sound source recognition according to an embodiment of the present invention.

도 18을 참조하면, 음원 인식기(134)는 GUI(1340), MAM 연동기(1342) 및 음원인식 라이브러리(1344)를 포함한다. MAM 연동기(1342)는 MAM 서버(2)와 TCP/IP를 통해 음원 인식기(134)와 연동하도록 한다. 음원 인식기(134)는 음원인식 라이브러리(1344)를 이용하여 영상 내 음악을 인식한다.Referring to FIG. 18, the sound source recognizer 134 includes a GUI 1340, a MAM interlocker 1342, and a sound source recognition library 1344. The MAM interlocker 1342 is linked to the MAM server 2 and the sound source recognizer 134 through TCP/IP. The sound source recognizer 134 recognizes music in the image using the sound source recognition library 1344.

도 19는 본 발명의 일 실시 예에 따른 음원인식 프로세스를 도시한 도면이다.19 is a diagram illustrating a sound source recognition process according to an embodiment of the present invention.

도 18 및 도 19를 참조하면, 음원 인식기(134)는 음원인식을 위해 원본영상에서 WAV 파일을 추출한다. 콘텐츠 내 음원 재생 구간을 알 수 없는 관계로, 콘텐츠의 모든 구간을 미리 설정된 단위(예를 들어, 7초 단위)로 WAV 파일을 생성한다. WAV 파일을 이용하여 음원인식을 실행하고 1차 결과를 DB에 저장한다. 음원인식은 음원인식 라이브러리(1344)를 이용해 이루어진다. MAM 서버(2)는 1차 저장된 결과를 토대로 유효한 인식 결과를 선별(결과 재처리)하여 최종 결과를 저장한다. 음원인식 작업은 MAM 서버(2)를 통해 자동으로 진행된다.18 and 19, the sound source recognizer 134 extracts a WAV file from the original image for sound source recognition. Since the section for reproducing the sound source in the content is unknown, a WAV file is generated in all sections of the content in a preset unit (for example, in units of 7 seconds). It performs sound source recognition using a WAV file and stores the primary result in the DB. Sound source recognition is performed using a sound source recognition library 1344. The MAM server 2 selects valid recognition results based on the first stored result (result reprocessing) and stores the final result. Sound source recognition is automatically performed through the MAM server (2).

도 20은 본 발명의 일 실시 예에 따른 상황인식을 위한 장치 구성을 도시한 도면이다.20 is a diagram illustrating a device configuration for situational awareness according to an embodiment of the present invention.

도 20을 참조하면, 상황인식 작업은 상황 인식기(135)를 통한 전처리 작업과 상황인식 서버(15)를 통한 상황 인식작업으로 구분된다. 상황 인식기(135)는 GUI(1350), 상황인식 서버(15)와 MAM 서버(2)를 연동하기 위해 MAM 연동기(1352)와 인식서버 연동기(1354)를 포함한다. MAM 연동기(1352)는 MAM 서버(2)와 TCP/IP를 이용하여 연결되고, 인식서버 연동기(1354)는 상황인식 서버(15)와 Restful을 이용하여 연결될 수 있다. 상황인식 서버(15)는 Restful API(150) 및 상황인식 라이브러리(152)를 포함한다. 전처리 작업과 인식 작업은 MAM 서버(2)를 통해 자동으로 진행된다.Referring to FIG. 20, the situation recognition task is divided into a pre-processing task through the situation recognizer 135 and a situation recognition task through the situation recognition server 15. The context recognizer 135 includes a GUI 1350, a context recognition server 15, and an MAM linker 1352 and a recognition server linker 1354 to link the MAM server 2 with each other. The MAM linker 1352 may be connected to the MAM server 2 using TCP/IP, and the recognition server linker 1354 may be connected to the context recognition server 15 using Restful. The context awareness server 15 includes a Restful API 150 and a context awareness library 152. The pre-processing and recognition tasks are automatically performed through the MAM server 2.

도 21은 본 발명의 일 실시 예에 따른 상황 인식 프로세스를 도시한 도면이다.21 is a diagram illustrating a situation recognition process according to an embodiment of the present invention.

도 20 및 도 21을 참조하면, 콘텐츠 등록 완료 후 상황 인식기(135)는 전처리 작업을 통해 영상으로부터 프레임 이미지를 추출(예를 들어, 5 프레임당 1장씩 이미지를 추출)한다. 그리고 추출된 프레임 이미지를 상황인식 서버(15)에 전달하여 인식작업이 시작된다. 하나의 작업으로 모든 인식 작업이 순차적으로 처리된다. 상황인식 서버(15)의 인식작업의 결과물(객체 인식결과, 이벤트 인식결과, 장소 인식결과, 랜드마크 인식결과, 동영상 인식결과)은 MAM 서버(2)를 통해 스토리지와 DB에 저장된다.20 and 21, after content registration is completed, the situation recognizer 135 extracts a frame image from the image through a pre-processing operation (for example, extracts one image per 5 frames). Then, the extracted frame image is transmitted to the situation recognition server 15 to start recognition. All recognition tasks are processed sequentially in one task. The result of the recognition operation of the situation recognition server 15 (object recognition result, event recognition result, place recognition result, landmark recognition result, video recognition result) is stored in the storage and DB through the MAM server 2.

이하, 도 22 내지 도 28을 참조로 하여 콘텐츠 및 인식작업을 관리할 수 있는 관리자의 저작도구(Authoring tool)에 대해 후술한다.Hereinafter, an authoring tool of an administrator who can manage content and recognition tasks will be described later with reference to FIGS. 22 to 28.

도 22는 본 발명의 일 실시 예에 따른 저작도구 화면을 도시한 도면이다.22 is a view showing a authoring tool screen according to an embodiment of the present invention.

도 22를 참조하면, 얼굴 인식 기반 등장인물 정보 제공 시스템은 등록된 콘텐츠와 딥 메타 인식결과를 확인 및 관리할 수 있는 웹 기반의 저작도구를 제공한다. 저작도구 화면은 콘텐츠 관리 페이지(2200), 스토리보드 페이지(2210), 검증 페이지(2220), 인물관리 페이지(2230), 작업 내역 페이지 및 관리자 페이지 등으로 구성되며, 관리자 화면을 통해 표시될 수 있다.Referring to FIG. 22, the facial recognition-based character information providing system provides a web-based authoring tool that can check and manage registered content and deep meta recognition results. The authoring tool screen includes a content management page 2200, a storyboard page 2210, a verification page 2220, a person management page 2230, a work history page, and an administrator page, and can be displayed through the manager screen. .

일 실시 예에 따른 콘텐츠 관리 페이지는 콘텐츠 및 메타데이터 등록 화면, 콘텐츠 확인 화면, 메타데이터 확인 화면, 전처리 명령 입력 화면 및 인식 명령 입력 화면을 포함한다.The content management page according to an embodiment includes a content and metadata registration screen, a content confirmation screen, a metadata confirmation screen, a preprocessing command input screen, and a recognition command input screen.

일 실시 예에 따른 등장인물 관리 페이지는 인식 서버에서 추천된 프로필용 이미지 화면, 동일한 인물끼리 그룹핑한 클러스터링 이미지 화면 및 입력된 인물사진과 관련된 인물을 검색하여 제공하는 검색 화면을 포함한다.The character management page according to an embodiment includes a profile image screen recommended by the recognition server, a clustering image screen grouping the same people, and a search screen to search for and provide people related to the input portrait.

일 실시 예에 따른 스토리보드 페이지는 카탈로깅을 통해 생성된 샷 추출내용을 확인하고 샷 기반으로 씬을 생성 및 관리하는 스토리보드-샷 화면, 음원인식 결과를 씬 단위로 확인하고 오인식된 결과를 수정하며 타임코드가 중복된 항목을 삭제하거나 병합하기 위한 스토리보드-음원인식 화면, 객체인식 결과를 확인하고 오인식된 결과를 편집하기 위한 스토리보드-객체인식 화면을 포함한다.The storyboard page according to one embodiment checks the shot extraction contents generated through cataloging and checks the storyboard-shot screen, the sound source recognition result in a scene unit, and corrects the misrecognized result based on the shot. It includes a storyboard-sound recognition screen for deleting or merging items with duplicate time codes, and a storyboard-object screen for checking object recognition results and editing misrecognized results.

일 실시 예에 따른 검증 페이지는 재생시점에 해당하는 등장인물 표시화면, 음원 표시화면, 등장인물의 부가 정보 확인 화면, 등장인물 출연지점 안내 화면, 엔딩 지점에서 관련 콘텐츠 안내 화면을 포함한다.The verification page according to an embodiment includes a character display screen corresponding to a reproduction time point, a sound source display screen, an additional information confirmation screen of a character, a character appearance point guidance screen, and a related content guidance screen at the ending point.

도 23은 본 발명의 일 실시 예에 따른 저작도구의 콘텐츠 관리 화면을 도시한 도면이다.23 is a diagram illustrating a content management screen of an authoring tool according to an embodiment of the present invention.

도 23을 참조하면, 등록된 콘텐츠를 관리하는 페이지를 이용하여 콘텐츠를 등록 및 조회(2300)할 수 있고, 콘텐츠 별 관리상태를 확인(2310)할 수 있고, 메타데이터를 확인(2320)할 수 있으며, 전/후처리 명령(2330, 2340)을 실행할 수 있다. 전처리 명령(2330)의 예로는 트랜스코딩 요청, 카탈로깅 요청, 프레임 추출 요청, 인물 피처 추출 요청 명령 등이 있다. 후처리 명령(2340)의 예로는 얼굴 인식 요청, 음원인식 요청, 객체 인식 요청 등이 있다.Referring to FIG. 23, content can be registered and viewed 2300 using a page for managing registered content, management status for each content may be checked 2310, and metadata may be checked 2320. In addition, pre/post processing commands 2330 and 2340 may be executed. Examples of the preprocessing command 2330 include a transcoding request, a cataloging request, a frame extraction request, and a person feature extraction request command. Examples of the post-processing command 2340 include a face recognition request, a sound source recognition request, and an object recognition request.

도 24는 본 발명의 일 실시 예에 따른 저작도구의 등장인물 관리 화면을 도시한 도면이다.24 is a diagram illustrating a character management screen of an authoring tool according to an embodiment of the present invention.

도 24를 참조하면, 얼굴인식이 완료된 콘텐츠는 저작도구의 등장인물 메뉴를 통해 프로필 사진을 변경하고, 갤러리에 사진을 추가할 수 있다. 등장인물 관리에서는 3가지 방식으로 프로필 사진 변경 및 갤러리 추가가 가능하다. 1단계로, 얼굴인식 서버에서 추천된 프로필용 이미지 목록(2410)을 제공한다. 2단계로, 동일한 인물끼리 그룹핑한 클러스터링 이미지(clustering image) 목록(2420)을 제공한다. 3단계로, 입력된 인물 사진과 유사한 인물 목록(2430)을 검색하여 제공한다. 각 단계 모두 프로필 사진 변경 및 갤러리 추가가 가능하다.Referring to FIG. 24, the content whose face recognition has been completed may change the profile picture through the character menu of the authoring tool and add a picture to the gallery. In character management, you can change your profile picture and add a gallery in 3 ways. As a first step, a list of images 2410 for a profile recommended by the face recognition server is provided. In step 2, a clustering image list 2420 grouped between the same people is provided. In step 3, a list of people 2430 similar to the input portrait is searched and provided. In each step, you can change your profile picture and add a gallery.

도 25는 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(샷) 관리 화면을 도시한 도면이다.25 is a view showing a storyboard (shot) management screen of the authoring tool according to an embodiment of the present invention.

도 25를 참조하면, 전처리 작업과 인식 작업이 완료된 콘텐츠는 스토리보드를 통하여 샷 추출 내용과 인식 결과를 확인할 수 있다. 샷 페이지는 카타로깅을 통해 생성된 샷 추출 내용을 확인(2500, 2510)하고 샷 기반으로 씬 추가, 삭제, 이동, 샷 편집(2520)할 수 있는 기능을 제공한다. 샷 편집 기능 시에, 프레임 단위로 샷 시작 지점 변경(2530)이 가능하다.Referring to FIG. 25, the content of which the pre-processing and recognition tasks have been completed can be checked through the storyboard and the results of the shot extraction and recognition. The shot page provides a function to check (2500, 2510) shot extracts generated through catalogging and add, delete, move, and edit shots (2520) based on the shot. In the shot editing function, it is possible to change the shot start point in units of frames (2530).

도 26은 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(음원인식) 관리 화면을 도시한 도면이다.26 is a view showing a storyboard (sound recognition) management screen of the authoring tool according to an embodiment of the present invention.

도 26을 참조하면, 스토리보드의 음원인식 페이지는 음원인식 결과를 씬 단위로 확인(2600, 2610)하고 오인식된 결과를 간편하게 수정할 수 있는 기능을 제공한다. 예를 들어, 타임코드가 중복된 항목을 대상으로 음원 경고를 표시(2620)하고, 중복된 음원을 삭제하거나 서로 병합(2630)할 수 있으며 편집 내용을 초기화할 수 있는 기능을 제공한다.Referring to FIG. 26, the sound source recognition page of the storyboard provides a function for checking (2600, 2610) sound source recognition results in a scene unit and easily correcting the misrecognized results. For example, a sound source warning is displayed for items with duplicated timecode (2620), and duplicate sound sources can be deleted or merged with each other (2630), and a function for initializing edits is provided.

도 27은 본 발명의 일 실시 예에 따른 저작도구의 스토리보드(객체인식) 관리 화면을 도시한 도면이다.27 is a view showing a storyboard (object chain type) management screen of the authoring tool according to an embodiment of the present invention.

도 27을 참조하면, 스토리보드의 객체인식 페이지는 객체인식 결과를 요약하여 제공(2700)하고, 오인식된 결과를 간편하게 편집(2710)할 수 있는 기능을 제공한다. 인식된 객체 클래스별 인식 결과 요약 내용과 프레임 이미지를 확인할 수 있다.Referring to FIG. 27, the object recognition page of the storyboard provides a summary of object recognition results (2700), and provides a function to easily edit (2710) misrecognized results. You can check the summary and frame image of the recognition results for each recognized object class.

도 28은 본 발명의 일 실시 예에 따른 저작도구의 검증 화면을 도시한 도면이다.28 is a view showing a verification screen of the authoring tool according to an embodiment of the present invention.

도 28을 참조하면, 검증 페이지는 딥 메타 결과를 검증할 수 있는 페이지이다. 검증 페이지는 재생시점에 해당하는 등장인물 표시화면, 음원 표시화면, 등장인물의 부가 정보 확인 화면, 등장인물 출연지점 안내 화면, 엔딩 지점에서 관련 콘텐츠 안내 화면을 제공한다. 예를 들어, 재생시점에 해당하는 등장인물을 표시(2800)하고 음원을 표시(2810)하며 등장인물의 상제정보를 표시(2820)하여 확인할 수 있는 기능을 제공한다. 그리고 등장 인물 출연 지점을 안내(2830)하고 엔딩 지점에서 관련 콘텐츠를 안내(2840)하는 등 영상 재생시 딥 메타데이터를 실시간으로 표시함으로써 효과적인 데이터 검증 기능을 제공한다.Referring to FIG. 28, the verification page is a page capable of verifying deep meta results. The verification page provides a character display screen, a sound source display screen, an additional information confirmation screen of the character, a character appearance point guidance screen, and a related content guidance screen at the ending point corresponding to the playing time. For example, it provides a function to display the character corresponding to the playback time (2800), display the sound source (2810), and display (2820) the information about the character's reciprocal character. And it provides effective data verification function by displaying deep metadata in real time during video playback, such as guiding the appearance point of the character (2830) and guiding the related content at the ending point (2840).

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far, the present invention has been focused on the embodiments. Those skilled in the art to which the present invention pertains will understand that the present invention can be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in terms of explanation, not limitation. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent range should be interpreted as being included in the present invention.

Claims

The content registration unit copies the content original file and stores it in storage, and registers and stores the content metadata in a media asset management (MAM, hereinafter referred to as'MAM') server;
Assigning a preprocessing task necessary for the recognition task and verification to the preprocessing server by the MAM server;
A pre-processing server performing pre-processing including transcoding, cataloging, and frame extraction;
When the MAM server assigns the recognition task to the recognition server using the pre-processed result, the recognition server recognizes a person, a situation, and a sound source in the image;
A step in which the administrator terminal checks the recognition result using an authoring tool and verifies and corrects the result; And
After verification is completed, the MAM server delivers the results to the user terminal;
It includes,
Steps to verify and correct
Provides a web-based authoring tool to check and manage registered content and deep meta recognition results,
The authoring tool provides a content management page, a storyboard page, a verification page, a character management page, and a manager page, and the workflow processing method of the face recognition based character information providing system.

The method of claim 1, wherein performing the pre-treatment
A MAM server assigning a transcoding operation to the transcoder, and the transcoder converting the original image into a searchable image playable on the web;
The MAM server allocates a cataloging operation to the cataloger, and the cataloger analyzes the original image and divides the scene based on a scene change point to output a shot image; And
MAM server assigns a frame extracting task to the frame extractor, and the frame extractor extracts the frame image; It includes,
Each step of the pre-processing work is performed in parallel at the same time,
A workflow processing method of a face recognition-based character information providing system, characterized in that the communication between the MAM server and the pre-processing server's transcoder, cataloger, and frame extractor is performed using a TCP protocol.

The method of claim 1, wherein the step of recognizing the person, situation and sound source in the image is
Assigning, by the MAM server, a character feature extraction task to the person recognizer;
A person recognizer calling a person recognition server for feature extraction for recognition of characters;
Extracting a feature for recognition of the character by using the gallery image of the character stored by the called character recognition server;
A step in which the person recognizer checks the feature extraction status in the person recognition server and registers the verification result in the MAM server;
Assigning a person recognition task to the person recognizer by the MAM server;
A person recognizer calling a person recognition server for person recognition;
The called person recognition server recognizes the person using the frame extraction image and the character feature file, and provides a representative profile recommendation picture, a person clustering image, a feature file for each frame, and a recognition result file; And
Storing the recognition result by the MAM server;
A workflow processing method of a face recognition-based character information providing system comprising a.

The method of claim 1, wherein the step of recognizing the person, situation and sound source in the image is
Assigning a sound source recognition task to the sound source recognizer by the MAM server;
A sound source recognizer extracting a wav file from the storage and calling the sound source recognition server to the sound source recognition server using the extracted wav file;
A sound source recognition server recognizing a sound source using the sound source recognition library and registering the sound source recognition result to the MAM server;
Correcting a misrecognized result from the result recognized by the MAM server in a preset unit and merging the valid results; And
MAM server storing the final result;
A workflow processing method of a face recognition-based character information providing system comprising a.

The method of claim 4, wherein the step of merging the valid results is
Providing sound source recognition results in a scene unit through a sound source recognition page of the storyboard; And
Providing an editing screen for displaying a sound source warning for the items with duplicate time codes and deleting the duplicated sound sources or merging each other;
A workflow processing method of a face recognition-based character information providing system comprising a.

The method of claim 1, wherein the step of recognizing the person, situation and sound source in the image is
Assigning a context-aware task to the context-aware MAM server;
The situation recognizer transmitting the frame extraction image to the situation awareness server and calling the situation awareness server; And
Receiving a frame extraction image by the context recognition server, recognizing objects, events, places, landmarks and videos through context recognition, and providing recognition results;
A workflow processing method of a face recognition-based character information providing system comprising a.

delete

According to claim 1,
The content management page includes a content and metadata registration screen, a content confirmation screen, a metadata confirmation screen, a pre-processing command input screen, and a recognition command input screen.

According to claim 1,
The character management page includes a face image-based character recognition characterized by including an image screen for a profile recommended by the recognition server, a clustering image screen grouping the same people, and a search screen to search for and provide people related to the input portrait photo. Information processing system workflow processing method.

According to claim 1,
The storyboard page is a storyboard-shot screen that checks shot extractions generated through cataloging, creates and manages scenes based on shots, checks the sound source recognition results in units of scenes, corrects misrecognized results, and duplicates the timecode It includes a storyboard-sound recognition screen for deleting or merging deleted items, and a storyboard-object screen for viewing object recognition results and editing misrecognized results. Workflow processing method.

According to claim 1,
The verification page is based on face recognition, characterized in that it includes a character display screen, a sound source display screen, an additional information confirmation screen of the character, a character appearance point guidance screen, and a related content guidance screen at the ending point corresponding to the playing time. Workflow processing method of person information providing system.