KR20220160316A

KR20220160316A - Method for generating customized video based on objects and service server using the same

Info

Publication number: KR20220160316A
Application number: KR1020210068382A
Authority: KR
Inventors: 김승진; 이종혁; 안재철; 변우식; 조성택; 김민경; 김성호; 장준기
Original assignee: 네이버 주식회사
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2022-12-06
Also published as: KR102507873B1

Abstract

The present application relates to a method for generating a customized video for each object and a service server using the same. A method for generating a customized video for each object according to an embodiment of the present invention may include the steps of: analyzing an original video and identifying at least one object appearing in each frame image included in the original video; cropping the object from the frame image and generating a cropped image corresponding to each of the objects; and collecting the cropped images for each object and generating a customized video for each object.

Description

Method for generating customized video based on objects and service server using the same}

본 출원은 영상 내에 나타난 각각의 객체별로 맞춤 동영상을 자동으로 생성할 수 있는 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버에 관한 것이다. The present application relates to a method for generating a custom video for each object capable of automatically generating a custom video for each object appearing in a video, and a service server using the same.

최근 인터넷 등 네트워크의 보급과, 일반 사용자들의 사용하는 각종 기기(퍼스널 컴퓨터, 스마트폰, 카메라 등)의 고기능화에 의하여, 사용자가 원하는 다양한 서비스를 제공하는 동영상 서비스가 널리 활용되고 있다.Recently, with the spread of networks such as the Internet and the high functionality of various devices (personal computers, smart phones, cameras, etc.) used by general users, video services that provide various services desired by users are widely used.

또한, K-POP의 흥행과, 아이돌 팬덤의 문화의 성장에 따라, 동영상 서비스 중에서 아이돌들의 무대영상의 재생횟수가 높아지고 있으며, 최근에는 멤버별 직캠 등 특정 멤버에 퍼포먼스를 집중적으로 시청하고자 하는 수요도 늘어나고 있다. 이에 주요 방송국 등에서는 각각의 멤버를 따로 촬영하여 생성한 직캠을 별도로 제작하여 제공하는 등의 동영상 서비스도 제공하고 있다.In addition, with the popularity of K-POP and the growth of idol fandom culture, the number of playbacks of idols' stage videos among video services is increasing, and recently there has been a demand to intensively watch performances of specific members, such as member-specific direct cams. It is increasing. In response, major broadcasting stations are also providing video services, such as separately producing and providing direct cams created by separately filming each member.

그러나, 멤버별 직캠의 경우, 사람이 직접 각각의 멤버별 직캠을 따로 촬영하거나, 이미 촬영된 원본 영상에서 각 멤버별 영상을 따로 편집하여 업로드해야하는 점에서 한계가 존재한다. However, in the case of direct cams for each member, there is a limitation in that a person must separately shoot each member's direct cam, or separately edit and upload each member's video from the original video already filmed.

본 출원은, 영상 내에 나타난 각각의 객체별로 맞춤 동영상을 자동으로 생성할 수 있는 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버를 제공하고자 한다.An object of the present application is to provide a method for generating a custom video for each object capable of automatically generating a custom video for each object appearing in a video, and a service server using the same.

본 출원은, 복수의 카메라로 촬영한 경우, 해당 객체가 가장 잘 나온 화면으로 자동으로 스위칭하여, 최적의 맞춤 동영상을 생성할 수 있는 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버를 제공하고자 한다.An object of the present application is to provide a method for creating a customized video for each object and a service server using the same, which can create an optimally customized video by automatically switching to the screen in which the corresponding object is best when photographed with a plurality of cameras.

본 출원은, 객체별 맞춤 동영상을 다양한 UX(User Experience)로 표시할 수 있는 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버를 제공하고자 한다.An object of the present application is to provide a method for generating a video customized for each object capable of displaying a video customized for each object in various user experiences (UX) and a service server using the same.

본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법은, 서비스 서버의 객체별 맞춤 동영상 생성 방법에 관한 것으로, 원본 동영상을 분석하여, 상기 원본 동영상에 포함된 각각의 프레임 이미지에 나타난 적어도 하나의 객체를 식별하는 단계; 상기 객체를 상기 프레임 이미지로부터 크롭(crop)하여, 각각의 객체들에 대응하는 크롭 이미지를 생성하는 단계; 및 상기 크롭 이미지들을 각각의 객체별로 취합하여, 객체별로 맞춤 동영상을 생성하는 단계를 포함할 수 있다. A method for generating a custom video for each object according to an embodiment of the present invention relates to a method for generating a custom video for each object of a service server, by analyzing an original video, and generating at least one frame image included in the original video. identifying the object; cropping the object from the frame image to generate a cropped image corresponding to each of the objects; and collecting the cropped images for each object and generating a customized video for each object.

본 발명의 일 실시예에 의한 서비스 서버는, 원본 동영상을 분석하여, 상기 원본 동영상에 포함된 각각의 프레임 이미지 내에 나타난 적어도 하나의 객체를 식별하는 객체식별부; 상기 객체를 상기 프레임 이미지로부터 크롭(crop)하여, 각각의 객체들에 대응하는 크롭 이미지를 생성하는 이미지크롭부; 및 상기 크롭 이미지들을 각각의 객체별로 취합하여, 객체별로 맞춤 동영상을 생성하는 편집부를 포함할 수 있다. A service server according to an embodiment of the present invention includes an object identification unit for analyzing an original video and identifying at least one object appearing in each frame image included in the original video; an image cropping unit that crops the objects from the frame image and generates cropped images corresponding to the respective objects; and an editing unit that collects the cropped images for each object and creates a customized video for each object.

덧붙여 상기한 과제의 해결수단은, 본 발명의 특징을 모두 열거한 것이 아니다. 본 발명의 다양한 특징과 그에 따른 장점과 효과는 아래의 구체적인 실시형태를 참조하여 보다 상세하게 이해될 수 있을 것이다.In addition, the solution to the above problem does not enumerate all the features of the present invention. Various features of the present invention and the advantages and effects thereof will be understood in more detail with reference to specific embodiments below.

본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버에 의하면, 영상 내에 나타난 각각의 객체별로 맞춤 동영상을 자동으로 생성하는 것이 가능하다. 따라서, 원본 영상에 대한 실시간 처리나 대량의 원본 영상에 대한 처리가 가능하다.According to the method for generating a custom video for each object according to an embodiment of the present invention and the service server using the same, it is possible to automatically generate a custom video for each object appearing in the video. Therefore, real-time processing of original images or processing of a large amount of original images is possible.

본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버에 의하면, 복수의 대상 동영상 중에서 해당 객체가 가장 잘 나온 화면으로 자동으로 스위칭하는 것이 가능하므로, 해당 객체에 대한 최적의 맞춤 동영상을 생성할 수 있다.According to the method of generating a video customized for each object according to an embodiment of the present invention and the service server using the same, it is possible to automatically switch to the screen on which the corresponding object is best displayed among a plurality of target videos, thereby providing optimal fit for the corresponding object. You can create a video.

다만, 본 발명의 실시예들에 따른 객체별 맞춤 동영상 생성 방법 및 이를 이용하는 서비스 서버가 달성할 수 있는 효과는 이상에서 언급한 것들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the effects that can be achieved by the method for generating customized videos for each object and the service server using the same according to embodiments of the present invention are not limited to those mentioned above, and other effects not mentioned above can be seen from the description below. It will be clearly understood by those skilled in the art to which the invention belongs.

도1은 본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 시스템을 나타내는 개략도이다.
도2는 본 발명의 일 실시예에 의한 서비스 서버를 나타내는 블록도이다.
도3은 본 발명의 일 실시예에 의한 원본 동영상을 나타내는 예시도이다.
도4는 본 발명의 일 실시예에 의한 객체별 맞춤 동영상을 나타내는 예시도이다.
도5은 본 발명의 일 실시예에 의한 프레임 이미지 내 객체 인식을 나타내는 개략도이다.
도6은 본 발명의 일 실시예에 의한 크롭 이미지를 나타내는 개략도이다.
도7 및 도8은 본 발명의 일 실시예에 의한 사용자 단말에서의 원본 동영상 및 맞춤 동영상의 디스플레이를 나타내는 개략도이다.
도9는 본 발명의 일 실시예에 의한 원본 동영상의 촬영을 나타내는 개략도이다.
도10은 본 발명의 다른 실시예에 의한 서비스 서버를 나타내는 블록도이다.
도11은 본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법을 나타내는 순서도이다. 1 is a schematic diagram showing a system for generating a customized video for each object according to an embodiment of the present invention.
2 is a block diagram showing a service server according to an embodiment of the present invention.
3 is an exemplary view showing an original video according to an embodiment of the present invention.
4 is an exemplary diagram illustrating a video customized for each object according to an embodiment of the present invention.
5 is a schematic diagram illustrating object recognition in a frame image according to an embodiment of the present invention.
6 is a schematic diagram showing a cropped image according to an embodiment of the present invention.
7 and 8 are schematic diagrams illustrating display of an original video and a customized video in a user terminal according to an embodiment of the present invention.
Fig. 9 is a schematic diagram showing shooting of an original video according to an embodiment of the present invention.
10 is a block diagram showing a service server according to another embodiment of the present invention.
11 is a flowchart illustrating a method for generating a custom video for each object according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 즉, 본 발명에서 사용되는 '부'라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, '부'는 어떤 역할들을 수행한다. 그렇지만 '부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 '부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '부'들로 결합되거나 추가적인 구성요소들과 '부'들로 더 분리될 수 있다.Hereinafter, the embodiments disclosed in this specification will be described in detail with reference to the accompanying drawings, but the same or similar components are given the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used together in consideration of ease of writing the specification, and do not have meanings or roles that are distinct from each other by themselves. That is, the term 'unit' used in the present invention means a hardware component such as software, FPGA or ASIC, and 'unit' performs certain roles. However, 'part' is not limited to software or hardware. A 'unit' may be configured to reside in an addressable storage medium and may be configured to reproduce one or more processors. Thus, as an example, 'unit' refers to components such as software components, object-oriented software components, class components and task components, processes, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays and variables. The functionality provided within the components and 'parts' may be combined into a smaller number of elements and 'parts' or further separated into additional elements and 'parts'.

또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.In addition, in describing the embodiments disclosed in this specification, if it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed in this specification, the detailed descriptions thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in this specification, the technical idea disclosed in this specification is not limited by the accompanying drawings, and all changes included in the spirit and technical scope of the present invention , it should be understood to include equivalents or substitutes.

도1은 본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 시스템을 나타내는 개략도이다. 1 is a schematic diagram showing a system for generating a customized video for each object according to an embodiment of the present invention.

도1을 참조하면, 본 발명의 일 실시예에 의한 객체별 맞춤동영상 생성 시스템은 사용자 단말(1) 및 서비스 서버(100)를 포함할 수 있다.Referring to FIG. 1 , a system for generating customized videos for each object according to an embodiment of the present invention may include a user terminal 1 and a service server 100 .

사용자 단말(1)은 다양한 종류의 어플리케이션들을 실행할 수 있으며, 실행 중인 어플리케이션을 시각이나 청각 등으로 표시하여 사용자에게 제공할 수 있다. 사용자 단말(1)은 어플리케이션을 시각적으로 표시하기 위한 디스플레이부를 포함할 수 있으며, 사용자의 입력을 인가받는 입력부, 통신부, 적어도 하나의 프로그램이 저장된 메모리 및 프로세서를 포함할 수 있다. The user terminal 1 may execute various types of applications, and may provide a user with a visual or auditory display of the running applications. The user terminal 1 may include a display unit for visually displaying an application, and may include an input unit receiving a user's input, a communication unit, a memory storing at least one program, and a processor.

사용자 단말(1)은 스마트폰, 태블릿 PC 등의 이동 단말기일 수 있으며, 실시예에 따라서는 데스크탑 등의 고정형 장치도 포함될 수 있다. 구체적으로, 사용자 단말(1)에는 휴대폰, 스마트 폰(Smart phone), 노트북 컴퓨터(laptop computer), 디지털방송용 단말기, PDA(personal digital assistants), PMP(portable multimedia player), 슬레이트 PC(slate PC), 태블릿 PC(tablet PC), 울트라북(ultrabook), 웨어러블 디바이스(wearable device, 예를 들어, 워치형 단말기 (smartwatch), 글래스형 단말기 (smart glass), HMD(head mounted display)) 등이 포함될 수 있다. The user terminal 1 may be a mobile terminal such as a smart phone or a tablet PC, and may also include a fixed device such as a desktop according to embodiments. Specifically, the user terminal 1 includes a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a slate PC, A tablet PC, an ultrabook, a wearable device (eg, a smartwatch, a smart glass, a head mounted display (HMD)), etc. may be included. .

사용자 단말(1)은 앱 스토어(App store) 또는 플레이 스토어(play store) 등에 접속하여, 다양한 어플리케이션들을 다운로드받아 설치할 수 있다. 실시예에 따라서는 서비스 서버(100) 또는 타 기기(미도시)와의 유선 또는 무선 통신을 통하여 다운로드받는 것도 가능하다.The user terminal 1 may access an App store or a play store to download and install various applications. Depending on the embodiment, it is also possible to download through wired or wireless communication with the service server 100 or other devices (not shown).

한편, 사용자 단말(1)은 통신 네트워크를 통하여 서비스 서버(100)와 연결될 수 있다. 여기서, 통신 네트워크는 유선 네트워크와 무선 네트워크를 포함할 수 있으며, 구체적으로, 근거리 네트워크(LAN: Local Area Network), 도시권 네트워크(MAN: Metropolitan Area Network), 광역 네트워크(WAN: Wide Area Network) 등 다양한 네트워크를 포함할 수 있다. 또한, 통신 네트워크는 공지의 월드와이드웹(WWW: World Wide Web)을 포함할 수도 있다. 다만, 본 발명에 따른 통신 네트워크는 상기 열거된 네트워크에 국한되지 않으며, 공지의 무선 데이터 네트워크, 공지의 전화 네트워크, 공지의 유선 또는 무선 텔레비전 네트워크 등을 포함할 수 있다.Meanwhile, the user terminal 1 may be connected to the service server 100 through a communication network. Here, the communication network may include a wired network and a wireless network, and specifically, various local area networks (LAN: Local Area Network), metropolitan area networks (MAN: Metropolitan Area Network), wide area networks (WAN: Wide Area Network), etc. may include networks. Also, the communication network may include the well-known World Wide Web (WWW). However, the communication network according to the present invention is not limited to the networks listed above, and may include a known wireless data network, a known telephone network, a known wired or wireless television network, and the like.

사용자 단말(1)에서 실행되는 어플리케이션 중에는, 동영상 어플리케이션이 포함될 수 있으며, 사용자 단말(1)은 동영상 어플리케이션을 이용하여 서비스 서버(100)로부터 제공받은 동영상을 재생할 수 있다. 여기서, 서비스 서버(100)는 실시간 스트리밍, VOD(Video on Demand) 등 다양한 방식으로 동영상을 제공할 수 있다.Among applications executed in the user terminal 1, a video application may be included, and the user terminal 1 may play a video provided from the service server 100 using the video application. Here, the service server 100 may provide video in various ways such as real-time streaming and VOD (Video on Demand).

서비스 서버(100)는 어플리케이션을 통하여 사용자 단말(1)과 연결될 수 있으며, 사용자 단말(1)의 요청에 대응하여 다양한 온라인 서비스를 제공할 수 있다. 여기서 서비스 서버(100)는 사용자 단말(1)에게 다양한 동영상을 제공하는 동영상 서비스를 제공할 수 있으며, 이 경우 서비스 서버(100)는 사용자 단말(1)의 요청에 따라, 데이터베이스(미도시) 등에 저장된 원본 동영상(OV)을 사용자 단말(1)로 제공할 수 있다. 또한, 실시예에 따라서는, 서비스 서버(100)가 원본 동영상(OV) 내에 나타난 각각의 객체들을 식별한 후, 각각의 객체별로 맞춤 동영상을 생성할 수 있으며, 사용자 단말(1)의 요청에 따라 대응하는 객체의 맞춤 동영상을 사용자 단말(1)에게 제공하는 것도 가능하다.The service server 100 may be connected to the user terminal 1 through an application and may provide various online services in response to requests from the user terminal 1 . Here, the service server 100 may provide a video service for providing various videos to the user terminal 1. In this case, the service server 100, in response to the request of the user terminal 1, the database (not shown), etc. The stored original video OV may be provided to the user terminal 1 . In addition, depending on the embodiment, after the service server 100 identifies each object appearing in the original video OV, a customized video may be generated for each object, and according to a request of the user terminal 1 It is also possible to provide a customized video of a corresponding object to the user terminal 1 .

예를들어, 서비스 서버(100)는 도3에 도시한 바와 같이, 다양한 뮤지션(musician)들의 공연영상 등을 원본 동영상(OV)으로 수집할 수 있으며, 복수의 멤버(M1, M2, M3, M4, M5)들로 구성된 뮤지션인 경우, 도4에 도시한 바와 같이 원본 동영상(OV)을 편집하여 각각의 멤버 별 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)을 생성할 수 있다. For example, as shown in FIG. 3, the service server 100 may collect performance videos of various musicians as an original video (OV), and a plurality of members (M1, M2, M3, M4). , M5), as shown in FIG. 4, the original video (OV) can be edited to create custom videos (PV1, PV2, PV3, PV4, PV5) for each member.

즉, 아이돌(idol) 등의 공연영상의 경우, 특정 멤버의 퍼포먼스를 집중적으로 시청하고자 하는 수요가 점차 늘고 있으며, 이에 따라 주요 방송국 등에서는 각각의 멤버를 따로 촬영하여 생성한 인물캠을 별도로 제작하여 제공하는 등의 서비스도 제공하고 있다. 그러나, 종래에는 사람의 인력을 이용하여 직접 각각의 멤버별 맞춤 동영상을 따로 촬영하거나, 이미 촬영된 전체 원본 영상(OV)에서 각 멤버별 영상을 따로 편집하여 멤버별 맞춤 동영상을 생성하였다. 이 경우, 인력에 한계가 있으므로, 원본 영상(OV)으로부터 맞춤 동영상을 실시간으로 추출하거나, 대량의 원본 영상(OV)들에 대한 신속한 처리에는 어려움이 존재한다. In other words, in the case of performance videos such as idols, the demand for intensive viewing of a specific member's performance is gradually increasing. We provide services such as: However, in the related art, a customized video for each member is created by separately shooting a customized video for each member directly using a human force or separately editing a video for each member from an entire original video (OV) that has already been filmed. In this case, since manpower is limited, it is difficult to extract a customized video from the original video OV in real time or to quickly process a large amount of the original video OV.

반면에, 본 발명의 일 실시예에 의한 서비스 서버(100)에 의하면, 수집한 원본 동영상(OV)으로부터 각각의 객체들에 대응하는 맞춤 동영상을 자동으로 생성할 수 있으므로, 맞춤 동영상에 대한 실시간 처리나 대량 처리를 용이하게 수행할 수 있다. 다만, 여기서는 뮤지션의 공연영상을 예로 설명하였으나, 이에 한정되는 것은 아니며, 본 발명의 일 실시예에 의한 서비스 서버(100)는 다양한 분야의 원본 영상으로부터 맞춤 동영상을 생성할 수 있다. 예를들어, 스포츠 경기의 중계영상으로부터 특정선수 또는 특정 포지션에 대한 맞춤 동영상을 생성하거나, 감시카메라 등으로 촬영한 방범영상으로부터 특정 인물 또는 특정 차량의 동작에 대한 맞춤 동영상을 생성하는 것이 가능하다. 이하, 도2를 참조하여 본 발명의 일 실시예에 의한 서비스 서버(100)의 동작을 설명한다. On the other hand, according to the service server 100 according to an embodiment of the present invention, since a customized video corresponding to each object can be automatically generated from the collected original video (OV), the customized video is processed in real time. I can easily carry out mass processing. However, although a musician's performance video has been described as an example, the present invention is not limited thereto, and the service server 100 according to an embodiment of the present invention may generate customized videos from original videos in various fields. For example, it is possible to create a customized video for a specific player or position from a relay video of a sports game, or to create a customized video for a specific person or vehicle from a security video captured by a surveillance camera. Hereinafter, the operation of the service server 100 according to an embodiment of the present invention will be described with reference to FIG. 2 .

도2를 참조하면, 본 발명의 일 실시예에 의한 서비스 서버(100)는, 객체식별부(110), 이미지크롭부(120), 편집부(130) 및 전송부(140)를 포함할 수 있다.Referring to FIG. 2 , the service server 100 according to an embodiment of the present invention may include an object identification unit 110, an image cropping unit 120, an editing unit 130, and a transmission unit 140. .

객체식별부(110)는 원본 동영상(OV)을 분석하여 원본 동영상(OV)에 포함된 각각의 프레임 이미지 내에 나타난 객체들을 식별할 수 있다. 원본 동영상(OV)은 복수의 프레임 이미지들을 포함할 수 있으며, 원본 동영상(OV)은 복수의 프레임 이미지들을 설정된 프레임 레이트(frame rate)에 따라 연속적으로 표시하여 움직임을 나타내는 것일 수 있다. 따라서, 객체식별부(110)는 입력받은 원본 동영상(OV)으로부터 프레임 이미지들을 추출할 수 있으며, 추출한 프레임 이미지들에 대한 영상처리를 수행하여 프레임이미지 내에 포함된 객체들을 식별할 수 있다. 실시예에 따라서는, 객체식별부(110)가 임의의 프레임레이트로 원본 동영상(OV)을 캡쳐하여 각각의 프레임 이미지들을 생성하는 것도 가능하다.The object identification unit 110 may analyze the original video OV to identify objects appearing in each frame image included in the original video OV. The original video OV may include a plurality of frame images, and the original video OV may indicate movement by continuously displaying the plurality of frame images according to a set frame rate. Accordingly, the object identification unit 110 may extract frame images from the received original video OV, and may identify objects included in the frame image by performing image processing on the extracted frame images. Depending on the embodiment, it is also possible for the object identification unit 110 to capture the original video OV at an arbitrary frame rate and generate respective frame images.

또한, 객체식별부(110)는 각각의 프레임 이미지 내에 포함된 객체들을 검출할 수 있으며, 검출된 객체를 다른 객체들과 구별할 수 있다. 구체적으로, 객체식별부(110)는 추출하고자 하는 객체들의 형상이나 휘도 등의 특징을 기계학습 알고리즘을 이용하여 미리 학습해둘 수 있으며, 이를 활용하여 각각의 프레임 이미지 내에 포함된 객체들을 검출할 수 있다. 예를들어, 객체식별부(110)는 기계학습 알고리즘을 이용하여 다양한 사람들의 형상을 반복하여 학습할 수 있으며, 이를 통해 프레임 이미지 내에 포함된 사람의 형상을 객체로 추출할 수 있다. 이때, 객체식별부(110)는 CNN, RNN, PCA(Principal Component Analysis), Logistic Regression, Decision Tree 등의 기계학습 알고리즘을 활용할 수 있다.Also, the object identification unit 110 may detect objects included in each frame image, and may distinguish the detected object from other objects. Specifically, the object identification unit 110 may learn features of objects to be extracted, such as shape or luminance, in advance using a machine learning algorithm, and utilize this to detect objects included in each frame image. . For example, the object identification unit 110 may repeatedly learn the shapes of various people using a machine learning algorithm, and through this, the shapes of people included in the frame image may be extracted as objects. At this time, the object identification unit 110 may utilize machine learning algorithms such as CNN, RNN, PCA (Principal Component Analysis), Logistic Regression, and Decision Tree.

이후, 객체식별부(110)는 도5에 도시한 바와 같이, 추출된 객체들을 경계박스(bounding box)(b1, b2, b3) 등으로 특정하여 프레임 이미지(fi1) 내에 표시할 수 있으며, 경계박스(b1, b2, b3) 내에 나타난 특징점들을 비교하여 해당 객체를 식별할 수 있다. 이때, 객체식별부(110)는 식별된 객체에 대한 식별정보를 생성할 수 있으며, 식별정보에는 객체들을 구별하기 위한 식별ID(Identification), 경계박스의 좌표정보, 객체의 식별결과에 대한 신뢰도 등이 포함될 수 있다. Then, as shown in FIG. 5, the object identification unit 110 may specify the extracted objects as bounding boxes (b1, b2, b3) and display them in the frame image fi1, A corresponding object may be identified by comparing feature points shown in the boxes b1, b2, and b3. At this time, the object identification unit 110 may generate identification information for the identified object, and the identification information includes an identification ID (Identification) for distinguishing objects, coordinate information of a bounding box, reliability of an object identification result, etc. this may be included.

실시예에 따라서는, 먼저 객체식별부(110)가 원본 동영상(OV)의 최초 프레임 이미지(fi1) 내의 객체들을 각각의 경계박스(b1, b2, b3)를 이용하여 구별하고, 각각 상이한 식별 ID를 최초로 부여할 수 있다. 이후, 나머지 프레임 이미지들에 대하여는, 현재 프레임 이미지에 나타난 객체들의 특징점과, 이전 프레임 이미지 내 객체들의 특징점을 비교하여 동일성을 확인한 후, 동일성이 인정되는 객체에는 동일한 식별 ID를 부여하는 방식으로, 각각의 객체들을 식별할 수 있다.Depending on the embodiment, first, the object identification unit 110 distinguishes objects in the first frame image fi1 of the original video OV using the respective bounding boxes b1, b2, and b3, and each has a different identification ID. can be given first. Thereafter, with respect to the remaining frame images, feature points of objects in the current frame image are compared with feature points of objects in the previous frame image to confirm identity, and then objects whose identity is recognized are given the same identification ID, respectively. objects can be identified.

이미지 크롭부(120)는 객체를 프레임 이미지로부터 크롭(crop)하여, 각각의 객체들에 대응하는 크롭 이미지를 생성할 수 있다. 도5 및 도6을 참조하면, 객체식별부(110)는 프레임이미지(fi1) 내에 각각의 객체들에 대응하는 경계박스(b1, b2, b3)를 생성할 수 있으며, 이미지 크롭부(120)는 경계박스(b1, b2, b3)를 기준으로 프레임 이미지(fi1)를 잘라내어, 크롭 이미지들(ci1, ci2, ci3)을 생성할 수 있다. The image cropping unit 120 may crop objects from the frame image to generate cropped images corresponding to the respective objects. 5 and 6, the object identification unit 110 may create bounding boxes b1, b2, and b3 corresponding to each object in the frame image fi1, and the image cropping unit 120 may generate cropped images ci1, ci2, and ci3 by cropping the frame image fi1 based on the bounding boxes b1, b2, and b3.

도6에서는 이미지 크롭부(120)가 경계박스(b1, b2, b3)와 동일한 폭과 높이로 프레임 이미지(fil)를 잘라 크롭 이미지(ci1, ci2, ci3)를 생성하는 것으로 도시하였으나, 실시예에 따라서는 경계박스(b1, b2, b3)로부터 일정간격의 마진을 두고 자르는 것도 가능하다. 나아가, 이미지 크롭부(120)는 프레임 이미지(fi1)를 자른 후, 이를 설정규격에 맞춰 확대 또는 축소하여 크롭이미지(ci1, ci2, ci3)를 생성하는 것도 가능하다. 6 shows that the image cropping unit 120 generates cropped images ci1, ci2, and ci3 by cropping the frame image fil to the same width and height as the bounding boxes b1, b2, and b3. Depending on , it is also possible to cut with a margin of a certain interval from the bounding boxes (b1, b2, b3). Furthermore, the image cropping unit 120 may generate cropped images ci1, ci2, and ci3 by cropping the frame image fi1 and enlarging or reducing the frame image fi1 according to a set standard.

편집부(130)는 크롭 이미지들을 각각의 객체별로 취합하여, 객체별로 맞춤 동영상을 생성할 수 있다. 여기서, 크롭 이미지들은 각각의 객체별로 생성되므로, 각 객체별로 크롭 이미지들을 시간순서에 따라 취합하면, 해당 객체에 대응하는 맞춤 동영상을 생성할 수 있다. 즉, 도4에 도시한 바와 같이, 편집부(130)는 각각의 멤버들(M1, M2, M3, M4, M5)의 크롭 이미지들을 취합함으로써, 개별 멤버들의 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)를 생성할 수 있다. 여기서, 각각의 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)에는 특정 멤버의 춤추는 모습만이 포함되므로, 사용자는 해당 맞춤 동영상을 이용하여 원하는 멤버의 움직임만을 집중적으로 감상하는 것이 가능하다.The editing unit 130 may collect the cropped images for each object and create a customized video for each object. Here, since the cropped images are generated for each object, if the cropped images for each object are collected in chronological order, a customized video corresponding to the object can be created. That is, as shown in FIG. 4 , the editing unit 130 collects the cropped images of the respective members M1, M2, M3, M4, and M5, so that the customized videos PV1, PV2, PV3, and PV4 of the individual members are combined. , PV5) can be created. Here, since each of the customized videos (PV1, PV2, PV3, PV4, and PV5) includes only the dancing figure of a specific member, it is possible for the user to intensively watch only the movement of a desired member using the corresponding customized video.

한편, 원본 동영상(OV)의 재생시간 동안 지속적으로 나타나는 객체의 경우, 이미지 크롭부(120)에서 생성한 크롭이미지의 개수와 원본 동영상(OV) 내에 포함된 프레임 이미지의 개수와 동일하게 된다. 따라서, 편집부(130)는 크롭이미지들을 취합하여, 원본 동영상(OV)과 동일한 길이의 맞춤 동영상을 생성할 수 있다. Meanwhile, in the case of objects continuously appearing during the playing time of the original video OV, the number of cropped images generated by the image cropping unit 120 is equal to the number of frame images included in the original video OV. Accordingly, the editing unit 130 may collect the cropped images to create a customized video having the same length as the original video OV.

다만, 경우에 따라서는 원본 동영상(OV) 내에 일부 객체가 잠시 사라졌다가 나타날 수 있으며, 이 경우 프레임 이미지에 대응하는 일부 크롭 이미지가 누락되어, 원본 동영상(OV)과 맞춤 동영상 사이의 동기화가 맞지 않는 등의 문제가 발생할 수 있다. 이와 같이, 프레임 이미지에 대응하는 크롭 이미지가 누락되는 경우에는, 편집부(130)가 직전의 크롭 이미지를 중복하여 포함시키거나, 해당 객체가 나타나지 않음을 나타내는 문구를 포함하는 임의의 이미지를 포함하여, 원본 동영상(OV)과 맞춤 동영상를 동기화시킬 수 있다. However, in some cases, some objects may disappear and appear within the original video (OV) for a while. In this case, some cropped images corresponding to the frame image are missing, resulting in out-of-synchronization between the original video (OV) and the customized video. problems may arise. In this way, when the cropped image corresponding to the frame image is missing, the editing unit 130 duplicates the previous cropped image or includes an arbitrary image including a phrase indicating that the corresponding object does not appear, The original video (OV) and custom video can be synchronized.

전송부(140)는 생성한 맞춤 동영상을 각각의 사용자 단말(1)로 전송할 수 있다. 사용자 단말(1)은 서비스 서버(100)로 원본 동영상(OV)이나 맞춤 동영상을 요청할 수 있으며, 전송부(140)는 사용자 단말(1)의 요청에 따라 원본 동영상(OV) 또는 맞춤 동영상을 제공할 수 있다. The transmission unit 140 may transmit the generated customized video to each user terminal 1 . The user terminal 1 may request an original video (OV) or a customized video from the service server 100, and the transmission unit 140 provides the original video (OV) or a customized video according to the request of the user terminal 1. can do.

도4에 도시한 바와 같이, 복수의 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)이 생성된 경우, 전송부(140)는 복수의 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)들을 정렬하여, 사용자 단말(1)의 화면 내에 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)들이 동시에 표시되도록 설정하여 전송할 수 있다.As shown in FIG. 4, when a plurality of customized videos (PV1, PV2, PV3, PV4, and PV5) are generated, the transmission unit 140 transmits a plurality of customized videos (PV1, PV2, PV3, PV4, and PV5). After arranging, the customized videos (PV1, PV2, PV3, PV4, PV5) can be set to be simultaneously displayed on the screen of the user terminal 1 and transmitted.

또한, 전송부(140)는 도7에 도시한 바와 같이, 복수의 객체 중 어느 하나에 대한 사용자의 선택입력이 인가되면, 선택입력에 대응하는 객체의 맞춤 동영상(PV)만을 표시할 수 있다. 이 경우, 선택입력에 대응하는 객체의 맞춤 동영상(PV)과 함께, 원본 동영상(OV)이 동시에 표시되도록 설정하여 전송하는 것도 가능하다. In addition, as shown in FIG. 7 , the transmission unit 140 may display only the personalized video (PV) of the object corresponding to the selection input when a user's selection input for any one of a plurality of objects is applied. In this case, it is also possible to set and transmit the original video (OV) together with the customized video (PV) of the object corresponding to the selection input to be displayed at the same time.

실시예에 따라서는, 도4와 같이 복수의 맞춤 동영상(PV1, PV2, PV3, PV4, PV5)들을 표시된 상태에서, 사용자가 어느 하나의 맞춤 동영상을 선택하면, 도7과 같이 선택된 맞춤 동영상(PV)과 원본 동영상(OV)을 동시에 디스플레이하도록 설정하는 것도 가능하다.Depending on the embodiment, when a user selects one customized video in a state where a plurality of customized videos (PV1, PV2, PV3, PV4, PV5) are displayed as shown in FIG. 4, the selected customized video (PV ) and the original video (OV) can be set to be displayed at the same time.

이외에도, 전송부(170)는 도8에 도시한 바와 같이, 맞춤 동영상(PV)을 표시하도록 전송할 수 있다. 즉, 원본 동영상(OV) 내에서, 맞춤 동영상(PV)에 대응하는 영역과, 맞춤 동영상에 대응하지 않는 영역을 시각적으로 구분하여 표시하도록 설정할 수 있다. 예를들어, 맞춤 동영상(PV)에 대응하는 영역은 하이라이트 표시하고, 객체별 동영상에 대응하지 않는 나머지 영역은 음영처리하는 방식으로 표시할 수 있다. In addition, as shown in FIG. 8, the transmission unit 170 may transmit a personalized video (PV) to be displayed. That is, within the original video (OV), a region corresponding to the customized video (PV) and a region not corresponding to the customized video can be set to be visually distinguished and displayed. For example, a region corresponding to a personalized video (PV) may be highlighted, and other regions not corresponding to a video for each object may be displayed in a manner of shading.

추가적으로, 전송부(140)가 원본 동영상(OV)의 해상도에 따라, 맞춤 동영상(PV)의 표시방법을 달리하여 전송하는 실시예도 가능하다. 예를들어, 원본 동영상(OV)의 해상도가 설정값 이상인 경우에는, 도7에 도시한 바와 같이 맞춤 동영상(PV)을 원본 동영상(OV)과 구분되는 별도의 영역에 표시하도록 설정할 수 있다. 반면에, 원본 동영상(OV)의 해상도가 설정값 미만이면, 도8에 도시한 바와 같이, 원본 동영상(OV) 내에 맞춤 동영상(PV)을 시각적으로 구분하여 표시하도록 설정하는 것도 가능하다.Additionally, an embodiment in which the transmission unit 140 transmits the customized video PV in a different display method according to the resolution of the original video OV is also possible. For example, when the resolution of the original video (OV) is greater than or equal to a set value, the customized video (PV) can be set to be displayed in a separate area distinct from the original video (OV) as shown in FIG. 7 . On the other hand, if the resolution of the original video (OV) is less than the set value, as shown in FIG. 8, it is also possible to set the customized video (PV) to be visually distinguished and displayed within the original video (OV).

한편, 실시예에 따라서는, 복수의 카메라를 이용하여 동일한 객체들을 상이한 촬영환경에서 동시에 촬영하는 경우가 있을 수 있다. 즉, 도9에 도시한 바와 같이, 아이돌 등의 무대공연을 촬영하기 위하여 복수의 카메라(C1, C2, C3, C4)들이 설치될 수 있으며, 각각의 카메라들은 상이한 조도와 각도로 촬영한 복수의 대상 동영상들을 생성할 수 있다.Meanwhile, depending on the embodiment, there may be cases in which the same objects are simultaneously photographed in different photographing environments using a plurality of cameras. That is, as shown in Figure 9, a plurality of cameras (C1, C2, C3, C4) can be installed in order to photograph the stage performance of idols, etc., each camera is a plurality of images taken at different illumination and angle Target videos can be created.

이 경우, 본 발명의 일 실시예에 의한 서비스 서버(100)는, 복수의 카메라(C1, C2, C3, C4)에서 생성한 각각의 대상 동영상들을 취합하여, 객체(M1, M2, M3)들에 대한 맞춤 동영상을 생성할 수 있다. 이때, 서비스 서버(100)는 복수의 대상 동영상 중에서, 해당 객체가 가장 잘 나온 화면으로 자동으로 스위칭하여, 최적의 맞춤 동영상을 생성하도록 할 수 있다. In this case, the service server 100 according to an embodiment of the present invention collects each of the target videos generated by the plurality of cameras C1, C2, C3, and C4, and displays the objects M1, M2, and M3. You can create a custom video for At this time, the service server 100 may automatically switch to a screen on which the corresponding object appears best among a plurality of target videos to generate an optimally customized video.

구체적으로, 객체식별부(110)는 원본 동영상을 제공받을 수 있으며, 이때 원본 동영상에는 각각의 카메라(C1, C2, C3, C4)에서 촬영한 복수의 대상 동영상들이 모두 포함될 수 있다. 따라서, 객체식별부(110)는 먼저 복수의 대상 동영상들을 동기화시킬 수 있으며, 동기화된 대상 동영상으로부터 동일한 시점의 프레임 이미지들을 각각 추출할 수 있다. 이때, 객체식별부(110)는 대상 동영상들에 포함된 음원을 기준으로 대상 동영상들을 동기화할 수 있다. 예를들어, 각각의 카메라(C1, C2, C3, C4)의 촬영시작 시점은 상이할 수 있으므로, 대상동영상들의 시작시점을 일치시키는 방식으로는 정확하게 동기화되지 않을 수 있다. 따라서, 아이돌의 무대공연 등과 같이, 각각의 대상 동영상 내에 동일한 음원이 녹음되는 경우에는, 음원을 기준으로 동기화를 할 수 있다. 이후, 객체식별부(110)는 추출한 프레임 이미지들로부터 각각의 객체들을 식별할 수 있다.Specifically, the object identification unit 110 may be provided with an original video, and at this time, the original video may include all of a plurality of target videos captured by the respective cameras C1, C2, C3, and C4. Accordingly, the object identification unit 110 may first synchronize a plurality of target videos, and extract frame images of the same viewpoint from the synchronized target videos. At this time, the object identification unit 110 may synchronize the target videos based on sound sources included in the target videos. For example, since the shooting start points of the cameras C1, C2, C3, and C4 may be different, they may not be accurately synchronized by matching the start points of the target videos. Accordingly, when the same sound source is recorded in each target video, such as an idol's stage performance, synchronization can be performed based on the sound source. Then, the object identification unit 110 may identify each object from the extracted frame images.

이미지 크롭부(120)는 동일한 시점의 프레임 이미지들에 포함된 객체들을 각각 크롭하여, 대상 동영상별로 크롭 이미지들을 생성할 수 있다. The image cropping unit 120 may generate cropped images for each target video by cropping each of the objects included in the frame images of the same viewpoint.

편집부(130)는 대상 동영상별로 생성한 각각의 크롭 이미지 중에서, 설정조건을 만족하는 크롭 이미지를 선택하여, 객체에 대한 맞춤 동영상을 생성할 수 있다. 즉, 대상 동영상 별로 동일한 객체를 나타내는 크롭 이미지들이 각각 생성되므로, 복수의 크롭 이미지들 중에서 어느 하나를 선택할 필요가 있다. 여기서, 최적의 크롭 이미지를 선택하기 위하여, 미리 설정조건을 준비할 수 있으며, 편집부(130)는 설정조건에 따라 크롭 이미지를 선택할 수 있다. The editing unit 130 may select a cropped image that satisfies setting conditions from among cropped images generated for each target video, and create a customized video for the object. That is, since cropped images representing the same object are generated for each target video, it is necessary to select one of the plurality of cropped images. Here, in order to select an optimal cropped image, setting conditions may be prepared in advance, and the editing unit 130 may select a cropped image according to the setting conditions.

구체적으로, 설정조건에는 프레임 이미지 내에 크롭 이미지가 차지하는 크기 비율, 크롭 이미지의 밝기, 크롭 이미지 내의 이미지 왜곡 발생여부 등이 포함될 수 있다. 즉, 프레임 이미지 내에 크롭 이미지가 지나치게 작은 경우에는, 이후 맞춤 동영상 생성을 위해 크롭이미지를 확대했을 때 해상도가 낮아지는 등의 문제가 발생할 수 있다. 따라서, 편집부(130)는 프레임 이미지 내에 큰 크기 비율을 차지하는 크롭 이미지를 선택하도록 할 수 있다. Specifically, the setting conditions may include a size ratio occupied by the cropped image in the frame image, brightness of the cropped image, whether or not image distortion occurs in the cropped image, and the like. That is, if the cropped image is too small in the frame image, a problem such as a lower resolution may occur when the cropped image is enlarged to create a customized video. Accordingly, the editing unit 130 may select a cropped image occupying a large size ratio within the frame image.

또한, 크롭이미지가 지나치게 어두운 경우에는 크롭이미지 내에 포함된 객체가 잘 보이지 않으므로, 편집부(130)는 상대적으로 밝은 크롭이미지를 선택하도록 설정조건을 설정할 수 있다. In addition, if the cropped image is too dark, the object included in the cropped image is difficult to see, so the editor 130 may set a setting condition to select a relatively bright cropped image.

이외에도, 크롭 이미지 내의 객체가 흔들리거나 초점이 맞지 않는 등의 경우에는 객체를 알아보기 어려우므로, 편집부(130)는 이미지 왜곡이 발생하지 않은 크롭 이미지를 우선적으로 선택하도록 설정조건을 설정할 수 있다.In addition, since it is difficult to recognize an object in a cropped image when it is shaken or out of focus, the editor 130 may set a setting condition to preferentially select a cropped image in which image distortion does not occur.

이와 같이, 설정조건을 이용하면, 편집부(130)는 최적의 크롭 이미지를 선택하여, 객체에 대한 맞춤 동영상을 생성하는 것이 가능하다. In this way, using the setting conditions, the editing unit 130 can select an optimal cropped image to create a customized video for the object.

추가적으로, 실시예에 따라서는, 편집부(130)가 하나의 대상 동영상에서 추출한 크롭 이미지를 선택하면, 적어도 설정개수의 프레임 이미지 동안은 동일한 대상 동영상에서 추출한 크롭 이미지들을 선택하여 맞춤 동영상을 설정하도록 할 수 있다. 즉, 하나의 맞춤 동영상 내에서 선택되는 대상 동영상들이 지나치게 자주 변환되는 것을 방지하도록 할 수 있다. Additionally, depending on the embodiment, when the editing unit 130 selects a cropped image extracted from one target video, a customized video may be set by selecting cropped images extracted from the same target video for at least a set number of frame images. have. That is, target videos selected within one custom video can be prevented from being converted too frequently.

또한, 여기서는 설정조건을 미리 결정한 경우를 예시하였으나, 실시예에 따라서는, 어느 크롭 이미지를 선택할 것인지를 딥러닝 등 기계학습 기법을 활용하여 설정하도록 하는 것도 가능하다.In addition, although the case where the setting conditions are determined in advance is exemplified here, depending on the embodiment, it is also possible to set which cropped image to select using a machine learning technique such as deep learning.

한편, 도10에 도시한 바와 같이, 본 발명의 일 실시예에 의한 서비스 서버(100)는, 프로세서(10), 메모리(40) 등의 물리적인 구성을 포함할 수 있으며, 메모리(40) 내에는 프로세서(10)에 의하여 실행되도록 구성되는 하나 이상의 모듈이 포함될 수 있다. 구체적으로, 하나 이상의 모듈에는, 객체식별모듈, 이미지크롭모듈, 편집모듈 및 전송모듈 등이 포함될 수 있다. Meanwhile, as shown in FIG. 10, the service server 100 according to an embodiment of the present invention may include a physical configuration such as a processor 10 and a memory 40, and the memory 40 may include one or more modules configured to be executed by the processor 10. Specifically, one or more modules may include an object identification module, an image cropping module, an editing module, a transmission module, and the like.

프로세서(10)는, 다양한 소프트웨어 프로그램과, 메모리(40)에 저장되어 있는 명령어 집합을 실행하여 여러 기능을 수행하고 데이터를 처리하는 기능을 수행할 수 있다. 주변인터페이스부(30)는, 영상 마스킹 장치(100)의 입출력 주변 장치를 프로세서(10), 메모리(40)에 연결할 수 있으며, 메모리 제어기(20)는 프로세서(10)나 서비스 서버(100)의 구성요소가 메모리(40)에 접근하는 경우에, 메모리 액세스를 제어하는 기능을 수행할 수 있다. 실시예에 따라서는, 프로세서(10), 메모리 제어기(20) 및 주변인터페이스부(30)를 단일 칩 상에 구현하거나, 별개의 칩으로 구현할 수 있다. The processor 10 may execute various software programs and instruction sets stored in the memory 40 to perform various functions and process data. The peripheral interface unit 30 may connect input/output peripheral devices of the image masking device 100 to the processor 10 and the memory 40, and the memory controller 20 may connect the processor 10 or the service server 100. When a component accesses the memory 40, it may perform a function of controlling memory access. Depending on embodiments, the processor 10, the memory controller 20, and the peripheral interface unit 30 may be implemented on a single chip or may be implemented as separate chips.

메모리(40)는 고속 랜덤 액세스 메모리, 하나 이상의 자기 디스크 저장 장치, 플래시 메모리 장치와 같은 불휘발성 메모리 등을 포함할 수 있다. 또한, 메모리(40)는 프로세서(10)로부터 떨어져 위치하는 저장장치나, 인터넷 등의 통신 네트워크를 통하여 엑세스되는 네트워크 부착형 저장장치 등을 더 포함할 수 있다. The memory 40 may include a high-speed random access memory, one or more magnetic disk storage devices, non-volatile memory such as a flash memory device, and the like. In addition, the memory 40 may further include a storage device located away from the processor 10 or a network attached storage device accessed through a communication network such as the Internet.

여기서, 도10에 도시한 바와 같이, 본 발명의 일 실시예에 의한 서비스 서버(100)는, 메모리(40)에 운영체제를 비롯하여, 응용프로그램에 해당하는 객체식별모듈, 이미지크롭모듈, 편집모듈 및 전송모듈 등을 포함할 수 있다. 여기서, 각각의 모듈들은 상술한 기능을 수행하기 위한 명령어의 집합으로, 메모리(40)에 저장될 수 있다.Here, as shown in FIG. 10, the service server 100 according to an embodiment of the present invention includes an operating system in the memory 40, an object identification module corresponding to an application program, an image cropping module, an editing module, and A transmission module may be included. Here, each module is a set of instructions for performing the above-described functions, and may be stored in the memory 40 .

따라서, 본 발명의 일 실시예에 의한 서비스 서버(100)는, 프로세서(10)가 메모리(40)에 액세스하여 각각의 모듈에 대응하는 명령어를 실행할 수 있다. 다만, 객체식별모듈, 이미지크롭모듈, 편집모듈 및 전송모듈은 상술한 객체식별부(110), 이미지크롭부(120), 편집부(130) 및 전송(140)에 각각 대응하는 동작을 수행하는 것이므로, 여기서는 자세한 설명을 생략한다.Accordingly, in the service server 100 according to an embodiment of the present invention, the processor 10 may access the memory 40 and execute instructions corresponding to each module. However, since the object identification module, image cropping module, editing module, and transmission module perform operations corresponding to the above-described object identification unit 110, image cropping unit 120, editing unit 130, and transmission 140, respectively. , A detailed description is omitted here.

도11은 본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법을 나타내는 순서도이다. 여기서, 도11의 각 단계는 본 발명의 일 실시예에 의한 서비스 서버에 의하여 수행될 수 있다. 이하, 도11을 참조하여 본 발명의 일 실시예에 의한 객체별 맞춤 동영상 생성 방법을 설명한다. 11 is a flowchart illustrating a method for generating a custom video for each object according to an embodiment of the present invention. Here, each step of FIG. 11 may be performed by a service server according to an embodiment of the present invention. Hereinafter, with reference to FIG. 11, a method for generating a custom video for each object according to an embodiment of the present invention will be described.

먼저, 서비스 서버는 원본 동영상을 분석하여, 원본 동영상에 포함된 각각의 프레임 이미지에 나타난 객체들을 식별할 수 있다(S110). 서비스 서버는 추출하고자 하는 객체들의 형상이나 휘도 등의 특징을 기계학습 알고리즘을 이용하여 미리 학습해둘 수 있으며, 이를 활용하여 각각의 프레임 이미지 내에 포함된 객체들을 검출할 수 있다. 이 경우, 서비스 서버는 추출된 객체들을 경계박스 등으로 특정하여 프레임 이미지 내에 표시할 수 있으며, 이후 경계박스 내에 나타난 특징점들을 비교하여 해당 객체를 식별할 수 있다. 이때, 서비스 서버는 식별된 객체에 대한 식별정보를 생성할 수 있다. 식별정보에는 객체들을 구별하기 위한 식별ID(Identification), 경계박스의 좌표정보, 객체의 식별결과에 대한 신뢰도 등이 포함될 수 있다. 서비스 서버는 이전 프레임 이미지 등과 비교하여, 각각의 객체들에 대한 식별 ID 등을 설정할 수 있다. First, the service server may analyze the original video and identify objects appearing in each frame image included in the original video (S110). The service server may pre-learn features of objects to be extracted, such as shape or luminance, using a machine learning algorithm, and may utilize this to detect objects included in each frame image. In this case, the service server may display the extracted objects in a frame image by specifying them as a bounding box, etc., and then compare feature points appearing in the bounding box to identify the corresponding object. At this time, the service server may generate identification information for the identified object. The identification information may include an identification (ID) for distinguishing objects, coordinate information of a bounding box, reliability of an object identification result, and the like. The service server may compare the previous frame image, etc., and set an identification ID for each object.

이후, 서비스 서버는 객체를 프레임 이미지로부터 크롭하여, 각각의 객체들에 대응하는 크롭 이미지를 생성할 수 있다(S120). 여기서, 서비스 서버는 경계박스를 기준으로 프레임 이미지를 잘라낼 수 있으며, 실시예에 따라서는 경계박스로부터 일정간격의 마진을 두고 잘라 크롭 이미지를 생성할 수 있다. 또한, 맞춤 동영상 내 객체가 잘 나타나도록, 프레임 이미지를 자른 후 설정규격에 맞춰 확대 또는 축소하여 크롭이미지를 생성하는 것도 가능하다. Thereafter, the service server may crop the objects from the frame image to generate cropped images corresponding to the respective objects (S120). Here, the service server may crop the frame image based on the bounding box, and may generate a cropped image by cutting a margin of a predetermined interval from the bounding box according to an embodiment. In addition, it is also possible to create a cropped image by cropping the frame image and enlarging or reducing it according to the set standard so that the object in the customized video is clearly displayed.

크롭 이미지가 생성되면, 서비스 서버는 크롭 이미지들을 각각의 객체별로 취합하여, 객체별로 맞춤 동영상을 생성할 수 있다(S130). 여기서, 크롭 이미지들은 각각의 객체별로 생성되므로, 각 객체별로 크롭 이미지들을 시간순서에 따라 취합하면, 해당 객체에 대응하는 맞춤 동영상이 생성하는 것이 가능하다. 따라서, 전체 객체들의 움직임을 전반적으로 확인할 수 있는 원본 동영상과 달리, 특정 객체의 움직만을 집중하여 감상할 수 있는 맞춤 동영상을 용이하게 생성하는 것이 가능하다.When the cropped image is generated, the service server may collect the cropped images for each object and create a customized video for each object (S130). Here, since the cropped images are generated for each object, if the cropped images for each object are collected in chronological order, it is possible to generate a customized video corresponding to the object. Therefore, it is possible to easily create a customized video that can focus only on the movement of a specific object, unlike an original video in which motions of all objects can be generally checked.

맞춤 동영상을 생성한 이후에는, 서비스 서버가 사용자 단말의 요청 등에 대응하여, 맞춤 동영상을 사용자 단말로 전송할 수 있다(S140). 실시예에 따라서는, 생성한 복수의 맞춤 동영상들을 정렬하여, 사용자 단말의 화면 내에 맞춤 동영상들이 동시에 표시되도록 설정하여 전송할 수 있다. 또한, 복수의 객체 중 어느 하나에 대한 사용자의 선택입력이 인가되면, 선택입력에 대응하는 객체의 맞춤 동영상만을 표시하도록 하는 것도 가능하다. 이 경우, 선택입력에 대응하는 객체의 맞춤 동영상과 함께, 원본 동영상이 동시에 표시되도록 설정할 수 있다.After generating the customized video, the service server may transmit the customized video to the user terminal in response to a request from the user terminal (S140). Depending on the embodiment, a plurality of generated customized videos may be arranged, configured to display the customized videos simultaneously on the screen of the user terminal, and then transmitted. In addition, when a user's selection input for any one of a plurality of objects is applied, it is also possible to display only a customized video of an object corresponding to the selection input. In this case, the original video may be displayed simultaneously with the customized video of the object corresponding to the selection input.

이외에도, 서비스 서버는 원본 동영상 내에서, 맞춤 동영상에 대응하는 영역과, 맞춤 동영상에 대응하지 않는 영역을 시각적으로 구분하여 표시하도록 설정할 수 있다. 예를들어, 맞춤 동영상에 대응하는 영역은 하이라이트 표시하고, 객체별 동영상에 대응하지 않는 나머지 영역은 음영처리하는 방식으로 표시할 수 있다. In addition, the service server may visually divide and display a region corresponding to the customized video and a region not corresponding to the customized video within the original video. For example, a region corresponding to a customized video may be highlighted, and the rest of the region not corresponding to a video for each object may be displayed in a manner of shading.

추가적으로, 서비스 서버가 원본 동영상의 해상도에 따라, 맞춤 동영상의 표시방법을 달리하여 전송하는 것도 가능하다. 예를들어, 원본 동영상의 해상도가 설정값 이상인 경우에는, 맞춤 동영상을 원본 동영상과 구분되는 별도의 영역에 표시하고, 원본 동영상의 해상도가 설정값 미만이면, 원본 동영상 내에 맞춤 동영상을 시각적으로 구분하여 표시하도록 설정하는 것도 가능하다.Additionally, it is possible for the service server to transmit the customized video in a different display method according to the resolution of the original video. For example, if the resolution of the original video is greater than the set value, the custom video is displayed in a separate area distinct from the original video, and if the resolution of the original video is less than the set value, the custom video is visually distinguished within the original video. It is also possible to set it to display.

한편, 본 발명의 일 실시예에 의하면, 원본 동영상 내에 동일한 객체들을 상이한 촬영환경에서 동시에 촬영한 복수의 대상 동영상들이 포함되는 경우가 있을 수 있다. 이 경우, 서비스 서버는 복수의 대상 동영상 중에서, 해당 객체가 가장 잘 나온 화면으로 자동으로 스위칭하여, 최적의 맞춤 동영상을 생성하도록 할 수 있다. Meanwhile, according to an embodiment of the present invention, there may be cases in which a plurality of target videos in which the same objects are simultaneously photographed in different shooting environments are included in an original video. In this case, the service server may automatically switch to a screen on which the corresponding object is best displayed among a plurality of target videos to generate an optimally customized video.

구체적으로, 서비스 서버는 먼저 복수의 대상 동영상들을 동기화시킬 수 있으며, 동기화된 대상 동영상으로부터 동일한 시점의 프레임 이미지들을 각각 추출할 수 있다. 실시예에 따라서는, 서비스 서버가 대상 동영상들에 포함된 음원을 기준으로 대상 동영상들을 동기화할 수 있다. 이후, 서비스 서버는 동일한 시점의 프레임 이미지들로부터 각각의 객체들을 식별할 수 있다.Specifically, the service server may first synchronize a plurality of target videos, and extract frame images of the same viewpoint from the synchronized target videos. Depending on the embodiment, the service server may synchronize target videos based on sound sources included in the target videos. Thereafter, the service server may identify each object from the frame images of the same viewpoint.

서비스 서버는 동일한 시점의 프레임 이미지들에 포함된 객체들을 각각 크롭할 수 있으며, 대상 동영상별로 크롭 이미지들을 생성할 수 있다. The service server may crop each of the objects included in frame images of the same view, and may generate cropped images for each target video.

또한, 서비스 서버는 대상 동영상별로 생성한 각각의 크롭 이미지 중에서, 설정조건을 만족하는 크롭 이미지를 선택하여, 객체에 대한 맞춤 동영상을 생성할 수 있다. 즉, 대상 동영상 별로 동일한 객체를 나타내는 크롭 이미지들이 각각 생성되므로, 복수의 크롭 이미지들 중에서 어느 하나를 선택할 필요가 있다. 여기서, 최적의 크롭 이미지를 선택하기 위하여, 미리 설정조건을 준비할 수 있으며, 서비스 서버는 설정조건에 따라 크롭 이미지를 선택할 수 있다. In addition, the service server may generate a customized video for the object by selecting a cropped image that satisfies setting conditions from among cropped images generated for each target video. That is, since cropped images representing the same object are generated for each target video, it is necessary to select one of the plurality of cropped images. Here, in order to select an optimal cropped image, setting conditions may be prepared in advance, and the service server may select a cropped image according to the setting conditions.

구체적으로, 설정조건에는 프레임 이미지 내에 크롭 이미지가 차지하는 크기 비율, 크롭 이미지의 밝기, 크롭 이미지 내의 이미지 왜곡 발생여부 등이 포함될 수 있다. 즉, 설정조건을 이용하여 각각의 객체에 대한 최적의 크롭 이미지를 선택할 수 있으며, 이를 통해 객체에 대한 맞춤 동영상을 생성할 수 있다. Specifically, the setting conditions may include a size ratio occupied by the cropped image in the frame image, brightness of the cropped image, whether or not image distortion occurs in the cropped image, and the like. That is, it is possible to select an optimal cropped image for each object using setting conditions, and through this, a customized video for the object can be created.

전술한 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다. 따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.The above-described present invention can be implemented as computer readable code on a medium on which a program is recorded. The computer-readable medium may continuously store programs executable by the computer or temporarily store them for execution or download. In addition, the medium may be various recording means or storage means in the form of a single or combined hardware, but is not limited to a medium directly connected to a certain computer system, and may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROM and DVD, magneto-optical media such as floptical disks, and ROM, RAM, flash memory, etc. configured to store program instructions. In addition, examples of other media include recording media or storage media managed by an app store that distributes applications, a site that supplies or distributes various other software, and a server. Accordingly, the above detailed description should not be construed as limiting in all respects and should be considered illustrative. The scope of the present invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

본 발명은 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 본 발명에 따른 구성요소를 치환, 변형 및 변경할 수 있다는 것이 명백할 것이다.The present invention is not limited by the foregoing embodiments and accompanying drawings. It will be clear to those skilled in the art that the components according to the present invention can be substituted, modified, and changed without departing from the technical spirit of the present invention.

1: 사용자 단말 100: 서비스 서버
110: 객체식별부 120: 이미지크롭부
130: 편집부 140: 전송부1: user terminal 100: service server
110: object identification unit 120: image cropping unit
130: editing unit 140: transmission unit

Claims

In the method of generating a custom video for each object of a service server,
analyzing the original video and identifying at least one object appearing in each frame image included in the original video;
cropping the object from the frame image to generate a cropped image corresponding to each of the objects; and
Collecting the cropped images for each object and generating a custom video for each object.

The method of claim 1 , wherein identifying the object comprises:
Detecting objects included in the frame image to create a bounding box, comparing feature points in the bounding box to identify each object, and generating identification information for the object. How to create a custom video.

The method of claim 1, wherein the identification information
and at least one of identification (ID) of the objects, coordinate information of the bounding box, and reliability of an identification result of the object.

According to claim 1,
The custom video generation method for each object, characterized in that it further comprises the step of transmitting the custom video to a user terminal.

The method of claim 4, wherein the transmitting step
When a plurality of custom videos are created, the custom videos are arranged, and the custom videos are set to be simultaneously displayed on one screen and transmitted.

The method of claim 4, wherein the transmitting step
When a selection input for any one of a plurality of objects is applied, the original video and a custom video of an object corresponding to the selection input are set to be simultaneously displayed and transmitted.

According to claim 1,
The original video includes a plurality of target videos in which the same objects are simultaneously photographed in different shooting environments,
Identifying the object
The method of generating a customized video for each object, characterized in that by synchronizing the target videos, extracting frame images of the same viewpoint, and identifying each object from the frame images.

8. The method of claim 7, wherein generating the cropped image
and generating cropped images corresponding to objects included in the frame images for each target video.

8. The method of claim 7, wherein identifying the object comprises:
A method for creating custom videos for each object, characterized in that the target videos are synchronized based on the sound sources included in the target videos.

The method of claim 8, wherein generating the customized video
The method of generating a customized video for each object, characterized in that by selecting a cropped image that satisfies a set condition among the cropped images generated for each target video, and generating a customized video for the object.

11. The method of claim 10, wherein the setting condition is
The method of generating a custom video for each object, comprising at least one of a size ratio occupied by the cropped image in the frame image, brightness of the cropped image, and whether or not image distortion occurs in the cropped image.

The method of claim 4, wherein the transmitting step
The method of generating a customized video for each object, characterized in that in the original video, a region corresponding to the customized video and a region not corresponding to the customized video are visually separated and displayed, and then transmitted.

13. The method of claim 12, wherein the transmitting step
In the original video, a region corresponding to the customized video is highlighted and an area not corresponding to the customized video is sound-translated.

The method of claim 4, wherein the transmitting step
If the resolution of the original video is equal to or greater than the set value, set the custom video to be displayed in a separate area distinct from the original video;
If the resolution of the original video is less than a set value, the customized video creation method for each object, characterized in that setting to visually distinguish and display the customized video within the original video.

A computer program stored in a medium in combination with hardware to execute the method of generating a custom video for each object of any one of claims 1 to 14.

an object identification unit that analyzes the original video and identifies at least one object appearing in each frame image included in the original video;
an image cropping unit that crops the objects from the frame image and generates cropped images corresponding to the respective objects; and
A service server comprising an editing unit that collects the cropped images for each object and creates a customized video for each object.