KR20210151232A

KR20210151232A - How a server that provides a sports video-based platform service operates

Info

Publication number: KR20210151232A
Application number: KR1020217038998A
Authority: KR
Inventors: 김지훈
Original assignee: 김지훈
Priority date: 2020-02-15
Filing date: 2021-01-30
Publication date: 2021-12-13
Also published as: KR20220112305A; KR102427358B1; CN115104137A; US11810352B2; US20230072888A1; US20230377336A1; WO2021162305A1

Abstract

스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 구기종목 스포츠 동영상에서 동적 픽셀들을 남기는 전처리를 이용하여 공을 추적하는 단계; 공의 추적 결과를 이용하여 스포츠 동영상의 득점 관련 장면에 연관된 비식별 플레이어를 결정하는 단계; 비식별 플레이어를 식별 가능한 인접 프레임까지 추적함으로써 비식별 플레이어를 식별하는 단계; 및 득점 관련 장면에 대응하여 스포츠 동영상의 시간 구간 및 비식별 플레이어의 식별 정보를 생성하는 단계를 포함한다.A method of operating a server that provides a sports video-based platform service includes: tracking a ball using preprocessing that leaves dynamic pixels in a sports video for a ball game; determining an unidentified player associated with a score-related scene of a sports video by using the ball tracking result; identifying the non-identifying player by tracking the non-identifying player to an identifiable adjacent frame; and generating a time section of a sports video and identification information of an unidentified player in response to a score-related scene.

Description

How a server that provides a sports video-based platform service operates

아래 실시예들은 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법에 관한 것이다.The following embodiments relate to a method of operating a server that provides a sports video-based platform service.

스포츠 동영상을 분석하기 위하여 전문 인력에 의한 인적 자원이 상당 수준으로 요구된다. 스포츠 동영상을 분석하는 몇몇 솔루션이 제시되었으나, 이러한 솔루션들은 여전히 전용 촬영 장비를 요구하거나, 전담 팀을 요구한다. 따라서, 자원 효율적으로 스포츠 동영상을 분석하는 기술이 요구된다.In order to analyze sports videos, human resources by professional personnel are required at a considerable level. Several solutions for analyzing sports videos have been proposed, but these solutions still require dedicated filming equipment or a dedicated team. Therefore, there is a need for a technique for efficiently analyzing sports videos.

실시예들은 인공지능 기술을 이용함으로써, 스포츠 동영상을 분석하는데 요구되는 인적 자원을 현저하게 감소시키는 기술을 제공한다. 또한, 실시예들은 스포츠 동영상의 분석 결과를 해당하는 장면과 연동시켜 데이터베이스화 함으로써, 스포츠 동영상을 디테일 하게 검색하는 기술을 제공한다.Embodiments provide a technique for remarkably reducing human resources required to analyze sports videos by using artificial intelligence technology. In addition, the embodiments provide a technique for searching sports videos in detail by linking the analysis results of sports videos with a corresponding scene and forming a database.

일 측에 따른 동영상 분석 서버의 동작 방법은 구기종목의 스포츠 동영상의 링크를 포함하는 분석 요청 신호를 수신하는 단계; 상기 스포츠 동영상에 포함된 복수의 프레임들에서 정적 픽셀들을 필터링 아웃하여, 동적 픽셀들을 남기는 전처리를 수행하는 단계; 상기 전처리된 동영상에 기초하여 상기 스포츠 동영상의 공을 추적하는 단계; 상기 전처리된 동영상으로부터 상기 스포츠 동영상의 득점 관련 장면을 검출하는 단계; 상기 득점 관련 장면의 검출에 반응하여, 상기 공의 추적 결과를 이용하여 상기 득점 관련 장면에 연관된 비식별 플레이어를 결정하는 단계; 상기 비식별 플레이어를 식별 가능한 인접 프레임까지 상기 비식별 플레이어를 추적함으로써, 상기 비식별 플레이어를 식별하는 단계; 및 상기 득점 관련 장면에 대응하여, 상기 스포츠 동영상의 시간 구간 및 상기 비식별 플레이어의 식별 정보를 출력하는 단계를 포함한다.The method of operating a video analysis server according to one side comprises the steps of: receiving an analysis request signal including a link to a sports video of a ball game; filtering out static pixels from a plurality of frames included in the sports video and performing pre-processing to leave dynamic pixels; tracking the ball of the sports video based on the pre-processed video; detecting a score-related scene of the sports video from the pre-processed video; in response to detection of the scoring scene, determining an unidentified player associated with the scoring scene using the ball tracking result; identifying the unidentified player by tracking the unidentified player up to an adjacent frame in which the unidentified player is identifiable; and outputting a time section of the sports video and identification information of the unidentified player in response to the score-related scene.

상기 공을 추적하는 단계는 상기 프레임들 각각에 대응하여, 해당하는 프레임의 동적 픽셀들에 기초하여 공을 검출하는 단계를 포함할 수 있다.Tracking the ball may include, corresponding to each of the frames, detecting the ball based on dynamic pixels of the corresponding frame.

상기 득점 관련 장면을 검출하는 단계는 상기 프레임들 각각에 대응하여, 해당하는 프레임의 동적 픽셀들에 기초하여 림을 검출하는 단계; 및 상기 림이 검출된 프레임과 인접한 프레임들을 상기 득점 관련 장면으로 결정하는 단계를 포함할 수 있다.The detecting of the score-related scene may include: corresponding to each of the frames, detecting a rim based on dynamic pixels of the corresponding frame; and determining frames adjacent to the frame in which the rim is detected as the score-related scenes.

상기 득점 관련 장면에 연관된 비식별 플레이어를 결정하는 단계는 상기 공의 추적 결과를 이용하여, 상기 득점 관련 장면에 포함된 프레임들에서 득점을 시도한 선수와 관련된 동적 픽셀들을 검출하는 단계; 및 상기 득점을 시도한 선수와 관련된 동적 픽셀들이 검출된 프레임을 인스턴스 세그먼테이션 함으로써, 상기 득점 관련 장면에 연관된 비식별 플레이어를 결정하는 단계를 포함할 수 있다.The determining of the non-identified player associated with the score-related scene may include: detecting dynamic pixels related to the player who attempted to score in frames included in the scoring-related scene, using the ball tracking result; and determining a non-identified player associated with the scoring-related scene by instance-segmenting a frame in which dynamic pixels related to the player who attempted to score are detected.

상기 비식별 플레이어를 식별하는 단계는 상기 결정된 비식별 플레이어로부터 특징을 추출하는 단계; 상기 추출된 특징을 기 등록된 플레이어들의 특징들과 비교하는 단계; 상기 비교 결과 상기 비식별 플레이어를 식별 가능한지 여부를 판단하는 단계; 및 상기 비식별 플레이어를 식별할 수 없다는 판단에 따라, 인접 프레임을 인스턴스 세그먼테이션 함으로써 상기 비식별 플레이어를 추적하는 단계를 포함할 수 있다.The step of identifying the non-identified player may include extracting a feature from the determined non-identifying player; comparing the extracted features with features of previously registered players; determining whether the non-identified player can be identified as a result of the comparison; and tracking the unidentified player by instance segmenting adjacent frames according to a determination that the unidentified player cannot be identified.

상기 전처리를 수행하는 단계는 상기 스포츠 동영상이 고정 시점으로 촬영된 동영상인 경우, 미리 정해진 범위의 인접 프레임들 사이에서 픽셀 값의 변화를 기초로 정적 픽셀들을 필터링 아웃하는 단계; 및 상기 스포츠 동영상이 이동 시점으로 촬영된 동영상인 경우, 프레임 내 픽셀들의 옵티컬 플로우(optical flow)의 통계 값을 기초로 정적 픽셀들을 필터링 아웃하는 단계 중 적어도 하나를 포함할 수 있다.The performing of the pre-processing may include: filtering out static pixels based on a change in pixel values between adjacent frames within a predetermined range when the sports video is a video shot from a fixed viewpoint; and filtering out static pixels based on statistical values of optical flow of pixels within a frame when the sports video is a video taken at a moving time point.

일 측에 따른 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 스포츠 동영상의 링크에 기초하여, 상기 스포츠 동영상의 분석을 요청하는 신호를 동영상 분석 모듈에 전송하는 단계; 상기 동영상 분석 모듈로부터 수신되는 플레이어 별 클러스터들(player-specific clusters)을 데이터베이스에 저장하는 단계; 상기 데이터베이스에 기초하여, 상기 스포츠 동영상으로부터 플레이어 별 비디오 클립들(player-specific video clips)을 추출하기 위한 정보를 사용자 단말에 제공하는 단계; 스트리밍 서버로부터 상기 비디오 클립들을 상기 플레이어들 별로 제공받은 상기 사용자 단말로부터 적어도 하나의 클러스터의 비식별 플레이어를 식별하는 입력을 수신하는 단계; 및 상기 입력에 기초하여, 적어도 하나의 해당하는 클러스터의 식별 정보를 상기 데이터베이스에 갱신하는 단계를 포함한다.According to one aspect, a method of operating a server for providing a sports video-based platform service includes: transmitting a signal requesting analysis of the sports video to a video analysis module based on a link of the sports video; storing player-specific clusters received from the video analysis module in a database; providing information for extracting player-specific video clips from the sports video to a user terminal based on the database; receiving an input for identifying an unidentified player of at least one cluster from the user terminal provided with the video clips for each player from a streaming server; and updating identification information of at least one corresponding cluster in the database based on the input.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 사용자 단말에게 플레이어들의 기여도를 나타내는 통계를 제공하는 단계; 상기 사용자 단말로부터 상기 통계에 포함된 세부기록을 선택하는 입력을 수신하는 단계; 상기 데이터베이스에 기초하여 상기 선택된 세부기록과 관련된 적어도 하나의 서브-클러스터를 획득하는 단계; 및 상기 적어도 하나의 서브-클러스터에 기초하여, 상기 스포츠 동영상으로부터 비디오 클립을 추출하기 위한 정보를 상기 사용자 단말에 제공하는 단계를 더 포함할 수 있다.The method of operating a server providing the sports video-based platform service may include: providing statistics indicating the contribution of players to the user terminal; receiving an input for selecting detailed records included in the statistics from the user terminal; obtaining at least one sub-cluster associated with the selected detailed record based on the database; and providing information for extracting a video clip from the sports moving picture to the user terminal based on the at least one sub-cluster.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 사용자 단말로부터 검색 대상 플레이어 및 상기 검색 대상 장면을 포함하는 검색 쿼리를 수신하는 단계; 상기 데이터베이스로부터, 상기 검색 쿼리에 대응하는 서브-클러스터를 검색하는 단계; 및 상기 검색된 서브-클러스터에 기초하여, 상기 스포츠 동영상으로부터 비디오 클립을 추출하기 위한 정보를 상기 사용자 단말에 제공하는 단계를 더 포함할 수 있다.The method of operating a server for providing the sports video-based platform service includes: receiving a search query including a search target player and the search target scene from the user terminal; retrieving, from the database, a sub-cluster corresponding to the search query; and based on the found sub-cluster, providing information for extracting a video clip from the sports moving picture to the user terminal.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 클러스터들의 신뢰도에 기초하여, 과금 레벨을 결정하는 단계; 및 상기 클러스터들을 수정하는 피드백 입력에 기초하여, 보상 레벨을 결정하는 단계 중 적어도 하나를 더 포함할 수 있다.The method of operating a server providing the sports video-based platform service may include: determining a charging level based on the reliability of the clusters; and determining a compensation level based on a feedback input for modifying the clusters.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 사용자로부터 상기 적어도 하나의 클러스터에 포함된 적어도 하나의 구간의 플레이어가 해당 클러스터에 속하지 않는다는 피드백 입력을 수신하는 단계; 및 상기 피드백 입력에 기초하여, 해당 구간을 해당 클러스터에서 배제하는 단계를 더 포함할 수 있다.The method of operating a server for providing the sports video-based platform service includes: receiving a feedback input from the user that a player of at least one section included in the at least one cluster does not belong to the corresponding cluster; and excluding the corresponding section from the corresponding cluster based on the feedback input.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 사용자로부터 상기 적어도 하나의 클러스터에 포함된 적어도 하나의 구간의 플레이어가 다른 클러스터에 속한다는 피드백 입력을 수신하는 단계; 및 상기 피드백 입력에 기초하여, 해당 구간을 해당 클러스터에서 배제하고 상기 다른 클러스터에 포함시키는 단계를 더 포함할 수 있다.The method of operating a server for providing the sports video-based platform service may include: receiving, from the user, a feedback input indicating that a player of at least one section included in the at least one cluster belongs to another cluster; and excluding the corresponding section from the corresponding cluster and including the corresponding section in the other cluster based on the feedback input.

상기 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서버의 동작 방법은 상기 갱신된 데이터베이스에 의존하는 트레이닝 데이터를 생성하는 단계; 및 상기 트레이닝 데이터에 기초하여, 플레이어들의 검출 정보, 식별 정보 및 모션 유형 정보 중 적어도 하나를 추정하는 특화 모델을 학습하는 단계를 더 포함할 수 있다.The method of operating a server for providing the sports video-based platform service includes: generating training data dependent on the updated database; and learning a specialized model for estimating at least one of detection information, identification information, and motion type information of players based on the training data.

도 1은 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 제공하는 시스템을 설명하는 도면.
도 2는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서비스 서버의 동작 방법을 나타낸 동작 흐름도.
도 3은 일 실시예에 따른 클러스터링 동작을 설명하는 도면.
도 4는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 설명하는 도면.
도 5는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 설명하는 도면.
도 6은 일 실시예에 따른 모션 유형에 따른 특징 벡터들의 분포를 설명하는 도면.
도 7은 일 실시예에 따른 모션 유형에 따른 클러스터링 동작을 설명하는 도면.
도 8은 일 실시예에 따른 범용 모델 및 특화 모델을 설명하는 도면.
도 9는 일 실시예에 따른 농구 경기의 플레이어 별 기여도를 나타내는 통계와 연동하여 제공되는 비디오 클립들을 설명하는 도면.
도 10은 일 실시예에 따른 비디오 클립에 대한 사용자의 피드백을 반영하는 기능을 설명하는 도면.
도 11은 일 실시예에 따른 검색 기능을 설명하는 도면.
도 12는 일 실시예에 따른 추적 클러스터들을 생성하는 동작을 설명하는 도면.
도 13은 일 실시예에 따른 추적 클러스터들을 매칭하는 동작을 설명하는 도면.
도 14는 일 실시예에 따른 득점 이벤트를 감지하는 동작을 설명하는 도면.1 is a view for explaining a system for providing a sports video-based platform service according to an embodiment.
2 is an operation flowchart illustrating a method of operating a service server that provides a sports video-based platform service according to an embodiment.
3 is a view for explaining a clustering operation according to an embodiment;
4 is a view for explaining a sports video-based platform service according to an embodiment.
5 is a view for explaining a sports video-based platform service according to an embodiment.
6 is a view for explaining distribution of feature vectors according to a motion type according to an embodiment;
7 is a view for explaining a clustering operation according to a motion type according to an embodiment;
8 is a view for explaining a general-purpose model and a specialized model according to an embodiment;
9 is a view for explaining video clips provided in association with statistics indicating a contribution level for each player in a basketball game according to an embodiment;
10 is a view for explaining a function of reflecting a user's feedback on a video clip according to an embodiment;
11 is a view for explaining a search function according to an embodiment;
12 is a view for explaining an operation of generating tracking clusters according to an embodiment;
13 is a view for explaining an operation of matching tracking clusters according to an embodiment;
14 is a view for explaining an operation of detecting a scoring event according to an embodiment;

본 명세서에서 개시되어 있는 특정한 구조적 또는 기능적 설명들은 단지 기술적 개념에 따른 실시예들을 설명하기 위한 목적으로 예시된 것으로서, 실시예들은 다양한 다른 형태로 실시될 수 있으며 본 명세서에 설명된 실시예들에 한정되지 않는다.Specific structural or functional descriptions disclosed in this specification are merely illustrative for the purpose of describing embodiments according to technical concepts, and the embodiments may be embodied in various other forms and are limited to the embodiments described herein. doesn't happen

제1 또는 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 이해되어야 한다. 예를 들어 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various elements, but these terms should be understood only for the purpose of distinguishing one element from another element. For example, a first component may be termed a second component, and similarly, a second component may also be termed a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성요소들 간의 관계를 설명하는 표현들, 예를 들어 "~간에"와 "바로~간에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.When an element is referred to as being “connected” or “connected” to another element, it is understood that it may be directly connected or connected to the other element, but other elements may exist in between. it should be On the other hand, when it is said that a certain element is "directly connected" or "directly connected" to another element, it should be understood that the other element does not exist in the middle. Expressions describing the relationship between components, for example, “between” and “between” or “neighboring to” and “directly adjacent to”, etc. should be interpreted similarly.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, and includes one or more other features or numbers, It should be understood that the possibility of the presence or addition of steps, operations, components, parts or combinations thereof is not precluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present specification. does not

실시예들은 퍼스널 컴퓨터, 랩톱 컴퓨터, 태블릿 컴퓨터, 스마트 폰, 텔레비전, 스마트 가전 기기, 지능형 자동차, 키오스크, 웨어러블 장치 등 다양한 형태의 제품으로 서비스될 수 있다. Embodiments may be serviced in various types of products, such as personal computers, laptop computers, tablet computers, smart phones, televisions, smart home appliances, intelligent cars, kiosks, wearable devices, and the like.

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Like reference numerals in each figure indicate like elements.

도 1은 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 제공하는 시스템을 설명하는 도면이다. 도 1을 참조하면, 일 실시예에 따른 시스템은 서비스 서버를 포함하고, 시스템의 설계에 따라 스트리밍 서버, 스토리지 서버, 소셜 네트워크 서비스(SNS) 서버 및 인스턴트 메시징 서비스(IMS) 서버 중 적어도 하나를 더 포함할 수 있다. 서비스 서버는 사용자 단말에 설치되는 어플리케이션과 통신하는 프론트-엔드 서버와 스포츠 동영상을 분석하는 동영상 분석 서버를 포함할 수 있다. 시스템의 설계에 따라, 프론트-엔트 서버와 동영상 분석 서버는 동일한 서버 내에서 모듈의 형태로 구현될 수도 있고, 서로 다른 독립적인 서버로 구현되어 네트워크를 통하여 서로 통신하는 형태로 구현될 수도 있다.1 is a diagram illustrating a system for providing a sports video-based platform service according to an embodiment. Referring to FIG. 1 , a system according to an embodiment includes a service server, and according to the design of the system, at least one of a streaming server, a storage server, a social network service (SNS) server, and an instant messaging service (IMS) server is further added. may include The service server may include a front-end server that communicates with an application installed in the user terminal and a video analysis server that analyzes sports videos. Depending on the design of the system, the front-end server and the video analysis server may be implemented in the form of modules within the same server, or may be implemented as different independent servers to communicate with each other through a network.

서비스 서버는 스포츠 동영상을 수신한다. 스포츠 동영상은 스포츠 경기를 촬영한 영상으로, 복수의 프레임들을 포함한다. 스포츠 경기는 농구, 축구, 배구, 핸드볼, 하키, 아이스하키, 테니스 등 실시간 구기종목, 미식축구, 럭비, 야구, 크로캣, 골프 등 턴제 구기종목, 및 다이빙, 수영, 스키, 스노우 보드 등 비구기종목 중 어느 하나에 해당할 수 있다. 이하, 농구 경기를 예로 들어 설명하나, 실시예들은 농구 경기 이외의 다른 스포츠 경기에도 실질적으로 동일한 방식으로 적용될 수 있다. The service server receives the sports video. A sports video is an image of a sports game and includes a plurality of frames. Sports events include real-time ball games such as basketball, soccer, volleyball, handball, hockey, ice hockey, and tennis; It may correspond to any one of the stocks. Hereinafter, a basketball game will be described as an example, but embodiments may be applied in substantially the same manner to other sports events other than the basketball game.

스포츠 동영상은 스트리밍 서버나 스토리지 서버에 이미 업로드된 것일 수 있다. 이 경우, 서비스 서버는 스트리밍 서버나 스토리지 서버를 통하여 스포츠 동영상에 액세스하는 정보(예를 들어, URL 등)를 수신할 수 있다. 또는, 서비스 서버는 스포츠 동영상을 업로드하는 요청을 수신할 수도 있다. 서비스 서버는 업로드 요청된 스포츠 동영상을 스트리밍 서버나 스토리지 서버에 업로드를 할 수 있다. 스토리밍 서버나 스토리지 서버는 서비스 서버와 동일한 주체에 의하여 운영될 수도 있고, 실시예에 따라 상이한 주체에 의하여 운영될 수도 있다.The sports video may have already been uploaded to a streaming server or storage server. In this case, the service server may receive information (eg, URL, etc.) for accessing the sports video through a streaming server or a storage server. Alternatively, the service server may receive a request to upload a sports video. The service server may upload the requested sports video to a streaming server or a storage server. The storage server or the storage server may be operated by the same entity as the service server, or may be operated by a different entity according to an embodiment.

스포츠 동영상은 복수의 비식별 플레이어들을 포함한다. 비식별 플레이어는 해당 플레이어의 신원이 식별되지 않은 플레이어로, 예를 들어 해당 플레이어의 식별 정보가 설정되지 않은 플레이어를 포함할 수 있다. 스포츠 동영상에 포함된 복수의 비식별 플레이어들은 스포츠 경기를 위한 다양한 움직임들을 수행하고, 스포츠 동영상을 분석함으로써 해당하는 스포츠 경기에 참여한 플레이어들의 개인 차원의 능력이나 팀 차원의 능력이 평가될 수 있다. 다만, 스포츠 동영상을 분석하기 위해서는 해당하는 스포츠 경기의 규칙을 숙지하는 한 명 이상의 전문가가 해당하는 스포츠 경기의 전 시간에 걸쳐 플레이어들 각각을 위한 기록을 수행해야 한다.The sports video includes a plurality of unidentified players. The unidentified player is a player whose identity is not identified, and may include, for example, a player for which identification information of the corresponding player is not set. A plurality of unidentified players included in the sports video may perform various movements for a sports game, and by analyzing the sports video, individual or team capabilities of players participating in the corresponding sports event may be evaluated. However, in order to analyze the sports video, one or more experts who are familiar with the rules of the corresponding sports competition must perform recordings for each player throughout the entire time of the corresponding sports competition.

아래에서 설명하는 실시예들은 인공지능 기술을 이용함으로써, 스포츠 동영상을 분석하는데 요구되는 인적 자원을 현저하게 감소시키는 기술을 제공한다.The embodiments described below provide a technology for remarkably reducing human resources required to analyze sports videos by using artificial intelligence technology.

일 실시예에 따르면, 동영상 분석 서버는 프론트-엔트 서버로부터 구기종목의 스포츠 동영상의 링크를 포함하는 분석 요청 신호를 수신할 수 있다. 동영상 분석 서버는 스포츠 동영상에 포함된 복수의 프레임들에서 정적 픽셀들을 필터링 아웃하여, 동적 픽셀들을 남기는 전처리를 수행할 수 있다. 예를 들어, 스포츠 동영상이 고정 시점으로 촬영된 동영상 (고정된 카메라 앵글로 촬영된 동영상) 인 경우, 미리 정해진 범위의 인접 프레임들 사이에서 픽셀 값의 변화를 기초로 정적 픽셀들을 필터링 아웃(filtering out)할 수 있다. 인접 프레임들 사이에서 동일한 위치의 픽셀 값이 미리 정해진 임계범위를 초과하여 변하는 경우 동적 픽셀로 분류될 수 있다.According to an embodiment, the video analysis server may receive an analysis request signal including a link of a sports video of a ball game from the front-end server. The video analysis server may filter out static pixels from a plurality of frames included in the sports video and perform preprocessing to leave dynamic pixels. For example, when the sports video is a video recorded with a fixed viewpoint (a video recorded with a fixed camera angle), static pixels are filtered out based on a change in pixel value between adjacent frames within a predetermined range. )can do. When a pixel value at the same position between adjacent frames changes by exceeding a predetermined threshold range, it may be classified as a dynamic pixel.

만약 스포츠 동영상이 이동 시점으로 촬영된 동영상 (공을 따라서 이동하는 카메라 앵글로 촬영된 동영상) 인 경우, 프레임 내 픽셀들의 옵티컬 플로우(optical flow)의 통계 값을 기초로 정적 픽셀들을 필터링 아웃할 수 있다. 카메라 앵글의 이동으로 인하여 프레임 내 픽셀들은 공통된 옵티컬 플로우의 성분을 가질 수 있다. 픽셀들에 포함된 공통 성분의 옵티컬 플로우를 제거하면, 실제로 움직이는 객체 (움직이는 선수, 움직이는 공 혹은 진동하는 림 등) 에 해당하는 픽셀들의 옵티컬 플로우의 성분을 획득할 수 있다. 공통 성분의 옵티컬 플로우가 제거된 이후 미리 정해진 임계 크기를 초과하는 크기의 옵티컬 플로우를 가지는 픽셀이 동적 픽셀로 분류될 수 있다.If the sports video is a video shot with a moving point (a video shot with a camera angle moving along the ball), static pixels may be filtered out based on the statistical value of the optical flow of pixels in the frame. . Due to the movement of the camera angle, pixels in a frame may have a component of a common optical flow. If the optical flow of the common component included in the pixels is removed, the optical flow component of the pixels corresponding to an actually moving object (a moving player, a moving ball, or a vibrating rim, etc.) can be obtained. After the optical flow of the common component is removed, a pixel having an optical flow having a size exceeding a predetermined threshold size may be classified as a dynamic pixel.

동영상 분석 서버는 전처리된 동영상에 기초하여 스포츠 동영상의 공을 추적할 수 있다. 예를 들어, 동영상 분석 서버는 스포츠 동영상의 프레임들 각각에 대응하여, 해당하는 프레임의 동적 픽셀들에 기초하여 공을 검출함으로써, 스포츠 동영상의 공을 추적할 수 있다. 스포츠 동영상에 포함된 공은 영상에서 다른 객체들 (선수 혹은 골대 등)에 비하여 상대적으로 작게 촬영되므로, 기존의 객체 추적 모델을 통하여 공을 추적하기 어렵다. 실시예들은 전처리를 통하여 획득되는 동적 픽셀들을 이용하여 공을 추적하도록 학습된 인공 신경망을 이용하여, 공 추적의 성능을 현저하게 향상시킬 수 있다.The video analysis server may track the ball of the sports video based on the pre-processed video. For example, the video analysis server may track the ball of the sports video by detecting the ball based on dynamic pixels of the corresponding frame in response to each frame of the sports video. Since the ball included in the sports video is recorded relatively small compared to other objects (player or goalpost, etc.) in the image, it is difficult to track the ball through the existing object tracking model. Embodiments may remarkably improve the performance of ball tracking by using an artificial neural network trained to track a ball using dynamic pixels obtained through preprocessing.

동영상 분석 서버는 전처리된 동영상으로부터 스포츠 동영상의 득점 관련 장면을 검출할 수 있다. 예를 들어, 동영상 분석 서버는 프레임들 각각에 대응하여, 해당하는 프레임의 동적 픽셀들에 기초하여 림(rim)을 검출하고, 림이 검출된 프레임과 인접한 프레임들을 득점 관련 장면으로 결정함으로써, 스포츠 동영상의 득점 관련 장면을 검출할 수 있다. The video analysis server may detect a score-related scene of a sports video from the pre-processed video. For example, the video analysis server corresponds to each of the frames, detects a rim based on dynamic pixels of the corresponding frame, and determines frames adjacent to the frame in which the rim is detected as a score-related scene. It is possible to detect a score-related scene of a video.

림은 구기종목 스포츠 경기에서 공이 통과하여 득점 여부를 판별하기 위한 미리 정해진 형상의 구조물을 의미하며, 이하 림에 부착된 그물도 림에 포함되는 것으로 이해될 수 있다. 림은 평상시에는 움직이지 않아 정적 픽셀로 분류되다가, 공이 림에 맞아 실제로 림이 움직이는 경우 혹은 움직이는 공이 림을 통과하는 경우 등 득점과 관련된 장면에서 동적 픽셀로 분류될 수 있다. 따라서, 전처리된 영상에서 림이 검출되면, 득점과 관련된 장면이라고 판단될 수 있다.The rim refers to a structure of a predetermined shape for determining whether a ball passes through and scores in a sports game in a ball game, and it may be understood that a net attached to the rim is also included in the rim below. The rim is normally classified as a static pixel because it does not move, but it can be classified as a dynamic pixel in a scene related to scoring, such as when a ball hits the rim and the rim actually moves, or when a moving ball passes through the rim. Accordingly, when a rim is detected in the pre-processed image, it may be determined that the scene is related to a score.

동영상 분석 서버는 득점 관련 장면의 검출에 반응하여, 공의 추적 결과를 이용하여 득점 관련 장면에 연관된 비식별 플레이어를 결정할 수 있다. 예를 들어, 동영상 분석 서버는 공의 추적 결과를 이용하여, 득점 관련 장면에 포함된 프레임들에서 득점을 시도한 선수와 관련된 동적 픽셀들을 검출하고, 득점을 시도한 선수와 관련된 동적 픽셀들이 검출된 프레임을 인스턴스 세그먼테이션(instance segmentation) 함으로써, 득점 관련 장면에 연관된 비식별 플레이어를 결정할 수 있다.The video analysis server may determine a non-identified player related to the score-related scene by using the ball tracking result in response to detection of the scoring-related scene. For example, the video analysis server detects dynamic pixels related to the player who attempted to score from frames included in the scoring-related scene using the ball tracking result, and the frame in which the dynamic pixels related to the player who attempted to score are detected By performing instance segmentation, it is possible to determine an unidentified player associated with a scoring-related scene.

동영상 분석 서버는 전처리된 영상에서 공을 추적하므로, 전처리된 영상에서 림이 검출되면, 림이 검출된 프레임을 시작으로 그 이전의 인접 프레임들에 포함된 공의 이동 궤적을 획득할 수 있다. 동영상 분석 서버는 공의 이동 궤적을 탐색하면서, 인접 프레임들에 포함된 선수의 동적 픽셀들 중 공과 미리 정해진 기준거리 이내로 가까워지는 동적 픽셀들을 선별할 수 있다. 선별된 동적 픽셀들은 득점 시도를 한 비식별 플레이어를 포함할 수 있다.Since the video analysis server tracks the ball in the pre-processed image, when a rim is detected in the pre-processed image, it is possible to acquire the movement trajectory of the ball included in adjacent frames starting from the frame in which the rim is detected. The video analysis server may select dynamic pixels that come close to the ball within a predetermined reference distance from among the dynamic pixels of the player included in the adjacent frames while searching the movement trajectory of the ball. Selected dynamic pixels may include an unidentified player attempting to score.

동영상 분석 서버는 공과 미리 정해진 기준거리 이내로 가까워지는 동적 픽셀들이 선별되면, 해당하는 프레임을 인스턴스 세그먼테이션 하여, 득점 시도를 한 비식별 플레이어의 마스크(mask)를 획득할 수 있다. 이 때, 인스턴스 세그먼테이션을 수행하는 신경망 모델의 입력으로 전처리 되지 않은 원본 프레임의 영상이 입력될 수 있다.When dynamic pixels approaching within a predetermined reference distance from the ball are selected, the video analysis server may instance segment the corresponding frame to obtain a mask of the non-identified player who attempted to score. In this case, an image of an original frame that has not been pre-processed may be input as an input to the neural network model performing instance segmentation.

동영상 분석 서버는 비식별 플레이어를 식별 가능한 인접 프레임까지 비식별 플레이어를 추적함으로써, 비식별 플레이어를 식별할 수 있다. 예를 들어, 동영상 분석 서버는 결정된 비식별 플레이어로부터 특징(feature)을 추출할 수 있다. 동영상 분석 서버는 비식별 플레이어의 마스크를 이용하여 비식별 플레이어에 해당하는 픽셀 값들을 획득하고, 획득된 픽셀 값들을 기초로 특징을 추출할 수 있다. The video analysis server may identify the non-identified player by tracking the non-identifying player up to an adjacent frame in which the non-identifying player can be identified. For example, the video analysis server may extract a feature (feature) from the determined non-identified player. The video analysis server may obtain pixel values corresponding to the unidentified player by using the mask of the unidentified player, and extract a feature based on the obtained pixel values.

동영상 분석 서버는 추출된 특징을 기 등록된 플레이어들의 특징들과 비교하여 비식별 플레이어를 식별 가능한지 여부를 판단할 수 있다. 동영상 분석 서버는 서비스 서버로부터 분석 요청된 스포츠 경기에 참여한 선수들의 등록 정보를 수신할 수 있다. 등록 정보는 선수들의 명단 및 선수들의 사진들을 포함할 수 있다. 동영상 분석 서버는 기 등록된 선수들의 사진들로부터 특징들을 추출하고, 비 식별 플레이어의 특징과 비교할 수 있다. 동영상 분석 서버는 비교 결과에 따라 비식별 플에이어를 기 등록된 선수 중 한 명과 매칭할 수 있다.The video analysis server may determine whether a non-identified player can be identified by comparing the extracted features with features of previously registered players. The video analysis server may receive registration information of players participating in the sports event requested for analysis from the service server. Registration information may include a list of players and photos of the players. The video analysis server may extract features from photos of previously registered players and compare them with features of non-identified players. The video analysis server may match the unidentified player with one of the pre-registered players according to the comparison result.

비교 결과 비식별 플레이어를 식별할 수 없다고 판단되는 경우, 동영상 분석 서버는 인접 프레임(과거 혹은 미래의 인접 프레임)을 인스턴스 세그먼테이션 함으로써 비식별 플레이어를 추적할 수 있다. 비식별 플레이어가 다른 선수에게 가려지는 등 다양한 요인으로 인하여, 슈팅을 시도하는 프레임에서 비식별 플레이어를 식별하기에 충분한 영상 정보가 포함되지 않을 수 있다. 동영상 분석 서버는 비식별 플레이어를 식별하기에 충분한 영상 정보가 포함된 인접 프레임까지 비식별 플레이어를 추적할 수 있다. 동영상 분석 서버는 인접 프레임에서 비식별 플레이어를 추적한 뒤, 해당하는 인접 프레임에서 추적된 비식별 플레이어의 영상 정보를 기초로 비식별 플레이어의 식별 가부를 판단할 수 있다. 동영상 분석 서버는 비식별 플레이어를 식별 가능할 때까지 인접 프레임에서의 추적 동작 및 식별 동작을 반복적으로 수행할 수 있다.When it is determined that the non-identified player cannot be identified as a result of the comparison, the video analysis server may track the non-identified player by instance-segmenting an adjacent frame (a past or future adjacent frame). Due to various factors, such as the unidentified player being obscured by other players, video information sufficient to identify the unidentified player may not be included in the frame attempting to shoot. The video analysis server may track the non-identified player to an adjacent frame that contains sufficient video information to identify the non-identified player. After tracking the unidentified player in the adjacent frame, the video analysis server may determine whether to identify the unidentified player based on image information of the unidentified player tracked in the corresponding adjacent frame. The video analysis server may repeatedly perform a tracking operation and an identification operation in an adjacent frame until a non-identifying player can be identified.

동영상 분석 서버는 득점 관련 장면에서 득점에 성공하였는지, 아니면 득점에 실패하였는지 여부를 판단할 수 있다. 예를 들어, 동영상 분석 서버는 득점 관련 장면에 포함된 복수의 프레임들에서 림의 위치를 기준으로 하는 공의 이동 궤적을 입력으로 득점 성공 여부를 판단하도록 학습된 신경망 모델을 이용할 수 있다.The video analysis server may determine whether the goal was successful or failed in the goal-related scene. For example, the video analysis server may use a neural network model trained to determine whether scoring is successful by inputting the movement trajectory of the ball based on the position of the rim in a plurality of frames included in the scoring-related scene as an input.

동영상 분석 서버는 득점 관련 장면들 각각에 대응하여, 스포츠 동영상의 시간 구간, 비식별 플레이어의 식별 정보 및/또는 득점 성공 여부를 출력하여 프론트-엔트 서버의 분석 요청 신호에 응답할 수 있다. 프론트-엔드 서버는 응답 신호에 기초하여 데이터베이스를 구축할 수 있다.The video analysis server may respond to the analysis request signal of the front-end server by outputting a time section of a sports video, identification information of an unidentified player, and/or whether or not the score was successful, in response to each of the scoring-related scenes. The front-end server may build a database based on the response signal.

일 실시예에 따르면, 서비스 서버는 비지도식 학습 기법을 이용하여 스포츠 동영상에 포함된 비식별 플레어어들을 서로 구별할 수 있다. 서비스 서버는 비식별 플레이어들이 각각 누구인지를 식별할 수 없는 상황에서도, 제1 비식별 플레이어와 제2 비식별 플레이어가 서로 다르다는 것을 구별할 수 있다. 예를 들어, 스포츠 동영상 내에서 제1 비식별 플레이어의 외형 특징과 제2 비식별 플레이어의 외형 특징이 서로 구별될 수 있다. 외형 특징은 비식별 플레이어의 외형적 특징으로, 예를 들어 체격, 키, 피부색, 머리 스타일, 얼굴 등 비식별 플레이어 자체의 외형적 특징뿐 아니라, 운동복, 등번호, 신발, 보호대, 악세서리의 외형적 특징을 포함할 수 있다. 또한, 제1 비식별 플레이어의 모션 특징과 제2 비식별 플레이어의 모션 특징이 서로 구별될 수 있다. 모션 특징은 비식별 플레이어의 움직임 특징으로, 예를 들어 농구 경기에서 점프슛 모션, 세트슛 모션, 레이업 모션, 덩크 모션, 드리블 모션, 패스 모션, 픽 모션, 리바운드 모션, 블로킹 모션, 수비 모션 등 다양한 모션에서 해당하는 비식별 플레이어 고유의 자세나 움직임에 따른 특징을 포함할 수 있다. 서비스 서버는 외형 특징 및/또는 모션 특징을 이용하여, 비식별 플레이어들을 서로 구별할 수 있다.According to an embodiment, the service server may distinguish the non-identified players included in the sports video from each other by using an unsupervised learning technique. The service server may distinguish that the first non-identified player and the second non-identified player are different from each other even in a situation in which each of the non-identified players cannot be identified. For example, the external features of the first unidentified player and the external features of the second unidentified player may be distinguished from each other in the sports video. Appearance characteristics are the external characteristics of the non-identified player, for example, physical characteristics of the non-identified player itself, such as physique, height, skin color, hair style, and face, as well as external characteristics of sportswear, jersey numbers, shoes, protectors, and accessories. may include Also, the motion characteristics of the first unidentified player and the motion characteristics of the second unidentified player may be distinguished from each other. A motion characteristic is a movement characteristic of an unidentified player, for example, jump shot motion, set shot motion, layup motion, dunk motion, dribble motion, pass motion, pick motion, rebound motion, blocking motion, defense motion, etc. in a basketball game. In various motions, the characteristic according to the posture or movement of the non-identified player may be included. The service server may distinguish non-identified players from each other by using appearance characteristics and/or motion characteristics.

서비스 서버는 서로 구별된 비식별 플레이어들 별로 비디오 클립을 생성할 수 있다. 예를 들어, 서비스 서버는 전체 경기 영상 중 제1 비식별 플레이어의 영상만을 선별적으로 포함하는 제1 비디오 클립을 생성하고, 제2 비식별 플레이어의 영상만을 선별적으로 포함하는 제2 비디오 클립을 생성할 수 있다. 서비스 서버는 비식별 플레이어 별 비디오 클립들을 사용자에게 제공함으로써, 사용자로부터 비디오 클립의 비식별 플레이어를 식별하는 정보를 수신할 수 있다. 사용자는 해당하는 스포츠 동영상 내 비식별 플레이어들 중 하나일 수도 있고, 경우에 따라 사용자는 스포츠 동영상 내 비식별 플레이어는 아니지만 해당하는 스포츠 동영상 내 비식별 플레이어들을 식별할 수 있는 사람일 수 있다.The service server may generate a video clip for each non-identified player that is distinguished from each other. For example, the service server generates a first video clip selectively including only the image of the first unidentified player among the entire game image, and a second video clip selectively including only the image of the second unidentified player can create The service server may receive information identifying the non-identified player of the video clip from the user by providing the non-identified player-specific video clips to the user. The user may be one of the non-identified players in the corresponding sports video, and in some cases, the user may be a person who is not a non-identified player in the sports video but can identify the non-identified players in the corresponding sports video.

사용자는 복수의 비디오 클립들 중 적어도 하나의 비디오 클립에 대응하는 비식별 플레이어를 식별하는 정보를 입력할 수 있다. 예를 들어, 사용자는 비디오 클립들 중 자기 자신의 비디오 클립에 자기 자신임을 확인하는 입력을 하거나, 혹은 비디오 클립들 중 자신이 식별할 수 있는 비식별 플레이어의 비디오 클립에 해당 비식별 플레이어를 식별하는 정보를 입력할 수 있다. The user may input information identifying a non-identified player corresponding to at least one of the plurality of video clips. For example, the user inputs an input to confirm his/her own video clip among video clips, or identifies the non-identified player in a video clip of an unidentified player that he/she can identify among video clips. You can enter information.

서비스 서버는 비디오 클립 별로 입력되는 식별 정보에 기초하여, 스포츠 동영상을 분석할 수 있다. 대다수의 스포츠 경기는 미리 정해진 규칙 기반으로 진행되기 때문에, 스포츠 경기의 규칙에 따라 허용되는 모션 유형들이 미리 정해질 수 있다. 이로 인하여, 하나의 모션 유형과 다른 모션 유형은 서로 구별될 수 있다. 예를 들어, 농구 경기에서 슛 모션과 드리블 모션은 서로 구별될 수 있다. 더 나아가, 슛 모션 중에서도 점프 슛 모션과 덩크 슛 모션은 서로 구별될 수 있다. 따라서, 서비스 서버는 스포츠 동영상에서 특징 장면들 혹은 프레임들을 추출하고, 추출된 장면들의 모션 유형들을 분류할 수 있다. 아래에서 상세하게 설명하겠으나, 서비스 서버는 장면이나 프레임 전체가 아닌 일부 영역들을 추출하고, 추출된 영역들의 모션 유형들을 분류할 수도 있다.The service server may analyze the sports video based on identification information input for each video clip. Since most sports events are conducted based on predetermined rules, motion types allowed according to the rules of the sports competition may be predetermined. Due to this, one motion type and another motion type can be distinguished from each other. For example, in a basketball game, a shooting motion and a dribbling motion may be distinguished from each other. Furthermore, among the shooting motions, a jump shoot motion and a dunk shoot motion may be distinguished from each other. Accordingly, the service server may extract feature scenes or frames from the sports video and classify motion types of the extracted scenes. As will be described in detail below, the service server may extract some regions rather than the entire scene or frame, and classify motion types of the extracted regions.

이로 인하여, 실시예들은 사용자로부터 비디오 클립 별로 식별 정보를 수신하는 것만으로, 스포츠 경기 내 다양한 순간들(moments)을 분석하는 기술을 제공할 수 있다. 더 나아가, 동일한 플레이어라고 식별되는 영상 데이터가 누적됨에 따라, 해당 플레이어를 자동으로 식별하는 특화 모델이 학습될 수 있다. 특화 모델은 해당 플레이어를 자동으로 식별할 뿐 아니라, 해당 플레이어의 움직임을 보다 정확하게 분류하도록 학습될 수도 있다.For this reason, the embodiments may provide a technique for analyzing various moments in a sports game only by receiving identification information for each video clip from the user. Furthermore, as image data identified as the same player is accumulated, a specialized model for automatically identifying the player may be learned. The specialized model not only automatically identifies the player, but can also be trained to more accurately classify the player's movements.

또한, 서비스 서버는 스포츠 동영상에서 검출되는 순간들에 식별 정보 및 모션 유형 정보를 태깅하여, 데이터베이스를 구축할 수 있다. 사용자는 다양한 쿼리를 통하여 데이터베이스에서 원하는 영상을 검색할 수 있다. 예를 들어, 사용자는 i)플레이어를 지정하여 ii)원하는 장면을 검색하는 쿼리를 입력할 수 있다. 서비스 서버는 데이터베이스로부터 쿼리에 대응하는 장면들이 어느 스포츠 동영상 내 어느 시점(프레임)에 해당하는지 검색하고, 검색 결과를 기초로 비디오 클립을 생성하여 사용자에게 제공할 수 있다. 실시예에 따라, 복수의 플레이어들을 지정하여 원하는 장면, 예를 들어, 플레이어 A에게 어시스트를 받아 플레이어 B가 득점한 장면, 혹은 플레이어 C가 플레이어 D에게 블록 당한 장면 등이 검색될 수도 있다.In addition, the service server may build a database by tagging identification information and motion type information at moments detected in a sports video. A user can search for a desired image in the database through various queries. For example, the user may input a query for i) designating a player and ii) searching for a desired scene. The service server may search from the database to which time point (frame) in which sports video footage corresponds to scenes corresponding to the query, and may generate a video clip based on the search result and provide it to the user. According to an embodiment, a desired scene by designating a plurality of players, for example, a scene in which player B scores with an assist from player A, or a scene in which player C is blocked by player D, may be searched.

아래에서 상세히 설명하겠으나, 서비스 서버는 스포츠 경기에서 플레이어 별 기여도를 나타내는 통계를 생성할 수 있다. 예를 들어, 서비스 서버는 농구 경기의 결과를 수치화하여 표현하는 박스 스코어를 생성할 수 있다. 서비스 서버는 박스 스코어 내 세부기록마다 그 세부기록의 장면들을 연동하는 서비스를 제공할 수 있다. 예를 들어, 박스 스코어에서 특정 플레이어의 '스틸' 세부기록이 선택되면, 해당 플레이어가 '스틸' 한 장면들을 포함하는 비디오 클립을 제공할 수 있다. 일 실시예에 따르면, 프론트-엔드 서버는 특정 경기의 박스 스코어에서 '스틸'의 선택 입력에 반응하여, 해당 경기의 스틸 관련 클러스터들을 검색할 수 있다. 프론트-엔드 서버는 검색된 클러스터들에 기초하여, 관련 비디오 클립을 스트리밍 받을 수 있는 정보(예를 들어, 영상 URL 및 적어도 하나의 시간 구간)를 사용자 단말에 제공할 수 있다.As will be described in detail below, the service server may generate statistics indicating the contribution of each player in a sports game. For example, the service server may generate a box score expressing the result of a basketball game numerically. The service server may provide a service for linking scenes of the detailed record for each detailed record in the box score. For example, if a 'still' detail of a particular player is selected in a box score, a video clip containing scenes 'still' of that player may be provided. According to an embodiment, the front-end server may search for still-related clusters of a specific game in response to a selection input of 'still' in a box score of a specific game. The front-end server may provide, to the user terminal, information (eg, an image URL and at least one time interval) through which a related video clip can be streamed, based on the found clusters.

서비스 서버는 사용자에게 제공되는 비디오 클립을 소셜 네트워크 서버나 인스턴트 메시징 서버로 공유할 수 있다. 서비스 서버는 웹 인터페이스를 통하여 사용자에게 서비스를 제공하거나, 혹은 앱 인터페이스를 통하여 사용자에게 서비스를 제공할 수 있다.The service server may share the video clip provided to the user to a social network server or an instant messaging server. The service server may provide a service to a user through a web interface or may provide a service to a user through an app interface.

도 2는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 제공하는 서비스 서버의 동작 방법을 나타낸 동작 흐름도이다. 도 1을 통하여 전술한 것과 같이 서비스 서버는 프론트-엔드 모듈 및 동영상 분석 모듈의 단일 서버로 구현되거나, 혹은 프론트-엔드 서버 및 동영상 분석 서버의 별개 서버들로 구현될 수 있다. 이하, 설명의 편의를 위하여 단일 서버로 구현되는 실시예를 기준으로 설명한다.2 is an operation flowchart illustrating a method of operating a service server that provides a sports video-based platform service according to an embodiment. As described above with reference to FIG. 1 , the service server may be implemented as a single server of the front-end module and the video analysis module, or may be implemented as separate servers of the front-end server and the video analysis server. Hereinafter, for convenience of description, an embodiment implemented as a single server will be described.

도 2를 참조하면, 서비스 서버의 프론트-엔드 모듈은 스포츠 동영상의 링크에 기초하여, 스포츠 동영상의 분석을 요청하는 신호를 동영상 분석 모듈에 전송한다(210). 서비스 서버의 프론트-엔드 모듈은 동영상 분석 모듈로부터 수신되는 플레이어 별 클러스터들(player-specific clusters)을 데이터베이스에 저장한다(220). 플레이어 별 클러스터들은 플레이어 별로 경기 내 기여도로 산출 가능한 장면들(예를 들어, 득점 관련 장면, 수비 관련 장면, 실책이나 반칙 관련 장면 등)에 관한 구간 정보를 포함할 수 있다.Referring to FIG. 2 , the front-end module of the service server transmits a signal requesting analysis of the sports video to the video analysis module based on the link of the sports video ( 210 ). The front-end module of the service server stores player-specific clusters received from the video analysis module in the database ( 220 ). The player-specific clusters may include section information regarding scenes (eg, a score-related scene, a defense-related scene, an error or foul-related scene, etc.)

서비스 서버의 프론트-엔드 모듈은 데이터베이스에 기초하여, 스포츠 동영상으로부터 플레이어 별 비디오 클립들(player-specific video clips)을 추출하기 위한 정보를 사용자 단말에 제공한다(230). 사용자 단말에 제공되는 정보는 동영상의 링크 및 플레이어 별 주요 장면을 위한 시간 구간을 포함할 수 있다. 사용자 단말은 수신된 정보에 기초하여, 스트리밍 서버에 해당하는 주요 장면에 해당하는 시간 구간만을 선별적으로 스트리밍 요청할 수 있다.The front-end module of the service server provides information for extracting player-specific video clips from the sports video to the user terminal based on the database ( 230 ). The information provided to the user terminal may include a time section for a link of a video and a main scene for each player. The user terminal may selectively request streaming only for a time section corresponding to a main scene corresponding to the streaming server, based on the received information.

서비스 서버의 프론트-엔드 모듈은 사용자 단말로부터 적어도 하나의 클러스터의 비식별 플레이어를 식별하는 입력을 수신한다(240). 서비스 서버의 프론트-엔드 모듈은 사용자 입력에 기초하여, 적어도 하나의 해당하는 클러스터의 식별 정보를 데이터베이스에 갱신한다(250).The front-end module of the service server receives an input for identifying an unidentified player of at least one cluster from the user terminal ( 240 ). The front-end module of the service server updates identification information of at least one corresponding cluster in the database based on a user input ( 250 ).

서비스 서버는 사용자로부터 적어도 하나의 클러스터의 비식별 플레이어를 식별하는 입력을 수신한다. 사용자는 비식별 플레이어들 별로 비디오 클립을 제공받게 되므로, 비디오 클립 별로 비식별 플레이어를 식별하는 입력을 할 수 있다. 비디오 클립은 클러스터 단위로 추출되거나 생성되는 바, 수신된 입력은 해당하는 클러스터에 대응하는 비식별 플레이어를 식별하는 정보에 해당한다.The service server receives input from the user identifying the unidentified player of the at least one cluster. Since the user is provided with a video clip for each non-identified player, he/she can input for identifying the non-identified player for each video clip. A video clip is extracted or generated in units of clusters, and the received input corresponds to information identifying an unidentified player corresponding to the corresponding cluster.

서비스 서버는 사용자의 입력에 기초하여, 적어도 하나의 해당하는 클러스터의 식별 정보를 설정한다. 일 실시예에 따르면, 클러스터는 {스포츠 동영상의 액세스 정보, 식별 정보, 영역들의 인덱스들}의 필드들을 포함할 수 있다. 이 경우, 서비스 서버는 사용자의 입력에 기초하여, 클러스터의 식별 정보를 설정할 수 있다. 식별 정보는 스포츠 동영상 기반 플랫폼 서비스의 계정 정보에 해당할 수 있다. 또는, 식별 정보는 스포츠 동영상 기반 플랫폼 서비스와 연동하는 소셜 네트워크 서비스의 계정 정보, 혹은 스포츠 동영상 기반 플랫폼 서비스와 연동하는 인스턴트 메시징 서비스의 계정 정보에 해당할 수도 있다. 식별 정보는 비식별 플레이어를 식별하기 위한 미리 정해진 템플릿의 정보(예를 들어, 성, 이름, 닉네임, 소속 팀, 등번호, 성별, 나이, 키, 몸무게, 포지션 등)를 포함할 수 있다. 실시예에 따라, 식별 정보는 영역에 저장될 수도 있다. 이 경우, 클러스터는 {영역들의 인덱스들}의 필드들을 포함하고, 각 영역들이 {식별 정보}의 필드를 포함할 수 있다.The service server sets identification information of at least one corresponding cluster based on the user's input. According to an embodiment, the cluster may include fields of {access information of sports video, identification information, indexes of areas}. In this case, the service server may set the cluster identification information based on the user's input. The identification information may correspond to account information of a sports video-based platform service. Alternatively, the identification information may correspond to account information of a social network service interworking with a sports video-based platform service or account information of an instant messaging service interworking with a sports video-based platform service. The identification information may include information (eg, last name, first name, nickname, affiliation team, jersey number, gender, age, height, weight, position, etc.) of a predetermined template for identifying the non-identified player. According to an embodiment, the identification information may be stored in the area. In this case, the cluster may include fields of {indices of regions}, and each region may include a field of {identification information}.

또는, 서비스 서버는 사용자 단말로부터 적어도 하나의 클러스터에 포함된 적어도 하나의 구간의 플레이어가 해당 클러스터에 속하지 않는다는 피드백 입력을 수신하여, 해당 구간을 해당 클러스터에서 배제할 수 있다.Alternatively, the service server may receive a feedback input from the user terminal indicating that the player of at least one section included in at least one cluster does not belong to the corresponding cluster, and may exclude the corresponding section from the corresponding cluster.

또는, 서비스 서버는 사용자 단말로부터 적어도 하나의 클러스터에 포함된 적어도 하나의 구간의 플레이어가 다른 클러스터에 속한다는 피드백 입력을 수신하여, 해당 구간을 해당 클러스터에서 배제하고 다른 클러스터에 포함시킬 수 있다.Alternatively, the service server may receive a feedback input from the user terminal stating that the player of at least one section included in at least one cluster belongs to another cluster, and exclude the section from the corresponding cluster and include it in another cluster.

또한, 서비스 서버의 프론트-엔드 모듈은 스포츠 동영상 기반 플랫폼 서비스를 제공할 수 있다(260). 예를 들어, 서비스 서버는 식별 정보가 설정된 클러스터들(혹은 영역들)에 기초하여, 스포츠 동영상 기반 플랫폼 서비스를 제공할 수 있다. 서비스 서버는 스포츠 경기를 자동으로 분석하거나, 스포츠 경기의 내용을 지표화 하는 통계 정보를 제공하거나, 통계 정보와 연동하는 비디오 클립을 제공하거나, 스포츠 동영상의 세부적(detail) 검색 기능을 제공하거나, 검색 결과에 해당하는 비디오 클립을 제공하는 등의 다양한 서비스를 제공할 수 있다. 또한, 서비스 서버는 비디오 클립을 소셜 네트워크 서비스나 인스턴트 메시징 서비스에 공유하는 기능도 제공할 수 있다.Also, the front-end module of the service server may provide a sports video-based platform service ( 260 ). For example, the service server may provide a sports video-based platform service based on clusters (or regions) in which identification information is set. The service server automatically analyzes sports events, provides statistical information that indexes the contents of sports events, provides video clips linked with statistical information, provides detailed search functions for sports videos, or provides search results Various services, such as providing a video clip corresponding to In addition, the service server may also provide a function of sharing the video clip to a social network service or an instant messaging service.

서비스 서버는 사용자 단말에게 플레이어들의 기여도를 나타내는 통계를 제공하고, 사용자 단말로부터 통계에 포함된 세부기록을 선택하는 입력을 수신할 수 있다. 서비스 서버는 데이터베이스에 기초하여 선택된 세부기록과 관련된 적어도 하나의 서브-클러스터를 획득하고, 적어도 하나의 서브-클러스터에 기초하여, 스포츠 동영상으로부터 비디오 클립을 추출하기 위한 정보를 사용자 단말에 제공할 수 있다.The service server may provide statistics indicating the contribution of players to the user terminal, and receive an input for selecting detailed records included in the statistics from the user terminal. The service server may obtain at least one sub-cluster related to the selected detailed record based on the database, and provide the user terminal with information for extracting a video clip from the sports video based on the at least one sub-cluster. .

또는, 서비스 서버는 사용자 단말로부터 검색 대상 플레이어 및 검색 대상 장면을 포함하는 검색 쿼리를 수신하고, 데이터베이스로부터, 검색 쿼리에 대응하는 서브-클러스터를 검색할 수 있다. 서비스 서버는 검색된 서브-클러스터에 기초하여, 스포츠 동영상으로부터 비디오 클립을 추출하기 위한 정보를 사용자 단말에 제공할 수 있다.Alternatively, the service server may receive a search query including a search target player and a search target scene from the user terminal, and search for a sub-cluster corresponding to the search query from the database. The service server may provide information for extracting a video clip from a sports video to the user terminal based on the found sub-cluster.

또는, 서비스 서버는 클러스터들의 신뢰도에 기초하여, 과금 레벨을 결정하거나, 클러스터들을 수정하는 피드백 입력에 기초하여, 보상 레벨을 결정할 수 있다.Alternatively, the service server may determine a charging level based on the reliability of the clusters, or may determine a compensation level based on a feedback input for modifying the clusters.

뿐만 아니라, 서비스 서버는 갱신된 데이터베이스에 의존하는 트레이닝 데이터를 생성하여, 플레이어들의 검출 정보, 식별 정보 및 모션 유형 정보 중 적어도 하나를 추정하는 특화 모델을 학습할 수 있다.In addition, the service server may learn a specialized model for estimating at least one of detection information, identification information, and motion type information of players by generating training data depending on the updated database.

서비스 서버는 식별 정보가 설정된 클러스터들(혹은 영역들)에 기초하여, 식별된 플레이어를 위한 특화 모델을 학습할 수 있다. 또는, 서비스 서버는 복수의 식별된 플레이어들이 속한 팀을 위한 특화 모델을 학습할 수 있다. 특화 모델은 스포츠 동영상에서 플레이어들의 영역들을 검출하는 검출 모듈, 영역들을 분류하는 분류 모듈, 영역들을 식별하는 식별 모듈, 또는 전술한 기능들의 다양한 조합을 위한 복합 모듈 등으로 학습될 수 있다.The service server may learn a specialized model for the identified player based on clusters (or regions) in which identification information is set. Alternatively, the service server may learn a specialized model for a team to which a plurality of identified players belong. The specialized model may be trained as a detection module for detecting regions of players in a sports video, a classification module for classifying regions, an identification module for identifying regions, or a complex module for various combinations of the above-described functions.

서비스 서버는 특화 모델을 이용하여 식별 정보가 입력된 플레이어에 관한 보다 높은 품질의 서비스를 제공할 수 있다. 일 예로, 식별 정보가 설정된 데이터가 축적되어 특화 모델이 학습됨에 따라, 새로운 스포츠 동영상에서 해당 플레이어의 식별 정보까지 자동으로 설정될 수 있다. 또한, 특화 모델을 통하여 해당 플레이어의 경기 내용이 보다 정확하게 분석될 수 있다.The service server may provide a higher quality service regarding the player to which the identification information has been input by using the specialized model. As an example, as the specialized model is learned by accumulating data in which identification information is set, from a new sports video to identification information of a corresponding player may be automatically set. In addition, the game content of the corresponding player can be analyzed more accurately through the specialized model.

도면에 도시하지 않았으나, 일 실시예에 따르면, 서비스 서버는 비식별 플레이어들을 촬영한 스포츠 동영상으로부터 비식별 플레이어들에 대응하는 영역들을 검출한다. '검출'이란 영상 내 검출 대상에 해당하는 일부 영역을 결정하는 동작일 수 있다. 스포츠 동영상은 복수의 프레임들을 포함하고, 서비스 서버는 개별 프레임에서 개별 비식별 플레이어가 차지하는 영역을 검출할 수 있다. 서비스 서버는 스포츠 동영상에서 해당하는 스포츠 경기를 플레이하는 플레이어들을 검출하는 검출기를 이용하여, 비식별 플레이어들에 대응하는 영역들을 검출할 수 있다. 비식별 플레이어들에 대응하는 영역들은 미리 정해진 형상을 가질 수 있으며, 예를 들어 사각형의 윈도우 형상을 가질 수 있다. 서비스 서버는 스포츠 동영상의 개별 프레임 단위로, 해당 프레임에서 비식별 플레이어들에 대응하는 윈도우들을 검출할 수 있다. 일 예로, 서비스 서버는 프레임을 지시하는 프레임 인덱스, 검출된 윈도우의 위치를 지시하는 정보인 (x, y)-좌표와 검출된 윈도우의 크기를 지시하는 정보인 (width, height)를 획득할 수 있다. 이 경우, 개별 영역은 {frame_index, x-coordinate, y-coordinate, width, height}로 정의될 수 있다. 서비스 서버는 검출 모듈을 직접 구동할 수도 있고, 서비스 서버와 연동되어 검출 모듈을 구동하는 다른 서버로 검출을 요청할 수도 있다.Although not shown in the drawings, according to an embodiment, the service server detects regions corresponding to the non-identified players from a sports video shot of the non-identified players. 'Detection' may be an operation of determining a partial region corresponding to a detection target in an image. A sports video includes a plurality of frames, and the service server may detect an area occupied by an individual unidentified player in each frame. The service server may detect regions corresponding to non-identified players by using a detector that detects players playing a corresponding sporting event in the sports video. Regions corresponding to unidentified players may have a predetermined shape, for example, a rectangular window shape. The service server may detect windows corresponding to non-identified players in the respective frame unit of the sports video. As an example, the service server may obtain a frame index indicating a frame, (x, y)-coordinates that are information indicating the position of the detected window, and (width, height) information indicating the size of the detected window. have. In this case, the individual region may be defined as {frame_index, x-coordinate, y-coordinate, width, height}. The service server may directly drive the detection module, or may request detection from another server that operates the detection module in conjunction with the service server.

서비스 서버는 비식별 플레이어들을 서로 구별하도록 검출된 영역들을 클러스터링하여, 비식별 플레이어 별 클러스터들(unidentified player-specific clusters)을 생성한다(220). 서비스 서버는 비지도식 학습 기법에 기반하여 영역들을 클러스터링 할 수 있다. 예를 들어, 서비스 서버는 K-means clustering 기법으로 대표되는 군집 분석(cluster analysis) 기법을 이용하여 영역들을 클러스터링 할 수 있다. 생성된 클러스터는 해당하는 클러스터에 속한 영역들을 지시하는 정보(예를 들어, 영역들의 인덱스들 등)을 포함할 수 있다.The service server creates unidentified player-specific clusters by clustering the detected regions to distinguish unidentified players from each other ( 220 ). The service server can cluster regions based on unsupervised learning techniques. For example, the service server may cluster regions by using a cluster analysis technique, which is represented by a K-means clustering technique. The generated cluster may include information indicating areas belonging to the corresponding cluster (eg, indexes of areas, etc.).

클러스터링을 위한 K 파라미터는 경기에 참여한 플레이어들의 수를 사용자로부터 입력 받아 설정될 수 있다. 혹은, 스포츠 경기에 따라 동시에 참여 가능한 플레이어들의 수(예를 들어, 농구의 경우 한 팀당 5명씩 총 10명)를 K 파라미터의 초기 값으로 설정하고, K 파라미터의 값을 조절하면서 반복적으로(iteratively) 클러스터링을 수행함으로써, 해당하는 스포츠 동영상에서 스포츠 경기에 참여한 플레이어들의 수를 추정할 수도 있다. 교체 멤버까지 고려하면, 스포츠 경기에 참여한 플레이어들의 수는 동시에 참여 가능한 플레이어들의 수보다 많을 수 있다. 아래에서 상세하게 설명하겠으나, 실시예에 따라, 서비스 서버는 군집 분석 기법으로 계층적 군집화 기법을 이용할 수도 있다.The K parameter for clustering may be set by receiving the number of players participating in the game as input from the user. Alternatively, the number of players that can participate at the same time according to a sports event (for example, in case of basketball, 5 players per team for a total of 10 players) is set as the initial value of the K parameter, and iteratively while adjusting the value of the K parameter By performing clustering, it is also possible to estimate the number of players participating in a sports game in the corresponding sports video. Considering replacement members, the number of players participating in a sporting event may be greater than the number of players participating simultaneously. Although described in detail below, according to an embodiment, the service server may use a hierarchical clustering technique as a cluster analysis technique.

서비스 서버는 클러스터링을 위하여 영역의 특징을 추출할 수 있다. 일 실시예에 따르면, 서비스 서버는 개별 영역으로부터 외형 특징을 추출할 수 있다. 외형 특징은 다양한 방식으로 정의될 수 있다. 일 예로, 외형 특징은 다차원의 벡터로, 각각의 차원에서 체격, 키, 피부색, 머리 스타일, 얼굴, 운동복, 등번호, 신발, 보호대, 및/또는 악세서리 등과 관련된 정보를 포함할 수 있다. 또는, 서비스 서버는 영역들의 시퀀스로부터 모션 특징을 추출할 수 잇다. 모션 특징은 다양한 방식으로 정의될 수 있다. 일 예로, 모션 특징은 다차원의 벡터로 비식별 플레이어의 자세로부터 추출되는 정보나 비식별 플레이어의 움직임으로부터 추출되는 정보 등을 포함할 수 있다.The service server may extract the characteristics of a region for clustering. According to an embodiment, the service server may extract an appearance feature from an individual area. Appearance characteristics can be defined in a variety of ways. For example, the appearance feature is a multi-dimensional vector, and may include information related to a physique, height, skin color, hair style, face, sportswear, uniform number, shoes, protector, and/or accessories in each dimension. Alternatively, the service server may extract the motion feature from the sequence of regions. A motion characteristic can be defined in a variety of ways. For example, the motion feature is a multidimensional vector and may include information extracted from the posture of the unidentified player or information extracted from the movement of the unidentified player.

서비스 서버는 외형 특징이나 모션 특징에 기초하여, 영역들을 클러스터링 할 수 있다. 예를 들어, 서비스 서버는 외형 특징이 서로 유사한 영역들을 동일한 클러스터로 분류하고, 외형 특징이 서로 다른 영역들을 상이한 클러스터로 분류할 수 있다. 또는, 서비스 서버는 모션 특징이 서로 유사한 영역들을 동일한 클러스터로 분류하고, 모션 특징이 서로 다른 영역들을 상이한 클러스터로 분류할 수 있다. 실시예에 따라, 서비스 서버는 외형 특징과 모션 특징의 조합에 기초하여, 영역들을 클러스터링 할 수 있다. 예를 들어, 도 3을 참조하면, 서비스 서버는 외형 특징과 모션 특징의 조합이 서로 유사한 영역들을 동일한 클러스터로 분류하고, 외형 특징과 모션 특징의 조합이 서로 다른 영역들을 상이한 클러스터로 분류할 수 있다. 설명의 편의 상 도 3에서는 세 개의 클러스터들만 도시되었으나, 앞서 설명한 것과 같이 스포츠 동영상에 포함된 비식별 플레이어들의 수만큼의 클러스터들이 생성될 수 있다. 또한, 도 3에서는 외형 특징과 모션 특징을 각각 단일 차원으로 도시하였으나, 앞서 설명한 것과 같이 외형 특징이나 모션 특징은 다차원 정보를 포함할 수 있다.The service server may cluster regions based on appearance characteristics or motion characteristics. For example, the service server may classify regions having similar external features into the same cluster, and classify regions having different external features into different clusters. Alternatively, the service server may classify regions having similar motion characteristics into the same cluster, and classify regions having different motion characteristics into different clusters. According to an embodiment, the service server may cluster regions based on a combination of an appearance feature and a motion feature. For example, referring to FIG. 3 , the service server may classify regions having similar combinations of external features and motion features into the same cluster, and classify regions with different combinations of external features and motion features into different clusters. . For convenience of explanation, only three clusters are shown in FIG. 3, but as described above, as many clusters as the number of unidentified players included in the sports video may be generated. In addition, although the appearance feature and the motion feature are respectively illustrated in a single dimension in FIG. 3 , as described above, the appearance feature or the motion feature may include multidimensional information.

일 실시예에 따르면, 외형 특징은 단일 프레임의 영역에서 추출되고, 모션 특징은 복수 프레임들의 영역 시퀀스에서 추출될 수 있다. 이 경우, 영역과 영역 시퀀스 사이의 동기화 작업이 요구될 수 있다. 일 예로, 특정 프레임에서 검출된 영역의 외형 특징은 해당 영역을 중심으로 이전 프레임 및 이후 프레임에서 검출된 영역들의 외형 특징들과 함께 통계 처리(평균 등)될 수 있다. 이전 프레임 및 이후 프레임의 범위는 모션 특징을 추출하기 위한 영역 시퀀스의 프레임 범위에 대응될 수 있다.According to an embodiment, the appearance feature may be extracted from a region of a single frame, and the motion feature may be extracted from a region sequence of a plurality of frames. In this case, a synchronization operation between the region and the region sequence may be required. As an example, the external features of the region detected in a specific frame may be statistically processed (average, etc.) together with the external features of regions detected in the previous frame and subsequent frames with respect to the corresponding region. The range of the previous frame and the subsequent frame may correspond to the frame range of the region sequence for extracting the motion feature.

서비스 서버는 생성된 클러스터들에 기초하여, 스포츠 동영상으로부터 비식별 플레이어 별 비디오 클립들(unidentified player-specific video clips)을 추출한다(230). 서비스 서버는 각각의 클러스터에 대응하여, 해당 클러스터에 포함된 영역들의 프레임 인덱스들, 및 각 프레임 내에서 영역의 위치와 크기 등을 획득할 수 있다. 서비스 서버는 스포츠 동영상으로부터 개별 클러스터에 해당하는 프레임 인덱스들의 프레임들을 추출함으로써, 해당 클러스터를 위한 비디오 클립을 생성할 수 있다.The service server extracts unidentified player-specific video clips from the sports video based on the generated clusters (230). In response to each cluster, the service server may acquire frame indices of regions included in the cluster, and the location and size of regions within each frame. The service server may generate a video clip for a corresponding cluster by extracting frames of frame indices corresponding to individual clusters from the sports video.

일 실시예에 따르면, 서비스 서버는 비디오 클립에 시각적 효과를 부여할 수 있다. 서비스 서버는 프레임 내 영역의 위치와 크기에 기초하여, 스포츠 동영상에서 추출되는 프레임을 절단할 수 있다. 또는, 서비스 서버는 프레임 내 영역의 위치와 크기에 기초하여, 해당 프레임에서 해당 영역을 강조하는 시각적 효과를 부여할 수 있다. 또는, 서비스 서버는 클러스터와 관련된 정보나 영역에 관련된 정보를 자막 등의 형태로 부가할 수 있다.According to an embodiment, the service server may provide a visual effect to the video clip. The service server may cut the frame extracted from the sports video based on the location and size of the region within the frame. Alternatively, the service server may provide a visual effect of emphasizing the corresponding region in the corresponding frame based on the location and size of the region in the frame. Alternatively, the service server may add cluster-related information or region-related information in the form of captions or the like.

서비스 서버는 추출된 비디오 클립들을 비식별 플레이어들 별로 사용자에게 제공한다. 서비스 서버는 웹 인터페이스 및/또는 앱 인터페이스를 통하여 비디오 클립들을 사용자에게 제공할 수 있다. 실시예에 따라, 서비스 서버는 비디오 클립들을 소셜 네트워크 서비스 및/또는 인스턴트 메시징 서비스를 통하여 사용자에게 제공할 수도 있다.The service server provides the extracted video clips to the user for each non-identified player. The service server may provide the video clips to the user through a web interface and/or an app interface. Depending on the embodiment, the service server may provide the video clips to the user through a social networking service and/or an instant messaging service.

도 4는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 설명하는 도면이다. 도 4를 참조하면, 스포츠 동영상은 복수의 프레임들(..., k, k+1, ..., l, l+1, ??, m, m+1, ...)을 포함할 수 있다. 서비스 서버는 복수의 프레임들에서 복수의 비식별 플레이어들의 영역들(401 내지 416)을 검출할 수 있다.4 is a diagram for explaining a sports video-based platform service according to an embodiment. 4 , a sports video may include a plurality of frames (..., k, k+1, ..., l, l+1, ??, m, m+1, ...). can The service server may detect the regions 401 - 416 of the plurality of unidentified players in the plurality of frames.

서비스 서버는 비지도식 학습 기법을 이용하여, 영역들(401 내지 416)을 클러스터링 할 수 있다. 예를 들어, 서비스 서버는 영역(401), 영역(403), 영역(413) 및 영역(416)을 제1 클러스터(C1)로 분류하고, 영역(405), 영역(408), 영역(411) 및 영역(414)을 제2 클러스터(C2)로 분류하며, 영역(402), 영역(404), 영역(406) 및 영역(409)을 제3 클러스터(C3)로 분류하고, 영역(407), 영역(410), 영역(412) 및 영역(415)을 제4 클러스터(C4)로 분류할 수 있다.The service server may cluster the regions 401 to 416 by using an unsupervised learning technique. For example, the service server classifies the regions 401 , 403 , 413 , and 416 into the first cluster C1 , and the regions 405 , 408 , and 411 . ) and region 414 into a second cluster C2, region 402, region 404, region 406, and region 409 into a third cluster C3, and region 407 ), the region 410 , the region 412 , and the region 415 may be classified into a fourth cluster C4 .

서비스 서버는 제1 클러스터 내지 제4 클러스터에 기초하여, 스포츠 동영상으로부터 비식별 플레이어 별 비디오 클립들을 추출할 수 있다. 예를 들어, 서비스 서버는 제1 클러스터에 대응하는 프레임들(k, k+1, m, m+1)을 추출하여 제1 비식별 플레이어를 위한 비디오 클립을 생성할 수 있다. 또한, 서비스 서버는 제2 클러스터에 대응하는 프레임들(l, l+1, m, m+1)을 추출하여 제2 비식별 플레이어를 위한 비디오 클립을 생성할 수 있다. 서비스 서버는 제3 클러스터에 대응하는 프레임들(k, k+1, l, l+1)을 추출하여 제3 비식별 플레이어를 위한 비디오 클립을 생성할 수 있다. 서비스 서버는 제4 클러스터에 대응하는 프레임들(l, l+1, m, m+1)을 추출하여 제4 비식별 플레이어를 위한 비디오 클립을 생성할 수 있다.The service server may extract video clips for each non-identified player from the sports video based on the first to fourth clusters. For example, the service server may generate a video clip for the first non-identified player by extracting frames (k, k+1, m, m+1) corresponding to the first cluster. Also, the service server may generate a video clip for the second unidentified player by extracting frames (l, l+1, m, m+1) corresponding to the second cluster. The service server may generate a video clip for the third unidentified player by extracting frames (k, k+1, l, l+1) corresponding to the third cluster. The service server may generate a video clip for the fourth unidentified player by extracting frames (l, l+1, m, m+1) corresponding to the fourth cluster.

서비스 서버는 비식별 플레이어 별로 생성된 비디오 클립들을 사용자에게 제공함으로써, 비디오 클립 별로 식별 정보를 수신할 수 있다. 예를 들어, 서비스 서버는 제1 클러스터에 대응하여 팀A의 제1 플레이어라는 식별 정보를 수신하고, 제2 클러스터에 대응하여 팀A의 제2 플레이어라는 식별 정보를 수신하며, 제3 클러스터에 대응하여 팀B의 제1 플레이어라는 식별 정보를 수신하고, 제4 클러스터에 대응하여 팀B의 제2 플레이어라는 식별 정보를 수신할 수 있다.The service server may receive identification information for each video clip by providing the video clips generated for each non-identified player to the user. For example, the service server receives identification information of the first player of team A corresponding to the first cluster, receives identification information of the second player of team A corresponding to the second cluster, and corresponding to the third cluster Accordingly, identification information as the first player of team B may be received, and identification information as the second player of team B may be received corresponding to the fourth cluster.

서비스 서버는 복수의 사용자들로부터 수신되는 식별 정보를 수집할 수 있다. 예를 들어, 서비스 서버는 제1 사용자로부터 제1 클러스터의 비디오 클립을 식별하는 입력을 수신하고, 제2 사용자로부터 제2 클러스터의 비디오 클립을 식별하는 입력을 수신할 수 있다. 또는, 서비스 서버는 동일한 클러스터의 비디오 클립을 식별하는 정보를 복수의 사용자들로부터 수신하고, 가장 높은 신뢰도의 식별 정보를 채택할 수 있다. 예를 들어, 서비스 서버는 복수의 사용자들로부터 동일한 클러스터의 비디오 클립에 서로 다른 식별 정보를 수신할 수 있다. 이 경우, 서비스 서버는 가장 많은 사용자들에 의하여 입력된 식별 정보를 채택할 수 있다. 또는, 서비스 서버는 신뢰도가 가장 높은 사용자에 의하여 입력된 식별 정보를 채택할 수 있다. 또는, 서비스 서버는 사용자의 신뢰도를 기반으로 해당 사용자에 의하여 입력된 식별 정보에 점수를 부여하고, 가장 높은 점수를 가지는 식별 정보를 채택할 수도 있다. 사용자의 신뢰도는 해당 사용자가 기존에 스포츠 동영상 기반 플랫폼 서비스를 이용한 이력이나 해당 사용자의 본인 인증 레벨 등에 기초하여 결정될 수 있다.The service server may collect identification information received from a plurality of users. For example, the service server may receive an input identifying a video clip of a first cluster from a first user, and receive an input identifying a video clip of a second cluster from a second user. Alternatively, the service server may receive information identifying a video clip of the same cluster from a plurality of users, and adopt identification information of the highest reliability. For example, the service server may receive different identification information for video clips of the same cluster from a plurality of users. In this case, the service server may adopt the identification information input by the most users. Alternatively, the service server may adopt the identification information input by the user with the highest reliability. Alternatively, the service server may assign a score to the identification information input by the corresponding user based on the user's reliability and adopt the identification information having the highest score. The user's reliability may be determined based on the user's previous history of using a sports video-based platform service or the user's user authentication level.

도 5는 일 실시예에 따른 스포츠 동영상 기반 플랫폼 서비스를 설명하는 도면이다. 서비스 서버는 모션 유형들 별로 클러스터링을 수행할 수 있다. 이를 위하여, 서비스 서버는 스포츠 동영상에서 검출되는 비식별 플레이어들의 영역들을 미리 정해진 모션 유형 별로 분류하고, 모션 유형 별 영역들(motion type-specific regions)을 클러스터링 할 수 있다. 또는, 서비스 서버는 계층적 군집화 기법을 이용하여 영역들을 계층적으로 클러스터링 할 수 있다.5 is a diagram illustrating a sports video-based platform service according to an embodiment. The service server may perform clustering for each motion type. To this end, the service server may classify regions of non-identified players detected in a sports video according to a predetermined motion type, and may cluster motion type-specific regions. Alternatively, the service server may hierarchically cluster regions using a hierarchical clustering technique.

스포츠 경기에서 나타나는 모션 유형들 별 모션 특징은 비식별 플레이어들에 공통적으로 포함될 수 있다. 예를 들어, 제1 비식별 플레이어의 덩크 슛 모션과 제2 비식별 플레이어의 덩크 슛 모션은 공통적으로 덩크 슛을 위한 모션 특징을 포함할 수 있다. 또한, 제1 비식별 플레이어의 점프 슛 모션과 제2 비식별 플레이어의 점프 슛 모션은 공통적으로 점프 슛 모션을 위한 모션 특징을 포함할 수 있다. 도 6을 참조하면, 제1 비식별 플레이어의 덩크 슛 모션은 모션 벡터(610)로 표현되고, 제1 비식별 플레이어의 점프 슛 모션은 모션 벡터(620)으로 표현되며, 제2 비식별 플레이어의 덩크 슛 모션은 모션 벡터(630)으로 표현되고, 제2 비식별 플레이어의 점프 슛 모션은 모션 벡터(640)으로 표현될 수 있다.Motion characteristics for each type of motion appearing in a sports game may be included in common among unidentified players. For example, the dunk shooting motion of the first unidentified player and the dunk shooting motion of the second unidentified player may include a motion characteristic for a dunk shot in common. Also, the jump shoot motion of the first unidentified player and the jump shoot motion of the second unidentified player may include a motion characteristic for the jump shoot motion in common. Referring to FIG. 6 , the dunk shoot motion of the first unidentified player is represented by a motion vector 610 , the jump shoot motion of the first unidentified player is represented by a motion vector 620 , and the second unidentified player’s The dunk shoot motion may be expressed as a motion vector 630 , and the jump shot motion of the second unidentified player may be expressed as a motion vector 640 .

도 6의 실시예에서, 모션 벡터(610)과 모션 벡터(620) 사이의 거리는 모션 벡터(610)과 모션 벡터(630) 사이의 거리보다 멀고, 모션 벡터(640)과 모션 벡터(630) 사이의 거리는 모션 벡터(640)과 모션 벡터(620) 사이의 거리보다 멀 수 있다. 이 경우, 비식별 플레이어들 별로 클러스터링이 수행되지 못하고, 모션 유형 별로 클러스터링 될 수 있다. In the embodiment of FIG. 6 , the distance between motion vector 610 and motion vector 620 is greater than the distance between motion vector 610 and motion vector 630 , and between motion vector 640 and motion vector 630 . The distance of may be greater than the distance between the motion vector 640 and the motion vector 620 . In this case, clustering may not be performed for each non-identified player, but may be clustered for each motion type.

일 실시예에 따르면, 서비스 서버는 영역들의 모션 유형들을 분류한 뒤, 동일한 모션 유형에 해당하는 영역들 사이에서 클러스터링을 수행할 수 있다. 이 경우, 영역들은 덩크 슛 모션에 해당하는 제1 클러스터(650) 및 점프 슛 모션에 해당하는 제2 클러스터(660)로 분류될 수 있다. 실시예에 따라 영역 검출 및 모션 유형 분류는 동시에 수행될 수도 있다. 예를 들어, 스포츠 동영상으로부터 비식별 플레이어들을 검출하면서 해당하는 영역의 모션 유형을 분류(classify)하도록 학습된 검출 모듈이 이용될 수 있다. 물론 실시예에 따라 영역 검출 및 모션 유형 분류는 별도의 모듈(혹은 신경망)에 의하여 수행될 수도 있다. 영역들이 모션 유형들 별로 분류된 이후, 서비스 서버는 모션 유형들 별로 비식별 플레이어들을 서로 구별하도록 모션 유형 별 영역들(motion type-specific regions)을 클러스터링할 수 있다.According to an embodiment, after classifying motion types of regions, the service server may perform clustering among regions corresponding to the same motion type. In this case, the regions may be classified into a first cluster 650 corresponding to a dunk shoot motion and a second cluster 660 corresponding to a jump shoot motion. According to an embodiment, region detection and motion type classification may be performed simultaneously. For example, a learned detection module may be used to classify a motion type of a corresponding region while detecting non-identified players from a sports video. Of course, according to embodiments, region detection and motion type classification may be performed by a separate module (or neural network). After the regions are classified by motion types, the service server may cluster motion type-specific regions to distinguish non-identified players from each other by motion types.

일 실시예에 따르면, 서비스 서버는 계층적 군집화 기법을 이용하여, 우선 1차적 클러스터링을 수행한 뒤, 각 클러스터 내에서 2차적 클러스터링을 수행할 수 있다. 이 경우, 1차적 클러스터링을 통하여 제1 클러스터(650) 및 제2 클러스터(660)가 생성되고, 2차적 클러스터링을 통하여 제1 클러스터(650) 내에서 제1 비식별 플레이어와 제2 비식별 플레이어가 구별되고, 제2 클러스터(660) 내에서 제1 비식별 플레이어와 제2 비식별 플레이어가 구별될 수 있다. 1차적 클러스터링을 위한 K1은 해당 스포츠 경기에서 허용되는 모션 유형들의 수에 대응할 수 있고, 2차적 클러스터링을 위한 K2는 해당 스포츠 경기에 참여한 플레이어들의 수에 대응할 수 있다.According to an embodiment, the service server may first perform primary clustering and then perform secondary clustering within each cluster using a hierarchical clustering technique. In this case, the first cluster 650 and the second cluster 660 are generated through the primary clustering, and the first unidentified player and the second unidentified player in the first cluster 650 through the secondary clustering is distinguished, and the first unidentified player and the second unidentified player may be distinguished in the second cluster 660 . K1 for primary clustering may correspond to the number of motion types allowed in the corresponding sporting event, and K2 for secondary clustering may correspond to the number of players participating in the corresponding sporting event.

도 7을 참조하면, 모션 유형, 모션 특징 및 외형 특징에 따라 영역들이 클러스터링 되어 서브-클러스터들이 생성되는 실시예가 도시된다. 다시 도 6을 참조하면, 아래에서 상세히 설명하겠으나, 제1 클러스터(650) 내 제1 비식별 플레이어에 해당하는 서브-클러스터와 제2 클러스터(660) 내 제1 비식별 플레이어에 해당하는 서브-클러스터는 트래킹 정보, 외형 정보 또는 이들의 조합에 기초하여 서로 매칭될 수 있다. 물론 제1 클러스터(650) 내 제2 비식별 플레이어에 해당하는 서브-클러스터와 제2 클러스터(660) 내 제2 비식별 플레이어에 해당하는 서브-클러스터도 트래킹 정보, 외형 정보 또는 이들의 조합에 기초하여 서로 매칭될 수 있다.Referring to FIG. 7 , an embodiment in which sub-clusters are generated by clustering regions according to a motion type, a motion characteristic, and an appearance characteristic is illustrated. Referring back to FIG. 6 , as will be described in detail below, a sub-cluster corresponding to the first unidentified player in the first cluster 650 and a sub-cluster corresponding to the first unidentified player in the second cluster 660 . may be matched with each other based on tracking information, appearance information, or a combination thereof. Of course, the sub-cluster corresponding to the second unidentified player in the first cluster 650 and the sub-cluster corresponding to the second unidentified player in the second cluster 660 are also based on tracking information, appearance information, or a combination thereof. so they can be matched with each other.

도 5를 참조하면, 서비스 서버는 영역들(501 내지 516)을 모션 유형 별로 분류할 수 있다. 예를 들어, 영역(501) 및 영역(511)은 드리블 모션으로 분류되고, 영역(503), 영역(510) 및 영역(514)는 슛 모션으로 분류되며, 영역(504)는 블록 모션으로 분류되고, 영역(506)은 패스 모션으로 분류되며, 영역(513) 및 영역(516)은 스크린 모션으로 분류될 수 있다.Referring to FIG. 5 , the service server may classify regions 501 to 516 by motion type. For example, area 501 and area 511 are classified as dribble motion, area 503, area 510, and area 514 are classified as shooting motion, and area 504 is classified as block motion. , region 506 may be classified as a pass motion, and region 513 and region 516 may be classified as a screen motion.

서비스 서버는 모션 유형들 별로 비식별 플레이어들을 서로 구별하도록 모션 유형 별 영역들을 클러스터링 할 수 있다. 예를 들어, 서비스 서버는 드리블 모션으로 분류된 영역(501)과 영역(511)을 서로 다른 서브-클러스터들(DR1, DR2)로 클러스터링 할 수 있다. 서비스 서버는 슛 모션으로 분류된 영역(503), 영역(510) 및 영역(514)를 서로 다른 서브-클러스터들(SH1, SH2, SH3)로 클러스터링 할 수 있다. 서비스 서버는 스크린 모션으로 분류된 영역(513) 및 영역(516)을 동일한 서브-클러스터(SC1)로 클러스터링 할 수 있다.The service server may cluster regions for each motion type to distinguish non-identified players from each other for each motion type. For example, the service server may cluster the area 501 and the area 511 classified as dribbling motion into different sub-clusters DR1 and DR2. The service server may cluster the areas 503 , 510 , and 514 classified as shooting motion into different sub-clusters SH1 , SH2 , and SH3 . The service server may cluster the regions 513 and 516 classified as screen motion into the same sub-cluster SC1.

서비스 서버는 영역들의 트래킹 정보를 이용하여 서로 다른 모션 유형의 서브-클러스터들을 매칭할 수 있다. 예를 들어, 서비스 서버는 영역(501)과 영역(503)이 서로 연속된 영역들이라는 트래킹 정보에 기초하여, DR1과 SH1을 매칭하여 동일한 클러스터(C1)로 분류할 수 있다. 서비스 서버는 영역(511)과 영역(514)가 서로 연속된 영역들이라는 트래킹 정보에 기초하여, DR2와 SH3을 매칭하여 동일한 클러스터(C2)로 분류할 수 있다.The service server may match sub-clusters of different motion types using the tracking information of regions. For example, the service server may classify DR1 and SH1 into the same cluster C1 by matching DR1 and SH1 based on tracking information that the region 501 and the region 503 are continuous regions. The service server may classify the same cluster C2 by matching DR2 and SH3 based on the tracking information that the regions 511 and 514 are continuous regions.

서비스 서버는 영역들의 외형 정보를 이용하여 서로 다른 모션 유형의 서브-클러스터들을 매칭할 수 있다. 예를 들어, 서비스 서버는 영역(514)의 외형 특징과 영역(506)의 외형 특징이 유사하다는 판단에 따라, BR1과 PA1을 동일한 클러스터(C3)로 분류할 수 있다. 서비스 서버는 영역(513)의 외형 특징과 영역(502) 혹은 영역(503)의 외형 특징이 유사하다는 판단에 따라, SC1을 클러스터(C1)으로 분류할 수 있다.The service server may match sub-clusters of different motion types by using the appearance information of the regions. For example, the service server may classify BR1 and PA1 into the same cluster C3 according to determining that the external features of the area 514 and the external features of the region 506 are similar. The service server may classify SC1 into the cluster C1 according to the determination that the external features of the region 513 and the external features of the region 502 or 503 are similar.

도 5의 실시예에서, 영역들이 모션 유형들 별로 분류된 뒤 클러스터링 되는 예시를 설명하였으나, 전술한 것과 같이 계층적 군집화 기법을 이용하는 경우에도 실질적으로 동일하게 동작될 수 있다. 또한, 서브-클러스터들의 매칭을 위하여 트래킹 정보를 적용한 이후 외형 정보를 적용하는 예시를 설명하였으나, 트래킹 정보와 외형 정보를 적용하는 순서나 방식은 다양하게 변형될 수 있다.In the embodiment of FIG. 5 , an example in which regions are classified by motion types and then clustered has been described. However, even when the hierarchical clustering technique is used as described above, the operation may be substantially the same. In addition, although an example of applying the appearance information after applying the tracking information for matching sub-clusters has been described, the order or method of applying the tracking information and the appearance information may be variously modified.

일 실시예에 따르면, 서비스 서버는 미리 정해진 모션 유형에 따른 서브-클러스터들을 이용하여 비디오 클립을 생성할 수 있다. 예를 들어, 서비스 서버는 공격 모션에 해당하는 모션 유형들의 서브-클러스터들 만을 이용하여 비디오 클립을 생성할 수 있다. 더 나아가, 서비스 서버는 공격 모션 중 득점에 성공한 장면에 해당하는 영역 시퀀스 만을 이용하여 비디오 클립을 생성할 수도 있다.According to an embodiment, the service server may generate a video clip using sub-clusters according to a predetermined motion type. For example, the service server may generate a video clip using only sub-clusters of motion types corresponding to the attack motion. Furthermore, the service server may generate a video clip by using only the region sequence corresponding to a scene that has been scored successfully during the attack motion.

도 8은 일 실시예에 따른 범용 모델 및 특화 모델을 설명하는 도면이다. 도 8을 참조하면, 범용 모델은 스포츠 동영상으로부터 비식별 플레이어들을 범용적으로 검출하는 검출기 및 검출된 영역의 모션 유형을 범용적으로 분류하는 분류기를 포함할 수 있다. 범용 모델은 비식별 플레이어들의 데이터에 독립적으로 학습된 검출기 혹은 분류기를 포함할 수 있다.8 is a diagram illustrating a general-purpose model and a specialized model according to an embodiment. Referring to FIG. 8 , the general-purpose model may include a detector for universally detecting unidentified players from a sports video and a classifier for universally classifying the motion type of the detected region. A general-purpose model may include a detector or classifier that is independently trained on the data of non-identifying players.

특화 모델은 스포츠 동영상 기반 플랫폼 서비스에 따라 식별 정보가 설정된 데이터베이스를 활용하여, 특정 플레이어나 특정 그룹, 혹은 특정 팀에 특화되어, 플레이어를 검출하는 검출기, 모션 유형을 분류하는 분류기, 플레이어를 식별하는 식별기, 또는 검출 기능, 분류 기능 및/또는 식별 기능이 다양하게 조합된 혼합 모듈을 포함할 수 있다. 범용 모델과 특화 모델은 인공 신경망 기반의 모델일 수 있다. 이 경우, 데이터베이스는 식별 정보가 설정되는 플레이어들의 데이터에 의존하는 트레이닝 데이터를 포함할 수 있다.The specialized model utilizes a database in which identification information is set according to a sports video-based platform service, and is specialized for a specific player, a specific group, or a specific team, a detector that detects a player, a classifier that classifies a motion type, and an identifier that identifies a player , or a mixing module in which a detection function, a classification function, and/or an identification function are variously combined. The general-purpose model and the specialized model may be an artificial neural network-based model. In this case, the database may contain training data dependent on data of players for which identification information is set.

특화 모델은 특정 그룹에 새롭게 학습된 모델일 수 있으며, 실시예에 따라 범용 모델을 기반으로 특정 그룹에 적합하게 추가로 학습된 모델일 수 있다. 예를 들어, 국적, 연령대, 성별 등에 따라 특정 그룹에 특화되어 더 높은 성능을 내도록 범용 모델이 추가적으로 학습됨으로써, 특화 모델이 생성될 수 있다.The specialized model may be a model newly learned for a specific group, or may be a model additionally trained to suit a specific group based on a general-purpose model according to embodiments. For example, the general-purpose model may be additionally learned to be specialized to a specific group according to nationality, age group, gender, and the like to provide higher performance, thereby generating a specialized model.

서비스 서버는 특화 모델을 이용하여 새롭게 수신되는 스포츠 동영상으로부터 비식별 플레이어 별 클러스터들의 식별 정보를 자동으로 설정할 수 있다. 스포츠 동영상은 라이브 스트리밍 영상일 수 있으며, 이 경우 서비스 서버는 특화 모델에 기초하여, 스포츠 동영상으로부터 자동으로 식별되는 플레이어들의 기여도를 나타내는 통계를 실시간으로 생성할 수 있다. 플레이어들의 기여도를 나타내는 통계는 농구 경기 중인 양 팀의 점수뿐 아니라, 개별 플레이어의 득점 성공 횟수, 득점 시도 횟수, 출전시간 중 팀 득실, 총 득점 수, 공격 리바운드 수, 수비 리바운드 수, 어시스트 수, 스틸 수, 블록 수, 피 블록 수, 파울 수, 턴오버 수 등의 세부기록들을 포함할 수 있다.The service server may automatically set identification information of clusters for each unidentified player from a newly received sports video using a specialized model. The sports video may be a live streaming video, and in this case, the service server may generate, in real time, statistics indicating the contribution of players automatically identified from the sports video based on the specialized model. Statistics showing the contribution of players include not only the scores of both teams during a basketball game, but also the number of successful goals scored by each player, the number of attempts to score, the number of goals scored by the team during playing time, the total number of goals scored, the number of offensive rebounds, the number of defensive rebounds, the number of assists, and the number of steals. It may include detailed records such as the number, the number of blocks, the number of blocks, the number of fouls, and the number of turnovers.

도 9는 일 실시예에 따른 농구 경기의 플레이어 별 기여도를 나타내는 통계와 연동하여 제공되는 비디오 클립들을 설명하는 도면이다. 도 9을 참조하면, 서비스 서버는 농구 동영상에서 검출되는 영역의 식별 정보와 영역의 모션 유형에 따라 플레이어 별 경기 기여도를 통계적으로 나타내는 박스 스코어(box score)를 생성할 수 있다. 9 is a view for explaining video clips provided in association with statistics indicating a contribution level for each player in a basketball game according to an exemplary embodiment. Referring to FIG. 9 , the service server may generate a box score statistically indicating the contribution of each player to a game according to identification information of a region detected in a basketball video and a motion type of the region.

서비스 서버는 영역의 모션 유형에 따라, 추가적으로 스포츠 동영상을 분석할지 여부를 결정할 수 있다. 예를 들어, 서비스 서버는 모션 유형이 슛 모션이라는 판단에 따라, 슛이 성공하였는지 여부를 추가적으로 분석할 수 있다. 서비스 서버는 슛 모션의 프레임 이후 공이 림을 통과하였는지를 여부를 확인할 수 있다. 서비스 서버는 슛 모션 이후의 프레임들에서 림에 해당하는 영역들을 검출할 수 있고, 검출된 영역들에서 공이 통과하였는지 여부를 판단할 수 있다.The service server may determine whether to additionally analyze the sports video according to the motion type of the region. For example, the service server may additionally analyze whether the shot was successful according to the determination that the motion type is a shooting motion. The service server may check whether the ball has passed the rim after the frame of the shooting motion. The service server may detect regions corresponding to the rim in frames after the shooting motion, and may determine whether the ball has passed in the detected regions.

서비스 서버는 박스 스코어를 제공함과 함께, 박스 스코어 내 세부기록의 선택에 반응하여 선택된 세부기록의 장면들을 포함하는 비디오 클립을 제공할 수 있다. 예를 들어, 박스 스코어 내 FG은 08-14와 같이 '득점 성공 횟수'-'득점 시도 횟수'로 표시될 수 있다. 사용자가 '득점 성공 횟수'에 해당하는 08을 선택하는 경우, 서비스 서버는 해당하는 플레이어가 해당 경기에서 득점에 성공한 장면들을 포함하는 비디오 클립을 생성하여 사용자에게 제공할 수 있다.In addition to providing the box score, the service server may provide a video clip comprising scenes of the selected detail in response to selection of the detail in the box score. For example, the FG in the box score may be expressed as 'number of successful scoring' - 'number of scoring attempts', such as 08-14. When the user selects '08' corresponding to the 'number of successful scoring', the service server may generate a video clip including scenes in which the corresponding player succeeds in scoring in the corresponding game and provide it to the user.

보다 구체적으로, 박스 스코어의 세부기록들은 해당하는 영역들(혹은 서브-클러스터)을 연결 리스트 등의 자료구조로 저장할 수 있다. 서비스 서버는 연결 리스트를 따라서 관련된 영역들을 획득하고, 각 영역에 저장된 정보(프레임 인덱스, 윈도우 위치, 윈도우 크기 등)에 기초하여 농구 동영상으로부터 비디오 클립을 추출할 수 있다. 전술한 것과 같이 서비스 서버는 비디오 클립에 시각적 효과를 부여할 수도 있다.More specifically, the detailed records of the box score may store corresponding regions (or sub-clusters) in a data structure such as a linked list. The service server may obtain related areas along the linked list, and extract a video clip from the basketball moving picture based on information (frame index, window position, window size, etc.) stored in each area. As described above, the service server may provide a visual effect to the video clip.

도 10은 일 실시예에 따른 비디오 클립에 대한 사용자의 피드백을 반영하는 기능을 설명하는 도면이다. 도 10을 참조하면, 서비스 서버는 비디오 클립에 대한 사용자의 피드백을 수신할 수 있다. 예를 들어, 서비스 서버는 사용자로부터 비디오 클립의 클러스터에 포함된 적어도 하나의 영역의 비식별 플레이어가 해당 클러스터에 속하지 않는다는 피드백 입력(1010)을 수신할 수 있다. 서비스 서버는 해당 영역을 해당 클러스터에서 배제 혹은 제거하거나, 더 나아가 해당 영역을 포함하는 서브-클러스터(1015)를 해당 클러스터에서 배제 혹은 제거할 수 있다. 이 경우, 실시예에 따라 서비스 서버는 배제된 영역이나 서브-클러스터를 미분류 풀(pool)에 임시로 저장할 수 있다. 서비스 서버는 미분류 풀에 임시로 저장된 서브-클러스터들의 비디오 클립들을 사용자에게 제공하면서, 해당 서브-클러스터들이 어느 클러스터에 속하는지 여부를 문의할 수 있다.10 is a diagram for describing a function of reflecting a user's feedback on a video clip according to an exemplary embodiment. Referring to FIG. 10 , the service server may receive a user's feedback on a video clip. For example, the service server may receive a feedback input 1010 from the user indicating that the unidentified player of at least one region included in the cluster of video clips does not belong to the cluster. The service server may exclude or remove the corresponding area from the corresponding cluster, or further exclude or remove the sub-cluster 1015 including the corresponding area from the corresponding cluster. In this case, according to an embodiment, the service server may temporarily store the excluded area or sub-cluster in an unclassified pool. The service server may inquire to which cluster the sub-clusters belong while providing the user with video clips of the sub-clusters temporarily stored in the unclassified pool.

또는, 서비스 서버는 사용자로부터 비디오 클립의 클러스터에 포함된 적어도 하나의 영역의 비식별 플레이어가 다른 클러스터에 속한다는 피드백 입력(1020)을 수신할 수 있다. 서비스 서버는 해당 영역이나 해당 영역을 포함하는 서브 클러스터(1025)를 해당 클러스터에서 배제 혹은 제거할 수 있다. 서비스 서버는 배제된 영역이나 서브-클러스터를 사용자에 의하여 지정된 다른 클러스터에 포함시킬 수 있다.Alternatively, the service server may receive a feedback input 1020 from the user indicating that the unidentified player of at least one region included in the cluster of video clips belongs to another cluster. The service server may exclude or remove the corresponding area or the sub-cluster 1025 including the corresponding area from the corresponding cluster. The service server can include excluded zones or sub-clusters into other clusters designated by the user.

도면에 도시하지 않았으나, 서비스 서버는 서비스의 정확도(혹은 신뢰도)에 따라 차등적으로 과금 레벨을 결정할 수 있다. 예를 들어, 서비스 서버는 한 경기당 서비스 비용(예를 들어, 1달러)을 기준으로, 검출 정확도, 분류 정확도 또는 이들의 조합에 따라 서비스 비용을 할인할 수 있다. 검출 정확도 및/또는 분류 정확도는 클러스터의 정확도라고 지칭될 수 있다. Although not shown in the drawing, the service server may differentially determine the charging level according to the accuracy (or reliability) of the service. For example, the service server may discount the service cost according to detection accuracy, classification accuracy, or a combination thereof based on the service cost per game (eg, $1). Detection accuracy and/or classification accuracy may be referred to as cluster accuracy.

또한, 서비스 서버는 사용자의 피드백 입력에 따라 보상 레벨을 결정할 수 있다. 예를 들어, 서비스 서버는 사용자의 피드백 입력에 따라 클러스터의 정확도(혹은 신뢰도)가 향상되는 정도에 따라, 혹은 사용자의 피드백 입력에 따라 특화 모델의 성능이 향상되는 정도에 따라, 다음 번 서비스에 이용 가능한 포인트를 해당 사용자에게 적립할 수 있다.Also, the service server may determine the reward level according to the user's feedback input. For example, the service server is used for the next service according to the degree to which the accuracy (or reliability) of the cluster is improved according to the user's feedback input, or the performance of the specialized model is improved according to the user's feedback input. Possible points can be accumulated to the corresponding user.

도 11은 일 실시예에 따른 검색 기능을 설명하는 도면이다. 서비스 서버는 스포츠 동영상의 검색을 위한 데이터베이스를 구축할 수 있다. 서비스 서버는 사용자 단말로부터 검색 쿼리를 수신할 수 있다. 검색 쿼리는 검색 대상 플레이어 및 검색 대상 장면을 포함할 수 있다. 서비스 서버는 데이터베이스로부터 검색 쿼리에 대응하는 스포츠 동영상의 URL 및 해당하는 스포츠 동영상 내 시간 구간(들)을 검색할 수 있다.11 is a diagram for describing a search function according to an exemplary embodiment. The service server may build a database for searching sports videos. The service server may receive a search query from the user terminal. The search query may include a search target player and a search target scene. The service server may search for a URL of a sports video corresponding to the search query and time section(s) within the sports video corresponding to the search query from the database.

서비스 서버는 스트리밍 서버로 비디오 클립을 추출하기 위한 정보를 제공함으로써, 스트리밍 서버에서 사용자에게 비디오 클립이 바로 전송되도록 할 수 있다. 서비스 서버는 스포츠 동영상의 URL 및 시간 구간(들)을 포함하는 검색 결과를 사용자 단말에 제공할 수 있다. 사용자 단말은 검색 결과에 기초하여 스트리밍 서버 혹은 스토리지 서버에 스포츠 동영상의 해당 시간 구간의 영상을 요청할 수 있다.The service server provides information for extracting the video clip to the streaming server, so that the video clip is directly transmitted from the streaming server to the user. The service server may provide a search result including the URL and time section(s) of the sports video to the user terminal. The user terminal may request an image of the corresponding time period of the sports video from the streaming server or the storage server based on the search result.

도면에 도시하지 않았으나, 실시예에 따라, 서비스 서버는 검색된 서브-클러스터(혹은 구간 시퀀스)를 포함하는 클러스터의 {스포츠 동영상의 액세스 정보}에 기초하여 스트리밍 서버 혹은 스토리지 서버로부터 비디오 클립을 추출할 수 있다. 서비스 서버는 비디오 클립을 사용자에게 제공할 수 있다. 서비스 서버는 기 생성된 비디오 클립을 캐싱 할 수 있다. 서비스 서버는 데이터베이스에 비디오 클립의 캐싱 여부를 저장할 수 있다. 서비스 서버는 쿼리 처리 결과 캐싱 된 비디오 클립을 제공하면 되는 것으로 판단되는 경우, 비디오 클립을 추출(혹은 생성)하는 동작을 생략하고 바로 캐싱 된 비디오 클립을 사용자에게 제공할 수 있다.Although not shown in the figure, according to an embodiment, the service server may extract a video clip from a streaming server or a storage server based on {sports video access information} of a cluster including the found sub-cluster (or section sequence). have. The service server may provide the video clip to the user. The service server may cache pre-generated video clips. The service server may store whether the video clip is cached in the database. When it is determined that the service server needs to provide the cached video clip as a result of the query processing, the service server may omit the operation of extracting (or creating) the video clip and immediately provide the cached video clip to the user.

일 실시예에 따르면, 스포츠 동영상을 촬영한 카메라의 시점에 따라 외형 특징이나 모션 특징이 다르게 추출될 수 있다. 여기서, 카메라의 시점(viewpoint)은 카메라의 3차원 위치 및 카메라의 3차원 오리엔테이션으로 6 자유도(Degree of Freedom; DOF)의 값을 가질 수 있다.According to an embodiment, an appearance feature or a motion feature may be extracted differently according to a viewpoint of a camera that captures a sports video. Here, a viewpoint of the camera may have a value of 6 degrees of freedom (DOF) as a 3D position of the camera and a 3D orientation of the camera.

서비스 서버는 카메라의 시점 변화에 강건한 외형 특징이나 모션 특징을 이용할 수 있다. 예를 들어, 서비스 서버는 다차원 외형 벡터에 포함되는 정보가 카메라의 시점에 의존하지 않고 독립적이 되도록 외형 특징(appearance feature)을 인코딩할 수 있다. 카메라 시점에 독립적인 외형 특징은 헤어 스타일, 피부 스타일(혹은 피부타입) 및/또는 문신 스타일 등 플레이어 자체의 스타일 특징; 저지 스타일, 농구화 스타일, 및/또는 기타 악세서리 스타일 등 플레이어가 착용한 물체의 스타일 특징; 및/또는 기준 객체에 기초하여 정규화 된 플레이어의 키나 체격 등 플레이어의 피지컬(physical) 특징을 포함할 수 있다. The service server may use an appearance feature or motion feature that is robust to changes in the camera's viewpoint. For example, the service server may encode an appearance feature so that the information included in the multidimensional appearance vector is independent without depending on the viewpoint of the camera. Appearance characteristics independent of the camera view include the player's own style characteristics, such as hair style, skin style (or skin type) and/or tattoo style; style characteristics of the object worn by the player, such as jersey styles, basketball shoe styles, and/or other accessory styles; and/or may include physical characteristics of the player, such as the player's height or physique, normalized based on the reference object.

플레이어 자체의 스타일 특징과 플레이어가 착용한 물체의 특징은 카메라의 시점에 독립적인 형태로 정의될 수 있다. 기준 객체는 해당하는 스포츠 경기에서 규격화된 외형을 포함하는 객체로, 예를 들어 골대나 경기장에 그려진 라인들을 포함할 수 있다. 스포츠 동영상에 촬영된 기준 객체의 크기를 기준으로 플레이어의 키나 체격 등 피지컬 특징이 정규화 될 수 있다. 기준 객체의 크기는 해당 스포츠 경기에서 규격화되어 있으므로, 기준 객체를 기준으로 정규화 되는 피지컬 특징은 카메라의 시점에 의존하지 않고 독립적일 수 있다.The style characteristics of the player itself and the characteristics of the object worn by the player can be defined in a form independent of the camera's viewpoint. The reference object is an object including a standardized appearance in a corresponding sporting event, and may include, for example, lines drawn on a goalpost or a stadium. Physical characteristics such as height or physique of a player may be normalized based on the size of a reference object captured in a sports video. Since the size of the reference object is standardized in the corresponding sporting event, the physical feature normalized based on the reference object may be independent without depending on the viewpoint of the camera.

또한, 서비스 서버는 다차원 모션 벡터에 포함되는 정보가 카메라의 시점에 의존하지 않고 독립적이 되도록 모션 특징(motion feature)을 인코딩할 수 있다. 카메라 시점에 독립적인 모션 특징은 기준 객체를 기준으로 정규화 된 모션(예를 들어, 방향, 크기, 속도 등)이나 포즈를 포함할 수 있다. 플레이어의 관절을 인식 가능한 경우, 모션 특징은 주요 관절 별로 정규화 된 모션이나 포즈를 포함할 수 있다. 여기서, 모션 특징에 활용되는 주요 관절은 모션 유형 별로 정의될 수 있다.In addition, the service server may encode a motion feature so that information included in the multidimensional motion vector is independent without depending on the viewpoint of the camera. The camera viewpoint-independent motion characteristics may include normalized motion (eg, direction, size, speed, etc.) or pose with respect to a reference object. If the player's joints are recognizable, the motion characteristics may include normalized motions or poses for each major joint. Here, major joints used for motion characteristics may be defined for each motion type.

일 실시예에 따르면, 스포츠 동영상이 경기장의 일부가 잘린 채로 촬영될 수 있다. 서비스 서버는 영역 시퀀스를 활용하여, 촬영되지 못하고 잘린 공간에서의 플레이어의 동작을 추정할 수 있다. 예를 들어, 슛 모션이 감지되지 않은 상태에서 공이 림을 통과하거나, 공이 림에 맞는 이벤트가 감지되는 경우, 서비스 서버는 촬영되지 못하고 잘린 공간에서 슛 모션이 발생하였다고 추정할 수 있다. 더 나아가, 서비스 서버는 해당 프레임의 인접한 이전 프레임들로부터 촬영되지 못하고 잘린 공간으로 이동한 플레이어를 찾아, 해당하는 플레이어에 의하여 슛 모션이 발생하였다고 추정할 수 있다.According to an embodiment, a sports video may be recorded while a part of the stadium is cut off. The service server may utilize the region sequence to estimate the movement of the player in the uncaptured and cut space. For example, if the ball passes through the rim or an event that the ball hits the rim is detected while no shooting motion is detected, the service server may estimate that the shooting motion occurred in the cut space without being photographed. Furthermore, the service server may find a player who has moved to a cut space without being photographed from adjacent previous frames of the corresponding frame, and may estimate that a shooting motion has occurred by the corresponding player.

일 실시예에 따르면, 서비스 서버는 심판의 모션, 심판의 휘슬 소리, 혹은 점수 시스템의 부저 소리 등을 추가로 활용하여, 스포츠 동영상을 분석할 수 있다. 예를 들어, 서비스 서버는 심판의 모션을 이용하여 2점 슛인지 3점 슛인지 여부를 구별할 수 있다. 또는, 서비스 서버는 휘슬 소리나 부저 소리 등을 활용하여, 해당하는 시점에 경기 진행이 중지되었는지 여부를 판단할 수 있다.According to an embodiment, the service server may analyze the sports video by additionally utilizing the motion of the referee, the whistle sound of the referee, or the buzzer sound of the scoring system. For example, the service server may distinguish whether a two-point shot or a three-point shot is made by using the motion of the referee. Alternatively, the service server may use a whistle sound or a buzzer sound to determine whether the game is stopped at a corresponding time point.

일 실시예에 따르면, 서비스 서버는 모션 유형에 따른 연계 상황을 인식할 수 있다. 예를 들어, 서비스 서버는 슛 모션 이전의 패스 모션과 연계하여 어시스트 상황을 인식할 수 있다. 또는, 서비스 서버는 슛 모션 이후의 블록 모션과 연계하여 블록 상황을 인식할 수 있다. 블록 상황의 경우, 슛 모션 이후의 공의 진행 방향을 함께 고려하여 인식할 수도 있다.According to an embodiment, the service server may recognize a connection situation according to a motion type. For example, the service server may recognize the assist situation in connection with the pass motion before the shooting motion. Alternatively, the service server may recognize the block situation in connection with the block motion after the shoot motion. In the case of a block situation, it can also be recognized by considering the direction of the ball after the shooting motion.

도 12는 일 실시예에 따른 추적 클러스터들을 생성하는 동작을 설명하는 도면이다. 도 12를 참조하면, 사용자는 스트리밍 서버(1210)에 스포츠 동영상을 업로드할 수 있다. 사용자는 스트리밍 서버의 링크를 서비스 서버로 제공할 수 있다. 전술한 것과 같이, 스포츠 동영상은 스트리밍 서버(1210) 이외에 스토리지 서버에 업로드 될 수도 있고, 서비스 서버에 직접 업로드 될 수도 있다.12 is a diagram for explaining an operation of generating tracking clusters according to an embodiment. Referring to FIG. 12 , a user may upload a sports video to the streaming server 1210 . The user may provide a link of the streaming server as a service server. As described above, the sports video may be uploaded to a storage server other than the streaming server 1210 or may be directly uploaded to a service server.

서비스 서버는 해당 링크를 이용하여 스트리밍 서버(1210)에 접속함으로써, 스포츠 동영상을 수신할 수 있다. 실시예에 따라, 서비스 서버는 스포츠 동영상을 스트리밍 받으면서 처리를 하거나, 스포츠 동영상을 다운로드 받은 뒤 처리를 할 수도 있다.The service server may receive a sports video by accessing the streaming server 1210 using the corresponding link. According to an embodiment, the service server may process the sports video while streaming it or perform the processing after downloading the sports video.

서비스 서버의 전처리 모듈(1220)은 스포츠 동영상을 전처리 할 수 있다. 예를 들어, 서비스 서버는 스포츠 동영상에서 휴식 시간을 제외하고, 경기 시간의 영상을 추출할 수 있다. 만약 스포츠 경기가 4 쿼터로 이루어지는 경우, 서비스 서버는 1 내지 4 쿼터의 경기 영상들을 추출할 수 있다. 또한, 서비스 서버는 경기 영상들에서 경기 비 진행 영상을 제외하고 경기 진행 영상을 추출할 수 있다. 예를 들어, 한 쿼터 내에서도 반칙이나 작전 타임 등으로 인하여 경기가 중단될 수 있다. 서비스 서버는 경기가 중단되는 경기 비 진행 영상을 제외하고, 경기가 진행 중인 경기 진행 영상을 추출할 수 있다. 경기 진행 영상은 스포츠 경기의 규칙에 따라 세분화될 수 있다. 예를 들어, 농구 경기의 경우 반칙을 당한 플레이어에게 프리 드로우가 주어질 수 있고, 축구 경기의 경우 프리 킥이 주어질 수 있다. 서비스 서버는 해당하는 스포츠 경기의 규칙에 따라, 프리 드로우 영상이나 프리 킥 영상을 추출할 수 있다.The pre-processing module 1220 of the service server may pre-process a sports video. For example, the service server may extract an image of the game time, excluding the break time, from the sports video. If the sports game consists of four quarters, the service server may extract the match images of the first to fourth quarters. In addition, the service server may extract the game progress image from the game images except for the non-match video. For example, even within a quarter, the game may be stopped due to a foul or operation time. The service server may extract the video of the game in progress, except for the non-game video where the game is interrupted. The game progress video may be subdivided according to the rules of the sports game. For example, in a basketball game, a free throw may be awarded to a player who is fouled, and in a soccer game, a free kick may be awarded. The service server may extract a free throw image or a free kick image according to the rules of the corresponding sporting event.

서비스 서버는 스포츠 동영상 내 프레임 구간이나 시간 구간을 추출하는 방식으로 영상을 추출할 수 있다. 스포츠 동영상의 전처리는 자동으로 처리되는 자동 모드, 자동으로 처리된 후 사용자의 피드백을 받는 반자동 모드, 혹은 사용자에게 입력 받는 수동 모드로 동작될 수 있다. 예를 들어, 서비스 서버는 1 내지 4 쿼터의 시작 시간 및 종료 시간을 사용자로부터 입력 받을 수 있다. 또는, 서비스 서버는 동영상 분석을 통하여 1 내지 4 쿼터의 시작 장면 후보들 및 종료 장면 후보들을 자동으로 추출하고, 사용자로부터 각 쿼터의 실제 시작 장면 및 실제 종료 장면을 선택 받을 수 있다. 또한, 서비스 서버는 심판의 휘슬 소리나, 심판의 모션을 분석함으로써 반칙으로 인한 경기 중단 상황을 인식할 수 있다. 물론 서비스 서버는 동영상 분석을 통하여 경기 중단 상황을 인식할 수 있다. 예를 들어, 경기 진행 중 플레이어들의 모션 양과 경기 중단 상황에서 플레이어들의 모션 양에 차이가 날 수 있다. 서비스 서버는 동영상 분석을 통하여 플레이어들의 모션 양을 획득함으로써, 경기 중단 상황을 인식할 수 있다.The service server may extract the video by extracting a frame section or a time section within the sports video. Pre-processing of sports videos may be operated in an automatic mode in which automatic processing is performed, a semi-automatic mode in which user feedback is received after being automatically processed, or a manual mode in which input from a user is received. For example, the service server may receive a start time and an end time of 1 to 4 quarters from the user. Alternatively, the service server may automatically extract start scene candidates and end scene candidates of the first to fourth quarters through video analysis, and receive the user's selection of the actual start scene and the actual end scene of each quarter. In addition, the service server may recognize the match suspension due to a foul by analyzing the referee's whistle sound or the referee's motion. Of course, the service server can recognize the game suspension situation through video analysis. For example, there may be a difference between the amount of motion of the players during a match and the amount of motion of the players during a game stop. The service server may recognize the game interruption situation by acquiring the amount of motion of the players through video analysis.

서비스 서버는 전처리 된 동영상에서 비식별 플레이어들을 검출 및 추적(1230)할 수 있다. 서비스 서버는 스포츠 동영상으로부터 비식별 플레이어들에 대응하는 영역들을 추적하여, 추적 클러스터들(i0, i1, j0, k0)을 생성할 수 있다. 스포츠 동영상은 복수의 프레임들을 포함하고, 추적 클러스터는 복수의 프레임들 중 적어도 일부의 연속된 프레임들에 포함되는 동일한 비식별 플레이어의 영역들을 포함할 수 있다.The service server may detect and track (1230) unidentified players in the pre-processed video. The service server may generate tracking clusters i0, i1, j0, and k0 by tracking regions corresponding to non-identified players from the sports video. The sports video may include a plurality of frames, and the tracking cluster may include regions of the same unidentified player included in successive frames of at least some of the plurality of frames.

서비스 서버는 추적 클러스터들(i0, i1, j0, k0)에 고유한 식별자를 부여할 수 있다. 예를 들어, 서비스 서버는 시작 프레임 및 고유 번호의 조합으로 추적 클러스터에 식별자를 부여할 수 있다. 만약 i번째 프레임에서 시작되는 추적 클러스터가 존재하면, i0의 식별자가 부여될 수 있다. 만약 i번째 프레임에서 시작되는 다른 추적 클러스터가 존재하면, i1의 식별자가 부여될 수 있다. 마찬가지로, j번째 프레임에서 시작되는 추적 클러스터에는 j0의 식별자가 부여되고, k번째 프레임에서 시작되는 추적 클러스터에는 k0의 식별자가 부여될 수 있다.The service server may assign a unique identifier to the tracking clusters i0, i1, j0, k0. For example, the service server may assign an identifier to the tracking cluster with a combination of a start frame and a unique number. If there is a tracking cluster starting in the i-th frame, an identifier of i0 may be assigned. If there is another tracking cluster starting in the i-th frame, an identifier of i1 may be assigned. Similarly, an identifier of j0 may be assigned to a tracking cluster starting from a j-th frame, and an identifier of k0 may be assigned to a tracking cluster starting from a k-th frame.

또한, 서비스 서버는 추적 클러스터의 종료 프레임이나 추적 클러스터의 프레임 수를 지시하는 정보를 식별자에 추가할 수 있다. 아래에서 설명하겠으나, 추적 클러스터들 사이의 매칭을 수행할 때, 일부의 프레임이라도 겹치는 경우 매칭을 생략할 수 있다. 서비스 서버는 식별자에 추가된 종료 프레임이나, 프레임 수를 이용하여 추적 클러스터들이 서로 겹치는지 여부를 용이하게 판단할 수 있다.Also, the service server may add information indicating the end frame of the tracking cluster or the number of frames of the tracking cluster to the identifier. As will be described below, when matching between tracking clusters is performed, matching may be omitted if even some frames overlap. The service server can easily determine whether the tracking clusters overlap each other by using the end frame added to the identifier or the number of frames.

스트리밍 방식으로 스포츠 동영상을 처리하는 경우, 서비스 서버는 현재 프레임에서 추적 클러스터가 여전히 유효한지 여부를 지시하는 정보를 식별자에 추가할 수도 있다.When processing a sports video in a streaming manner, the service server may add information indicating whether the tracking cluster is still valid in the current frame to the identifier.

스포츠 동영상에서 비식별 플레이어들을 추적하는 과정에서, 다양한 요인으로 인하여 비식별 플레이어의 추적이 끊길 수 있다. 예를 들어, 카메라의 시점에 따라 비식별 플레이어들끼리 서로 겹치는 경우가 발생할 수 있다. 또는, 비식별 플레이어가 카메라의 시점에서 벗어나는 경우가 발생할 수 있다. 또는, 스포츠 동영상의 촬영 중 장애물로 인하여 화면의 일부 또는 전부가 가려지는 경우가 발생할 수 있다. 또는, 기타 기술적인 이유로 추적 모듈이 비식별 플레이어의 추적을 놓칠 수 있다. 이러한 경우, 추적이 끊기기 전까지의 제1 추적 클러스터가 생성되고, 추적이 끊기고 난 이후 새롭게 추적되는 제2 추적 클러스터가 생성될 수 있다. 아래 도 13에서는 동일한 비식별 플레이어에 대응하여 분리되어 생성되는 추적 클러스터들을 매칭을 통하여 병합하는 실시예를 설명한다.In the process of tracking unidentified players in a sports video, tracking of unidentified players may be interrupted due to various factors. For example, depending on the viewpoint of the camera, non-identified players may overlap each other. Alternatively, an unidentified player may deviate from the camera's viewpoint. Alternatively, some or all of the screen may be blocked due to an obstacle while shooting a sports video. Or, for other technical reasons, the tracking module may miss tracking of an unidentified player. In this case, a first tracking cluster until the tracking is stopped may be generated, and a second tracking cluster newly tracked after the tracking is stopped may be generated. In FIG. 13 below, an embodiment of merging tracking clusters that are separated and generated corresponding to the same unidentified player through matching will be described.

도 13은 일 실시예에 따른 추적 클러스터들을 매칭하는 동작을 설명하는 도면이다. 도 13을 참조하면, 서비스 서버는 추적 클러스터 별로 모션 유형을 분류(1310)하고 특징을 추출(1320)할 수 있다. 서비스 서버는 추적 클러스터에 포함된 비식별 플레이어의 외형 특징을 추출할 수 있다. 전술한 것과 같이 외형 특징은 카메라의 시점에 독립적인 다차원 특징 벡터를 포함할 수 있다.13 is a diagram illustrating an operation of matching tracking clusters according to an embodiment. Referring to FIG. 13 , the service server may classify motion types for each tracking cluster ( 1310 ) and extract features ( 1320 ). The service server may extract the appearance characteristics of the non-identified player included in the tracking cluster. As described above, the appearance feature may include a multidimensional feature vector independent of the viewpoint of the camera.

서비스 서버는 미리 정해진 모션 유형들에 기초하여 추적 클러스터에 포함된 적어도 일부의 구간의 모션 유형을 분류할 수 있다. 예를 들어, 서비스 서버는 추적 클러스터에 포함된 복수의 영역들 중 미리 정해진 모션 유형들 중 어느 하나에 해당하는 적어도 일부의 연속된 영역들을 검출할 수 있다. 일 예로, 추적 클러스터(i0)를 참조하면, 서비스 서버는 드리블 구간과 패스 구간을 검출할 수 있다. 서비스 서버는 추적 클러스터에 포함되는 연속된 장면들을 시퀀셜하게(sequentially) 입력 받아, 미리 정해진 모션 유형들 중 적어도 하나를 출력하는 신경망 모델을 이용할 수 있다. 서비스 서버는 모션 유형에 기초하여 추적 클러스터 내 해당 구간에 포함된 비식별 플레이어의 모션 특징을 추출할 수 있다. 전술한 것과 같이 모션 특징은 카메라의 시점에 독립적인 다차원 특징 벡터를 포함할 수 있다.The service server may classify the motion types of at least some sections included in the tracking cluster based on the predetermined motion types. For example, the service server may detect at least some continuous areas corresponding to any one of predetermined motion types among a plurality of areas included in the tracking cluster. For example, referring to the tracking cluster i0, the service server may detect a dribble section and a pass section. The service server may use a neural network model that sequentially receives consecutive scenes included in the tracking cluster and outputs at least one of predetermined motion types. The service server may extract the motion characteristics of the unidentified player included in the corresponding section in the tracking cluster based on the motion type. As described above, the motion feature may include a multidimensional feature vector independent of the viewpoint of the camera.

서비스 서버는 슛 구간이 검출되는 경우, 해당하는 비식별 플레이어가 슛을 시도한 코트 상 위치를 별도로 태깅할 수 있다. 아래에서 설명하겠으나, 서비스 서버는 득점 이벤트를 감지하여, 비식별 플레이어의 슛 시도가 성공하였는지 여부를 추가로 태깅할 수 있다.When a shot section is detected, the service server may separately tag the location on the court where the unidentified player attempted to shoot. As will be described below, the service server may detect a scoring event and additionally tag whether the unidentified player's shot attempt was successful.

실시예에 따라, 서비스 서버는 추적 클러스터에 포함된 복수의 영역들 중 공을 소유한 상태에 해당하는 영역들을 추출하고, 추출된 영역들 중 미리 정해진 모션 유형들 중 어느 하나에 해당하는 적어도 일부의 연속된 영역들을 검출할 수도 있다.According to an embodiment, the service server extracts areas corresponding to a state of possession of a ball from among a plurality of areas included in the tracking cluster, and extracts at least some of the extracted areas corresponding to any one of predetermined motion types. It is also possible to detect contiguous regions.

도 13의 실시예에서, 서비스 서버는 추적 클러스터(i0)에서 외형 특징 af_i0를 추출할 수 있다. 서비스 서버는 추적 클러스터(i0)에 포함된 드리블 구간에서 모션 특징 mf_i0_dribble을 추출하고, 패스 구간에서 모션 특징 mf_i0_pass를 추출할 수 있다. 실질적으로 동일한 방식으로, 서비스 서버는 추적 클러스터(i1)에서 외형 특징 af_i1을 추출하고, 모션 특징 mf_i1_shoot을 추출할 수 있다. 또한, 서비스 서버는 추적 클러스터(j0)에서 외형 특징 af_j0 및 모션 특징 mf_j0_shoot을 추출하고, 추적 클러스터(k0)에서 외형 특징 af_k0 및 모션 특징 mf_k0_block을 추출할 수 있다.In the embodiment of FIG. 13 , the service server may extract the appearance feature af_i0 from the tracking cluster i0. The service server may extract the motion feature mf_i0_dribble from the dribble section included in the tracking cluster i0 and extract the motion feature mf_i0_pass from the pass section. In substantially the same way, the service server may extract the appearance feature af_i1 from the tracking cluster i1 and extract the motion feature mf_i1_shoot. Also, the service server may extract the appearance feature af_j0 and the motion feature mf_j0_shoot from the tracking cluster j0, and extract the appearance feature af_k0 and the motion feature mf_k0_block from the tracking cluster k0.

서비스 서버는 외형 특징들 및 모션 특징들 중 적어도 하나에 기초하여, 추적 클러스터들을 매칭(1330)할 수 있다. 서비스 서버는 프레임 구간이 일부라도 겹치는 추적 클러스터들 사이에는 매칭을 수행하지 않을 수 있다. 프레임 구간이 일부라도 겹치는 서로 다른 추적 클러스터들은 서로 다른 비식별 플레이어들로 간주할 수 있기 때문이다. 이에 따라, 서비스 서버는 추적 클러스터(i0)와 추적 클러스터(i1) 사이의 매칭을 생략할 수 있다. 서비스 서버는 추적 클러스터(i0)의 외형 특징과 추적 클러스터(j0)의 외형 특징을 매칭한 뒤, 매칭에 실패(Failure)하였다고 판단할 수 있다. 서비스 서버는 추적 클러스터(i0)의 외형 특징과 추적 클러스터(k0)의 외형 특징을 매칭한 뒤, 매칭에 성공(Success)하였다고 판단할 수 있다. 매칭에 성공하는 경우, 서비스 서버는 추적 클러스터(i0)와 추적 클러스터(k0)를 병합할 수 있다. 여기서 병합은 추적 클러스터(i0)와 추적 클러스터(k0)를 동일한 비식별 플레이어를 위한 비식별 플레이어 별 클러스터에 포함시키는 동작으로 이해될 수 있다.The service server may match 1330 the tracking clusters based on at least one of appearance characteristics and motion characteristics. The service server may not perform matching between tracking clusters overlapping even if the frame period is partially. This is because different tracking clusters with overlapping frame sections can be regarded as different unidentified players. Accordingly, the service server may omit matching between the tracking cluster i0 and the tracking cluster i1. After matching the external features of the tracking cluster i0 with the external features of the tracking cluster j0, the service server may determine that the matching has failed. After matching the external features of the tracking cluster i0 with the external features of the tracking cluster k0, the service server may determine that the matching has been successful. If matching is successful, the service server may merge the tracking cluster i0 and the tracking cluster k0. Here, merging may be understood as an operation of including the tracking cluster i0 and the tracking cluster k0 into a non-identified player-specific cluster for the same non-identified player.

또한, 서비스 서버는 프레임 구간이 서로 겹치지 않는 추적 클러스터들 사이에서 동일한 모션 유형의 모션 특징들을 매칭할 수 있다. 예를 들어, 서비스 서버는 추적 클러스터(i1)의 슛 구간에서 추출된 mf_i1_shoot과 추적 클러스터(j0)의 슛 구간에서 추출된 mf_j0_shoot을 매칭한 뒤, 매칭에 실패(Failure)하였다고 판단할 수 있다.Also, the service server may match motion characteristics of the same motion type among tracking clusters whose frame sections do not overlap each other. For example, the service server may match mf_i1_shoot extracted from the shoot section of the tracking cluster i1 with mf_j0_shoot extracted from the shoot section of the tracking cluster j0, and then determine that the matching has failed.

서비스 서버는, 전술한 매칭 동작을 통하여 비식별 플레이어 별 클러스터들을 생성할 수 있다. 이상의 실시예들에서 '클러스터'는 비식별 플레이어 별 클러스터를 지칭하며, '추적 클러스터'와 구별되는 개념으로 이해될 수 있다. 서비스 서버는 클러스터들에 기초하여 비식별 플레이어 별 비디오 클립들을 생성(1340)할 수 있다. 비식별 플레이어별 클러스터들이 생성된 이후의 동작들에는 도 1 내지 도 11을 통하여 전술한 사항들이 그대로 적용될 수 있는 바, 보다 상세한 설명은 생략한다.The service server may generate clusters for each non-identified player through the above-described matching operation. In the above embodiments, the 'cluster' refers to a cluster for each unidentified player, and may be understood as a concept distinct from the 'tracking cluster'. The service server may generate ( 1340 ) video clips for each non-identified player based on the clusters. As described above with reference to FIGS. 1 to 11 may be applied to operations after the clusters for each non-identified player are generated, a more detailed description will be omitted.

도 14는 일 실시예에 따른 득점 이벤트(scoring event)를 감지하는 동작을 설명하는 도면이다. 도 14를 참조하면, 서비스 서버는 스포츠 동영상에서 골대에 공이 통과하는 득점 이벤트를 감지할 수 있다. 농구 경기의 경우 골대의 림(rim)에 공이 통과하는지 여부를 감지할 수 있다. 서비스 서버는 스포츠 동영상에서 골대 영역 혹은 골대의 림 영역을 검출하고, 프레임이 진행됨에 따라 검출된 영역에서 공이 통과하는 득점 이벤트가 감지되는지 여부를 판단할 수 있다.14 is a diagram for describing an operation of detecting a scoring event according to an exemplary embodiment. Referring to FIG. 14 , the service server may detect a scoring event in which the ball passes through the goalpost in the sports video. In the case of a basketball game, it is possible to detect whether the ball passes through the rim of the goalpost. The service server may detect the goal area or the rim area of the goal in the sports video, and determine whether a scoring event through which the ball passes in the detected area as the frame progresses is detected.

서비스 서버는 득점 이벤트가 감지되는 경우, 해당 프레임에 기초하여 추적 클러스터들 중 해당 이벤트와 연관된 추적 클러스터를 식별할 수 있다. 예를 들어, 서비스 서버는 추적 클러스터들 중, 해당 프레임 및 그 이전 프레임들에서 해당 이벤트와 연관된 모션 유형의 구간을 포함하는 추적 클러스터를 식별할 수 있다. 득점 이벤트가 감지되는 경우, 서비스 서버는 슛 구간을 포함하는 추적 클러스터를 식별할 수 있다. 서비스 서버는 득점 이벤트의 이전 프레임들에서 가장 가까운 슛 구간을 포함하는 추적 클러스터를 식별할 수 있다. 서비스 서버는 해당하는 슛 구간에서 시도된 슛이 성공하였다는 정보를 태깅할 수 있다.When a scoring event is detected, the service server may identify a tracking cluster associated with the corresponding event from among the tracking clusters based on the corresponding frame. For example, the service server may identify a tracking cluster including a section of a motion type associated with a corresponding event in the corresponding frame and previous frames from among the tracking clusters. When a scoring event is detected, the service server may identify a tracking cluster that includes the shot period. The service server may identify the tracking cluster containing the closest shot interval in previous frames of the scoring event. The service server may tag information indicating that the attempted shot was successful in the corresponding shot section.

일 실시예에 따르면, 비식별 플레이어 별 비디오 클립들을 생성할 때, 서비스 서버는 득점 이벤트와 연관된 서브-클러스터들을 선별적으로 획득한 뒤, 득점 이벤트와 연관된 서브-클러스터들에 기초하여 스포츠 동영상으로부터 비디오 클립을 추출할 수 있다.According to an embodiment, when generating video clips for each non-identified player, the service server selectively acquires sub-clusters associated with the scoring event, and then, based on the sub-clusters associated with the scoring event, video from the sports video. You can extract clips.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 인한 처리 장치로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented as a processing device using a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, the apparatus, methods and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA) array), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or apparatus, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited drawings, those skilled in the art may apply various technical modifications and variations based on the above. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

Claims

Receiving an analysis request signal including a link of a sports video of a ball game;
filtering out static pixels from a plurality of frames included in the sports video and performing pre-processing to leave dynamic pixels;
tracking the ball of the sports video based on the pre-processed video;
detecting a score-related scene of the sports video from the pre-processed video;
in response to detection of the scoring scene, determining an unidentified player associated with the scoring scene using the ball tracking result;
identifying the unidentified player by tracking the unidentified player up to an adjacent frame in which the unidentified player is identifiable; and
outputting a time section of the sports video and identification information of the non-identified player in response to the score-related scene
A method of operation of a video analysis server comprising a.

According to claim 1,
The step of tracking the ball is
corresponding to each of the frames, detecting a ball based on dynamic pixels of the corresponding frame;
Including, the method of operation of the video analysis server.

According to claim 1,
The step of detecting the score related scene is
corresponding to each of the frames, detecting a rim based on dynamic pixels of the corresponding frame; and
determining frames adjacent to the frame in which the rim is detected as the score-related scenes;
Including, the method of operation of the video analysis server.

According to claim 1,
The step of determining a non-identified player associated with the score-related scene comprises:
detecting dynamic pixels related to a player who attempted to score in frames included in the scoring-related scene by using the ball tracking result; and
determining an unidentified player associated with the scoring-related scene by instance segmenting a frame in which dynamic pixels related to the player who attempted to score are detected;
Including, the method of operation of the video analysis server.

According to claim 1,
The step of identifying the unidentified player is
extracting a feature from the determined non-identified player;
comparing the extracted features with features of previously registered players;
determining whether the non-identified player can be identified as a result of the comparison; and
tracking the unidentified player by instance segmenting adjacent frames in response to determining that the unidentified player cannot be identified;
Including, the method of operation of the video analysis server.

According to claim 1,
The pre-processing step is
filtering out static pixels based on a change in pixel values between adjacent frames within a predetermined range when the sports video is a video taken from a fixed viewpoint; and
Filtering out static pixels based on statistical values of optical flow of pixels in a frame when the sports video is a video taken at a moving time
At least one of, the method of operation of the video analysis server.

A computer-readable recording medium in which a program for executing the method of claim 1 is recorded.

transmitting a signal requesting analysis of the sports video to a video analysis module based on the link of the sports video;
storing player-specific clusters received from the video analysis module in a database;
providing information for extracting player-specific video clips from the sports video to a user terminal based on the database;
receiving an input for identifying an unidentified player of at least one cluster from the user terminal provided with the video clips for each player from a streaming server; and
updating identification information of at least one corresponding cluster in the database based on the input;
A method of operating a server that provides a sports video-based platform service comprising a.

9. The method of claim 8,
providing statistics indicating the contribution of players to the user terminal;
receiving an input for selecting detailed records included in the statistics from the user terminal;
obtaining at least one sub-cluster associated with the selected detailed record based on the database; and
providing information for extracting a video clip from the sports video to the user terminal based on the at least one sub-cluster
A method of operating a server that provides a sports video-based platform service further comprising a.

9. The method of claim 8,
receiving a search query including a search target player and a search target scene from the user terminal;
retrieving, from the database, a sub-cluster corresponding to the search query; and
providing information for extracting a video clip from the sports video to the user terminal based on the found sub-cluster
A method of operating a server that provides a sports video-based platform service further comprising a.

9. The method of claim 8,
determining a charging level based on the reliability of the clusters; and
determining a reward level based on a feedback input that modifies the clusters;
A method of operating a server that provides a sports video-based platform service further comprising at least one of.

9. The method of claim 8,
receiving, from the user terminal, a feedback input indicating that a player of at least one section included in the at least one cluster does not belong to the corresponding cluster; and
excluding the corresponding section from the corresponding cluster based on the feedback input
A method of operating a server that provides a sports video-based platform service further comprising a.

9. The method of claim 8,
receiving, from the user terminal, a feedback input indicating that a player of at least one section included in the at least one cluster belongs to another cluster; and
excluding the corresponding section from the corresponding cluster and including it in the other cluster based on the feedback input
A method of operating a server that provides a sports video-based platform service further comprising a.

9. The method of claim 8,
generating training data dependent on the updated database; and
Learning a specialized model for estimating at least one of detection information, identification information, and motion type information of players based on the training data
A method of operating a server that provides a sports video-based platform service further comprising a.

A computer-readable recording medium in which a program for executing the method of claim 8 is recorded.