KR20200102409A

KR20200102409A - Key frame scheduling method and apparatus, electronic devices, programs and media

Info

Publication number: KR20200102409A
Application number: KR1020207005376A
Authority: KR
Inventors: 시지안핑; 리율러; 린다화
Original assignee: 베이징 센스타임 테크놀로지 디벨롭먼트 컴퍼니 리미티드
Priority date: 2017-12-27
Filing date: 2018-12-25
Publication date: 2020-08-31
Also published as: JP6932254B2; CN108229363A; EP3644221A1; MY182985A; US20200394414A1; WO2019128979A1; SG11202000578UA; JP2020536332A; EP3644221A4; KR102305023B1; US11164004B2

Abstract

본 출원 실시예는 키 프레임 스케줄링 방법 및 장치, 전자 기기, 프로그램과 매체를 공개하였고, 여기서, 상기 방법은, 뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하는 단계; 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하는 단계; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하는 단계; 및 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 뉴럴 네트워크의 제2 네트워크 계층을 통해 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득하는 단계 - 뉴럴 네트워크에서, 제1 네트워크 계층의 네트워크 깊이는 제2 네트워크 계층의 네트워크 깊이보다 얕음 - 를 포함한다. 본 출원 실시예는 비디오에서 서로 상이한 프레임 사이의 하위 계층 특징의 변화를 이용함으로써, 신속하고, 정확하게, 자기 적응적으로 키 프레임 스케줄링을 수행하여, 키 프레임의 스케줄링 효율을 향상시킬 수 있다.The embodiment of the present application discloses a method and apparatus for scheduling a key frame, an electronic device, a program, and a medium, wherein the method includes feature extraction for a current frame through a first network layer of a neural network, Obtaining a lower layer feature of; Obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame; Determining whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; And when it is determined that the current frame is scheduled as a key frame, performing feature extraction on a lower layer feature of the current key frame through a second network layer of the neural network to obtain an upper layer feature of the current key frame.- Neural In the network, the network depth of the first network layer is shallower than the network depth of the second network layer. The embodiment of the present application may improve key frame scheduling efficiency by rapidly, accurately, and self-adaptively performing key frame scheduling by using changes in lower layer characteristics between different frames in a video.

Description

Key frame scheduling method and apparatus, electronic devices, programs and media

관련 출원의 상호 참조Cross-reference of related applications

본 출원은 2017년 12월 27일에 중국 특허청에 제출한 출원번호가 CN201711455838.X이고, 발명의 명칭이 "키 프레임 스케줄링 방법 및 장치, 전자 기기, 프로그램과 매체"인 중국 특허 출원의 우선권을 요청하며, 그 전부 내용을 원용하여 본 출원에 결합하였다.This application requests the priority of a Chinese patent application filed with the Chinese Intellectual Property Office on December 27, 2017 with the application number CN201711455838.X and the name of the invention "Key frame scheduling method and device, electronic device, program and medium" And, all the contents are incorporated into the present application.

본 출원은 컴퓨터 비전 기술에 관한 것으로, 특히 키 프레임 스케줄링 방법 및 장치, 전자 기기, 프로그램과 매체에 관한 것이다.The present application relates to computer vision technology, and more particularly, to a method and apparatus for scheduling key frames, electronic devices, programs and media.

비디오 시맨틱 분할은 컴퓨터 비전 및 비디오 시맨틱 이해 작업에서의 중요한 문제이다. 비디오 시맨틱 분할 모델은 자율 주행, 비디오 모니터링 및 비디오 타겟 분석과 같은 많은 분야에서 중요한 응용을 가진다. 비디오 시맨틱 분할 속도는 비디오 시맨틱 분할 작업에서 중요한 측면이다.Video semantic segmentation is an important issue in computer vision and video semantics understanding tasks. The video semantic segmentation model has important applications in many fields such as autonomous driving, video monitoring and video target analysis. Video semantic segmentation speed is an important aspect in video semantic segmentation.

본 출원 실시예는 키 프레임 스케줄링의 기술방안을 제공한다.The embodiment of the present application provides a technical solution for key frame scheduling.

본 출원의 실시예의 일 측면에 따라 제공되는 키 프레임 스케줄링 방법은,A key frame scheduling method provided according to an aspect of the embodiment of the present application,

뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하는 단계; Performing feature extraction on the current frame through a first network layer of the neural network to obtain a lower layer feature of the current frame;

상기 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 상기 현재 프레임의 하위 계층 특징에 따라, 상기 현재 프레임의 스케줄링 확률값을 획득하는 단계 - 상기 이전 키 프레임의 하위 계층 특징은 상기 제1 네트워크 계층이 상기 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고, 상기 스케줄링 확률값은 현재 프레임이 키 프레임으로 스케줄링될 확률임 - ;Acquiring a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame-The lower layer characteristic of the previous key frame is determined by the first network layer It is obtained by performing feature extraction on the previous key frame, and the scheduling probability value is the probability that the current frame is scheduled as a key frame;

상기 현재 프레임의 스케줄링 확률값에 따라, 상기 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하는 단계; 및 Determining whether the current frame is scheduled as a key frame according to a scheduling probability value of the current frame; And

상기 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 상기 현재 프레임을 현재 키 프레임으로 결정하고, 상기 뉴럴 네트워크의 제2 네트워크 계층을 통해 상기 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 상기 현재 키 프레임의 상위 계층 특징을 획득하는 단계 - 상기 뉴럴 네트워크에서, 상기 제1 네트워크 계층의 네트워크 깊이는 상기 제2 네트워크 계층의 네트워크 깊이보다 얕음 - 를 포함한다.When it is determined that the current frame is scheduled as a key frame, the current frame is determined as a current key frame, and feature extraction is performed on a lower layer feature of the current key frame through a second network layer of the neural network, Acquiring a higher layer feature of the current key frame, in the neural network, a network depth of the first network layer being shallower than a network depth of the second network layer.

선택적으로, 본 출원의 상기 임의의 방법 실시예에서, 상기 방법은,Optionally, in any of the above method embodiments of the present application, the method comprises:

초기 키 프레임을 결정하는 단계;Determining an initial key frame;

상기 제1 네트워크 계층을 통해 상기 초기 키 프레임에 대해 특징 추출을 수행하여, 상기 초기 키 프레임의 하위 계층 특징을 획득하고 버퍼링하는 단계; 및Performing feature extraction on the initial key frame through the first network layer to obtain and buffer lower layer features of the initial key frame; And

상기 제2 네트워크 계층을 통해 상기 초기 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 상기 초기 키 프레임의 상위 계층 특징을 획득하는 단계를 더 포함한다.And performing feature extraction on a lower layer feature of the initial key frame through the second network layer to obtain an upper layer feature of the initial key frame.

상기 초기 키 프레임에 대해 시맨틱 분할을 수행하여, 상기 초기 키 프레임의 시맨틱 태그를 출력하는 단계를 더 포함한다.And performing semantic division on the initial key frame, and outputting a semantic tag of the initial key frame.

선택적으로, 본 출원의 상기 임의의 방법 실시예에서, 상기 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정한 후, 상기 방법은,Optionally, in any of the above method embodiments of the present application, after determining that the current frame is scheduled as a key frame, the method comprises:

상기 현재 키 프레임의 하위 계층 특징을 버퍼링하는 단계를 더 포함한다.And buffering a lower layer feature of the current key frame.

선택적으로, 본 출원의 상기 임의의 방법 실시예에서, 상기 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 상기 현재 프레임의 하위 계층 특징에 따라, 상기 현재 프레임의 스케줄링 확률값을 획득하는 단계는,Optionally, in the arbitrary method embodiment of the present application, obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame,

상기 이전 키 프레임의 하위 계층 특징 및 상기 현재 프레임의 하위 계층 특징을 스플라이싱하여, 스플라이싱 특징을 얻는 단계; 및Splicing a lower layer feature of the previous key frame and a lower layer feature of the current frame to obtain a splicing feature; And

키 프레임 스케줄링 네트워크를 통해, 상기 스플라이싱 특징에 기반하여 상기 현재 프레임의 스케줄링 확률값을 획득하는 단계를 포함한다.And obtaining, through a key frame scheduling network, a scheduling probability value of the current frame based on the splicing feature.

선택적으로, 본 출원의 상기 임의의 방법 실시예에서, 상기 방법은, Optionally, in any of the above method embodiments of the present application, the method comprises:

상기 현재 키 프레임에 대해 시맨틱 분할을 수행하여, 상기 키 프레임의 시맨틱 태그를 출력하는 단계를 더 포함한다.And outputting a semantic tag of the key frame by performing semantic division on the current key frame.

본 출원 실시예의 다른 측면에 따라 제공된 키 프레임 스케줄링 장치는,A key frame scheduling apparatus provided according to another aspect of the present application embodiment,

현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하기 위한 제1 특징 추출 유닛 - 상기 제1 특징 추출 유닛은 뉴럴 네트워크의 제1 네트워크 계층을 포함함 - ; A first feature extraction unit for obtaining a lower layer feature of the current frame by performing feature extraction on the current frame, the first feature extraction unit including a first network layer of a neural network;

상기 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 상기 현재 프레임의 하위 계층 특징에 따라, 상기 현재 프레임의 스케줄링 확률값을 획득하기 위한 스케줄링 유닛 - 상기 이전 키 프레임의 하위 계층 특징은 상기 제1 네트워크 계층이 상기 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고, 상기 스케줄링 확률값은 현재 프레임이 키 프레임으로 스케줄링될 확률임 - ;A scheduling unit for obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame-The lower layer characteristic of the previous key frame is the first network layer This is obtained by performing feature extraction on the previous key frame, and the scheduling probability value is a probability that the current frame is scheduled as a key frame;

상기 현재 프레임의 스케줄링 확률값에 따라, 상기 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하기 위한 결정 유닛; 및 A determining unit for determining whether the current frame is scheduled as a key frame according to a scheduling probability value of the current frame; And

상기 결정 유닛의 결정 결과에 따라, 상기 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 상기 현재 프레임을 현재 키 프레임으로 결정하고, 상기 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 상기 현재 키 프레임의 상위 계층 특징을 획득하기 위한 제2 특징 추출 유닛 - 상기 제2 특징 추출 유닛은 상기 뉴럴 네트워크의 제2 네트워크 계층을 포함하고, 상기 뉴럴 네트워크에서, 상기 제1 네트워크 계층의 네트워크 깊이는 상기 제2 네트워크 계층의 네트워크 깊이보다 얕음 - 을 포함한다.When it is determined that the current frame is scheduled as a key frame according to the determination result of the determining unit, the current frame is determined as a current key frame, and feature extraction is performed on a lower layer feature of the current key frame, and the A second feature extraction unit for obtaining an upper layer feature of a current key frame-The second feature extraction unit includes a second network layer of the neural network, and in the neural network, the network depth of the first network layer is And is shallower than the network depth of the second network layer.

선택적으로, 본 출원의 상기 임의의 장치 실시예에서, 상기 이전 키 프레임은 미리 결정된 초기 키 프레임을 포함하고; Optionally, in any of the above device embodiments of the present application, the previous key frame comprises a predetermined initial key frame;

상기 장치는,The device,

키 프레임의 하위 계층 특징 및 상위 계층 특징을 버퍼링하기 위한 버퍼링 유닛을 더 포함하고, 상기 키 프레임은 상기 초기 키 프레임을 포함한다.And a buffering unit for buffering lower layer features and higher layer features of the key frame, wherein the key frame includes the initial key frame.

선택적으로, 본 출원의 상기 임의의 장치 실시예에서, 상기 제1 특징 추출 유닛은 또한, 상기 결정 유닛의 결정 결과에 따라, 상기 버퍼링 유닛에서 상기 현재 키 프레임의 하위 계층 특징을 버퍼링하기 위한 것이다.Optionally, in the arbitrary device embodiment of the present application, the first feature extraction unit is also for buffering a lower layer feature of the current key frame in the buffering unit according to a determination result of the determining unit.

선택적으로, 본 출원의 상기 임의의 장치 실시예에서, 상기 스케줄링 유닛은,Optionally, in the arbitrary device embodiment of the present application, the scheduling unit,

상기 이전 키 프레임의 하위 계층 특징 및 상기 현재 프레임의 하위 계층 특징을 스플라이싱하여, 스플라이싱 특징을 얻기 위한 스플라이싱 서브 유닛; 및A splicing subunit for obtaining a splicing feature by splicing a lower layer feature of the previous key frame and a lower layer feature of the current frame; And

상기 스플라이싱 특징에 기반하여 상기 현재 프레임의 스케줄링 확률값을 획득하기 위한 키 프레임 스케줄링 네트워크를 포함한다.And a key frame scheduling network for obtaining a scheduling probability value of the current frame based on the splicing feature.

선택적으로, 본 출원의 상기 임의의 장치 실시예에서, 상기 장치는,Optionally, in any of the above device embodiments of the present application, the device comprises:

상기 키 프레임에 대해 시맨틱 분할을 수행하여, 상기 키 프레임의 시맨틱 태그를 출력하기 위한 시맨틱 분할 유닛을 더 포함하며, 상기 키 프레임은 초기 키 프레임, 상기 이전 키 프레임 또는 상기 현재 키 프레임을 포함한다.A semantic division unit for performing semantic division on the key frame and outputting a semantic tag of the key frame, wherein the key frame includes an initial key frame, the previous key frame, or the current key frame.

본 출원 실시예의 또 다른 측면에 따라 제공된 전자 기기는, 본 출원의 임의의 실시예에 따른 키 프레임 스케줄링 장치를 포함한다.An electronic device provided according to another aspect of the embodiment of the present application includes a key frame scheduling device according to an embodiment of the present application.

본 출원 실시예의 또 다른 측면에 따라 제공된 전자 기기는,An electronic device provided according to another aspect of the embodiment of the present application,

프로세서 및 본 출원의 임의의 실시예에 따른 키 프레임 스케줄링 장치를 포함하고; A processor and a key frame scheduling apparatus according to any embodiment of the present application;

프로세서에 의해 상기 키 프레임 스케줄링 장치가 작동될 때, 본 출원의 임의의 실시예에 따른 키 프레임 스케줄링 장치의 유닛은 작동된다.When the key frame scheduling device is operated by the processor, the unit of the key frame scheduling device according to any embodiment of the present application is operated.

본 출원 실시예의 또 다른 측면에 따라 제공된 전자 기기는, 프로세서 및 메모리를 포함하며; An electronic device provided according to another aspect of the embodiment of the present application includes a processor and a memory;

상기 메모리는 적어도 하나의 실행 가능 명령어를 저장하기 위한 것이며, 상기 실행 가능 명령어는 상기 프로세서로 하여금 본 출원의 임의의 실시예에 따른 키 프레임 스케줄링 방법에서의 각 단계의 동작을 실행하도록 한다.The memory is for storing at least one executable instruction, and the executable instruction causes the processor to execute each step of the key frame scheduling method according to an exemplary embodiment of the present application.

본 출원 실시예의 또 다른 측면에 따라 제공된 컴퓨터 프로그램은, 컴퓨터 판독 가능 코드를 포함하고, 상기 컴퓨터 판독 가능 코드가 기기에서 작동될 때, 상기 기기 중의 프로세서는 본 출원의 임의의 실시예에 따른 차량 운전 시뮬레이션 방법에서 각 단계를 구현하기 위한 명령어를 실행한다.The computer program provided according to another aspect of the embodiment of the present application includes a computer-readable code, and when the computer-readable code is operated on the device, the processor in the device drives a vehicle according to any embodiment of the present application. Execute commands to implement each step in the simulation method.

본 출원 실시예의 또 다른 측면에 따라 제공된 컴퓨터 판독 가능 명령어를 저장하기 위한 컴퓨터 판독 가능 매체는, 상기 명령어가 실행될 때 본 출원의 임의의 실시예에 따른 키 프레임 스케줄링 방법에서의 각 단계의 동작을 구현한다.A computer-readable medium for storing a computer-readable instruction provided according to another aspect of the present application embodiment, when the instruction is executed, implements the operation of each step in the key frame scheduling method according to any embodiment of the present application do.

본 출원의 상기 실시예에서 제공한 키 프레임 스케줄링 방법 및 장치, 전자 기기, 프로그램과 매체에 기반하여, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하고, 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하며; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하고; 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득한다. 본 출원 실시예는 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라 이전 키 프레임 하위 계층 특징에 대한 현재 프레임의 변화를 획득할 수 있고, 비디오에서 서로 상이한 프레임 사이의 하위 계층 특징의 변화를 이용함으로써, 신속하고, 정확하게, 자기 적응적으로 키 프레임 스케줄링을 수행하여, 키 프레임의 스케줄링 효율을 향상시킬 수 있다.Based on the key frame scheduling method and apparatus, electronic device, program, and medium provided in the above embodiment of the present application, feature extraction is performed on the current frame to obtain lower layer features of the current frame, and adjacent previous key frames Obtaining a scheduling probability value of the current frame according to the lower layer characteristic of the current frame and the lower layer characteristic of the current frame; Determine whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; When it is determined that the current frame is scheduled as a key frame, feature extraction is performed on a lower layer feature of the current key frame to obtain an upper layer feature of the current key frame. In the embodiment of the present application, it is possible to obtain a change of the current frame with respect to the lower layer feature of the previous key frame according to the lower layer feature of the previous key frame and the lower layer feature of the current frame, and the lower layer feature between different frames in the video. By using the change, key frame scheduling can be performed quickly, accurately and self-adaptively, thereby improving the scheduling efficiency of key frames.

아래에, 첨부된 도면 및 실시예를 참조하여 본 출원의 기술 방안을 상세히 설명한다.Hereinafter, a technical method of the present application will be described in detail with reference to the accompanying drawings and embodiments.

본 출원 실시예는 키 프레임 스케줄링의 기술방안을 제공할 수 있는 효과가 있다.The embodiment of the present application has an effect of providing a technical solution for key frame scheduling.

본 명세서의 일부를 구성하는 도면은 본 출원의 실시예를 설명하고, 본 명세서의 원리를 설명과 함께 해석하기 위한 것이다.
도면을 참조하면, 본 출원은 다음의 상세한 설명에 따라, 더욱 명확하게 이해될 수 있다.
도 1은 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 흐름 모식도이다.
도 2는 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 다른 흐름 모식도이다.
도 3은 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 구조 모식도이다.
도 4는 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 다른 구조 모식도이다.
도 5는 본 출원 실시예에 의해 제공된 전자 기기의 응용 실시예의 구조 모식도이다.The drawings constituting a part of the present specification are for explaining the embodiments of the present application and for interpreting the principles of the present specification together with the description.
Referring to the drawings, the present application may be more clearly understood according to the following detailed description.
1 is a schematic flow diagram of a key frame scheduling method provided by an embodiment of the present application.
2 is another flow schematic diagram of a key frame scheduling method provided by an embodiment of the present application.
3 is a structural schematic diagram of a key frame scheduling apparatus provided by an embodiment of the present application.
4 is a schematic diagram of another structure of a key frame scheduling apparatus provided by an embodiment of the present application.
5 is a structural schematic diagram of an application example of an electronic device provided by the embodiment of the present application.

아래에 첨부 도면을 참조하여 본 출원의 다양한 실시예를 상세히 설명한다. 유의해야 할 것은, 달리 구체적으로 언급되지 않는 한, 실시예에서 반복적으로 설명된 구성 요소, 단계의 상대적인 배열, 수치 표현 및 값은 본 출원의 범위를 한정하려는 것이 아니다.Hereinafter, various embodiments of the present application will be described in detail with reference to the accompanying drawings. It should be noted that, unless specifically stated otherwise, elements, relative arrangements of steps, numerical expressions, and values repeatedly described in the examples are not intended to limit the scope of the present application.

동시에, 설명의 편의를 위해, 첨부 도면에 도시된 각 부분의 크기는 실제 비율로 도시되지 않았다는 것을 이해해야 한다.At the same time, for convenience of description, it should be understood that the size of each part shown in the accompanying drawings is not drawn to scale.

적어도 하나의 예시적인 실시예에 대한 이하의 설명은 실제로 예시일 뿐이며, 본 출원 및 그 응용 또는 사용을 한정하려는 것이 아니다.The following description of at least one exemplary embodiment is merely illustrative and is not intended to limit the present application and its applications or uses.

관련 기술 분야의 통상의 기술자에게 공지된 기술, 방법 및 기기는 상세하게 논의되지 않을 수 있지만, 적절한 경우, 상기 기술, 방법 및 기기는 명세서의 일부로 간주되어야 한다.Techniques, methods, and devices known to those skilled in the art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the specification.

유의해야 할 것은, 유사한 도면 부호 및 문자는 다음의 도면에서 유사한 항목을 표시하므로, 어느 한 항목이 하나의 도면에서 정의되면, 후속 도면에서 추가로 논의될 필요가 없다.It should be noted that similar reference numerals and letters indicate similar items in the following figures, so if any one item is defined in one figure, it does not need to be discussed further in subsequent figures.

본 출원의 실시예는 컴퓨터 시스템/서버에 응용될 수 있으며, 이는 다수의 다른 범용 또는 특수 목적 컴퓨팅 시스템 환경 또는 구성과 함께 작동될 수있다. 컴퓨터 시스템/서버와 함께 사용하기에 적합한 잘 알려진 컴퓨팅 시스템, 환경 및 구성 중 적어도 하나의 예는 개인용 컴퓨터 시스템, 서버 컴퓨터 시스템, 씬 클라이언트, 씩 클라이언트, 핸드 헬드 또는 랩톱 기기, 마이크로 프로세서 기반 시스템, 셋톱 박스, 프로그래머블 가전 제품, 네트워크 개인용 컴퓨터, 소형 컴퓨터 시스템, 대형 컴퓨터 시스템 및 상기 시스템 중 어느 하나를 포함하는 분산형 클라우드 컴퓨팅 기술 환경 등을 포함하지만 이에 한정되지 않는다.Embodiments of the present application may be applied to computer systems/servers, which may work with a number of different general purpose or special purpose computing system environments or configurations. Examples of at least one of the well-known computing systems, environments and configurations suitable for use with a computer system/server include personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, microprocessor based systems, and set-tops. Boxes, programmable home appliances, networked personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, but are not limited thereto.

컴퓨터 시스템/서버는 컴퓨터 시스템에 의해 실행되는 컴퓨터 시스템 실행 가능 명령어(예를 들어, 프로그램 모듈)의 일반적인 맥락에서 설명될 수 있다. 일반적으로, 프로그램 모듈은 특정 작업을 실행하거나 특정 추상 데이터 타입을 구현하기 위한 루틴, 프로그램, 타겟 프로그램, 어셈블리, 논리, 데이터 구조 등을 포함할 수 있다. 컴퓨터 시스템/서버는 작업이 통신 네트워크를 통해 연결된 원격 처리 장치에 의해 실행되는 분산형 클라우드 컴퓨팅 환경에서 구현될 수 있다. 분산형 클라우드 컴퓨팅 환경에서, 프로그램 모듈은 저장 기기를 포함하는 로컬 또는 원격 컴퓨팅 시스템 저장 매체에 위치할 수 있다.A computer system/server may be described in the general context of computer system executable instructions (eg, program modules) executed by a computer system. In general, program modules may include routines, programs, target programs, assemblies, logic, data structures, etc. for executing specific tasks or implementing specific abstract data types. The computer system/server can be implemented in a distributed cloud computing environment where tasks are executed by remote processing devices connected through a communication network. In a distributed cloud computing environment, program modules may be located in a local or remote computing system storage medium including a storage device.

도 1은 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 흐름 모식도이다. 도 1에 도시된 바와 같이, 상기 실시예 방법은 다음의 단계들을 포함한다.1 is a schematic flow diagram of a key frame scheduling method provided by an embodiment of the present application. As shown in Fig. 1, the embodiment method includes the following steps.

단계 102에 있어서, 뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득한다.In step 102, feature extraction is performed on the current frame through the first network layer of the neural network to obtain lower layer features of the current frame.

선택적으로, 현재 프레임은 비디오에서의 임의의 프레임 이미지일 수 있다.Optionally, the current frame can be any frame image in the video.

하나의 선택적인 예에서, 상기 단계 102는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 제1 특징 추출 유닛에 의해 실행될 수도 있다.In one alternative example, step 102 may be executed by a processor by calling a corresponding instruction stored in a memory, or may be executed by a first feature extraction unit operated by the processor.

단계 104에 있어서, 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득한다.In step 104, a scheduling probability value of the current frame is obtained according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame.

여기서, 이전 키 프레임의 하위 계층 특징은 상기 제1 네트워크 계층이 상기 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고, 선택적으로, 본 출원 실시예에서 제공한 스케줄링 확률값은 현재 프레임이 키 프레임으로 스케줄링될 확률이다.Here, the lower layer feature of the previous key frame is obtained by the first network layer by performing feature extraction on the previous key frame, and optionally, the scheduling probability value provided in the embodiment of the present application is that the current frame is scheduled as a key frame. Is the probability of becoming.

하나의 선택적인 예에서, 상기 단계 104는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 스케줄링 유닛에 의해 실행될 수도 있다.In one alternative example, step 104 may be executed by a processor by calling a corresponding instruction stored in the memory, or may be executed by a scheduling unit operated by the processor.

단계 106에 있어서, 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정한다.In step 106, it is determined whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame.

본 출원 실시예의 선택적인 일 예에서, 현재 프레임의 스케줄링 확률값이 기설정된 임계값보다 큰지 여부에 따라, 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정할 수 있다. 예를 들어, 기설정된 임계값이 80%이고, 현재 프레임의 스케줄링 확률값이 상기 기설정된 임계값보다 크거나 같으면, 현재 프레임은 키 프레임으로 스케줄링된 것으로 결정되며, 즉, 상기 현재 프레임은 키 프레임으로 간주된다. 현재 프레임의 스케줄링 확률값이 상기 기설정된 임계값보다 작으면, 현재 프레임은 키 프레임으로 스케줄링되지 않은 것으로 결정된다.In an optional example of the embodiment of the present application, it may be determined whether the current frame is scheduled as a key frame according to whether the scheduling probability value of the current frame is greater than a preset threshold value. For example, if the preset threshold is 80% and the scheduling probability value of the current frame is greater than or equal to the preset threshold, it is determined that the current frame is scheduled as a key frame, that is, the current frame is a key frame. Is considered. If the scheduling probability value of the current frame is less than the preset threshold, it is determined that the current frame is not scheduled as a key frame.

하나의 선택적인 예에서, 상기 단계 106는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 결정 유닛에 의해 실행될 수도 있다.In one alternative example, step 106 may be executed by a processor by invoking a corresponding instruction stored in the memory, or may be executed by a decision unit operated by the processor.

단계 108에 있어서, 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 프레임을 현재 키 프레임으로 결정하고, 상기 뉴럴 네트워크의 제2 네트워크 계층을 통해 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득한다.In step 108, if it is determined that the current frame is scheduled as a key frame, the current frame is determined as the current key frame, and feature extraction is performed on a lower layer feature of the current key frame through the second network layer of the neural network. Thus, an upper layer feature of the current key frame is obtained.

여기서, 뉴럴 네트워크에서, 상기 제1 네트워크 계층의 네트워크 깊이는 상기 제2 네트워크 계층의 네트워크 깊이보다 얕다.Here, in a neural network, the network depth of the first network layer is smaller than that of the second network layer.

하나의 선택적인 예에서, 상기 단계 108은 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 제2 특징 추출 유닛에 의해 실행될 수도 있다.In one alternative example, step 108 may be executed by a processor by calling a corresponding instruction stored in the memory, or may be executed by a second feature extraction unit operated by the processor.

본 출원 실시예에서, 뉴럴 네트워크는 상이한 네트워크 깊이를 갖는 2 개 이상의 네트워크 계층을 포함하고, 뉴럴 네트워크에 포함된 네트워크 계층에서, 특징 추출에 사용되는 네트워크 계층은 특징 계층으로 지칭되며, 뉴럴 네트워크가 하나의 프레임을 수신한 후, 첫 번째 특징 계층을 통해 입력된 프레임에 대해 특징 추출을 수행하고, 이를 두 번째 특징 계층에 입력하며, 두 번째 특징 계층으로부터 시작하여, 각 특징 계층은 입력된 특징 특징에 대해 순차적으로 특징 추출을 수행하여, 추출된 특징을 시맨틱 분할에 사용될 수 있는 특징을 얻을 때까지 다음 네트워크 계층에 입력하여 특징 추출을 수행한다. 뉴럴 네트워크에서 적어도 하나의 특징 계층의 네트워크 깊이는 특징 추출의 순서에 따라 점점 깊어지며, 네트워크 깊이에 따라, 뉴럴 네트워크에서 특징 추출에 사용되는 특징 계층을 2 개의 부분, 즉 하위 계층 특징 계층 및 상위 계층 특징 계층으로 분할하며, 다시 말해서, 상기 제1 네트워크 계층 및 제2 네트워크 계층으로 분할한다. 여기서, 하위 계층 특징 계층에서의 적어도 하나의 특징 계층은 특징 추출을 순차적으로 수행하여 최종적으로 출력된 특징을 하위 계층 특징이라 지칭하며, 상위 계층 특징 계층에서의 적어도 하나의 특징 계층은 특징 추출을 순차적으로 수행하여 최종적으로 출력된 특징을 상위 계층 특징이라 지칭한다. 동일한 뉴럴 네트워크에서 네트워크 깊이가 얕은 특징 계층에 비교하여, 네트워크 깊이가 깊은 특징 계층은 시야가 넓으며, 주목 공간 구조 정보가 많고, 추출된 특징이 시맨틱 분할에 사용될 때, 시맨틱 분할은 더욱 정확하지만, 네트워크 깊이가 깊을수록, 계산이 더 어렵고 복잡하다. 실제 응용에서, 기설정된 표준, 예를 들어 계산량에 따라, 뉴럴 네트워크에서의 특징 계층을 하위 계층 특징 계층 및 상위 계층 특징 계층으로 분할하고, 상기 기설정된 표준은 실제 요구에 따라 조정될 수 있다. 예를 들어, 100 개의 순차적으로 연결된 특징 계층을 포함하는 뉴럴 네트워크의 경우, 사전 설정에 따라, 상기 100 개의 특징 계층의 첫 번째 내지 30 번째인 첫 30 개(다른 개수일 수도 있음)의 특징 계층을 하위 계층 특징 계층으로 사용하며, 31 번째 내지 100 번째인 마지막 70 개의 특징 계층을 상위 계층 특징 계층으로 사용한다. 예를 들어, 피라미드 장면 파싱 네트워크 (Pyramid Scene Parsing Network, PSPN)의 경우, 상기 뉴럴 네트워크는 4 개 부분의 컨볼루션 네트워크(conv1 내지 conv4) 및 하나의 분류 계층을 포함할 수 있으며, 각 부분의 컨볼루션 네트워크는 또한 복수 개의 컨볼루션 계층을 포함하며, 계산량의 크기에 따라, 상기 PSPN에서 conv1로부터 conv4_3까지의 컨볼루션 계층을 하위 계층 특징 계층으로 사용하고, 이는 상기 PSPN의 계산량의 약 1/8을 차지하고, 상기 PSPN에서 conv4_4로부터 분류 계층 전까지의 적어도 하나의 컨볼루션 계층을 상위 계층 특징 계층으로 사용하며, 이는 PSPN의 계산량의 약 7/8을 차지하며; 분류 계층은 상위 계층 특징 계층에 의해 출력된 상위 계층 특징에 대해 시맨틱 분할을 수행하여, 프레임의 시맨틱 태그를 획득하며, 즉 프레임에서 적어도 하나의 픽셀의 분류를 획득한다.In the embodiment of the present application, the neural network includes two or more network layers having different network depths, and in the network layer included in the neural network, the network layer used for feature extraction is referred to as a feature layer, and the neural network is one After receiving the frame of, feature extraction is performed on the frame input through the first feature layer, inputs it to the second feature layer, and starting from the second feature layer, each feature layer is applied to the input feature feature. Feature extraction is sequentially performed for the extracted features, and feature extraction is performed by inputting the extracted features to the next network layer until a feature that can be used for semantic division is obtained. In a neural network, the network depth of at least one feature layer becomes deeper according to the order of feature extraction, and according to the network depth, the feature layer used for feature extraction in the neural network is divided into two parts, namely, a lower layer feature layer and an upper layer. It is divided into feature layers, that is, divided into the first network layer and the second network layer. Here, at least one feature layer in the lower layer feature layer sequentially performs feature extraction and the finally output feature is referred to as a lower layer feature, and at least one feature layer in the upper layer feature layer sequentially performs feature extraction. The feature finally outputted by performing with is referred to as an upper layer feature. Compared to a feature layer with a shallow network depth in the same neural network, a feature layer with a deep network depth has a wide field of view, a lot of spatial structure information of interest, and when the extracted feature is used for semantic segmentation, semantic segmentation is more accurate, The deeper the network, the more difficult and complex the computation. In an actual application, a feature layer in a neural network is divided into a lower layer feature layer and an upper layer feature layer according to a predetermined standard, for example, a computation amount, and the preset standard may be adjusted according to actual demand. For example, in the case of a neural network including 100 sequentially connected feature layers, the first 30 feature layers (may be different numbers) of the first to 30 feature layers of the 100 feature layers are selected according to preset settings. The lower layer feature layer is used, and the last 70 feature layers, which are the 31st to the 100th, are used as the upper layer feature layer. For example, in the case of a Pyramid Scene Parsing Network (PSPN), the neural network may include 4 parts of convolution networks (conv1 to conv4) and one classification layer, and each part of convolution The convolutional network also includes a plurality of convolutional layers, and according to the amount of computation, the PSPN uses a convolutional layer from conv1 to conv4_3 as a lower layer feature layer, which takes about 1/8 of the computational amount of the PSPN. Occupies, and uses at least one convolutional layer from conv4_4 up to the classification layer in the PSPN as an upper layer feature layer, which occupies about 7/8 of the computational amount of the PSPN; The classification layer obtains the semantic tag of the frame by performing semantic division on the upper layer feature output by the upper layer feature layer, that is, obtains the classification of at least one pixel in the frame.

상위 계층 특징의 추출은 네트워크 깊이가 깊은 제2 네트워크 계층을 필요로 하므로, 그 계산의 난이도 및 복잡도는 높고, 프레임의 시맨틱 태그를 정확하게 획득하기 위해서는 또한, 프레임의 상위 계층 특징에 기반하여 시맨틱 분할을 수행할 필요가 있으며, 따라서, 본 출원 실시예에서, 시맨틱 분할을 위해 키 프레임에 대해 상위 계층 특징 추출만을 수행하고, 비디오에서 프레임 단위로 상위 계층 특징 추출을 수행하는 것과 비교하면, 계산의 난이도 및 복잡성을 감소시킬 뿐만 아니라, 비디오의 시맨틱 분할 결과를 획득할 수도 있다.Since the extraction of the upper layer features requires a second network layer with a deep network depth, the computational difficulty and complexity are high, and in order to accurately acquire the semantic tag of the frame, semantic division is also performed based on the upper layer features of the frame. Therefore, in the embodiment of the present application, compared to performing only upper layer feature extraction on a key frame for semantic division, and performing higher layer feature extraction on a frame-by-frame basis in video, the difficulty of calculation and In addition to reducing the complexity, it is also possible to obtain a result of semantic segmentation of the video.

본 출원의 상기 실시예에 의해 제공된 키 프레임 스케줄링 방법에 따르면, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하고, 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하며; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하고; 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득한다. 본 출원 실시예는 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라 이전 키 프레임 하위 계층 특징에 대한 현재 프레임의 변화를 획득할 수 있고, 비디오에서 서로 상이한 프레임 사이의 하위 계층 특징의 변화를 이용함으로써, 신속하고, 정확하게, 자기 적응적으로 키 프레임 스케줄링을 수행하여, 키 프레임의 스케줄링 효율을 향상시킬 수 있다.According to the key frame scheduling method provided by the above embodiment of the present application, feature extraction is performed on a current frame to obtain a lower layer feature of a current frame, and a lower layer feature of an adjacent previous key frame and a lower layer of the current frame According to the feature, obtaining a scheduling probability value of the current frame; Determine whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; When it is determined that the current frame is scheduled as a key frame, feature extraction is performed on a lower layer feature of the current key frame to obtain an upper layer feature of the current key frame. In the embodiment of the present application, it is possible to obtain a change of the current frame with respect to the lower layer feature of the previous key frame according to the lower layer feature of the previous key frame and the lower layer feature of the current frame. By using the change, key frame scheduling can be performed quickly, accurately and self-adaptively, thereby improving the scheduling efficiency of key frames.

또한, 본 출원의 키 프레임 스케줄링 방법의 다른 실시예에서, 상기 도 1에 도시된 실시예 전에, 상기 방법은,In addition, in another embodiment of the key frame scheduling method of the present application, before the embodiment shown in Fig. 1, the method,

초기 키 프레임을 결정하는 단계를 더 포함할 수 있다. 예를 들어, 비디오에서의 제1 프레임 또는 다른 임의의 하나의 프레임을 초기 키 프레임으로 지정하는 단계; It may further include determining an initial key frame. Designating a first frame or any other frame in the video as an initial key frame, for example;

상기 제1 네트워크 계층을 통해 초기 키 프레임에 대해 특징 추출을 수행하여, 초기 키 프레임의 하위 계층 특징을 획득하고 버퍼링하는 단계 - 상기 키 프레임의 하위 계층 특징에 기반하여 다른 프레임이 키 프레임으로 스케줄링되는지 여부를 결정할 수 있음(상기 단계 102를 참조하여 결정될 수 있음) - ; 및Performing feature extraction on the initial key frame through the first network layer, acquiring and buffering lower layer features of the initial key frame-Whether another frame is scheduled as a key frame based on the lower layer feature of the key frame Can determine whether or not (can be determined with reference to step 102)-; And

상기 제2 네트워크 계층을 통해 초기 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 초기 키 프레임의 상위 계층 특징을 획득함으로써 시맨틱 분할에 사용된다.The second network layer is used for semantic division by performing feature extraction on a lower layer feature of an initial key frame and obtaining an upper layer feature of the initial key frame.

선택적으로, 본 출원의 키 프레임 스케줄링 방법의 또 다른 실시예에서, 상기 초기 키 프레임에 대해 시맨틱 분할을 수행하여, 상기 키 프레임의 시맨틱 태그를 출력하는 단계를 더 포함할 수 있다.Optionally, in another embodiment of the key frame scheduling method of the present application, the step of outputting a semantic tag of the key frame by performing semantic division on the initial key frame may be further included.

또한, 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 또 다른 실시예에서, 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정된 후, 상기 방법은, 현재 프레임을 현재 키 프레임이라 지칭하고, 현재 키 프레임의 하위 계층 특징을 버퍼링하여, 비디오에서 현재 키 프레임 이후의 다른 프레임이 키 프레임으로 스케줄링되어 사용되었는지 여부를 결정하는 단계를 더 포함할 수 있다.Further, in another embodiment of the key frame scheduling method provided by the present application embodiment, after it is determined that the current frame is scheduled as a key frame, the method refers to the current frame as a current key frame, and It may further include the step of determining whether a frame other than the current key frame in the video is scheduled and used as a key frame by buffering the lower layer feature.

또한, 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 또 다른 실시예에서, 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정된 후, 상기 방법은, 현재 프레임을 현재 키 프레임이라 지칭하고, 상기 현재 키 프레임에 대해 시맨틱 분할을 수행하여, 상기 현재 키 프레임의 시맨틱 태그를 출력하는 단계를 더 포함할 수 있다. 본 출원 실시예에서, 키 프레임의 경우, 계산에서 비용이 높은 단일 프레임 모델, 예를 들어, PSPN을 호출하여 시맨틱 분할을 수행함으로써, 정밀도가 높은 시맨틱 분할 결과를 획득할 수 있다. 본 출원 실시예에서, 키 프레임 및 현재 프레임은 뉴럴 네트워크의 하위 계층 특징 계층(즉, 제1 네트워크 계층)을 공유하여 하위 계층 특징 추출을 수행할 수 있으며, 여기서, 뉴럴 네트워크는 피라미드 장면 파싱 네트워크 (Pyramid Scene Parsing Network, PSPN)를 사용할 수 있고, 상기 뉴럴 네트워크는 4 개 부분의 컨볼루션 네트워크(conv1 내지 conv4) 및 하나의 분류 계층을 포함할 수 있으며, 각 부분의 컨볼루션 네트워크는 복수 개의 컨볼루션 계층으로도 분할될 수 있으며, 여기서, 뉴럴 네트워크의 하위 계층 특징 계층은 PSPN에서 conv1로부터 conv4_3까지의 컨볼루션 계층을 포함할 수 있으며, PSPN의 계산량의 약 1/8을 차지하며; 뉴럴 네트워크의 상위 계층 특징 계층(즉, 제2 네트워크 계층)은 conv4_4로부터 분류 계층 전까지의 적어도 하나의 컨볼루션 계층을 포함할 수 있으며, PSPN의 계산량의 약 7/8을 차지하고, 키 프레임의 상위 계층 특징 추출에 사용되며; 분류 계층은 키 프레임의 상위 계층 특징에 기반하여 키 프레임에서 적어도 하나의 픽셀의 카테고리를 대응적으로 식별함으로써, 키 프레임에 대한 시맨틱 분할을 구현한다.Further, in another embodiment of the key frame scheduling method provided by the present application embodiment, after it is determined that the current frame is scheduled as a key frame, the method refers to a current frame as a current key frame, and the current key frame The method may further include performing semantic division on and outputting a semantic tag of the current key frame. In the embodiment of the present application, in the case of a key frame, a semantic segmentation result with high precision may be obtained by calling a single frame model, for example, a PSPN, which is expensive in calculation, to perform semantic segmentation. In the embodiment of the present application, the key frame and the current frame may share a lower layer feature layer (ie, a first network layer) of a neural network to perform lower layer feature extraction, wherein the neural network is a pyramid scene parsing network ( Pyramid Scene Parsing Network, PSPN) may be used, and the neural network may include a convolutional network of four parts (conv1 to conv4) and one classification layer, and the convolutional network of each part is a plurality of convolutions. It may be divided into layers, wherein the lower layer feature layer of the neural network may include convolution layers from conv1 to conv4_3 in the PSPN, and occupies about 1/8 of the computational amount of the PSPN; The upper layer feature layer (i.e., the second network layer) of the neural network may include at least one convolution layer from conv4_4 to the classification layer, and occupies about 7/8 of the computational amount of the PSPN, and the upper layer of the key frame Used for feature extraction; The classification layer implements semantic division for a key frame by correspondingly identifying a category of at least one pixel in a key frame based on an upper layer characteristic of the key frame.

도 2는 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 방법의 다른 흐름 모식도이다. 도 2에 도시된 바와 같이, 상기 실시예의 키 프레임 스케줄링 방법은 다음의 단계를 포함한다.2 is another flow schematic diagram of a key frame scheduling method provided by an embodiment of the present application. As shown in Fig. 2, the key frame scheduling method of the above embodiment includes the following steps.

단계 202에 있어서, 뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득한다.In step 202, feature extraction is performed on the current frame through the first network layer of the neural network to obtain lower layer features of the current frame.

본 출원 실시예의 일 예에서, 뉴럴 네트워크의 하위 계층 특징 계층을 통해 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득할 수 있다.In an example of the embodiment of the present application, feature extraction is performed on a current frame through a lower layer feature layer of a neural network, so that lower layer features of the current frame may be obtained.

하나의 선택적인 예에서, 상기 단계 202는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 제1 특징 추출 유닛에 의해 실행될 수도 있다.In one alternative example, step 202 may be executed by a processor by calling a corresponding instruction stored in the memory, or may be executed by a first feature extraction unit operated by the processor.

단계 204에 있어서, 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득한다.In step 204, a scheduling probability value of the current frame is obtained according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame.

하나의 선택적인 예에서, 상기 단계 204는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 스케줄링 유닛에 의해 실행될 수도 있다.In one alternative example, step 204 may be executed by a processor by invoking a corresponding instruction stored in a memory, or may be executed by a scheduling unit operated by the processor.

단계 206에 있어서, 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정한다.In step 206, it is determined whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame.

현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 프레임을 현재 키 프레임으로 결정하고, 단계 208을 실행한다. 그렇지 않으면, 현재 프레임이 키 프레임으로 스케줄링되지 않은 것으로 결정되면, 본 실시예의 후속적인 프로세스를 실행하지 않는다.If it is determined that the current frame is scheduled as the key frame, the current frame is determined as the current key frame, and step 208 is executed. Otherwise, if it is determined that the current frame is not scheduled as a key frame, the subsequent process of this embodiment is not executed.

본 출원을 구현하는 과정에서, 출원인은 연구를 통해 두 프레임 사이의 하위 계층 특징 사이의 차이(두 프레임의 하위 계층 특징 사이의 차이값으로 정의됨)가 클수록, 대응하는 시맨틱 태그의 차이값(두 프레임의 시맨틱 태그에서 비일치 부분이 차지하는 비율로 정의됨)은 크며, 본 출원 실시예는 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징 사이의 차이를 통해, 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정한다. 두 프레임 사이의 하위 계층 특징 사이의 차이가 상기 기설정된 임계값보다 큰 경우, 정확한 시맨틱 결과를 획득하기 위해, 현재 프레임을 키 프레임(즉, 키 프레임으로 스케줄됨)으로 설정될 수 있다.In the process of implementing the present application, the applicant studied the difference between the lower layer features between the two frames (defined as the difference value between the lower layer features of the two frames), the greater the difference value of the corresponding semantic tag (two It is defined as the ratio occupied by the non-matching portion in the semantic tag of the frame), and in the embodiment of the present application, through the difference between the lower layer feature of the previous key frame adjacent to the current frame and the lower layer feature of the current frame, the current frame is Determines whether it is scheduled with a key frame. When the difference between the lower layer features between the two frames is greater than the preset threshold, the current frame may be set as a key frame (ie, scheduled as a key frame) in order to obtain an accurate semantic result.

하나의 선택적인 예에서, 상기 단계 206은, 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 결정 유닛에 의해 실행될 수도 있다.In one alternative example, step 206 may be executed by a processor by invoking a corresponding instruction stored in the memory, or may be executed by a decision unit operated by the processor.

단계 208에 있어서, 상기 뉴럴 네트워크의 제2 네트워크 계층을 통해 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득하며, 현재 키 프레임의 하위 계층 특징을 버퍼링한다.In step 208, feature extraction is performed on a lower layer feature of the current key frame through a second network layer of the neural network to obtain an upper layer feature of the current key frame, and buffer the lower layer feature of the current key frame. do.

하나의 선택적인 예에서, 상기 단계 208는 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 제2 특징 추출 유닛 및 버퍼링 유닛에 의해 실행될 수도 있다.In one alternative example, step 208 may be executed by a processor by calling a corresponding instruction stored in a memory, or may be executed by a second feature extraction unit and a buffering unit operated by the processor.

단계 210에 있어서, 현재 키 프레임에 대해 시맨틱 분할을 수행하여, 현재 키 프레임의 시맨틱 태그를 출력한다.In step 210, semantic division is performed on the current key frame, and the semantic tag of the current key frame is output.

하나의 선택적인 예에서, 상기 단계 210은 메모리에 저장된 대응하는 명령을 호출함으로써 프로세서에 의해 실행될 수 있거나, 프로세서에 의해 작동되는 시맨틱 분할 유닛에 의해 실행될 수도 있다.In one alternative example, the step 210 may be executed by a processor by calling a corresponding instruction stored in a memory, or may be executed by a semantic division unit operated by the processor.

본 출원을 구현하는 과정에서, 출원인은 연구를 통해 비디오에서 프레임 사이의 하위 계층 특징이 크게 변화하면 이에 대해 시맨틱 분할을 수행하여 획득한 시맨틱 태그 사이의 떨림이 크고, 그렇지 않으면 떨림이 작다는 것을 발견하였다. 본 출원 실시예에서, 딥 러닝 방법을 이용하여, 비디오에서 적어도 하나의 프레임의 특징 정보를 획득할 수 있고, 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징 사이의 차이에 따라 하위 계층 특징의 변화를 결정하며, 비디오에서 프레임 사이의 떨림 상황을 분석하고, 현재 프레임 및 인접한 이전 키 프레임 하위 계층 특징 사이의 일치 정도를 계산함으로써, 하위 계층 특징 변화가 크면 태그는 떨림이 크고, 그렇지 않으면 떨림이 작으며, 따라서, 하위 계층 특징에 의해 시맨틱 태그의 떨림 정도의 회귀를 구현함으로써, 키 프레임을 자기 적응적으로 스케줄링할 수 있다.In the process of implementing this application, the applicant found that, through research, if the characteristics of the lower layer between frames in the video change significantly, the vibration between the semantic tags obtained by performing semantic segmentation is large, otherwise the vibration is small. I did. In the embodiment of the present application, by using the deep learning method, feature information of at least one frame can be obtained from a video, and the difference between the lower layer feature of the previous key frame adjacent to the current frame and the lower layer feature of the current frame is It determines the change of the lower layer feature accordingly, analyzes the shaking situation between frames in the video, and calculates the degree of correspondence between the lower layer features of the current frame and the adjacent previous key frame.If the lower layer feature change is large, the tag has a large vibration. Otherwise, the vibration is small, and thus, by implementing a regression of the degree of vibration of the semantic tag by the lower layer feature, the key frame can be self-adaptively scheduled.

본 출원의 상기 임의의 실시예의 선택적인 일 예에서, 단계 104 또는 단계 204는, In an optional example of any of the above embodiments of the present application, step 104 or step 204,

이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징을 스플라이싱하여, 스플라이싱 특징을 얻는 단계; 및Splicing the lower layer feature of the previous key frame and the lower layer feature of the current frame to obtain a splicing feature; And

키 프레임 스케줄링 네트워크를 통해, 상기 스플라이싱 특징에 기반하여 현재 프레임의 스케줄링 확률값을 획득 및 출력하는 단계를 포함할 수 있다.And obtaining and outputting a scheduling probability value of a current frame based on the splicing feature through a key frame scheduling network.

본 출원 실시예는 자율 주행 장면, 비디오 모니터링 장면, 포트레이트 분할 등 인터넷 엔터테인먼트 제품 등에 사용될 수 있으며, 예를 들어 다음과 같다.The embodiment of the present application may be used for Internet entertainment products such as autonomous driving scenes, video monitoring scenes, and portrait segmentation, for example, as follows.

1. 자율 주행의 장면에서, 본 출원 실시예를 이용하여 비디오에서의 타겟 즉 사람 및 차량을 신속하게 분할할 수 있다.1. In the scene of autonomous driving, the target in the video, that is, people and vehicles can be quickly segmented using the embodiment of the present application.

2. 비디오 모니터링 장면에서, 사람을 신속하게 분할할 수 있다. 2. In the video monitoring scene, people can be divided quickly.

3. 포트레이트 분할 등 인터넷 엔터테인먼트 제품에서, 사람을 비디오 프레임으로부터 신속하게 분할할 수 있다.3. In Internet entertainment products such as portrait segmentation, it is possible to quickly segment people from video frames.

당업자는 상기 방법 실시예를 구현하기 위한 전부 또는 일부 단계는 프로그램 명령어와 관련되는 하드웨어를 통해 완료되며, 전술한 프로그램은 컴퓨터 판독 가능 저장 매체에 저장될 수 있으며, 상기 프로그램이 실행될 때, 상기 방법 실시예를 포함하는 단계를 실행하며; 전술한 저장 매체는 판독 전용 메모리(Read Only Memory, ROM), 랜덤 액세스 메모리(Random Access Memory, RAM), 자기 디스크 또는 광 디스크와 같은 프로그램 코드를 저장할 수 있는 다양한 매체를 포함한다.For those skilled in the art, all or some of the steps for implementing the method embodiment are completed through hardware associated with program instructions, and the above-described program may be stored in a computer-readable storage medium, and when the program is executed, the method is executed. Carry out the steps including the example; The above-described storage medium includes a variety of media capable of storing program codes such as a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

도 3은 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 구조 모식도이다. 본 출원 실시예에서 제공한 키 프레임 스케줄링 장치는 본 출원의 상기 각 실시예에 의해 제공된 키 프레임 스케줄링 방법을 구현하기 위한 것일 수 있다. 도 3에 도시된 바와 같이, 키 프레임 스케줄링 장치의 일 실시예에서, 제1 특징 추출 유닛, 스케줄링 유닛, 결정 유닛 및 제2 특징 추출 유닛을 포함할 수 있다. 여기서,3 is a structural schematic diagram of a key frame scheduling apparatus provided by an embodiment of the present application. The key frame scheduling apparatus provided by the embodiments of the present application may be for implementing the key frame scheduling method provided by the respective embodiments of the present application. As shown in FIG. 3, in an embodiment of the key frame scheduling apparatus, a first feature extraction unit, a scheduling unit, a determination unit, and a second feature extraction unit may be included. here,

제1 특징 추출 유닛은, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하기 위한 것이며, 상기 제1 특징 추출 유닛은 뉴럴 네트워크의 제1 네트워크 계층을 포함한다.The first feature extraction unit is for obtaining a lower layer feature of the current frame by performing feature extraction on the current frame, and the first feature extraction unit includes a first network layer of a neural network.

스케줄링 유닛은, 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하기 위한 것이다. 여기서, 이전 키 프레임의 하위 계층 특징은 제1 네트워크 계층에 의해 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고, 선택적으로, 본 출원 실시예에서 제공한 스케줄링 확률값은 현재 프레임이 키 프레임으로 스케줄링될 확률이다.The scheduling unit is for obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame. Here, the lower layer feature of the previous key frame is obtained by performing feature extraction on the previous key frame by the first network layer, and optionally, the scheduling probability value provided in the embodiment of the present application is that the current frame is scheduled as a key frame. It's probability.

결정 유닛은, 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하기 위한 것이다.The determining unit is for determining whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame.

제2 특징 추출 유닛은, 뉴럴 네트워크의 제2 네트워크 계층을 포함하고, 결정 유닛의 결정 결과에 따라, 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 프레임을 현재 키 프레임으로 결정하고, 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득하기 위한 것이다. 여기서, 뉴럴 네트워크에서, 상기 제1 네트워크 계층의 네트워크 깊이는 제2 네트워크 계층의 네트워크 깊이보다 얕다.The second feature extraction unit includes a second network layer of the neural network, and when it is determined that the current frame is scheduled as a key frame according to the determination result of the determination unit, determines the current frame as the current key frame, and This is to obtain the upper layer feature of the current key frame by performing feature extraction on the lower layer feature of the frame. Here, in a neural network, the network depth of the first network layer is smaller than that of the second network layer.

본 출원의 상기 실시예에 의해 제공된 키 프레임 스케줄링 장치에 기반하여, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하고, 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하며; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하고; 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득한다. 본 출원 실시예는 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라 이전 키 프레임 하위 계층 특징에 대한 현재 프레임의 변화를 획득할 수 있고, 비디오에서 서로 상이한 프레임 사이의 하위 계층 특징의 변화를 이용함으로써, 신속하고, 정확하게, 자기 적응적으로 키 프레임 스케줄링을 수행하여, 키 프레임의 스케줄링 효율을 향상시킬 수 있다.Based on the key frame scheduling device provided by the above embodiment of the present application, feature extraction is performed on the current frame to obtain a lower layer feature of the current frame, and a lower layer feature of the adjacent previous key frame and a lower layer of the current frame Obtaining a scheduling probability value of the current frame according to the layer feature; Determine whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; When it is determined that the current frame is scheduled as a key frame, feature extraction is performed on a lower layer feature of the current key frame to obtain an upper layer feature of the current key frame. In the embodiment of the present application, it is possible to obtain a change of the current frame with respect to the lower layer feature of the previous key frame according to the lower layer feature of the previous key frame and the lower layer feature of the current frame. By using the change, key frame scheduling can be performed quickly, accurately and self-adaptively, thereby improving the scheduling efficiency of key frames.

본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 선택적인 일 실시형태에서, 상기 이전 키 프레임은 미리 결정된 초기 키 프레임을 포함한다.In an optional embodiment of the key frame scheduling apparatus provided by the present application embodiment, the previous key frame includes a predetermined initial key frame.

도 4는 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 다른 구조 모식도이다. 도 4에 도시된 바와 같이, 도 3에 도시된 실시예와 비교하면, 상기 실시예에서, 키 프레임 스케줄링 장치는, 키 프레임의 하위 계층 특징을 버퍼링하기 위한 버퍼링 유닛을 더 포함하고, 본 출원 실시예에서의 키 프레임은 초기 키 프레임을 포함한다.4 is a schematic diagram of another structure of a key frame scheduling apparatus provided by an embodiment of the present application. As shown in FIG. 4, compared with the embodiment shown in FIG. 3, in the above embodiment, the key frame scheduling apparatus further includes a buffering unit for buffering lower layer characteristics of the key frame, and the present application is implemented. The key frame in the example includes an initial key frame.

또한, 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 또 다른 실시예에서, 제1 특징 추출 유닛은 또한, 결정 유닛의 결정 결과에 따라, 버퍼링 유닛에서 현재 키 프레임의 하위 계층 특징을 버퍼링하기 위한 것일 수 있다.In addition, in another embodiment of the key frame scheduling apparatus provided by the present application embodiment, the first feature extraction unit is further configured to buffer the lower layer feature of the current key frame in the buffering unit according to the determination result of the determining unit. Can be.

본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치의 일 실시형태에서, 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징을 스플라이싱하여, 스플라이싱 특징을 얻기 위한 스플라이싱 서브 유닛; 및 스플라이싱 특징에 기반하여 현재 프레임의 스케줄링 확률값을 획득하기 위한 키 프레임 스케줄링 네트워크를 포함할 수 있다.In one embodiment of the key frame scheduling apparatus provided by the present application embodiment, there is provided a splicing subunit for obtaining a splicing feature by splicing a lower layer feature of a previous key frame and a lower layer feature of a current frame; And a key frame scheduling network for obtaining a scheduling probability value of the current frame based on the splicing feature.

또한, 다시 도 4를 참조하면, 본 출원 실시예에 의해 제공된 키 프레임 스케줄링 장치는, 키 프레임에 대해 시맨틱 분할을 수행하여, 키 프레임의 시맨틱 태그를 출력하기 위한 시맨틱 분할 유닛을 더 포함할 수 있으며, 본 출원 실시예에서의 키 프레임은, 초기 키 프레임, 이전 키 프레임 또는 현재 키 프레임을 포함할 수 있다.Further, referring again to FIG. 4, the key frame scheduling apparatus provided according to the embodiment of the present application may further include a semantic division unit for outputting a semantic tag of a key frame by performing semantic division on a key frame. , The key frame in the embodiment of the present application may include an initial key frame, a previous key frame, or a current key frame.

또한, 본 출원 실시예는 또한 전자 기기를 제공하며, 본 출원의 상기 임의의 실시예의 키 프레임 스케줄링 장치를 포함한다.In addition, the embodiment of the present application also provides an electronic device, and includes the key frame scheduling apparatus of any of the above embodiments of the present application.

또한, 본 출원 실시예는 또한, 다른 전자 기기를 제공하며, 상기 전자 기기는,In addition, the embodiment of the present application also provides another electronic device, the electronic device,

프로세서 및 본 출원 상기 임의의 실시예의 키 프레임 스케줄링 장치를 포함하며; A processor and a key frame scheduling apparatus in any of the above embodiments of the present application;

프로세서에 의해 상기 키 프레임 스케줄링 장치가 작동될 때, 본 출원의 상기 임의의 실시예의 키 프레임 스케줄링 장치 중의 유닛은 작동된다.When the key frame scheduling device is operated by a processor, a unit in the key frame scheduling device of any of the above embodiments of the present application is operated.

또한, 본 출원 실시예는 또한, 프로세서 및 메모리를 포함하는 또 다른 전자 기기를 제공하며; In addition, the present application embodiment also provides another electronic device including a processor and a memory;

메모리는 적어도 하나의 실행 가능 명령어를 저장하기 위한 것이며, 실행 가능 명령어는 프로세서로 하여금 본 출원의 상기 임의의 실시예의 키 프레임 스케줄링 방법에서의 각 단계의 동작을 실행하도록 한다.The memory is for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation of each step in the key frame scheduling method of any of the above embodiments of the present application.

본 출원의 실시예는 또한 전자 기기를 제공하며, 예를 들어, 이동 단말, 개인용 컴퓨터(Personal Computer, PC), 태블릿 컴퓨터, 서버 등일 수 있다. 도 5는 본 출원 실시예에 의해 제공된 전자 기기의 응용 실시예의 구조 모식도이다. 도 5를 참조하면, 본 출원의 실시예에 따른 단말 기기 또는 서버를 구현하기에 적합한 전자 기기(500)의 구조 모식도이며, 도 5에 도시된 바와 같이, 전자 기기(500)는 하나 또는 복수 개의 프로세서, 통신부 등을 포함하며, 상기 하나 또는 복수 개의 프로세서는, 예를 들어, 하나 또는 복수 개의 중앙처리장치(Central Processing Unit, CPU)(501), 및 하나 또는 복수 개의 그래픽 처리 장치(Graphic Processing Unit, GPU)(513) 중 적어도 하나이며, 프로세서는 판독 전용 메모리(ROM)(502)에 저장된 실행 가능 명령어 또는 저장 부분(508)으로부터 랜덤 액세스 메모리(RAM)(503)에 로딩된 실행 가능한 명령어에 따라 다양한 적절한 동작 및 처리를 실행할 수 있다. 통신부(512)는 인피니밴드(Infiniband, IB) 랜 카드를 포함할 수 있지만 이에 한정되지 않는 랜 카드를 포함할 수 있지만 이에 한정되지는 않는다.The embodiment of the present application also provides an electronic device, and may be, for example, a mobile terminal, a personal computer (PC), a tablet computer, and a server. 5 is a structural schematic diagram of an application example of an electronic device provided by the embodiment of the present application. Referring to FIG. 5, it is a schematic structural diagram of an electronic device 500 suitable for implementing a terminal device or a server according to an embodiment of the present application. As shown in FIG. 5, the electronic device 500 includes one or a plurality of A processor, a communication unit, etc., and the one or more processors include, for example, one or a plurality of central processing units (CPUs) 501, and one or a plurality of graphic processing units (Graphic Processing Units). , GPU) 513, and the processor is at least one of the executable instructions stored in the read-only memory (ROM) 502 or the executable instructions loaded into the random access memory (RAM) 503 from the storage portion 508. Various appropriate operations and processing can be performed accordingly. The communication unit 512 may include an Infiniband (IB) LAN card, but may include a LAN card that is not limited thereto, but is not limited thereto.

프로세서는 판독 전용 메모리(502) 및랜덤 액세스 메모리(503) 중 적어도 하나와 통신하여 실행 가능 명령어를 실행할 수 있고, 버스 (504)에 의해 통신부(512)에 연결되고, 통신부(512)를 통해 다른 타겟 기기와 통신함으로써, 본 출원 실시예에에서 제공한 어느 한 방법에 대응하는 동작을 완료하며, 예를 들어, 뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하고; 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하며; 여기서, 이전 키 프레임의 하위 계층 특징은 제1 네트워크 계층에 의해 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하며; 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 프레임을 현재 키 프레임으로 결정하고, 뉴럴 네트워크의 제2 네트워크 계층을 통해 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득하고; 여기서, 뉴럴 네트워크에서, 제1 네트워크 계층의 네트워크 깊이는 제2 네트워크 계층의 네트워크 깊이보다 얕다.The processor can communicate with at least one of the read-only memory 502 and the random access memory 503 to execute an executable instruction, is connected to the communication unit 512 by a bus 504, and is connected to the communication unit 512 through the communication unit 512. By communicating with the target device, the operation corresponding to any one method provided in the embodiment of the present application is completed. For example, through the first network layer of the neural network, feature extraction is performed on the current frame, Acquire a lower layer feature of the frame; Obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame; Here, the lower layer feature of the previous key frame is obtained by performing feature extraction on the previous key frame by the first network layer; Determine whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; When it is determined that the current frame is scheduled as a key frame, the current frame is determined as the current key frame, and feature extraction is performed on the lower layer features of the current key frame through the second network layer of the neural network. Acquire higher layer features; Here, in the neural network, the network depth of the first network layer is shallower than that of the second network layer.

또한, RAM(503)에는 장치의 동작에 필요한 다양한 프로그램 및 데이터가 더 저장될 수 있다. CPU(501), ROM(502) 및 RAM(503)은 통신 버스(504)를 통해 서로 연결된다. RAM(503)이 있는 경우, ROM(502)은 선택 가능한 모듈이다. RAM(503)은 실행 가능 명령어를 저장하고, 또는 작동될 경우, ROM(502)에 실행 가능 명령어를 기록하며, 실행 가능 명령어는 CPU(501)로 하여금 상기 통신 방법에 대응하는 동작을 실행할 수 있도록 한다. 입력/출력(I/O) 인터페이스(505)도 버스(504)에 연결된다. 통신부(512)는 통합될 수 있거나, 버스에 연결된 복수 개의 서브 모듈(예를 들어 복수 개의 IB 랜 카드)을 갖도록 구성될 수 있다.In addition, the RAM 503 may further store various programs and data necessary for the operation of the device. The CPU 501, ROM 502 and RAM 503 are connected to each other through a communication bus 504. If there is a RAM 503, the ROM 502 is a selectable module. The RAM 503 stores executable instructions, or, when operated, records the executable instructions in the ROM 502, and the executable instructions allow the CPU 501 to execute an operation corresponding to the communication method. do. An input/output (I/O) interface 505 is also connected to the bus 504. The communication unit 512 may be integrated, or may be configured to have a plurality of sub-modules (eg, a plurality of IB LAN cards) connected to a bus.

다음의 구성 요소, 즉 키보드, 마우스 등을 포함하는 입력 부분(506); 음극 선관(Cathode Ray Tube, CRT), 액정 디스플레이(Liquid Crystal Display, LCD), 스피커 등을 포함하는 출력 부분(507); 하드웨어 등을 포함하는 저장 부분(508); 및 LAN 카드, 모뎀 등을 포함하는 네트워크 인터페이스 카드의 통신 부분(509)은 I/O 인터페이스(505)에 연결된다. 통신 부분(509)은 인터넷과 같은 네트워크를 통해 통신 처리를 실행한다. 드라이버(510)는 필요에 따라 I/O 인터페이스(505)에 연결될 수도 있다. 자기 디스크, 광 디스크, 광 자기 디스크, 반도체 메모리 등과 같은 탈착 가능한 매체(511)는 필요에 따라 저장 부분(508)에 장착된 컴퓨터 프로그램을 판독할 수 있도록, 필요에 따라 드라이버(510)에 장착된다.An input portion 506 including the following components, namely, a keyboard, a mouse, and the like; An output portion 507 including a cathode ray tube (CRT), a liquid crystal display (LCD), a speaker, and the like; A storage portion 508 including hardware and the like; And a communication portion 509 of a network interface card including a LAN card, a modem, and the like is connected to the I/O interface 505. The communication part 509 executes communication processing through a network such as the Internet. The driver 510 may be connected to the I/O interface 505 as needed. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc. is mounted on the driver 510 as necessary so that a computer program mounted on the storage portion 508 can be read. .

설명해야 할 것은, 도 5에 도시된 아키텍쳐는 다만 선택적인 구현 방식일 뿐, 구체적인 실천 과정에서, 상기 도 5의 구성 요소의 개수 및 타입은 실제 필요에 따라 선택, 제거, 감소, 증가 또는 교체되며; 상이한 기능적 구성 요소 설치에서, 분리 설치 또는 통합 설치 등 구현 방식을 사용할 수 있으며, 예를 들어 GPU(513) 및 CPU(501)는 분리 설치되거나 GPU(513)가 CPU(501)에 통합될 수 있으며, 통신부는 CPU(501) 또는 GPU(513)에 분리 설치 또는 통합 설치될 수 있는 등이다. 이러한 대안적인 실시형태는 모두 본 출원의 보호 범위에 속한다.It should be described that the architecture shown in FIG. 5 is only a selective implementation method, and in a specific practice process, the number and types of the components of FIG. 5 are selected, removed, decreased, increased or replaced according to actual needs. ; In the installation of different functional components, an implementation method such as separate installation or integrated installation may be used, for example, the GPU 513 and the CPU 501 may be separately installed or the GPU 513 may be integrated into the CPU 501, and , The communication unit may be separately installed or integrated into the CPU 501 or the GPU 513. All of these alternative embodiments fall within the protection scope of this application.

특히, 본 출원의 실시예에 따른 흐름도를 참조하여 설명된 과정은 컴퓨터 소프트웨어 프로그램에 의해 구현될 수 있다. 예를 들어, 본 출원의 실시예는 컴퓨터 판독 가능 매체에 유형적으로 포함된 컴퓨터 프로그램 제품을 포함하며, 컴퓨터 프로그램은 흐름도에 도시된 방법을 실행하기 위한 프로그램 코드를 포함하며, 프로그램 코드는 본 출원 실시예에 의해 제공된 방법 단계를 대응적으로 실행하기 위한 대응하는 명령어를 포함할 수 있으며, 예를 들어, 뉴럴 네트워크의 제1 네트워크 계층을 통해, 현재 프레임에 대해 특징 추출을 수행하여, 현재 프레임의 하위 계층 특징을 획득하고; 현재 프레임에 인접한 이전 키 프레임의 하위 계층 특징 및 현재 프레임의 하위 계층 특징에 따라, 현재 프레임의 스케줄링 확률값을 획득하며; 여기서, 이전 키 프레임의 하위 계층 특징은 제1 네트워크 계층에 의해 이전 키 프레임에 대해 특징 추출을 수행하여 얻은 것이고; 현재 프레임의 스케줄링 확률값에 따라 현재 프레임이 키 프레임으로 스케줄링되었는지 여부를 결정하며; 현재 프레임이 키 프레임으로 스케줄링된 것으로 결정되면, 현재 프레임을 현재 키 프레임으로 결정하고, 뉴럴 네트워크의 제2 네트워크 계층을 통해 현재 키 프레임의 하위 계층 특징에 대해 특징 추출을 수행하여, 현재 키 프레임의 상위 계층 특징을 획득하며; 여기서, 뉴럴 네트워크에서, 제1 네트워크 계층의 네트워크 깊이는 제2 네트워크 계층의 네트워크 깊이보다 얕다. 이러한 실시예에서, 상기 컴퓨터 프로그램은 통신 부분(509)을 통해 네트워크로부터 다운로드 및 설치될 수 있는 것 및 탈착 가능한 매체(511)로부터 설치될 수 있는 것 중 적어도 하나이다. 상기 컴퓨터 프로그램은 중앙처리장치(CPU)(501)에 의해 실행될 때, 본 출원의 방법에 한정된 상기 기능의 동작을 실행한다.In particular, the process described with reference to the flowchart according to the embodiment of the present application may be implemented by a computer software program. For example, an embodiment of the present application includes a computer program product tangibly included in a computer-readable medium, the computer program includes program code for executing the method shown in the flowchart, and the program code is implemented in the present application. It may include a corresponding instruction for correspondingly executing the method steps provided by the example, for example, through the first network layer of the neural network, by performing feature extraction on the current frame, Obtain a layer feature; Obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame; Here, the lower layer feature of the previous key frame is obtained by performing feature extraction on the previous key frame by the first network layer; Determine whether the current frame is scheduled as a key frame according to the scheduling probability value of the current frame; When it is determined that the current frame is scheduled as a key frame, the current frame is determined as the current key frame, and feature extraction is performed on the lower layer features of the current key frame through the second network layer of the neural network. Acquire higher layer features; Here, in the neural network, the network depth of the first network layer is shallower than that of the second network layer. In this embodiment, the computer program is at least one of those that can be downloaded and installed from the network through the communication portion 509 and those that can be installed from a removable medium 511. When the computer program is executed by the central processing unit (CPU) 501, it executes the operation of the above function limited to the method of the present application.

또한, 본 출원 실시예는 또한 컴퓨터 판독 가능 명령어를 저장하기 위한 컴퓨터 저장 매체를 제공하며, 상기 명령어가 실행될 때, 본 출원의 상기 임의의 실시예에 따른 키 프레임 스케줄링 방법의 동작을 구현한다.In addition, the embodiment of the present application also provides a computer storage medium for storing a computer-readable instruction, and when the instruction is executed, it implements the operation of the key frame scheduling method according to any of the above embodiments of the present application.

또한, 본 출원 실시예는 또한, 컴퓨터 판독 가능 명령어를 포함하는 컴퓨터 프로그램을 제공하며, 컴퓨터 판독 가능 명령어가 기기에서 작동될 때, 기기 중의 프로세서는 본 출원의 상기 임의의 실시예에 따른 키 프레임 스케줄링 방법 중의 단계를 구현하기 위한 실행 가능 명령어를 실행한다.In addition, the embodiment of the present application also provides a computer program including a computer-readable instruction, and when the computer-readable instruction is operated on the device, the processor in the device performs key frame scheduling according to any of the above embodiments of the present application Execute executable instructions to implement steps in the method.

선택적인 일 실시형태에서, 상기 컴퓨터 프로그램은 구체적으로 소프트웨어 개발 키트(Software Development Kit, SDK) 등과 같은 소프트웨어 제품이다.In an optional embodiment, the computer program is specifically a software product such as a software development kit (SDK) or the like.

하나 또는 복수 개의 선택적인 실시형태에서, 본 출원 실시예는 또한, 컴퓨터 판독 가능 명령어를 저장하기 위한 프로그램 제품을 제공하며, 상기 명령어가 실행될 때, 컴퓨터로 하여금 상기 임의의 가능한 구현방식에 따른 키 프레임 스케줄링 방법을 실행하도록 한다.In one or more optional embodiments, the present application embodiments also provide a program product for storing computer-readable instructions, wherein when the instructions are executed, the computer causes a key frame according to the any possible implementation manner. Run the scheduling method.

상기 컴퓨터 프로그램 제품은 구체적으로 하드웨어, 소프트웨어 또는 이들의 조합에 의해 구현될 수 있다. 하나의 선택적인 예에 있어서, 상기 컴퓨터 프로그램 제품은 구체적으로 컴퓨터 저장 매체로 반영되며, 다른 선택적인 예에 있어서, 상기 컴퓨터 프로그램 제품은 구체적으로 SDK 등과 같은 소프트웨어 제품으로 반영된다.The computer program product may be specifically implemented by hardware, software, or a combination thereof. In one optional example, the computer program product is specifically reflected as a computer storage medium, and in another optional example, the computer program product is specifically reflected as a software product such as SDK.

본 명세서에서, 각 실시예는 모두 점진적으로 설명되며, 각 실시예는 다른 실시예와의 차이점에 초점을 맞추고, 각 실시예 사이의 동일하거나 유사한 부분은 서로 참조될 수 있다. 시스템 실시예는 방법 실시예에 거의 대응되므로, 설명이 비교적 간단하고, 관련 부분에 대해서는 방법 실시예의 설명을 참조하면 된다.In this specification, each embodiment is all gradually described, each embodiment focuses on differences from other embodiments, and the same or similar parts between each embodiment may be referred to each other. Since the system embodiments substantially correspond to the method embodiments, the description is relatively simple, and the description of the method embodiments may be referred to for related parts.

본 출원의 방법과 장치는 많은 방식으로 구현될 수 있다. 예를 들어, 본 출원의 방법과 장치는 소프트웨어, 하드웨어, 펌웨어 또는 소프트웨어, 하드웨어, 펌웨어의 임의의 조합으로 구현될 수 있다. 달리 구체적으로 언급되지 않는 한, 상기 방법의 상기 단계의 상기 순서는 다만 구체적인 설명을 위한 것이며, 본 출원의 방법의 단계를 한정하려는 것은 아니다. 또한, 일부 실시예에 있어서, 본 출원은 기록 매체에 기록된 프로그램으로서 구현될 수도 있으며, 이러한 프로그램은 본 출원의 방법을 구현하기 위한 기계 판독 가능 명령어를 포함한다. 따라서, 본 출원은 본 출원에 따른 방법들을 실행하기 위한 프로그램을 저장하는 기록 매체를 더 포함한다.The method and apparatus of the present application can be implemented in many ways. For example, the method and apparatus of the present application may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware. Unless specifically stated otherwise, the order of the steps in the method is for specific description only, and is not intended to limit the steps of the method of the present application. Further, in some embodiments, the present application may be implemented as a program recorded on a recording medium, and such a program includes machine-readable instructions for implementing the method of the present application. Accordingly, the present application further includes a recording medium storing a program for executing the methods according to the present application.

본 출원의 설명은 예시 및 설명을 목적으로 제공되며, 누락되지 않는 형태로 한정하거나 본 출원을 개시된 형태로 한정하려는 것은 아니다. 많은 보정과 변경은 본 기술분야의 통상의 기술자에게 명백한 것이다. 실시예를 선택하고 설명한 것은 본 출원의 원리 및 실제 응용을 더 잘 설명하기 위해서이고, 본 기술분야의 통상의 기술자로 하여금 본 출원을 이해하여, 특정 사용에 적용 가능한 다양한 보정들을 갖는 다양한 실시예들을 설계하도록 한다.The description of the present application is provided for purposes of illustration and description, and is not intended to be limited to a form that is not omitted or to limit the present application to the disclosed form. Many corrections and modifications will be apparent to those skilled in the art. The selection and description of the embodiments is intended to better explain the principles and practical applications of the present application, and to allow those skilled in the art to understand the present application, various embodiments having various corrections applicable to a specific use are described. Design it.

Claims

As a key frame scheduling method,
Performing feature extraction on the current frame through a first network layer of the neural network to obtain a lower layer feature of the current frame;
Acquiring a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame-The lower layer characteristic of the previous key frame is determined by the first network layer It is obtained by performing feature extraction on the previous key frame, and the scheduling probability value is the probability that the current frame is scheduled as a key frame;
Determining whether the current frame is scheduled as a key frame according to a scheduling probability value of the current frame; And
When it is determined that the current frame is scheduled as a key frame, the current frame is determined as a current key frame, and feature extraction is performed on a lower layer feature of the current key frame through a second network layer of the neural network, Acquiring a higher layer feature of the current key frame, wherein in the neural network, a network depth of the first network layer is shallower than a network depth of the second network layer.

The method of claim 1,
Determining an initial key frame;
Performing feature extraction on the initial key frame through the first network layer to obtain and buffer lower layer features of the initial key frame; And
And acquiring a higher layer feature of the initial key frame by performing feature extraction on a lower layer feature of the initial key frame through the second network layer.

The method of claim 2,
And outputting a semantic tag of the initial key frame by performing semantic division on the initial key frame.

The method according to any one of claims 1 to 3,
After determining that the current frame is scheduled as a key frame,
And buffering a lower layer characteristic of the current key frame.

The method according to any one of claims 1 to 4,
Obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame,
Splicing a lower layer feature of the previous key frame and a lower layer feature of the current frame to obtain a splicing feature; And
And obtaining a scheduling probability value of the current frame based on the splicing feature through a key frame scheduling network.

The method according to any one of claims 1 to 5,
And outputting a semantic tag of the key frame by performing semantic division on the current key frame.

As a key frame scheduling device,
A first feature extraction unit for obtaining a lower layer feature of the current frame by performing feature extraction on the current frame, the first feature extraction unit including a first network layer of a neural network;
A scheduling unit for obtaining a scheduling probability value of the current frame according to a lower layer characteristic of a previous key frame adjacent to the current frame and a lower layer characteristic of the current frame-The lower layer characteristic of the previous key frame is the first network layer This is obtained by performing feature extraction on the previous key frame, and the scheduling probability value is a probability that the current frame is scheduled as a key frame;
A determining unit for determining whether the current frame is scheduled as a key frame according to a scheduling probability value of the current frame; And
When it is determined that the current frame is scheduled as a key frame according to the determination result of the determining unit, the current frame is determined as a current key frame, and feature extraction is performed on a lower layer feature of the current key frame, and the A second feature extraction unit for obtaining an upper layer feature of a current key frame- The second feature extraction unit includes a second network layer of the neural network, and in the neural network, the network depth of the first network layer is the A key frame scheduling apparatus comprising: shallower than the network depth of the second network layer.

The method of claim 7,
The previous key frame includes a predetermined initial key frame;
The device,
And a buffering unit for buffering a lower layer characteristic of a key frame, wherein the key frame includes the initial key frame.

The method of claim 8,
The first feature extraction unit is further for buffering a lower layer feature of the current key frame in the buffering unit according to a determination result of the determining unit.

The method according to any one of claims 7 to 9,
The scheduling unit,
A splicing subunit for obtaining a splicing feature by splicing a lower layer feature of the previous key frame and a lower layer feature of the current frame; And
And a key frame scheduling network for obtaining a scheduling probability value of the current frame based on the splicing feature.

The method according to any one of claims 7 to 10,
The device,
By performing semantic division on the key frame, further comprising a semantic division unit for outputting a semantic tag of the key frame, wherein the key frame includes an initial key frame, the previous key frame, or the current key frame. Key frame scheduling device, characterized in that.

As an electronic device,
An electronic device comprising the key frame scheduling device according to any one of claims 7 to 11.

As an electronic device,
A processor and a key frame scheduling device according to any one of claims 7 to 11;
12. An electronic device, characterized in that when the key frame scheduling device is operated by a processor, a unit of the key frame scheduling device according to any one of claims 7 to 11 is operated.

As an electronic device,
Including a processor and memory;
The memory is for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation of each step in the key frame scheduling method according to any one of claims 1 to 6. Electronic device, characterized in that.

A computer program comprising computer-readable code,
When the computer-readable code is run on a device, a processor in the device executes an instruction for implementing each step in the key frame scheduling method according to any one of claims 1 to 6. Computer program.

A computer-readable medium for storing a computer-readable instruction, wherein when the instruction is executed, it implements the operation of each step in the key frame scheduling method according to any one of claims 1 to 6 Readable medium.