KR20210087494A

KR20210087494A - Human body orientation detection method, apparatus, electronic device and computer storage medium

Info

Publication number: KR20210087494A
Application number: KR1020217016720A
Authority: KR
Inventors: 시아오 리; 징웨이 슈; 구앙리앙 쳉
Original assignee: 상하이 센스타임 인텔리전트 테크놀로지 컴퍼니 리미티드
Priority date: 2019-11-20
Filing date: 2020-09-08
Publication date: 2021-07-12
Also published as: CN112825145A; JP2022510963A; CN112825145B; WO2021098346A1

Abstract

본 실시예는 인체 방향 검출 방법, 장치, 전자 기기 및 컴퓨터 저장 매체를 개시하며, 상기 방법은, 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하는 단계; 상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하는 단계; 및 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하는 단계를 포함한다. 이와 같이, 본 발명의 실시예에서, 최종 인체 방향은 인체 핵심 포인트 및 초기 인체 방향을 종합적으로 고려한 기초상에서 획득된 것이고, 따라서 인체 핵심 포인트에 기반하여 최종 인체 방향의 정확도 및 사용 가능성을 향상시킬 수 있다.This embodiment discloses a human body orientation detection method, apparatus, electronic device and computer storage medium, the method comprising: performing feature extraction on an image to be processed to obtain features of the image to be processed; determining key points of the human body and an initial body orientation based on the characteristics of the image to be processed; and determining a final body orientation according to the determined body key points and the initial body orientation. As such, in an embodiment of the present invention, the final human body orientation is obtained on the basis of comprehensively considering the human body core points and the initial body orientation, and thus the accuracy and usability of the final body orientation can be improved based on the human body core points. have.

Description

Human body orientation detection method, apparatus, electronic device and computer storage medium

관련 출원의 상호 참조Cross-referencing of related applications

본 발명은 출원번호가 201911143057.6이고 출원일자가 2019년 11월 20일인 중국 특허 출원에 기반하여 제출하였고 상기 중국 특허 출원의 우선권을 주장하는 바, 상기 중국 특허 출원의 모든 내용은 참조로서 본원 발명에 원용된다.The present invention is filed based on a Chinese patent application with an application number of 201911143057.6 and an application date of November 20, 2019, and claims the priority of the Chinese patent application, all contents of the Chinese patent application are incorporated herein by reference. .

본 발명은 컴퓨터 비전 처리 기술에 관한 것으로, 특히 인체 방향 검출 방법, 장치, 전자 기기 및 컴퓨터 저장 매체에 관한 것이다.The present invention relates to computer vision processing technology, and more particularly, to a method, apparatus, electronic device, and computer storage medium for detecting human body orientation.

컴퓨터 비전 처리 기술의 발전에 따라, 보행자 방향 검출은 컴퓨터 비전 분야에서 점차 중요한 연구 방향으로 되고 있으며, 관련 기술에서 보행자 방향 검출의 해결수단은, 카메라에서 얻은 이미지를 처리하여 이미지 중 각 사람의 신체 및/또는 얼굴의 방향을 예측하는 것일 수 있지만, 이와 같이 검출된 보행자 방향의 정확도 및 사용 가능성은 보장될 수 없다.With the development of computer vision processing technology, pedestrian direction detection is becoming an increasingly important research direction in the field of computer vision, and the solution to pedestrian direction detection in related technologies is to process the image obtained from the camera to process each person's body and / or predicting the direction of the face, but the accuracy and usability of the detected pedestrian direction in this way cannot be guaranteed.

본 발명의 실시예는 인체 방향 검출의 기술적 해결수단을 제공하고자 한다.An embodiment of the present invention is intended to provide a technical solution for detecting the direction of the human body.

본 발명의 실시예는,An embodiment of the present invention is

처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하는 단계; performing feature extraction on the image to be processed to obtain features of the image to be processed;

상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하는 단계; 및determining key points of the human body and an initial body orientation based on the characteristics of the image to be processed; and

결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하는 단계를 포함하는 인체 방향 검출 방법을 제공한다.Provided is a method for detecting a human body direction, including determining a final body direction according to the determined body key points and the initial body direction.

일부 실시예에서, 상기 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하는 단계는,In some embodiments, the determining of the final body orientation according to the determined body core points and the initial body orientation includes:

상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하는 경우에 응답하여, 상기 초기 인체 방향을 상기 최종 인체 방향으로 결정하는 단계를 포함한다. 이와 같이, 초기 인체 방향을 최종 인체 방향으로 결정하여, 최종 인체 방향을 정확하게 획득할 수 있다.and determining the initial body orientation as the final body orientation in response to a case in which the orientation of the anatomy characterized by the determined key points of the body coincides with the initial orientation of the body. In this way, by determining the initial human body direction as the final human body direction, the final human body direction can be accurately obtained.

일부 실시예에서, 상기 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하는 단계는, In some embodiments, the determining of the final body orientation according to the determined body core points and the initial body orientation includes:

상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하지 않는 경우에 응답하여, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향을 상기 최종 인체 방향으로 결정하는 단계를 포함한다. 이와 같이, 인체 핵심 포인트에 의해 특성화된 인체 방향이 초기 인체 방향과 일치하지 않는 경우, 초기 인체 방향의 정확도가 비교적 낮은 것으로 간주할 수 있으며, 인체 핵심 포인트에 의해 특성화된 인체 방향을 최종 인체 방향으로 결정하여, 최종 인체 방향의 정확도를 향상시킬 수 있다.and in response to a case in which the human body orientation characterized by the determined body core points does not match the initial body orientation, determining the body orientation characterized by the determined body core points as the final body orientation. As such, when the human body orientation characterized by the human body key points does not match the initial body orientation, the accuracy of the initial body orientation can be considered to be relatively low, and the human body orientation characterized by the human body core points is the final body orientation. By determining it, it is possible to improve the accuracy of the final human body orientation.

일부 실시예에서, 처리될 이미지에 대해 특징 추출을 수행하여 상기 처리될 이미지의 특징을 획득하고, 상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하는 단계는 신경망에 의해 수행되고, 상기 신경망은 제1 샘플 이미지 및 제2 샘플 이미지로 트레이닝하여 획득되며, 상기 제1 샘플 이미지는 제1 인체 이미지 및 태깅된 인체 핵심 포인트를 포함하고, 상기 제2 샘플 이미지는 제2 인체 이미지 및 태깅된 인체 방향을 포함한다.In some embodiments, performing feature extraction on the image to be processed to obtain features of the image to be processed, and determining, based on the features of the image to be processed, an anatomical key point and an initial anatomical orientation may be performed by a neural network. performed, the neural network is obtained by training with a first sample image and a second sample image, wherein the first sample image includes a first human body image and tagged human body key points, and the second sample image includes a second human body image Includes images and tagged body orientations.

일부 실시예에서, 상기 신경망이 제1 샘플 이미지 및 제2 샘플 이미지로 트레이닝하여 획득되는 단계는, In some embodiments, the step of obtaining the neural network by training with a first sample image and a second sample image includes:

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하고; 상기 제1 샘플 이미지의 특징에 따라 보행자 핵심 포인트를 검출하여, 상기 제1 샘플 이미지의 인체 핵심 포인트를 획득하며; 상기 제2 샘플 이미지의 특징에 기반하여 방향을 검출하여, 상기 제2 샘플 이미지의 인체 방향을 획득하는 단계; 및performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image, and obtaining a human body key point of the first sample image; detecting a direction based on a characteristic of the second sample image to obtain a human body direction of the second sample image; and

검출된 인체 핵심 포인트, 태깅된 인체 핵심 포인트, 검출된 인체 방향 및 태깅된 인체 방향에 따라, 상기 신경망의 네트워크 파라미터값을 조정하는 단계를 포함한다. 이와 같이, 신경망의 네트워크 파라미터값을 조정하여, 트레이닝된 신경망의 성능이 보다 바람직하도록 한다.and adjusting the network parameter values of the neural network according to the detected human body key point, the tagged human body key point, the detected body direction, and the tagged body direction. In this way, by adjusting the network parameter values of the neural network, the performance of the trained neural network is more desirable.

일부 실시예에서, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하는 단계는,In some embodiments, performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image,

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하고, 스티칭된 이미지 데이터에 대해 특징 추출을 수행하여, 스티칭된 이미지 데이터의 특징을 획득하는 단계; 및stitching the first sample image and the second sample image, and performing feature extraction on the stitched image data to obtain features of the stitched image data; and

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 스티칭 방식에 따라, 상기 스티칭된 이미지 데이터의 특징을 상기 제1 샘플 이미지의 특징 및 상기 제2 샘플 이미지의 특징으로 분할하는 단계를 포함한다. 이와 같이, 스티칭된 이미지 데이터의 특징을 분할하는 것을 통해, 제1 샘플 이미지 및 제2 샘플 이미지의 특징 각각에 대해, 인체 핵심 포인트 검출 및 인체 방향 검출을 수행하는데 유리하도록 하고, 이는 구현 복잡도가 비교적 낮다.and dividing a feature of the stitched image data into a feature of the first sample image and a feature of the second sample image according to a stitching method of the first sample image and the second sample image. In this way, by dividing the features of the stitched image data, it is advantageous to perform human body key point detection and human body orientation detection for each of the features of the first sample image and the second sample image, which has a relatively low implementation complexity. low.

일부 실시예에서, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하는 단계는, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 배치 차원에 따라 스티칭하는 단계를 포함하고;In some embodiments, stitching the first sample image and the second sample image comprises stitching the first sample image and the second sample image along a batch dimension;

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하는 단계 이전에, 상기 방법은,Prior to stitching the first sample image and the second sample image, the method comprises:

채널, 높이 및 폭 3개의 차원에서 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 각각 동일하게 조정하는 단계를 더 포함함으로써, 배치 차원에 따라 이미지 데이터를 스티칭하는 것을 구현한다.Stitching the image data according to the arrangement dimension is implemented by further comprising the step of respectively adjusting the first sample image and the second sample image equally in three dimensions of a channel, a height and a width.

일부 실시예에서, 상기 검출된 인체 핵심 포인트, 태깅된 인체 핵심 포인트, 검출된 인체 방향 및 태깅된 인체 방향에 따라, 상기 신경망의 네트워크 파라미터값을 조정하는 단계는,In some embodiments, the step of adjusting the network parameter value of the neural network according to the detected body core point, the tagged body core point, the detected body direction, and the tagged body direction includes:

상기 검출된 인체 핵심 포인트 및 상기 태깅된 인체 핵심 포인트에 따라, 상기 신경망의 제1 손실값을 획득하는 단계 - 상기 제1 손실값은 상기 검출된 인체 핵심 포인트와 상기 태깅된 인체 핵심 포인트 간의 차이를 나타냄 - ; acquiring a first loss value of the neural network according to the detected human core point and the tagged human body core point, wherein the first loss value is a difference between the detected human body core point and the tagged human body core point indicates - ;

상기 검출된 인체 방향 및 상기 태깅된 인체 방향에 따라, 상기 신경망의 제2 손실값을 획득하는 단계 - 상기 제2 손실값은 상기 검출된 인체 방향과 상기 태깅된 인체 방향 간의 차이를 나타냄 - ; obtaining a second loss value of the neural network according to the detected human body orientation and the tagged human body orientation, wherein the second loss value represents a difference between the detected human body orientation and the tagged human body orientation;

상기 제1 손실값 및 상기 제2 손실값에 따라, 상기 신경망의 네트워크 파라미터값을 조정하는 단계를 포함한다. 이와 같이, 손실값을 사용하여 신경망의 네트워크 파라미터값을 조정하여, 신경망의 로버스트를 향상시킬 수 있다.and adjusting a network parameter value of the neural network according to the first loss value and the second loss value. In this way, the robustness of the neural network can be improved by adjusting the network parameter value of the neural network using the loss value.

본 발명의 실시예는 추출 모듈 및 처리 모듈을 포함하는 인체 방향 검출 장치를 더 제공하되, 여기서,An embodiment of the present invention further provides a human body orientation detection device comprising an extraction module and a processing module, wherein:

추출 모듈은, 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하고; The extraction module is configured to: perform feature extraction on the image to be processed to obtain features of the image to be processed;

처리 모듈은, 상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하며; 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정한다.The processing module is configured to determine, based on the characteristics of the image to be processed, a human body key point and an initial body orientation; According to the determined body key points and the initial body orientation, the final body orientation is determined.

일부 실시예에서, 상기 처리 모듈은, 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하고, 이 경우는 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하는 경우에 응답하여, 상기 초기 인체 방향을 상기 최종 인체 방향으로 결정하는 경우를 포함한다.In some embodiments, the processing module determines a final body orientation according to the determined body core point and the initial body orientation, in which case the body orientation characterized by the determined body core point coincides with the initial body orientation. and determining the initial body orientation as the final body orientation in response to the case.

일부 실시예에서, 상기 처리 모듈은, 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하고, 이 경우는 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하지 않는 경우에 응답하여, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향을 상기 최종 인체 방향으로 결정하는 경우를 포함한다.In some embodiments, the processing module determines a final body orientation according to the determined body core point and the initial body orientation, in which case the body orientation characterized by the determined body core point does not coincide with the initial body orientation and a case in which the human body direction characterized by the determined body key point is determined as the final body direction in response to the case of not.

일부 실시예에서, 상기 장치는 트레이닝 모듈을 더 포함하되, 상기 트레이닝 모듈은, 제1 샘플 이미지 및 제2 샘플 이미지를 사용하여 상기 신경망을 트레이닝하고, 이 경우는,In some embodiments, the apparatus further comprises a training module, wherein the training module trains the neural network using a first sample image and a second sample image, in this case:

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하고; 상기 제1 샘플 이미지의 특징에 따라 보행자 핵심 포인트를 검출하여, 상기 제1 샘플 이미지의 인체 핵심 포인트를 획득하며; 상기 제2 샘플 이미지의 특징에 기반하여 방향을 검출하여, 상기 제2 샘플 이미지의 인체 방향을 획득하는 경우; 및performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image, and obtaining a human body key point of the first sample image; detecting a direction based on a characteristic of the second sample image to obtain a human body direction of the second sample image; and

검출된 인체 핵심 포인트, 태깅된 인체 핵심 포인트, 검출된 인체 방향 및 태깅된 인체 방향에 따라, 상기 신경망의 네트워크 파라미터값을 조정하는 경우를 포함한다.and adjusting the network parameter values of the neural network according to the detected human body key point, the tagged human body core point, the detected body direction, and the tagged body direction.

일부 실시예에서, 상기 트레이닝 모듈은, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하고; 이 경우는,In some embodiments, the training module is configured to: perform feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; In this case,

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하고, 스티칭된 이미지 데이터에 대해 특징 추출을 수행하여, 스티칭된 이미지 데이터의 특징을 획득하는 경우; 및stitching the first sample image and the second sample image and performing feature extraction on the stitched image data to obtain features of the stitched image data; and

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 스티칭 방식에 따라, 상기 스티칭된 이미지 데이터의 특징을 상기 제1 샘플 이미지의 특징 및 상기 제2 샘플 이미지의 특징으로 분할하는 경우를 포함한다.and dividing a feature of the stitched image data into a feature of the first sample image and a feature of the second sample image according to a stitching method of the first sample image and the second sample image.

일부 실시예에서, 상기 트레이닝 모듈은, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하고, 이 경우는 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 배치 차원에 따라 스티칭하는 경우를 포함하며; In some embodiments, the training module includes stitching the first sample image and the second sample image, wherein the first sample image and the second sample image are stitched according to a batch dimension, ;

상기 트레이닝 모듈은 또한, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 스티칭하기 전에, 채널, 높이 및 폭 3개의 차원에서 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지를 각각 동일하게 조정한다.The training module also adjusts the first sample image and the second sample image equally in three dimensions of a channel, a height and a width, respectively, before stitching the first sample image and the second sample image.

일부 실시예에서, 상기 트레이닝 모듈은, 검출된 인체 핵심 포인트, 태깅된 인체 핵심 포인트, 검출된 인체 방향 및 태깅된 인체 방향에 따라, 상기 신경망의 네트워크 파라미터값을 조정하고, 이 경우는,In some embodiments, the training module adjusts the network parameter values of the neural network according to the detected body core points, the tagged body core points, the detected body directions, and the tagged body directions, in this case,

상기 검출된 인체 핵심 포인트 및 상기 태깅된 인체 핵심 포인트에 따라, 상기 신경망의 제1 손실값을 획득하는 경우 - 상기 제1 손실값은 상기 검출된 인체 핵심 포인트와 상기 태깅된 인체 핵심 포인트 간의 차이를 나타냄 - ; When the first loss value of the neural network is obtained according to the detected human core point and the tagged human body core point - The first loss value is the difference between the detected human body core point and the tagged human body core point indicates - ;

상기 검출된 인체 방향 및 상기 태깅된 인체 방향에 따라, 상기 신경망의 제2 손실값을 획득하는 경우 - 상기 제2 손실값은 상기 검출된 인체 방향과 상기 태깅된 인체 방향 간의 차이를 나타냄 - ; 및obtaining a second loss value of the neural network according to the detected human body orientation and the tagged human body orientation, wherein the second loss value represents a difference between the detected human body orientation and the tagged human body orientation; and

상기 제1 손실값 및 상기 제2 손실값에 따라, 상기 신경망의 네트워크 파라미터값을 조정하는 경우를 포함한다.and adjusting a network parameter value of the neural network according to the first loss value and the second loss value.

본 발명의 실시예는 프로세서 및 프로세서에서 실행 가능한 컴퓨터 프로그램을 저장하기 위한 메모리를 포함하되; 여기서,An embodiment of the present invention includes a processor and a memory for storing a computer program executable on the processor; here,

상기 프로세서는 상기 컴퓨터 프로그램을 실행하여 상기 임의의 하나의 인체 방향 검출 방법을 수행하는 전자 기기를 더 제공한다.The processor further provides an electronic device that executes the computer program to perform any one of the human body orientation detection methods.

본 발명의 실시예는 컴퓨터 프로그램이 저장되되, 상기 퓨터 프로그램은 프로세서에 의해 실행될 경우 상기 임의의 하나의 인체 방향 검출 방법을 구현하는 컴퓨터 저장 매체를 더 제공한다.An embodiment of the present invention further provides a computer storage medium in which a computer program is stored, and when the computer program is executed by a processor, the computer storage medium implements any one of the human body orientation detection methods.

본 발명의 실시예에서 제공되는 인체 방향 검출 방법, 장치, 전자 기기 및 컴퓨터 저장 매체에 있어서, 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하고; 상기 처리될 이미지의 특징에 기반하여 인체 핵심 포인트 및 초기 인체 방향을 결정하며; 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정한다. 이와 같이, 본 발명의 실시예에서, 최종 인체 방향은 인체 핵심 포인트 및 초기 인체 방향을 종합적으로 고려한 기초상에서 획득된 것이고, 따라서 인체 핵심 포인트에 기반하여 최종 인체 방향의 정확도 및 사용 가능성을 향상시킬 수 있다.In the human body orientation detection method, apparatus, electronic device and computer storage medium provided in the embodiments of the present invention, it is possible to perform feature extraction on an image to be processed to obtain features of the image to be processed; determine body key points and an initial body orientation based on the characteristics of the image to be processed; According to the determined body key points and the initial body orientation, the final body orientation is determined. As such, in an embodiment of the present invention, the final human body orientation is obtained on the basis of comprehensively considering the human body core points and the initial body orientation, and thus the accuracy and usability of the final body orientation can be improved based on the human body core points. have.

이상 일반 설명과 아래의 세부 사항 설명은 단지 예시적이고 해석적인 것일 뿐, 본 발명을 한정하려는 것이 아님을 이해해야 할 것이다.It should be understood that the above general description and the following detailed description are merely illustrative and interpretative, and not intended to limit the present invention.

여기서의 도면은 명세서에 병합되어 본 명세서의 일부분을 구성하고, 이러한 도면은 본 발명에 부합되는 실시예를 나타내며, 명세서와 함께 본 발명의 기술적 해결수단을 설명하기 위한 것이다.
도 1은 본 발명의 실시예의 인체 방향 검출 방법의 흐름도이다.
도 2는 본 발명의 실시예의 트레이닝이 완료된 신경망의 아키텍처 모식도이다.
도 3은 본 발명의 실시예에 관한 인체 핵심 포인트의 모식도이다.
도 4는 본 발명의 실시예의 인체 방향의 모식도이다.
도 5는 본 발명의 실시예의 신경망 트레이닝 방법의 흐름도이다.
도 6은 본 발명의 실시예의 신경망 트레이닝의 아키텍처 모식도이다.
도 7은 본 발명의 실시예 중 이미지 데이터 스티칭의 모식도이다.
도 8은 본 발명의 실시예 중 이미지 특징 분할의 모식도이다.
도 9는 본 발명의 실시예의 인체 방향 검출 장치의 조성 구조 모식도이다.
도 10은 본 발명의 실시예의 전자 기기의 구조 모식도이다.The drawings herein are incorporated in and constitute a part of this specification, and these drawings show embodiments consistent with the present invention, and together with the specification are for explaining the technical solutions of the present invention.
1 is a flowchart of a method for detecting a human body direction according to an embodiment of the present invention.
2 is an architectural schematic diagram of a neural network that has been trained according to an embodiment of the present invention.
3 is a schematic diagram of human body key points according to an embodiment of the present invention.
4 is a schematic diagram of the human body direction in the embodiment of the present invention.
5 is a flowchart of a neural network training method according to an embodiment of the present invention.
6 is an architectural schematic diagram of neural network training according to an embodiment of the present invention.
7 is a schematic diagram of image data stitching in an embodiment of the present invention.
8 is a schematic diagram of image feature segmentation in an embodiment of the present invention.
9 is a schematic diagram of the composition structure of the human body direction detecting device according to the embodiment of the present invention.
10 is a structural schematic diagram of an electronic device according to an embodiment of the present invention.

아래에 도면과 실시예를 결부하여 본 발명에 대해 더 상세하게 설명한다. 이해해야 할 것은, 여기서 제공되는 실시예는 단지 본 발명을 해석하기 위한 것일 뿐, 본 발명을 한정하려는 것이 아니다. 이밖에, 아래에 제공되는 실시예는 본 발명의 전부 실시예를 구현하는 것이 아니라, 본 발명의 일부 실시예를 구현하도록 제공되며, 충돌되지 않는 한, 본 발명의 실시예에 기재된 기술적 해결수단은 임의의 조합 방식으로 구현될 수 있다.The present invention will be described in more detail below in conjunction with the drawings and examples. It should be understood that the examples provided herein are merely for interpreting the present invention and not for limiting the present invention. In addition, the embodiments provided below do not implement all embodiments of the present invention, but are provided to implement some embodiments of the present invention, and unless there is a conflict, the technical solutions described in the embodiments of the present invention are It can be implemented in any combination manner.

설명해야 할 것은, 본 발명의 실시예에서, 용어 “포괄”, “포함” 또는 임의의 다른 변형은 비배타적인 포함을 포함하도록 의도됨으로써, 일련의 요소를 포함하는 방법 또는 장치가 이들 요소 뿐만 아니라 명시적으로 열거되지 않은 다른 요소, 또는 실시형태 또는 장치의 고유 요소를 더 포함하도록 한다. 더 많은 제한이 없을 경우, “하나를 포함한다”는 문장으로 정의된 요소는 그 요소를 포함하는 방법 또는 장치에 다른 관련 요소(예를 들어, 방법 중 단계 또는 장치 중 유닛이며, 예를 들어, 유닛은 일부 회로, 일부 프로세서, 일부 프로그램 또는 소프트웨어 등일 수 있음)가 존재한다는 것을 배제하지 않는다. It should be noted that, in embodiments of the present invention, the terms "inclusive", "comprising" or any other variation are intended to include non-exclusive inclusions, such that a method or apparatus comprising a series of elements is intended to include those elements as well as those elements. It is intended to further include other elements not explicitly listed, or elements unique to an embodiment or device. In the absence of further limitations, an element defined by the sentence "comprising one" is another related element to the method or apparatus comprising the element (e.g., a step of a method or a unit of an apparatus, e.g., A unit may be some circuitry, some processor, some program or software, etc.) is not excluded.

예를 들어, 본 발명의 실시예에서 제공되는 인체 방향 검출 방법은 일련의 단계를 포함하지만, 본 발명의 실시예에서 제공되는 인체 방향 검출 방법은 기재된 단계에 한정되지 않으며, 마찬가지로, 본 발명의 실시예에서 제공되는 인체 방향 검출 장치는 일련의 모듈을 포함하지만, 본 발명의 실시예에서 제공되는 장치는 명확하게 기재된 모듈을 포함하는 것으로 한정되지 않고, 관련 정보를 획득하거나 정보를 기반으로 처리할 때 설정해야 하는 모듈을 더 포함할 수 있다.For example, although the human body orientation detection method provided in the embodiment of the present invention includes a series of steps, the human body orientation detection method provided in the embodiment of the present invention is not limited to the described steps, and similarly, the practice of the present invention The apparatus for detecting human body direction provided in the example includes a series of modules, but the apparatus provided in the embodiment of the present invention is not limited to including a clearly described module, and when acquiring related information or processing based on the information It may further include modules that need to be configured.

본문에서 용어 “및/또는”은 단지 연관 객체를 설명하는 연관 관계이고, 3가지 관계가 존재함을 의미하는 바, 예를 들어 A 및/또는 B는, A만 존재, A와 B가 동시에 존재, B만 존재하는 3가지 경우를 의미한다. 이밖에, 본문에서 용어 “적어도 하나”는 복수 중 어느 하나 또는 복수 중 적어도 2가지의 임의의 조합을 의미하는 바, 예를 들어, A, B, C 중 적어도 하나를 포함하는 것은, A, B 및 C로 구성된 집합으로부터 선택된 어느 하나 또는 복수의 요소를 포함하는 것을 의미할 수 있다.In the text, the term “and/or” is an association relationship that merely describes a related object, and means that three relationships exist, for example, A and/or B means that only A exists, and A and B exist at the same time. , means three cases in which only B exists. In addition, the term “at least one” in the text means any one of a plurality or any combination of at least two of the plurality, for example, including at least one of A, B, and C is, A, B And it may mean including any one or a plurality of elements selected from the set consisting of C.

본 발명의 실시예는 단말기 및/또는 서버로 조성된 컴퓨터 시스템에 적용될 수 있고, 다른 많은 일반 또는 전용 컴퓨팅 시스템 환경 또는 구성과 함께 작동될 수 있다. 여기서 단말기는 신 클라이언트(Thin client), 식 클라이언트(Thick client), 핸드 헬드 또는 랩톱 기기, 마이크로 프로세서 기반 시스템, 셋톱 박스, 프로그램 가능 소비 전자 제품, 네트워크 개인용 컴퓨터, 소형 컴퓨터 시스템 등일 수 있으며, 서버는 서버 컴퓨터 시스템 소형 컴퓨터 시스템, 대형 컴퓨터 시스템 및 상기 임의의 시스템을 포함한 분산형 클라우드 컴퓨팅 기술 환경 등일 수 있다.Embodiments of the present invention may be applied to computer systems configured as terminals and/or servers, and may operate with many other general or dedicated computing system environments or configurations. wherein the terminal can be a thin client, a thick client, a handheld or laptop device, a microprocessor-based system, a set-top box, a programmable consumer electronics, a network personal computer, a small computer system, etc., and the server is A server computer system may be a small computer system, a large computer system, and a distributed cloud computing technology environment including any of the above.

단말기, 서버 등 전자 기기는 컴퓨터 시스템에 의해 실행되는 컴퓨터 시스템 실행 가능 명령(예를 들어, 프로그램 모듈)의 일반적인 언어 환경에서 설명될 수 있다. 일반적으로 프로그램 모듈은 특정 태스크를 수행하거나 특정 추상 데이터 타입을 구현하는 루틴, 프로그램, 타깃 프로그램, 구성 요소, 논리, 데이터 구조 등을 포함할 수 있다. 컴퓨터 시스템/서버는 분산형 클라우드 컴퓨팅 환경에서 구현될 수 있으며, 분산형 클라우드 컴퓨팅 환경에서, 태스크는 통신 네트워크를 통해 연결된 원격 처리 기기에 의해 수행된다. 분산형 클라우드 컴퓨팅 환경에서, 프로그램 모듈은 저장 기기를 포함하는 로컬 또는 원격 컴퓨팅 시스템 저장 매체에 위치할 수 있다.Electronic devices such as terminals and servers may be described in a general language environment of computer system executable instructions (eg, program modules) executed by the computer system. In general, program modules may include routines, programs, target programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment, in which tasks are performed by remote processing devices connected through a communication network. In a distributed cloud computing environment, program modules may be located in a local or remote computing system storage medium including a storage device.

상기 기재된 내용을 기반으로, 본 발명의 일부 실시예에서, 인체 방향 검출의 기술적 해결수단을 제공하며, 본 발명의 실시예에 적용되는 상황은 자율 주행, 로봇 네비게이션 등 상황을 포함하지만 이에 한정되지 않는다.Based on the above description, in some embodiments of the present invention, a technical solution for detecting the human body direction is provided, and situations applied to the embodiments of the present invention include, but are not limited to, situations such as autonomous driving and robot navigation. .

도 1은 본 발명의 실시예의 인체 방향 검출 방법의 흐름도이고, 도 1에 도시된 바와 같이, 상기 흐름은 하기와 같은 단계를 포함할 수 있다.1 is a flowchart of a method for detecting a human body direction according to an embodiment of the present invention, and as shown in FIG. 1 , the flow may include the following steps.

단계 101에서, 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득한다.In step 101, feature extraction is performed on the image to be processed to obtain features of the image to be processed.

실제 응용에서, 로컬 저장 영역 또는 네트워크로부터 처리될 이미지를 획득할 수 있으며, 처리될 이미지의 포맷은 공동 영상 전문가 그룹(Joint Photographic Experts GROUP, JPEG), 비트맵(Bitmap, BMP), 휴대망 그래픽스(Portable Network Graphics, PNG) 또는 다른 포맷일 수 있고; 설명해야 할 것은, 여기서 단지 처리될 이미지의 포맷 및 소스에 대해 예를 들어 설명하며, 본 발명의 실시예는 처리될 이미지의 포맷 및 소스에 대해 한정하지 않는다.In actual application, an image to be processed can be acquired from a local storage area or a network, and the format of the image to be processed is Joint Photographic Experts GROUP (JPEG), Bitmap (Bitmap, BMP), Portable Network Graphics ( Portable Network Graphics, PNG) or other formats; What should be explained is that the format and source of the image to be processed are merely described by way of example, and the embodiment of the present invention does not limit the format and source of the image to be processed.

실제 응용에서, 처리될 이미지를 특징 추출 네트워크에 입력하고, 특징 추출 네트워크를 사용하여 처리될 이미지에 대해 특징 추출을 수행하여, 처리될 이미지의 특징을 획득할 수 있으며; 본 발명의 실시예에서, 특징 추출 네트워크는 이미지 특징을 추출하기 위한 신경망이고, 특징 추출 네트워크는 콘볼루션 레이어 등 구조를 포함할 수 있으며; 여기서, 특징 추출 네트워크의 종류에 대해 한정하지 않는 바, 예를 들어, 특징 추출 네트워크는 딥 레지듀얼 네트워크(Deep residual network, Resnet)이거나 이미지 특징 추출을 수행하기 위한 신경망일 수 있다.In a practical application, an image to be processed may be input to a feature extraction network, and feature extraction may be performed on the image to be processed using the feature extraction network to obtain features of the image to be processed; In an embodiment of the present invention, the feature extraction network is a neural network for extracting image features, and the feature extraction network may include a structure such as a convolutional layer; Here, the type of the feature extraction network is not limited. For example, the feature extraction network may be a deep residual network (Resnet) or a neural network for performing image feature extraction.

본 발명의 실시예는 처리될 이미지의 특징의 표현 형태에 대해 한정하지 않는 바, 예를 들어, 처리될 이미지의 특징의 표현 형태는 특징 맵 또는 다른 표현 형태일 수 있다.The embodiment of the present invention does not limit the expression form of the feature of the image to be processed, for example, the expression form of the feature of the image to be processed may be a feature map or other expression form.

단계 102에서, 처리될 이미지의 특징에 기반하여 인체 핵심 포인트 및 초기 인체 방향을 결정한다.In step 102, a key point of the human body and an initial body orientation are determined based on the characteristics of the image to be processed.

본 단계의 구현 방식의 경우, 예시적으로, 처리될 이미지의 특징을 기반으로 인체 핵심 포인트를 검출하여, 인체 핵심 포인트를 획득하고; 처리될 이미지의 특징을 기반으로 인체 방향을 검출하여, 초기 인체 방향을 획득한다.In the case of the implementation method of this step, exemplarily, the human body key point is obtained by detecting the human body key point based on the characteristics of the image to be processed; An initial human body orientation is obtained by detecting the human body orientation based on the characteristics of the image to be processed.

처리될 이미지의 특징에 대해 인체 핵심 포인트를 검출하는 구현 방식의 경우, 예시적으로, 처리될 이미지의 특징에 대해 콘볼루션 및 업 샘플링 처리를 수행하여, 인체 핵심 포인트를 획득할 수 있다.In the case of an implementation method of detecting a human body key point with respect to a feature of the image to be processed, for example, convolution and upsampling processing may be performed on the feature of the image to be processed to obtain the human body key point.

일 구체적인 예시에서, 처리될 이미지의 특징을 획득한 후, 처리될 이미지의 특징을 특징 피라미드 네트워크(Feature Pyramid Networks, FPN)에 입력하고, FPN을 사용하여 처리될 이미지의 특징에 대해 처리하여, 인체 핵심 포인트를 획득할 수 있으며; FPN 기반의 이미지 특징 처리 방식은, 상이한 사이즈의 특징 맵에서 특징을 추출한 다음, 상이한 사이즈의 특징 맵을 융합하여 다중 스케일의 특징을 추출할 수 있으며, 따라서 이러한 다중 스케일의 특징을 융합하여 인체 핵심 포인트를 정확하게 획득할 수 있다.In one specific example, after acquiring the features of the image to be processed, the features of the image to be processed are input to a Feature Pyramid Networks (FPN), and the features of the image to be processed are processed using the FPN, You can earn key points; The FPN-based image feature processing method extracts features from feature maps of different sizes and then fuses feature maps of different sizes to extract multi-scale features. can be obtained accurately.

처리될 이미지의 특징에 대해 인체 방향 검출을 수행하는 구현 방식의 경우, 예시적으로, 처리될 이미지의 특징에 대해 콘볼루션 처리를 수행하여 초기 인체 방향을 획득할 수 있다. 실제 응용에서, 처리될 이미지의 특징을 획득한 후, 처리될 이미지의 특징을 적어도 하나의 콘볼루션 레이어로 조성된 신경망에 입력한 다음, 상기 신경망에서 콘볼루션 동작을 통해, 처리될 이미지의 특징을 초기 인체 방향 검출 결과로 전환할 수 있다.In the case of an implementation method of performing human body orientation detection on a feature of an image to be processed, for example, convolutional processing may be performed on a feature of the image to be processed to obtain an initial body orientation. In actual application, after acquiring the features of the image to be processed, the features of the image to be processed are input to a neural network composed of at least one convolutional layer, and then through a convolution operation in the neural network, the features of the image to be processed are It can be converted to the initial human body orientation detection result.

실제 응용에서, 단계 101 내지 단계 102는 트레이닝이 완료된 신경망에 기반하여 구현될 수 있고, 도 2는 본 발명의 실시예의 트레이닝이 완료된 신경망의 아키텍처 모식도이며, 도 2에 도시된 바와 같이, 트레이닝이 완료된 신경망은 베이스 계층 네트워크 및 위쪽 계층 네트워크 2개 부분을 포함하되, 여기서 베이스 계층 네트워크는 상술한 특징 추출 네트워크이고, 실질적으로 구현할 경우, 베이스 계층 네트워크의 입력은 처리될 이미지이며, 베이스 계층 네트워크를 이용하여 처리될 이미지에 대해 특징을 추출한 후, 처리될 이미지에 비해 표현 기능이 더욱 강한 중고층 특징을 획득할 수 있고; 위쪽 계층 네트워크는 인체 핵심 포인트 검출의 위쪽 계층 네트워크 및 인체 방향 검출의 위쪽 계층 네트워크를 포함하며; 인체 핵심 포인트 검출의 위쪽 계층 네트워크를 이용하여 처리될 이미지의 특징에 대해 처리하여 인체 핵심 포인트를 획득할 수 있고; 인체 방향 검출의 위쪽 계층 네트워크를 이용하여 처리될 이미지의 특징에 대해 처리하여 초기 인체 방향을 획득할 수 있다.In actual application, steps 101 to 102 may be implemented based on a neural network that has been trained, and FIG. 2 is an architectural schematic diagram of a neural network that has been trained according to an embodiment of the present invention, and as shown in FIG. The neural network includes two parts: a base layer network and an upper layer network, wherein the base layer network is the above-described feature extraction network, and when practically implemented, the input of the base layer network is an image to be processed, and using the base layer network After the feature is extracted for the image to be processed, it is possible to obtain a middle-class feature with a stronger expression function than the image to be processed; The upper layer network includes an upper layer network of body key point detection and an upper layer network of body orientation detection; processing the features of the image to be processed using the upper layer network of human body key point detection to obtain body key points; An initial human body orientation may be obtained by processing the features of the image to be processed using the upper layer network of the human body orientation detection.

단계 103에서, 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정한다.In step 103, a final body orientation is determined according to the determined body key points and the initial body orientation.

실제 응용에서, 단계 101 내지 단계 103은 전자 기기 중 프로세서로 구현될 수 있고, 상기 프로세서는 특정 용도 지향 집적 회로(Application Specific Integrated Circuit , ASIC), 디지털 신호 프로세서(Digital Signal Processor, DSP), 디지털 신호 처리 장치(Digital Signal Processing Device, DSPD), 프로그램 가능 논리 장치(Programmable Logic Device, PLD), 필드 프로그램 가능 게이트 어레이(Field-Programmable Gate Array, FPGA), 중앙 처리 장치(Central Processing Unit, CPU), 컨트롤러, 마이크로 컨트롤러, 마이크로 프로세서 중 적어도 하나일 수 있다.In a practical application, steps 101 to 103 may be implemented by a processor in an electronic device, and the processor includes an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), and a digital signal. Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), Central Processing Unit (CPU), Controller , a microcontroller, and a microprocessor.

관련 기술에서, 단지 인체 방향 검출을 기반으로 인체의 방향을 판단한여 획득한 인체 방향의 정밀도는 비교적 낮고; 본 발명의 실시예에서, 최종 인체 방향은 인체 핵심 포인트 및 초기 인체 방향을 종합적으로 고려한 기초상에서 획득된 것이며, 인체 핵심 포인트는 인체 방향을 판단할 수 있는 하나의 근거로 되고, 따라서 인체 핵심 포인트를 기반으로 초기 인체 방향에 대해 최적화하여 최종 인체 방향의 정확도 및 사용 가능성을 향상시킨다.In the related art, the precision of the human body orientation obtained by judging the human body orientation only based on the human body orientation detection is relatively low; In an embodiment of the present invention, the final human body orientation is obtained on the basis of comprehensively considering the human body core point and the initial body orientation, and the human body core point serves as a basis for judging the human body orientation, and thus the human body core point Based on this, it is optimized for the initial body orientation to improve the accuracy and usability of the final body orientation.

이밖에, 본 발명의 실시예에서, 인체 핵심 포인트 검출 및 인체 방향 검출 태스크 중 이미지 특징 추출은 모두 동일한 이미지 특징 추출 네트워크에서 구현되므로, 따라서 본 발명의 실시예는 소모가 비교적 적은 컴퓨팅 리소스를 기반으로, 인체 핵심 포인트 검출 및 인체 방향 검출 테스크를 동시에 구현하여, 인체 핵심 포인트 검출 및 인체 방향 검출 태스크의 실시간성 요구를 만족하는데 유리할 수 있다. 또한 인체 핵심 포인트의 검출 결과 및 인체 방향의 검출 결과를 이용하여 인체 방향을 공동으로 판정함으로써 인체 방향 검출의 정확성을 향상시킨다.In addition, in the embodiment of the present invention, image feature extraction among human body key point detection and human body orientation detection tasks are all implemented in the same image feature extraction network, so that the embodiment of the present invention is based on computing resources with relatively low consumption. , it may be advantageous to simultaneously implement the human body key point detection and human body orientation detection tasks to satisfy the real-time requirements of the human body core point detection and body orientation detection tasks. In addition, the accuracy of human body orientation detection is improved by jointly determining the human body orientation using the detection result of the human body key point and the human body orientation detection result.

단계 103의 구현 방식의 경우, 일 예시에서, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하는 경우에 응답하여, 상기 초기 인체 방향을 상기 최종 인체 방향으로 결정한다.In the case of the implementation manner of step 103, in one example, in response to a case in which the human body orientation characterized by the determined anatomical key points coincides with the initial human body orientation, the initial body orientation is determined as the final body orientation.

실제 응용에서, 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 초기 인체 방향과 일치한지 여부를 판단하여 판단 결과를 획득할 수 있고; 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 초기 인체 방향과 일치한 경우, 초기 인체 방향은 비교적 정확한 것으로 간주할 수 있으므로, 따라서 초기 인체 방향을 최종 인체 방향으로 결정하여 최종 인체 방향을 정확하게 획득할 수 있다.In practical application, it is possible to determine whether the human body orientation characterized by the determined anatomical key points coincides with the initial human body orientation to obtain a judgment result; If the human body orientation characterized by the determined body key points coincides with the initial body orientation, the initial body orientation can be considered relatively accurate, so that the initial body orientation can be determined as the final body orientation to accurately obtain the final body orientation. have.

아래 도면을 통해 본 발명의 실시예의 효과에 대해 예를 들어 설명한다.An example of the effect of the embodiment of the present invention will be described with reference to the drawings below.

도 3은 본 발명의 실시예에 관한 인체 핵심 포인트의 모식도이고, 도 3에 도시된 바와 같이, 숫자 0 내지 17은 인체 핵심 포인트를 통해 검출하여 얻은 인체 핵심 포인트를 나타낼 수 있으며, 전체 인체 핵심 포인트를 검출하였을 경우 인체는 앞으로 향하거나 뒤로 향하고; 좌측 핵심 포인트만 검출하였을 경우 인체는 좌측을 향하며; 우측의 핵심 포인트만 검출하였을 경우 인체는 우측을 향하고; 도 4는 본 발명의 실시예의 인체 방향의 모식도이며, 도 4에서, 숫자 1 내지 8은 상이한 인체 방향을 나타내고, 인체 방향 검출에서, 인체 방향을 8개의 방향으로 나누는데 이는 핵심 포인트에 따라 결정된 인체 방향에 비해 보다 더 정확하므로, 따라서 핵심 포인트의 검출 결과를 사용하여 방향 검출 결과를 수정함으로써 방향 검출 결과의 정확도를 향상시킬 수 있다.3 is a schematic diagram of human body core points according to an embodiment of the present invention, and as shown in FIG. 3 , numbers 0 to 17 may represent human body core points obtained by detection through human body core points, and the entire human body core points When detecting , the human body faces forward or backward; When only the left key point is detected, the human body faces to the left; When only the right key point is detected, the human body faces to the right; Fig. 4 is a schematic diagram of a human body direction in an embodiment of the present invention, in Fig. 4, numbers 1 to 8 indicate different human body directions, and in human body direction detection, the human body direction is divided into eight directions, which are determined according to key points. Since it is more accurate than , it is possible to improve the accuracy of the direction detection result by correcting the direction detection result using the detection result of the key point.

도 3 및 도 4를 결부한데 의하면, 인체 방향이 상이할 경우, 검출 가능한 인체 핵심 포인트의 개수 및 위치도 상이한 것을 알 수 있는 바, 예를 들어, 인체 좌측의 모든 핵심 포인트를 검출할 수 있고 우측의 핵심 포인트는 일부만 검출하거나 검출할 수 없을 경우, 초기 인체 방향은 마찬가지로 좌측을 향하며, 따라서 상기 초기 인체 방향이 정확하다는 것을 판단할 수 있고, 따라서 초기 인체 방향을 최종 인체 방향으로 결정하여, 최종 인체 방향의 정확도를 비교적 높은 수준으로 유지하는데 유리하다.3 and 4 , it can be seen that, when the human body orientation is different, the number and position of detectable human body key points are also different. For example, all key points on the left side of the human body can be detected and the right side of the human body can be detected. The key point of is that when only a part of the body direction is detected or cannot be detected, the initial body direction is also directed to the left, so it can be determined that the initial body direction is correct, and thus the initial body direction is determined as the final body direction, and the final body direction is It is advantageous in maintaining the accuracy of the direction at a relatively high level.

단계 103의 구현 방식의 경우, 다른 예에서, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하지 않는 경우에 응답하여, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향을 상기 최종 인체 방향으로 결정한다.In the case of the implementation manner of step 103, in another example, in response to the case that the human body orientation characterized by the determined anatomical core point does not match the initial anatomical orientation, the human body orientation characterized by the determined anatomical core point is the Decide on the final body orientation.

보다시피, 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 초기 인체 방향과 일치하지 않는 경우, 초기 인체 방향의 정확도가 비교적 낮은 것으로 간주할 수 있으며, 이를 기반으로, 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향을 상기 최종 인체 방향으로 결정하여, 최종 인체 방향의 정확도를 향상시킬 수 있다.As can be seen, if the human body orientation characterized by the determined body key points does not match the initial body orientation, the accuracy of the initial body orientation can be considered as relatively low, and based on this, the human body characterized by the determined body core points By determining the direction as the final human body direction, accuracy of the final human body direction may be improved.

예를 들어, 도 3 및 도 4를 결합해보면, 인체 핵심 포인트 중 인체 측면의 여러 개의 포인트만 유효하고 초기 인체 방향은 정면이거나 후면인 경우, 상기 초기 인체 방향이 정확하지 않은 것으로 판단할 수 있으며, 즉 인체 핵심 포인트를 통해, 인체 방향의 유효성 및 정확성을 효과적으로 판단할 수 있고; 따라서 인체 핵심 포인트를 결합하여 초기 인체 방향에 대해 최적화하여 최종 인체 방향의 정확도 및 사용 가능성을 향상시킬 수 있다.For example, when combining FIGS. 3 and 4, if only several points on the side of the human body are valid among the core points of the human body and the initial body direction is the front or rear, it can be determined that the initial body direction is not correct, That is, the validity and accuracy of the body orientation can be effectively determined through the body key points; Therefore, the human body key points can be combined and optimized for the initial body orientation to improve the accuracy and usability of the final body orientation.

일부 실시예에서, 단계 101 내지 단계 102는 신경망에 의해 수행될 수 있고, 상기 신경망은 제1 샘플 이미지 및 제2 샘플 이미지로 트레이닝하여 획득되며, 제1 샘플 이미지는 제1 인체 이미지 및 태깅된 인체 핵심 포인트를 포함하고, 제2 샘플 이미지는 제2 인체 이미지 및 태깅된 인체 방향을 포함한다.In some embodiments, steps 101 to 102 may be performed by a neural network, the neural network is obtained by training with a first sample image and a second sample image, and the first sample image is a first human body image and a tagged human body image a key point, and the second sample image includes a second anatomy image and a tagged anatomy orientation.

실제 응용에서, 로컬 저장 영역 또는 네트워크로부터 제1 샘플 이미지 또는 제2 샘플 이미지를 획득할 수 있고, 제1 샘플 이미지 또는 제2 샘플 이미지의 포맷은 JPEG, BMP, PNG 또는 다른 포맷일 수 있으며; 설명해야 할 것은, 여기에서 단지 제1 샘플 이미지 또는 제2 샘플 이미지의 포맷 및 소스에 대해서만 예를 들어 설명하였고, 본 발명의 실시예는 제1 샘플 이미지 또는 제2 샘플 이미지의 포맷 및 소스에 대해 한정하지 않는다.In practical application, the first sample image or the second sample image may be obtained from a local storage area or network, and the format of the first sample image or the second sample image may be JPEG, BMP, PNG or other format; It should be noted that here, only the format and source of the first sample image or the second sample image have been described by way of example, and the embodiment of the present invention relates to the format and source of the first sample image or the second sample image. do not limit

일 구체적인 예시에서, 제1 샘플 이미지 및 제2 샘플 이미지는 상이한 데이터 집합으로부터 획득될 수 있고, 제1 샘플 이미지 및 제2 샘플 이미지에 대응되는 데이터 집합은 중첩 부분이 없을 수 있다.In a specific example, the first sample image and the second sample image may be obtained from different data sets, and the data sets corresponding to the first sample image and the second sample image may not have overlapping portions.

보다시피, 본 발명의 실시예에서, 신경망을 기반으로 인체 핵심 포인트 및 초기 인체 방향을 획득할 수 있으며, 구현이 용이한 특징을 갖는다.As can be seen, in the embodiment of the present invention, it is possible to acquire the human body key point and the initial body direction based on the neural network, and has a feature that is easy to implement.

아래 도면을 통해 상기 신경망의 트레이닝 과정을 예시적으로 설명한다.A training process of the neural network will be exemplarily described with reference to the drawings below.

도 5는 본 발명의 실시예의 신경망 트레이닝 방법의 흐름도이고, 도 5에 도시된 바와 같이, 상기 흐름은 하기와 같은 단계를 포함할 수 있다.5 is a flowchart of a neural network training method according to an embodiment of the present invention, and as shown in FIG. 5 , the flow may include the following steps.

단계 501에서, 제1 샘플 이미지 및 제2 샘플 이미지를 획득한다.In step 501, a first sample image and a second sample image are obtained.

본 단계의 구현 방식은 상기에서 기재된 내용에서 이미 설명하였으며 여기서 더이상 서술하지 않는다.The implementation manner of this step has already been described in the above description and is not further described here.

단계 502에서, 제1 샘플 이미지 및 제2 샘플 이미지를 신경망에 입력하고, 신경망을 기반으로, 하기와 같은 단계를 수행하되, 상기 단계는 제1 샘플 이미지 및 제2 샘플 이미지에 대해 특징 추출을 수행하여 제1 샘플 이미지 및 제2 샘플 이미지의 특징을 획득하는 단계; 제1 샘플 이미지의 특징에 따라 보행자 핵심 포인트를 검출하여, 제1 샘플 이미지의 인체 핵심 포인트를 획득하는 단계; 및 제2 샘플 이미지의 특징을 기반으로 방향을 검출하여 제2 샘플 이미지의 인체 방향을 획득하는 단계를 포함한다.In step 502, the first sample image and the second sample image are input to the neural network, and based on the neural network, the following steps are performed, wherein the step is to perform feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image to obtain a human body key point of the first sample image; and detecting a direction based on a characteristic of the second sample image to obtain a human body orientation of the second sample image.

실제 응용에서, 제1 샘플 이미지 및 제2 샘플 이미지를 특징 추출 네트워크에 입력하고, 특징 추출 네트워크를 이용하여 제1 샘플 이미지 및 제2 샘플 이미지에 대해 특징 추출을 수행하여 제1 샘플 이미지 및 제2 샘플 이미지의 특징을 획득할 수 있다.In a practical application, the first sample image and the second sample image are input to a feature extraction network, and feature extraction is performed on the first sample image and the second sample image by using the feature extraction network to obtain the first sample image and the second sample image. The characteristics of the sample image may be acquired.

본 발명의 실시예는 제1 샘플 이미지 및 제2 샘플 이미지의 특징의 표현 형태에 대해 한정하지 않으며, 예를 들어, 제1 샘플 이미지 및 제2 샘플 이미지의 특징의 표현 형태는 특징 맵 또는 다른 표현 형태일 수 있다.The embodiment of the present invention is not limited to the expression form of the feature of the first sample image and the second sample image, for example, the expression form of the feature of the first sample image and the second sample image is a feature map or other representation may be in the form

제1 샘플 이미지 및 제2 샘플 이미지에 대해 특징 추출을 수행하여 제1 샘플 이미지 및 제2 샘플 이미지의 특징을 획득하는 단계의 구현 방식의 경우, 예시적으로, 제1 샘플 이미지 및 제2 샘플 이미지에 대해 이미지 데이터 스티칭을 수행하고 스티칭된 이미지 데이터에 대해 특징 추출을 수행하여 스티칭된 이미지 데이터의 특징을 획득할 수 있고; 제1 샘플 이미지 및 제2 샘플 이미지의 이미지 데이터 스티칭 방식에 따라 스티칭된 이미지 데이터의 특징을 제1 샘플 이미지 및 제2 샘플 이미지의 특징으로 분할한다.For the implementation manner of the step of performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image, exemplarily, the first sample image and the second sample image may perform image data stitching on the , and perform feature extraction on the stitched image data to obtain features of the stitched image data; A feature of the stitched image data is divided into features of the first sample image and the second sample image according to the image data stitching method of the first sample image and the second sample image.

보다시피, 제1 샘플 이미지 및 제2 샘플 이미지에 대한 이미지 데이터 스티칭을 통해, 스티칭된 이미지 데이터에 대해 통합적으로 특징 추출을 수행하는데 유리하여 용이하게 구현하며; 스티칭된 이미지 데이터의 특징에 대한 분할을 통해, 제1 샘플 이미지 및 제2 샘플 이미지의 특징에 대해 인체 핵심 포인트 검출 및 인체 방향 검출을 각각 수행하는데 유리하여 용이하게 구현할 수 있다.As you can see, through the image data stitching for the first sample image and the second sample image, it is advantageous to perform feature extraction on the stitched image data in an easy way; Through the division of the stitched image data features, it is advantageous to perform human body key point detection and human body direction detection for the features of the first sample image and the second sample image, and can be easily implemented.

제1 샘플 이미지 및 제2 샘플 이미지에 대해 이미지 데이터 스티칭을 수행하는 구현 방식의 경우, 예시적으로, 제1 샘플 이미지 및 제2 샘플 이미지를 배치 차원에 따라 스티칭할 수 있고; 제1 샘플 이미지 및 제2 샘플 이미지를 스티칭하기 전에, 채널, 높이 및 폭 3개의 차원에서 제1 샘플 이미지 및 제2 샘플 이미지를 각각 동일하게 조정할 수 있으며; 다음 배치 차원에서, 조정된 제1 샘플 이미지 및 제2 샘플 이미지를 스티칭할 수 있다.For an implementation manner of performing image data stitching on the first sample image and the second sample image, for example, the first sample image and the second sample image may be stitched according to a batch dimension; Before stitching the first sample image and the second sample image, the first sample image and the second sample image in three dimensions of a channel, a height and a width, respectively, may be equally adjusted; In the next batch dimension, the adjusted first sample image and the second sample image may be stitched.

여기서, 이미지 채널 수는 이미지 특징 추출을 수행하는 채널 개수를 의미하고, 배치 차원은 이미지의 개수 차원을 의미하며; 본 발명의 실시예에서, 제1 샘플 이미지 및 제2 샘플 이미지의 채널 수, 높이 및 폭을 동일한 크기로 조정할 경우, 조정된 상이한 개수의 제1 샘플 이미지 및 제2 샘플 이미지를 배치 차원에 따라 이미지 데이터 스티칭을 수행할 수 있다.Here, the number of image channels means the number of channels for performing image feature extraction, and the arrangement dimension means the number of images; In an embodiment of the present invention, when the number, height and width of channels of the first sample image and the second sample image are adjusted to the same size, the adjusted different numbers of the first sample image and the second sample image are imaged according to the arrangement dimension Data stitching can be performed.

도 6은 본 발명의 실시예의 신경망 트레이닝의 아키텍처 모식도이고, 도 7은 본 발명의 실시예 중 이미지 데이터 스티칭의 모식도이며, 도 7에서, 실선 직사각형 프레임은 제1 샘플 이미지(601)를, 점선 직사각형 프레임은 제2 샘플 이미지(602)를 의미하고; 본 발명의 실시예에서, 제1 샘플 이미지(601) 및 제2 샘플 이미지(602)의 데이터 포맷은 [B C H W]로 나타낼 수 있는데, 여기서 B는 배치 차원의 크기를 의미하며, C는 채널 차원의 크기를 의미하고, H는 높이를 의미하며, W는 폭을 의미하고; 이미지 특징 추출의 과정에 포함된 콘볼루션과 같은 연산은 모두 채널 차원, 높이 차원 및 폭 차원에서 계산되므로, 따라서 도 6 및 도 7을 참조하면, 배치 차원에 따라 제1 샘플 이미지(602) 및 제2 샘플 이미지(603)에 대해 이미지 데이터 스티칭을 수행할 수 있다.6 is a schematic diagram of neural network training in an embodiment of the present invention, FIG. 7 is a schematic diagram of image data stitching in an embodiment of the present invention, and in FIG. 7 , a solid rectangular frame represents a first sample image 601, a dotted rectangle frame means second sample image 602; In an embodiment of the present invention, the data format of the first sample image 601 and the second sample image 602 may be expressed as [BCHW], where B means the size of the batch dimension, and C is the size of the channel dimension. means size, H means height, W means width; Operations such as convolution included in the process of image feature extraction are all calculated in the channel dimension, height dimension, and width dimension. Therefore, referring to FIGS. 6 and 7 , the first sample image 602 and the second Image data stitching may be performed on the two sample images 603 .

도 6을 참조하면, 베이스 계층 네트워크(601)를 이용하여 스티칭된 이미지 데이터에 대해 특징 추출을 수행하여 대응하는 이미지 특징을 획득할 수 있고; 다음 베이스 계층 네트워크에 의해 출력된 이미지 특징에 대해 분할해야 한다.Referring to FIG. 6 , performing feature extraction on stitched image data using the base layer network 601 to obtain corresponding image features; Then we need to segment for the image features output by the base layer network.

도 8은 본 발명의 실시예 중 이미지 특징 분할의 모식도이고, 도 8에서, 실선 직사각형 프레임(C1에 대응됨)은 제1 샘플 이미지의 이미지 특징을 의미하며, 점선 직사각형 프레임(C2에 대응됨)은 제2 샘플 이미지의 이미지 특징을 의미하고; 본 발명의 실시예에서, 제1 샘플 이미지 및 제2 샘플 이미지의 이미지 데이터 스티칭 방식에 따라, 배치 차원에 따라 스티칭된 이미지 데이터의 특징을 분할하여 제1 샘플 이미지의 이미지 특징(801) 및 제2 샘플 이미지의 이미지 특징(802)을 획득할 수 있으며; 여기서 제1 샘플 이미지의 이미지 특징(801) 및 제2 샘플 이미지의 이미지 특징(802)은 모두 특징 맵을 통해 표현된다.8 is a schematic diagram of image feature segmentation in an embodiment of the present invention. In FIG. 8, a solid rectangular frame (corresponding to C1) means an image feature of the first sample image, and a dotted rectangular frame (corresponding to C2). denotes an image feature of the second sample image; In an embodiment of the present invention, according to the image data stitching method of the first sample image and the second sample image, the image features 801 of the first sample image and the second obtain an image characteristic 802 of the sample image; Here, both the image feature 801 of the first sample image and the image feature 802 of the second sample image are expressed through a feature map.

도 6을 참조하면, 제1 샘플 이미지의 이미지 특징을 인체 핵심 포인트 검출의 위쪽 계층 네트워크(604)에 입력하고, 인체 핵심 포인트 검출의 위쪽 계층 네트워크는 입력된 이미지 특징에 대해 처리한 후, 제1 샘플 이미지의 인체 핵심 포인트(641)를 출력할 수 있으며; 또한 제2 샘플 이미지의 이미지 특징을 인체 방향 검출의 위쪽 계층 네트워크(605)에 입력하고, 인체 방향 검출의 위쪽 계층 네트워크(605)는 입력된 이미지 특징에 대해 처리한 후, 제2 샘플 이미지의 인체 방향(651)을 출력할 수 있다.Referring to FIG. 6 , the image features of the first sample image are input to the upper layer network 604 of human body core point detection, and the upper layer network of human body core point detection processes the input image features, and then the first output the human body key points 641 of the sample image; In addition, the image features of the second sample image are input to the upper layer network 605 of the human body orientation detection, and the upper layer network 605 of the human body orientation detection processes the input image features, and then the human body of the second sample image A direction 651 may be output.

또한, 도 6을 참조하면, 제1 샘플 이미지의 인체 핵심 포인트를 획득한 후, 또한 신경망의 제1 손실(642)을 계산할 수 있고, 제1 손실(642)은 제1 샘플 이미지의 인체 핵심 포인트와 태깅된 인체 핵심 포인트 간의 차이를 의미하며; 제2 샘플 이미지의 인체 방향을 획득한 후, 또한 신경망의 제2 손실(652)을 계산할 수 있고, 제2 손실(652)은 제2 샘플 이미지의 인체 방향과 태깅된 인체 방향 간의 차이를 의미한다.Also, referring to FIG. 6 , after acquiring the human body core point of the first sample image, a first loss 642 of the neural network may also be calculated, and the first loss 642 is the human body core point of the first sample image. and the difference between the tagged anatomical key points; After obtaining the human body orientation of the second sample image, it is also possible to calculate a second loss 652 of the neural network, the second loss 652 means the difference between the human body orientation of the second sample image and the tagged human body orientation .

본 발명의 실시예에서, 제1 샘플 이미지의 특징을 기반으로 인체 핵심 포인트 검출을 수행하는 구현 방식은 단계 102에서 처리될 이미지의 특징을 기반으로 인체 핵심 포인트 검출을 수행하는 구현 방식과 동일하며, 여기서 더이상 서술하지 않는다. 제2 샘플 이미지의 특징을 기반으로 인체 방향 검출을 수행하는 구현 방식은 단계 102에서 처리될 이미지의 특징을 기반으로 인체 방향 검출을 수행하는 구현 방식과 동일하며, 여기서 더이상 서술하지 않는다. In an embodiment of the present invention, the implementation method of performing the human body key point detection based on the characteristics of the first sample image is the same as the implementation method of performing the human body key point detection based on the characteristics of the image to be processed in step 102, No further description is given here. An implementation method of performing human body orientation detection based on the features of the second sample image is the same as an implementation method of performing human body orientation detection based on the features of the image to be processed in step 102 , which is not further described herein.

보다시피, 신경망의 응용 및 테스트 과정(단계 101 내지 단계 103)은 신경망의 트레이닝 과정에 비해, 이미지 데이터의 스티칭 및 이미지 특징의 분할을 수행할 필요 없이, 처리될 이미지에 대해 베이스 계층 네트워크 및 2개의 위쪽 계층 네트워크를 통해 처리하기만 하면, 처리될 이미지의 인체 핵심 포인트 및 초기 인체 방향을 획득할 수 있다.As can be seen, the application and testing process (step 101 to step 103) of the neural network is compared to the training process of the neural network, the base layer network and two By simply processing through the upper layer network, the human body key point and initial body orientation of the image to be processed can be obtained.

단계 503에서, 검출된 인체 핵심 포인트, 태깅된 인체 핵심 포인트, 검출된 인체 방향 및 태깅된 인체 방향에 따라, 신경망의 네트워크 파라미터값을 조정한다.In step 503, the network parameter values of the neural network are adjusted according to the detected body key point, the tagged body key point, the detected body direction, and the tagged body direction.

본 단계의 구현 방식의 경우, 예시적으로, 검출된 인체 핵심 포인트(즉 제1 샘플 이미지의 인체 핵심 포인트) 및 태깅된 인체 핵심 포인트에 따라, 신경망의 제1 손실을 획득하고; 검출된 인체 방향(즉 제2 샘플 이미지의 인체 방향) 및 태깅된 인체 방향에 따라, 신경망의 제2 손실을 획득할 수 있으며; 상기 제1 손실 및 제2 손실에 따라, 신경망의 네트워크 파라미터값을 조정한다.For the implementation manner of this step, for example, according to the detected human body key point (ie, the human body key point of the first sample image) and the tagged human body key point, a first loss of the neural network is obtained; According to the detected human body orientation (ie, the human body orientation of the second sample image) and the tagged human body orientation, a second loss of the neural network may be obtained; According to the first loss and the second loss, the network parameter value of the neural network is adjusted.

구체적으로 구현할 경우, 제1 손실과 제2 손실의 합을 신경망의 총 손실로 사용할 수 있고, 제1 손실과 제2 손실의 가중합을 신경망의 총 손실로 사용할 수도 있으며, 제1 손실과 제2 손실의 가중치는 실제 응용 수요에 따라 미리 설정될 수 있다.Specifically, the sum of the first loss and the second loss may be used as the total loss of the neural network, and the weighted sum of the first loss and the second loss may be used as the total loss of the neural network, and the first loss and the second loss may be used as the total loss of the neural network. The weight of the loss may be preset according to the actual application demand.

신경망의 총 손실을 획득한 후, 신경망의 총 손실에 따라, 신경망의 네트워크 파라미터값을 조정할 수 있다.After obtaining the total loss of the neural network, the network parameter values of the neural network may be adjusted according to the total loss of the neural network.

단계 504에서, 네트워크 파라미터값이 조정된 초기 신경망이 이미지에 대한 처리가 설정된 정밀도 요구를 만족하는지 여부를 판단하며, 그렇지 않으면, 단계 501 내지 단계 504를 다시 수행하고; 그렇다면, 단계 505를 수행한다.In step 504, the initial neural network whose network parameter value is adjusted determines whether processing for the image meets the set precision requirement, otherwise, steps 501 to 504 are performed again; If so, step 505 is performed.

본 발명의 실시예에서, 설정된 정밀도 요구는 미리 설정될 수 있고; 예시적으로, 설정된 정밀도 요구는 제1 손실 및 제2 손실과 관계가 있으며, 첫 번째 예시에서, 설정된 정밀도 요구는 상기 신경망의 총 손실이 제1 설정 임계값보다 작은 것일 수 있고, 두 번째 예시에서, 설정된 정밀도 요구는 제1 손실이 제2 설정 임계값보다 작고, 제2 손실이 제3 설정 임계값보다 작은 것일 수 있다.In an embodiment of the present invention, the set precision requirement may be preset; Exemplarily, the set precision requirement is related to the first loss and the second loss, and in the first example, the set precision requirement may be that the total loss of the neural network is less than the first set threshold, and in the second example , the set precision request may be that the first loss is smaller than the second set threshold and the second loss is smaller than the third set threshold.

실제 응용에서, 제1 설정 임계값, 제2 설정 임계값 및 제3 설정 임계값은 모두 실제 응용 수요에 따라 미리 설정될 수 있다.In actual application, the first setting threshold value, the second setting threshold value, and the third setting threshold value may all be preset according to the actual application demand.

단계 505에서, 네트워크 파라미터값이 조정된 신경망을 트레이닝이 완료된 신경망으로 사용한다.In step 505, the neural network whose network parameter values are adjusted is used as a neural network that has been trained.

실제 응용에서, 단계 501 내지 단계 505는 전자 기기 중 프로세서로 구현될 수 있고, 상기 프로세서는 ASIC, DSP, DSPD, PLD, FPGA, CPU, 컨트롤러, 마이크로 컨트롤러, 마이크로 프로세서 중 적어도 하나일 수 있다.In actual application, steps 501 to 505 may be implemented by a processor among electronic devices, and the processor may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor.

보다시피, 본 발명의 실시예에서, 신경망에 대해 트레이닝을 수행할 경우, 제1 샘플 이미지 또는 제2 샘플 이미지에 대해 인체 핵심 포인트 검출 및 인체 방향 검출을 각각 수행할 필요가 없으며, 인체 핵심 포인트 검출 및 인체 방향 검출 태스크는 모두 동일한 이미지 특징 추출 과정을 기반으로 구현된 것이므로, 따라서 트레이닝이 완료된 신경망이 소모가 비교적 적은 컴퓨팅 리소스를 기반으로, 인체 핵심 포인트 검출 및 인체 방향 검출 테스크를 동시에 구현하여, 인체 핵심 포인트 검출 및 인체 방향 검출 태스크의 실시간성 요구를 만족하는데 유리해질 수 있도록 한다. As you can see, in the embodiment of the present invention, when training is performed on the neural network, it is not necessary to perform human body key point detection and body orientation detection on the first sample image or the second sample image, respectively, and human body core point detection And the human body orientation detection task is implemented based on the same image feature extraction process. Therefore, the trained neural network uses relatively low-consuming computing resources to simultaneously implement the human body key point detection and body orientation detection tasks. It can be advantageous to meet the real-time requirements of key point detection and body orientation detection tasks.

신경망에 대해 트레이닝을 수행하는 과정에서, 일 예시에서, 제1 샘플 이미지 및 제2 샘플 이미지의 데이터 유사성(즉 양자는 모두 인체 이미지를 포함함)을 충분히 이용하고, 제1 샘플 이미지 및 제2 샘플 이미지의 이미지 데이터 스티칭을 통해, 스티칭된 이미지 데이터에 대해 통합으로 특징 추출을 수행하는데 유리하여 구현이 용이해질 수 있으며; 또한 인체 핵심 포인트 검출 및 인체 방향 검출의 신경망의 유사성(즉 모두 인체 이미지 중 특징을 추출해야 함)을 이용하고, 인체 핵심 포인트 검출 신경망 및 인체 방향 검출 신경망에서 공통의 베이스 계층 네트워크를 추출하여 통합적인 이미지 특징 추출에 사용할 수 있어, 따라서 동일한 트레이닝이 완료된 신경망을 통해 인체 핵심 포인트 검출 및 인체 방향 검출을 동시에 수행할 수 있도록 한다.In the process of performing training on the neural network, in one example, the data similarity of the first sample image and the second sample image (that is, both include a human body image) is fully exploited, and the first sample image and the second sample image are fully utilized. Through the image data stitching of the image, it is advantageous to perform feature extraction integrally on the stitched image data, so that the implementation can be facilitated; In addition, it utilizes the similarity of the neural network for human body key point detection and body orientation detection (that is, all features need to be extracted from the human body image), and extracts a common base layer network from the human body core point detection neural network and the human body orientation detection neural network to provide an integrated It can be used for image feature extraction, so that it is possible to simultaneously perform human body key point detection and body orientation detection through a neural network that has completed the same training.

본 분야의 기술자는, 구체적인 실시형태의 상기 방법에서, 각 단계의 작성 순서는 엄격한 수행 순서를 의미하거나 실시 과정에 대해 임의의 한정을 구성하는 것이 아니며, 각 단계의 구체적인 수행 순서는 그 기능 및 가능한 내적 논리로 결정되어야 함을 이해할 수 있다.A person skilled in the art will recognize that in the above method of a specific embodiment, the writing order of each step does not imply a strict execution order or constitute any limitation on the implementation process, and the specific execution order of each step depends on its function and possible It can be understood that the decision must be made by internal logic.

전술한 실시예에서 제공된 인체 방향 검출 방법을 기반으로, 본 발명의 실시예는 인체 방향 검출 장치를 제공한다.Based on the human body orientation detection method provided in the above-described embodiment, an embodiment of the present invention provides an apparatus for detecting a human body orientation.

도 9는 본 발명의 실시예의 인체 방향 검출 장치의 조성 구조 모식도이고, 도 9에 도시된 바와 같이, 상기 장치는 추출 모듈(901) 및 처리 모듈(902)을 포함할 수 있으며, 여기서,9 is a schematic diagram of the composition structure of the human body direction detection device according to an embodiment of the present invention, and as shown in FIG. 9 , the device may include an extraction module 901 and a processing module 902, where,

추출 모듈(901)은, 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하고; The extraction module 901 is configured to perform feature extraction on the image to be processed to obtain features of the image to be processed;

처리 모듈(902)은, 상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하며; 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정한다.The processing module 902 determines, based on the characteristics of the image to be processed, an anatomical key point and an initial human body orientation; According to the determined body key points and the initial body orientation, the final body orientation is determined.

일부 실시예에서, 상기 처리 모듈(902)은, 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하고, 이 경우는, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하는 경우에 응답하여, 상기 초기 인체 방향을 상기 최종 인체 방향으로 결정하는 경우를 포함한다.In some embodiments, the processing module 902 determines a final body orientation according to the determined body core point and the initial body orientation, in this case, the body orientation characterized by the determined body core point is the initial body orientation. and determining the initial body direction as the final body direction in response to the case of matching the direction.

일부 실시예에서, 상기 처리 모듈(902)은, 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정하고, 이 경우는, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향이 상기 초기 인체 방향과 일치하지 않는 경우에 응답하여, 상기 결정된 인체 핵심 포인트에 의해 특성화된 인체 방향을 상기 최종 인체 방향으로 결정하는 경우를 포함한다.In some embodiments, the processing module 902 determines a final body orientation according to the determined body core point and the initial body orientation, in this case, the body orientation characterized by the determined body core point is the initial body orientation. and in response to the case where the orientation does not match, determining the body orientation characterized by the determined body key point as the final body orientation.

상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하는 경우; 상기 제1 샘플 이미지의 특징에 따라 보행자 핵심 포인트를 검출하여, 상기 제1 샘플 이미지의 인체 핵심 포인트를 획득하는 경우; 상기 제2 샘플 이미지의 특징에 기반하여 방향을 검출하여, 상기 제2 샘플 이미지의 인체 방향을 획득하는 경우; 및performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image to obtain a human body key point of the first sample image; detecting a direction based on a characteristic of the second sample image to obtain a human body direction of the second sample image; and

일부 실시예에서, 상기 트레이닝 모듈은, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지에 대해 특징 추출을 수행하여, 상기 제1 샘플 이미지 및 상기 제2 샘플 이미지의 특징을 획득하고, 이 경우는,In some embodiments, the training module performs feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image, in this case,

실제 응용에서, 추출 모듈(901) 및 처리 모듈(902)은 모두 전자 기기 중 프로세서를 사용하여 구현될 수 있고, 상기 프로세서는 ASIC, DSP, DSPD, PLD, FPGA, CPU, 컨트롤러, 마이크로 컨트롤러, 마이크로 프로세서 중 적어도 하나일 수 있다.In a practical application, both the extraction module 901 and the processing module 902 may be implemented using a processor in an electronic device, the processor being ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, microcontroller It may be at least one of the processors.

이 밖에, 본 실시예의 각 기능 모듈은 하나의 처리 유닛에 통합되거나 또는 각각의 유닛이 물리적으로 단독으로 존재할 수도 있고, 2개 또는 2개 이상의 유닛이 하나의 유닛에 통합될 수도 있다. 상기 통합된 유닛은 하드웨어의 형태로 구현될 수 있고, 소프트웨어 기능 모듈의 형태로 구현될 수도 있다.In addition, each functional module of the present embodiment may be integrated into one processing unit, or each unit may exist alone physically, and two or two or more units may be integrated into one unit. The integrated unit may be implemented in the form of hardware or may be implemented in the form of a software function module.

상기 통합된 유닛이 소프트웨어 기능 모듈의 형태로 구현되고 독립적인 제품으로 판매되거나 사용되지 않는 경우, 하나의 컴퓨터 판독 가능 저장 매체에 저장될 수 있고, 이러한 이해에 기반하여, 본 실시예의 기술적 해결수단은 본질적으로 또는 선행 기술에 기여하는 일부분 또는 상기 기술적 해결수단의 전부 또는 일부는 소프트웨어 제품의 형태로 반영될 수 있으며, 상기 컴퓨터 소프트웨어 제품은 하나의 저장 매체에 저장되고, 하나의 컴퓨터 기기(개인용 컴퓨터, 서버, 또는 네트워크 기기 등일 수 있음) 또는 프로세서(processor)가 본 실시예에서 설명된 상기 방법의 전부 또는 일부 단계를 수행하도록 하는 명령을 포함한다. 전술된 저장 매체는 U 디스크, 이동식 하드 디스크, 판독 전용 메모리(Read-Only Memory, ROM), 랜덤 액세스 메모리(Random Access Memory, RAM), 자기 디스크 또는 광 디스크와 같은 프로그램 코드를 저장할 수 있는 다양한 매체를 포함한다.When the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it may be stored in one computer-readable storage medium, and based on this understanding, the technical solution of this embodiment is A part essentially or contributing to the prior art or all or part of the technical solution may be reflected in the form of a software product, the computer software product being stored in one storage medium, and one computer device (personal computer, a server or a network device, etc.) or a processor to perform all or some steps of the method described in the present embodiment. The above-described storage medium includes various media capable of storing a program code, such as a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk. includes

구체적으로, 본 실시예에서 인체 방향 검출 방법에 대응되는 컴퓨터 프로그램 명령은 광 디스크, 하드 디스크, USB 플래시 디스크와 같은 저장 매체 저장될 수 있고, 저장 매체 중 인체 방향 검출 방법에 대응되는 컴퓨터 프로그램 명령이 하나의 전자 기기에 의해 판독되거나 실행될 경우, 전술한 실시예의 임의의 하나의 인체 방향 검출 방법을 구현한다.Specifically, in the present embodiment, the computer program command corresponding to the human body direction detection method may be stored in a storage medium such as an optical disk, a hard disk, or a USB flash disk, and the computer program command corresponding to the human body direction detection method among the storage media is provided. When read or executed by one electronic device, any one of the human body orientation detection methods of the above-described embodiments is implemented.

전술한 실시예와 동일한 기술적 구상에 기반하여, 도 10을 참조하면, 이는 본 발명의 실시예에서 제공된 전자 기기(10)를 도시하며, 메모리(1001) 및 프로세서(1002)를 포함할 수 있되; 여기서,Based on the same technical concept as in the above embodiment, referring to FIG. 10 , which shows an electronic device 10 provided in an embodiment of the present invention, which may include a memory 1001 and a processor 1002 ; here,

상기 메모리(1001)는 컴퓨터 프로그램 및 데이터를 저장하고; the memory 1001 stores computer programs and data;

상기 프로세서(1002)는 상기 메모리에 저장된 컴퓨터 프로그램을 실행하여 전술한 실시예의 임의의 하나의 인체 방향 검출 방법을 구현한다.The processor 1002 executes the computer program stored in the memory to implement any one of the human body orientation detection methods of the above-described embodiments.

실제 응용에서, 상기 메모리(1001)는 RAM과 같은 휘발성 메모리(volatile memory)이거나; ROM과 같은 비 휘발성 메모리(non-volatile memory), 플래시 메모리(flash memory), 하드 디스크 드라이브(Hard Disk Drive, HDD) 또는 솔리드 스테이트 드라이브(Solid-State Drive, SSD)이거나; 상기 종류의 메모리의 조합일 수 있으며, 프로세서(1002)에 명령 및 데이터를 제공한다.In practical applications, the memory 1001 is a volatile memory such as RAM; non-volatile memory such as ROM, flash memory, hard disk drive (HDD), or solid-state drive (SSD); It may be a combination of the above types of memory, providing instructions and data to the processor 1002 .

상기 프로세서(1002)는 ASIC, DSP, DSPD, PLD, FPGA, CPU, 컨트롤러, 마이크로 컨트롤러, 마이크로 프로세서 중 적어도 하나일 수 있다. 이해할 수 있는 것은, 상이한 기기에 대해, 상기 프로세서 기능을 구현하기 위한 전자 소자는 다른 것일 수 있으며, 본 발명의 실시예는 구체적으로 한정하지 않는다.The processor 1002 may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor. It can be understood that for different devices, the electronic device for implementing the processor function may be different, and the embodiment of the present invention is not specifically limited.

일부 실시예에서, 본 발명의 실시예에서 제공된 장치가 가지는 기능 또는 포함된 모듈은 상기 방법 실시예에서 설명한 방법을 수행할 수 있고, 그 구체적인 구현은 상기 방법 실시예의 설명을 참조할 수 있으며, 간결함을 위해 여기서 더이상 서술하지 않는다.In some embodiments, the functions or modules included in the apparatus provided in the embodiments of the present invention may perform the methods described in the method embodiments, and the specific implementation may refer to the description of the method embodiments, for conciseness For the sake of this, it is not further described here.

상기 각각의 실시예에 대한 설명은 각각의 실시예 사이의 상이한 점을 강조하기 위한 것으로, 그 동일하거나 유사한 부분은 상호 참조할 수 있으며, 간결함을 위해 본문에서는 더이상 서술하지 않는다.The description of each of the above embodiments is intended to emphasize the differences between the respective embodiments, and the same or similar parts may be cross-referenced, and for the sake of brevity, they are not described further in the text.

본 발명에서 제공된 각 방법 실시예에서 개시된 방법은, 충돌되지 않는 한 임의로 조합되어 새로운 방법 실시예를 획득할 수 있다.The methods disclosed in each method embodiment provided in the present invention may be arbitrarily combined to obtain a new method embodiment as long as there is no conflict.

본 발명에서 제공된 각 제품 실시예에서 개시된 특징은, 충돌되지 않는 한 임의로 조합되어 새로운 제품 실시예를 획득할 수 있다.Features disclosed in each product embodiment provided in the present invention may be arbitrarily combined to obtain a new product embodiment as long as they do not conflict.

본 발명에서 제공된 각 방법 또는 기기 실시예에서 개시된 특징은, 충돌되지 않는 한 임의로 조합되어 새로운 방법 실시예 또는 기기 실시예를 획득할 수 있다.The features disclosed in each method or device embodiment provided in the present invention may be arbitrarily combined to obtain a new method embodiment or device embodiment as long as they do not conflict.

이상 실시형태의 설명을 통해, 본 분야의 기술자는 상기 실시예 방법이 소프트웨어에 필수적인 일반 하드웨어 플랫폼을 이용하는 방식으로 구현될 수 있으며, 물론, 하드웨어를 통해 구현될 수도 있지만, 대부분 경우 전자가 더 바람직한 구현 방식임을 명확하게 이해할 수 있을 것이다. 이러한 이해에 기반해보면, 본 발명의 기술적 해결수단은 본질적으로 또는 선행기술에 기여하는 부분은 소프트웨어 제품의 형식으로 구현되고, 상기 컴퓨터 소프트웨어 제품은 하나의 저장 매체(예를 들어, ROM/RAM, 디스크, CD)에 저장되며, 약간의 명령을 포함하여 하나의 단말기(휴대폰, 컴퓨터, 서버, 공조기, 또는 네트워크 기기 등일 수 있음)가 본 발명의 각각의 실시예에 따른 방법을 실행하도록 할 수 있다. Through the description of the above embodiments, those skilled in the art can realize that the above embodiment methods can be implemented in a manner using a general hardware platform essential to software, and of course, can also be implemented through hardware, but in most cases, the former is a more preferable implementation It can be clearly understood how Based on this understanding, the technical solution of the present invention essentially or the part contributing to the prior art is implemented in the form of a software product, and the computer software product is a storage medium (eg, ROM/RAM, disk , CD), and including some instructions, one terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) can execute the method according to each embodiment of the present invention.

상기에서는 도면을 결부하여 본 발명의 실시예에 대해 설명하였지만, 본 발명은 상기 구체적인 실시형태에 한정되지 않으며, 상기 구체적인 실시형태는 단지 예시적인 것일 뿐, 한정된 것이 아니고, 본 분야의 통상의 기술자는 본 발명의 시사 하에서, 본 발명의 종지 및 청구보호범위가 보호하고자 하는 범위를 벗어나지 않고 다양한 형태를 이룰 수 있으며, 이들은 모두 본 발명의 보호 범위에 속한다.In the above, embodiments of the present invention have been described in conjunction with the drawings, but the present invention is not limited to the specific embodiments, and the specific embodiments are merely exemplary and not limiting, and those skilled in the art will Under the teachings of the present invention, various forms can be formed without departing from the scope of the present invention and the scope of the protection of the present invention and the claims, all of which fall within the protection scope of the present invention.

본 발명은 인체 방향 검출 방법, 장치, 전자 기기 및 컴퓨터 저장 매체를 제공한다. 여기서 처리될 이미지에 대해 특징 추출을 수행하여, 상기 처리될 이미지의 특징을 획득하고; 상기 처리될 이미지의 특징에 기반하여, 인체 핵심 포인트 및 초기 인체 방향을 결정하며; 결정된 인체 핵심 포인트 및 초기 인체 방향에 따라, 최종 인체 방향을 결정한다.The present invention provides a method, an apparatus, an electronic device, and a computer storage medium for detecting a human body orientation. performing feature extraction on the to-be-processed image here to obtain features of the to-be-processed image; determine, based on the characteristics of the image to be processed, a human body key point and an initial body orientation; According to the determined body key points and the initial body orientation, the final body orientation is determined.

Claims

A method for detecting human body orientation, comprising:
The method is
performing feature extraction on the image to be processed to obtain features of the image to be processed;
determining key points of the human body and an initial body orientation based on the characteristics of the image to be processed; and
A method of detecting a human body orientation comprising the step of determining a final body orientation according to the determined body key points and the initial body orientation.

According to claim 1,
The step of determining the final body orientation according to the determined human body key point and the initial body orientation includes:
and determining the initial body orientation as the final body orientation in response to a case in which the orientation of the human body characterized by the determined key points of the human body coincides with the initial orientation of the body.

According to claim 1,
The step of determining the final body orientation according to the determined human body key point and the initial body orientation includes:
In response to a case in which the human body orientation characterized by the determined core human body point does not coincide with the initial human body orientation, determining the human body orientation characterized by the determined human body core point as the final human body orientation detection method.

4. The method according to any one of claims 1 to 3,
Performing feature extraction on the image to be processed to obtain features of the image to be processed, and determining the human body key points and initial body orientation based on the features of the image to be processed are performed by a neural network, wherein the neural network is obtained by training with a first sample image and a second sample image, wherein the first sample image includes a first human body image and a tagged human body key point, and the second sample image includes a second human body image and a tagged human body image A method of detecting human body orientation including orientation.

5. The method of claim 4,
The step of training the neural network with a first sample image and a second sample image to obtain,
performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image, and obtaining a human body key point of the first sample image; detecting a direction based on a characteristic of the second sample image to obtain a human body direction of the second sample image; and
and adjusting a network parameter value of the neural network according to the detected human body key point, the tagged human body core point, the detected body direction, and the tagged body direction.

6. The method of claim 5,
performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image,
stitching the first sample image and the second sample image, and performing feature extraction on the stitched image data to obtain features of the stitched image data; and
and dividing a feature of the stitched image data into a feature of the first sample image and a feature of the second sample image according to a stitching method of the first sample image and the second sample image; Way.

7. The method of claim 6,
Stitching the first sample image and the second sample image includes:
stitching the first sample image and the second sample image along a batch dimension;
Prior to stitching the first sample image and the second sample image, the method comprises:
The method further comprising the step of adjusting the first sample image and the second sample image equally in three dimensions of a channel, a height, and a width.

6. The method of claim 5,
The step of adjusting the network parameter value of the neural network according to the detected human body core point, the tagged human body core point, the detected body direction and the tagged body direction includes:
acquiring a first loss value of the neural network according to the detected human core point and the tagged human body core point, wherein the first loss value is a difference between the detected human body core point and the tagged human body core point indicates - ;
obtaining a second loss value of the neural network according to the detected human body orientation and the tagged human body orientation, wherein the second loss value represents a difference between the detected human body orientation and the tagged human body orientation; and
and adjusting a network parameter value of the neural network according to the first loss value and the second loss value.

A human body orientation detection device comprising:
The device comprises an extraction module and a processing module,
The extraction module is configured to: perform feature extraction on the image to be processed to obtain features of the image to be processed;
The processing module is configured to determine, based on the characteristics of the image to be processed, a human body key point and an initial body orientation; A body orientation detection device that determines the final body orientation according to the determined body key points and the initial body orientation.

10. The method of claim 9,
The processing module is configured to determine, according to the determined body core point and the initial body orientation, to determine the final body orientation, in response when the body orientation characterized by the determined body core point coincides with the initial body orientation, and determining the human body direction as the final human body direction.

10. The method of claim 9,
The processing module is configured to determine, according to the determined body core point and the initial body orientation, to determine the final body orientation, in response to a case in which the body orientation characterized by the determined body core point does not coincide with the initial body orientation, and determining a body orientation characterized by the determined body key points as the final body orientation.

12. The method according to any one of claims 9 to 11,
Performing feature extraction on the image to be processed to obtain features of the image to be processed, and determining the human body key points and initial body orientation based on the features of the image to be processed are performed by a neural network, wherein the neural network is obtained by training with a first sample image and a second sample image, wherein the first sample image includes a first human body image and a tagged human body key point, and the second sample image includes a second human body image and a tagged human body image A human body direction detection device including a direction.

13. The method of claim 12,
The apparatus further comprises a training module, wherein the training module comprises: training the neural network using the first sample image and the second sample image;
performing feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image; detecting a pedestrian key point according to a characteristic of the first sample image to obtain a human body key point of the first sample image; detecting a direction based on a characteristic of the second sample image to obtain a human body orientation of the second sample image; and
and adjusting a network parameter value of the neural network according to the detected body core point, the tagged body core point, the detected body direction, and the tagged body direction.

14. The method of claim 13,
The training module performs feature extraction on the first sample image and the second sample image to obtain features of the first sample image and the second sample image,
stitching the first sample image and the second sample image, and performing feature extraction on the stitched image data to obtain features of the stitched image data; and
and dividing a feature of the stitched image data into a feature of the first sample image and a feature of the second sample image according to a stitching method of the first sample image and the second sample image. .

15. The method of claim 14,
The training module is configured to: stitching the first sample image and the second sample image includes stitching the first sample image and the second sample image according to a batch dimension;
The training module is further configured to: before stitching the first sample image and the second sample image, a human body orientation for adjusting the first sample image and the second sample image equally in three dimensions of a channel, a height and a width, respectively detection device.

14. The method of claim 13,
The training module is configured to adjust the network parameter value of the neural network according to the detected body core point, the tagged body core point, the detected body direction, and the tagged body direction,
obtaining a first loss value of the neural network according to the detected human core point and the tagged human body core point, wherein the first loss value is a difference between the detected human body core point and the tagged human body core point indicates - ;
obtaining a second loss value of the neural network according to the detected human body orientation and the tagged human body orientation, wherein the second loss value represents a difference between the detected human body orientation and the tagged human body orientation; and
and adjusting a network parameter value of the neural network according to the first loss value and the second loss value.

As an electronic device,
a processor and a memory for storing a computer program executable on the processor;
The processor executes the computer program to perform the method according to any one of claims 1 to 8.

A computer storage medium in which a computer program is stored, comprising:
The computer program is a computer storage medium that, when executed by a processor, implements the method according to any one of claims 1 to 8.