KR102591325B1

KR102591325B1 - Apparatus and Method for Estimating Human Pose

Info

Publication number: KR102591325B1
Application number: KR1020200105426A
Authority: KR
Inventors: 윤영우; 김도형; 김재홍; 이재연; 장민수
Original assignee: 한국전자통신연구원
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2023-10-20
Also published as: KR20220023553A

Abstract

휴먼 자세 추정 장치 및 방법이 개시된다. 본 발명의 실시예에 따른 실시예에 따른 휴먼 자세 추정 장치는, 적어도 하나의 프로그램 및 복수의 자세 추정 모델들이 기록된 메모리 및 프로그램을 실행하는 프로세서를 포함하며, 프로그램은, 입력 영상에서 샘플 영상을 추출하는 단계, 추출된 샘플 영상을 복수의 자세 추정 모델들 각각에 입력하여 자세를 추정하는 단계, 복수의 자세 추정 모델들 각각이 추정한 자세의 관절 위치 신뢰도(Confidence)를 기반으로 하나의 자세 추정 모델을 선택하는 단계 및 선택된 자세 추정 모델로 입력 영상의 자세를 추정하는 단계를 수행하되, 복수의 자세 추정 모델들 각각은, 각각 상이한 조건에서 촬영된 영상 데이터 셋에 의해 미리 훈련된 것일 수 있다. An apparatus and method for estimating human posture are disclosed. A human posture estimation device according to an embodiment of the present invention includes a memory in which at least one program and a plurality of posture estimation models are recorded, and a processor that executes the program, and the program selects a sample image from an input image. Extracting, estimating the posture by inputting the extracted sample image into each of a plurality of posture estimation models, estimating a posture based on the joint position confidence of the posture estimated by each of the plurality of posture estimation models. A step of selecting a model and a step of estimating the pose of the input image using the selected pose estimation model are performed, and each of the plurality of pose estimation models may be trained in advance using image data sets captured under different conditions.

Description

Apparatus and Method for Estimating Human Pose}

기재된 실시예는 영상에 포함된 사람의 자세를 추정하는 기술에 관한 것이다. The described embodiment relates to technology for estimating the posture of a person included in an image.

통상적인 영상 기반 휴먼 자세 추정을 위해 학습되는 모델은 사람 영상을 입력받아 사람의 각 바디 파트, 예컨대, 코, 목, 팔꿈치, 손목의 위치를 추정하도록 학습된다. A model learned for typical image-based human posture estimation receives human images as input and is trained to estimate the positions of each human body part, such as the nose, neck, elbow, and wrist.

그런데, 자세 추정 모델의 인식 결과는 학습 데이터의 특성에 의존적이게 된다. 만약 자세 추정 모델이 똑바로 서 있는 사람의 영상만이 존재하는 학습 데이터로 학습될 경우, 서 있는 사람에 대한 인식률을 높을 수 있어도 누워 있거나 앉아 있는 자세의 영상에 대해서는 인식률이 낮을 수 있다. However, the recognition result of the posture estimation model is dependent on the characteristics of the learning data. If the posture estimation model is learned with training data that contains only images of people standing upright, the recognition rate may be high for standing people, but the recognition rate may be low for images of lying or sitting postures.

즉, 학습 기반의 자세 추정 모델은 학습에 사용하는 데이터의 분포에 영향을 받게 되므로 학습 데이터에 다수 포함되는 일반적인 자세는 인식률이 높으나, 학습 데이터에 소수만 존재하는 특수한 자세에 대해서는 그 인식률이 낮을 수 있다. In other words, the learning-based posture estimation model is affected by the distribution of data used for learning, so the recognition rate is high for general postures that are included in the learning data, but the recognition rate may be low for special postures that only exist in a small number in the learning data. .

기재된 실시예는 영상에 포함된 사람이 취하는 특수한 자세도 정확히 인식할 수 있도록 하는데 그 목적이 있다. The purpose of the described embodiment is to enable accurate recognition of special postures taken by people included in the video.

실시예에 따른 휴먼 자세 추정 장치는, 적어도 하나의 프로그램 및 복수의 자세 추정 모델들이 기록된 메모리 및 프로그램을 실행하는 프로세서를 포함하며, 프로그램은, 입력 영상에서 샘플 영상을 추출하는 단계, 추출된 샘플 영상을 복수의 자세 추정 모델들 각각에 입력하여 자세를 추정하는 단계, 복수의 자세 추정 모델들 각각이 추정한 자세의 관절 위치 신뢰도(Confidence)를 기반으로 하나의 자세 추정 모델을 선택하는 단계 및 선택된 자세 추정 모델로 입력 영상의 자세를 추정하는 단계를 수행하되, 복수의 자세 추정 모델들 각각은, 각각 상이한 조건에서 촬영된 영상 데이터 셋에 의해 미리 훈련된 것일 수 있다. A human posture estimation device according to an embodiment includes a memory in which at least one program and a plurality of posture estimation models are recorded, and a processor for executing the program, the program comprising: extracting a sample image from an input image, the extracted sample Estimating the posture by inputting an image into each of a plurality of posture estimation models, selecting one posture estimation model based on the joint position confidence of the posture estimated by each of the plurality of posture estimation models, and selecting the selected posture estimation model. A step of estimating the pose of an input image is performed using a pose estimation model, but each of the plurality of pose estimation models may be trained in advance using image data sets captured under different conditions.

실시예에 따라, 영상에 포함된 사람이 취하는 특수한 자세도 정확히 인식할 수 있도록 하는데 그 목적이 있다. Depending on the embodiment, the purpose is to accurately recognize the special posture taken by a person included in the video.

도 1은 실시예에 따른 휴먼 자세 추정 장치를 설명하기 위한 도면이다.
도 2는 실시예에 따른 복수의 자세 추정 모델들을 설명하기 위한 도면이다.
도 3은 실시예에 따른 휴먼 자세 추정 방법을 설명하기 위한 순서도이다.
도 4는 실시예에 따른 샘플 영상에 대한 자세 추정 모델별 인식 결과 출력 화면의 예시도이다.
도 5는 실시예에 따른 자세 추정 모델 선택을 설명하기 위한 예시도이다.
도 6은 실시예에 따른 샘플 영상 추출 단계를 설명하기 위한 도면이다.
도 7은 실시예에 따른 컴퓨터 시스템 구성을 나타낸 도면이다.1 is a diagram for explaining a human posture estimation device according to an embodiment.
Figure 2 is a diagram for explaining a plurality of posture estimation models according to an embodiment.
Figure 3 is a flowchart for explaining a human posture estimation method according to an embodiment.
Figure 4 is an example of a recognition result output screen for each posture estimation model for a sample image according to an embodiment.
Figure 5 is an example diagram for explaining the selection of a posture estimation model according to an embodiment.
Figure 6 is a diagram for explaining a sample image extraction step according to an embodiment.
Figure 7 is a diagram showing the configuration of a computer system according to an embodiment.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and will be implemented in various different forms. The present embodiments only serve to ensure that the disclosure of the present invention is complete and that common knowledge in the technical field to which the present invention pertains is not limited. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

비록 "제1" 또는 "제2" 등이 다양한 구성요소를 서술하기 위해서 사용되나, 이러한 구성요소는 상기와 같은 용어에 의해 제한되지 않는다. 상기와 같은 용어는 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용될 수 있다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있다.Although terms such as “first” or “second” are used to describe various components, these components are not limited by the above terms. The above terms may be used only to distinguish one component from another component. Accordingly, the first component mentioned below may also be the second component within the technical spirit of the present invention.

본 명세서에서 사용된 용어는 실시예를 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 또는 "포함하는(comprising)"은 언급된 구성요소 또는 단계가 하나 이상의 다른 구성요소 또는 단계의 존재 또는 추가를 배제하지 않는다는 의미를 내포한다.The terms used in this specification are for describing embodiments and are not intended to limit the invention. As used herein, singular forms also include plural forms, unless specifically stated otherwise in the context. As used in the specification, “comprises” or “comprising” implies that the mentioned element or step does not exclude the presence or addition of one or more other elements or steps.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 해석될 수 있다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms used in this specification can be interpreted as meanings commonly understood by those skilled in the art to which the present invention pertains. Additionally, terms defined in commonly used dictionaries are not interpreted ideally or excessively unless clearly specifically defined.

이하에서는, 도 1 내지 도 7을 참조하여 실시예에 따른 휴먼 자세 추정 장치 및 방법이 상세히 설명된다.Hereinafter, an apparatus and method for estimating human posture according to an embodiment will be described in detail with reference to FIGS. 1 to 7 .

도 1은 실시예에 따른 휴먼 자세 추정 장치를 설명하기 위한 도면이고, 도 2는 실시예에 따른 복수의 자세 추정 모델들을 설명하기 위한 도면이다. FIG. 1 is a diagram illustrating a human posture estimation device according to an embodiment, and FIG. 2 is a diagram illustrating a plurality of posture estimation models according to an embodiment.

도 1을 참조하면, 실시예에 따른 휴먼 자세 추정 장치(1)는, 사람의 영상을 입력받아 사람의 관절 위치를 추정하는 것일 수 있다. 이를 위해, 휴먼 자세 추정 장치(1)는 사람의 영상 및 사람의 영상의 관절 위치를 정답값으로 하여 미리 학습된 휴먼 자세 추정 모델을 기반으로 사람의 관절 위치를 추정할 수 있다. Referring to FIG. 1, the human posture estimation device 1 according to an embodiment may receive an image of a person and estimate the position of the person's joints. To this end, the human posture estimation device 1 can estimate the joint positions of a person based on a human posture estimation model learned in advance using the human image and the joint positions of the human image as the correct value.

이때, 실시예에 따라 휴먼 자세 추정 장치(1)는, 도 2에 도시된 바와 같이, 미리 학습된 복수의 자세 추정 모델들(10)을 포함할 수 있다. At this time, depending on the embodiment, the human posture estimation device 1 may include a plurality of posture estimation models 10 that have been learned in advance, as shown in FIG. 2 .

도 2를 참조하면, 자세 추정 모델들(10) 각각은 다수의 (영상, 관절 위치) 쌍으로 이루어진 데이터 셋을 이용하여, 입력된 영상이 쌍을 이루는 관절 위치를 추정하도록 신경망(Neural Network)의 가중치를 역전파 방식으로 업데이트하면서 학습된다. Referring to FIG. 2, each of the posture estimation models 10 uses a data set consisting of multiple (image, joint position) pairs, and uses a neural network to estimate the joint position of the input image pair. It is learned by updating the weights using the backpropagation method.

이때, 실시예에 따라, 복수의 자세 추정 모델들(10) 각각은 서로 상이한 특성을 가지는 데이터 셋들(20)에 의해 학습될 수 있다. At this time, depending on the embodiment, each of the plurality of posture estimation models 10 may be learned using data sets 20 having different characteristics.

이때, 서로 상이한 데이터셋들(20)은 휴먼 자세 뿐만 아니라 촬영 환경에 의해서도 구별될 수 있다. At this time, different datasets 20 can be distinguished not only by human posture but also by the shooting environment.

예컨대, 데이터셋 1은 요가 자세에 대한 영상의 분포 비율이 클 수 있고, 데이터셋 2는 사람을 위에서 촬영한 영상의 분포 비율이 클 수 있고, 데이터셋 3은 저조도 환경에서 촬영된 영상의 분포 비율이 클 수 있다. For example, Dataset 1 may have a large distribution ratio of images of yoga postures, Dataset 2 may have a large distribution ratio of images taken of people from above, and Dataset 3 may have a large distribution ratio of images taken in a low-light environment. This can be big.

따라서, 데이터 셋 1으로 학습된 자세 추정 모델 1은 요가 자세에 대한 자세 인식률이 높을 수 있고, 데이터 셋 2로 학습된 자세 추정 모델 2는 위에서 촬영한 영상에 대한 사람의 자세 인식률이 높을 수 있고, 데이터 셋 3으로 학습된 자세 추정 모델 3는 저조도 환경의 영상에 대한 인식률이 높을 수 있다. Therefore, posture estimation model 1 learned with data set 1 may have a high posture recognition rate for yoga postures, and posture estimation model 2 learned with data set 2 may have a high human posture recognition rate for images captured from above. Pose estimation model 3 learned with data set 3 can have a high recognition rate for images in low-light environments.

따라서, 실시예에 따른 휴먼 자세 추정 장치(10)는 복수의 자세 추정 모델들(10) 중 인식 대상 영상이 촬영된 환경 조건에 가장 적합한 자세 추정 모델을 선별하여 인식 대상 영상에서의 사람의 자세를 추정해낸다. 이로써, 일반적이지 않은 특수한 자세에 대해서도 정확히 인식해낼 수 있다. Therefore, the human posture estimation device 10 according to the embodiment selects the posture estimation model that is most suitable for the environmental conditions in which the recognition target image is captured among the plurality of posture estimation models 10 and determines the human posture in the recognition target image. Make an estimate. As a result, even special, unusual postures can be accurately recognized.

그러면, 휴먼 자세 추정 장치(10)에 의한 휴먼 자세 추정 방법에 대해 도 3 내지 도 6을 참조하여 상세히 설명하기로 한다. Then, the human posture estimation method using the human posture estimation device 10 will be described in detail with reference to FIGS. 3 to 6.

도 3은 실시예에 따른 휴먼 자세 추정 방법을 설명하기 위한 순서도이고, 도 4는 실시예에 따른 샘플 영상에 대한 자세 추정 모델별 인식 결과 출력 화면의 예시도이고, 도 5는 실시예에 따른 자세 추정 모델 선택을 설명하기 위한 예시도이고, 도 6은 실시예에 따른 샘플 영상 추출 단계를 설명하기 위한 도면이다. FIG. 3 is a flowchart for explaining a human posture estimation method according to an embodiment, FIG. 4 is an example of a recognition result output screen for each posture estimation model for a sample image according to an embodiment, and FIG. 5 is a posture diagram according to an embodiment. This is an example diagram for explaining selection of an estimation model, and FIG. 6 is a diagram for explaining a sample image extraction step according to an embodiment.

도 3을 참조하면, 실시예에 따른 휴먼 자세 추정 방법은 입력 영상에서 샘플 영상을 추출하는 단계(S110), 추출된 샘플 영상을 복수의 자세 추정 모델들 각각에 입력하여 자세를 추정하는 단계(S120), 복수의 자세 추정 모델들 각각이 추정한 자세의 관절 위치 신뢰도(Confidence)를 기반으로 하나의 자세 추정 모델을 선택하는 단계(S130) 및 선택된 자세 추정 모델로 입력 영상의 자세를 추정하는 단계(S140)를 수행할 수 있다. Referring to FIG. 3, the human posture estimation method according to the embodiment includes extracting a sample image from an input image (S110), and inputting the extracted sample image into each of a plurality of posture estimation models to estimate the posture (S120). ), selecting one pose estimation model based on the joint position confidence (Confidence) of the pose estimated by each of the plurality of pose estimation models (S130), and estimating the pose of the input image with the selected pose estimation model ( S140) can be performed.

실시예에 따른 입력 영상에서 샘플 영상을 추출하는 단계(S110)에서, 입력 영상으로부터 M개의 샘플 영상을 추출한다. In the step of extracting sample images from the input image according to the embodiment (S110), M sample images are extracted from the input image.

이때, 입력 영상은, 사용자로부터 자세 추정 기술을 사용하고자 하는 환경에서 촬영된 영상 또는 유사한 환경의 영상일 수 있다. 이때, M은 적어도 하나 이상일 수 있고, 소정 주기로 추출된 것일 수 있다. At this time, the input image may be an image captured in an environment in which the user wants to use posture estimation technology, or an image in a similar environment. At this time, M may be at least one or more and may be extracted at a predetermined period.

실시예에 따른 자세를 추정하는 단계(S120)에서, 추출된 M개의 샘플 영상들 을 N개의 자세 추정 모델들(20) 각각에 입력하여, M개의 영상들 에 각각에 대한 자세를 추정한다(S120).In the step of estimating the posture according to the embodiment (S120), M sample images extracted is input to each of the N posture estimation models 20, and M images are generated. The posture for each is estimated (S120).

이때, M 개의 샘플 영상들 중에서 i 번째 영상 에 대해서 N개의 자세 추정 모델들 중 j번째 자세 추정 모델 로 추정된 자세 는 다음의 <수학식 1>과 같이 표현될 수 있다. At this time, the ith image among M sample images The jth pose estimation model among the N pose estimation models. posture estimated to be Can be expressed as the following <Equation 1>.

이때, 자세 는 다음의 <수학식 2>와 같이 K개 관절들 각각의 위치, 좌표 (x, y)와 신뢰도(confidence)로 이루질 수 있다. 이때, 신뢰도(confidence)는 관절 위치 추정의 정확도이다At this time, posture Can be composed of the position, coordinates (x, y), and confidence of each of the K joints, as shown in Equation 2 below. At this time, confidence is the accuracy of joint position estimation.

실시예에 따른 자세 추정 모델을 선택하는 단계(S130)에서, S120에 의한 자세 추정 결과를 기반으로 최적의 자세 추정 모델을 선택하는데, 다음의 두 가지 실시예들이 있을 수 있다. In the step of selecting a posture estimation model according to an embodiment (S130), an optimal posture estimation model is selected based on the posture estimation result in S120. There may be the following two embodiments.

일 실시예에 따라, 휴먼 자세 추정 장치(1)는, S120에서 산출된 신뢰도(confidence) 값을 기반으로 자세 추정 모델들(20) 중 하나를 자동 선택할 수 있다(S131). According to one embodiment, the human posture estimation device 1 may automatically select one of the posture estimation models 20 based on the confidence value calculated in S120 (S131).

즉, 휴먼 자세 추정 장치(1)는, <수학식 3>과 같이 자세 추정 모델들 별 신뢰도(confidence)의 총합을 산출하고, <수학식 4>와 같이 신뢰도 총합이 최 상위인 자세 추정 모델을 선택한다. That is, the human posture estimation device 1 calculates the total confidence of each posture estimation model as in <Equation 3>, and selects the posture estimation model with the highest total confidence as in <Equation 4>. Choose.

다른 실시예에 따라, 휴먼 자세 추정 장치(1)는, 샘플 영상에 대한 자세 추정 모델별 인식 결과 출력 화면을 생성하여 디스플레인 한 후, 사용자로부터 자세 추정 모델을 선택받을 수 있다(S133~S135). According to another embodiment, the human posture estimation device 1 may generate and display a recognition result output screen for each posture estimation model for a sample image and then select a posture estimation model from the user (S133 to S135). .

예컨대, 도 4를 참조하면, 샘플 영상에 대한 자세 추정 모델별 인식 결과 출력 화면에는 샘플 영상에 대한 자세 추정 모델들 각각의 관절 위치 추정 결과가 표시될 수 있다. 도 4에서 각 모델들의 관절 위치 추정 결과는 상이한데, 모델 1이 추정한 팔꿈치(401a)와 모델 2가 추정한 팔꿈치(401b)가 상이한 추정 결과를 나타냄을 알 수 있다. 또한, 모델 4가 추정한 골반(402a)와 모델 6이 추정한 골반(402b)이 상이한 추정 결과를 나타낼 수 있다. For example, referring to FIG. 4 , the joint position estimation results of each of the pose estimation models for the sample image may be displayed on the recognition result output screen for each pose estimation model for the sample image. In Figure 4, the joint position estimation results of each model are different, and it can be seen that the elbow 401a estimated by model 1 and the elbow 401b estimated by model 2 show different estimation results. Additionally, the pelvis 402a estimated by Model 4 and the pelvis 402b estimated by Model 6 may show different estimation results.

이때, 휴먼 자세 추정 장치(1)는, 화면에 출력되는 자세 추정 모델들을 를 기준으로 정렬하여 출력할 수 있다(S133). At this time, the human posture estimation device 1 uses the posture estimation models output on the screen. It can be sorted and output based on (S133).

그러면, 사용자는 디스플레이된 자세 추정 모델들 각각의 인식 결과를 확인한 후, 하나의 자세 추정 모델을 선택할 수 있게 된다(S135). Then, the user can check the recognition results of each of the displayed posture estimation models and then select one posture estimation model (S135).

전술한 바와 같이 최적의 자세 추정 모델이 선택되면, 휴먼 자세 추정 장치(1)는, 입력된 영상을 선택된 자세 추정 모델에만 입력시켜 자세 추정 결과를 얻는다(S140). 즉, 도 5에 도시된 바와 같이, 입력 영상을 선택된 자세 추정 모델(11)에 입력시켜 자세 추정 결과를 획득한다. As described above, when the optimal posture estimation model is selected, the human posture estimation device 1 obtains a posture estimation result by inputting the input image only to the selected posture estimation model (S140). That is, as shown in FIG. 5, the input image is input into the selected posture estimation model 11 to obtain a posture estimation result.

한편, 전술한 바와 같이 초기에 최적의 자세 추정 모델을 선택되더라도 시간이 경과됨에 따라, 입력 영상의 촬영 환경이 변화될 수 있다. 예컨대, 영상이 실내에서 촬영되다가 그 촬영 장소가 실외로 이동될 수 있다. 이럴 경우, 실내 조도에 촬영된 영상에 적합한 자세 추정 모델은 실외 조도에서 촬영된 영상에 대해서는 정확히 인식하지 못할 수 있다. 따라서, 실시예에 따른 휴먼 자세 추정 방법은 자세 추정 모델을 재선택하는 단계를 더 포함할 수 있다. Meanwhile, as described above, even if the optimal posture estimation model is initially selected, the shooting environment of the input image may change as time passes. For example, an image may be filmed indoors and then the filming location may be moved to outdoors. In this case, a pose estimation model suitable for images captured in indoor illumination may not accurately recognize images captured in outdoor illumination. Accordingly, the human posture estimation method according to the embodiment may further include the step of reselecting the posture estimation model.

즉, 도 3을 참조하면, 휴먼 자세 추정 장치(1)는, 주기가 도래하는지를 판단한다(S150). That is, referring to FIG. 3, the human posture estimation device 1 determines whether a period has arrived (S150).

이때, 주기 T는 도 6에 도시된 바와 같이 일정할 수도 있으나, 설정에 따라 무작위로 설정될 수도 있다. At this time, the period T may be constant as shown in FIG. 6, but may be set randomly depending on the setting.

휴먼 자세 추정 장치(1)는, 주기가 도래하고, 영상 추정이 완료되지 않았을 경우(S160), S110 내지 S140을 재수행한다. 이때, 초기에 무작위로 샘플 영상들이 추출되어 저장되어 있는 경우, 입력 영상에서 샘플 영상을 추출하는 단계(110)의 반복은 생략될 수 있다. When the period arrives and image estimation is not completed (S160), the human posture estimation device 1 re-performs S110 to S140. At this time, if sample images are initially randomly extracted and stored, repetition of step 110 of extracting sample images from the input image may be omitted.

도 7은 실시예에 따른 컴퓨터 시스템 구성을 나타낸 도면이다.Figure 7 is a diagram showing the configuration of a computer system according to an embodiment.

실시예에 따른 휴먼 자세 추정 장치는 컴퓨터로 읽을 수 있는 기록매체와 같은 컴퓨터 시스템(1000)에서 구현될 수 있다.The human posture estimation device according to the embodiment may be implemented in a computer system 1000 such as a computer-readable recording medium.

컴퓨터 시스템(1000)은 버스(1020)를 통하여 서로 통신하는 하나 이상의 프로세서(1010), 메모리(1030), 사용자 인터페이스 입력 장치(1040), 사용자 인터페이스 출력 장치(1050) 및 스토리지(1060)를 포함할 수 있다. 또한, 컴퓨터 시스템(1000)은 네트워크(1080)에 연결되는 네트워크 인터페이스(1070)를 더 포함할 수 있다. 프로세서(1010)는 중앙 처리 장치 또는 메모리(1030)나 스토리지(1060)에 저장된 프로그램 또는 프로세싱 인스트럭션들을 실행하는 반도체 장치일 수 있다. 메모리(1030) 및 스토리지(1060)는 휘발성 매체, 비휘발성 매체, 분리형 매체, 비분리형 매체, 통신 매체, 또는 정보 전달 매체 중에서 적어도 하나 이상을 포함하는 저장 매체일 수 있다. 예를 들어, 메모리(1030)는 ROM(1031)이나 RAM(1032)을 포함할 수 있다.Computer system 1000 may include one or more processors 1010, memory 1030, user interface input device 1040, user interface output device 1050, and storage 1060 that communicate with each other via bus 1020. You can. Additionally, the computer system 1000 may further include a network interface 1070 connected to the network 1080. The processor 1010 may be a central processing unit or a semiconductor device that executes programs or processing instructions stored in the memory 1030 or storage 1060. The memory 1030 and storage 1060 may be storage media including at least one of volatile media, non-volatile media, removable media, non-removable media, communication media, and information transfer media. For example, memory 1030 may include ROM 1031 or RAM 1032.

이상에서 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.Although embodiments of the present invention have been described above with reference to the attached drawings, those skilled in the art will understand that the present invention can be implemented in other specific forms without changing the technical idea or essential features. You will understand that it exists. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive.

1 : 휴먼 자세 추정 장치
10 : 자세 추정 모델들1: Human posture estimation device
10: Pose estimation models

Claims

a memory in which at least one program and a plurality of posture estimation models are recorded; and
Contains a processor that executes a program,
The program is,
Extracting a sample image from an input image;
estimating a posture by inputting the extracted sample image into each of a plurality of posture estimation models;
Selecting one posture estimation model based on joint position confidence estimated by each of the plurality of posture estimation models; and
Including the step of estimating the pose of the input image using only one selected pose estimation model,
When a certain period has arrived and image estimation has not been completed,
extracting a sample image from the input image;
estimating a posture by inputting the extracted sample image into each of a plurality of posture estimation models;
Selecting one posture estimation model based on joint position confidence estimated by each of the plurality of posture estimation models; and
Repeat the step of estimating the pose of the input image using only one selected pose estimation model,
The predetermined period is,
set constant or random,
Each of the plurality of posture estimation models is,
A human posture estimation device that is pre-trained using image data sets captured under different conditions.

The method of claim 1, wherein selecting a posture estimation model comprises:
A human posture estimation device that calculates the total confidence of each posture estimation model and selects the posture estimation model with the highest total confidence.

The method of claim 1, wherein selecting a posture estimation model comprises:
A human posture estimation device that generates and displays a recognition result output screen showing the joint position estimation results of each of the posture estimation models for the sample image and then selects a posture estimation model from the user.

The method of claim 3, wherein the step of selecting a posture estimation model includes:
A human posture estimation device that calculates the total confidence of each posture estimation model and sorts and displays the joint position estimation results of each posture estimation model based on the total confidence on the recognition result output screen.

delete

Extracting a sample image from an input image;
estimating a posture by inputting the extracted sample image into each of a plurality of posture estimation models;
Selecting one posture estimation model based on joint position confidence estimated by each of the plurality of posture estimation models; and
Including the step of estimating the pose of the input image using only one selected pose estimation model,
When a certain period has arrived and image estimation has not been completed,
extracting a sample image from the input image;
estimating a posture by inputting the extracted sample image into each of a plurality of posture estimation models;
Selecting one posture estimation model based on joint position confidence estimated by each of the plurality of posture estimation models; and
Repeat the step of estimating the pose of the input image using only one selected pose estimation model,
The predetermined period is,
set constant or random,
Each of the plurality of posture estimation models is,
A human pose estimation method that is pre-trained using image data sets captured under different conditions.

The method of claim 6, wherein the step of selecting a posture estimation model includes:
A human posture estimation method that calculates the total confidence of each posture estimation model and selects the posture estimation model with the highest total confidence.

The method of claim 6, wherein the step of selecting a posture estimation model includes:
A human posture estimation method that generates and displays a recognition result output screen showing the joint position estimation results of each of the posture estimation models for the sample image, and then selects a posture estimation model from the user.

The method of claim 8, wherein the step of selecting a posture estimation model includes:
A human posture estimation method that calculates the total confidence of each posture estimation model and sorts and displays the joint position estimation results of each posture estimation model based on the total confidence on the recognition result output screen.

delete