KR102667880B1

KR102667880B1 - beauty educational content generating apparatus and method therefor

Info

Publication number: KR102667880B1
Application number: KR1020210134970A
Authority: KR
Inventors: 김은혜
Original assignee: (주)은혜컴퍼니
Priority date: 2021-05-10
Filing date: 2021-10-12
Publication date: 2024-06-20
Also published as: KR102314103B1; KR20220152908A

Abstract

본 발명은 미용 교육 컨텐츠 생성 장치 및 그 방법에 대한 것으로, 복수의 깊이 카메라를 이용하여 미용 행위를 수행하고 있는 적어도 하나의 사람의 움직임을 전방위로 촬영하여 영상 데이터 및 상기 영상 데이터에 대응하는 송수신 시간 데이터를 포함한 동작 정보를 수집하는 동작 정보 수집부; 상기 동작 정보에 포함된 각 깊이 카메라별 영상 데이터를 인공 신경망 기반의 영상 분석 모델에 입력하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보를 생성하고, 상기 동작 정보에 포함된 각 깊이 카메라별 송수신 시간 데이터를 분석하여 연속된 복수의 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하고, 상기 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 제거 함수를 이용하여 노이즈를 제거하고, 상기 노이즈가 제거된 각 깊이 카메라 별로 생성된 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간 값으로 평균화하여 제2 행동 정보를 생성하는 행동 정보 생성부; 상기 제1 행동 정보와 제2 행동 정보를 시간을 기준으로 정합 하여 오차를 산출하고, 산출된 오차에 대해서는 연속형 가중치 중간값 필터를 사용하여 오차를 제거하여 최종 행동 정보로 생성하는 최종 행동 정보 생성부; 상기 최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도 및 상기 영상 데이터를 동작 영상 생성 모델에 입력하여 360도 회전이 가능한 사람의 움직임에 대한 3D 합성 영상 데이터를 생성하는 3D 영상 생성부; 상기 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 정보 제공 텍스트를 생성하고, 생성한 정보 제공 텍스트를 기반으로 생성한 대한 3D 합성 영상 데이터에 자막을 입혀 미용 콘텐츠를 생성하는 미용 콘텐츠 생성부를 포함할 수 있다.The present invention relates to an apparatus and method for generating beauty education content, which uses a plurality of depth cameras to photograph the movement of at least one person performing a beauty act in all directions to obtain image data and a transmission/reception time corresponding to the image data. a motion information collection unit that collects motion information including data; The image data for each depth camera included in the motion information is input into an artificial neural network-based image analysis model to generate first behavior information including the movement line, movement speed, and movement angle for each body part, and included in the motion information. By analyzing the transmission and reception time data for each depth camera, a plurality of consecutive depth data, speed data, and amplitude data are generated, and noise is removed using the cumulative distribution function and noise removal function according to the size of the amplitude using the amplitude data. a behavior information generator that generates second behavior information by removing the noise and averaging the depth data, speed data, and amplitude data for each body part generated for each depth camera to an intermediate value; The first behavior information and the second behavior information are matched based on time to calculate an error, and the calculated error is removed using a continuous weighted median filter to generate final behavior information. wealth; The movement line, movement speed, and movement angle of each body part over time included in the final action information, as well as the image data, are input into the motion image generation model to generate 3D composite image data about the person's movement that can be rotated 360 degrees. A 3D image generator that generates; Beauty content that generates informational text by analyzing information on the movement line, movement speed, and movement angle of each body part, and creates beauty content by adding subtitles to the 3D synthetic image data generated based on the generated informational text. It may include a creation unit.

Description

Beauty educational content generating apparatus and method therefor}

본 발명은 미용 교육 컨텐츠를 생성하는 기술에 대한 것으로, 더욱 자세하게는 미용 전문가의 미용 동작을 복수의 깊이 카메라를 통해 촬영함으로써 촬영 데이터를 분석하여 360도 3D 영상으로 생성된 미용 교육 컨텐츠를 자동으로 생성하고, 미용 교육 컨텐츠는 행동이 선명하게 식별될 수 있도록 신체 부위를 연결해 사람의 형상으로 생성된 3D 영상에 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 생성한 정보 제공 텍스트를 자막으로 입혀 컨텐츠를 시청하는 사용자들에게 복잡한 동작으로 이루어진 미용 동작을 손쉽게 학습할 수 있도록 미용 교육 콘텐츠를 제공하는 미용 콘텐츠 자동 생성 장치 및 그 방법을 제공하는데 그 목적이 있다.The present invention relates to a technology for generating beauty education content. More specifically, the beauty education content created as a 360-degree 3D image is automatically generated by analyzing the shooting data by filming the beauty movements of a beauty expert through a plurality of depth cameras. In addition, beauty education content is a 3D image created in the shape of a person by connecting body parts so that actions can be clearly identified, and subtitles provide informational text created by analyzing information on the movement line, movement speed, and movement angle of each body part. The purpose is to provide an automatic beauty content generation device and method that provides beauty education content so that users who watch the content can easily learn beauty movements consisting of complex movements.

K-POP를 선두로 한국 드라마, 영화들이 전세계인의 사랑을 점차 받고 있는 것을 기화로 K-Beauty라고 하는 한국식 화장법, 마사지법, 미용 시술 방법 등 미용 분야에 대해 점차 관심이 늘어가고 있다.As Korean dramas and movies, led by K-POP, are gradually receiving love from people around the world, interest in the field of beauty, including Korean makeup methods, massage methods, and beauty treatments called K-Beauty, is gradually increasing.

이러한 미용 분야에 포함된 활동을 수행하기 위해서는 미용 동작의 숙달이 기본적은 전제 조건이며, 이러한 미용 동작을 학습하기 위해서 종래에는 교육기관에 찾아가 많은 비용을 들여 교육과정을 오프라인에서 수강하는 방법 만이 존재하였다.In order to perform activities included in this field of beauty, mastery of beauty movements is a basic prerequisite, and in order to learn these beauty movements, the only existing method was to visit an educational institution and spend a lot of money to take a course offline. .

그러나 이러한 방법은 지방 또는 해외에 있는 수요자들에게는 현실적으로 어려운 대안일 뿐만 아니라 비용도 상대적으로 많이 소요되며, 특히 현재와 같은 코로나 바이러스로 인한 사회적 거리두기가 강화된 환경에서는 많은 제약이 존재한다.However, this method is not only a realistically difficult alternative for consumers in rural areas or overseas, but also requires relatively high costs, and there are many limitations, especially in the current environment where social distancing due to the coronavirus has been strengthened.

이러한 제약을 극복하기 위하여 유튜브 등 다양한 동영상 기반의 소셜 네트워크에서는 미용 교육에 대한 많은 영상 콘텐츠들이 올라와 있지만 이러한 영상을 만들기 위해서는 실제로 미용 동작을 수행하는 사람 외에도 촬영기사, 편집자 등이 필요하며 자막에 넣을 스크립트도 별도로 생성하여 입혀야 하는 번거로움이 존재한다.To overcome these limitations, a lot of video content about beauty education has been uploaded on various video-based social networks such as YouTube. There is also the inconvenience of having to create and apply it separately.

따라서 미용 전문가가 타인의 별도 도움 없이도 손쉽게 미용 컨텐츠를 만들수 있게 하는 기술에 대한 니즈가 점차 커지고 있는 상황이다.Therefore, the need for technology that allows beauty experts to easily create beauty content without additional help from others is gradually increasing.

대한민국 공개특허공보 제10-2020-0089640호(2020.07.27.)Republic of Korea Patent Publication No. 10-2020-0089640 (2020.07.27.)

본 발명은 미용 전문가가 타인의 별도 도움 없이도 손쉽게 미용 컨텐츠를 만들수 있는 미용 콘텐츠 자동 생성 기술을 제공하고자 하는데 목적이 있으며, 복수의 깊이 카메라를 설치하고, 미용 전문가가 미용 동작을 수행하기만 하면, 미용 동작을 수행하는 것을 복수의 깊이 카메라가 촬영하고, 촬영한 영상 데이터 및 송수신 데이터를 분석하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함한 행동 정보를 이용하여 사람의 움직임에 대한 3D 합성 영상 데이터를 생성하고 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 생성한 정보 제공 텍스트를 자막으로 입혀서 타인의 도움 없이도 미용 전문가가 미용동작을 3D영상으로 구현한 미용 관련 영상 콘텐츠를 생성하여 피교육자들에게 제공할 수 있다.The purpose of the present invention is to provide automatic beauty content creation technology that allows beauty experts to easily create beauty content without any additional help from others. All that is required is for a beauty expert to install a plurality of depth cameras and perform beauty actions. Multiple depth cameras film the person performing the movement, analyze the captured image data and transmission/reception data, and use behavioral information including the movement line, movement speed, and movement angle for each body part to produce 3D composite image data of the person's movement. By creating and analyzing information on movement lines, movement speeds, and movement angles for each body part, the information-providing text created is subtitled to create beauty-related video content in which beauty experts implement beauty movements in 3D images without the help of others. It can be provided to trainees.

본 발명의 일 실시예에 따르면 미용 콘텐츠 자동 생성 장치는 복수의 깊이 카메라를 이용하여 미용 행위를 수행하고 있는 적어도 하나의 사람의 움직임을 전방위로 촬영하여 영상 데이터 및 상기 영상 데이터에 대응하는 송수신 시간 데이터를 포함한 동작 정보를 수집하는 동작 정보 수집부; 상기 동작 정보에 포함된 각 깊이 카메라별 영상 데이터를 인공 신경망 기반의 영상 분석 모델에 입력하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보를 생성하고, 상기 동작 정보에 포함된 각 깊이 카메라별 송수신 시간 데이터를 분석하여 연속된 복수의 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하고, 상기 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 제거 함수를 이용하여 노이즈를 제거하고, 상기 노이즈가 제거된 각 깊이 카메라 별로 생성된 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간 값으로 평균화하여 제2 행동 정보를 생성하는 행동 정보 생성부; 상기 제1 행동 정보와 제2 행동 정보를 시간을 기준으로 정합 하여 오차를 산출하고, 산출된 오차에 대해서는 연속형 가중치 중간값 필터를 사용하여 오차를 제거하여 최종 행동 정보로 생성하는 최종 행동 정보 생성부; 상기 최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도 및 상기 영상 데이터를 동작 영상 생성 모델에 입력하여 360도 회전이 가능한 사람의 움직임에 대한 3D 합성 영상 데이터를 생성하는 3D 영상 생성부; 및 상기 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 정보 제공 텍스트를 생성하고, 생성한 정보 제공 텍스트를 기반으로 생성한 대한 3D 합성 영상 데이터에 자막을 입혀 미용 콘텐츠를 생성하는 미용 콘텐츠 생성부를 포함할 수 있다.According to an embodiment of the present invention, an apparatus for automatically generating beauty content uses a plurality of depth cameras to photograph the movements of at least one person performing a beauty act in all directions and generates image data and transmission/reception time data corresponding to the image data. A motion information collection unit that collects motion information including; The image data for each depth camera included in the motion information is input into an artificial neural network-based image analysis model to generate first behavior information including the movement line, movement speed, and movement angle for each body part, and included in the motion information. By analyzing the transmission and reception time data for each depth camera, a plurality of consecutive depth data, speed data, and amplitude data are generated, and noise is removed using the cumulative distribution function and noise removal function according to the size of the amplitude using the amplitude data. a behavior information generator that generates second behavior information by removing the noise and averaging the depth data, speed data, and amplitude data for each body part generated for each depth camera to an intermediate value; The first behavior information and the second behavior information are matched based on time to calculate an error, and the calculated error is removed using a continuous weighted median filter to generate final behavior information. wealth; The movement line, movement speed, and movement angle of each body part over time included in the final action information, as well as the image data, are input into the motion image generation model to generate 3D composite image data about the person's movement that can be rotated 360 degrees. A 3D image generator that generates; and Beauty, which generates informational text by analyzing information on the movement line, movement speed, and movement angle of each body part, and creates beauty content by adding subtitles to the 3D synthetic image data generated based on the generated informational text. It may include a content creation unit.

본 발명의 일 실시예에 따르면 행동 정보 생성부는, 특징점 추출 모듈, 행동 정보 분석 모듈로 형성된 인공 신경망 기반의 영상 분석 모델을 포함하며, 상기 특징점 추출 모듈에 상기 영상 데이터를 입력하여 특징점을 추출하고, 추출된 특징점을 미리 설정 해놓은 신체 부위별로 라벨링하며, 행동 정보 분석 모듈에 상기 신체 부위별 특징점 및 영상 데이터를 입력하여 각 특징점 별 동선, 이동 속도 및 이동 각도를 포함하는 특징 맵을 생성하고, 생성된 특징 맵에 출력 활성화 함수를 적용하여 제1 행동 정보를 출력하는 제1 행동 정보 생성부; 및 상기 각 깊이 카메라별 송수신 시간 데이터를 분석하여 미리 설정 해놓은 신체 부위별 연속된 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하고, 상기 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 함수를 이용하여 검출된 노이즈를 제거하고, 연속형 확률 분포 함수를 이용하여 표면의 빛의 흡수량에 상관없이 일정한 진폭으로 정규화 시킨 각 깊이 카메라 별로 생성된 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간값 필터로 평균화하여 제2 행동 정보를 생성하는 제2 행동 정보 생성부를 더 포함할 수 있다.According to one embodiment of the present invention, the behavior information generator includes an artificial neural network-based image analysis model formed by a feature point extraction module and a behavior information analysis module, and inputs the image data to the feature point extraction module to extract feature points, The extracted feature points are labeled according to preset body parts, and the feature points and image data for each body part are input into the behavior information analysis module to generate a feature map including the movement line, movement speed, and movement angle for each feature point, and the generated a first behavior information generator that outputs first behavior information by applying an output activation function to the feature map; and analyzing the transmission and reception time data for each depth camera to generate preset continuous depth data, speed data, and amplitude data for each body part, and using the amplitude data to generate a cumulative distribution function and a noise function according to the size of the amplitude. The median value of the depth data, speed data, and amplitude data for each body part generated by each depth camera was normalized to a constant amplitude regardless of the amount of light absorption on the surface using a continuous probability distribution function. It may further include a second behavior information generator that generates second behavior information by averaging it using a filter.

본 발명의 일 실시예에 따르면 최종 행동 정보 생성부는, 상기 제1 행동 정보와 제2 행동 정보를 인공 신경망 기반의 최종 행동 정보 생성 모델에 입력하여 제1 행동 정보와 제2 행동 정보에 포함된 각 신체 부위별 동선, 이동 속도 및 이동 각도를 대비하여 발생하는 차이를 오차로 산출하고, 오차가 발생한 데이터 각각에 가중치를 적용하여 산출된 값들을 중간 값으로 평균화하여 오차를 제거함으로써 하나의 신체 부위별 동선, 이동 속도 및 이동 각도를 가진 최종 행동 정보를 생성하고, 상기 최종 행동 정보에 따라 생성된 3D 영상과 영상 데이터를 비교하여 정확도를 평가하는 인공 신경망 기반의 판단 모델을 이용하여, 상기 최종 행동 정보에 따라 생성된 3D 영상이 영상 데이터와 행동이 일치한다고 인정될 정도로 근사함을 평가할 수 있는 것으로 판단하기 위한 기준 행동 정보에 대한 제1 기대값을 설정하고, 상기 최종 행동 정보에 따라 생성된 3D 영상과 제1 기대값의 차이를 제1 차이값으로 산출하고, 상기 최종 행동 정보에 따라 생성된 3D 영상이 영상 데이터와 행동이 일치한다고 인정될 정도로 근사함을 평가할 수 없는 것으로 판단하기 위한 기준 행동 정보에 대한 제2 기대값을 설정하고, 상기 최종 행동 정보에 따라 생성된 3D 영상과 제2 기대값의 차이를 제2 차이값으로 산출하고, 상기 제1 차이값과 상기 제2 차이값의 합을 기반의 최종 행동 정보 생성 모델을 구성하는 인경 신경망의 구분 손실값으로 산출하고, 상기 구분 손실값이 최소가 되도록 가중치를 고정하여, 상기 최종 행동 정보 생성 모델의 오차가 발생한 데이터 각각에 적용하는 가중치로 업데이트할 수 있다.According to one embodiment of the present invention, the final action information generator inputs the first action information and the second action information into an artificial neural network-based final action information generation model, and inputs each of the first action information and the second action information included in the first action information and the second action information. The difference arising from the comparison of the movement line, movement speed, and movement angle of each body part is calculated as an error, and the error is removed by applying a weight to each data in which the error occurred and averaging the calculated values to the median value. Using an artificial neural network-based judgment model that generates final action information with movement line, movement speed, and movement angle, and evaluates accuracy by comparing 3D images and image data generated according to the final action information, the final action information A first expectation value is set for the reference behavior information to determine that the 3D image generated according to the above can be evaluated as being close enough to be recognized as matching the image data and behavior, and the 3D image generated according to the final behavior information and The difference between the first expected values is calculated as the first difference value, and the 3D image generated according to the final action information is used for reference action information to determine that it cannot be evaluated as being close enough to be recognized as matching the image data and action. Set a second expected value, calculate the difference between the 3D image generated according to the final behavior information and the second expected value as a second difference value, and calculate a second difference value based on the sum of the first difference value and the second difference value. Calculate the classification loss value of the human neural network constituting the final behavior information generation model, fix the weight so that the classification loss value is minimal, and update it with the weight applied to each data in which an error occurred in the final behavior information generation model. You can.

본 발명의 일 실시예에 따르면 상기 3D 영상 생성부는, 복수의 연산 레이어로 이루어진 합성곱 인공 신경망으로 구현되고 배경화면 생성 모듈, 동작 영상 생성 모듈, 3D 영상 합성 모듈을 포함하는 동작 영상 생성 모델을 포함하고, 상기 3D 영상 합성 모듈은 상기 영상 데이터를 입력하면 영상 데이터에 포함된 객체를 제외한 복수의 배경 이미지를 추출하고, 추출된 복수의 배경 데이터를 레퍼런스 이미지에 매칭하여 전처리를 수행하고, 전처리가 수행된 복수의 배경 데이터를 분석하여 구간별로 디스크립터 데이터를 설정하며, 복수의 이미지 데이터 중 공간상 연결되는 이미지 데이터 간에 서로 공유되는 디스크립터 데이터를 도출하고, 도출된 상기 디스크립터 데이터를 기준으로 복수의 이미지를 정합하여 360도 회전이 가능한 배경화면 영상 데이터를 생성하고, 상기 동작 영상 생성 모듈은 미리 설정된 신체 부위를 중심으로 각 신체 부위를 연결해 사람의 형상을 3D로 생성하고, 상기 생성된 사람의 형상을 상기 최종 행동 정보에 포함된 각 신체 부위별 동선, 이동 속도 및 이동 각도에 따라 각 신체 부위별로 움직이는 동작 영상 데이터를 생성하고, 상기 3D 영상 합성 모듈은 상기 360도 회전이 가능한 배경화면에 3D 가상 인물의 동작 영상을 위치 및 각도에 따라 합성하여 합성 영상 데이터를 생성할 수 있다.According to one embodiment of the present invention, the 3D image generator is implemented as a convolutional artificial neural network consisting of a plurality of computational layers and includes a motion image generation model including a background screen generation module, a motion image generation module, and a 3D image synthesis module. When the image data is input, the 3D image synthesis module extracts a plurality of background images excluding objects included in the image data, performs preprocessing by matching the extracted plurality of background data to the reference image, and preprocessing is performed. Descriptor data is set for each section by analyzing the plurality of background data, deriving descriptor data shared between spatially connected image data among the plurality of image data, and matching the plurality of images based on the derived descriptor data. This generates background screen image data that can be rotated 360 degrees, and the motion image generation module connects each body part around a preset body part to create a 3D human shape, and converts the generated human shape into the final image. Motion image data is generated for each body part according to the movement line, movement speed, and movement angle of each body part included in the behavior information, and the 3D image synthesis module displays the motion of the 3D virtual character on the background screen that can rotate 360 degrees. Synthetic image data can be created by combining images according to location and angle.

본 발명의 일 실시예에 따르면 상기 동작 정보 수집부는, 송수신 시간 데이터를 수집함에 있어, 2단계의 깊이 측정을 수행하고, 제1 깊이 측정은 낮은 변조 주파수를 이용하여 깊이를 측정하여 미리 설정된 신체 부위별 관심 영역에 대하여 낮은 측정 품질로 측정을 수행하고, 제2 깊이 측정은 제1 깊이 측정의 신체 부위별 관심 영역에 대한 측정 결과를 기반으로 높은 변조 주파수를 이용하여 깊이를 측정하여 높은 측정 품질로 측정 정밀도를 상승시키고, 상기 낮은 변조 주파수는 수학식 1을 기반으로 선정되며,According to an embodiment of the present invention, the motion information collection unit performs a two-stage depth measurement when collecting transmission and reception time data, and the first depth measurement measures the depth using a low modulation frequency to detect a preset body part. Measurements are performed with low measurement quality for each region of interest, and the second depth measurement measures depth using a high modulation frequency based on the measurement results for the region of interest for each body part of the first depth measurement with high measurement quality. To increase measurement precision, the low modulation frequency is selected based on Equation 1,

[수학식 1][Equation 1]

상기 높은 변조 주파수는 상기 낮은 변조 주파수의 사용시 측정된 표준 편차에 반비례하는 값으로 선정되고, 상기 표준 편차가 미리 설정한 한계값보다 작다면 신호 대 잡음비가 높은 것으로 판단하여 상기 표준 편차가 미리 설정한 한계 값보다 큰 경우보다 상대적으로 높은 주파수로 될 수 있으며, 상기 제2 깊이 측정을 다수 수행하여 반복된 측정을 통해 측정 정밀도를 상승시킬 수 있다.The high modulation frequency is selected as a value inversely proportional to the standard deviation measured when using the low modulation frequency, and if the standard deviation is less than a preset threshold, it is determined that the signal-to-noise ratio is high and the standard deviation is preset. The frequency can be relatively higher than when it is greater than the limit value, and measurement precision can be increased through repeated measurements by performing the second depth measurement multiple times.

본 발명의 실시예에 따라 구현된 미용 콘텐츠 자동 생성 장치 및 그 방법을 이용하면 스탠드를 이용해 미용 전문가의 미용 동작을 촬영할 수 있는 복수의 깊이 카메라의 위치만 세팅해주면 타인의 별도 도움 없이도 복수의 깊이 카메라가 촬영하고, 촬영한 영상 데이터 및 송수신 데이터를 분석하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함한 행동 정보를 이용하여 사람의 움직임에 대한 3D 합성 영상 데이터를 생성하고 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 생성한 정보 제공 텍스트를 자막으로 입혀서 미용 전문가가 미용동작을 3D영상으로 구현한 미용 관련 영상 콘텐츠를 생성하여 피교육자들에게 제공함으로써, 쉽고 빠르게 미용 콘텐츠의 생성이 가능하며, 사람의 실제 동작 영상이 아닌 신체 세부 부위별로 동작의 식별이 가능한 3D 영상일 뿐만 아니라 자막으로 정보 제공 텍스트까지 제공되므로 피교육자로 하여금 교육 내용을 좀 더 명확하게 인식할 수 있는 효과가 존재한다.By using the automatic beauty content generation device and method implemented in accordance with an embodiment of the present invention, you can set the positions of a plurality of depth cameras that can capture the beauty movements of a beauty expert using a stand, and the plurality of depth cameras can be created without additional help from others. By analyzing the image data captured and transmitted and received data, 3D synthetic image data of human movement is generated using behavioral information including the movement line, movement speed, and movement angle of each body part, and the movement line and movement of each body part are generated. By applying subtitles to the information provision text created by analyzing information on speed and movement angle, beauty experts create beauty-related video content that embodies beauty movements in 3D video and provide it to trainees, enabling the creation of beauty content quickly and easily. This is possible, and not only is it a 3D video that allows identification of movements by detailed body part, rather than a video of a person's actual motion, but it also provides informational text with subtitles, which has the effect of allowing trainees to perceive the training content more clearly. .

도 1은 본 발명의 실시예에 따라 구현된 미용 콘텐츠 자동 생성 장치의 구성도이다.
도 2는 도 1에 도시된 행동 정보 생성부의 세부 구성도이다.
도 3은 본 발명의 일 실시예에 따라 구현된 영상 분석 모델을 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따라 구현된 특징점 추출 모듈 및 행동 정보 분석 모듈을 포함한 영상 분석 모델을 나타낸 도면이다.
도 5는 본 발명의 일 실시예에 따라 구현된 동작 영상 생성 모델의 세부 모듈을 나타낸 도면이다.
도 6은 본 발명의 일 실시예에 따라 구현된 동작 영상 생성 모델을 나타낸 도면이다.
도 7는 본 발명의 일 실시예에 따라 구현된 배경화면 생성 모듈, 동작 영상 생성 모듈, 3D 영상 합성 모듈을 포함한 동작 영상 생성 모델을 나타낸 도면이다.
도 8은 본 발명의 일 실시예에 따라 미용 콘텐츠 자동 생성 방법의 흐름도이다.1 is a configuration diagram of an automatic beauty content generation device implemented according to an embodiment of the present invention.
FIG. 2 is a detailed configuration diagram of the behavior information generator shown in FIG. 1.
Figure 3 is a diagram showing an image analysis model implemented according to an embodiment of the present invention.
Figure 4 is a diagram showing an image analysis model including a feature point extraction module and a behavior information analysis module implemented according to an embodiment of the present invention.
Figure 5 is a diagram showing detailed modules of a motion image generation model implemented according to an embodiment of the present invention.
Figure 6 is a diagram showing a motion image generation model implemented according to an embodiment of the present invention.
Figure 7 is a diagram showing a motion image generation model including a background screen creation module, a motion image creation module, and a 3D image synthesis module implemented according to an embodiment of the present invention.
Figure 8 is a flowchart of a method for automatically generating beauty content according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Below, with reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily implement the present invention. However, the present invention may be implemented in many different forms and is not limited to the embodiments described herein.

본 발명에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. The terms used in the present invention are only used to describe specific embodiments and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 발명에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In the present invention, terms such as “comprise” or “have” are intended to designate the presence of features, numbers, steps, operations, components, parts, or combinations thereof described in the specification, but are not intended to indicate the presence of one or more other features. It should be understood that this does not exclude in advance the possibility of the existence or addition of elements, numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as generally understood by a person of ordinary skill in the technical field to which the present invention pertains.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 발명에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Terms defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related technology, and unless clearly defined in the present invention, should not be interpreted in an idealized or excessively formal sense. No.

또한 도면들의 각 블록과 흐름도 도면들의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수 있음을 이해할 수 있을 것이며, 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 흐름도 블록(들)에서 설명된 기능들을 수행하는 수단을 생성하게 된다. It will also be understood that each block of the drawings and the combinations of the flowchart drawings can be performed by computer program instructions, and these computer program instructions can be mounted on a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing equipment. Therefore, the instructions executed through a processor of a computer or other programmable data processing equipment create a means of performing the functions described in the flowchart block(s).

이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 흐름도 블록(들)에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions may also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular manner, so that the computer-usable or computer-readable memory The instructions stored in may also produce manufactured items containing instruction means that perform the functions described in the flow diagram block(s).

컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 흐름도 블록(들)에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a process that is executed by the computer, thereby generating a process that is executed by the computer or other programmable data processing equipment. Instructions that perform processing equipment may also provide steps for executing the functions described in the flow diagram block(s).

또한, 각 블록은 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. Additionally, each block may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s).

그리고 몇 가지 대체 실시예들에서는 블록들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.It should also be noted that in some alternative embodiments it is possible for the functions mentioned in the blocks to occur out of order. For example, it is possible for two blocks shown in succession to be performed substantially at the same time, or it is possible for the blocks to be performed in reverse order depending on the corresponding function.

이 때, 본 실시예에서 사용되는 '~부'라는 용어는 소프트웨어 또는 FPGA(field-Programmable Gate Array) 또는 ASIC(Application Specific Integrated Circuit)과 같은 하드웨어 구성요소를 의미하며, '~부'는 어떤 역할들을 수행한다. At this time, the term '~unit' used in this embodiment refers to software or hardware components such as FPGA (field-programmable gate array) or ASIC (Application Specific Integrated Circuit), and '~unit' refers to what role perform them.

그렇지만 '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다.However, '~part' is not limited to software or hardware. The '~ part' may be configured to reside in an addressable storage medium and may be configured to reproduce on one or more processors.

따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들, 및 변수들을 포함한다. 구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로 더 분리될 수 있다. 뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU들을 재생시키도록 구현될 수도 있다.Therefore, as an example, '~ part' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, and procedures. , subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. The functions provided within the components and 'parts' may be combined into a smaller number of components and 'parts' or may be further separated into additional components and 'parts'. Additionally, components and 'parts' may be implemented to regenerate one or more CPUs within a device or a secure multimedia card.

본 발명의 실시예들을 구체적으로 설명함에 있어서, 특정 시스템의 예를 주된 대상으로 할 것이지만, 본 명세서에서 청구하고자 하는 주요한 요지는 유사한 기술적 배경을 가지는 여타의 통신 시스템 및 서비스에도 본 명세서에 개시된 범위를 크게 벗어나지 아니하는 범위에서 적용 가능하며, 이는 당해 기술분야에서 숙련된 기술적 지식을 가진 자의 판단으로 가능할 것이다.In explaining the embodiments of the present invention in detail, the main focus will be on examples of specific systems, but the main point claimed in this specification is that the scope disclosed in this specification is applicable to other communication systems and services with similar technical background. It can be applied within a range that does not significantly deviate, and this can be done at the discretion of a person with skilled technical knowledge in the relevant technical field.

이하, 도면을 참조하여 본 발명의 실시 예에 따른 미용 콘텐츠 자동 생성 장치 및 그 방법에 대하여 설명한다.Hereinafter, an apparatus and method for automatically generating beauty content according to an embodiment of the present invention will be described with reference to the drawings.

도 1은 본 발명의 실시예에 따라 구현된 미용 콘텐츠 자동 생성 장치(10)의 구성도이다.Figure 1 is a configuration diagram of an automatic beauty content generation device 10 implemented according to an embodiment of the present invention.

도 1을 참조하면 미용 콘텐츠 자동 생성 장치(10)는 동작 정보 수집부(100), 행동 정보 생성부(200), 최종 행동 정보 생성부(300), 3D 영상 생성부(400), 미용 콘텐츠 생성부(500)를 포함할 수 있다.Referring to Figure 1, the beauty content automatic generation device 10 includes a motion information collection unit 100, a behavior information generation unit 200, a final behavior information generation unit 300, a 3D image generation unit 400, and beauty content generation. It may include unit 500.

동작 정보 수집부(100)는 복수의 깊이 카메라를 이용하여 미용 행위를 수행하고 있는 적어도 하나의 사람의 움직임을 전방위로 촬영하여 영상 데이터 및 영상 데이터에 대응하는 미리 설정된 신체 부위별 송수신 시간 데이터를 포함한 동작 정보를 수집할 수 있다.The motion information collection unit 100 uses a plurality of depth cameras to photograph the movements of at least one person performing a beauty treatment in all directions, including image data and preset transmission/reception time data for each body part corresponding to the image data. Movement information can be collected.

본 발명의 일 실시예에 따르면 깊이 카메라로는 구 좌표계의 원점에서 펄스 변조된 적외선(IR) 빔을 목표물에 발사시켜 수평(pan, φ)과 상하(tilt, θ)로 스캐닝하여 구 표면(sphere surface)의 각기 다른 불연속 점(г, θ, φ)의 분포(point-wise)에서 일어나는 역방향 산란(back scattering)으로 반사되어 원점으로 되돌아오는 시간, 즉 송수신 시간을 기반으로 배경 내 목표물의 3차원 영상 정보를 획득할 수 있는 카메라가 사용될 수 있다.According to one embodiment of the present invention, the depth camera fires a pulse-modulated infrared (IR) beam from the origin of the spherical coordinate system to the target and scans it horizontally (pan, ϕ) and up and down (tilt, θ) to detect the sphere surface (sphere). 3-dimensional image of the target in the background based on the time it is reflected and returned to the origin by back scattering that occurs in the point-wise distribution of different discontinuous points (г, θ, ϕ) on the surface, that is, the transmission and reception time A camera capable of acquiring image information may be used.

본 발명의 일 실시예에 따르면 복수개의 깊이 카메라는 적어도 하나의 사람의 움직임을 전방위로 촬영할 수 있도록 일정한 각도를 기준으로 설치될 수 있으며, 영상 데이터 및 영상 데이터와 시간을 기준으로 대응되는 미리 설정된 신체 부위별 송수신 시간 데이터를 수집할 수 있다.According to an embodiment of the present invention, a plurality of depth cameras may be installed based on a certain angle to capture the movement of at least one person in all directions, and image data and a preset body corresponding to the image data and time may be installed. Transmission and reception time data for each part can be collected.

여기서 미리 설정된 신체 부위는 사람의 동작을 식별할 수 있는 사람의 신체 부위를 의미하며, 예를 들어 양 손의 손가락 끝과 각 마디, 팔목, 팔꿈치 어깨, 목, 얼굴의 각 이목구비 양 끝단, 고과절, 무릎, 발목, 양 발의 발가락 끝과 각 마디 등 주로 신체의 끝단 및 가동가능한 관절 부위를 중심으로 설정될 수 있다.Here, the preset body parts refer to human body parts that can identify human movements, such as the fingertips and each joint of both hands, wrists, elbows, shoulders, neck, both ends of each facial feature, and hip joints. , it can be set mainly around the extremities of the body and movable joints, such as knees, ankles, the tips of the toes of both feet, and each joint.

본 발명의 일 실시예에 따르면 동작 정보 수집부(100)는 송수신 시간 데이터를 수집함에 있어, 2단계의 깊이 측정을 수행할 수 있다.According to one embodiment of the present invention, the motion information collection unit 100 may perform two-stage depth measurement when collecting transmission and reception time data.

본 발명의 일 실시예에 따르면 2단계의 깊이 측정은 제1 깊이 측정 단계 및 제2 깊이 측정 단계로 구분할 수 있다.According to one embodiment of the present invention, the two-stage depth measurement can be divided into a first depth measurement stage and a second depth measurement stage.

상기 실시예와 같이 2단계로 나누어 깊이 측정을 수행하는 이유는 단일 주파수를 사용할 때 낮은 변조 주파수를 이용하여 깊이 측정을 수행하는 경우에는 최대 범위는 넓어지지만 측정 품질은 낮아지는 단점이 존재하며, 반대로, 높은 변조 주파수를 이용하여 깊이 측정을 수행하는 경우에는 측정 품질은 향상시킬 수 있지만 최대 범위는 좁아진다는 단점이 존재하므로, 이러한 단점을 최대한 보완하기 위함에 있다.The reason why depth measurement is performed in two stages as in the above embodiment is that when depth measurement is performed using a low modulation frequency when using a single frequency, the maximum range is widened but the measurement quality is lowered, and conversely, , When depth measurement is performed using a high modulation frequency, measurement quality can be improved, but there is a disadvantage in that the maximum range is narrowed. This is to compensate for this disadvantage as much as possible.

본 발명의 일 실시예에 따르면 제1 깊이 측정은 낮은 변조 주파수를 이용하여 깊이를 측정하여 미리 설정된 신체 부위별 관심 영역에 대하여 낮은 측정 품질로 측정을 수행할 수 있다.According to an embodiment of the present invention, the first depth measurement may measure depth using a low modulation frequency and perform measurement with low measurement quality for a preset region of interest for each body part.

본 발명의 일 실시예에 따르면 제2 깊이 측정은 제1 깊이 측정의 신체 부위별 관심 영역에 대한 측정 결과를 기반으로 높은 변조 주파수를 이용하여 깊이를 측정하여 높은 측정 품질로 측정 정밀도를 상승시킬 수 있다.According to an embodiment of the present invention, the second depth measurement measures depth using a high modulation frequency based on the measurement results for the region of interest for each body part of the first depth measurement, thereby increasing measurement precision with high measurement quality. there is.

본 발명의 일 실시예에 따르면 넓은 최대 깊이 범위를 갖도록 낮은 변조 주파수를 이용하는 제1 깊이 측정은, 넓은 깊이 범위를 대상으로 대략적인 깊이를 측정할 수 있으며, 이때의 측정 품질은 변조 주파수에 비례관계를 가지므로 제1 깊이 측정을 통해 측정된 경과는 넓은 관심 영역에 대하여 낮은 측정 품질을 제공할 수 있다.According to one embodiment of the present invention, the first depth measurement using a low modulation frequency to have a wide maximum depth range can measure the approximate depth over a wide depth range, and the measurement quality at this time is proportional to the modulation frequency. Therefore, the progress measured through the first depth measurement may provide low measurement quality for a wide area of interest.

*본 발명의 일 실시예에 따르면 제2 깊이 측정은 최대 깊이 범위가 제1 깊이 측정의 정밀도를 기반으로 설정될 수 있으며, 상대적으로 더 높은 주파수를 선택하여 좁은 관심 영역에 대한 높은 측점 품질을 제공함으로써 제1 깊이 측정 결과의 오차를 보상할 수 있다.*According to one embodiment of the present invention, the maximum depth range of the second depth measurement may be set based on the precision of the first depth measurement, and a relatively higher frequency may be selected to provide high measurement quality for a narrow region of interest. By doing so, the error in the first depth measurement result can be compensated.

본 발명의 일 실시예에 따르면 낮은 변조 주파수는 수학식 1를 기반으로 선정될 수 있다.According to one embodiment of the present invention, a low modulation frequency can be selected based on Equation 1.

여기서 는 낮은 변조 주파수를 의미하며, 는 광속, 는 최대 깊이 범위를 의미할 수 있다.here means low modulation frequency, is the speed of light, may mean the maximum depth range.

본 발명의 일 실시예에 따르면 높은 변조 주파수는 낮은 변조 주파수의 사용시 측정된 표준 편차에 반비례하는 값으로 선정될 수 있으며, 표준 편차가 미리 설정한 한계값보다 작다면, 이는 신호 대 잡음비가 높은 것으로 판단하여 표준 편차가 미리 설정한 한계 값보다 큰 경우보다 상대적으로 높은 주파수로 선정될 수 있다.According to one embodiment of the present invention, a high modulation frequency can be selected as a value inversely proportional to the standard deviation measured when using a low modulation frequency, and if the standard deviation is less than a preset threshold, this means that the signal-to-noise ratio is high. By judging, a relatively higher frequency can be selected than when the standard deviation is greater than a preset limit value.

여기서 표준 편차( )는 수학식 2를 기반으로 산출될 수 있다.where standard deviation ( ) can be calculated based on Equation 2.

여기서 는 살출하고자 하는 표준 편차를 의미할 수 있으며, dp는 제1 깊이 측정으로 통해 측정된 깊이를 의미할 수 있으며 μ는 관심 영역(RoI)에 대한 dp의 평균값을 의미하며 N은 관심 영역 내의 픽셀의 개수로서 자연수일 수 있다.here may mean the standard deviation to be measured, dp may mean the depth measured through the first depth measurement, μ means the average value of dp for the region of interest (RoI), and N is the number of pixels in the region of interest (RoI). The number can be a natural number.

본 발명의 일 실시예에 따르면 제1 깊이 측정 단계 및 제2 깊이 측정 단계를 다수 수행하여 반복된 측정을 통해 측정 정밀도를 상승시킬 수 있다.According to an embodiment of the present invention, measurement precision can be increased through repeated measurements by performing multiple first and second depth measurement steps.

행동 정보 생성부(200)는 동작 정보에 포함된 각 깊이 카메라별 영상 데이터를 인공 신경망 기반의 영상 분석 모델에 입력하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보를 생성할 수 있다.The behavior information generator 200 inputs the image data for each depth camera included in the motion information into an artificial neural network-based image analysis model to generate first behavior information including the movement line, movement speed, and movement angle for each body part. can do.

본 발명의 일 실시예에 따르면 인공 신경망 기반의 영상 분석 모델은 깊이 카메라별 영상 데이터를 입력층에 입력 받아 미리 설정된 각 신체 부위별로 동선, 이동 속도, 이동 각도 정보를 출력할 수 있으며, 이를 각 신체 부위별로 그룹핑하여 제1 행동 정보를 생성하도록 학습될 수 있다.According to an embodiment of the present invention, an artificial neural network-based image analysis model can receive image data for each depth camera into an input layer and output movement line, movement speed, and movement angle information for each preset body part, which can be used to output information about each body part. It can be learned to generate first action information by grouping by part.

또한 동작 정보에 포함된 각 깊이 카메라별 송수신 시간 데이터를 분석하여 연속된 복수의 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하고, 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 제거 함수를 이용하여 노이즈를 제거하고, 노이즈가 제거된 각 깊이 카메라 별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간 값으로 평균화하여 제2 행동 정보를 생성할 수 있다. In addition, by analyzing the transmission and reception time data for each depth camera included in the motion information, multiple consecutive depth data, speed data, and amplitude data are generated, and the cumulative distribution function and noise removal function are calculated according to the size of the amplitude using the amplitude data. Second behavioral information can be generated by removing noise and averaging the depth data, speed data, and amplitude data for each depth camera from which the noise has been removed to an intermediate value.

본 발명의 일 실시예에 따르면 각 깊이 카메라 별로 생성된 신체 부위별 송수신 시간 데이터를 분석하여 각 신체 부위별로 시간의 흐름에 따라 연속된 깊이 데이터, 속도 데이터, 진폭 데이터를 생성할 수 있다.According to an embodiment of the present invention, continuous depth data, speed data, and amplitude data can be generated for each body part over time by analyzing transmission and reception time data for each body part generated for each depth camera.

여기서 깊이 데이터란 관심 영역의 3차원적 위치를 인식할 수 있는 데이터를 의미할 수 있으며, 속도 데이터는 깊이 데이터를 기반으로 시간의 흐름에 따라 위치의 변경을 통해 산출된 이동 속도를 의미할 수 있으며, 진폭 데이터는 깊이 카메라에 의해 송신된 광은 표면의 진폭 정도에 따라 표면의 광 강도가 상이하므로 손상된 광이 반사되어 생긴 표면의 광 강도를 의미할 수 있다.Here, depth data may refer to data that can recognize the three-dimensional location of the area of interest, and speed data may refer to the movement speed calculated through changes in location over time based on depth data. , Amplitude data may refer to the light intensity of the surface resulting from reflection of damaged light because the light intensity of the surface is different depending on the amplitude of the surface of the light transmitted by the depth camera.

본 발명의 일 실시예에 따르면 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 함수를 이용하여 노이즈를 제거할 수 있다. According to an embodiment of the present invention, noise can be removed using amplitude data using a cumulative distribution function and a noise function depending on the size of the amplitude.

상기 실시예에 다르면 진폭 데이터는 깊이 카메라에 의해 반사되는 광이 얼마나 밝은지를 나타낼 수 있고, 이는 표면의 진폭 정도에 따라 표면의 광 강도가 상이하게 나타나는 것을 특징을 기반으로 한다.According to the above embodiment, the amplitude data may indicate how bright the light reflected by the depth camera is, and this is based on the characteristic that the light intensity of the surface appears differently depending on the amplitude of the surface.

본 발명의 일 실시예에 따르면 깊이 카메라에 사용되는 비이상적인 파형으로 인해 생기는 잡음과 객체의 경계에 생기는 노이즈를 제거하기 위해 누적 분포 함수 및 노이즈 함수를 사용하여 필터링을 수행할 수 있다.According to an embodiment of the present invention, filtering can be performed using a cumulative distribution function and a noise function to remove noise caused by a non-ideal waveform used in a depth camera and noise generated at the boundary of an object.

본 발명의 일 실시예에 따르면 누적 분포 함수는 수학식 3을 기반으로 수행될 수 있다.According to one embodiment of the present invention, the cumulative distribution function can be performed based on Equation 3.

Φ(x)는 누적 분포 함수를 나타내며 NF는 노이즈 함수를 의미할 수 있다.Φ(x) represents the cumulative distribution function and NF may represent the noise function.

본 발명의 일 실시예에 따르면 노이즈 함수(NF)는 수학식 4를 기반으로 수행될 수 있다.According to one embodiment of the present invention, the noise function (NF) can be performed based on Equation 4.

행동 정보 생성부(200)는 도 2를 참고하여 더 자세하게 설명하도록 한다.The behavior information generator 200 will be described in more detail with reference to FIG. 2 .

최종 행동 정보 생성부(300)는 제1 행동 정보와 제2 행동 정보를 시간을 기준으로 정합 하여 오차를 산출하고, 산출된 오차에 대해서는 연속형 가중치 중간값 필터를 사용하여 오차를 제거하여 최종 행동 정보로 생성할 수 있다.The final action information generator 300 calculates an error by matching the first action information and the second action information based on time, and removes the error using a continuous weighted median filter for the calculated error to produce the final action. It can be created with information.

본 발명의 일 실시예에 따르면 제1 행동 정보와 제2 행동 정보를 인공 신경망 기반의 최종 행동 정보 생성 모델에 입력하여 제1 행동 정보와 제2 행동 정보에 포함된 신체 부위별 동선, 이동 속도 및 이동 각도를 대비하여 발생하는 차이를 오차로 산출하고, 오차가 발생한 데이터 각각에 가중치를 적용하여 산출된 값들을 중간 값으로 평균화하여 오차를 제거함으로써 하나의 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 가진 최종 행동 정보를 생성할 수 있다.According to an embodiment of the present invention, the first behavior information and the second behavior information are input into an artificial neural network-based final behavior information generation model to determine the movement line, movement speed, and The difference that occurs in comparison with the movement angle is calculated as an error, and a weight is applied to each data in which the error occurs, and the calculated values are averaged to the middle value to remove the error, thereby generating depth data, speed data, and amplitude data for each body part. Final action information with can be generated.

본 발명의 일 실시예에 따르면 최종 행동 정보 생성 모델은 동작 분석 모듈 및 정합 모듈을 포함할 수 있다.According to one embodiment of the present invention, the final behavior information generation model may include a motion analysis module and a matching module.

본 발명의 일 실시예에 따르면 동작 분석 모듈은 제2 행동 정보에 포함된 각 깊이 카메라 별로 생성된 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 입력 받아 각 신체 부위별 동선, 이동 속도 및 이동 각도 정보를 출력할 수 있다.According to an embodiment of the present invention, the motion analysis module receives depth data, speed data, and amplitude data for each body part generated for each depth camera included in the second behavior information, and receives the movement line, movement speed, and movement angle for each body part. Information can be printed.

본 발명의 일 실시예에 따르면 정합 모듈은 제1 행동 정보에 포함된 각 신체 부위별 동선, 이동 속도 및 이동 각도와 제2 행동 정보에 포함된 각 신체 부위별 동선, 이동 속도 및 이동 각도를 입력 받아 발생하는 차이를 오차로 산출할 수 있다.According to an embodiment of the present invention, the matching module inputs the movement line, movement speed, and movement angle for each body part included in the first behavior information and the movement line, movement speed, and movement angle for each body part included in the second behavior information. The difference that occurs can be calculated as an error.

상기 실시예에 따르면 오차가 발생한 데이터 각각에 가중치를 적용하여 산출된 값들을 중간 값으로 평균화하여 오차를 제거함으로써 하나의 신체 부위별 동선, 이동 속도 및 이동 각도를 출력할 수 있고, 이를 포함하는 최종 행동 정보를 생성할 수 있다.According to the above embodiment, the movement line, movement speed, and movement angle for each body part can be output by applying a weight to each data in which an error occurred and averaging the calculated values to a median value to remove the error. Behavioral information can be generated.

여기서 오차가 발생한 데이터의 신체 부위별로 가중치를 부여하고 제1 행동 정보, 제2 행동 정보 별로도 가중치를 부여하여 가중치를 적용하여 산출한 값들의 중간 값으로 평균화를 수행하여 최종 행동 정보를 생성할 수 있으며, 본 발명의 일 실시예에 따르면 가중치 값들은 인공 신경망 기반의 판단 모델에 의해 정확도가 상대적으로 향상되도록 업데이트 될 수 있다.Here, weights are given to each body part of the data in which an error occurred, and weights are also given to each first and second behavior information, and the final behavior information can be generated by performing averaging with the median value of the values calculated by applying the weights. And, according to one embodiment of the present invention, the weight values can be updated to relatively improve accuracy by an artificial neural network-based judgment model.

본 발명의 일 실시예에 따르면 최종 행동 정보에 따라 생성된 3D 영상과 영상 데이터를 비교하여 정확도를 평가하는 인공 신경망 기반의 판단 모델을 이용할 수 있다.According to an embodiment of the present invention, an artificial neural network-based decision model that evaluates accuracy by comparing 3D images generated according to final action information and image data can be used.

상기 실시예에 따르면 인공 신경망 기반의 판단 모델을 이용하여 최종 행동 정보에 따라 생성된 3D 영상이 영상 데이터와 행동이 일치한다고 인정될 정도로 근사함을 평가할 수 있는 것으로 판단하기 위한 기준 행동 정보에 대한 제1 기대값을 설정하고, 최종 행동 정보에 따라 생성된 3D 영상과 제1 기대값의 차이를 제1 차이값으로 산출할 수 있다.According to the above embodiment, the first reference behavioral information for determining that the 3D image generated according to the final behavioral information using an artificial neural network-based judgment model can be evaluated as being close enough to be recognized as matching the image data and behavior. An expected value can be set, and the difference between the 3D image generated according to the final behavior information and the first expected value can be calculated as the first difference value.

또한 최종 행동 정보에 따라 생성된 3D 영상이 영상 데이터와 행동이 일치한다고 인정될 정도로 근사함을 평가할 수 없는 것으로 판단하기 위한 기준 행동 정보에 대한 제2 기대값을 설정하고, 상기 최종 행동 정보에 따라 생성된 3D 영상과 제2 기대값의 차이를 제2 차이값으로 산출할 수 있다.In addition, a second expected value for the standard behavior information is set to determine that the 3D image generated according to the final behavior information cannot be evaluated as being close enough to be recognized as matching the image data and behavior, and is generated according to the final behavior information. The difference between the generated 3D image and the second expected value can be calculated as a second difference value.

상기 실시예에 따르면 제1 차이값과 제2 차이값의 합을 기반의 최종 행동 정보 생성 모델을 구성하는 인경 신경망의 구분 손실값으로 산출하고, 구분 손실값이 최소가 되도록 가중치를 고정하여, 최종 행동 정보 생성 모델의 오차가 발생한 데이터 각각에 적용하는 가중치로 업데이트할 수 있다.According to the above embodiment, the sum of the first difference value and the second difference value is calculated as the segmentation loss value of the human neural network constituting the final behavioral information generation model based, the weight is fixed so that the segmentation loss value is minimal, and the final behavior information generation model is calculated. It can be updated with the weight applied to each data in which an error occurred in the behavioral information generation model.

3D 영상 생성부(400)는 최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도 및 영상 데이터를 동작 영상 생성 모델에 입력하여 360도 회전이 가능한 사람의 움직임에 대한 3D 합성 영상 데이터를 생성할 수 있다.The 3D image generator 400 inputs the movement line, movement speed, movement angle, and image data for each body part over time included in the final action information into the motion image generation model to determine the movement of the person capable of 360 degree rotation. 3D synthetic image data can be generated.

본 발명의 일 실시예에 따르면 동작 영상 생성 모델에 최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도를 입력하여 각 신체 부위를 기점으로 하고, 이를 연결하여 사람의 형상으로 3D 영상을 생성할 수 있으며 시간의 흐름에 따라 변화하는 값을 반영하여 사람의 움직임을 나타낼 수 있는 3D 영상을 생성할 수 있다.According to an embodiment of the present invention, the movement line, movement speed, and movement angle of each body part over time included in the final action information are entered into the motion image generation model, each body part is used as a starting point, and these are connected to create a human A 3D image can be created in the shape of , and a 3D image that can represent a person's movement can be created by reflecting values that change over time.

본 발명의 일 실시예에 따르면 영상 데이터를 동작 영상 생성 모델에 입력하여 영상 데이터에 포함된 배경 이미지를 이용하여 배경화면 영상 데이터를 생성할 수 있다.According to an embodiment of the present invention, background screen image data can be generated by inputting image data into a motion image generation model and using a background image included in the image data.

3D 영상 생성부(400)에 대해서는 도 5를 참조하면 더 자세하게 설명하도록 한다.The 3D image generator 400 will be described in more detail with reference to FIG. 5 .

미용 콘텐츠 생성부(500)는 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 정보 제공 텍스트를 생성하고, 생성한 정보 제공 텍스트를 기반으로 생성한 대한 3D 합성 영상 데이터에 자막을 입혀 미용 콘텐츠를 생성할 수 있다.The beauty content creation unit 500 generates informational text by analyzing information on the movement line, movement speed, and movement angle of each body part, and adds subtitles to the 3D synthetic image data generated based on the generated informational text. You can create beauty content.

본 발명의 일 실시예에 따르면 정보 제공 텍스트는 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 시간의 흐름에 따라 영상에 매칭되도록 생성된 텍스트 형태의 정보일 수 있으며, 신체 부위가 어떤 방향과 속도와 각도로 움직이는지 산술적인 값을 텍스트로 변환하여 제공하는 것일 수 있다.According to one embodiment of the present invention, the informational text may be information in the form of text created to match the image over time by analyzing information on the movement line, movement speed, and movement angle of each body part, and the body part may be Arithmetic values of the direction, speed, and angle of movement may be converted into text and provided.

도 2는 도 1에 도시된 행동 정보 생성부(200)의 세부 구성도이다. FIG. 2 is a detailed configuration diagram of the behavior information generator 200 shown in FIG. 1.

행동 정보 생성부(200)는 제1 행동 정보 생성부(210), 제2 행동 정보 생성부(220)을 포함할 수 있으며, 제1 행동 정보 생성부(210)는 특징점 추출 모듈, 행동 정보 분석 모듈로 형성된 인공 신경망 기반의 영상 분석 모델을 포함할 수 있다.The behavior information generation unit 200 may include a first behavior information generation unit 210 and a second behavior information generation unit 220, and the first behavior information generation unit 210 may include a feature point extraction module and a behavior information analysis unit. It may include an artificial neural network-based image analysis model formed as a module.

본 발명의 일 실시예에 따르면 제1 행동 정보 생성부(210)는 특징점 추출 모듈에 영상 데이터를 입력하여 특징점을 추출하고, 추출된 특징점을 미리 설정 해놓은 신체 부위별로 라벨링할 수 있다.According to an embodiment of the present invention, the first behavior information generator 210 may input image data into a feature point extraction module to extract feature points, and label the extracted feature points for each preset body part.

본 발명의 일 실시예에 따르면 특징점 추출 모듈은 미리 설정된 신체 부위를 특정할 수 있는 특징점을 영상으로부터 추출할 수 있도록 학습된 모델일 수 있다.According to one embodiment of the present invention, the feature point extraction module may be a model learned to extract feature points that can specify a preset body part from an image.

본 발명의 일 실시예에 따르면 추출된 특징점 중 적어도 하나의 특징점을 신체 부위에 매칭시켜 라벨링할 수 있다.According to an embodiment of the present invention, at least one feature point among the extracted feature points can be matched to a body part and labeled.

또한 행동 정보 분석 모듈에 라벨링된 특징점 및 영상 데이터를 입력하여 각 특징점 별 동선, 이동 속도 및 이동 각도를 포함하는 특징 맵을 생성하고, 생성된 특징 맵에 출력 활성화 함수를 적용하여 제1 행동 정보를 출력할 수 있다.Additionally, by inputting labeled feature points and image data into the behavior information analysis module, a feature map including the movement line, movement speed, and movement angle for each feature point is generated, and an output activation function is applied to the generated feature map to generate first behavior information. Can be printed.

본 발명의 일 실시예에 따르면 특징 맵은 특징점 별 동선, 이동 속도 및 이동 각도를 그룹화하여 다수의 그룹을 생성하고 이를 특징점에 매칭되는 신체 부위로 정렬해서 생성한 데이터일 수 있다.According to an embodiment of the present invention, the feature map may be data generated by grouping the movement line, movement speed, and movement angle for each feature point to create a plurality of groups and sorting them into body parts matching the feature points.

제2 행동 정보 생성부(220)는 각 깊이 카메라별 송수신 시간 데이터를 분석하여 미리 설정 해놓은 신체 부위별 연속된 깊이 데이터, 속도 데이터, 진폭 데이터를 생성할 수 있다.The second behavior information generator 220 may analyze transmission and reception time data for each depth camera and generate preset continuous depth data, speed data, and amplitude data for each body part.

본 발명의 일 실시예에 따르면 제2 행동 정보 생성부(220)는 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 함수를 이용하여 검출된 노이즈를 제거하고, 연속형 확률 분포 함수를 이용하여 표면의 빛의 흡수량에 상관없이 일정한 진폭으로 정규화 시킨 각 깊이 카메라 별로 생성된 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간값 필터로 평균화하여 제2 행동 정보를 생성할 수 있다.According to one embodiment of the present invention, the second behavior information generator 220 uses amplitude data to remove detected noise using a cumulative distribution function and a noise function according to the size of the amplitude, and uses a continuous probability distribution function. Second behavioral information can be generated by averaging the depth data, speed data, and amplitude data for each body part generated by each depth camera, which are normalized to a constant amplitude regardless of the amount of light absorption on the surface, using a median filter.

본 발명의 일 실시예에 따르면 노이즈 함수(NF)는 하기의 수학식 4을 기반으로 수행될 수 있다.According to one embodiment of the present invention, the noise function (NF) can be performed based on Equation 4 below.

본 발명의 일 실시예에 따르면 중간값 필터는 오차가 발생한 데이터 각각에 가중치를 적용하여 산출된 값들을 중간 값으로 평균화하여 오차를 제거할 수 있는 필터를 의미할 수 있다.According to an embodiment of the present invention, the median filter may refer to a filter that can remove errors by applying weights to each data in which errors occur and averaging the calculated values to a median value.

도 3은 본 발명의 일 실시예에 따라 구현된 영상 분석 모델을 나타낸 도면이다.Figure 3 is a diagram showing an image analysis model implemented according to an embodiment of the present invention.

도 3을 참조하면 본 발명의 일 실시예에 따라 구현된 영상 분석 모델이 도시되어 있으며 영상 분석 모델은 합성곱 연산망으로 형성될 수 있으며, 각 깊이 카메라가 수집한 복수의 영상 데이터를 입력층에 입력 받고 제1 행동 정보를 출력할 수 있다.Referring to FIG. 3, an image analysis model implemented according to an embodiment of the present invention is shown. The image analysis model can be formed as a convolutional network, and a plurality of image data collected by each depth camera is input to the input layer. It is possible to receive input and output first action information.

도 4는 본 발명의 일 실시예에 따라 구현된 특징점 추출 모듈 및 행동 정보 분석 모듈을 포함한 영상 분석 모델을 나타낸 도면이다.Figure 4 is a diagram showing an image analysis model including a feature point extraction module and a behavior information analysis module implemented according to an embodiment of the present invention.

도 4를 참조하면 본 발명의 일 실시예에 따라 구현된 영상 분석 모델에 포함된 특징점 추출 모듈과 행동 정보 분석 모듈의 데이터 흐름이 나타나 있다.Referring to FIG. 4, the data flow of the feature point extraction module and the behavioral information analysis module included in the image analysis model implemented according to an embodiment of the present invention is shown.

본 발명의 일 실시예에 따르면 각 깊이 카메라에서 수집된 복수의 영상 데이터를 특징점 추출 모듈에 입력하면 미리 설정된 신체 부위를 특정할 수 있는 특징점을 영상으로부터 추출되며, 추출된 특징점 중 적어도 하나의 특징점을 신체 부위에 매칭시켜 라벨링한 라벨링된 특징점 정보가 생성될 수 있다.According to an embodiment of the present invention, when a plurality of image data collected from each depth camera is input to the feature point extraction module, feature points that can specify a preset body part are extracted from the image, and at least one feature point among the extracted feature points is extracted from the image. Labeled feature point information may be generated by matching and labeling body parts.

본 발명의 일 실시예에 따르면 행동 정보 분석 모듈에 라벨링된 특징점 정보와 각 깊이 카메라에서 수집된 복수의 영상 데이터를 입력하면 각 특징점 별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보가 생성될 수 있다.According to an embodiment of the present invention, when labeled feature point information and a plurality of image data collected from each depth camera are input to the behavior information analysis module, first behavior information including the movement line, movement speed, and movement angle for each feature point is generated. It can be.

도 5는 본 발명의 일 실시예에 따라 구현된 동작 영상 생성 모델의 세부 모듈을 나타낸 도면이다.Figure 5 is a diagram showing detailed modules of a motion image generation model implemented according to an embodiment of the present invention.

도 5를 참조하면 3D 영상 생성부(400)는 복수의 연산 레이어로 이루어진 합성곱 인공 신경망으로 구현되고 배경화면 생성 모듈, 동작 영상 생성 모듈, 3D 영상 합성 모듈을 포함하는 동작 영상 생성 모델을 할 수 있다.Referring to FIG. 5, the 3D image generator 400 is implemented as a convolutional artificial neural network consisting of a plurality of computational layers and is capable of creating a motion image generation model including a background screen creation module, a motion image generation module, and a 3D image synthesis module. there is.

본 발명의 일 실시예에 따르면 3D 영상 합성 모듈은 상기 영상 데이터를 입력하면 영상 데이터에 포함된 객체를 제외한 복수의 배경 이미지를 추출하고, 추출된 복수의 배경 데이터를 레퍼런스 이미지에 매칭하여 전처리를 수행할 수 있다.According to one embodiment of the present invention, when the image data is input, the 3D image synthesis module extracts a plurality of background images excluding objects included in the image data, and performs preprocessing by matching the extracted plurality of background data to the reference image. can do.

여기서 레퍼런스 이미지는 복수로 생성된 배경 이미지에 매칭되어 전방위 영상으로 조합되기 위해 그 위치 및 크기에 대하여 미리 설정된 기준 이미지를 의미할 수 있다.Here, the reference image may refer to a reference image whose position and size are preset to match multiple generated background images and combine them into an omnidirectional image.

상기 실시예에 따르면 전처리가 수행된 복수의 배경 데이터를 분석하여 구간별로 디스크립터 데이터를 설정하며, 복수의 이미지 데이터 중 공간상 연결되는 이미지 데이터 간에 서로 공유되는 디스크립터 데이터를 도출하고, 도출된 디스크립터 데이터를 기준으로 복수의 이미지를 정합하여 360도 회전이 가능한 배경화면 영상 데이터를 생성할 수 있다.According to the above embodiment, descriptor data is set for each section by analyzing a plurality of background data on which preprocessing has been performed, descriptor data shared between spatially connected image data among the plurality of image data is derived, and the derived descriptor data is By matching multiple images as a standard, background screen image data that can be rotated 360 degrees can be generated.

여기서 디스크립터 데이터는 두 이미지 간 유사도를 측정하기 위하여 이미지에서 의미 있는 특징들을 적절한 숫자로 변환하여 산출된 특징을 대표할 수 있는 숫자에 대한 데이터를 의미할 수 있다.Here, descriptor data may refer to data about numbers that can represent the features calculated by converting meaningful features in the image into appropriate numbers in order to measure the similarity between two images.

본 발명의 일 실시예에 따르면 획득한 복수의 배경 데이터를 대상으로 이미지 부분별로 특징적인 부분(Scale-space extrema detection)을 찾아 특징점으로 분류할 수 있으며, 분류된 특징점들 중에서 신뢰도 있는 최종 특징점인 핵심 특징점을 선별하여 핵심 특징점들의 픽셀값(Intensity), 핵심 특징점들 중 객체의 코너에 해당하는 위치나 크기 등을 기준으로 최종 특징점을 우선 선별할 수 있다.According to an embodiment of the present invention, characteristic parts (scale-space extrema detection) for each image part can be found for a plurality of acquired background data and classified into feature points, and among the classified feature points, the core is the final reliable feature point. By selecting the feature points, the final feature points can be selected based on the pixel value (Intensity) of the key feature points, the location or size of the corner of the object among the key feature points, etc.

상기 실시예에 따르면 최종 특징점의 주변 영역에 대해 경사도를 산출하여 전체적으로 주변 영역의 픽셀들이 가리키는 방향을 구하고, 주변 영역의 픽셀들이 가리키는 방향이 0도가 되도록 회전하여 주변 영역에 당하는 부분을 디스크립터 데이터로 설정할 수 있다.According to the above embodiment, the gradient is calculated for the surrounding area of the final feature point, the direction in which the pixels in the surrounding area are pointing as a whole is obtained, and the direction in which the pixels in the surrounding area are pointing is rotated to 0 degrees, and the part hit by the surrounding area is set as descriptor data. You can.

본 발명의 일 실시예에 따르면 디스크립터 데이터는 최종 특징점의 주변 영역의 픽셀 값들이 포함될 수 있으며, 최종 특징점을 기준으로 주변 영역의 픽셀 값들의 방향정보에 기반한 히스토그램 정보도 포함되므로, 디스크립터 데이터를 이용하여 촬영각에 따라 변경된 복수의 이미지 데이터를 대비하여 동일한 구간을 나타내는 타겟 포인트를 식별함으로써 서로 이웃하는 이미지 데이터들을 정확하게 정합하여 배경화면 영상 데이터를 생성할 수 있다.According to an embodiment of the present invention, the descriptor data may include pixel values in the surrounding area of the final feature point, and histogram information based on direction information of pixel values in the surrounding area based on the final feature point is also included, so the descriptor data can be used to By comparing a plurality of image data that changes depending on the shooting angle and identifying a target point representing the same section, neighboring image data can be accurately matched to generate background screen image data.

본 발명의 일 실시예에 따르면 동작 영상 생성 모듈은 미리 설정된 신체 부위를 중심으로 각 신체 부위를 연결해 사람의 형상을 3D로 생성하고, 생성된 사람의 형상을 상기 최종 행동 정보에 포함된 각 신체 부위별 동선, 이동 속도 및 이동 각도에 따라 각 신체 부위별로 움직이는 동작 영상 데이터를 생성할 수 있다.According to an embodiment of the present invention, the motion image generation module connects each body part around a preset body part to generate a 3D human shape, and matches the generated human shape to each body part included in the final action information. Motion image data can be generated for each body part according to the movement line, movement speed, and movement angle.

본 발명의 일 실시예에 따르면 3D 영상 합성 모듈은 상기 360도 회전이 가능한 배경화면에 3D 가상 인물의 동작 영상을 위치 및 각도에 따라 합성하여 합성 영상 데이터를 생성할 수 있다.According to one embodiment of the present invention, the 3D image synthesis module can generate synthesized image data by synthesizing the motion image of the 3D virtual character on the background screen that can rotate 360 degrees according to the position and angle.

도 6은 본 발명의 일 실시예에 따라 구현된 동작 영상 생성 모델을 나타낸 도면이다.Figure 6 is a diagram showing a motion image generation model implemented according to an embodiment of the present invention.

도 6을 참조하면 본 발명의 일 실시예에 따라 합성곱 신경망 기반으로 구현된 동작 영상 생성 모델이 도시되어 있으며, 각 깊이 카메라별 영상 데이터와 각 신체 부위별 동선, 이동속도, 이동 각도 정보를 동작 영상 모델에 입력 받으면 3D 합성 영상 데이터를 출력할 수 있다.Referring to Figure 6, a motion image generation model implemented based on a convolutional neural network according to an embodiment of the present invention is shown, and the image data for each depth camera and the movement line, movement speed, and movement angle information for each body part are used for operation. Once input is received from the image model, 3D synthetic image data can be output.

도 7는 본 발명의 일 실시예에 따라 구현된 배경화면 생성 모듈, 동작 영상 생성 모듈, 3D 영상 합성 모듈을 포함한 동작 영상 생성 모델을 나타낸 도면이다.Figure 7 is a diagram showing a motion image generation model including a background screen creation module, a motion image creation module, and a 3D image synthesis module implemented according to an embodiment of the present invention.

도 7을 참조하면 본 발명의 일 실시예에 따라 구현된 동작 영상 생성 모델에 포함된 배경화면 생성 모듈, 동작 영상 생성 모듈, 3D 영상 합성 모듈 사이의 데이터 흐름이 나타나 있다.Referring to FIG. 7, the data flow between the background screen creation module, the motion image creation module, and the 3D image synthesis module included in the motion image generation model implemented according to an embodiment of the present invention is shown.

본 발명의 일 실시예에 따르면 배경화면 생성 모듈에 각 깊이 카메라별 영상 데이터를 입력하여 배경화면 영상 데이터를 출력 받고, 동작 영상 생성 모듈에 각 신체 부위별 동선, 이동 속도, 이동 각도 정보를 입력하여 동작 영상 데이터를 출력할 수 있다.According to an embodiment of the present invention, image data for each depth camera is input into the background image generation module to output background image data, and input movement line, movement speed, and movement angle information for each body part into the motion image generation module. Motion video data can be output.

상기 일 실시예에 따르면 배경화면 생성 모듈과 동작 영상 생성 모듈에서 각각 출력된 배경화면 영상 데이터 및 동작 영상 데이터를 3D 영상 합성 모듈에 입력하여 3D 합성 영상 데이터를 출력 받을 수 있다.According to the above embodiment, the background screen image data and the motion image data output from the background screen generation module and the motion image generation module, respectively, can be input to the 3D image synthesis module to output 3D composite image data.

도 8은 본 발명의 일 실시예에 따라 미용 콘텐츠 자동 생성 방법의 흐름도이다.Figure 8 is a flowchart of a method for automatically generating beauty content according to an embodiment of the present invention.

미용 행위를 수행하고 있는 적어도 하나의 사람의 움직임을 전방위로 촬영하여 영상 데이터 및 영상 데이터에 대응하는 송수신 시간 데이터를 포함한 동작 정보를 수집한다(S10).The movement of at least one person performing a beauty act is photographed in all directions to collect motion information including image data and transmission/reception time data corresponding to the image data (S10).

본 발명의 일 실시예에 따르면 복수의 깊이 카메라를 이용하여 미용 행위를 수행하고 있는 적어도 하나의 사람의 움직임을 전방위로 촬영하여 영상 데이터 및 영상 데이터에 대응하는 미리 설정된 신체 부위별 송수신 시간 데이터를 포함한 동작 정보를 수집할 수 있다.According to an embodiment of the present invention, the movements of at least one person performing a beauty act are photographed in all directions using a plurality of depth cameras, including image data and preset transmission/reception time data for each body part corresponding to the image data. Movement information can be collected.

본 발명의 일 실시예에 따르면 깊이 카메라로는 구 좌표계의 원점에서 펄스 변조된 적외선(IR) 빔을 목표물에 발사시켜 수평(pan, φ)과 상하(tilt, θ)로 스캐닝하여 구 표면(sphere surface)의 각기 다른 불연속 점(г, θ, φ)의 분포(point wise)에서 일어나는 역방향 산란(back scattering)으로 반사되어 원점으로 되돌아오는 시간, 즉 송수신 시간을 기반으로 배경 내 목표물의 3차원 영상 정보를 획득할 수 있는 카메라가 사용될 수 있다.According to an embodiment of the present invention, the depth camera fires a pulse-modulated infrared (IR) beam from the origin of the spherical coordinate system to the target and scans it horizontally (pan, ϕ) and up and down (tilt, θ) to capture the sphere surface (sphere). A three-dimensional image of a target in the background based on the time it reflects and returns to the origin due to back scattering that occurs in the point wise distribution of different discontinuous points (г, θ, ϕ) on the surface, that is, the transmission and reception time. A camera capable of acquiring information may be used.

본 발명의 일 실시예에 따르면 송수신 시간 데이터를 수집함에 있어, 2단계의 깊이 측정을 수행할 수 있다.According to one embodiment of the present invention, when collecting transmission and reception time data, two-stage depth measurement can be performed.

*본 발명의 일 실시예에 따르면 2단계의 깊이 측정은 제1 깊이 측정 단계 및 제2 깊이 측정 단계로 구분할 수 있다.*According to one embodiment of the present invention, the two-stage depth measurement can be divided into a first depth measurement stage and a second depth measurement stage.

본 발명의 일 실시예에 따르면 제2 깊이 측정은 최대 깊이 범위가 제1 깊이 측정의 정밀도를 기반으로 설정될 수 있으며, 상대적으로 더 높은 주파수를 선택하여 좁은 관심 영역에 대한 높은 측점 품질을 제공함으로써 제1 깊이 측정 결과의 오차를 보상할 수 있다.According to one embodiment of the present invention, the second depth measurement may have a maximum depth range set based on the precision of the first depth measurement, by selecting a relatively higher frequency to provide high point quality for a narrow region of interest. Errors in the first depth measurement result may be compensated.

동작 정보에 포함된 각 깊이 카메라별 영상 데이터를 인공 신경망 기반의 영상 분석 모델에 입력하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보를 생성한다(S20).The image data for each depth camera included in the motion information is input into an artificial neural network-based video analysis model to generate first behavior information including the movement line, movement speed, and movement angle of each body part (S20).

본 발명의 일 실시예에 따르면 동작 정보에 포함된 각 깊이 카메라별 영상 데이터를 인공 신경망 기반의 영상 분석 모델에 입력하여 각 신체 부위별 동선, 이동 속도 및 이동 각도를 포함하는 제1 행동 정보를 생성할 수 있으며, 일 실시예에 따르면 인공 신경망 기반의 영상 분석 모델은 깊이 카메라별 영상 데이터를 입력층에 입력 받아 미리 설정된 각 신체 부위별로 동선, 이동 속도, 이동 각도 정보를 출력할 수 있으며, 이를 각 신체 부위별로 그룹핑하여 제1 행동 정보를 생성하도록 학습될 수 있다.According to an embodiment of the present invention, the image data for each depth camera included in the motion information is input into an artificial neural network-based image analysis model to generate first behavior information including the movement line, movement speed, and movement angle for each body part. According to one embodiment, an artificial neural network-based image analysis model can receive image data for each depth camera into an input layer and output movement line, movement speed, and movement angle information for each preset body part, which can be used for each body part. It can be learned to generate first action information by grouping by body part.

동작 정보에 포함된 송수신 시간 데이터를 분석하여 연속된 복수의 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하여 노이즈를 제거한 후 중간 값으로 평균화하여 제2 행동 정보를 생성한다(S30).Transmission and reception time data included in the motion information is analyzed to generate a plurality of consecutive depth data, speed data, and amplitude data, noise is removed, and then averaged to an intermediate value to generate second behavior information (S30).

*본 발명의 일 실시예에 따르면 동작 정보에 포함된 각 깊이 카메라별 송수신 시간 데이터를 분석하여 연속된 복수의 깊이 데이터, 속도 데이터, 진폭 데이터를 생성하고, 진폭 데이터를 이용하여 진폭의 크기에 따라 누적 분포 함수 및 노이즈 제거 함수를 이용하여 노이즈를 제거하고, 노이즈가 제거된 각 깊이 카메라 별 깊이 데이터, 속도 데이터, 진폭 데이터를 중간 값으로 평균화하여 제2 행동 정보를 생성할 수 있다.*According to one embodiment of the present invention, the transmission and reception time data for each depth camera included in the motion information is analyzed to generate a plurality of consecutive depth data, speed data, and amplitude data, and the amplitude data is used to determine the size of the amplitude. Noise can be removed using a cumulative distribution function and a noise removal function, and the depth data, speed data, and amplitude data for each depth camera from which the noise has been removed can be averaged to a median value to generate second behavioral information.

제1 행동 정보와 제2 행동 정보를 시간을 기준으로 정합 하여 오차를 산출하고, 산출된 오차에 대해서는 연속형 가중치 중간값 필터를 사용하여 오차를 제거하여 최종 행동 정보로 생성한다(S40).The first behavior information and the second behavior information are matched based on time to calculate the error, and the calculated error is removed using a continuous weighted median filter to generate final behavior information (S40).

본 발명의 일 실시예에 따르면 제1 행동 정보와 제2 행동 정보를 시간을 기준으로 정합 하여 오차를 산출하고, 산출된 오차에 대해서는 연속형 가중치 중간값 필터를 사용하여 오차를 제거하여 최종 행동 정보로 생성할 수 있다.According to an embodiment of the present invention, the first behavior information and the second behavior information are matched based on time to calculate the error, and the calculated error is removed using a continuous weighted median filter to produce the final behavior information. It can be created with

본 발명의 일 실시예에 따르면 제1 행동 정보와 제2 행동 정보를 인공 신경망 기반의 최종 행동 정보 생성 모델에 입력하여 제1 행동 정보와 제2 행동 정보에 포함된 신체 부위별 동선, 이동 속도 및 이동 각도를 대비하여 발생하는 차이를 오차로 산출하고, 오차가 발생한 데이터 각각에 가중치를 적용하여 산출된 값들을 중간 값으로 평균화하여 오차를 제거함으로써 하나의 신체 부위별 깊이 데이터, 속도 데이터, 진폭 데이터를 가진 최종 행동 정보를 생성할 수 있다.According to an embodiment of the present invention, the first behavior information and the second behavior information are input into an artificial neural network-based final behavior information generation model to determine the movement line, movement speed, and The difference occurring in comparison with the movement angle is calculated as an error, a weight is applied to each data in which the error occurred, and the calculated values are averaged to the middle value to remove the error, resulting in depth data, speed data, and amplitude data for each body part. Final action information with can be generated.

최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도 및 상기 영상 데이터를 동작 영상 생성 모델에 입력하여 3D 합성 영상 데이터를 생성한다(S50).The movement line, movement speed, and movement angle of each body part over time included in the final action information, as well as the image data, are input into the motion image generation model to generate 3D synthetic image data (S50).

본 발명의 일 실시예에 따르면 최종 행동 정보에 포함된 시간의 흐름에 따른 각 신체 부위별 동선, 이동 속도 및 이동 각도 및 영상 데이터를 동작 영상 생성 모델에 입력하여 360도 회전이 가능한 사람의 움직임에 대한 3D 합성 영상 데이터를 생성할 수 있다.According to an embodiment of the present invention, the movement line, movement speed, movement angle, and image data for each body part over time included in the final action information are input into a motion image generation model to determine the movement of a person capable of 360 degree rotation. 3D synthetic image data can be generated.

신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 정보 제공 텍스트를 생성하고, 정보 제공 텍스트를 기반으로 3D 합성 영상 데이터에 자막을 입혀 미용 콘텐츠를 생성한다(S60).Information on the movement line, movement speed, and movement angle of each body part is analyzed to generate informational text, and beauty content is created by adding subtitles to the 3D composite image data based on the informational text (S60).

본 발명의 일 실시예에 따르면 신체 부위별 동선, 이동 속도 및 이동 각도에 대한 정보를 분석하여 정보 제공 텍스트를 생성하고, 생성한 정보 제공 텍스트를 기반으로 생성한 대한 3D 합성 영상 데이터에 자막을 입혀 미용 콘텐츠를 생성할 수 있다.According to an embodiment of the present invention, information on the movement line, movement speed, and movement angle of each body part is analyzed to generate informational text, and subtitles are added to the 3D synthetic image data generated based on the generated informational text. You can create beauty content.

본 발명의 실시 예는 이상에서 설명한 장치 및/또는 방법을 통해서만 구현이 되는 것은 아니며, 이상에서 본 발명의 실시 예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.The embodiments of the present invention are not implemented only through the apparatus and/or method described above. Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto and is subject to the following claims. Various modifications and improvements made by those skilled in the art using the basic concept of the present invention defined in also fall within the scope of the present invention.

Claims

a motion information collection unit that collects motion information including image data and transmission/reception time data corresponding to the image data by photographing the movements of at least one person performing a beauty treatment in all directions using a plurality of depth cameras;
The image data for each depth camera included in the motion information is input into an artificial neural network-based image analysis model to generate first behavior information including the movement line, movement speed, and movement angle for each body part, and included in the motion information. By analyzing the transmission and reception time data for each depth camera, a plurality of consecutive depth data, speed data, and amplitude data are generated, and noise is removed using the cumulative distribution function and noise removal function according to the size of the amplitude using the amplitude data. a behavior information generator that generates second behavior information by removing the noise and averaging the depth data, speed data, and amplitude data for each body part generated for each depth camera to an intermediate value;
The first behavior information and the second behavior information are matched based on time to calculate an error, and the calculated error is removed using a continuous weighted median filter to generate final behavior information. wealth;
The movement line, movement speed, and movement angle of each body part over time included in the final action information, as well as the image data, are input into the motion image generation model to generate 3D composite image data about the person's movement that can be rotated 360 degrees. A 3D image generator that generates; and
Beauty content that generates informational text by analyzing information on the movement line, movement speed, and movement angle of each body part, and creates beauty content by adding subtitles to the 3D synthetic image data generated based on the generated informational text. Includes a generating unit,
The behavioral information generator,
It includes an artificial neural network-based image analysis model formed by a feature point extraction module and a behavioral information analysis module, inputs the image data to the feature point extraction module to extract feature points, and labels the extracted feature points by preset body parts, By inputting the feature points and image data for each body part into the behavior information analysis module, a feature map including the movement line, movement speed, and movement angle for each feature point is generated, and an output activation function is applied to the generated feature map to generate first behavior information. a first behavior information generator that outputs; and
By analyzing the transmission and reception time data for each depth camera, continuous depth data, speed data, and amplitude data for each body part that are preset are generated, and the amplitude data is used to use a cumulative distribution function and a noise function according to the size of the amplitude. The detected noise is removed, and the depth data, speed data, and amplitude data for each body part generated by each depth camera are normalized to a constant amplitude regardless of the amount of light absorption on the surface using a continuous probability distribution function. It includes a second behavior information generator that generates second behavior information by averaging,
The second behavior information generator performs filtering using a cumulative distribution function and a noise function to remove noise caused by the non-ideal waveform used in the depth camera and noise generated at the boundary of the object, and the cumulative distribution function is calculated using a mathematical formula. It is performed based on equation 3,
[Equation 3]

The final action information generator,
The first and second action information are input into an artificial neural network-based final action information generation model, and the movement line, movement speed, and movement angle of each body part included in the first and second action information are compared. The difference is calculated as an error, a weight is applied to each data in which the error occurred, and the calculated values are averaged to the median to remove the error, thereby generating final behavioral information with the movement line, movement speed, and movement angle for each body part. do,
Using an artificial neural network-based judgment model that evaluates accuracy by comparing the 3D image and image data generated according to the final behavioral information,
Setting a first expected value for the standard behavior information to determine that the 3D image generated according to the final behavior information can be evaluated as being close enough to be recognized as matching the image data and behavior, and generating according to the final behavior information Calculate the difference between the 3D image and the first expected value as the first difference value,
Setting a second expected value for the standard behavior information to determine that the 3D image generated according to the final behavior information cannot be evaluated to be approximate enough to be recognized as matching the image data and behavior, and generated according to the final behavior information The difference between the 3D image and the second expected value is calculated as a second difference value,
The sum of the first difference value and the second difference value is calculated as a classification loss value of the human neural network constituting the final action information generation model based, the weight is fixed so that the division loss value is minimal, and the final action information generation model is calculated. It is updated with the weight applied to each data in which an error occurred in the information generation model,
The operation information collection unit,
In collecting transmission and reception time data, two-stage depth measurement is performed using two modulation frequencies,
The first depth measurement measures depth using a relatively low modulation frequency of the two modulation frequencies and performs measurement with low measurement quality for a preset region of interest for each body part,
The second depth measurement measures depth using the relatively higher modulation frequency of the two modulation frequencies based on the measurement results for the area of interest for each body part of the first depth measurement, increasing measurement precision with high measurement quality. ,
The low modulation frequency is selected based on Equation 1,
[Equation 1]

The high modulation frequency is selected as a value inversely proportional to the standard deviation measured when using the low modulation frequency, and if the standard deviation is less than a preset threshold, it is determined that the signal-to-noise ratio is high and the standard deviation is preset. It can be at a relatively higher frequency than when it is greater than the limit value,
A device for automatically generating beauty content, characterized in that the second depth measurement is performed multiple times to increase measurement precision through repeated measurements.

The 3D image generator of claim 1,
It is implemented as a convolutional artificial neural network consisting of a plurality of computational layers and includes a motion image generation model including a background screen generation module, a motion image generation module, and a 3D image synthesis module,
When the 3D image synthesis module inputs the image data, it extracts a plurality of background images excluding objects included in the image data, performs pre-processing by matching the extracted plurality of background data to the reference image, and performs pre-processing By analyzing the background data, descriptor data is set for each section, descriptor data shared between spatially connected image data among the plurality of image data is derived, and the plurality of images are matched based on the derived descriptor data to create a 360 Create background image data that can be rotated,
The motion image generation module connects each body part around a preset body part to create a 3D human shape, and converts the generated human shape into the movement line, movement speed, and movement speed for each body part included in the final action information. Generates moving motion image data for each body part according to the movement angle,
The 3D image synthesis module is an automatic beauty content generation device, wherein the 3D image synthesis module synthesizes the motion image of the 3D virtual character on the background screen that can be rotated 360 degrees according to the position and angle to generate synthesized image data.