KR20240008265A

KR20240008265A - Method and apparatus for generating motion of character based on musculoskeletal system model

Info

Publication number: KR20240008265A
Application number: KR1020230089013A
Authority: KR
Inventors: 이윤상; 김민관
Original assignee: 한양대학교 산학협력단
Priority date: 2022-07-11
Filing date: 2023-07-10
Publication date: 2024-01-18

Abstract

근골격계 모델에 기반 캐릭터의 모션 생성 방법은 영상 처리 장치가 캐릭터의 상태 정보를 입력 받는 단계; 상기 영상 처리 장치가 상기 캐릭터의 상태 정보를 강화학습 기반의 모델에 입력하는 단계; 및 상기 영상 처리 장치가 강화학습 기반 모델의 출력값을 기반으로 캐릭터의 상태를 제어하는 단계;를 포함한다. A method for generating motion of a character based on a musculoskeletal model includes the steps of receiving, by an image processing device, status information of the character; Inputting, by the image processing device, state information of the character into a reinforcement learning-based model; and controlling, by the image processing device, the state of the character based on the output value of the reinforcement learning-based model.

Description

Method and device for generating motion of a character based on a musculoskeletal model {METHOD AND APPARATUS FOR GENERATING MOTION OF CHARACTER BASED ON MUSCULOSKELETAL SYSTEM MODEL}

이하 설명하는 기술은 근골격계 모델 기반 캐릭터의 모션 생성 방법에 대한 것이다. The technology described below relates to a method for generating motion of a character based on a musculoskeletal model.

캐릭터 에니메이션은 게임, 영화 컴퓨터 그래픽등 다양한 분야에서 활용된다. 캐릭터의 동작을 나타내기 위해서 모션 캡쳐 방법등의 기술이 이용되었다. 그러나 모션 캡쳐 방법은 모션을 얻기 위해 고가의 장비가 필요하다는 단점이 있다. 이에 물리 시뮬레이션 기반의 캐릭터를 이용해 에니메이션을 생성하는 방법등도 개발되었다. 물리 시뮬레이션 기반 캐릭터를 이용하면 생소한 환경에서 다양한 물체와 상호작용하는 모션을 획득할 수 있다. Character animation is used in various fields such as games, movies, and computer graphics. Technologies such as motion capture methods were used to represent the character's movements. However, motion capture methods have the disadvantage of requiring expensive equipment to obtain motion. Accordingly, methods for creating animations using physical simulation-based characters were also developed. Using physics simulation-based characters, you can obtain motions that interact with various objects in unfamiliar environments.

한국 등록특허공보 10-0856824 B1Korean Patent Publication 10-0856824 B1

물리 기반으로 시뮬레이션 되는 캐릭터 중 근골격계 모델에 기반한 캐릭터를 기반으로 영상을 만드는 기술들이 있다. 근골격계 모델은 사람의 근골격을 모사하여 만든 모델이다. 종래 근골격계 모델 기반 캐릭터의 모션을 생성하기 위해서 각 관절에 대해 돌림힘을 계산하여 모션을 생성하였다. 종래 기술은 돌림힘을 각 관절에 직접 적용하여 캐릭터가 참조동작을 따라 하도록 하였다. 하지만 실제 사람의 각 관절은 돌림힘을 발생시킬 수 없다. 대신 근육의 수축과 이완에 의하여 관절을 움직인다. Among characters simulated based on physics, there are technologies that create videos based on characters based on musculoskeletal models. A musculoskeletal model is a model created by replicating the human musculoskeletal system. In order to generate the motion of a character based on a conventional musculoskeletal model, the motion was generated by calculating the turning force for each joint. In the prior art, turning force was applied directly to each joint so that the character followed the reference motion. However, in reality, each joint of a person cannot generate turning force. Instead, joints move through muscle contraction and relaxation.

이하 설명하는 기술은 사람이 근육을 수축 및 이완시키는 원리를 기반으로 시뮬레이션되는 캐릭터의 동작을 생성하는 방법을 제공한다. 더 나아가 강화학습 기반의 모델을 이용하여 캐릭터의 모션을 생성할 수 있는 방법을 제공한다. The technology described below provides a method for generating simulated character movements based on the principles of how humans contract and relax their muscles. Furthermore, we provide a method to generate character motion using a reinforcement learning-based model.

상기 캐릭터는 관절, 상기 관절에 형성된 경로점 및 상기 경로점 사이에 연결된 근육 구동기(Muscle Actuator)로 이루어진다. 상기 근육 구동기는 경로점 사이에 힘을 적용해 상기 캐릭터를 움직이게 한다. 상기 강화학습 기반 모델은 입력 받은 상태 정보를 기반으로 최대의 보상값을 받는 행동값을 출력하는 모델이다. 상기 캐릭터의 상태 정보는 상기 캐릭터의 강체 상태 정보 및 근육 상태 정보를 포함한다. 상기 보상값은 사람의 보행특성을 반영할수록 커진다. 상기 행동값은 상기 근육 구동기가 수축할 것인지 또는 이완할 것인지에 대한 값을 포함한다. The character consists of joints, path points formed on the joints, and muscle actuators connected between the path points. The muscle actuators apply forces between path points to move the character. The reinforcement learning-based model is a model that outputs an action value that receives the maximum reward value based on the input state information. The state information of the character includes rigid body state information and muscle state information of the character. The compensation value increases as it reflects the person's walking characteristics. The action value includes a value for whether the muscle actuator will contract or relax.

이하 설명하는 기술을 이용하면 근골격계 모델 기반 캐릭터의 모션을 생성할 수 있다. 특히 강화학습 기반 모델을 이용하여 캐릭터가 좀더 사람다운 모션을 생성할 수 있는 방법을 제공한다. Using the technology described below, it is possible to create motion of a character based on a musculoskeletal model. In particular, it provides a way to create more human-like motion for characters using a reinforcement learning-based model.

도1은 영상 처리 장치(100)가 근골격계 모델 기반 캐릭터의 모션을 생성하는 전체적인 과정을 보여준다.
도2는 근골격계 모델 기반 캐릭터 모션 생성 방법의 실시예 중 하나의 순서도(200)이다.
도3은 근육구동기가 작용하는 실시예 중 하나이다.
도4는 근육 구동기의 실시예 중 하나이다.
도5는 근섬유의 길이(Fiber length)에 따른 힘의 크기(Fiber Force)를 보여준다.
도6은 근섬유 길이의 시간 미분값(velocity)과 힘(force) 사이의 관계를 보여준다.
도7은 강화학습 모델의 작동방식의 실시예 중 하나이다.
도8은 시뮬레이션에 이용된 캐릭터이다.
도9는 학습된 강화학습 기반 모델을 이용해 캐릭터의 모션을 생성한 결과이다.
도10은 보상값 중 에너지 보상으로 신진 대사 해당치(MET)만 사용했을 때의 캐릭터 모션을 생성한 결과이다.
도11은 보상값 중 에너지 보상으로 이동 비용(Cot)만 사용했을 때의 캐릭터 모션을 생성한 결과이다.
도12는 보상값 중 에너지 보상을 전혀 사용하지 않은 경우의 결과이다.
도13은 도10 내지 도12의 학습 결과의 보상값을 비교한 그래프이다.
도14는 캐릭터의 초기 자세를 지정하지 않는 경우의 실험결과이다.
도15는 영상 처리 장치의 실시예 중 하나의 구성이다. Figure 1 shows the overall process by which the image processing device 100 generates motion of a character based on a musculoskeletal model.
Figure 2 is a flow chart 200 of one embodiment of a musculoskeletal model-based character motion generation method.
Figure 3 is one of the embodiments in which the muscle actuator operates.
Figure 4 shows one example of a muscle actuator.
Figure 5 shows the magnitude of force (Fiber Force) according to the length of muscle fibers (Fiber Length).
Figure 6 shows the relationship between the time derivative of muscle fiber length (velocity) and force.
Figure 7 is one example of how a reinforcement learning model works.
Figure 8 shows the characters used in simulation.
Figure 9 shows the results of generating character motion using the learned reinforcement learning-based model.
Figure 10 shows the result of creating character motion when only metabolic equivalent value (MET) was used as energy compensation among the compensation values.
Figure 11 shows the result of creating a character motion when only the movement cost (Cot) is used as energy compensation among the compensation values.
Figure 12 is the result when no energy compensation is used among the compensation values.
Figure 13 is a graph comparing reward values of the learning results of Figures 10 to 12.
Figure 14 shows the results of an experiment when the initial posture of the character is not specified.
Figure 15 shows the configuration of one embodiment of an image processing device.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있다. 명세서의 도면에 이하 설명하는 기술의 특정 실시 형태가 기재될 수 있다. 그러나, 이는 이하 설명하는 기술의 설명을 위한 것이며 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니다. 따라서 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경 물, 균등 물 내지 대체 물이 이하 설명하는 기술에 포함하는 것으로 이해되어야 한다.The technology described below may be subject to various changes and may have various embodiments. Specific embodiments of the technology described below may be depicted in the drawings of the specification. However, this is for explanation of the technology described below and is not intended to limit the technology described below to specific embodiments. Therefore, it should be understood that all changes, equivalents, and substitutes included in the spirit and scope of the technology described below are included in the technology described below.

이하 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 기재된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.In the terms used below, singular expressions should be understood to include plural expressions, unless clearly interpreted differently from the context, and terms such as “including” refer to the described features, numbers, steps, operations, components, parts, or It should be understood that it means the existence of a combination of these, but does not exclude the possibility of the presence or addition of one or more other features, numbers, step operation components, parts, or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성 부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전 부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Before providing a detailed description of the drawings, it would be clarified that the division of components in this specification is merely a division according to the main function each component is responsible for. That is, two or more components, which will be described below, may be combined into one component, or one component may be divided into two or more components for more detailed functions. In addition to the main functions it is responsible for, each of the component parts described below may additionally perform some or all of the functions handled by other components, and some of the main functions handled by each component may be performed by other components. Of course, it can also be carried out exclusively by .

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In addition, when performing a method or operation method, each process forming the method may occur in a different order from the specified order unless a specific order is clearly stated in the context. That is, each process may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the opposite order.

이하 영상 처리 장치가 근골격계 모델 기반 캐릭터의 모션 생성 방법을 수행하는 전체적인 과정을 설명한다. Hereinafter, the overall process by which an image processing device performs a method for generating motion of a character based on a musculoskeletal system model will be described.

도1은 영상 처리 장치(100)가 근골격계 모델 기반 캐릭터의 모션을 생성하는 전체적인 과정을 보여준다. Figure 1 shows the overall process by which the image processing device 100 generates motion of a character based on a musculoskeletal model.

영상 처리 장치(100)는 캐릭터의 상태 정보를 입력 받을 수 있다. 영상 처리 장치(100)는 캐릭터의 상태 정보를 강화학습 기반의 모델에 입력할 수 있다. 영상 처리 장치(100)는 강화학습 기반 모델의 출력값을 기반으로 캐릭터의 상태를 제어할 수 있다. The image processing device 100 may receive character status information. The image processing device 100 may input character status information into a reinforcement learning-based model. The image processing device 100 may control the state of the character based on the output value of the reinforcement learning-based model.

캐릭터는 관절, 상기 관절에 형성된 경로점 및 상기 경로점 사이에 연결된 근육 구동기(Muscle Actuator)로 이루어질 수 있다. 근육 구동기는 경로점 사이에 힘을 적용해 캐릭터를 움직이게 할 수 있다. A character may be composed of joints, path points formed at the joints, and muscle actuators connected between the path points. Muscle actuators can move a character by applying forces between path points.

캐릭터의 상태 정보는 캐릭터의 강체 상태 정보 및 근육 상태 정보를 포함할 수 있다. The character's state information may include the character's rigid body state information and muscle state information.

강화학습 기반 모델은 입력 받은 상태 정보를 기반으로 최대의 보상값을 받는 행동값을 출력하는 모델이고, A reinforcement learning-based model is a model that outputs an action value that receives the maximum reward value based on the input state information.

보상값은 사람의 보행특성을 잘 반영할 수록 커질 수 있다. The compensation value can increase as it better reflects the person's walking characteristics.

행동값은 근육 구동기가 수축할 것인지 또는 이완할 것인지에 대한 값을 포함할 수 있다. The action value may include a value for whether the muscle actuator will contract or relax.

이하 근골격계 모델 기반 캐릭터 모션 생성 방법에 대해 구체적으로 설명한다. Hereinafter, a method for generating character motion based on a musculoskeletal model will be described in detail.

도2는 근골격계 모델 기반 캐릭터 모션 생성 방법의 실시예 중 하나의 순서도(200)이다.Figure 2 is a flow chart 200 of one embodiment of a musculoskeletal model-based character motion generation method.

영상 처리 장치는 캐릭터의 상태 정보를 입력 받을 수 있다(210). The image processing device can receive character status information (210).

캐릭터는 관절, 관절에 형성된 경로점 및 경로점 사이에 연결된 근육 구동기로 이루어질 수 있다. 관절은 복수개일 수 있다. A character may be composed of joints, path points formed at the joints, and muscle actuators connected between the path points. There may be multiple joints.

근육 구동기는 경로점 사이에 힘을 적용해 캐릭터를 움직이게 할 수 있다. 즉 근육 구동기는 근육이 이완 또는 수축하는 힘을 모사한 것일 수 있다. 도3은 근육구동기가 작용하는 실시예 중 하나이다. 캐릭터는 3개의 관절로 구성되어 있다. 3개의 관절 각각에는 경로점이 형성되어 있다. 경로점 사이에는 근육 구동기가 형성되어 있다. 근육 구동기는 경로점 사이에 힘을 적용해 관절을 움직이며 최종적으로 캐릭터를 움직이게 할 수 있다. 근육 구동기에 대한 구체적인 설명은 후술한다. Muscle actuators can move a character by applying forces between path points. In other words, a muscle actuator may simulate the force that causes a muscle to relax or contract. Figure 3 is one of the embodiments in which the muscle actuator operates. The character consists of three joints. A path point is formed at each of the three joints. Muscle actuators are formed between path points. Muscle actuators can apply force between path points to move joints and ultimately move the character. A detailed description of the muscle actuator will be described later.

캐릭터의 강체 상태 정보는 캐릭터의 관절의 위치 및 속도에 대한 정보를 포함할 수 있다. 캐릭터의 관절의 위치 및 속도는 캐릭터의 루트(골반)의 프레임에 의하여 표현될 수 있다. 관절의 속도는 관절의 각속도를 의미할 수 있다. 또는 캐릭터의 강체 상태 정보는 캐릭터의 몸체의 위치 및 속도에 대한 정보를 포함할 수 있다. The character's rigid body state information may include information about the positions and velocities of the character's joints. The position and speed of the character's joints can be expressed by the frame of the character's root (pelvis). The speed of a joint may mean the angular velocity of the joint. Alternatively, the character's rigid body state information may include information about the position and speed of the character's body.

캐릭터의 근육 상태 정보는 근육 구동기의 길이 정보를 포함할 수 있다. 구체적으로 캐릭터의 근육 상태 정보는 근육 구동기에서 근섬유(fiber)의 길이를 의미할 수 있다. 구체적인 내용은 후술한다. The character's muscle state information may include length information of the muscle actuator. Specifically, the character's muscle status information may refer to the length of muscle fibers in the muscle actuator. Specific details will be described later.

영상 처리 장치는 캐릭터의 상태 정보를 강화학습 기반 모델에 입력할 수 있다(220). The image processing device may input the character's state information into a reinforcement learning-based model (220).

강화학습 기반 모델은 입력 받은 캐릭터의 상태 정보를 기반으로 최대의 보상값을 받을 수 있는 행동값을 출력하는 모델일 수 있다. A reinforcement learning-based model may be a model that outputs an action value that can receive the maximum reward value based on the status information of the input character.

보상값은 사람의 보행 특성을 잘 반영할수록 커질 수 있다. 즉 보상값은 캐릭터가 사람과 같이 자연스러운 보행 동작을 취해 넘어지지 않고 앞으로 걸어나갈수록 커질 수 있다. The compensation value can increase as it better reflects the person's walking characteristics. In other words, the compensation value can increase as the character walks forward without falling by using natural walking movements like a human.

일 실시예로 보상값은 캐릭터의 골반이 돌아가지 않는 경우, 캐릭터의 무게 중심이 좌우로 벗어나지 않는 경우, 캐릭터의 상태가 기울어지지 않는 경우, 캐릭터가 원하는 속도로 움직이는 경우, 캐릭터의 근육의 활성화 값이 최소화 되는 경우, 캐릭터의 에너지 소모량이 최소가 되는 경우 및 캐릭터가 균형을 유지하는 경우 중 적어도 하나를 만족하면 값이 커질 수 있다. In one embodiment, the compensation value is the activation value of the character's muscles when the character's pelvis does not rotate, when the character's center of gravity does not deviate from side to side, when the character is not tilted, and when the character moves at a desired speed. When this is minimized, the value may increase if at least one of the following is satisfied: the character's energy consumption is minimal and the character maintains balance.

행동값은 근육 구동기가 수축할 것인지 또는 이완할 것인지에 대한 값을 포함할 수 있다. 구체적으로 근육 구동기의 활성화 값을 포함할 수 있다. The action value may include a value for whether the muscle actuator will contract or relax. Specifically, it may include activation values of muscle actuators.

강화학습 기반 모델에 대한 구체적인 설명은 후술한다. A detailed description of the reinforcement learning-based model is provided later.

영상 처리 장치는 강화학습 기반 모델의 출력값을 기반으로 캐릭터를 제어할 수 있다(230). The image processing device can control the character based on the output value of the reinforcement learning-based model (230).

구체적으로 영상 처리 장치는 강화학습 기반 모델의 출력값을 캐릭터의 근육 구동기에 적용함으로써 캐릭터를 제어할 수 있다. 근육 구동기가 작용하는 힘은 근섬유의 길이 정보 및 활성화 값을 기반 계산될 수 있다. 캐릭터를 제어하는 과정에 대한 구체적인 설명은 후술한다. Specifically, the image processing device can control the character by applying the output value of the reinforcement learning-based model to the character's muscle actuator. The force exerted by the muscle actuator can be calculated based on the length information and activation value of the muscle fiber. A detailed explanation of the process of controlling the character will be described later.

이하 근육 구동기에 대해 구체적으로 설명한다. Hereinafter, the muscle actuator will be described in detail.

도4는 근육 구동기의 실시예 중 하나이다. Figure 4 shows one example of a muscle actuator.

근육 구동기는 실제 사람의 근육을 모사한 것일 수 있다. 근육 구동기는 힘줄(tendon)과 근섬유(Fiber)로 이루어져 있다. The muscle actuator may be a replica of an actual human muscle. Muscle actuators are made up of tendons and muscle fibers.

근섬유는 능동적(active) 파트 및 수동적(passive) 파트을 포함할 수 있다. 능동적 파트는 근육이 수축하는 힘을 모델링한 파트일 수 있다. 수동적 파트는 근육이 원래대로 복구하려는 힘을 모델링 한것이다. 구체적으로 수동적 파트는 근육이 특정 길이 이상으로 늘어 났을 때 원래 상태로 복원하려는 탄성력을 모델링한 부분일 수 있다. Muscle fibers can include active and passive parts. The active part may be a part that models the force of muscle contraction. The passive part models the force with which the muscles try to restore their original state. Specifically, the passive part may be a part that models the elastic force that tries to restore the muscle to its original state when it is stretched beyond a certain length.

근육 구동기의 길이는 힘줄(tendon)의 길이와 근섬유(fiber)의 길이를 기반으로 계산될 수 있다. The length of the muscle actuator can be calculated based on the length of the tendon and the length of the muscle fiber.

수학식1은 근육 구동기의 길이(l_mtu)를 계산할 때 이용되는 식이다. Equation 1 is the equation used to calculate the length (l _mtu ) of the muscle actuator.

수학식1에서 l_mtu는 근육 구동기의 길이를 의미할 수 있다. 수학식1에서 l_m은 근섬유의 길이를 의미할 수 있다. 수학식1에서 l_t는 힘줄의 길이를 의미할 수 있다. 수학식1에서 α는 힘줄과 근섬유 사이의 각도(근속각, pennation angle)를 의미할 수 있다. In Equation 1, l _mtu may mean the length of the muscle actuator. In Equation 1, l _m may mean the length of the muscle fiber. In Equation 1, l _t may mean the length of the tendon. In Equation 1, α may mean the angle between the tendon and the muscle fiber (pennation angle).

힘줄의 길이는 일정한 길이일 수 있다. 즉 힘줄의 길이를 안정길이(resting length)로 고정한채 시뮬레이션을 진행할 수 있다. 이는 계산량을 감소시켜 시뮬레이션 속도를 개선하기 위함이다. 캐릭터 내의 근육 구동기마다 서로 다른 힘줄의 안정길이를 가질 수 있다. The length of the tendon may be a certain length. In other words, the simulation can be performed with the length of the tendon fixed at the resting length. This is to improve simulation speed by reducing the amount of calculation. Each muscle actuator within a character can have a different tendon resting length.

근육 구동기가 발생시키는 힘의 크기(f_mtu)는 근섬유가 발생시키는 힘(f_m)과 힘줄이 발생시키는 힘(f_t)이 동일하다는 가정하에 계산될 수 있다. 즉 힘줄이 발생시키는 힘의 크기와 방향과 근섬유가 발생시키는 힘이 크기의 방향은 서로 평형을 이룬다는 가정하에 계산될 수 있다. 따라서 수학식2와 같은 관계가 성립될 수 있다.The magnitude of the force generated by the muscle actuator (f _mtu ) can be calculated under the assumption that the force generated by the muscle fiber (f _m ) and the force generated by the tendon (f _t ) are the same. In other words, the magnitude and direction of the force generated by the tendon and the direction of the force generated by the muscle fiber can be calculated under the assumption that they are in equilibrium with each other. Therefore, the relationship shown in Equation 2 can be established.

수학식2에서 f_mtu는 근육 구동기가 발생시키는 힘을 의미할 수 있다. 수학식2에서 f_t는 힘줄이 발생시키는 힘을 의미 할 수 있다. 수학식2에서 f_m은 근섬유가 발생시키는 힘을 의미할 수 있다. 수학식2에서 f_ce는 근섬유의 능동적 파트가 발생시키는 힘을 의미할 수 있다. 수학식2에서 f_pe는 근섬유의 수동적 파트가 발생시키는 힘을 의미할 수 있다. 수학식2에서 α는 힘줄과 근섬유와의 각도(근속각, pennation angle)를 의미할 수 있다.In Equation 2, f _mtu may mean the force generated by the muscle actuator. In Equation 2, f _t may mean the force generated by the tendon. In Equation 2, f _m may mean the force generated by muscle fibers. In Equation 2, f _ce may mean the force generated by the active part of the muscle fiber. In Equation 2, f _pe may mean the force generated by the passive part of the muscle fiber. In Equation 2, α may mean the angle between the tendon and the muscle fiber (pennation angle).

수학식2에 수학식3을 적용함으로써 수학식4을 도출할 수 있다. 수학식4는 근섬유의 길이에 따라 근육 구동기가 내는 힘을 계산하는데 이용되는 식이다. 수학식4에 대한 구체적인 내용은 논문 'Locomotion Control for Many muscle Humanoids.' (2014 Yoonsang Lee)에서 확인할 수 있다. Equation 4 can be derived by applying Equation 3 to Equation 2. Equation 4 is an equation used to calculate the force produced by the muscle actuator depending on the length of the muscle fiber. For specific details on Equation 4, see the paper 'Locomotion Control for Many muscle Humanoids.' (2014 Yoonsang Lee).

수학식3에서f_ce는 능동적인 파트가 내는 힘을 의미할 수 있다. 수학식3에서f_pe는 수동적인 파트가 내는 힘을 의미할 수 있다. 수학식3 및 수학식4에서 l_m은 근섬유의 길이를 의미할 수 있다. 수학식3및 수학식4에서

_m은 근섬유의 길이의 시간 미분값을 의미할 수 있다.즉

_m은 근섬유의 길이 변화 속도를 의미할 수 있다.In Equation 3, f _ce may mean the force emitted by the active part. In Equation 3, f _pe may mean the power produced by the passive part. In Equation 3 and Equation 4, l _m may mean the length of the muscle fiber. In Equation 3 and Equation 4

_m may mean the time derivative of the length of the muscle fiber. That is,

_m may refer to the rate of change in length of muscle fibers.

수학식4에서 g_al는 근육의 능동적인 파트와 근섬유 길이 사이의 관계식을 의미한다. 수학식4에서 g_av는 근육의 능동적인 파트와 근섬유 길이의 시간 미분 사이의 관계식을 의미한다. 즉 근육의 능동적인 파트와 근섬유 길이의 변화 속도사이의 관계식을 의미할 수 있다. 수학식4에서 g_pl은 근육의 수동적인 파트와 근섬유 길이 사이의 관계식을 의미한다. 수학식4에서 f_mtu는 근육 구동기가 낼 수 있는 힘을 의미한다. 수학식4에서 cos(α)는 힘줄과 근섬유간의 각도(근속각)를 의미한다. In Equation 4, g _al refers to the relationship between the active part of the muscle and the muscle fiber length. In Equation 4, g _av refers to the relationship between the active part of the muscle and the time derivative of the muscle fiber length. In other words, it can mean the relationship between the active part of the muscle and the rate of change in muscle fiber length. In Equation 4, g _pl refers to the relationship between the passive part of the muscle and the muscle fiber length. In Equation 4, f _mtu refers to the force that the muscle actuator can produce. In Equation 4, cos(α) means the angle (muscle angle) between the tendon and the muscle fiber.

수학식4에서 활성화 값(act)은 0 내지 1사이 값을 가질 수 있다. 즉 근육이 이완이 되는 경우 0의 값을 가질 수 있으며 근육이 수축이 되는 경우 1의 값을 가질 수 있다. In Equation 4, the activation value (act) can have a value between 0 and 1. That is, when the muscle is relaxed, it can have a value of 0, and when the muscle is contracted, it can have a value of 1.

도5는 근섬유의 길이(Fiber length)에 따른 힘의 크기(Fiber Force)를 보여준다. 구체적으로 수학식4에서 g_al(l_m) 및 g_pl(l_m)의 변화에 따른 근섬유의 힘을 보여준다. Figure 5 shows the magnitude of force (Fiber Force) according to the length of muscle fibers (Fiber Length). Specifically, Equation 4 shows the strength of muscle fibers according to changes in g _al (l _m ) and g _pl (l _m ).

능동적(active) 파트에 대한 힘인 G_al(l_m)은 길이가 증가함에 따라 초기에는 증가하다가 이후 감소하는 경향을 보여준다. 수동적(passive) 파트에 대한 g_pl(l_m)은 길이가 증가함에 따라 초기에는 영향이 없다가, 일정 길이 이상에는 급격하게 증가하는 경향을 보여준다. G _al (l _m ), the force on the active part, initially increases as the length increases and then tends to decrease. g _pl (l _m ) for the passive part has no effect initially as the length increases, but shows a tendency to increase rapidly beyond a certain length.

도6은 근섬유 길이의 시간 미분값(velocity)과 힘(force) 사이의 관계를 보여준다. 구체적으로 수학식4에서 g_v(

'_m)의 변화에 따른 근섬유의 힘을 보여준다. Figure 6 shows the relationship between the time derivative of muscle fiber length (velocity) and force. Specifically, in Equation 4, g _v (

It shows the strength of muscle fibers according to changes in ' _m ).

근섬유의 길이가 빠르게 변할수록 근섬유의 힘이 증가하는 것을 확인할 수 있다. It can be seen that the faster the length of the muscle fiber changes, the more the strength of the muscle fiber increases.

이하 근육 구동기를 이용해 캐릭터를 제어하는 방법에 대해 살펴본다. Below we will look at how to control a character using muscle actuators.

근육 구동기는 몸체에 부착되어 있는 경로점 사이에 위치한다. 근육 구동기가 발생시키는 힘은 각 경로점 사이를 당기거나(수축) 밀어내는 방향(이완)으로 적용될 수 있다. 경로점이 변경되면 그에 따른 관절도 변경된다. 이때 경로점 사이의 힘은 서로 평형을 이뤄 상쇄될 수 있다. Muscle actuators are located between path points attached to the body. The force generated by the muscle actuator can be applied in a pulling (contraction) or pushing (relaxation) direction between each path point. When the path point changes, the corresponding joints also change. At this time, the forces between path points can balance each other and cancel out.

근섬유의 길이, 근섬유의 길이 변화 정도 및 활성화 값을 기반으로 근육 구동기가 작용하는 힘을 계산할 수 있다. 이를 위해 수학식4을 이용할 수 있다. 이때 근육 구동기의 길이와 근육 구동기의 길이 변화 정도는 캐릭터의 상태 정보를 기반으로 계산될 수 있다. 또한 활성화 값은 강화학습 모델의 출력값을 기반으로 계산될 것일 수 있다. The force exerted by the muscle actuator can be calculated based on the length of the muscle fiber, the degree of change in muscle fiber length, and the activation value. For this purpose, Equation 4 can be used. At this time, the length of the muscle actuator and the degree of change in the length of the muscle actuator can be calculated based on the character's status information. Additionally, the activation value may be calculated based on the output value of the reinforcement learning model.

근육 구동기가 작용하는 힘을 계산하기 위해선 근섬유의 길이가 필요한다. 전술한 바와 같이 근육 구동기 중 힘줄의 길이는 고정되어 있으므로, 근섬유의 길이는 근육 구동기의 길이에서 힘줄의 길이를 뺌으로 계산될 수 있다. To calculate the force exerted by a muscle actuator, the length of the muscle fiber is required. As described above, since the length of the tendon among the muscle actuators is fixed, the length of the muscle fiber can be calculated by subtracting the length of the tendon from the length of the muscle actuator.

이하 강화학습 모델에 대하여 구체적으로 설명한다. Below, the reinforcement learning model will be described in detail.

강화학습 기반 모델은 상태값을 입력받고 행동값을 출력하는 모델일 수 있다. 좀 더 구체적으로 강화학습 기반 모델은 최대의 보상을 받기 위한 행동값을 출력하는 모델일 수 있다. A reinforcement learning-based model may be a model that receives state values as input and outputs action values. More specifically, a reinforcement learning-based model may be a model that outputs action values to receive maximum reward.

도7은 강화학습 모델의 작동방식의 실시예 중 하나이다. 강화학습 모델은 상태값으로 캐릭터의 상태 정보를 입력 받을 수 있다. 강화학습 모델은 행동값으로 근육 구동기의 활성화 값을 출력하고 그에 따른 보상값을 받을 수 있다. 보상은 사람의 보행 특성을 잘 반영했는지에 대한 것일 수 있다. 강화학습 모델의 출력값을 기반으로 근육 시물레이션이 진행되며 이에 따라 캐릭터의 상태 정보가 갱신될 수 있다. Figure 7 is one example of how a reinforcement learning model works. The reinforcement learning model can receive the character's state information as a state value. The reinforcement learning model can output the activation value of the muscle actuator as an action value and receive a reward value accordingly. Compensation may be related to whether a person's walking characteristics are well reflected. Muscle simulation is performed based on the output value of the reinforcement learning model, and the character's status information can be updated accordingly.

보상값은 캐릭터가 사람의 보행특성을 잘 반영할수록 값이 커지는 것일 수 있다. The reward value may increase as the character better reflects the human walking characteristics.

수학식5는 보상값(reward)을 결정할 이용되는 식 중 하나이다. Equation 5 is one of the equations used to determine the reward value.

수학식5에서 r_ori는 캐릭터의 골반이 돌아가지 않도록 하는 항이다. 즉 r_ori-는 골반이 많이 돌아가 있을수록 패널티를 부여하는 항이다. r_ori는 수학식6을 통해 계산될 수 있다. In Equation 5, r _ori is a term that prevents the character's pelvis from rotating. In other words, r _ori - is a term that gives a penalty the more the pelvis is rotated. r _ori can be calculated through Equation 6.

수학식6에서 w_ori는 가중치를 의미할 수 있다. 수학식6에서 θ 는 3차원 회전 자유도(x, y, z)에 대한 것으로 각각의 방향으로 얼마나 많이 돌아 갔는지를 의미할 수 있다. In Equation 6, w _ori may mean a weight. In Equation 6, θ refers to the three-dimensional rotational degree of freedom (x, y, z) and can mean how much it rotates in each direction.

수학식5에서 r_dev는 캐릭터의 무게 중심이 좌우로 벗어나지 않도록 하는 항이다. 즉 수학식5에서 r_dev는 좌우로 무게중심이 벗어날수록 패널티를 부여하는 항이다. r_dev는 수학식7을 통해 계산될 수 있다. In Equation 5, r _dev is a term that prevents the character's center of gravity from deviating from side to side. That is, in Equation 5, r _dev is a term that gives a penalty as the center of gravity deviates from left to right. r _dev can be calculated through Equation 7.

수학식7에서 w_dev는 가중치를 의미할 수 있다. 수학식7에서 COMz는 중심 무게(Center of mass)의 z축의 위치를 의미하며, 구체적으로 캐릭터의 무게 중심이 z축으로부터 얼마나 벗어났는지를 의미할 수 있다. In Equation 7, w _dev may mean a weight. In Equation 7, COMz refers to the z-axis position of the center of mass, and can specifically mean how much the character's center of gravity deviates from the z-axis.

수학식5에서 r_up는 캐릭터의 상체가 기울어지지 않도록 하는 항이다. 이는 상체(torso)가 위로 일직선을 유지할 수 있게 하는 항이다. r_up는 수학식8을 통해 계산될 수 있다.In Equation 5, r _up is a term that prevents the character's upper body from tilting. This is the term that allows the upper body (torso) to maintain a straight line upward. r _up can be calculated through Equation 8.

수학식8에서 w_up 가중치를 의미한다. 수학식8에서 ytorso는 현재 캐릭터의 상체(torso)의 y축 방향의 벡터를 의미한다. 수학식8에서 yglobal은 <0, 1, 0>은 y축 방향으로 일직선인 단위 벡터를 의미한다. In Equation 8, w _up refers to the weight. In Equation 8, ytorso means the vector in the y-axis direction of the upper body (torso) of the current character. In Equation 8, yglobal <0, 1, 0> means a unit vector that is straight in the y-axis direction.

수학식5에서 r_vel은 캐릭터가 원하는 속도로 움직이게 하는 항이다. 즉 r_vel는 캐릭터의 현재 속도가 목표 속도와의 차이가 클수록 패널티를 부여하는 항이다. 즉 캐릭터가 목표 속도로 앞으로 나아가도록 하기 위한 목적의 항이다. In Equation 5, r _vel is a term that causes the character to move at the desired speed. In other words, r _vel is a term that gives a penalty the greater the difference between the character's current speed and the target speed. In other words, it is a term for the purpose of allowing the character to move forward at the target speed.

r_vel는 수학식9을 통해 계산될 수 있다. r _vel can be calculated through Equation 9.

수학식9에서 w_vel는 가중치를 의미할 수 있다. 수학식9에서 v_desired는 목표 속도를 의미하며, 초반에는 0m/s에서 시작해서 1초에 걸쳐서 1.5m/s까지 가속하는 값을 의"J라 수 있다. 수학식9에서 v_current는 캐릭터의 현재 속도를 의미한다. In Equation 9, w _vel may mean a weight. In Equation 9, v _desired refers to the target speed, and can refer to the value that starts from 0 m/s at the beginning and accelerates to 1.5 m/s over 1 second. In Equation 9, v _current refers to the character's It means the current speed.

수학식6에서 r_eng는 에너지 소모량을 최소화 하도록 하는 항이다. r_eng는 근육의 활성화 정도, 근육의 무피 및 근섬유의 속도등의 물리량을 기반으로 계산될 수 있다. In Equation 6, r _eng is a term that minimizes energy consumption. r _eng can be calculated based on physical quantities such as the degree of muscle activation, muscle skin coverage, and muscle fiber speed.

r_eng는 두 종류의 에너지 보상을 통해 계산될 수 있다. r _eng can be calculated through two types of energy compensation.

첫번째는 신진대사 해당치 (Metabolic Equivalent of Task, 이 하 MET)에 대한 것이다. MET 는 대사 에너지 소모량의 변화율을 질량으로 정규화 한 값일 수 있다. MET는 매 에피소드 스텝마다 계산되는 보상일 수 있다(dense reward)The first is about metabolic equivalent of task (MET). MET may be the rate of change in metabolic energy consumption normalized to mass. MET may be a reward calculated for each episode step (dense reward)

두번째는 이동 비용(Cost of Transport, 이하 CoT)에 대한 것이다. CoT 는 단위 이동거리를 움직이는 데 필요한 에너지량일 수 있다. CoT는 각 에피 소드가 끝날 때 한 번만 계산되는 보상일 수 있다(sparse reward)The second concerns cost of transportation (CoT). CoT may be the amount of energy required to move a unit distance. CoT may be a reward that is calculated only once at the end of each episode (sparse reward)

이하 강화학습 모델을 학습시키는 실시예 중 하나를 살펴 본다. Below, we will look at one example of training a reinforcement learning model.

정적인 인간과 비슷한 보행 동작을 달성하기 위해 강화학습 모델은 두 단계에 걸쳐서 학습되었다. 먼저 보상값 중 에너지 보상으로 MET를 이용한 뒤, 이후 에너지 보상으로 CoT를 이용하였다. 구체적으로 매 에피소드 스텝마다 주어지는 보상인 MET 로 수렴할 때까지 학습한 뒤 에피소드 마지막에 한 번만 부여되는 CoT 보상으로 전환해서 학습된 정책을 개선하고 동작을 안정되게 했다.To achieve static human-like walking movements, the reinforcement learning model was trained in two steps. First, among the compensation values, MET was used as energy compensation, and then CoT was used as energy compensation. Specifically, we learned until convergence with MET, which is a reward given at every episode step, and then switched to CoT reward, which is given only once at the end of the episode, to improve the learned policy and stabilize the operation.

초기 학습 단계에서는 정책이 출력하는 근육의 활성화 값은 대부분 큰 에너지를 소모하고 불안정한 동작을 만들어 진다. MET보상은 여러 에피소드 스텝에 걸쳐 균형을 유지하는 방법을 학습하면서 동작을 빠르게 안정화 하는데 도움을 준다. MET에서 학습이 수렴된 다음 CoT보상으로 전환할 경우 같은 에너지를 소모하더라도 더 먼 거리를 이동하도록 높은 보상을 부여하기 때문에 이동거리를 효과적으로 늘리는데 도움을 줄 수 있다. In the initial learning stage, most of the muscle activation values output by the policy consume a lot of energy and produce unstable movements. MET compensation helps you quickly stabilize your movements while learning how to maintain balance across multiple episodic steps. When learning converges in MET and then switches to CoT compensation, it can help effectively increase the distance traveled because it provides a high reward for moving a longer distance even if the same energy is consumed.

에피소드 시작 시점의 사람 캐릭터의 자세는 두발을 땅에 붙인 상태가 아니라 왼쪽 또는 오른 쪽 다리 중 하나를 무작위 각도로 들어올린 상태이다. 이는 에피소드 시작 시점의 사람 캐릭터의 자세가 인간과 비슷한 보행 동작의 학습을 할 때 중요한 역할을 할 수 있기 때문이다. 즉 한쪽 다리를 들어올린 상태에서 시작하는 것이 양 다리를 교차하면서 앞으로 나가는 동작을 탐색할 가능성이 높기 때문이다. The human character's posture at the beginning of the episode is not with both feet on the ground, but with either the left or right leg raised at a random angle. This is because the posture of the human character at the start of the episode can play an important role when learning walking movements similar to humans. In other words, starting with one leg lifted is more likely to explore forward motion while crossing both legs.

연구원은 전술한 강화학습 기반 모델을 구축 한 뒤, 이를 기반으로 캐릭터의 모션을 생성하는 실험을 진행하였다. 연구원은 강화학습 알고리즘으로는 PPO(Proximal Policy Optimization)를 이용하였다. 실험에는 32개의 CPU코어가 이용되었으며, 안정적으로 걷는 정책을 학습하는데는 약 10일의 시간이 소요되었다. 네트워크 은닉층의 크기는 32 x 32 x 32이며 활성화 함수(activation function)로는 relu가 사용되었다. After building the aforementioned reinforcement learning-based model, the researcher conducted an experiment to generate character motion based on it. The researcher used PPO (Proximal Policy Optimization) as a reinforcement learning algorithm. 32 CPU cores were used in the experiment, and it took about 10 days to learn a stable walking policy. The size of the network hidden layer is 32 x 32 x 32, and relu was used as the activation function.

이하 연구원이 진행한 실험 결과에 대해 살펴본다. Below we look at the results of the experiment conducted by the researcher.

도8은 시뮬레이션에 이용된 캐릭터이다. Figure 8 shows the characters used in simulation.

시뮬레이션에 이용되는 캐릭터는 75kg의 무게를 가지며 16개의 몸체로 구성되어 있다. 또한 시뮬레이션에 이용되는 캐릭터는 31개의 자유도와 120개의 근육을 포함한다. 도8에서 파란색으로 표시된 선은 이완 상태의 근육을 의미한다. The character used in the simulation weighs 75 kg and consists of 16 bodies. Additionally, the character used in the simulation includes 31 degrees of freedom and 120 muscles. In Figure 8, the blue line indicates a muscle in a relaxed state.

도9는 학습된 강화학습 기반 모델을 이용해 캐릭터의 모션을 생성한 결과이다. Figure 9 shows the results of generating character motion using the learned reinforcement learning-based model.

도9의 (a)는 걸음의 한 사이클 구간을 보여준다. 도9의 (b)는 걸음의 총 네 사이클 구간을 보여준다. 붉은색으로 표현된 선은 수축상태의 근육을 의미하며, 파란색으로 표현된 선은 이완상태의 근육을 의미한다. 도9를 통해 캐릭터가 인간과 비슷한 동작으로 걸을 수 있다는 것을 확인할 수 있다. Figure 9(a) shows one cycle section of walking. Figure 9(b) shows a total of four cycle sections of walking. The lines expressed in red indicate muscles in a contracted state, and the lines expressed in blue indicate muscles in a relaxed state. Through Figure 9, it can be seen that the character can walk with movements similar to humans.

도10은 보상값 중 에너지 보상으로 신진 대사 해당치(MET)만 사용했을 때의 캐릭터 모션을 생성한 결과이다. Figure 10 shows the result of creating character motion when only metabolic equivalent value (MET) was used as energy compensation among the compensation values.

걷는 동작 자체는 도9와 비슷하지만 세번째 걸음에서 발을 착지 할 때 균형을 유지하지 못하고 넘어지는 것을 확인할 수 있다. The walking motion itself is similar to Figure 9, but you can see that when landing on the third step, the person cannot maintain balance and falls.

도11은 보상값 중 에너지 보상으로 이동 비용(Cot)만 사용했을 때의 캐릭터 모션을 생성한 결과이다. 이동 비용은 에피소드가 종료될 때 한번만 계산되기 때문에 학습에 충분한 정보를 제공하지 못한다. 따라서 에피소드의 매 스텝마다 부여되는 신진대사 해당치(Met)와 비교했을 때 안정적인 행동을 하지 못하고 금방 넘어지는 것을 확인할 수 있다. Figure 11 shows the result of creating a character motion when only the movement cost (Cot) is used as energy compensation among the compensation values. Because movement costs are calculated only once, at the end of an episode, they do not provide sufficient information for learning. Therefore, when compared to the metabolic value (Met) given for each step of the episode, it can be seen that it does not behave stably and falls quickly.

도12는 보상값 중 에너지 보상을 전혀 사용하지 않은 경우의 결과이다. 전체적으로 이전 결과과 비교할 때 근육이 힘을 많이 사용하는 것을 확인할 수 있다. 또한 앞으로 점프해서 나아가려고 하지만 금방 균형을 잃고 넘어지는 것을 확인할 수 있다. Figure 12 is the result when no energy compensation is used among the compensation values. Overall, compared to the previous results, it can be seen that the muscles use a lot of force. Also, you can see that he tries to jump forward but quickly loses his balance and falls.

도13은 도10 내지 도12의 학습 결과의 보상값을 비교한 그래프이다. Figure 13 is a graph comparing reward values of the learning results of Figures 10 to 12.

에너지 보상을 사용하지 못한 경우(no energy) 에너지 소모량을 줄이지 못하는 것을 확인할 수 있다. 그로 인해 에너지 보상을 사용하지 않은 경우 에너지 모소량을 줄이지 못하고, 그로 인해 동작 역시 불안정하기 때문에 높은 보상을 얻지 못하는 것을 확인할 수 있다. 특히 학습 초반에 안정적인 공간에 진입하는 것을 어려워 하는 것을 확인할 수 있다. 에너지 소모를 최소화 하는 것은 단순히 평균적인 힘의 크기를 제한하는 것이 아니라 인간과 비슷한 움직임에 가까운 상태와 액션 공간을 찾는데 도움을 줄 수 있기 때문이다. If energy compensation is not used (no energy), you can see that energy consumption cannot be reduced. As a result, if energy compensation is not used, the amount of energy consumption cannot be reduced, and as a result, the operation is also unstable, so it can be confirmed that high compensation is not obtained. In particular, it can be seen that it is difficult to enter a stable space in the early stages of learning. This is because minimizing energy consumption does not simply limit the size of average force, but can help find states and action spaces that are close to human-like movements.

에너지 보상을 사용하는 것이 더 좋은 보상을 얻을 수 있으며, 특히 매 스텝마다 에너지 보상을 부여하는 것이 에피소드 마짐가에 한번 에너지 보상을 부여하는 방식에 비해서 보상값 증가 및 학습면에서 좋을 결과를 보이는 것을 확인할 수 있다. You can get better rewards by using energy rewards, and in particular, you can see that giving energy rewards at every step shows better results in terms of increased reward value and learning compared to giving energy rewards once at the end of an episode. You can.

도14는 캐릭터의 초기 자세를 지정하지 않는 경우의 실험결과이다. 즉 두발이 땅에 붙어 있는 상태에서 학습한 결과이다. Figure 14 shows the results of an experiment when the initial posture of the character is not specified. In other words, it is the result of learning with both feet on the ground.

캐릭터의 초기 자세를 지정하지 않는 경우 양 발로 점프하는 동작이 학습된 것을 확인할 수 있다. 이는 초기 단계에서 한발을 들어 올리는 것보다 두 발로 점프하는 행동을 탐색하는 것이 더 쉽기 때문으로 보인다. If you do not specify the character's initial posture, you can see that the jumping motion with both feet has been learned. This appears to be because it is easier to explore the behavior of jumping on two feet rather than lifting one foot in the early stages.

이하 영상 처리 장치에 대해 설명한다. The image processing device will be described below.

도15는 영상 처리 장치(300)의 실시예 중 하나의 구성이다. Figure 15 shows the configuration of one embodiment of the image processing device 300.

영상 처리 장치(300)는 도1에서 설명한 영상 처리 장치(100)에 해당할 수 있다. 즉 영상 처리 장치(300)는 근골격계 모델 기반 캐릭터의 모션 생성 방법을 수행장치일 수 있다. The image processing device 300 may correspond to the image processing device 100 described in FIG. 1 . That is, the image processing device 300 may be a device that performs a method for generating motion of a character based on a musculoskeletal model.

영상 처리 장치(300)는 물리적으로 다양한 형태로 구현될 수 있다. 예를 들어 영상 처리 장치(300)는 PC, 노트북, 스마트기기, 서버 또는 데이터처리 전용 칩셋 등의 형태를 가질 수 있다. The image processing device 300 may be physically implemented in various forms. For example, the image processing device 300 may take the form of a PC, a laptop, a smart device, a server, or a chipset dedicated to data processing.

영상 처리 장치(300)는 입력장치(310), 저장장치(320), 연산장치(330), 출력장치(340), 인터페이스 장치(350) 및 통신장치(360)를 포함할 수 있다. The image processing device 300 may include an input device 310, a storage device 320, an arithmetic device 330, an output device 340, an interface device 350, and a communication device 360.

입력장치(310)는 일정한 명령 또는 데이터를 입력 받는 인터페이스 장치(키보드, 마우스, 터치스크린 등)를 포함할 수도 있다. 입력장치(310)는 별도의 저장장치(USB, CD, 하드디스크 등)를 통하여 정보를 입력 받는 구성을 포함할 수도 있다. 입력장치(310)는 입력 받는 데이터를 별도의 측정장치를 통하여 입력 받거나, 별도의 DB를 통하여 입력 받을 수도 있다. 입력장치(310)는 유선 또는 무선 통신을 통해 데이터를 입력 받을 수 있다. The input device 310 may include an interface device (keyboard, mouse, touch screen, etc.) that receives certain commands or data. The input device 310 may include a component that receives information through a separate storage device (USB, CD, hard disk, etc.). The input device 310 may receive input data through a separate measuring device or through a separate DB. The input device 310 can receive data through wired or wireless communication.

입력장치(310)는 근골격계 모델 기반 캐릭터의 모션 생성 방법을 실행하는데 필요한 정보 및 모델을 입력 받을 수 있다. 입력장치(310)는 캐릭터의 상태 정보를 입력 받을 수 있다. 입력장치(310)는 강화학습 기반의 모델을 입력 받을 수 있다. The input device 310 can receive input information and models necessary to execute a method for generating motion of a character based on a musculoskeletal model. The input device 310 can receive character status information. The input device 310 can receive a reinforcement learning-based model.

저장장치(320)는 입력장치(310)를 통해 입력 받은 정보를 저장할 수 있다. 저장장치(320)는 연산장치(330)가 연산하는 과정에서 생성되는 정보를 저장할 수 있다. 즉 저장장치(320)는 메모리를 포함할 수 있다. 저장장치(320)는 연산장치(330)가 계산한 결과를 저장할 수 있다. The storage device 320 can store information input through the input device 310. The storage device 320 can store information generated during calculation by the computing device 330. That is, the storage device 320 may include memory. The storage device 320 can store the results calculated by the computing device 330.

저장장치(320)는 근골격계 모델 기반 캐릭터의 모션 생성 방법을 실행하는데 필요한 정보 및 모델을 저장할 수 있다. 저장장치(320)는 캐릭터의 상태 정보 및 강화학습 기반의 모델을 저장할 수 있다. The storage device 320 may store information and models necessary to execute a method for generating motion of a character based on a musculoskeletal model. The storage device 320 can store character status information and a reinforcement learning-based model.

연산장치(330)는 데이터를 처리하고, 일정한 연산을 처리하는 프로세서, AP, 프로그램이 임베디드된 칩과 같은 장치일 수 있다. 연산장치(330)는 영상 처리 장치(300)를 제어하는 제어신호를 생성할 수 있다. The computing device 330 may be a device such as a processor that processes data and performs certain operations, an AP, or a chip with an embedded program. The computing device 330 may generate a control signal that controls the image processing device 300.

연산장치(330)는 근골격계 모델 기반 캐릭터의 모션 생성 방법을 수행하는데 필요한 연산을 수행할 수 있다. The calculation device 330 may perform calculations necessary to perform a method for generating motion of a character based on a musculoskeletal model.

연산장치(330)는 캐릭터의 상태 정보를 강화학습 기반의 모델에 입력할 수 있다. 연산장치(330)는 강화학습 기반 모델의 출력값을 기반으로 캐릭터의 상태를 제어할 수 있다. The computing device 330 can input the character's state information into a reinforcement learning-based model. The computing device 330 can control the state of the character based on the output value of the reinforcement learning-based model.

출력장치(340)는 일정한 정보를 출력하는 장치가 될 수도 있다. 출력장치(340)는 데이터 과정에 필요한 인터페이스, 입력된 데이터, 분석결과 등을 출력할 수도 있다. 출력장치(340)는 디스플레이, 문서를 출력하는 장치, 등과 같이 물리적으로 다양한 형태로 구현될 수도 있다. The output device 340 may be a device that outputs certain information. The output device 340 may output interfaces, input data, analysis results, etc. required for data processing. The output device 340 may be physically implemented in various forms, such as a display, a document output device, etc.

인터페이스 장치(350)는 외부로부터 일정한 명령 및 데이터를 입력 받는 장치일 수 있다. 인터페이스 장치(350)는 물리적으로 연결된 입력 장치 또는 외부 저장장치로부터 근골격계 모델 기반 캐릭터의 모션 생성 방법을 수행하는데 필요한 정보 및 모델을 저장할 수 있다. 인터페이스 장치(350)는 영상 처리 장치(300)를 제어하기 위한 제어신호를 입력 받을 수 있다. 인터페이스 장치(350)는 영상 처리 장치(300)가 분석한 결과를 출력할 수 있다. The interface device 350 may be a device that receives certain commands and data from the outside. The interface device 350 may store information and models required to perform a method for generating motion of a character based on a musculoskeletal model from a physically connected input device or an external storage device. The interface device 350 may receive a control signal to control the image processing device 300. The interface device 350 may output the results analyzed by the image processing device 300.

통신장치(360)는 유선 또는 무선 네트워크를 통해 일정한 정보를 수신하고 전송하는 구성을 의미할 수 있다. 통신장치(360)는 영상 처리 장치(300)를 제어하는데 필요한 제어 신호를 수신할 수 있다. 통신장치(360)는 영상 처리 장치(300)가 분석한 결과를 전송할 수 있다. The communication device 360 may refer to a configuration that receives and transmits certain information through a wired or wireless network. The communication device 360 may receive a control signal necessary to control the image processing device 300. The communication device 360 may transmit the results analyzed by the image processing device 300.

전술한 근골격계 모델에 기반 캐릭터의 모션 생성 방법은 컴퓨터에서 실행될 수 있는 실행가능한 알고리즘을 포함하는 프로그램(또는 어플리케이션)으로 구현될 수 있다. The method for generating motion of a character based on the above-described musculoskeletal model may be implemented as a program (or application) including an executable algorithm that can be executed on a computer.

상기 프로그램은 일시적 또는 비일시적 판독 가능 매체(non-transitory computer readable medium)에 저장되어 제공될 수 있다.The program may be stored and provided in a temporary or non-transitory computer readable medium.

비일시적 판독 가능 매체는 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM (read-only memory), PROM (programmable read only memory), EPROM(Erasable PROM, EPROM) 또는 EEPROM(Electrically EPROM) 또는 플래시 메모리 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.A non-transitory readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short period of time, such as registers, caches, and memories. Specifically, the various applications or programs described above include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM (read-only memory), PROM (programmable read only memory), and EPROM (Erasable PROM, EPROM). Alternatively, it may be stored and provided in a non-transitory readable medium such as EEPROM (Electrically EPROM) or flash memory.

일시적 판독 가능 매체는 스태틱 램(Static RAM，SRAM), 다이내믹 램(Dynamic RAM，DRAM), 싱크로너스 디램 (Synchronous DRAM，SDRAM), 2배속 SDRAM(Double Data Rate SDRAM，DDR SDRAM), 증강형 SDRAM(Enhanced SDRAM，ESDRAM), 동기화 DRAM(Synclink DRAM，SLDRAM) 및 직접 램버스 램(Direct Rambus RAM，DRRAM) 과 같은 다양한 RAM을 의미한다.Temporarily readable media include Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM), and Enhanced SDRAM (Enhanced RAM). It refers to various types of RAM such as SDRAM, ESDRAM), synchronous DRAM (Synclink DRAM, SLDRAM), and Direct Rambus RAM (DRRAM).

본 실시예 및 본 명세서에 첨부된 도면은 전술한 기술에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 전술한 기술의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시예는 모두 전술한 기술의 권리범위에 포함되는 것이 자명하다고 할 것이다.This embodiment and the drawings attached to this specification only clearly show some of the technical ideas included in the above-described technology, and those skilled in the art can easily understand them within the scope of the technical ideas included in the specification and drawings of the above-described technology. It will be self-evident that all inferable modifications and specific embodiments are included in the scope of rights of the above-described technology.

Claims

A step in which an image processing device receives status information of a character;
Inputting, by the image processing device, state information of the character into a reinforcement learning-based model; and
Including, the image processing device controlling the state of the character based on the output value of the reinforcement learning-based model,
The character consists of a joint, a path point formed on the joint, and a muscle actuator connected between the path points,
The muscle actuator applies force between path points to move the character,
The reinforcement learning-based model is a model that outputs an action value that receives the maximum reward value based on the input state information,
The state information of the character includes rigid body state information and muscle state information of the character,
The compensation value increases as it reflects the person's walking characteristics,
The action value includes a value for whether the muscle actuator will contract or relax.

According to paragraph 1,
A method for generating motion of a character based on a musculoskeletal model, wherein the rigid body state information of the character includes information about the positions and speeds of joints of the character.

According to paragraph 1,
A method for generating motion of a character based on a musculoskeletal model, wherein the muscle state information of the character includes length information of the muscle actuator.

According to paragraph 1,
The compensation value is calculated when the character's pelvis does not rotate, when the character's center of gravity does not deviate from side to side, when the character is not tilted, when the character moves at a desired speed, and when the character's muscles A method for generating motion of a character based on a musculoskeletal model, wherein the value increases when at least one of the following is satisfied: when the activation value is minimized, when the energy consumption of the character is minimized, and when the character maintains balance.

According to paragraph 4,
The character's energy consumption includes metabolic equivalent of task (MET) or cost of transport (CoT).
The metabolic equivalent value includes the rate of change in energy consumption normalized by mass,
A method for generating motion of a character based on a musculoskeletal model, wherein the movement cost includes the amount of energy required to move a unit movement distance.

According to clause 5,
The reinforcement learning-based model is first learned using the metabolic value as the energy consumption among the compensation values, and then learned again using the movement cost as the energy consumption among the compensation values. A method for generating motion of a character based on a musculoskeletal model.

According to paragraph 1,
The muscle actuator consists of tendons and muscle fibers,
A method for generating motion of a character based on a musculoskeletal system model, wherein the muscle fibers include an active part that models the force by which the muscle contracts and a passive part that models the force that the muscle tries to restore to its original state.

In clause 7,
A method for generating motion of a character based on a musculoskeletal model, wherein the force (f _mtu ) produced by the muscle actuator is calculated based on the following equation.
[Equation]

In the above equation
In the above equation, l _m may mean the length of the muscle fiber. In the above equation

_m may mean the time derivative of the length of the muscle fiber. in other words

_m may mean the rate of change in length of the muscle fiber. In the above equation, g _al refers to the relationship between the active part of the muscle fiber and the muscle fiber length. In the above equation, g _av refers to the relationship between the active part of the muscle fiber and the time derivative of the muscle fiber length. In other words, it can mean the relationship between the active part of the muscle and the rate of change in muscle fiber length. In the above equation, g _pl refers to the relationship between the passive part of the muscle and the muscle fiber length. In the above equation, f _mtu means the force that the muscle actuator can produce. In the above equation, cos(α) means the angle (muscle angle) between the tendon and the muscle fiber. In the above equation, the activation value (act) may have a value between 0 and 1. That is, when the muscle is relaxed, it can have a value of 0, and when the muscle is contracted, it can have a value of 1.

An input device that receives character status information;
a computing device that inputs the character's state information into a reinforcement learning-based model, and allows the image processing device to control the character's state based on an output value of the reinforcement learning-based model; and
A storage device that stores the status information of the character and the reinforcement learning-based model,
The character consists of a joint, a path point formed on the joint, and a muscle actuator connected between the path points,
The muscle actuator applies force between path points to move the character,
The reinforcement learning-based model is a model that outputs an action value that receives the maximum reward value based on the input state information,
The state information of the character includes rigid body state information and muscle state information of the character,
The compensation value increases as it reflects the person's walking characteristics,
An apparatus for generating motion of a character based on a musculoskeletal model, wherein the action value includes a value for whether the muscle actuator will contract or relax.

According to clause 9,
An apparatus for generating motion of a character based on a musculoskeletal model, wherein the rigid body state information of the character includes information about the position and speed of joints of the character.

According to clause 9,
An apparatus for generating motion of a character based on a musculoskeletal model, wherein the muscle state information of the character includes length information of the muscle actuator.

According to clause 9,
The compensation value is calculated when the character's pelvis does not rotate, when the character's center of gravity does not deviate from side to side, when the character is not tilted, when the character moves at a desired speed, and when the character's muscles A motion generating device for a character based on a musculoskeletal model, wherein the value increases when at least one of the following is satisfied: when the activation value is minimized, when energy consumption of the character is minimized, and when the character maintains balance.

According to clause 12,
The character's energy consumption includes metabolic equivalent of task (MET) or cost of transport (CoT),
The metabolic equivalent value includes the rate of change in energy consumption normalized by mass,
A motion generating device for a character based on a musculoskeletal model, wherein the movement cost includes the amount of energy required to move a unit moving distance.

According to clause 13,
The reinforcement learning-based model is first learned using the metabolic value as the energy consumption among the compensation values, and then learned again using the movement cost as the energy consumption among the compensation values. A device for generating motion of a character based on a musculoskeletal model.

According to clause 9,
The muscle actuator consists of tendons and muscle fibers,
The muscle fibers include an active part that models the force by which the muscle contracts and a passive part that models the force that the muscle tries to restore to its original state. A motion generating device for a character based on a musculoskeletal model.

According to clause 15,
A motion generating device for a character based on a musculoskeletal model, wherein the force (f _mtu ) produced by the muscle actuator is calculated based on the following equation.
[Equation]

In the above equation
In the above equation, l _m may mean the length of the muscle fiber. In the above equation

A computer-readable recording medium recording a program for executing the method for generating motion of a musculoskeletal model-based character according to claim 1.