KR102645536B1

KR102645536B1 - Animation processing methods and devices, computer storage media, and electronic devices

Info

Publication number: KR102645536B1
Application number: KR1020227002917A
Authority: KR
Inventors: 룽 장
Original assignee: 텐센트 테크놀로지(센젠) 컴퍼니 리미티드
Priority date: 2020-01-15
Filing date: 2020-11-02
Publication date: 2024-03-07
Also published as: CN111292401A; JP7407919B2; EP4009282A1; JP2022553167A; KR20220025023A; US11790587B2; CN111292401B; US20220139020A1; US20240005582A1; EP4009282A4; WO2021143289A1

Abstract

본 개시는 애니메이션 처리 방법 및 장치를 제공하며, 인공 지능 분야에 관한 것이다. 이 방법은, 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징을 획득하고, 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보를 획득하는 단계; 지형 특징, 상태 정보 및 태스크 정보를 애니메이션 처리 모델에 입력하고, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하여 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득하는 단계; 및 관절 액션 정보에 따라 관절 토크를 결정하고, 관절 토크에 기반하여 렌더링을 수행하여 현재 순간의 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하고, 제스처 조정 정보에 따라 애니메이션 세그먼트를 처리하는 단계를 포함한다.This disclosure provides an animation processing method and apparatus, and relates to the field of artificial intelligence. The method includes obtaining terrain features in a graphical user interface at the current moment, and obtaining state information and task information corresponding to a virtual character in an animation segment at the current moment; Input terrain features, state information, and task information into the animation processing model, and use the animation processing model to perform feature extraction on the terrain features, state information, and task information to obtain joint action information corresponding to the virtual character at the next moment. steps; And determining the joint torque according to the joint action information, performing rendering based on the joint torque to obtain gesture adjustment information corresponding to the virtual character at the current moment, and processing the animation segment according to the gesture adjustment information. .

Description

Animation processing methods and devices, computer storage media, and electronic devices

본 개시는 2020년 1월 15일에 출원된 중국 특허 출원 번호 제202010043321.5호에 대한 우선권을 주장하는 바이며, 상기 문헌의 내용은 그 전체로서 원용에 의해 포함된다.This disclosure claims priority to Chinese Patent Application No. 202010043321.5 filed on January 15, 2020, the content of which is incorporated by reference in its entirety.

본 개시는 인공 지능(artificial intelligence) 기술 분야에 관한 것으로, 특히 애니메이션(animation) 처리 방법, 애니메이션 처리 장치, 컴퓨터 저장 매체 및 전자 디바이스에 관한 것이다.The present disclosure relates to the field of artificial intelligence technology, and particularly to animation processing methods, animation processing devices, computer storage media, and electronic devices.

인공 지능의 지속적인 발전과 함께 인공 지능 기술은 의료 분야, 금융 분야, 그래픽 디자인 분야와 같은 점점 더 많은 분야에 적용되기 시작하고 있다. 예를 들어, 게임 디자인은 원래의 2D 게임 디자인에서 현재의 3D 게임 디자인으로 점진적으로 발전한다.With the continued development of artificial intelligence, artificial intelligence technology is beginning to be applied to more and more fields, such as the medical field, financial field, and graphic design field. For example, game design evolves gradually from original 2D game design to current 3D game design.

현재, 게임 제작에서, 일반적으로 애니메이터(animator)에 의해 복수의 애니메이션 세그먼트(segment)가 설계되고, 게임 엔진을 사용하여 복수의 애니메이션 세그먼트가 혼합 및 전환되어 최종적으로 게임에서의 효과(effect)가 달성된다. 애니메이션은 캐릭터 행동의 표현이며, 완전한 애니메이션 세그먼트는 일정 기간에서의 캐릭터 객체의 액션(action)이 기록되고 재생되는 것이다. 그러나, 애니메이터가 제작한 애니메이션은 물리적 엔진(physical engine)에서 실시간으로 렌더링되는 애니메이션보다 재생 효과가 덜 자연스럽고 덜 생생하며, 플레이어와 상호 작용할 수 없으며, 예를 들어 가변 타깃 태스크(task)를 구현할 수 없으며, 역동적인 지형(terrain)에 적응할 수 없다.Currently, in game production, a plurality of animation segments are generally designed by an animator, and the plurality of animation segments are mixed and converted using a game engine to finally achieve an effect in the game. do. Animation is an expression of character behavior, and a complete animation segment is one in which the actions of a character object over a certain period of time are recorded and played back. However, animations produced by animators have less natural and lifelike playback effects than animations rendered in real time in a physical engine, cannot interact with the player, and cannot, for example, implement variable target tasks. and cannot adapt to dynamic terrain.

전술한 배경에서 개시되는 정보는 단지 본 개시의 배경에 대한 이해를 강화하기 위한 것이며, 따라서 당업자에게 공지된 관련 기술을 구성하지 않는 정보를 포함할 수 있다.The information disclosed in the foregoing background is merely to enhance understanding of the background of the present disclosure and may therefore include information that does not constitute related art known to those skilled in the art.

본 개시의 실시예는 애니메이션 처리 방법, 애니메이션 처리 장치, 컴퓨터 저장 매체 및 전자 디바이스를 제공한다.Embodiments of the present disclosure provide an animation processing method, an animation processing apparatus, a computer storage medium, and an electronic device.

본 개시의 실시예는 전자 디바이스에 적용 가능한 애니메이션 처리 방법을 제공한다. 상기 애니메이션 처리 방법은, 현재 순간(moment)의 그래픽 사용자 인터페이스에서 지형 특징(terrain feature)을 획득하고, 상기 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크(task) 정보를 획득하는 단계; 상기 지형 특징, 상기 상태 정보 및 상기 태스크 정보를 애니메이션 처리 모델에 입력하고, 상기 애니메이션 처리 모델을 사용하여 상기 지형 특징, 상기 상태 정보 및 상기 태스크 정보에 대해 특징 추출을 수행하여, 다음 순간의 상기 가상 캐릭터에 대응하는 관절 액션(joint action) 정보를 획득하는 단계; 상기 관절 액션 정보에 따라 관절 토크(torque)를 결정하는 단계; 및 상기 관절 토크에 기반하여 상기 현재 순간의 상기 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하고, 상기 제스처 조정 정보에 따라 상기 애니메이션 세그먼트를 처리하는 단계를 포함한다. Embodiments of the present disclosure provide an animation processing method applicable to electronic devices. The animation processing method includes obtaining a terrain feature from a graphical user interface at a current moment, and obtaining state information and task information corresponding to a virtual character from an animation segment at the current moment. ; Input the terrain features, the state information, and the task information into an animation processing model, perform feature extraction on the terrain features, the state information, and the task information using the animation processing model, and perform feature extraction on the terrain features, the state information, and the task information, and Obtaining joint action information corresponding to the character; determining joint torque according to the joint action information; and obtaining gesture adjustment information corresponding to the virtual character at the current moment based on the joint torque, and processing the animation segment according to the gesture adjustment information.

본 개시의 실시예는 애니메이션 처리 장치를 제공하며, 상기 애니메이션 처리 장치는, 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징을 획득하고 상기 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보를 획득하도록 구성된 정보 획득 모듈; 상기 지형 특징, 상기 상태 정보 및 상기 태스크 정보를 애니메이션 처리 모델에 입력하고, 상기 애니메이션 처리 모델을 사용하여 상기 지형 특징, 상기 상태 정보 및 상기 태스크 정보에 대해 특징 추출을 수행하여, 다음 순간의 상기 가상 캐릭터에 대응하는 관절 액션 정보를 획득하도록 구성된 모델 처리 모듈; 및 상기 관절 액션 정보에 따라 관절 토크를 결정하고, 상기 관절 토크에 기반하여 상기 현재 순간의 상기 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하며, 상기 제스처 조정 정보에 따라 상기 애니메이션 세그먼트를 처리하도록 구성된 제스처 조정 모듈을 포함한다. An embodiment of the present disclosure provides an animation processing device, wherein the animation processing device acquires terrain features in a graphical user interface at a current moment and acquires state information and task information corresponding to a virtual character in an animation segment at the current moment. an information acquisition module configured to; Input the terrain features, the state information, and the task information into an animation processing model, perform feature extraction on the terrain features, the state information, and the task information using the animation processing model, and perform feature extraction on the terrain features, the state information, and the task information, and a model processing module configured to acquire joint action information corresponding to the character; and a gesture configured to determine a joint torque according to the joint action information, obtain gesture adjustment information corresponding to the virtual character at the current moment based on the joint torque, and process the animation segment according to the gesture adjustment information. Includes coordination module.

본 개시의 실시 예는 컴퓨터 프로그램을 저장하는, 컴퓨터가 판독 가능한 저장 매체를 제공하며, 상기 컴퓨터 프로그램은 프로세서에 의해 실행될 때, 전술한 실시 예에 따른 애니메이션 처리 방법을 구현한다. An embodiment of the present disclosure provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, it implements the animation processing method according to the above-described embodiment.

본 개시의 실시 예는 전자 디바이스를 제공하며, 상기 전자 디바이스는 하나 이상의 프로세서; 및 하나 이상의 프로그램을 저장하도록 구성된 저장 장치를 포함하며, 상기 하나 이상의 프로그램은 상기 하나 이상의 프로세서에 의해 실행될 때, 상기 하나 이상의 프로세서가 전술한 실시 예에 따른 애니메이션 처리 방법을 구현하게 한다. Embodiments of the present disclosure provide an electronic device, the electronic device comprising: one or more processors; and a storage device configured to store one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the animation processing method according to the above-described embodiment.

본 개시의 실시예에서 제공되는 기술 솔루션에서, 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징 그리고 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보가 먼저 획득되며; 그런 다음, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하여 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득하고; 마지막으로 관절 액션 정보에 따라 관절 토크가 결정되며, 관절 토크에 기반하여 현재 순간의 가상 캐릭터에 대응하는 제스처 조정 정보가 획득되고, 제스처 조정 정보에 따라 애니메이션 세그먼트가 처리된다. 이러한 방식으로 애니메이션 세그먼트를 시뮬레이션할 수 있을 뿐만 아니라 상이한 지형 특징 및 태스크 정보에 따라 가상 캐릭터의 액션 및 제스처를 조정할 수 있다. 한편으로, 애니메이션의 충실도(fidelity)가 향상되며; 한편, 사용자와 가상 캐릭터 간의 상호 작용이 구현되고, 가상 캐릭터의 자기 적응성(self-adaptivity)이 향상된다. In the technical solutions provided in the embodiments of the present disclosure, the terrain features in the graphical user interface at the current moment and the state information and task information corresponding to the virtual character in the animation segment at the current moment are first obtained; Then, use the animation processing model to perform feature extraction on the terrain features, state information, and task information to obtain joint action information corresponding to the virtual character at the next moment; Finally, the joint torque is determined according to the joint action information, gesture adjustment information corresponding to the virtual character at the current moment is obtained based on the joint torque, and the animation segment is processed according to the gesture adjustment information. In this way, not only can animation segments be simulated, but also the actions and gestures of the virtual character can be adjusted according to different terrain features and task information. On the one hand, the fidelity of animation is improved; Meanwhile, interaction between the user and the virtual character is realized, and the self-adaptivity of the virtual character is improved.

위의 일반적인 설명 및 다음의 상세한 설명은 단지 예시 및 설명을 위한 것이며, 본 개시를 제한할 수 없음을 이해해야 한다.It should be understood that the above general description and the following detailed description are for illustrative and explanatory purposes only and are not intended to limit the disclosure.

여기에서 첨부된 도면은 본 명세서에 통합되어 본 명세서의 일부를 구성하고, 본 개시에 부합하는 실시예를 도시하며, 본 명세서와 함께 본 개시의 원리를 설명하기 위해 사용된다. 명백하게, 후술하는 첨부된 도면은 본 개시의 일부 실시예에 불과하며, 본 기술 분야에서 통상의 지식을 가진 자라면 창의적인 노력 없이도 첨부된 도면에 따라 다른 첨부된 도면을 더 획득할 수 있을 것이다. 첨부된 도면에서:
도 1은 본 개시의 실시예의 기술적 솔루션이 적용될 수 있는 예시적인 시스템 아키텍처의 개략도이다.
도 2는 종래의 스킨된 애니메이션(skinned animation)에서 가상 캐릭터의 구성 구조를 개략적으로 도시한다.
도 3은 본 개시의 실시예에 따른 애니메이션 처리 방법의 개략적인 흐름도를 개략적으로 도시한다.
도 4는 본 개시의 실시예에 따라 게임 장면과 실제 장면이 통합된 후 획득되는 장면의 개략도를 개략적으로 도시한다.
도 5는 본 개시의 실시예에 따른 조밀한 노치 지형(densely-notched terrain)의 인터페이스의 개략도를 개략적으로 도시한다.
도 6은 본 개시의 실시예에 따른 하이브리드 장애물 지형(hybrid-obstacle terrain)의 인터페이스의 개략도를 개략적으로 도시한다.
도 7은 본 개시의 실시예에 따른 사람 형상의 캐릭터의 걷기 액션의 제1 프레임의 액션 정보를 개략적으로 도시한다.
도 8은 본 개시의 실시예에 따른 지형의 인터페이스의 개략도를 개략적으로 도시한다.
도 9는 본 개시의 실시예에 따른 애니메이션 처리 모델의 개략적인 구조도를 개략적으로 도시한다.
도 10은 본 개시의 실시예에 따른 제1 제어 네트워크의 개략적인 구조도를 개략적으로 도시한다.
도 11은 본 개시의 실시예에 따른 제2 제어 네트워크의 개략적인 구조도를 개략적으로 도시한다.
도 12는 본 개시의 실시예에 따른 강화 학습의 개략적인 흐름도를 개략적으로 도시한다.
도 13은 본 개시의 실시예에 따른 애니메이션 처리 모델의 알고리즘 프레임워크의 아키텍처 다이어그램을 개략적으로 도시한다.
도 14는 본 개시의 실시예에 따른, 애니메이션 처리 모델에 의해 제어되고 평지(flat ground)를 달리는 가상 캐릭터의 액션 시퀀스를 개략적으로 도시한다.
도 15a 내지 도 15e는 본 개시의 실시예에 따른 조밀한 노치 지형을 달리는 사람 형상의 가상 캐릭터의 액션 시퀀스를 개략적으로 도시한다.
도 16a 내지 도 16l은 본 개시의 실시예에 따른 하이브리드 장애물 지형을 달리는 사람 형상의 가상 캐릭터의 액션 시퀀스를 개략적으로 도시한다.
도 17은 본 개시의 실시예에 따른 애니메이션 처리 장치의 블록도를 개략적으로 도시한다.
도 18은 본 개시의 실시예를 구현하도록 적응된 전자 디바이스의 컴퓨터 시스템의 개략적인 구조도를 도시한다.The drawings attached herein are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure, and are used together with the specification to explain the principles of the present disclosure. Obviously, the attached drawings described below are only some embodiments of the present disclosure, and those skilled in the art will be able to further obtain other attached drawings according to the attached drawings without creative efforts. In the attached drawing:
1 is a schematic diagram of an example system architecture to which the technical solutions of embodiments of the present disclosure can be applied.
Figure 2 schematically shows the configuration structure of a virtual character in conventional skinned animation.
Figure 3 schematically shows a schematic flowchart of an animation processing method according to an embodiment of the present disclosure.
Figure 4 schematically shows a schematic diagram of a scene obtained after a game scene and a real scene are integrated according to an embodiment of the present disclosure.
Figure 5 schematically illustrates a schematic diagram of an interface with densely-notched terrain according to an embodiment of the present disclosure.
Figure 6 schematically shows a schematic diagram of an interface of a hybrid-obstacle terrain according to an embodiment of the present disclosure.
Figure 7 schematically shows action information of a first frame of a walking action of a human-shaped character according to an embodiment of the present disclosure.
Figure 8 schematically shows a schematic diagram of a terrain interface according to an embodiment of the present disclosure.
Figure 9 schematically shows a schematic structural diagram of an animation processing model according to an embodiment of the present disclosure.
Figure 10 schematically shows a schematic structural diagram of a first control network according to an embodiment of the present disclosure.
Figure 11 schematically shows a schematic structural diagram of a second control network according to an embodiment of the present disclosure.
Figure 12 schematically shows a schematic flowchart of reinforcement learning according to an embodiment of the present disclosure.
Figure 13 schematically shows an architecture diagram of an algorithmic framework of an animation processing model according to an embodiment of the present disclosure.
Figure 14 schematically shows an action sequence of a virtual character running on flat ground and controlled by an animation processing model, according to an embodiment of the present disclosure.
15A to 15E schematically illustrate an action sequence of a human-shaped virtual character running on dense notched terrain according to an embodiment of the present disclosure.
16A to 16L schematically show an action sequence of a human-shaped virtual character running on a hybrid obstacle terrain according to an embodiment of the present disclosure.
Figure 17 schematically shows a block diagram of an animation processing device according to an embodiment of the present disclosure.
Figure 18 shows a schematic structural diagram of a computer system of an electronic device adapted to implement an embodiment of the present disclosure.

예시적인 구현이 이제 첨부 도면을 참조하여 보다 철저하게 설명될 것이다. 그러나, 본 개시는 다양한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되는 것은 아니다. 대신, 본 개시를 보다 철저하고 완전하게 하고 예시적인 구현의 아이디어를 당업자에게 완전히 전달하기 위한 구현이 제공된다. An example implementation will now be described more thoroughly with reference to the accompanying drawings. However, the present disclosure may be implemented in various forms and is not limited to the embodiments described herein. Instead, implementations are provided so that this disclosure will be more thorough and complete and will fully convey the ideas of the example implementations to those skilled in the art.

또한, 설명된 특징, 구조 또는 특성은 임의의 적절한 방식으로 하나 이상의 실시예에서 조합될 수 있다. 다음 설명에서, 본 개시의 실시예의 완전한 이해를 얻기 위해 많은 세부사항이 제공된다. 그러나, 당업자는 본 개시의 기술적 솔루션이 하나 이상의 특정 세부사항 없이 구현될 수 있거나, 또는 다른 방법, 유닛, 장치 또는 단계가 사용될 수 있음을 인지해야 한다. 다른 경우에, 잘 알려진 방법, 장치, 구현 또는 작동(operation)이 본 개시의 측면을 모호하게 하지 않기 위해 상세하게 도시되거나 설명되지 않는다.Additionally, the described features, structures or characteristics may be combined in one or more embodiments in any suitable way. In the following description, numerous details are provided to obtain a thorough understanding of embodiments of the present disclosure. However, those skilled in the art should recognize that the technical solutions of the present disclosure may be implemented without one or more specific details, or that other methods, units, devices or steps may be used. In other instances, well-known methods, devices, implementations or operations are not shown or described in detail so as not to obscure aspects of the disclosure.

첨부된 도면에 도시된 블록도는 단지 기능적 엔티티일 뿐이며 물리적으로 독립된 엔티티에 반드시 대응하는 것은 아니다. 즉, 기능적 엔티티는 소프트웨어 형태로, 또는 하나 이상의 하드웨어 모듈 또는 집적 회로로, 또는 상이한 네트워크 및/또는 프로세서 장치 및/또는 마이크로 컨트롤러 장치로 구현될 수 있다.The block diagrams shown in the attached drawings are merely functional entities and do not necessarily correspond to physically independent entities. That is, the functional entity may be implemented in software form, or as one or more hardware modules or integrated circuits, or as a different network and/or processor device and/or microcontroller device.

첨부된 도면에 도시된 흐름도는 단지 예시적인 설명일 뿐이며, 모든 내용 및 작동/단계를 포함할 필요가 없고, 설명된 순서대로 수행될 필요도 없다. 예를 들어, 일부 작동/단계는 더 분할될 수 있는 반면 일부 작동/단계는 조합되거나 부분적으로 조합될 수 있다. 따라서, 실제 실행 순서는 실제 사례에 따라 변경될 수 있다.The flowcharts shown in the accompanying drawings are merely illustrative illustrations and do not necessarily include all content and operations/steps, nor do they necessarily need to be performed in the order described. For example, some operations/steps may be further divided while some operations/steps may be combined or partially combined. Therefore, the actual execution order may change depending on the actual case.

도 1은 본 개시의 실시예의 기술적 솔루션이 적용될 수 있는 예시적인 시스템 아키텍처의 개략도이다.1 is a schematic diagram of an example system architecture to which the technical solutions of embodiments of the present disclosure can be applied.

도 1에 도시된 바와 같이, 시스템 아키텍처(100)는 단말 디바이스(101), 네트워크(102) 및 서버(103)를 포함할 수 있다. 네트워크(102)는 단말 디바이스(101)와 서버(103) 사이의 통신 링크를 제공하기 위해 사용되는 매체이다. 네트워크(102)는 다양한 연결 유형, 예를 들어, 유선 통신 링크, 무선 통신 링크 등을 포함할 수 있다.As shown in Figure 1, system architecture 100 may include a terminal device 101, a network 102, and a server 103. Network 102 is a medium used to provide a communication link between terminal device 101 and server 103. Network 102 may include various connection types, such as wired communication links, wireless communication links, etc.

도 1의 단말 디바이스의 수, 네트워크의 수 및 서버의 수는 예시일 뿐임을 이해해야 한다. 실제 요건에 따라 임의의 수의 단말 디바이스, 임의의 수의 네트워크 및 임의의 수의 서버가 있을 수 있다. 예를 들어, 서버(103)는 복수의 서버를 포함하는 서버 클러스터일 수 있다. 단말 디바이스(101)는 노트북, 휴대용 컴퓨터 또는 데스크탑 컴퓨터와 같이 디스플레이 화면을 포함하는 단말 디바이스일 수 있다.It should be understood that the number of terminal devices, number of networks, and number of servers in FIG. 1 are only examples. There may be any number of terminal devices, any number of networks, and any number of servers, depending on actual requirements. For example, server 103 may be a server cluster including a plurality of servers. The terminal device 101 may be a terminal device including a display screen, such as a laptop, portable computer, or desktop computer.

본 개시의 실시예에서, 게임 애플리케이션은 단말 디바이스(101)에서 운반되며, 게임 애플리케이션은 애니메이션 세그먼트(animation segment)를 포함한다. 게임 애플리케이션을 실행하는 동안, 게임 애플리케이션의 관련 컨트롤을 사용하여 가상 캐릭터에 장애물이 설정될 수 있으며; 또는 단말 디바이스(101)의 촬영 유닛을 사용하여 실제 장면을 촬영하고, 실제 장면을 게임 화면에 통합하여 가상 캐릭터에 대한 장애물을 설정할 수 있다. 한편, 사용자는 애니메이션 세그먼트의 장면에 따라 가상 캐릭터에 대한 태스크(task)를 설정할 수 있으며, 예를 들어 가상 캐릭터가 타깃 방향 또는 타깃 지점(point)을 향해 이동하도록 할 수 있다. 단말 디바이스(101)는 네트워크(102)를 사용하여, 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징(terrain feature), 그리고 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 태스크 정보 및 상태 정보를 서버(103)로 전송할 수 있고, 지형 특징, 태스크 정보 및 상태 정보는 서버(103)를 사용하여 처리되어 현재 순간의 가상 캐릭터에 대응하는 제스처 조정(gesture adjustment) 정보를 획득한다. 이러한 방식으로 애니메이션 세그먼트를 시물레이션할 수 있을 뿐만 아니라 가상 캐릭터도 자기 적응성을 가질 수 있으며, 설정된 태스크를 완료할 수 있다.In an embodiment of the present disclosure, a game application is carried in the terminal device 101, and the game application includes an animation segment. While running the gaming application, obstacles may be set for the virtual character using the relevant controls of the gaming application; Alternatively, an actual scene may be photographed using the photographing unit of the terminal device 101, and an obstacle for the virtual character may be set by integrating the actual scene into the game screen. Meanwhile, the user can set a task for the virtual character according to the scene of the animation segment, for example, have the virtual character move in a target direction or toward a target point. The terminal device 101 uses the network 102 to send terrain features in the graphical user interface at the current moment, and task information and status information corresponding to the virtual character in the animation segment at the current moment, to the server 103. The terrain features, task information, and status information are processed using the server 103 to obtain gesture adjustment information corresponding to the virtual character at the current moment. In this way, not only can animated segments be simulated, but virtual characters can also be self-adaptive and complete set tasks.

일부 실시예에서, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출이 수행되어 다음 순간의 가상 캐릭터에 대응하는 관절 액션(joint action) 정보를 획득할 수 있고; 관절 액션 정보에 기반하여 관절 토크(joint torque)가 결정될 수 있으며, 관절 토크는 물리적 엔진을 사용하여 대응하는 관절에 인가되어, 렌더링(rendering)을 수행하여 현재 순간의 가상 캐릭터에 대응하는 제스처 조정 정보를 획득할 수 있다. 가상 캐릭터에 대응하는 상태 정보는 애니메이션 세그먼트의 초기 순간에서 가상 캐릭터에 대응하는 제스처 정보이거나, 또는 이전 순간의 관절 액션 정보에 기반하여 결정된 상태 정보일 수 있다. 애니메이션 세그먼트는 일정 지속 기간(duration)을 가지며, 복수의 순간에서 가상 캐릭터에 대응하는 제스처 조정 정보는 전술한 작동을 반복함으로써 획득될 수 있으며, 타깃 액션 시퀀스(target action sequence)는 복수의 순간에서의 제스처 조정 정보에 따라 결정될 수 있고, 타깃 액션 시퀀스는 애니메이션 세그먼트를 형성할 수 있으며, 애니메이션 세그먼트는 실행 중인 게임의 애니메이션 세그먼트와 유사하고 높은 충실도를 가지며, 차이점은 애니메이션 세그먼트의 가상 캐릭터가 사용자가 설정한 지형에 맞게 자체 적응할 수 있으며 사용자가 설정한 태스크를 완료할 수 있다는 것이며, 즉 본 개시의 실시예의 기술적 솔루션은 사용자와 가상 캐릭터 간의 상호 작용을 개선하고. 가상 캐릭터의 자기 적응성을 개선하여 사용자 경험을 더욱 향상시킬 수 있다.In some embodiments, feature extraction may be performed on terrain features, state information, and task information using an animation processing model to obtain joint action information corresponding to the virtual character at the next moment; Joint torque may be determined based on joint action information, and the joint torque is applied to the corresponding joint using a physical engine to perform rendering and gesture adjustment information corresponding to the virtual character at the current moment. can be obtained. The state information corresponding to the virtual character may be gesture information corresponding to the virtual character at an initial moment of the animation segment, or may be state information determined based on joint action information at a previous moment. An animation segment has a certain duration, gesture adjustment information corresponding to the virtual character at a plurality of moments can be obtained by repeating the above-described operation, and a target action sequence is at a plurality of moments. It can be determined according to the gesture coordination information, the target action sequence can form an animation segment, the animation segment is similar to the animation segment of the running game and has high fidelity, the difference is that the virtual character in the animation segment is set by the user. It is capable of self-adapting to the terrain and completing tasks set by the user, that is, the technical solution of embodiments of the present disclosure improves the interaction between the user and the virtual character. By improving the self-adaptability of virtual characters, the user experience can be further improved.

본 개시의 실시예에서 제공되는 애니메이션 처리 방법은 서버에서 수행될 수 있고, 이에 대응하여 애니메이션 처리 장치는 서버에 배치될 수 있다. 그러나, 본 개시의 다른 실시예에서, 본 개시의 실시예에서 제공되는 애니메이션 처리 방법은 다르게는 단말 디바이스에 의해 수행될 수 있다.The animation processing method provided in the embodiment of the present disclosure may be performed on a server, and correspondingly, an animation processing device may be placed on the server. However, in another embodiment of the present disclosure, the animation processing method provided in the embodiment of the present disclosure may otherwise be performed by a terminal device.

서버(103)는 독립된 물리적 서버일 수도 있고, 또는 복수의 물리적 서버 또는 분산된 시스템을 포함하는 서버 클러스터일 수도 있으며, 또는 클라우드 서비스, 클라우드 데이터베이스, 클라우드 컴퓨팅, 클라우드 기능, 클라우드 스토리지, 네트워크 서비스, 클라우드 통신, 미들웨어 서비스, 도메인 네임 서비스, 보안 서비스, 콘텐츠 전송 네트워크(content delivery network, CDN), 빅데이터 및 인공 지능 플랫폼과 같은 기본적인 클라우드 컴퓨팅 서비스를 제공하는 클라우드 서버일 수도 있다.Server 103 may be an independent physical server, or a server cluster including multiple physical servers or distributed systems, or may be a cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud It may be a cloud server that provides basic cloud computing services such as communications, middleware services, domain name services, security services, content delivery network (CDN), big data, and artificial intelligence platforms.

이 분야의 관련 기술에서, 3D 게임을 예로 들 수 있으며, 3D 게임에서 캐릭터 애니메이션은 일반적으로 스킨된 애니메이션(skinned animation)을 의미한다. 도 2는 스킨된 애니메이션에서 가상 캐릭터의 구성 구조(composition structure)를 도시한다. 도 2에 도시된 바와 같이, 스킨된 애니메이션의 가상 캐릭터는 뼈(bone), 스킨(skin), 애니메이션을 포함하며, 뼈는 관절로 구성된 가동 골격이면서 또한 움직일 수 있는 가상 주체이며, 전체 캐릭터를 움직이게 하지만 게임에서 렌더링되지 않으며; 스킨은 뼈를 감싸는 삼각형 메시(triangular mesh)이며, 메시의 각 정점은 하나 이상의 뼈에 의해 제어되고, 애니메이션은 특정 시점(time point)에서 각 뼈의 위치나 방향이 변하는 것이며, 3차원 공간은 일반적으로 행렬로 표현된다. 일반적으로 애니메이터는 3D 애니메이션 제작 소프트웨어를 사용하여 대량의 애니메이션 세그먼트를 미리 디자인하여 제작하고, 게임을 플레이하는 동안 프로그램은 장면에 필요한 애니메이션 세그먼트를 적절한 시간에 재생한다. 특히 필요한 경우, 렌더링 전에 프로그램에서 애니메이션 후처리를 수행하며, 예를 들어 가상 캐릭터의 손과 발의 정확한 위치는 액션을 조정하기 위해 역운동학(inverse kinematics, IK) 방법을 사용하여 그 시간의 실제 환경에 따라 계산된다. 그러나, 후처리의 효과는 제한적이며, 일반적으로 애니메이션의 품질은 거의 전적으로 애니메이터의 기술에 달려 있다. 애니메이터가 직접 애니메이션을 제작한다는 것은 실제로 게임에서 직접 애니메이션을 재생하는 것으로, 실제 세계의 물리 법칙을 물리적 엔진으로 시뮬레이션하지 않아서 캐릭터의 액션이 충분히 자연스럽지 않거나 생생하지 않다. 현재 업계에는 물리적 애니메이션 AI를 훈련하기(train) 위한 몇 가지 머신 학습 솔루션이 있다. 그러나, 학습 효과는 일반적으로 좋지 않으며 하나의 모델은 단일 성능을 갖는 하나의 액션만 학습할 수 있다.In related technologies in this field, 3D games can be taken as an example, and character animation in 3D games generally means skinned animation. Figure 2 shows the composition structure of a virtual character in a skinned animation. As shown in Figure 2, the virtual character of the skinned animation includes a bone, a skin, and an animation, and the bone is a movable skeleton composed of joints and a virtual subject that can also move, allowing the entire character to move. However, it is not rendered in game; A skin is a triangular mesh that surrounds a bone, and each vertex of the mesh is controlled by one or more bones. Animation is a change in the position or direction of each bone at a specific time point, and the three-dimensional space is a general It is expressed as a matrix. Typically, animators use 3D animation production software to design and produce a large number of animation segments in advance, and while playing the game, the program plays the animation segments needed for the scene at the appropriate time. In particular, if necessary, the program performs post-animation post-processing before rendering, for example, the exact positioning of the virtual character's hands and feet can be determined using inverse kinematics (IK) methods to coordinate the action with the real environment at that time. It is calculated accordingly. However, the effectiveness of post-processing is limited, and the quality of the animation generally depends almost entirely on the skill of the animator. When an animator produces animation directly, it means that the animation is actually played directly in the game. The physical laws of the real world are not simulated by a physical engine, so the character's actions are not natural or lifelike enough. There are currently several machine learning solutions in the industry for training physically animated AI. However, the learning effect is generally poor, and one model can only learn one action with a single performance.

또한, 현대 게임 제작에서 애니메이션을 구현하는 주요 방법은 애니메이터가 제작한 애니메이션 세그먼트를 재생하는 것으로, 실질적으로 미리 정의된 열거 가능한 장면에 적용될 수 있을 뿐이며, 환경에 대한 자체 적응 능력이 없다. 캐릭터의 환경에 대한 자기 적응성(self-adaptivity)은 미지의(unknown) 환경에서 캐릭터 애니메이션이 환경에 매칭하는 제스처를 제시할(present) 수 있다는 것을 의미한다. 여기서 '미지(unknown)'는 애니메이션의 사전 제작 프로세스 동안 가정되는 환경과 관련이 있으며; 실제 환경은 애니메이션 세그먼트를 사용하는 동안 크거나 작은 변화를 가진다. 또한, 외부 간섭에 의해 충돌이 인지되어, 현실감이 강한 장면을 가지는 액션의 편차와 수정을 제시할 수 있다. 적어도 IK 기술은 환경에 대한 자기 적응성을 실현하고 캐릭터의 사지(extremities)가 위치 측면에서 환경 또는 타깃과 정렬될 수 있도록 해야 하고; 그리고 캐릭터의 액션의 적절한 속도와 부드러운 전환 프로세스를 계산하기 위해 환경에 대한 캐릭터의 피드백이 생생해야 하면, "물리적"(즉, 강체 동역학(rigid body dynamic)의 시뮬레이션)이 추가로 도입될 필요가 있다. 일반적으로 지형은 고정되어 있고, 지형을 이동하는 캐릭터의 액션 프로세스는 애니메이션으로 제작되고, 부자연스러운 부분(unnatural part)에 대한 적절한 수정이 이루어진다. 이 절차는 본질적으로 여전히 애니메이션을 재생하기 위한 것이며, 지형에서 캐릭터의 이동((movement)이 부자연스럽다.Additionally, the main method of implementing animation in modern game production is to play animation segments produced by animators, which can practically only be applied to predefined enumerable scenes and have no self-adaptation ability to the environment. The character's self-adaptivity to the environment means that the character animation can present gestures matching the environment in an unknown environment. The 'unknown' here relates to the environment assumed during the pre-production process of the animation; The real world may have large or small changes while using an animated segment. Additionally, collisions can be recognized due to external interference, suggesting deviations and corrections in actions that have a scene with a strong sense of realism. At the very least, IK technology should realize self-adaptation to the environment and allow the character's extremities to be aligned with the environment or target in terms of position; And if the character's feedback about the environment must be vivid in order to calculate the appropriate speed of the character's actions and a smooth transition process, additional "physics" (i.e. simulation of rigid body dynamics) need to be introduced. . In general, the terrain is fixed, the action process of the character moving through the terrain is animated, and unnatural parts are appropriately modified. This procedure is essentially still intended to play animations, and the character's movement across the terrain is unnatural.

본 개시의 실시예는 관련 기술의 기존 문제점을 감안하여 애니메이션 처리 방법을 제공한다. 이 방법은 인공 지능(artificial intelligence, AI)을 기반으로 구현된다. 인공 지능은 디지털 컴퓨터 또는 디지털 컴퓨터로 제어되는 머신을 사용하여 인간의 지능을 시뮬레이션, 확대(extend) 및 확장(expand)하고, 환경을 인식하며, 지식을 획득하고, 지식을 사용하여 최적의 결과를 획득하는 이론, 방법, 기술 및 애플리케이션 시스템이다. 달리 말하면, AI는 지능의 본질을 이해하고 인간의 지능과 유사하게 반응할 수 있는 새로운 유형의 지능 머신을 생산하려는 컴퓨터 과학의 종합 기술이다. AI는 다양한 지능형 머신의 설계 원리와 구현 방법을 연구하여 머신이 인식, 추론 및 의사 결정 기능을 갖도록 하는 것이다.Embodiments of the present disclosure provide an animation processing method in consideration of existing problems in related technologies. This method is implemented based on artificial intelligence (AI). Artificial intelligence uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to produce optimal results. It is a system of theories, methods, techniques and applications to be acquired. In other words, AI is a comprehensive technology of computer science that seeks to understand the nature of intelligence and produce new types of intelligent machines that can respond similarly to human intelligence. AI is the study of design principles and implementation methods of various intelligent machines to enable machines to have perception, reasoning, and decision-making functions.

AI 기술은 하드웨어 수준 기술과 소프트웨어 수준 기술을 모두 포함하는 광범위한 분야를 포괄하는 포괄적인 학문이다. 기본 AI 기술에는 일반적으로 센서, 전용 AI 칩, 클라우드 컴퓨팅, 분산 스토리지, 빅 데이터 처리 기술, 운영/상호작용 시스템, 메카트로닉스 등의 기술이 포함된다. AI 소프트웨어 기술은 주로 컴퓨터 비전 기술, 음성 처리 기술, 자연어 처리 기술, 머신 러닝/딥 러닝 등을 포함한다.AI technology is a comprehensive study that covers a wide range of fields, including both hardware-level technology and software-level technology. Basic AI technologies typically include technologies such as sensors, dedicated AI chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics. AI software technology mainly includes computer vision technology, voice processing technology, natural language processing technology, and machine learning/deep learning.

컴퓨터 비전(computer vision. CV)은 머신이 "볼 수 있도록" 하는 방법이며, 구체적으로 컴퓨터가 타깃을 사람의 눈이 관찰하기에 더 적합하거나 검출용 기기로 전송되기에 더 적합한 이미지로 처리하도록, 사람의 눈을 대신하는 카메라 및 컴퓨터를 사용하여 타깃에 대한 인식, 추적(tracking), 측정 등과 같은 머신 비전을 구현하고 추가로 그래픽 처리를 수행하는 방법을 연구하는 과학이다. CV는 과학 과목으로 관련 이론과 기술을 연구하고, 이미지나 다차원 데이터에서 정보를 획득할 수 있는 AI 시스템 구축을 시도한다. CV 기술에는 일반적으로 이미지 처리, 이미지 인식, 이미지 의미론적 이해, 이미지 검색(retrieval), 광학 문자 인식(optical character recognition, OCR), 비디오 처리, 비디오 의미론적 이해, 비디오 콘텐츠/행동 인식, 3D 객체 재구성, 3D 기술, 가상 현실, 증강 현실, 동기 포지셔닝(synchronous positioning) 및 지도 구성이 포함되며, 일반 얼굴 인식(common face recognition) 및 지문 인식과 같은 생체 특징 인식 기술이 더 포함된다.Computer vision (CV) is a method of enabling a machine to "see", specifically allowing a computer to process a target into an image that is better suited for observation by the human eye or transmitted to a detection device. It is a science that studies how to implement machine vision such as target recognition, tracking, and measurement using cameras and computers that replace the human eye and perform additional graphic processing. CV is a scientific subject that studies related theories and technologies and attempts to build an AI system that can obtain information from images or multidimensional data. CV technologies typically include image processing, image recognition, image semantic understanding, image retrieval, optical character recognition (OCR), video processing, video semantic understanding, video content/action recognition, and 3D object reconstruction. , 3D technology, virtual reality, augmented reality, synchronous positioning, and map construction, and further includes biometric feature recognition technologies such as common face recognition and fingerprint recognition.

ML은 다방면의 학문(multi-field interdiscipline)이며, 확률 이론, 통계학, 근사 이론, 볼록 공간 분석(convex analysis) 및 알고리즘 복잡도 이론과 같은 복수의 학문 분야에 관한 것이다. ML은 새로운 지식이나 기술을 습득하고 기존 지식 구조를 재구성하여 컴퓨터 성능을 지속적으로 개선하기 위해 컴퓨터가 인간의 학습 액션을 시뮬레이션하거나 구현하는 방법을 연구하는 것을 전문으로 한다. ML은 AI의 핵심이며, 컴퓨터를 지능화하는 기본 방법으로 AI의 다양한 분야에 적용된다. ML과 딥 러닝에는 일반적으로 인공 신경망, 신뢰성 있는 네트워크(belief network), 강화 학습(reinforcement learning), 전이 학습, 귀납적 학습 및 데모 학습(learning from demonstrations)과 같은 기술이 포함된다.ML is a multi-field interdiscipline and concerns multiple academic fields such as probability theory, statistics, approximation theory, convex analysis, and algorithmic complexity theory. ML specializes in studying how computers simulate or implement human learning actions in order to continuously improve computer performance by acquiring new knowledge or skills and reconstructing existing knowledge structures. ML is the core of AI and is the basic method of making computers intelligent and is applied to various fields of AI. ML and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and learning from demonstrations.

인공지능 기술의 연구와 발전으로 인공지능 기술은 일반 스마트 홈, 스마트 웨어러블 디바이스, 가상 비서, 스마트 스피커, 스마트 마케팅, 무인 운전, 자율 주행, 무인 항공기, 로봇, 스마트 의료, 스마트 고객 서비스 등 다양한 분야에서 연구 및 적용되고 있다. 기술의 발전과 함께 AI 기술은 더 많은 분야에 적용될 것이며 점점 더 중요한 역할을 할 것으로 믿어진다.With the research and development of artificial intelligence technology, artificial intelligence technology is used in various fields such as general smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, autonomous driving, unmanned aerial vehicles, robots, smart medicine, and smart customer service. It is being researched and applied. It is believed that with the advancement of technology, AI technology will be applied to more fields and play an increasingly important role.

본 개시의 실시예에서 제공하는 솔루션은 AI의 영상 처리 기술을 포함하며, 이하의 실시예를 사용하여 설명한다.The solution provided in the embodiment of the present disclosure includes AI image processing technology, and is described using the following embodiment.

본 개시의 실시예는 먼저 애니메이션 처리 방법을 제공한다. 도 3은 본 개시의 실시예에 따른 애니메이션 처리 방법의 흐름도를 개략적으로 도시한 것이다. 애니메이션 처리 방법은 서버에서 수행될 수 있으며, 서버는 도 1에 도시된 서버(103)일 수 있다. 게임 애니메이션의 처리를 예로 사용한다. 도 3을 참조하면, 애니메이션 처리 방법은 적어도 단계(S310) 내지 단계(S330)를 포함한다.Embodiments of the present disclosure first provide an animation processing method. Figure 3 schematically shows a flowchart of an animation processing method according to an embodiment of the present disclosure. The animation processing method may be performed on a server, and the server may be the server 103 shown in FIG. 1. The processing of game animation is used as an example. Referring to FIG. 3, the animation processing method includes at least steps S310 to S330.

단계(S310)에서: 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징이 획득되고, 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보가 획득된다.In step S310: terrain features are obtained from the graphical user interface at the current moment, and state information and task information corresponding to the virtual character are obtained from the animation segment at the current moment.

본 개시의 실시예에서, 게임의 재미를 향상시키고 게임에서 사용자와 가상 캐릭터 간의 상호작용을 강화하기 위해, 게임 동안 사용자는 가상 캐릭터에 대한 장애물(obstacle)을 인위적으로 설정하여, 그래픽 사용자 인터페이스에서 새로운 지형을 설정할 수 있으며, 예를 들어 가상 캐릭터가 원래 애니메이션 세그먼트에서 평평한 도로를 따라 똑바로 걷는 경우, 사용자는 가상 캐릭터의 이동 경로(moving path)에 로드블록(roadblock)을 설정할 수 있으며, 로드블록은 돌, 스텝(step) 또는 구덩이(pit)와 같은 장애물일 수 있고; 또는, 사용자는 하늘에 장애물 예를 들어 가상 캐릭터의 이동 경로에 처마(eave)나 날아다니는 새와 같은 장애물을 설정할 수 있으며, 가상 캐릭터는 이러한 장애물을 피해야 계속 앞으로 나아갈 수 있다. 본 개시의 실시예의 기술적 솔루션을 명확하게 하기 위하여, 이하의 설명에서는 로드블록을 예로 들어 설명하며, 로드블록은 지면에서의 노치(notch), 돌기(protuberance), 또는 스텝 등의 장애물일 수 있다.In an embodiment of the present disclosure, in order to improve the fun of the game and enhance the interaction between the user and the virtual character in the game, the user artificially sets obstacles for the virtual character during the game, and creates a new device in the graphical user interface. The terrain can be set, for example, if the virtual character walks straight along a flat road in the original animation segment, the user can set roadblocks in the virtual character's moving path, roadblocks can be made of stones. , may be an obstacle such as a step or a pit; Alternatively, the user can set obstacles in the sky, such as eaves or flying birds, in the virtual character's movement path, and the virtual character must avoid these obstacles to continue moving forward. In order to clarify the technical solution of the embodiment of the present disclosure, the following description takes a roadblock as an example, and the roadblock may be an obstacle such as a notch, protuberance, or step in the ground.

본 개시의 실시예에서, 사용자가 설정한 로드블록은 게임 내부에 배치된 컨트롤을 사용하여 설정되거나, 실제 장면에 따라 설정될 수 있다. 일부 실시예에서, 로드블록 설정 버튼은 게임 인터랙션 인터페이스에서 설정될 수 있다. 사용자가 로드블록 설정 버튼을 트리거하는 경우, 리스트가 팝업될 수 있으며, 사용자는 리스트로부터 가상 캐릭터에 대해 설정하고자 하는 로드블록을 선택한다. 사용자의 경정 이후에, 대응하는 로드블록이 게임 인터페이스에 나타난다. 증강 현실 게임에서, 실제 장면은 사용자가 사용하는 단말 디바이스에 구비된 촬영 유닛을 사용하여 촬영되며, 게임 엔진은 실제 장면과 게임 장면을 통합할 수 있다. 도 4는 게임 장면과 실제 장면이 통합된 후 획득되는 장면의 개략도를 도시한다. 도 4에 도시된 바와 같이, 게임 장면에 악령(demon spirit)(V)이 있고, 실제 장면에 복수의 스텝(step)(S)이 있으며, 최상단 스텝의 플랫폼에 복수의 전동 스쿠터(electric scooter)(M)이 배치되어 있다. 악령(V)은 게임 장면과 실제 장면을 통합하여 실제 장면의 스텝(S)에 배치될 수 있다. In an embodiment of the present disclosure, the roadblock set by the user may be set using controls placed inside the game, or may be set according to the actual scene. In some embodiments, a roadblock setting button may be set in the game interaction interface. When the user triggers the roadblock setup button, a list may pop up, and the user selects the roadblock they want to set for the virtual character from the list. After the user's decision, the corresponding roadblock appears in the game interface. In an augmented reality game, the actual scene is filmed using a photography unit provided in the terminal device used by the user, and the game engine can integrate the real scene and the game scene. Figure 4 shows a schematic diagram of the scene obtained after the game scene and the real scene are integrated. As shown in Figure 4, there is a demon spirit (V) in the game scene, a plurality of steps (S) in the actual scene, and a plurality of electric scooters on the platform of the top step. (M) is placed. The evil spirit (V) can be placed on the step (S) of the real scene by integrating the game scene and the real scene.

본 개시의 실시예에서, 게임에서 장애물을 설정하는 것은 일반적으로 로드블록을 설정하는 것이며, 상대적으로 많은 수의 장애물을 사용하여 생성되는 지형은 조밀한 노치 지형(densely-notched terrain) 및 하이브리드 장애물 지형(hybrid-obstacle terrain)을 포함한다. 도 5는 조밀한 노치 지형의 인터페이스의 개략도를 도시한다. 도 5에 도시된 바와 같이, 조밀한 노치 지형은 지면(G)에 연속된 복수의 노치가 존재하고, 노치의 너비가 상이하며, 두 노치 사이에 일정한 간격(interval)이 있도록 설계된다. 도 6은 하이브리드 장애물 지형의 인터페이스의 개략도를 도시한다. 도 6에 도시된 바와 같이, 하이브리드 장애물 지형은 일정한 길이의 지면(G)에 노치(C), 스텝(step)(D), 돌기(E)와 같은 장애물을 포함한다. 장애물의 높이와 너비가 상이하며, 두 장애물 사이에는 일정한 간격이 있다.In an embodiment of the present disclosure, setting obstacles in a game is generally setting roadblocks, and the terrain generated using a relatively large number of obstacles can be called densely-notched terrain and hybrid obstacle terrain ( hybrid-obstacle terrain). Figure 5 shows a schematic diagram of the interface of the dense notch topography. As shown in FIG. 5, the dense notch topography is designed so that there are a plurality of continuous notches on the ground (G), the notches have different widths, and there is a constant interval between the two notches. Figure 6 shows a schematic diagram of the interface of the hybrid obstacle terrain. As shown in FIG. 6, the hybrid obstacle terrain includes obstacles such as notches (C), steps (D), and protrusions (E) on the ground (G) of a certain length. The height and width of the obstacles are different, and there is a certain gap between the two obstacles.

본 개시의 실시예에서, 가상 캐릭터에 대한 로드블록을 설정하는 것에 더하여, 가상 캐릭터가 이동하는 동안 가상 캐릭터에 대한 태스크가 추가로 설정될 수 있다. 예를 들어, 가상 캐릭터 앞에 축구공이 있고, 태스크는 "축구공 차기"로 설정될 수 있으며, 축구공의 좌표 위치를 타깃 지점으로 사용하여, 타깃 지점에 따른 태스크 정보를 결정할 수 있다. 특정 방향으로 이동하도록 가상 캐릭터를 구동하기 위해 타깃 속도 방향이 가상 캐릭터에 대해 추가로 설정되며, 타깃 속도 방향에 따라 태스크 정보가 결정될 수 있다.In an embodiment of the present disclosure, in addition to setting a roadblock for the virtual character, a task for the virtual character may be additionally set while the virtual character is moving. For example, there is a soccer ball in front of the virtual character, the task can be set to “kick the soccer ball,” and the coordinate position of the soccer ball can be used as the target point to determine task information according to the target point. In order to drive the virtual character to move in a specific direction, a target speed direction is additionally set for the virtual character, and task information may be determined according to the target speed direction.

본 개시의 실시예에서, 가상 캐릭터의 제스처 및 액션은 연속된 시간 및 공간에서 서로 연관된다. 한 걸음 내딛는 사람 형상의(human-shaped) 가상 캐릭터를 예로 사용하며, 사람 형상의 가상 캐릭터의 오른발이 현재 순간에 들어 올려지면 다음 순간에 오른발이 착지하는 경향이 있다. 따라서, 다음 순간의 가상 캐릭터의 관절 액션 정보의 결정은 현재 순간의 가상 캐릭터의 상태 정보 처리를 기반으로 해야 하며, 그 상태 정보는 가상 캐릭터의 각 관절의 상태를 기술하는데 사용되며, 관절의 제스처, 속도, 위상(phase)을 포함할 수 있다. 따라서, 장애물을 피하고 태스크를 완료하기 위해 현재 순간의 가상 캐릭터의 제스처를 변경하는 방법을 결정하기 위해, 현재 순간의 그래픽 사용자 인터페이스에서의 지형 특징 그리고 현재 순간의 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보가 획득될 수 있으며, 이러한 정보를 처리하여 대응하는 제스처 조정 정보를 획득할 수 있다.In embodiments of the present disclosure, the virtual character's gestures and actions are associated with each other in continuous time and space. Using a human-shaped virtual character taking a step as an example, if the human-shaped virtual character's right foot is lifted at the current moment, the right foot tends to land at the next moment. Therefore, the determination of the joint action information of the virtual character at the next moment should be based on processing the state information of the virtual character at the current moment, and the state information is used to describe the state of each joint of the virtual character, including the gestures of the joints, May include speed and phase. Therefore, to determine how to change the gestures of a virtual character at the current moment to avoid obstacles and complete a task, terrain features in the graphical user interface at the current moment and state information and task information corresponding to the virtual character at the current moment can be obtained, and this information can be processed to obtain corresponding gesture adjustment information.

본 개시의 실시예에서, 애니메이션 세그먼트를 생성할 때, 애니메이터는 애니메이션 세그먼트를 상이한 형식(format)으로 설정할 수 있고; 애니메이션 세그먼트에서 가상 캐릭터의 상태 정보를 추출할 때, 먼저 애니메이션 세그먼트의 형식을 일부 소프트웨어(예: MotionBuilder 또는 3ds Max)를 사용하여 FBX 형식 또는 BVH 형식의 파일로 변환하고, 그런 다음 상태 정보를 추출한다. 실제 구현에서, 현재 순간이 애니메이션 세그먼트의 초기(initial) 순간인 경우, 애니메이션 세그먼트의 초기 순간에서 가상 캐릭터의 제스처 정보에 따라 상태 정보가 결정되고, 구현 동안 초기 순간의 제스처 정보가 상태 정보로 결정될 수 있으며; 현재 순간이 애니메이션 세그먼트의 초기가 아닌 순간인 경우, 이전 순간의 가상 캐릭터에 대응하는 관절 액션 정보에 따라 상태 정보가 결정되고, 구현하는 동안 이전 순간의 가상 캐릭터에 대응하는 관절 액션 정보가 상태 정보로 결정될 수 있다.In an embodiment of the present disclosure, when creating an animation segment, an animator may set the animation segment to different formats; When extracting the state information of a virtual character from an animation segment, first convert the format of the animation segment into a file in FBX format or BVH format using some software (e.g. MotionBuilder or 3ds Max), and then extract the state information. . In actual implementation, when the current moment is the initial moment of the animation segment, the state information is determined according to the gesture information of the virtual character at the initial moment of the animation segment, and the gesture information at the initial moment may be determined as the state information during implementation. There is; If the current moment is a moment other than the beginning of the animation segment, the state information is determined according to the joint action information corresponding to the virtual character at the previous moment, and during implementation, the joint action information corresponding to the virtual character at the previous moment is converted into state information. can be decided.

본 개시의 실시예에서, 사람 형상의 가상 캐릭터를 예로 사용하며, 사람 형상의 가상 캐릭터는 총 15개의 관절을 가지며, 각각 루트(root) 관절, 가슴, 목, 오른쪽 다리, 왼쪽 다리, 오른쪽 무릎, 왼쪽 무릎, 오른쪽 발목, 왼쪽 발목, 오른쪽 어깨, 왼쪽 어깨, 오른쪽 팔꿈치, 왼쪽 팔꿈치, 오른손 및 왼손이며, 여기서 루트 관절은 일반적으로 골반 위치를 나타내며 루트로 표시된다. 일반적으로 사람 형상의 가상 캐릭터의 뼈와 관절은 부모-자식 계층 구조(parent-child hierarchical structure)로 되어 있다. 예를 들어, 어깨는 부모 관절이고, 팔꿈치는 어깨의 자식 관절이며, 손목은 팔꿈치의 자식 관절이다. 자식 관절의 위치는 부모 관절의 위치로부터 대응하는 트렌스레이션(translation)을 수행하는 것에 의해 획득된다. 따라서, 자식 관절의 위치 좌표를 기록할 필요가 없으며, 최상위 루트 관절의 위치 좌표를 알고, 애니메이션을 디자인하는 동안 애니메이터에 의해 뼈 세트의 크기에 따라 트렌스레이션이 수행되면, 자식 관절의 위치 좌표를 획득할 수 있다. 액션의 경우, 캐릭터 관절의 제스처 정보를 애니메이션 세그먼트에 기록하고, 필요한 관절 각각의 위치와 회전을 알면, 가상 캐릭터의 현재 액션을 구성할 수 있다. 루트 관절의 위치 및 회전 외에도, 다른 관절의 대응하는 회전이 기록되어 가상 캐릭터의 현재 전체 제스처를 구성한다. 도 7은 사람 형상의 캐릭터의 걷기 액션의 제1 프레임의 액션 정보를 도시한다. 도 7에 도시된 바와 같이, 제1 프레임의 액션 정보는 세 줄로 나뉜다. 첫 번째 줄의 제1 숫자 0.0333333은 제1 프레임의 지속 기간(duration)을 초(second)로 나타내고, 그 다음 3개의 숫자(001389296, 0.8033880000000001, 0.0036694320000000002)는 3차원 공간에서 제1 프레임의 루트 관절의 좌표이며; 두 번째 줄의 4개의 숫자(0.5306733251792894, -0.5324986777087051, -0.4638864011202557, -0.46865807049205305)는 제1 프레임의 루트 관절의 회전 정보이고; 세 번째 줄의 4개의 숫자(0.7517762842400346, 0.0012912812309982618, -0.0033 740637622359164, 0.6594083459744481)는 루트 관절에 대응하는 제1 자식 관절의 회전이며, 나머지 자식 관절의 회전 정보는 도 7에서 생략된다. 회전 정보는 단위 쿼터니언(unit quaternion)으로 표시된다. 단위 쿼터니언은 3차원 공간에서 회전을 표현하기 위해 사용될 수 있으며, 흔히 사용되는 3차원 직교 행렬 및 오일러 각도(Euler angle)와 동등하지만, 오일러 각도 표현에서 짐벌 잠금(gimbal lock) 문제를 피할 수 있다. 3차원 공간에서 한 점의 데카르트 좌표(Cartesian coordinate)가 (x, y, z)인 경우, 그 점은 순수(pure) 쿼터니언(이는 순허수(pure imaginary number)와 유사하며, 즉, 0의 실수 컴포넌트를 갖는 쿼터니언) xi+yj+zk으로 표현된다. i, j 또는 k의 기하학적 의미는 일종의 회전으로 이해될 수 있으며, 여기서 회전 i는 X축과 X축의 교차 평면에서 X축 양의 방향으로부터 Y축 양의 방향을 향한 회전을 나타내며, 회전 j는 Z축과 X축의 교차 평면에서 Z축 양의 방향으로부터 X축 양의 방향을 향한 회전을 나타내고, 회전 k는 Y축과 Z축의 교차 평면에서 Y축 양의 방향으로부터 Z축 양의 방향을 향한 회전을 나타내며, -i, -j, -k는 각각 회전 i, 회전 j, 회전 k의 역회전을 나타낸다.In an embodiment of the present disclosure, a human-shaped virtual character is used as an example, and the human-shaped virtual character has a total of 15 joints, respectively, root joint, chest, neck, right leg, left leg, right knee, These are the left knee, right ankle, left ankle, right shoulder, left shoulder, right elbow, left elbow, right hand and left hand, where the root joint usually refers to the pelvic position and is denoted as the root. In general, the bones and joints of human-shaped virtual characters have a parent-child hierarchical structure. For example, the shoulder is a parent joint, the elbow is a child joint of the shoulder, and the wrist is a child joint of the elbow. The position of the child joint is obtained by performing a corresponding translation from the position of the parent joint. Therefore, there is no need to record the position coordinates of the child joints, the position coordinates of the top root joint are known, and when the translation is performed according to the size of the bone set by the animator during animation design, the position coordinates of the child joints are obtained. can do. In the case of actions, by recording the gesture information of the character's joints in the animation segment and knowing the position and rotation of each necessary joint, the current action of the virtual character can be configured. In addition to the position and rotation of the root joint, the corresponding rotations of other joints are recorded to constitute the current overall gesture of the virtual character. Figure 7 shows action information of the first frame of a walking action of a human-shaped character. As shown in FIG. 7, the action information of the first frame is divided into three lines. The first number 0.0333333 in the first line represents the duration of the first frame in seconds, and the next three numbers (001389296, 0.8033880000000001, 0.0036694320000000002) represent the root joint of the first frame in three-dimensional space. coordinates; The four numbers in the second line (0.5306733251792894, -0.5324986777087051, -0.4638864011202557, -0.46865807049205305) are rotation information of the root joint of the first frame; The four numbers of the third row (0.7517762842400346, 0.0012912309982618, -0.0033 740637622359164, 0.6594083839744481) are the rotation of the first child joint corresponding to the root joints, and the remaining rotation information of the remaining child joints It is omitted in FIG. 7. Rotation information is expressed as a unit quaternion. Unit quaternions can be used to express rotations in three-dimensional space and are equivalent to the commonly used three-dimensional orthogonal matrices and Euler angles, but avoid the gimbal lock problem in Euler angle representation. If the Cartesian coordinates of a point in three-dimensional space are (x, y, z), then that point is a pure quaternion (which is similar to a pure imaginary number, that is, a real number of 0). A quaternion with components) is expressed as xi+yj+zk. The geometric meaning of i, j or k can be understood as a kind of rotation, where rotation i represents a rotation from the positive It represents a rotation from the positive Z-axis direction toward the positive X-axis direction in the intersection plane of the Y-axis and the Indicates that -i, -j, and -k represent the reverse rotation of rotation i, rotation j, and rotation k, respectively.

본 개시의 실시예에서, 애니메이션 처리 모델에 입력되는 상태 정보는 197 차원 벡터(dimensional vector)일 수 있으며, 여기서 포함된 제스처는 106 차원, 포함된 속도는 90 차원, 포함된 위상은 1차원일 수 있다. 일부 실시예에서, 제스처는 사람 형상의 캐릭터의 15개 관절의 위치 및 회전 정보를 기록하며, 여기서 위치는 3차원 좌표로 표시되고, 회전 정보는 15*7 = 총 105 차원인 단위 쿼터니언으로 표시된다. 또한, 현재 순간에서 가상 캐릭터의 루트 관절 좌표의 1차원 y축 값을 추가로 기록해야 하며, 이는 월드 좌표계와의 정렬에 사용된다. 속도는 각 관절의 선형 속도(linear speed)와 각속도(angular speed)를 기록하며, 각각은 x축 속도, y축 속도, z축 속도에 대응하는 길이가 3인 벡터로 나타내므로, 15*(3+3)= 총 90 차원이 있다. 위상은 애니메이션 세그먼트의 총 시간 길이에서 현재 순간의 위치를 기록하며, 이는 총 1차원이다. In an embodiment of the present disclosure, the state information input to the animation processing model may be a 197-dimensional vector, where the included gesture may be 106 dimensional, the included speed may be 90 dimensional, and the included phase may be 1 dimensional. there is. In some embodiments, the gesture records position and rotation information of 15 joints of a human-shaped character, where the positions are expressed in three-dimensional coordinates and the rotation information is expressed as a unit quaternion with 15*7 = 105 total dimensions. . Additionally, the one-dimensional y-axis value of the virtual character's root joint coordinates at the current moment must be additionally recorded, which is used for alignment with the world coordinate system. Speed records the linear speed and angular speed of each joint, and each is expressed as a vector with a length of 3 corresponding to the x-axis speed, y-axis speed, and z-axis speed, so 15*(3 +3)= There are a total of 90 dimensions. Phase records the position of the current moment in the total time length of the animation segment, which is in one dimension overall.

본 개시의 실시예에서, 지형 특징은 2차원 행렬일 수 있으며, 행렬의 각 엘리먼트는 대응하는 지점의 지형 높이와 가상 캐릭터의 현재 위치의 높이 사이의 상대 높이 차이이며, 이는 가상 캐릭터 앞의 미리 설정된 범위 내 영역(region)의 높이를 커버한다(cover). 행렬의 크기와 덮는 지형(covering terrain)의 면적(area)은 실제 적용 시나리오에 따라 조정될 수 있다. 예를 들어, 2차원 행렬의 크기는 100*100으로 설정되고, 덮는 지형의 면적은 10m*10m로 설정된다. 이는 본 개시의 실시예에 한정되지 않는다. 도 8은 지형의 인터페이스의 개략도를 도시한다. 도 8에 도시된 바와 같이, 지형은 직사각형 영역이고, 가상 캐릭터는 왼쪽 사이드라인(sideline)의 중간 지점에 위치되며, 화살표는 가상 캐릭터 A의 이동 방향(movement direction)을 나타낸다. 가상 캐릭터 A는 방향 전환 없이(without turning) 수평 방향으로만 전진하고 장애물 B에 평행하고 수직으로 장애물 B만큼 높기 때문에, 지형 특징은 가상 캐릭터 앞에서 10m 이내의 지형 특징을 커버하는 100*1의 행렬로 결정될 수 있다.In an embodiment of the present disclosure, the terrain feature may be a two-dimensional matrix, where each element of the matrix is the relative height difference between the terrain height of the corresponding point and the height of the current location of the virtual character, which is a preset height in front of the virtual character. Covers the height of the region within the range. The size of the matrix and the area of covering terrain can be adjusted according to the actual application scenario. For example, the size of the two-dimensional matrix is set to 100*100, and the area of the terrain it covers is set to 10m*10m. This is not limited to the embodiments of the present disclosure. Figure 8 shows a schematic diagram of the terrain interface. As shown in FIG. 8, the terrain is a rectangular area, the virtual character is located at the midpoint of the left sideline, and the arrow indicates the movement direction of the virtual character A. Since virtual character A only moves forward in the horizontal direction without turning, is parallel to obstacle B and is vertically as high as obstacle B, the terrain features are a 100*1 matrix covering terrain features within 10m in front of the virtual character. can be decided.

현재 순간의 그래픽 사용자 인터페이스에서 지형 특징 그리고 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보를 획득할 때, 단말이 전송한 현재 순간의 지형 특징, 상태 정보 및 태스크 정보를 수신할 수 있거나, 그래픽 사용자 인터페이스 및 수신된 설정 정보에 따라 서버 자체에서 현재 순간의 지형 특징, 상태 정보 및 태스크 정보를 결정할 수 있다.When obtaining terrain features from the graphical user interface of the current moment and status information and task information corresponding to the virtual character from the animation segment of the current moment, the terrain features, status information, and task information of the current moment transmitted by the terminal can be received. Alternatively, the server itself may determine the current moment's terrain features, status information, and task information according to the graphical user interface and the received configuration information.

단계(S320)에서: 지형 특징, 상태 정보 및 태스크 정보를 애니메이션 처리 모델에 입력하고, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하여, 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득한다.In step S320: Input terrain features, state information, and task information into the animation processing model, and use the animation processing model to perform feature extraction on the terrain features, state information, and task information, to generate the virtual character at the next moment. Obtain corresponding joint action information.

본 개시의 실시예에서, 현재 순간의 지형 특징, 상태 정보 및 태스크 정보를 획득한 후, 이러한 정보가 애니메이션 처리 모델에 입력될 수 있으며, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보, 태스크 정보에 대해 특징 추출을 수행하여, 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득할 수 있다. 관절 액션 정보는 가상 캐릭터가 현재 순간에서 지형 특징 및 태스크 특징에 직면하는 경우 다음 순간에 각 관절이 수행할 수 있는 액션에 대한 정보이며, 관절 액션 정보는 관절의 회전 정보일 수 있으며, 이는 4차원 길이로 표현되고, 루트 관절 이외의 관절의 회전 정보를 포함하며, 이는 (15-1)*4 = 총 56차원이다. 루트 관절의 제스처는 다른 관절이 토크 효과 하에서 이동 및 회전을 수행한 후 물리적 엔진에 의해 시뮬레이션되고 획득될 수 있다. 예를 들어, 사람 형상의 가상 캐릭터가 평지에서 앞으로 걷고, 다리와 무릎과 같은 관절의 회전 정보에 의해 결정된 토크에 따라 물리적 엔진이 이동 및 회전을 수행한 후, 발에 가해지는 후방 정지 마찰력(backward static friction force)이 하지, 무릎, 허벅지, 루트 관절까지 순차적으로 전달되고, 힘의 영향으로 루트 관절이 앞으로 밀려나며, 이에 따라 관절 액션 정보에서 루트 관절의 액션 정보가 생략될 수 있다.In an embodiment of the present disclosure, after obtaining the terrain features, state information, and task information at the current moment, this information may be input into an animation processing model, and the animation processing model may be used to determine the terrain features, state information, and task information. By performing feature extraction, joint action information corresponding to the virtual character at the next moment can be obtained. Joint action information is information about the actions that each joint can perform at the next moment when the virtual character faces the terrain features and task features at the current moment, and the joint action information may be the rotation information of the joint, which is a four-dimensional It is expressed in length and includes rotation information of joints other than the root joint, which is (15-1)*4 = 56 dimensions in total. The gesture of the root joint can be simulated and acquired by the physics engine after the other joints perform translation and rotation under the torque effect. For example, a human-shaped virtual character walks forward on level ground, and the physical engine performs movement and rotation according to the torque determined by rotation information of joints such as the legs and knees, and then the rear static friction force (backward) applied to the feet Static friction force) is sequentially transmitted to the lower extremities, knees, thighs, and root joints, and the root joint is pushed forward under the influence of the force. Accordingly, the action information of the root joint may be omitted from the joint action information.

본 개시의 실시예에서, 특징 추출이 강화 학습 기반의 애니메이션 처리 모델을 사용하여, 획득된 지형 특징, 상태 정보 및 태스크 정보에 대해 수행되어, 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득할 수 있다. 도 9는 애니메이션 처리 모델의 개략적인 구조도를 도시한다. 도 9에 도시된 바와 같이, 애니메이션 처리 모델(900)은 제1 제어 네트워크(901) 및 제2 제어 네트워크(902)를 포함한다. 제1 제어 네트워크(901)는 가상 캐릭터의 주요 관절(key joint)의 액션을 가이드하도록(guide) 구성된 고수준 컨트롤러(high-level controller, HLC)일 수 있고, 주요 관절은 지형 특징, 가상 캐릭터의 상태 정보 및 태스크 정보에 대응하는 일부 관절이다. 예를 들어, 사람 형상의 가상 캐릭터가 달리고 있는 경우, 다리의 액션이 주로 바뀌고, 허벅지는 다리와 발을 움직이게 하므로, 주요 관절이 허벅지 관절이 된다. 이와 유사하게, 사람 형상의 가상 캐릭터가 던지고 있는 경우, 팔과 손의 액션이 주로 바뀌고 팔꿈치는 손목과 손을 움직여 움직이므로, 주요 관절은 팔꿈치가 된다. 제2 제어 네트워크(902)는 모든 관절에 대응하는 관절 액션 정보를 출력하도록 구성된 저수준 컨트롤러(low-level controller, LLC)일 수 있다. 복잡한 애니메이션 장면과 복잡한 태스크에 대한 적응은 제1 제어 네트워크와 제2 제어 네트워크를 각각 설정하는 것에 의해 더 잘 구현될 수 있다. 또한, 제1 제어 네트워크는 주로 특정 액션을 가이드하도록 구성되고, 제2 제어 네트워크는 주로 캐릭터의 이동을 제어하도록 구성된다. 상이한 특정 액션을 위한 복수의 제1 제어 네트워크는 훈련된 제2 제어 네트워크에 연결될 수 있다. 예를 들어, 훈련된 제2 제어 네트워크는 발의 타깃 상태 정보에 따라 가상 캐릭터의 발 액션의 관절 액션 정보를 출력할 수 있고, 발의 타깃 상태에 대응하는 액션은 가상 캐릭터가 발로 공을 차는 액션 또는 가상 캐릭터가 점프하는 액션일 수 있다. 이 경우, 동일한 제2 제어 네트워크가 가상 캐릭터의 공 차기를 가이드하는 제1 제어 네트워크에 연결될 수 있고, 가상 캐릭터의 점프를 가이드하는 제1 제어 네트워크에도 연결될 수 있다. 애니메이션 세그먼트는 제1 제어 네트워크와 제2 제어 네트워크를 포함하는 애니메이션 처리 모델을 사용하여 처리되며, 이는 액션의 효과와 액션의 충실도를 향상시킬 수 있고, 다양한 지형에 적응할 수 있어 환경에 대한 자기 적응력을 향상시킨다.In an embodiment of the present disclosure, feature extraction is performed on the acquired terrain features, state information, and task information using a reinforcement learning-based animation processing model to obtain joint action information corresponding to the virtual character at the next moment. You can. Figure 9 shows a schematic structural diagram of an animation processing model. As shown in Figure 9, the animation processing model 900 includes a first control network 901 and a second control network 902. The first control network 901 may be a high-level controller (HLC) configured to guide the actions of key joints of the virtual character, where the key joints include terrain features, the state of the virtual character, and the like. These are some joints that correspond to information and task information. For example, when a human-shaped virtual character is running, the action of the legs mainly changes, and the thighs move the legs and feet, so the main joint becomes the thigh joint. Similarly, when a human-shaped virtual character is throwing, the main joint is the elbow, as the action of the arm and hand changes and the elbow moves by moving the wrist and hand. The second control network 902 may be a low-level controller (LLC) configured to output joint action information corresponding to all joints. Adaptation to complex animation scenes and complex tasks can be better implemented by configuring the first control network and the second control network respectively. Additionally, the first control network is mainly configured to guide specific actions, and the second control network is mainly configured to control movement of the character. A plurality of first control networks for different specific actions may be connected to a trained second control network. For example, the trained second control network may output joint action information of the foot action of the virtual character according to the target state information of the foot, and the action corresponding to the target state of the foot may be an action of the virtual character kicking a ball with the foot or a virtual character kicking a ball with the foot. This could be an action where the character jumps. In this case, the same second control network may be connected to the first control network that guides the virtual character's kicking of the ball, and may also be connected to the first control network that guides the virtual character's jumping. Animation segments are processed using an animation processing model that includes a first control network and a second control network, which can improve the effect of the action and the fidelity of the action, and can adapt to various terrains, providing self-adaptation to the environment. improve

일부 실시예에서, 제1 제어 네트워크(901)는 지형 특징 그리고 현재 순간의 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하여, 주요 관절에 대응하는 타깃 상태 정보를 획득하며; 그런 다음 타깃 상태 정보를 타깃 태스크 정보로 결정하고, 상태 정보와 타깃 태스크 정보를 제2 제어 네트워크(902)에 입력한다. 제2 제어 네트워크(902)를 사용하여 가상 캐릭터에 대응하는 상태 정보 및 타깃 태스크 정보에 대해 특징 추출을 수행하여, 가상 캐릭터의 모든 관절에 대응하는 관절 액션 정보를 획득한다. 사람 형상의 가상 캐릭터가 장애물을 넘는 경우를 예로 들 수 있다. 높이가 상이한 장애물 앞에서, 사람 형상의 가상 캐릭터가 성공적으로 넘도록 하기 위해, 사람 형상의 가상 캐릭터는 상이한 각도로 다리를 들어 올리고 상이한 크기로 걸음을 내딛는다. 제1 제어 네트워크(901)는 지형 특성, 태스크 정보, 및 상태 정보에 따라 캐릭터의 평면 상에서 두 허벅지 관절의 회전 및 루트 관절의 속도 방향을 출력할 수 있으며, 캐릭터의 평면 상에서 두 허벅지 관절의 회전 및 루트 관절의 속도 방향은, 주요 관절에 대응하는 타깃 상태 정보이며, 그 출력은 제2 제어 네트워크(902)의 타깃 태스크로 사용되어, 사람 형상의 가상 캐릭터가 다리를 들어올리도록 가이드한다. 이에 대응하여, 제1 제어 네트워크(901)의 출력은 10차원 벡터, 즉 두 허벅지의 회전을 측정하는 단위 쿼터니언 및 길이가 2인 단위 벡터일 수 있다. 물론, 타깃 상태 정보는 캐릭터의 평면 상의 두 허벅지 관절의 회전 및 루트 관절의 속도 방향 이외에, 두 개의 손 관절의 추가 회전, 두 개의 어깨 관절의 회전 등일 수 있다. 타깃 상태 정보는 장애물 유형 및 태스크 정보에 따라 상이하다. In some embodiments, the first control network 901 performs feature extraction on terrain features and state information and task information corresponding to the virtual character at the current moment to obtain target state information corresponding to key joints; Then, the target status information is determined as target task information, and the status information and target task information are input to the second control network 902. Feature extraction is performed on the state information and target task information corresponding to the virtual character using the second control network 902 to obtain joint action information corresponding to all joints of the virtual character. An example would be a case where a human-shaped virtual character overcomes an obstacle. In order for the human-shaped virtual character to successfully overcome obstacles of different heights, the human-shaped virtual character lifts its legs at different angles and takes steps at different sizes. The first control network 901 may output the rotation of the two thigh joints and the speed direction of the root joint on the plane of the character according to the terrain characteristics, task information, and state information, and the rotation and direction of the speed of the root joint on the plane of the character. The speed direction of the root joint is target state information corresponding to the main joint, and its output is used as a target task of the second control network 902 to guide the human-shaped virtual character to lift the leg. Correspondingly, the output of the first control network 901 may be a 10-dimensional vector, i.e., a unit quaternion measuring the rotation of both thighs and a unit vector of length 2. Of course, the target state information may be additional rotation of the two hand joints, rotation of the two shoulder joints, etc., in addition to the rotation of the two thigh joints and the speed direction of the root joint on the character's plane. Target status information varies depending on obstacle type and task information.

또한, 도 10은 제1 제어 네트워크의 개략적인 구조도를 도시한다. 도 10에 도시된 바와 같이, 제1 제어 네트워크(901)는 컨볼루션 유닛(convolution unit)(1001), 제1 완전 연결 계층(fully connected layer)(1002), 제2 완전 연결 계층(1003) 및 제3 완전 연결 계층(1004)을 포함하며, 여기서 컨볼루션 유닛(1001)은 서로 다른 크기의 복수의 컨볼루션 계층을 포함할 수 있다. 도면에 도시된 바와 같이, 제1 컨볼루션 계층 세트의 크기는 8*8이고, 제2 컨볼루션 계층 세트와 제3 컨볼루션 계층 세트의 크기는 모두 4*4이며, 제1 완전 연결 계층(1002), 제2 완전 연결 계층(1003) 및 제3 완전 연결 계층(1004)의 크기는 서로 다르고, 여기서 제1 완전 연결 계층(1002), 제2 완전 연결 계층(1003) 및 제3 완전 연결 계층(1004)에 포함되는 신경 셀(nerve cell)의 수는 각각 64, 1024, 512이다. 지형 특징 T, 태스크 정보 g_H 및 상태 정보 s_H가 제1 제어 네트워크에 입력된 후, 컨볼루션 유닛(1001)을 사용하여 지형 특징 T에 대한 특징 추출이 먼저 수행되어 제1 특징 정보를 획득하고; 그런 다음, 제1 완전 연결 계층(1002)을 사용하여 제1 특징 정보에 대해 특징 조합(feature combination)을 수행하여 제2 특징 정보를 획득하며; 다음으로 제2 완전 연결 계층(1003)을 사용하여 제2 특징 정보, 상태 정보 s_H 및 태스크 정보 g_H에 대해 특징 조합을 수행하여 제3 특징 정보를 획득하고; 제3 완전 연결 계층을 사용하여 제3 특징 정보에 대해 특징 조합을 최종적으로 수행하여 타깃 상태 정보 a_H를 획득한다.Additionally, Figure 10 shows a schematic structural diagram of the first control network. As shown in Figure 10, the first control network 901 includes a convolution unit 1001, a first fully connected layer 1002, a second fully connected layer 1003, and It includes a third fully connected layer 1004, where the convolution unit 1001 may include a plurality of convolution layers of different sizes. As shown in the figure, the size of the first convolution layer set is 8*8, the sizes of the second and third convolution layer sets are both 4*4, and the first fully connected layer (1002 ), the sizes of the second fully connected layer 1003 and the third fully connected layer 1004 are different from each other, where the first fully connected layer 1002, the second fully connected layer 1003 and the third fully connected layer ( The numbers of nerve cells included in 1004) are 64, 1024, and 512, respectively. After the terrain feature T, task information g _H and state information s _H are input to the first control network, feature extraction for the terrain feature T is first performed using the convolution unit 1001 to obtain first feature information; ; Then, using the first fully connected layer 1002, feature combination is performed on the first feature information to obtain second feature information; Next, using the second fully connected layer 1003, feature combination is performed on the second feature information, state information s _H , and task information g _H to obtain third feature information; Using the third fully connected layer, feature combination is finally performed on the third feature information to obtain target state information a _H.

도 11은 제2 제어 네트워크의 개략적인 구조도를 도시한다. 도 11에 도시된 바와 같이, 제2 제어 네트워크(902)는 제4 완전 연결 계층(1101) 및 제5 완전 연결 계층(1102)을 포함하고, 제4 완전 연결 계층(1101)과 제5 완전 연결 계층의 크기는 서로 다르다. 일부 실시예에서, 제4 완전 연결 계층(1101)은 1024개의 신경 셀을 포함할 수 있고, 제5 완전 연결 계층(1102)은 512개의 신경 셀을 포함할 수 있다. 제1 제어 네트워크(901)가 타깃 상태 정보 a_H를 출력한 후, 타깃 상태 정보 a_H는 제2 제어 네트워크(902)의 타깃 태스크 정보 g_L로 간주될 수 있고, 상태 정보 s_L과 함께 제2 제어 네트워크(902)에 동시에 입력되며, 제4 완전 연결 계층(1101)을 사용하여 상태 정보 s_L 및 타깃 태스크 정보 g_L에 대해 특징 조합을 수행하여 제4 특징 정보를 획득하고, 이어서 제5 완전 연결 계층(1102)을 사용하여 제4 특징 정보에 대해 특징 조합을 수행하여 관절 액션 정보 a_L을 획득한다.Figure 11 shows a schematic structural diagram of the second control network. As shown in Figure 11, the second control network 902 includes a fourth fully connected layer 1101 and a fifth fully connected layer 1102, and the fourth fully connected layer 1101 and the fifth fully connected layer The sizes of the tiers are different. In some embodiments, the fourth fully connected layer 1101 may include 1024 nerve cells, and the fifth fully connected layer 1102 may include 512 nerve cells. After the first control network 901 outputs the target state information a _H , the target state information a _H may be regarded as the target task information g _L of the second control network 902, and the first control network 901 together with the state information s _L 2 are simultaneously input to the control network 902, and feature combination is performed on the state information s _L and target task information g _L using the fourth fully connected layer 1101 to obtain the fourth feature information, and then the fifth Joint action information a _L is obtained by performing feature combination on the fourth feature information using the fully connected layer 1102.

본 개시의 실시예에서, 제1 제어 네트워크(901)는 가상 캐릭터의 주요 관절의 액션, 즉 특정 액션을 가이드하고, 제2 제어 네트워크(902)는 가상 캐릭터의 전체 관절의 관절 액션 정보를 출력하여, 연속적인 액션을 형성하며, 즉 캐릭터의 이동을 제어하며, 이에 따라 제1 제어 네트워크(901)와 제2 제어 네트워크(902)의 호출 주기(period)가 상이하다. 즉, 제1 제어 네트워크(901)는 캐릭터의 액션 또는 주요 관절의 상태가 변경된 경우에만 호출될 필요가 있고; 그리고 가상 캐릭터가 이동하는 한, 각각의 관절은 대응하는 관절 액션 정보에 대응하며, 따라서 제2 제어 네트워크(902)는 지속적으로 호출될 필요가 있다. 가상 캐릭터가 로드블록을 넘는 것을 예로 들 수 있다. 제1 제어 네트워크(901)는 가상 캐릭터가 스텝을 밟을 때만 호출되어야 하는 반면, 제2 제어 네트워크(902)는 지속적으로 호출되어 가상 캐릭터가 연속적으로 액션하도록 제어한다. 제1 제어 네트워크(901)와 제2 제어 네트워크(902)에 대해 서로 다른 호출 주기를 설정함으로써, 시간과 자원을 절약할 수 있어, 애니메이션 처리 모델의 처리 효율을 향상시키고, 그에 따라 액션 생성의 효율을 향상시킬 수 있다. 본 개시의 실시 예에서 PD 컨트롤러에 대한 제1 제어 네트워크(901)의 호출 주파수는 2Hz이고, 제2 제어 네트워크(902)의 호출 주파수는 30Hz이며, 물리적 시뮬레이션 주파수는 3000Hz이다. 실제 사용시에, 현재 순간의 지형 특징, 태스크 정보, 및 상태 정보에 따라 제1 제어 네트워크(901)을 호출해야 하는지를 판정한다. 그러나, 다음 순간의 가상 캐릭터의 관절 액션 정보를 예측하기 위해서는 제2 제어 네트워크(902)가 지속적으로 호출될 필요가 있다. 제1 제어 네트워크(901)가 호출되지 않은 경우, 제2 제어 네트워크(902)의 입력은 변경되지 않는다.In an embodiment of the present disclosure, the first control network 901 guides the action of the main joints of the virtual character, that is, a specific action, and the second control network 902 outputs joint action information of all joints of the virtual character to , forms a continuous action, that is, controls the movement of the character, and accordingly, the call period of the first control network 901 and the second control network 902 is different. That is, the first control network 901 needs to be called only when the character's action or the state of the main joint changes; And as long as the virtual character moves, each joint corresponds to the corresponding joint action information, so the second control network 902 needs to be continuously called. An example would be a virtual character crossing a roadblock. While the first control network 901 must be called only when the virtual character takes a step, the second control network 902 is continuously called to control the virtual character to take continuous actions. By setting different call cycles for the first control network 901 and the second control network 902, time and resources can be saved, improving the processing efficiency of the animation processing model, and thus the efficiency of action generation. can be improved. In an embodiment of the present disclosure, the call frequency of the first control network 901 for the PD controller is 2 Hz, the call frequency of the second control network 902 is 30 Hz, and the physical simulation frequency is 3000 Hz. In actual use, it is determined whether the first control network 901 should be called according to the terrain features, task information, and status information at the current moment. However, the second control network 902 needs to be continuously called in order to predict the joint action information of the virtual character at the next moment. If the first control network 901 is not called, the input of the second control network 902 is not changed.

본 개시의 실시예에서, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하기 전에, 안정적인 애니메이션 처리 모델을 획득하기 위해 훈련 대상(to-be-trained) 애니메이션 처리 모델을 훈련시킬 필요가 있다. 애니메이션 처리 모델을 훈련하는 동안, 일반적으로 채택되는 방법은 지형 특성을 모델에 입력하는 것이다. 그러나, 이 방법은 효과가 보통이고 훈련이 실패하기 쉬우며, 캐릭터의 액션이 약간 뻣뻣하여, 결과적으로 이 방법은 비교적 단순한 지형에만 적응한다. 따라서, 본 개시의 실시예에서는 지형과 액션에 대한 애니메이션 처리 모델의 민감도를 강화하고 보다 복잡한 지형으로 이행하기(migrate) 위해, 모델에 입력된 지형 특징을 분할하여(split) 계층적 강화 학습을 적용한다. 강화 학습은 머신 러닝의 한 분야이며, 환경에 기반하여 액션하는 방법을 강조하여 기대 효과를 극대화한다. 이동 제어 문제(movement control problem)는 강화 학습의 표준 벤치마크가 되며, 딥(deep) 강화 학습 방법은 조작(manipulation) 및 이동을 포함한 여러 태스크에 적응하는 것으로 입증되었다.In an embodiment of the present disclosure, before performing feature extraction on terrain features, state information, and task information using the animation processing model, a to-be-trained animation processing model is used to obtain a stable animation processing model. There is a need to train. While training an animation processing model, a commonly adopted method is to input terrain features into the model. However, this method is only moderately effective, training is prone to failure, and the character's actions are a bit stiff, and as a result, this method only adapts to relatively simple terrain. Therefore, in an embodiment of the present disclosure, in order to strengthen the sensitivity of the animation processing model to terrain and action and migrate to a more complex terrain, hierarchical reinforcement learning is applied by splitting the terrain features input to the model. do. Reinforcement learning is a field of machine learning and emphasizes how to take actions based on the environment to maximize expected effects. The movement control problem is a standard benchmark in reinforcement learning, and deep reinforcement learning methods have been proven to be adaptable to several tasks, including manipulation and movement.

본 개시의 실시예에서, 강화 학습은 각각 환경, 지능형 에이전트, 상태, 액션, 보상(reward), 가치 함수(value function) 및 정책인 복수의 기본 개념을 포함한다. 환경은 외부 시스템이며, 지능형 에이전트는 시스템 내부에 위치하여 시스템을 인지하고 인지된 상태에 따라 특정 액션을 수행할 수 있다. 지능형 에이전트는 환경에 내장된 시스템이며, 상태를 변경하는 태스크를 수행할 수 있다. 상태는 순간에서의 현재 환경의 상태 정보이다. 액션은 주체가 수행하는 행동이다. 보상은 현재 액션이나 상태에 대한 환경의 보상을 나타내는 스칼라(scalar)이다. 보상은 즉각적인 이익을 정의하는 반면, 가치 함수는 누적된 보상으로 간주될 수 있고 일반적으로 V로 표시되는 장기 이익(long-term benefit)을 정의한다. 정책은 현재 환경 상태에서 행동으로 매핑하는 것으로, 일반적으로 입력 상태인 π로 표시되며, 모델은 상태에서 수행되어야 하는 액션을 출력한다. 도 12는 강화 학습의 개략적인 흐름도를 나타낸다. 도 12에 도시된 바와 같이, 순간 t에서, 현재 상태 S_t가 지능형 에이전트에 입력되고, 현재 정책에 따라 지능형 에이전트는 액션 A_t를 출력할 수 있으며; 액션 A_t는 환경과 상호작용하도록 수행되며, 타깃의 완료 조건에 따라 환경은 보상 R_t 및 다음 순간 t+1에서서의 지능형 에이전트의 상태 S_t+1을 피드백하고; 지능형 에이전트는 보상에 따라 정책을 조정하여 다음 순간에서의 액션 A_t+1을 출력하며; 그리고 이 프로세스를 반복하고, 계속해서 정책을 조정하며, 타깃을 완수할 수 있는 정책 π를 훈련을 통해 최종적으로 획득할 수 있다.In an embodiment of the present disclosure, reinforcement learning includes a plurality of basic concepts, which are environment, intelligent agent, state, action, reward, value function, and policy, respectively. The environment is an external system, and an intelligent agent is located inside the system and can recognize the system and perform specific actions according to the recognized state. Intelligent agents are systems embedded in the environment and can perform tasks that change their state. The state is information about the state of the current environment at the moment. An action is an action performed by a subject. Reward is a scalar that represents the environment's reward for the current action or state. Reward defines the immediate benefit, while the value function defines the long-term benefit, which can be considered as an accumulated reward and is usually denoted by V. A policy is a mapping from the current state of the environment to an action, usually denoted by π, the input state, and the model outputs the action that should be performed in the state. Figure 12 shows a schematic flowchart of reinforcement learning. As shown in Figure 12, at moment t, the current state S _t is input to the intelligent agent, and according to the current policy, the intelligent agent can output the action A _t ; The action A _t is performed to interact with the environment, and according to the completion condition of the target, the environment feeds back the reward R _t and the state S _t+ 1 of the intelligent agent at the next moment t+1; The intelligent agent adjusts the policy according to the reward and outputs the action A _t+1 at the next moment; Then, you can repeat this process, continuously adjust the policy, and finally obtain a policy π that can complete the target through training.

본 개시의 실시예에서, 애니메이션 처리 모델은 AC 프레임워크에 기반하여 훈련된다. AC 프레임워크는 가치 함수 추정 알고리즘과 정책 탐색 알고리즘을 통합한 프레임워크이며, 액터(actor) 네트워크와 크리틱(critic) 네트워크의 2개의 네트워크를 포함한다. 액터 네트워크는 현재 정책을 훈련하며, 그리고 액션을 출력하도록 구성되고, 크리틱 네트워크는 가치 함수를 학습하고 그리고 현재 상태 값(state value) V(s)를 출력하도록 구성되며, 값은 상태의 품질을 평가하는 데 사용된다. 도 13은 애니메이션 처리 모델의 알고리즘 프레임워크의 아키텍처 다이어그램을 나타낸다. 도 13에 도시된 바와 같이, 프레임워크는 액터 네트워크(1301), 크리틱 네트워크(1302) 및 환경(1303)을 포함한다. 액터 네트워크(1301)는 현재 상태 및 정책에 따른 액션을 출력하고, 환경(1303)은 액터 네트워크(1301)에 의해 출력된 액션에 따라 보상 형태로 피드백하며, 크리틱 네트워크(1302)는 액션이 수행된 후 생성된 상태 및 환경(1303)에 의해 피드백된 보상에 따라 평가를 수행하며, 현재 상태 값을 결정하고, 현재 상태 값을 액터 네트워크(1301)에 피드백하여 액터 네트워크(1301)가 정책을 조정하게 한다. 애니메이션 처리 모델이 안정될 때까지 이 프로세스를 반복하고 훈련을 계속하여 수행한다. 크리틱 네트워크(1302)에 의해 출력되는 현재 상태 값의 학습 기준(learning standard)이, 환경(1303)에 의해 피드백되는 일련의 보상을 시간차 방법(temporal difference method)을 사용하여 계산하는 것에 의해 획득되며, 그리고 크리틱 네트워크의 학습을 가이드하는 데 사용된다. 일부 실시예에서, 경로 시뮬레이션이 예로서 사용되고, 경로 상의 노드에 대응하는 보상 R₁ 내지 R_i가 획득될 수 있으며, i는 경로 상의 노드의 수이고; 경로 상의 노드 t에 대응하는 상태 값 V(S_t)를 구하고자 하는 경우, t는 1 내지 i 범위의 값이며, S_t의 값 V(S_t)는 획득된 보상 R과 후속 상태의 상태 값의 추정된 값에 따라 업데이트된다. 업데이트가 여러 번 수행된 후, 안정적인 가치 함수가 획득되며; 경로에 대해 샘플링이 수행된 후 가치 함수는 여러 번 업데이트될 수 있고, 사용된 평가 알고리즘은 일 수 있으며, α는 계수이다. 시간차 방법은 강화 학습의 중심 아이디어이다. 몬테카를로(Monte Carlo) 방법과 유사하게, 시간차 방법은 몬테카를로 방법의 샘플링 방법(즉, 실험을 수행)과 동적 프로그래밍 방법(dynamic programming method)의 부트스트래핑(bootstrapping)(후속 상태의 가치 함수를 사용하여 현재 가치 함수를 추정)과 조합되며, 그리고 환경에 대한 완전한 지식 없이 경험에서 직접 알 수 있다. 시간차 방법은 동적 프로그래밍 방법과 유사하게, 전체 이벤트가 완료될 때까지 기다리지 않고 기존 추정 결과를 개선할 수 있으며, 이에 따라 학습 효율을 높일 수 있다.In an embodiment of the present disclosure, the animation processing model is trained based on the AC framework. The AC framework is a framework that integrates a value function estimation algorithm and a policy search algorithm, and includes two networks: an actor network and a critic network. The actor network is configured to train the current policy and output an action, and the critic network is configured to learn the value function and output the current state value V(s), which evaluates the quality of the state. It is used to Figure 13 shows an architecture diagram of the algorithmic framework of the animation processing model. As shown in Figure 13, the framework includes an actor network 1301, a critic network 1302, and an environment 1303. The actor network 1301 outputs an action according to the current state and policy, the environment 1303 provides feedback in the form of compensation according to the action output by the actor network 1301, and the critic network 1302 determines whether the action was performed. Afterwards, an evaluation is performed according to the generated state and the reward fed back by the environment 1303, the current state value is determined, and the current state value is fed back to the actor network 1301 so that the actor network 1301 adjusts the policy. do. This process is repeated and training continues until the animation processing model is stable. A learning standard of the current state value output by the critical network 1302 is obtained by calculating a series of rewards fed back by the environment 1303 using a temporal difference method, And it is used to guide the learning of the critical network. In some embodiments, path simulation is used as an example, and rewards R ₁ to R _i corresponding to nodes on the path can be obtained, where i is the number of nodes on the path; If you want to find the state value V(S _t ) corresponding to node t on the path, t is a value in the range from 1 to i, and the value of S _t V(S _t ) is the obtained reward R and the state value of the subsequent state. is updated according to the estimated value of . After the update is performed several times, a stable value function is obtained; After sampling is performed on a path, the value function can be updated multiple times, and the evaluation algorithm used is It can be, and α is the coefficient. The temporal difference method is the central idea of reinforcement learning. Similar to the Monte Carlo method, the time-difference method combines the Monte Carlo method's sampling method (i.e., performing an experiment) and the dynamic programming method's bootstrapping (using the value function of the subsequent state to determine the current state). It is combined with an estimate of the value function) and can be known directly from experience without complete knowledge of the environment. Similar to the dynamic programming method, the temporal difference method can improve existing estimation results without waiting for the entire event to complete, thereby increasing learning efficiency.

본 개시의 실시예에서, 모델의 훈련 동안, 물리적 엔진은 각각 운동학적 캐릭터(kinematics character) 및 물리적 캐릭터인 2개의 캐릭터를 포함한다. 운동학적 캐릭터는 물리적 속성을 가지지 않으며, 애니메이션 디자이너가 디자인한 액션 세그먼트에서 액션을 수행하기 위해서만 사용되며, 운동학 방법을 사용하는 애니메이션 세그먼트에서 운동학적 캐릭터의 관절이 참조 액션을 수행하도록 하기 위해서만 필요하다. 물리적 캐릭터는 운동학적 캐릭터를 기준(standard)과 템플릿(template)으로 사용하여 학습하고, 물리적인 속성을 가지며, 그리고 토크에 의해 제어될 수 있으며, 물리적 속성은 토크, 속도, 중력, 충돌 효과 등일 수 있다. 한편, 물리적 속성을 갖는 물리적 캐릭터는 모델이 출력하는 제스처를 사용하여 각 관절의 토크를 계산하고, 물리적 엔진에서 액션 시뮬레이션을 수행한다. 물리적 엔진은 각 액션이 수행된 후 환경의 조건을 시뮬레이션하여 실제 효과를 생성한다. 매 순간, 보상을 계산한다는 것은 두 캐릭터의 현재 제스처, 속도, 각속도 등의 차이를 측정하는 것이며, 차이가 작을수록 보상이 크다. 복수의 보상 컴포넌트에 대해 가중 합산(weighted summation)을 수행하는 것에 의해 최종 보상이 획득되며, 필요에 따라 가중치가 조정될 수 있다. 환경은 제스처 시뮬레이션의 품질에 따라 보상을 제공하여, 캐릭터가 참조 액션의 제스처와 일치하는(consistent) 제스처를 유지하도록 자극한다(stimulate). 두 제스처가 가까울수록 보상이 커지며, 그렇지 않으면 보상이 더 작아진다.In an embodiment of the present disclosure, during training of the model, the physics engine includes two characters, a kinematics character and a physical character, respectively. Kinematic characters have no physical properties and are only used to perform actions in action segments designed by animation designers, and are only needed to ensure that the joints of a kinematic character perform reference actions in animation segments using kinematic methods. Physical characters can be learned using kinematic characters as standards and templates, have physical properties, and can be controlled by torque, and physical properties can be torque, speed, gravity, collision effects, etc. there is. Meanwhile, for physical characters with physical properties, the torque of each joint is calculated using the gesture output by the model, and action simulation is performed in the physical engine. The physics engine creates realistic effects by simulating the conditions of the environment after each action is performed. At each moment, calculating the reward means measuring the difference between the current gestures, speeds, and angular velocities of the two characters. The smaller the difference, the greater the reward. The final reward is obtained by performing weighted summation on the plurality of reward components, and the weights can be adjusted as needed. The environment provides rewards based on the quality of the gesture simulation, stimulating the character to maintain gestures that are consistent with those of the reference action. The closer the two gestures are, the larger the reward; otherwise, the smaller the reward.

본 개시의 실시예에서, 보상은 수식 (1)에 따라 결정되며, 수식 (1)은 다음과 같다.In an embodiment of the present disclosure, the compensation is determined according to Equation (1), and Equation (1) is as follows.

은 순간 t에서의 시뮬레이션의 보상 값이며, 는 순간 t에서의 타깃 태스크를 완료하는 보상 값이고, 가중치 는 액션을 시뮬레이션하는 비율을 나타내며, 가중치 는 태스크를 완료하는 비율이고, 공학적으로(in engineering) 및 가 설정될 수 있으며, 이다. is the compensation value of the simulation at instant t, is the reward value for completing the target task at instant t, and is the weight represents the rate at which the action is simulated, and the weight is the rate at which the task is completed, in engineering and can be set, am.

물리적 캐릭터와 운동학적 캐릭터의 액션이 일치되도록 하기 위해, 물리적 캐릭터와 운동학적 캐릭터에 맞게 일부 준이 설정될 수 있으며, 수식 는 다섯 부분: 제스처 보상 , 속도 보상 , 사지 관절(extremity joint) 보상 , 루트 관절 제스처 보상 및 중심 제스처(centroid gesture) 보상 을 포함하는 운동학적 측면에서 유사도를 포함한다. 제스처와 속도는 각 관절의 제스처 및 속도이다. 두 캐릭터의 액션이 일치되어야 하면, 제스처와 속도가 반드시 일치해야 하므로, 제스처 보상과 속도 보상이 설정될 수 있다. 사지 관절은 손과 발을 나타낸다. 물리적 캐릭터의 사지 관절이 운동학적 캐릭터의 사지 관절과 정렬되도록 하기 위해, 사지 관절에 대한 사지 관절 보상이 설정된다. 루트 관절은 모든 관절의 상단(top) 관절이다. 두 캐릭터의 액션이 일치되어야 하면, 루트 관절도 일치되어야 하며, 이에 따라 루트 관절 제스처 보상이 설정될 수 있다. 또한, 물리적 캐릭터가 보다 안정적으로 걷고 흔들리지 않도록 하기 위해, 물리적 캐릭터의 중심(centroid)이 운동학적 캐릭터의 중심과 일치하도록 해야 하므로 중심 제스처 보상이 설정될 수 있다. 전술한 보상을 설정함으로써, 물리적 캐릭터와 운동학적 캐릭터의 액션이 최대한 일치하도록 보장될 수 있다. 보상에 대응하는 가중치는 이다. 운동학적 캐릭터 항목(item)의 오른쪽 상단 모서리는 *로 표시된다. 제스처 컴포넌트가 예로 사용되며, 여기서 는 운동학적 캐릭터의 j번째 관절의 제스처이고, 는 시뮬레이션 캐릭터의 j번째 관절의 제스처이다. 수식 (1)은 다음과 같이 수식 (2)로 변환될 수 있다.In order to ensure that the actions of the physical and kinematic characters are consistent, some criteria can be set for the physical and kinematic characters, and the formula Five Parts: Gesture Compensation , speed compensation , extremity joint compensation , root joint gesture compensation and centroid gesture compensation It includes similarity in kinematic aspects including. Gesture and speed are the gesture and speed of each joint. If the actions of the two characters must match, the gesture and speed must match, so gesture compensation and speed compensation can be set. The limb joints represent the hands and feet. To ensure that the physical character's limb joints are aligned with the kinematic character's limb joints, limb joint compensation is set for the limb joints. The root joint is the top joint of all joints. If the actions of the two characters must match, the root joints must also match, and root joint gesture compensation can be set accordingly. Additionally, in order for the physical character to walk more stably and not shake, the centroid of the physical character must match the center of the kinematic character, so centroid gesture compensation can be set. By setting the above-described compensation, it can be ensured that the actions of the physical character and the kinematic character match as much as possible. The weight corresponding to the reward is am. The upper right corner of a kinematic character item is marked with *. A gesture component is used as an example, where is the gesture of the jth joint of the kinematic character, is the gesture of the jth joint of the simulation character. Formula (1) can be converted to equation (2) as follows.

여기서, 는 제스처 간의 유사도를 기술하며, 그리고 각 관절의 위치 및 회전과 타깃 값의 차이로 표현되고, 이며; 는 속도 사이의 유사도를 기술하고, 그리고 각 관절의 선형 속도와 타깃 값 사이의 차이로 표현되고 이며; 는 사지 관절 제스처 사이의 유사도를 기술하고, 그리고 손 관절의 위치와 발 관절의 위치 사이의 차이로 표현되고, 이며; 는 루트 관절 사이의 유사도를 기술하며, 이고; 는 중심 속도 사이의 유사도를 기술하며, 이다.here, describes the similarity between gestures, and is expressed as the difference between the position and rotation of each joint and the target value, and; describes the similarity between velocities, and is expressed as the difference between the linear velocity of each joint and the target value. and; describes the similarity between limb joint gestures, and is expressed as the difference between the positions of the hand joints and the positions of the foot joints, and; describes the similarity between root joints, ego; describes the similarity between the central velocities, am.

는 캐릭터가 타깃을 달성하는 품질을 기술하며, 일반적으로 실제 조건과 캐릭터의 이동의 타깃 사이의 차이를 측정한다. 예를 들어, 타깃이 이동 방향 g_t인 경우, 는 다음의 수식 (3)에 보여지는 바와 같이, 지면 상에서의 진행 방향(forward direction) v_t과 타깃 g_t의 각도 차이 θ로서 계산될 수 있다. describes the quality with which a character achieves a target, and generally measures the difference between actual conditions and the target of the character's movement. For example, if the target has direction of movement g _t , Can be calculated as the angle difference θ between the forward direction v _t and the target g _t on the ground, as shown in the following equation (3).

가상 캐릭터가 넘어지는 액션(action of falling down)을 학습하는데 실패한 경우, 현재 훈련 경로가 완료되며, 보상 값은 0이다.If the virtual character fails to learn the action of falling down, the current training path is completed and the reward value is 0.

본 개시의 실시예에서, 훈련 대상 애니메이션 처리 모델은 훈련될 제1 제어 네트워크 및 훈련될 제2 제어 네트워크를 포함한다. 훈련 전에, 복수의 애니메이션 세그먼트 샘플을 획득할 수 있으며, 애니메이션 세그먼트 샘플은 가상 캐릭터에 대응하는 서로 다른 지형 특징과 서로 다른 태스크 정보를 포함하고, 서로 다른 지형 특징과 서로 다른 태스크 정보에 대응하여, 가상 캐릭터는 서로 다른 제스처 및 액션을 가진다. 본 개시의 실시예에서 애니메이션 처리 모델의 제1 제어 네트워크는 지형 특징, 태스크 정보 및 상태 정보에 따라 주요 관절에 대응하는 타깃 상태 정보를 출력하며, 그런 다음 타깃 상태 정보가 타깃 태스크 정보로서 제2 제어 네트워크로 입력되고 제2 제어 네트워크에 의해 처리되어, 관절 액션 정보를 출력하며, 이는 복합 태스크(complex task)를 처리하는 데 사용될 수 있다. 제1 제어 네트워크와 제2 제어 네트워크가 동시에 훈련되고 제1 제어 네트워크에서 출력되는 타깃 상태 정보에 에러(error)가 있는 경우, 에러가 있는 타깃 상태 정보를 제2 제어 네트워크에 입력하고, 애니메이션 처리 모델은 제2 제어 네트워크에 의해 출력되는 관절 액션 정보에 따라 역으로 훈련되며, 이에 따라 애니메이션 처리 모델이 불안정하여 복잡한 태스크를 효율적으로 처리하지 못한다. 따라서, 애니메이션 처리 모델이 복잡한 태스크를 처리할 수 있도록 하기 위해, 훈련될 제1 제어 네트워크와 훈련될 제2 제어 네트워크를 훈련 중에 별도로 훈련해야 한다. 훈련될 제1 제어 네트워크의 훈련을 완료하는 것은 고정된 파라미터로 제2 제어 네트워크에 연결된, 훈련될 제1 제어 네트워크를 훈련시켜서 제1 제어 네트워크를 획득하는 것이다.In an embodiment of the present disclosure, the animation processing model to be trained includes a first control network to be trained and a second control network to be trained. Before training, a plurality of animation segment samples can be obtained, the animation segment samples include different terrain features and different task information corresponding to the virtual character, and corresponding to the different terrain features and different task information, the virtual character Characters have different gestures and actions. In an embodiment of the present disclosure, the first control network of the animation processing model outputs target state information corresponding to major joints according to terrain features, task information, and state information, and then the target state information is used as target task information to control the second It is input to the network and processed by the second control network to output joint action information, which can be used to process complex tasks. If the first control network and the second control network are trained simultaneously and there is an error in the target state information output from the first control network, the target state information with the error is input to the second control network, and the animation processing model is trained inversely according to the joint action information output by the second control network, and as a result, the animation processing model is unstable and cannot efficiently process complex tasks. Therefore, in order to enable the animation processing model to handle complex tasks, the first control network to be trained and the second control network to be trained must be trained separately during training. Completing the training of the first control network to be trained is to obtain the first control network by training the first control network to be trained, which is connected to the second control network with fixed parameters.

본 개시의 실시예에서, 애니메이션 처리 모델은 AC 알고리즘 프레임워크를 기반으로 학습되고, 애니메이션 처리 모델에서 훈련될 제1 제어 네트워크와 훈련될 제2 제어 네트워크는 별도로 훈련되고, 따라서, 훈련될 제1 제어 네트워크와 훈련될 제2 제어 네트워크는 각각 한 쌍의 AC 네트워크를 포함하도록 설정될 수 있으며, 즉, 훈련될 제1 제어 네트워크는 제1 훈련 대상 액터 서브 네크워크 및 제1 훈련 대상 크리틱 서브 네트워크를 포함하고, 훈련될 제2 제어 네트워크는 제2 훈련 대상 액터 서브 네크워크 및 제2 훈련 대상 크리틱 서브 네트워크를 포함한다. 또한, 제1 훈련 대상 액터 서브 네트워크의 구조는 제1 훈련 대상 크리틱 서브 네트워크의 구조와 동일하게 설정될 수 있으며, 제2 훈련 대상 액터 서브 네트워크의 구조는 제2 훈련 대상 크리틱 서브 네트워크의 구조와 동일하게 설정될 수 있다. 제1 훈련 대상 액터 서브 네트워크 및 제1 훈련 대상 크리틱 서브 네트워크의 구조에 대해서는 도 10을 참조할 수 있으며, 제2 훈련 대상 액터 서브 네트워크 및 제2 훈련 대상 크리틱 서브 네트워크의 구조에 대해서는 도 11을 참조할 수 있으며; 차이점은 단지 입력된 정보와 출력되는 정보에 있다. 훈련될 제1 제어 네트워크와 훈련될 제2 제어 네트워크의 훈련이 완료된 후, 제1 액터 서브 네트워크와 제2 액터 서브 네트워크를 호출하기만 하면, 제1 액터 서브 네트워크를 사용하여, 입력된 지형 특징, 태스크 정보 및 상태 정보에 따라 주요 관절에 대응하는 타깃 상태 정보 a_H가 출력되고, 추가로 제2 액터 서브 네트워크를 사용하여 타깃 상태 정보 및 상태 정보에 따라 가상 캐릭터의 모든 관절의 관절 액션 정보 a_L를 출력한다.In an embodiment of the present disclosure, the animation processing model is learned based on the AC algorithm framework, the first control network to be trained and the second control network to be trained in the animation processing model are trained separately, and thus the first control network to be trained The network and the second control network to be trained may each be set to include a pair of AC networks, that is, the first control network to be trained includes a first actor sub-network to be trained and a first critic sub-network to be trained; , the second control network to be trained includes a second training target actor sub-network and a second training target critic sub-network. Additionally, the structure of the first training target actor subnetwork may be set to be the same as the structure of the first training target critic subnetwork, and the structure of the second training target actor subnetwork may be set to be the same as the structure of the second training target critic subnetwork. It can be set as follows. Refer to FIG. 10 for the structure of the first training target actor sub-network and the first training target critic sub-network, and refer to FIG. 11 for the structure of the second training target actor sub-network and the second training target critic sub-network. You can; The difference lies only in the information input and the information output. After the training of the first control network to be trained and the second control network to be trained is completed, simply call the first actor sub-network and the second actor sub-network, using the first actor sub-network, input terrain features, Target state information a _H corresponding to the main joint is output according to the task information and state information, and joint action information a _L for all joints of the virtual character according to the target state information and state information using the second actor subnetwork. outputs.

사람 형상의 가상 캐릭터가 장애물을 피하는 것이 예로 사용된다. 훈련될 제2 제어 네트워크의 훈련 동안, 애니메이션 세그먼트 세트를 사용하여 평지에 대해 훈련을 수행할 수 있다. 애니메이션 세그먼트 세트는 서로 다른 높이의 장애물 앞에서 가상 캐릭터의 다리 들기(leg lifting) 및 스테핑(stepping)의 서로 다른 제스처를 커버하는 복수의 애니메이션 세그먼트 샘플을 포함한다. 초기 액션들은 가깝고, 단 하나의 스텝만을 가진다. 예를 들어, 애니메이션 세그먼트 세트에는 총 15개의 애니메이션 세그먼트 샘플이 있으며, 각 애니메이션 세그먼트 샘플의 길이는 0.5초이다. 훈련 동안, 복수의 애니메이션 세그먼트 샘플 중에서 가장 적절한 애니메이션 세그먼트 샘플을 선택하여 훈련할 수 있다.An example is used as a human-shaped virtual character avoiding obstacles. During training of the second control network to be trained, training may be performed on flat ground using a set of animated segments. The animation segment set includes a plurality of animation segment samples covering different gestures of leg lifting and stepping of the virtual character in front of obstacles of different heights. The initial actions are close and have only one step. For example, an animation segment set has a total of 15 animation segment samples, and each animation segment sample is 0.5 seconds long. During training, the most appropriate animation segment sample may be selected from among a plurality of animation segment samples for training.

애니메이션 세그먼트 세트를 획득한 후, 각 애니메이션 세그먼트 샘플에 대해 미러링(mirroring)이 수행될 수 있으며, 즉, 캐릭터의 오른쪽 다리의 스테핑이 왼쪽 다리의 스테핑으로 전환되고, 캐릭터의 왼쪽 다리의 스테핑이 오른쪽 다리의 스테핑으로 전환될 수 있으며, 각 액션 세그먼트의 초기 제스처와 발을 착지하는(landing a foot) 제스처가 카운트된다. 제2 제어 네트워크는 제1 제어 네트워크의 출력을 타깃 태스크로 사용하기 때문에, 훈련될 제2 제어 네트워크가 훈련되는 경우, 가상 캐릭터의 초기 제스처가 미리 설정될 수 있으며, 타깃 애니메이션 세그먼트 샘플은 초기 제스처에 따라 애니메이션 세그먼트 세트로부터 결정될 수 있고, 타깃 애니메이션 세그먼트 샘플에 따라 타깃 태스크가 결정되어, 훈련될 제2 제어 네트워크가 타깃 애니메이션 세그먼트 샘플에 따라 학습하게 한다. 가상 캐릭터가 한 스텝을 완료하고 다음 스텝을 시작할 준비가 된 경우, 전술한 작동이 반복될 수 있으며, 초기 제스처와 동일하거나 유사한 타깃 애니메이션 세그먼트 샘플을 획득하여, 훈련될 제2 제어 네트워크를 훈련할 수 있다. 타깃 애니메이션 세그먼트 샘플이 결정된 경우, 초기 제스처는 애니메이션 세그먼트 세트의 각 애니메이션 세그먼트 샘플과 비교되어, 초기 제스처와 각 애니메이션 세그먼트 샘플의 가상 캐릭터의 제스처 간의 유사도를 획득할 수 있으며, 그런 다음 복수의 유사도를 내림차순으로 순위 지정하여 시퀀스를 형성하고, 가장 높은 유사도에 대응하는 애니메이션 세그먼트 샘플을 최종적으로 타깃 애니메이션 세그먼트 샘플로 사용하거나; 또는 미리 설정된 수의 유사도가 시퀀스로부터 연속적으로 획득될 수 있고, 유사도에 대응하는 애니메이션 세그먼트 샘플 중 임의의 하나가 타깃 애니메이션 세그먼트 샘플로 사용되며, 미리 설정된 수는 실제 요건에 따라 설정될 수 있으며, 예를 들어, 미리 설정된 수는 3 또는 5일 수 있다. 타깃 애니메이션 세그먼트 샘플이 결정된 후, 주요 관절에 대응하는 상태 정보 샘플을 타깃 애니메이션 세그먼트 샘플로부터 추출하고, 상태 정보 샘플을 타깃 태스크 정보로 사용하며, 가상 캐릭터의 모든 관절에 대응하는 관절 액션 정보 샘플을 동시에 획득하고; 그리고 타깃 태스크 정보를 훈련될 제2 제어 네트워크에 입력하여 훈련을 수행하며, 훈련될 제2 제어 네트워크에 의해 출력된 관절 액션 정보가 관절 액션 정보 샘플과 동일하거나 유사한 경우, 이는 훈련될 제2 제어 네트워크의 훈련이 완료되었음을 지시한다. 관절 액션 정보 샘플은 발을 착지할 때 운동학적 캐릭터의 두 허벅지의 회전 및 회전 그리고 평면 상에서 루트 관절의 속도 방향 일 수 있으며, 는 허벅지 관절에 대응하는 착지하는 발(landing foot)의 회전을 기술하고, 는 착지하려고 하는 비착지하는 발(non-landing foot)의 회전/허벅지 관절에 대응하는 착지하는 발의 회전을 기술하며, 현재 타깃 태스크 정보 는 두 허벅지의 회전 및 회전 그리고 평면에서 루트 관절의 속도 방향 에 따라 결정될 수 있으며, 훈련을 수행하기 위해 훈련될 제2 제어 네트워크에 입력된다.After obtaining a set of animation segments, mirroring can be performed for each animation segment sample, that is, the stepping of the character's right leg is converted to the stepping of the left leg, and the stepping of the character's left leg is converted to the stepping of the right leg. can be converted to stepping, and the initial gesture of each action segment and the landing a foot gesture are counted. Because the second control network uses the output of the first control network as the target task, when the second control network to be trained is trained, the initial gesture of the virtual character can be set in advance, and the target animation segment sample is applied to the initial gesture. Accordingly, the target task may be determined from the animation segment set, and the target task may be determined according to the target animation segment sample, so that the second control network to be trained learns according to the target animation segment sample. When the virtual character has completed one step and is ready to start the next step, the above-described operation can be repeated, and samples of target animation segments identical or similar to the initial gesture can be obtained to train a second control network to be trained. there is. Once the target animation segment sample is determined, the initial gesture may be compared with each animation segment sample in the animation segment set to obtain the similarity between the initial gesture and the gesture of the virtual character of each animation segment sample, and then the plurality of similarities may be ranked in descending order. Form a sequence by ranking, and finally use the animation segment sample corresponding to the highest similarity as the target animation segment sample; Alternatively, a preset number of similarities can be continuously obtained from the sequence, and any one of the animation segment samples corresponding to the similarity is used as the target animation segment sample, and the preset number can be set according to actual requirements, e.g. For example, the preset number may be 3 or 5. After the target animation segment sample is determined, state information samples corresponding to major joints are extracted from the target animation segment samples, the state information samples are used as target task information, and joint action information samples corresponding to all joints of the virtual character are simultaneously extracted. acquire; Then, training is performed by inputting the target task information into the second control network to be trained. If the joint action information output by the second control network to be trained is the same as or similar to the joint action information sample, it is the second control network to be trained. Indicates that training has been completed. A sample of joint action information is the rotation of both thighs of a kinematic character when landing on a foot. and rotation And the velocity direction of the root joint in the plane It can be, describes the rotation of the landing foot relative to the thigh joint, describes the rotation of the non-landing foot about to land/the rotation of the landing foot corresponding to the thigh joint, and the current target task information Rotation of both thighs and rotation and the velocity direction of the root joint in the plane It can be determined according to , and is input to the second control network to be trained to perform training.

제2 제어 네트워크의 안정성(stability)을 확보하기 위해, 캐릭터의 복수의 달리기 경로(running path)에 따라 훈련이 수행될 수 있고, 캐릭터의 달리기 경로의 최대값은 200초(200s)와 같은 숫자 값으로 설정될 수 있다. 강화 학습 기반으로 훈련될 제2 제어 네트워크는 가상 캐릭터가 달리기 경로를 완료한 후 가상 캐릭터의 액션 또는 상태를 평가하고, 상태 값을 결정하며, 최대 상태 값이 획득될 때까지 상태 값에 따라 액션을 조정한다. 훈련에 의해, 상이한 타깃 태스크 정보가 입력될 때, 제2 제어 네트워크는 대응하는 스테핑 액션을 수행할 수 있다.To ensure the stability of the second control network, training may be performed according to a plurality of running paths of the character, and the maximum value of the character's running path is a numeric value such as 200 seconds (200s). It can be set to . The second control network to be trained based on reinforcement learning evaluates the action or state of the virtual character after the virtual character completes the running path, determines the state value, and performs actions according to the state value until the maximum state value is obtained. Adjust. By training, when different target task information is input, the second control network can perform a corresponding stepping action.

본 개시의 실시예에서, 제2 제어 네트워크의 훈련이 완료된 후, 훈련될 제1 제어 네트워크가 훈련될 수 있고, 훈련에 사용되는 달리기의 애니메이션 세그먼트 샘플은 하나뿐이며, 각 경로의 최대 길이는 200초로 제한될 수도 있다. 각각의 경로가 시작되는 경우, 현재 순간의 지형 특징 샘플, 캐릭터 상태 샘플 및 태스크 정보 샘플이 입력될 수 있으며, 훈련될 제1 제어 네트워크를 사용하여 현재 순간의 지형 특징 샘플, 캐릭터 상태 샘플 및 태스크 정보 샘플에 대해 특징 추출이 수행되며, 액션 정보가 출력된다. 액션 정보는 타깃 태스크로서 훈련된 제2 제어 네트워크에 입력될 수 있고, 대응하는 관절 액션 정보가 캐릭터의 액션을 제어하기 위해 제2 제어 네트워크에 의해 출력될 수 있다. 유사하게, 가상 캐릭터가 달리기 경로를 완료한 후, 강화 학습에 기반하여 훈련될 제1 제어 네트워크는 환경에 의해 피드백되는 보상에 따라 가상 캐릭터의 상태에 대응하는 상태 값을 결정할 수 있으며, 상태 값이 미리 설정된 값 또는 최대값에 도달하는 경우, 이는 훈련될 제1 제어 네트워크의 훈련이 완료되었음을 지시한다.In an embodiment of the present disclosure, after the training of the second control network is completed, the first control network to be trained can be trained, and only one animation segment sample of the run is used for training, and the maximum length of each path is 200 seconds. It may be limited. When each path starts, the current moment's terrain feature sample, character state sample, and task information sample may be input, and the current moment's terrain feature sample, character state sample, and task information sample may be input using a first control network to be trained. Feature extraction is performed on the sample, and action information is output. Action information may be input to a second control network trained as a target task, and corresponding joint action information may be output by the second control network to control the character's actions. Similarly, after the virtual character completes the running path, the first control network to be trained based on reinforcement learning can determine the state value corresponding to the state of the virtual character according to the reward fed back by the environment, and the state value is When the preset value or maximum value is reached, this indicates that training of the first control network to be trained is complete.

단계(S330)에서, 관절 액션 정보에 따라 관절 토크가 결정되고, 관절 토크를 기반으로 렌더링이 수행되어 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하며, 제스처 조정 정보에 따라 애니메이션 세그먼트가 처리된다.In step S330, joint torque is determined according to joint action information, rendering is performed based on the joint torque to obtain gesture adjustment information corresponding to the virtual character, and animation segments are processed according to the gesture adjustment information.

본 개시의 실시예에서, 애니메이션 처리 모델에 의해 출력되는 관절 액션 정보가 획득된 후, 관절 액션 정보에 따라 관절 토크가 결정될 수 있고, 관절 토크가 추가로, 물리적 엔진을 사용하여 강체 구조(rigid body structure)에 대응하는 관절에 적용되어 렌더링을 수행하여, 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하고, 애니메이션 세그먼트가 제스처 조정 정보에 따라 처리된다.In an embodiment of the present disclosure, after the joint action information output by the animation processing model is acquired, the joint torque may be determined according to the joint action information, and the joint torque may be additionally calculated using a physical engine to form a rigid body structure. It is applied to the joint corresponding to the structure and performs rendering to obtain gesture adjustment information corresponding to the virtual character, and the animation segment is processed according to the gesture adjustment information.

본 개시의 실시예에서, 모션 애니메이션에서, 캐릭터 제스처가 일반적으로 역운동학에 기반한 방법을 사용하여 제어된다. 그러나, 물리 기반의 캐릭터 제스처 제어를 위해, 실시간으로 캐릭터를 제어하기 위해 운동학적 방법을 사용하는 경우, 실제 물리적 효과가 발생하지 않고 충돌과 같은 상호 효과가 인지되지 못하며, 따라서, 토크가 일반적으로 캐릭터 액션을 제어하는 데 사용된다. 물리적 캐릭터를 실시간으로 제어하는 방법은 크게 3가지가 있다: (1) 토크 제어: 모델이 각 관절에 가해지는 토크를 직접 출력한다. 이 방법은 구현하기 쉽다. 그러나, 제어 효과가 약하고, 동적 제어가 불안정하며, 흔들림(shaking)이 발생하기 쉽고, 액션이 충분히 자연스럽지 않다. (2) 위치 제어: 모델이 각 관절의 타깃 위치를 제공하며, 그런 다음 PD 컨트롤러(proportional-derivative controller)를 사용하여 대응하는 위치에 있도록 캐릭터가 동적으로 제어된다. 토크 제어에 비해, 위치 제어가 더 안정적이며, 모델이 각 관절의 제스처를 출력하며, 이 방법은 분포 분산이 비교적 작고 샘플이 작으며 모델 수렴 속도(model convergence speed)가 높다. 그러나, 기존의 PD 제어는 여전히 상대적으로 큰 흔들림이 있다. (3) 속도 제어: 모델이 각 관절의 타깃 속도를 직접 제공하며, 캐릭터는 그런 다음 PD 제어 알고리즘을 사용하여 타깃 속도로 동적으로 제어되며, 속도 제어의 효과와 모델 수렴 속도가 위치 제어의 효과 및 모델 수렴 속도와 실질적으로 일치한다.In embodiments of the present disclosure, in motion animation, character gestures are generally controlled using methods based on inverse kinematics. However, for physics-based character gesture control, when kinematic methods are used to control the character in real time, no real physical effects occur and interaction effects such as collisions are not perceived, and therefore, torque is generally used to control the character. Used to control actions. There are three main ways to control a physical character in real time: (1) Torque control: The model directly outputs the torque applied to each joint. This method is easy to implement. However, the control effect is weak, dynamic control is unstable, shaking is prone to occur, and the action is not natural enough. (2) Position control: the model provides target positions for each joint, and then the character is dynamically controlled to be in the corresponding position using a proportional-derivative controller (PD controller). Compared with torque control, position control is more stable, the model outputs the gesture of each joint, this method has relatively small distribution variance, small samples, and high model convergence speed. However, the existing PD control still has relatively large fluctuations. (3) Speed control: the model directly provides the target speed of each joint, the character is then dynamically controlled to the target speed using the PD control algorithm, and the effect of speed control and the model convergence speed are determined by the effect of position control and It is substantially consistent with the model convergence speed.

그러나, 일반적으로 위치 컨트롤러가 사용되며, 이는 계층적 제어와 동등하며, 의사결정 네트워크(decision network)는 현재 캐릭터 상태를 획득하고, 다음 순간의 타깃 위치를 출력하며, 그런 다음 캐릭터가 PD 컨트롤러를 사용하여 타깃 제스처에 있도록 동적으로 제어되고, 실제 엔지니어링에서, PD의 제어 주기를 100으로 설정한다. 이 방법은 모델 수렴 속도와 견고성(robustness) 모두에 좋은 영향을 미친다. 그러나, 일반적인 PD 컨트롤러를 사용하는 경우, 흔들림이 상대적으로 크고 제스처가 그다지 표준적이지 않다.However, usually a position controller is used, which is equivalent to hierarchical control, the decision network obtains the current character state, outputs the target position for the next moment, and then the character uses the PD controller. It is dynamically controlled to be in the target gesture, and in actual engineering, the control cycle of PD is set to 100. This method has a positive effect on both model convergence speed and robustness. However, when using a typical PD controller, the shaking is relatively large and the gestures are not very standard.

본 개시의 실시예에서는 기존 제스처 제어 방식의 단점을 극복하기 위해 역운동학 기반의 안정적인 PD 제어를 제공한다. 기존의 PD 컨트롤러를 사용하여 토크를 결정하는 계산 수식은 다음 수식 (4)와 같다.An embodiment of the present disclosure provides stable PD control based on inverse kinematics to overcome the shortcomings of existing gesture control methods. The calculation formula for determining torque using an existing PD controller is as follows:

는 토크 출력이고, q는 현재 순간에서의 가상 캐릭터의 관절의 현재 위치이며, 는 가상 캐릭터의 관절의 타깃 위치이고, 는 현재 순간에서의 관절의 속도이며, k_p는 비례 계수이고, k_d는 미분 이득 계수이며, n은 PD 제어의 제어 주기 수이다. is the torque output, q is the current position of the virtual character's joints at the current moment, is the target position of the joint of the virtual character, is the speed of the joint at the current moment, k _p is the proportional coefficient, k _d is the differential gain coefficient, and n is the number of control cycles of PD control.

물리적 캐릭터를 제어하는 동안, 컨트롤러는 타깃 제스처와의 차이를 빠르게 줄여야 하므로, k_p를 더 크게 설정해야 한다. 이 경우, 높은 비례 이득의 안정성 문제가 발생하기 쉽다. 안정적인 PD 제어는 이 문제를 잘 해결할 수 있다. 안정적인 PD 제어는 다음 기간(time period) 이후의 위치를 사용하여 를 계산하여 획득하며, 이는 타깃으로부터의 차이를 비교하면서 초기 상태를 고려하는 경우와 동등하며, 이에 따라 물리적 속성의 안정성을 향상시킨다. 일부 실시예에서, 관절의 현재 위치 및 타깃 위치는 관절 액션 정보에 따라 결정되고; 관절의 현재 속도 및 현재 가속도는 현재 위치에 따라 결정되며, 관절의 타깃 속도는 타깃 위치에 따라 결정되고; 현재 속도 및 현재 가속도에 따라 다음 제어 주기 이후 관절에 대응하는 제1 위치 및 제1 속도가 결정되며; 관절 토크는 비례 계수, 미분 이득 계수, 현재 위치, 타깃 위치, 타깃 속도, 제1 위치 및 제1 속도에 따라 계산된다. 계산 수식에 대해 수식 (5)를 참조한다.While controlling a physical character, the controller must quickly reduce the difference from the target gesture, so k _p should be set larger. In this case, stability problems with high proportional gain are likely to occur. Stable PD control can well solve this problem. Stable PD control is achieved in the following time period: using the location after It is obtained by calculating , which is equivalent to considering the initial state while comparing the difference from the target, thereby improving the stability of physical properties. In some embodiments, the current position and target position of the joint are determined according to joint action information; The current velocity and current acceleration of the joint are determined according to the current position, and the target velocity of the joint is determined according to the target position; According to the current velocity and the current acceleration, a first position and a first velocity corresponding to the joint are determined after the next control cycle; The joint torque is calculated according to the proportional coefficient, differential gain coefficient, current position, target position, target velocity, first position, and first velocity. Refer to equation (5) for the calculation formula.

는 토크 출력이고, k_p는 비례 계수이며, k_d는 미분 이득 계수이고, 은 현재 위치이며, 은 일정 기간 후 현재 속도에서 관절의 제1 위치이고, 는 관절의 타깃 위치이며, 는 관절의 현재 속도이고, 는 일정 기간 후 현재 가속도에서 관절의 제1 속도이며, 는 관절의 타깃 속도이고, n은 컨트롤러의 제어 주기 수이다. is the torque output, k _p is the proportional coefficient, k _d is the differential gain coefficient, is the current location, is the first position of the joint at the current velocity after a certain period of time, is the target position of the joint, is the current velocity of the joint, is the first velocity of the joint at the current acceleration after a certain period of time, is the target speed of the joint, and n is the number of control cycles of the controller.

본 개시의 실시예에서, 하나의 관절 액션 정보에 대응하는 복수의 토크는 역운동학에 기반한 안정적인 PD 제어를 사용하여 결정될 수 있으며, 복수의 토크는 추가로 물리적 엔진을 사용하여 대응하는 관절에 각각 적용되며, 관절의 회전축 및 앵커 포인트(anchor point)에 따라 각속도 및 최종 제스처를 계산하고, 관절 회전의 실제 조건을 시뮬레이션하며, 즉, 현재 순간의 가상 캐릭터에 대응하는 제스처 조정 정보를 획득할 수 있으며, 제스처 조정 정보는 액션 시퀀스일 수 있다. 역운동학을 기반으로 한 안정적인 PD 제어는 계산 정확도를 향상시키고, 흔들림을 줄이며, 가상 캐릭터의 액션 효과를 향상시킬 수 있다.In an embodiment of the present disclosure, a plurality of torques corresponding to one joint action information may be determined using stable PD control based on inverse kinematics, and the plurality of torques may be additionally applied to the corresponding joints using a physical engine. The angular velocity and final gesture are calculated according to the rotation axis and anchor point of the joint, simulating the actual conditions of joint rotation, that is, gesture adjustment information corresponding to the virtual character at the current moment can be obtained, Gesture adjustment information may be an action sequence. Stable PD control based on inverse kinematics can improve calculation accuracy, reduce shaking, and improve the action effect of virtual characters.

본 개시의 실시예에서, 전술한 솔루션은 애니메이션 세그먼트의 마지막 프레임의 이미지의 시뮬레이션이 완료될 때까지 연속적인 기간에서 반복되므로, 매순간(each moment)에서 가상 캐릭터에 대응하는 제스처 조정 정보, 즉, 각각의 이미지 프레임에서 가상 캐릭터에 대응하는 제스처 조정 정보가 획득될 수 있고, 여기서 제스처 조정 정보는 그래픽 사용자 인터페이스에서 새로 추가된 지형 특징 및 가상 캐릭터에 대응하는 설정 태스크 특징에 따라 결정된 가상 캐릭터의 제스처에 관한 것이다. 타깃 액션 시퀀스가 매순간에서 가상 캐릭터에 대응하는 제스처 조정 정보에 따라 결정될 수 있다. 사용자의 관점에서 볼 때, 타깃 액션 시퀀스에 의해 제시되는 애니메이션 효과는 원래 애니메이션 세그먼트의 효과보다 더 생생하고, 가상 캐릭터는 새로 설정된 장애물을 피하고 대응하는 태스크를 완료할 수 있으며, 애니메이션 효과는 더 생생하고, 사용자 경험이 더 좋다.In an embodiment of the present disclosure, the above-described solution is repeated in successive periods until the simulation of the image of the last frame of the animation segment is completed, so that at each moment the gesture coordination information corresponding to the virtual character, i.e. Gesture adjustment information corresponding to the virtual character may be obtained from the image frame of will be. The target action sequence may be determined according to gesture adjustment information corresponding to the virtual character at each moment. From the user's perspective, the animation effect presented by the target action sequence is more vivid than the effect of the original animation segment, the virtual character can avoid the newly set obstacles and complete the corresponding tasks, and the animation effect is more vivid. , the user experience is better.

도 14의 (A) 내지 (J)는 애니메이션 처리 모델에 의해 제어되고 평지를 달리는 가상 캐릭터의 액션 시퀀스를 도시한다. 도 14의 (A) 내지(J)에 도시된 바와 같이, 가상 캐릭터의 다리 들기, 스테핑, 발 착지, 팔 흔들기 액션이 보다 자연스럽고 생생하다.Figures 14 (A) to (J) show action sequences of a virtual character controlled by an animation processing model and running on a flat surface. As shown in Figures 14 (A) to (J), the virtual character's leg lifting, stepping, foot landing, and arm shaking actions are more natural and vivid.

도 15a 내지 도 15e는 조밀한 노치 지형을 달리는 사람 형상의 가상 캐릭터의 액션 시퀀스를 나타낸다. 도 15a 내지 도 15e에 도시된 바와 같이, 2개의 사람 형상의 가상 캐릭터: 흰색 사람 형상의 가상 캐릭터 W 및 검은색 사람 형상의 가상 캐릭터 B가 포함된다. 흰색 사람 형상의 가상 캐릭터 W는 원래 애니메이션 세그먼트에서 사람 형상의 가상 캐릭터이고, 검은색 사람 형상의 가상 캐릭터 B는 애니메이션 처리 모델에 의해 제어되는 사람 형상의 가상 캐릭터이다. 도 15a 내지 도 15e로부터, 흰색 사람 형상의 가상 캐릭터 W와 검은 색 사람 형상의 가상 캐릭터 B의 액션은 동일하고, 노치 C에서 흰색 사람 형상의 가상 캐릭터 W와 검은 색 사람 형상의 가상 캐릭터 B 사이에 스테핑의 차이만 있음을 알 수 있다. 애니메이션 처리 모델에 의해 제어되는 검은색 사람 형상의 가상 캐릭터 B는 도 15a, 도 15b, 도 15d 및 도 15e에 도시된 바와 같이 조밀한 노치 지형 G의 전체 달리기를 성공적으로 완료할 수 있다.15A to 15E show action sequences of a human-shaped virtual character running on dense notched terrain. As shown in FIGS. 15A to 15E, two human-shaped virtual characters are included: a white human-shaped virtual character W and a black human-shaped virtual character B. The white human-shaped virtual character W is a human-shaped virtual character from the original animation segment, and the black human-shaped virtual character B is a human-shaped virtual character controlled by the animation processing model. 15A to 15E, the actions of the white human-shaped virtual character W and the black human-shaped virtual character B are the same, and at the notch C between the white human-shaped virtual character W and the black human-shaped virtual character B You can see that there is only a difference in stepping. The black human-shaped virtual character B, controlled by the animation processing model, can successfully complete the entire run of the dense notched terrain G, as shown in FIGS. 15A, 15B, 15D, and 15E.

도 16a 내지 도 16l은 하이브리드 장애물 지형에서 달리는 사람 형상의 가상 캐릭터의 액션 시퀀스를 도시한다. 도 16a 내지 도 16l에 도시된 바와 같이, 하이브리드 장애물 지형의 지면(G)은 노치(C), 돌기(E) 및 스텝(D)을 포함한다. 도 15와 유사하게, 도면은 원래의 애니메이션 세그먼트의 흰색 사람 형상의 가상 캐릭터 W와 애니메이션 처리 모델에 의해 제어되는 검은색 사람 형상의 가상 캐릭터 B를 포함한다. 도 16a 내지 도 16e는 노치를 건너는 사람 형상의 가상 캐릭터의 액션 시퀀스를 도시하며, 도 16f 내지 도 16k는 돌기를 건너는 사람 형상의 가상 캐릭터의 액션 시퀀스를 도시하고, 도 16l은 사람 형상의 가상 캐릭터가 스텝을 건너는 액션 시퀀스를 나타낸다. 검은색 사람 형상의 가상 캐릭터 B는 노치, 돌기, 스텝을 더 잘 넘을 수 있는 반면 흰색 사람 형상의 가상 캐릭터 W의 달리기 효과는 좋지 않음을 알 수 있다. 예를 들어, 흰색 사람 형상의 가상 캐릭터 W의 발은 노치 위 또는 돌기 또는 스텝 아래에 있을 수 있으며, 이는 비현실적인 애니메이션 효과를 가진다. 16A to 16L show action sequences of a human-shaped virtual character running on a hybrid obstacle terrain. As shown in FIGS. 16A to 16L, the ground (G) of the hybrid obstacle terrain includes notches (C), protrusions (E), and steps (D). Similar to Figure 15, the diagram includes a white human-shaped virtual character W from the original animation segment and a black human-shaped virtual character B controlled by the animation processing model. Figures 16A to 16E show an action sequence of a human-shaped virtual character crossing a notch, Figures 16F to 16K show an action sequence of a human-shaped virtual character crossing a bump, and Figure 16L shows a human-shaped virtual character crossing a bump. represents an action sequence that crosses steps. It can be seen that the black human-shaped virtual character B can go over notches, bumps, and steps better, while the white human-shaped virtual character W has a poor running effect. For example, the feet of W, a white human-shaped virtual character, may be above notches or under protuberances or steps, which has an unrealistic animation effect.

본 개시의 실시예에 따른 애니메이션 처리 방법은 물리적 애니메이션을 필요로 하는 모든 게임 또는 애니메이션 디자인에 적용될 수 있다. 본 개시의 실시예에 따른 애니메이션 처리 방법에 따르면, 애니메이터가 디자인한 애니메이션 세그먼트를 시뮬레이션할 수 있다. 시뮬레이션 중에 장애물과 태스크가 가상 캐릭터에 대해 설정될 수 있다. 애니메이션 처리 모델을 사용하여 지형 특징 그리고 현재 순간의 가상 캐릭터에 대응하는 태스크 정보 및 상태 정보에 따라, 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보가 결정된다. 예를 들어, 가상 사용자가 현재 순간에 왼발을 착지하고 오른발을 들어올리고, 지형적 특징은 가상 사용자의 이동 경로에 돌기가 있는 것이고, 태스크 정보는 속도 방향이 정방향이라는 것이므로, 애니메이션 처리 모델은 이러한 정보에 따라 다음 순간의 가상 캐릭터의 관절 액션 정보를 출력하여, 가상 캐릭터가 복수의 순간에 액션을 수행한 후 성공적으로 돌기를 건널 수 있도록 한다. 마지막으로, 관절 액션 정보에 따라 관절 토크를 결정하고, 물리적 엔진을 사용하여 관절 토크를 허벅지와 발에 인가하여 렌더링을 수행하여, 가상 캐릭터가 돌기를 건너는 액션을 획득한다.The animation processing method according to an embodiment of the present disclosure can be applied to any game or animation design that requires physical animation. According to the animation processing method according to an embodiment of the present disclosure, an animation segment designed by an animator can be simulated. During simulation, obstacles and tasks can be set for the virtual character. Using the animation processing model, joint action information corresponding to the virtual character at the next moment is determined according to the terrain features and the task information and state information corresponding to the virtual character at the current moment. For example, the virtual user lands on his left foot and lifts his right foot at the current moment, the topographical feature is that there are bumps in the virtual user's movement path, and the task information is that the speed direction is forward, so the animation processing model is based on this information. Accordingly, the joint action information of the virtual character at the next moment is output so that the virtual character can successfully cross the bump after performing the action at multiple moments. Finally, the joint torque is determined according to the joint action information, and the joint torque is applied to the thigh and foot using a physical engine to perform rendering, thereby obtaining the action of the virtual character crossing the bump.

본 개시의 실시예에 따른 애니메이션 처리 방법은 임의 유형의 게임 애니메이션에 적용될 수 있다. 증강 현실 게임이 예로 사용된다. 게임 장면과 실제 장면이 통합된 후 획득되는 장면의 개략도를 나타내는 도 4를 기반으로, 게임 애니메이션에서 악령(V)은 가상 캐릭터이고, 악령이 있는 환경은 실제 장면에서 스텝(S)이며, 악령 뒤에는 전동 스쿠터(M)이 줄지어 있다. 본 개시의 실시예에 따른 애니메이션 처리 방법에 따르면, 사용자는 악령(V)이 스텝을 내려가거나 전동 스쿠터(M)를 우회하도록 태스크를 설정할 수 있다. 악령(V)의 상태 정보 및 태스크 정보 및 그래픽 사용자 인터페이스의 지형 특징에 따라, 악령에 대응하는 생생한 액션 시퀀스를 획득할 수 있다. 시각 효과 면에서 악령(V)은 현재 스텝에서 다음 스텝으로 점프하거나 전동 스쿠터(M)를 성공적으로 우회할 수 있으며, 스텝(S) 아래에 발이 나타나거나 전기 스쿠터(M)와 몸과 겹치는 경우가 일어나지 않는다. 따라서, 액션이 더욱 생생하고 환경에 대한 자기적응력이 강해진다.An animation processing method according to an embodiment of the present disclosure can be applied to any type of game animation. Augmented reality games are used as examples. Based on Figure 4, which shows a schematic diagram of the scene obtained after the integration of the game scene and the real scene, in the game animation the evil spirit (V) is a virtual character, the environment where the evil spirit is is a step (S) in the real scene, and behind the evil spirit is Electric scooters (M) are lined up. According to the animation processing method according to the embodiment of the present disclosure, the user can set a task so that the evil spirit (V) goes down the steps or bypasses the electric scooter (M). According to the status information and task information of the evil spirit (V) and the terrain characteristics of the graphical user interface, a vivid action sequence corresponding to the evil spirit can be obtained. In terms of visual effects, the demon (V) can jump from the current step to the next step or successfully bypass the electric scooter (M), and its feet may appear under the step (S) or overlap its body with the electric scooter (M). It doesn't happen. Therefore, the action becomes more vivid and self-adaptation to the environment becomes stronger.

본 개시의 실시예에 따른 애니메이션 처리 방법에 따르면, 애니메이션 처리 모델을 사용하여 지형 특징 그리고 매순간의 가상 캐릭터의 상태 정보 및 태스크 정보에 따라, 매순간에 인접한 다음 순간의 관절 액션 정보를 출력하고, 관절 액션 정보에 따라 결정된 관절 토크를 물리적 엔진을 사용하여 대응하는 관절에 인가하여 렌더링하여, 생생한 액션 시퀀스를 획득할 수 있다. 생생한 액션 시퀀스에 따라 생성되는 애니메이션의 애니메이션 효과는 애니메이터가 디자인한 애니메이션에 비해 자연스럽고 생생하다. 또한, 처리 프로세스 동안, 서로 다른 지형과 태스크가 추가되고, 게임에서 사용자와 가상 캐릭터 간의 상호작용이 구현되므로, 가상 캐릭터가 자기 적응력을 가져서 가상 캐릭터가 지형을 인지하는 능력을 향상시킨다. 평지에서 가상 캐릭터가 수행하는 액션을 복잡한 지형으로 이행하여(migrated), 게임의 재미를 향상시켜 사용자 경험을 더욱 향상시키고 게임 애니메이션의 제작 비용을 절감할 수 있다.According to the animation processing method according to an embodiment of the present disclosure, joint action information of the next instant adjacent to each instant is output according to terrain features and state information and task information of the virtual character at each instant using an animation processing model, and joint action By applying the joint torque determined according to the information to the corresponding joint using a physical engine and rendering it, a vivid action sequence can be obtained. The animation effects of animations created according to vivid action sequences are more natural and vivid than animations designed by animators. In addition, during the processing process, different terrains and tasks are added, and the interaction between the user and the virtual character is implemented in the game, so that the virtual character has self-adaptation, improving the virtual character's ability to recognize the terrain. By migrating the actions performed by a virtual character on flat ground to complex terrain, it is possible to improve the fun of the game, further improving the user experience and reducing the production cost of game animation.

다음은 본 개시의 전술한 실시예의 애니메이션 처리 방법을 수행하는데 사용될 수 있는 본 개시의 장치 실시예를 설명한다. 본 개시의 장치 실시예에서 공개되지 않은 세부 사항은 본 개시의 전술한 애니메이션 처리 방법을 참조한다.The following describes device embodiments of the present disclosure that can be used to perform the animation processing method of the above-described embodiments of the present disclosure. For details not disclosed in the device embodiments of the present disclosure, refer to the above-described animation processing method of the present disclosure.

도 17은 본 개시의 실시예에 따른 애니메이션 처리 장치의 블록도를 개략적으로 도시한 도면이다. Figure 17 is a schematic block diagram of an animation processing device according to an embodiment of the present disclosure.

도 17을 참조하면, 본 개시의 실시예에 따른 애니메이션 처리 장치(1700)는 정보 획득 모듈(1701), 모델 처리 모듈(1702) 및 제스처 조정 모듈(1703)을 포함한다.Referring to FIG. 17, the animation processing device 1700 according to an embodiment of the present disclosure includes an information acquisition module 1701, a model processing module 1702, and a gesture adjustment module 1703.

정보 획득 모듈(1701)은 현재 순간의 그래픽 사용자 인터페이스에서 지형 특징을 획득하고, 현재 순간의 애니메이션 세그먼트에서 가상 캐릭터에 대응하는 상태 정보 및 태스크 정보를 획득하도록 구성되며; 모델 처리 모듈(1702)은 지형 특징, 상태 정보 및 태스크 정보를 애니메이션 처리 모델에 입력하고, 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하여, 다음 순간의 가상 캐릭터에 대응하는 관절 액션 정보를 획득하도록 구성되고; 제스처 조정 모듈(1703)은 관절 액션 정보에 따라 관절 토크를 결정하고, 관절 토크에 기반하여 렌더링을 수행하여, 현재 순간의 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하고, 제스처 조정 정보에 따라 애니메이션 세그먼트를 처리하도록 구성된다.The information acquisition module 1701 is configured to acquire terrain features in the graphical user interface at the current moment, and obtain state information and task information corresponding to the virtual character in the animation segment at the current moment; The model processing module 1702 inputs terrain features, state information, and task information into the animation processing model, uses the animation processing model to perform feature extraction on the terrain features, state information, and task information, and performs feature extraction on the terrain features, state information, and task information to create a virtual character at the next moment. configured to obtain joint action information corresponding to; The gesture adjustment module 1703 determines the joint torque according to the joint action information, performs rendering based on the joint torque, obtains gesture adjustment information corresponding to the virtual character at the current moment, and segments animations according to the gesture adjustment information. It is configured to process.

본 개시의 실시예에서, 애니메이션 처리 장치(1700)는 추가로, 현재 순간이 애니메이션 세그먼트의 초기 순간인 경우, 애니메이션 세그먼트의 초기 순간의 가상 캐릭터의 제스처 정보에 따라 상태 정보를 결정하고: 현재 순간이 애니메이션 세그먼트의 초기 순간이 아닌 경우, 이전 순간의 가상 캐릭터에 대응하는 관절 액션 정보에 따라 상태 정보를 결정하도록 구성된다.In an embodiment of the present disclosure, the animation processing device 1700 further determines state information according to the gesture information of the virtual character of the initial moment of the animation segment when the current moment is the initial moment of the animation segment: If it is not the initial moment of the animation segment, the state information is determined according to the joint action information corresponding to the virtual character at the previous moment.

본 개시의 실시예에서, 애니메이션 처리 장치(1700)는 추가로, 애니메이션 세그먼트에 기반하여 복수의 순간에 가상 캐릭터에 대응하는 제스처 조정 정보를 획득하고; 그리고 복수의 순간에서 제스처 조정 정보에 따라 타깃 액션 시퀀스를 결정하도록 구성된다.In an embodiment of the present disclosure, the animation processing device 1700 further acquires gesture adjustment information corresponding to the virtual character at a plurality of moments based on the animation segment; And it is configured to determine the target action sequence according to the gesture adjustment information at a plurality of moments.

본 개시의 실시예에서, 지형 특징은 자체 정의(self-defined) 지형의 특징 또는 실제 지형의 특징이고; 상태 정보는 가상 캐릭터의 각 관절의 제스처, 속도 및 위상을 포함하고; 태스크 정보는 가상 캐릭터에 대응하는 타깃 속도 방향 또는 타깃 지점 좌표를 포함한다.In embodiments of the present disclosure, the terrain features are self-defined terrain features or actual terrain features; The state information includes the gesture, speed, and phase of each joint of the virtual character; The task information includes target speed direction or target point coordinates corresponding to the virtual character.

본 개시의 실시예에서, 애니메이션 처리 모델은 제1 제어 네트워크 및 제2 제어 네트워크를 포함한다. 모델 처리 모듈(1702)은 지형 특징, 상태 정보 및 태스크 정보를 제1 제어 네트워크에 입력하고, 제1 제어 네트워크를 사용하여 지형 특징, 상태 정보 및 태스크에 대해 특징 추출을 수행하여 주요 관절에 대응하는 타깃 상태 정보를 획득하도록 구성된 제1 특징 추출 유닛; 및 타깃 상태 정보를 타깃 태스크 정보로 사용하고, 상태 정보 및 타깃 태스크 정보를 제2 제어 네트워크에 입력하며, 제2 제어 네트워크를 사용하여 상태 정보 및 타깃 태스크 정보에 대해 특징 추출을 수행하여 관절 액션 정보를 획득하도록 구성된 제2 특징 추출 유닛을 포함한다.In an embodiment of the present disclosure, the animation processing model includes a first control network and a second control network. The model processing module 1702 inputs terrain features, state information, and task information into the first control network, and uses the first control network to perform feature extraction on the terrain features, state information, and tasks to generate features corresponding to major joints. a first feature extraction unit configured to obtain target state information; and using the target state information as target task information, inputting the state information and target task information into a second control network, and performing feature extraction on the state information and target task information using the second control network to obtain joint action information. and a second feature extraction unit configured to obtain.

본 개시의 실시예에서, 제1 제어 네트워크는 컨볼루션 유닛, 제1 완전 연결 계층, 제2 완전 연결 계층, 및 제3 완전 연결 계층을 포함한다. 제1 특징 추출 유닛은: 컨볼루션 유닛을 사용하여 지형 특징에 대해 특징 추출을 수행하여 지형에 대응하는 제1 특징 정보를 획득하고; 제1 완전 연결 계층을 사용하여 제1 특징 정보에 대해 특징 조합을 수행하여 제2 특징 정보를 획득하며; 제2 완전 연결 계층을 사용하여 제2 특징 정보, 상태 정보 및 태스크 정보에 대해 특징 조합을 수행하여 제3 특징 정보를 획득하고; 제3 완전 연결 계층을 사용하여 제3 특징 정보에 대해 특징 조합을 수행하여 타깃 상태 정보를 획득하도록 구성된다.In an embodiment of the present disclosure, the first control network includes a convolution unit, a first fully connected layer, a second fully connected layer, and a third fully connected layer. The first feature extraction unit: performs feature extraction on the terrain features using a convolution unit to obtain first feature information corresponding to the terrain; Using the first fully connected layer, perform feature combination on the first feature information to obtain second feature information; Using the second fully connected layer, perform feature combination on the second feature information, state information, and task information to obtain third feature information; It is configured to obtain target state information by performing feature combination on the third feature information using the third fully connected layer.

본 개시의 실시예에서, 제2 제어 네트워크는 제4 완전 연결 계층 및 제5 완전 연결 계층을 포함한다. 제2 특징 추출 유닛은: 제4 완전 연결 계층을 사용하여 상태 정보 및 타깃 태스크 정보에 대해 특징 조합을 수행하여 제4 특징 정보를 획득하고; 그리고 제5 완전 연결 계층을 사용하여 제4 특징 정보에 대해 특징 조합을 수행하여 관절 액션 정보를 획득하도록 구성된다.In an embodiment of the present disclosure, the second control network includes a fourth fully connected layer and a fifth fully connected layer. The second feature extraction unit: performs feature combination on the state information and target task information using the fourth fully connected layer to obtain fourth feature information; And it is configured to obtain joint action information by performing feature combination on the fourth feature information using the fifth fully connected layer.

본 개시의 실시예에서, 제스처 조정 모듈(1703)은: 관절 액션 정보에 따라 관절의 현재 위치 및 타깃 위치를 결정하고; 현재 위치에 따라 관절의 현재 속도 및 현재 가속도를 결정하며, 타깃 위치에 따라 관절의 타깃 속도를 결정하고; 현재 속도 및 현재 가속도에 따라 다음 제어 주기 후 관절에 대응하는 제1 위치 및 제1 속도를 결정하며; 비례 계수, 미분 이득 계수, 현재 위치, 타깃 위치, 타깃 속도, 제1 위치, 및 제1 속도에 따라 관절 토크를 계산하도록 구성된다.In an embodiment of the present disclosure, the gesture coordination module 1703: determines the current position and target position of the joint according to the joint action information; Determine the current velocity and current acceleration of the joint according to the current position, and determine the target velocity of the joint according to the target position; determine the first position and first velocity corresponding to the joint after the next control cycle according to the current velocity and current acceleration; It is configured to calculate the joint torque according to the proportional coefficient, differential gain coefficient, current position, target position, target speed, first position, and first speed.

본 개시의 실시예에서, 제스처 조정 모듈(1703)은 물리적 엔진에 관절 토크를 입력하고, 물리적 엔진을 사용하여 관절 토크를 대응하는 관절에 인가하며, 렌더링을 수행하여 제스처 조정 정보를 생성하도록 구성된다.In an embodiment of the present disclosure, the gesture coordination module 1703 is configured to input joint torque to the physical engine, apply the joint torque to the corresponding joint using the physical engine, and perform rendering to generate gesture coordination information. .

본 개시의 실시예에서, 애니메이션 처리 장치(1700)는: 애니메이션 처리 모델을 사용하여 지형 특징, 상태 정보 및 태스크 정보에 대해 특징 추출을 수행하기 전에, 훈련 대상 애니메이션 처리 모델을 훈련하여 애니메이션 처리 모델을 획득하도록 구성된 훈련 모듈을 더 포함한다.In an embodiment of the present disclosure, the animation processing device 1700: Before performing feature extraction on terrain features, state information, and task information using the animation processing model, train an animation processing model to be trained to create an animation processing model. It further includes a training module configured to obtain.

본 개시의 실시예에서, 훈련 대상 애니메이션 처리 모델은 훈련될 제1 제어 네트워크 및 훈련될 제2 제어 네트워크를 포함하고; 훈련 모듈은 지형 특징 샘플, 캐릭터 상태 샘플 및 태스크 정보 샘플을 획득하고, 지형 특징 샘플, 캐릭터 상태 샘플 및 태스크 정보 샘플에 따라 훈련될 제1 제어 네트워크를 훈련하여, 제1 제어 네트워크를 획득하도록 구성된 제1 훈련 유닛; 및 가상 캐릭터의 주요 관절에 대응하는 상태 정보 샘플 및 애니메이션 세그먼트 샘플의 모든 관절에 대응하는 관절 액션 정보 샘플에 따라 제2 제어 네트워크를 훈련하여 제2 제어 네트워크를 획득하도록 구성된 제2 훈련 유닛을 포함하고, 훈련될 제1 제어 네트워크와 훈련될 제2 제어 네트워크는 별도로 훈련되며; 그리고 훈련될 제1 제어 네트워크가 훈련되는 경우, 훈련될 제1 제어 네트워크는 고정된 파라미터로 제2 제어 네트워크에 연결된다.In an embodiment of the present disclosure, the animation processing model to be trained includes a first control network to be trained and a second control network to be trained; The training module is configured to obtain a terrain feature sample, a character state sample, and a task information sample, and train a first control network to be trained according to the terrain feature sample, the character state sample, and the task information sample, to obtain a first control network. 1 training unit; and a second training unit configured to train the second control network according to the state information sample corresponding to the main joint of the virtual character and the joint action information sample corresponding to all joints of the animation segment sample to obtain a second control network. , the first control network to be trained and the second control network to be trained are trained separately; And when the first control network to be trained is trained, the first control network to be trained is connected to the second control network with fixed parameters.

본 개시의 실시예에서, 제2 훈련 유닛은 복수의 애니메이션 세그먼트 샘플을 획득하고, 가상 캐릭터의 초기 제스처에 따라 복수의 애니메이션 세그먼트 샘플로부터 타깃 애니메이션 세그먼트 샘플을 결정하고; 타깃 애니메이션 세그먼트 샘플에서 주요 관절에 대응하는 상태 정보 샘플을 획득하고, 상태 정보 샘플을 타깃 태스크 정보로 사용하며; 가상 캐릭터의 모든 관절에 대응하는 관절 액션 정보 샘플을 획득하고; 그리고 타깃 태스크 정보 및 관절 액션 정보 샘플에 따라 훈련될 제2 제어 네트워크를 훈련하도록 구성된다.In an embodiment of the present disclosure, the second training unit acquires a plurality of animation segment samples, and determines a target animation segment sample from the plurality of animation segment samples according to the initial gesture of the virtual character; Obtain state information samples corresponding to major joints from the target animation segment samples, and use the state information samples as target task information; Obtain joint action information samples corresponding to all joints of the virtual character; And configured to train a second control network to be trained according to the target task information and joint action information samples.

본 개시의 실시예에서, 훈련될 제1 제어 네트워크는 제1 훈련 대상 액터 서브 네트워크 및 제1 훈련 대상 크리틱 서브 네트워크를 포함하고, 훈련될 제2 제어 네트워크는 제2 훈련 대상 액터 서브 네트워크와 제2 훈련 대상 크리틱 서브 네트워크를 포함하며, 여기서 제1 훈련 대상 액터 서브 네트워크의 구조는 제1 훈련 대상 크리틱 서브 네트워크의 구조와 동일하며, 제2 훈련 대상 액터 서브 네트워크의 구조는 제2 훈련 대상 크리틱 서브 네트워크의 구조와 동일하다.In an embodiment of the present disclosure, the first control network to be trained includes a first actor sub-network to be trained and a first critical sub-network to be trained, and the second control network to be trained includes a second actor sub-network to be trained and a second to-be-trained actor sub-network. A critic subnetwork to be trained, wherein the structure of the first actor subnetwork to be trained is the same as the structure of the first critic subnetwork to be trained, and the structure of the second actor subnetwork to be trained is the second critic subnetwork to be trained. It is the same as the structure of

도 18은 본 개시의 실시예들을 구현하도록 적응된 전자 디바이스의 컴퓨터 시스템의 개략적인 구조도를 도시한다.Figure 18 shows a schematic structural diagram of a computer system of an electronic device adapted to implement embodiments of the present disclosure.

도 18에 도시된 전자 디바이스의 컴퓨터 시스템(1800)은 예시일 뿐, 본 개시의 실시 예의 기능 및 사용 범위를 제한하지 않는다.The computer system 1800 of the electronic device shown in FIG. 18 is only an example and does not limit the function and scope of use of the embodiments of the present disclosure.

도 18에 도시된 바와 같이, 컴퓨터 시스템(1800)은 중앙 처리 유닛(central processing unit, CPU)(1801)을 포함하며, 이는 읽기 전용 메모리(read-only memory, ROM)(1802)에 저장된 프로그램 또는 저장부(storage part)(1808)로부터 랜덤 액세스 메모리(random access memory, RAM)(1803)에 로딩된 프로그램에 따라 다양한 적절한 액션 또는 처리를 수행하여, 전술한 실시예의 애니메이션 처리 방법을 구현한다. RAM(1803)은 추가로, 시스템 운영에 필요한 다양한 프로그램 및 데이터를 저장한다. CPU(1801), ROM(1802), 및 RAM(1803)은 버스(1804)를 통해 서로 연결된다. 입/출력(I/O) 인터페이스(1805)는 또한 버스(1804)에 연결된다.As shown in FIG. 18, the computer system 1800 includes a central processing unit (CPU) 1801, which stores a program or program stored in a read-only memory (ROM) 1802. The animation processing method of the above-described embodiment is implemented by performing various appropriate actions or processes according to the program loaded into the random access memory (RAM) 1803 from the storage part 1808. RAM 1803 additionally stores various programs and data necessary for system operation. CPU 1801, ROM 1802, and RAM 1803 are connected to each other through a bus 1804. Input/output (I/O) interface 1805 is also coupled to bus 1804.

키보드, 마우스 등을 포함하는 입력부(input part)(1806), 음극선관(cathode ray tube, CRT), 액정 디스플레이(liquid crystal display, LCD), 스피커 등을 포함하는 출력부(1807), 하드 디스크 등을 포함하는 저장부(1808), 및 근거리 통신망(local area network, LAN) 카드 또는 모뎀과 같은 네트워크 인터페이스 카드를 포함하는 통신부(1809)를 포함하는 구성 요소가 I/O 인터페이스(1805)에 연결된다. 통신부(1809)는 인터넷 등의 네트워크를 사용하여 통신 처리를 행한다. 드라이브(drive)(1810)는 또한 필요에 따라 I/O 인터페이스(1805)에 연결된다. 자기 디스크, 광 디스크, 광자기 디스크, 반도체 메모리 등의 제거가능한 매체(removable medium)(1811)가 필요에 따라 드라이브(1810)에 설치되므로, 제거가능한 매체에서 읽어 들인 컴퓨터 프로그램이 필요에 따라 저장부(1808)에 설치된다.An input part 1806 including a keyboard, a mouse, etc., an output part 1807 including a cathode ray tube (CRT), a liquid crystal display (LCD), a speaker, etc., a hard disk, etc. Components including a storage unit 1808 including a and a communication unit 1809 including a network interface card such as a local area network (LAN) card or modem are connected to the I/O interface 1805. . The communication unit 1809 performs communication processing using a network such as the Internet. Drive 1810 is also connected to I/O interface 1805 as needed. Since a removable medium 1811 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory is installed in the drive 1810 as needed, the computer program read from the removable medium can be stored in the storage unit as needed. Installed in 1808.

특히, 본 개시의 실시예에 따르면, 흐름도를 참조하여 후술하는 프로세스들이 컴퓨터 소프트웨어 프로그램으로 구현될 수 있다. 예를 들어, 본 개시의 이 실시예는 컴퓨터 프로그램 제품을 포함하고, 컴퓨터 프로그램 제품은 컴퓨터가 판독 가능한 매체에서 운반되는 컴퓨터 프로그램을 포함하고, 컴퓨터 프로그램은 흐름도에 도시된 방법을 수행하기 위해 사용되는 프로그램 코드를 포함한다. 이러한 실시예에서, 통신부(1809)를 이용함으로써, 컴퓨터 프로그램은 네트워크로부터 다운로드 및 설치될 수 있거나 및/또는 제거가능한 매체(1811)로부터 설치될 수 있다. 컴퓨터 프로그램이 CPU(1801)에 의해 실행될 때, 본 출원의 시스템에 정의된 다양한 기능이 실행된다.In particular, according to an embodiment of the present disclosure, processes described later with reference to the flowchart may be implemented as a computer software program. For example, this embodiment of the present disclosure includes a computer program product, the computer program product includes a computer program carried on a computer-readable medium, and the computer program is used to perform the method shown in the flowchart. Contains program code. In this embodiment, by using the communication unit 1809, a computer program can be downloaded and installed from a network and/or installed from a removable medium 1811. When a computer program is executed by the CPU 1801, various functions defined in the system of the present application are executed.

본 개시의 실시예에서 컴퓨터가 판독 가능한 매체는 컴퓨터가 판독 가능한 신호 매체 또는 컴퓨터가 판독 가능한 저장 매체 또는 이들의 임의의 조합일 수 있다. 컴퓨터가 판독 가능한 저장 매체는 예를 들어 전기, 자기, 광학, 전자기, 적외선 또는 반도체 시스템, 장치, 또는 컴포넌트, 또는 이들의 임의의 조합일 수 있지만 이에 제한되지는 않는다. 컴퓨터가 판독 가능한 저장 매체는 하나 이상의 와이어를 갖는 전기적 연결, 휴대용 컴퓨터 자기 디스크, 하드 디스크, 랜덤 액세스 메모리(random access memory, RAM), 읽기 전용 메모리(read-only memory, ROM), 지울 수 있는 프로그램 가능 읽기 전용 메모리(erasable programmable read-only memory, EPROM), 플래시 메모리, 광섬유, 컴팩트 디스크 읽기 전용 메모리(compact disk read-only memory, CD-ROM), 광학 저장 디바이스, 자기 저장 디바이스, 또는 이들의 임의의 적절한 조합을 포함할 수 있으며, 이에 한정되지 않는다. 본 개시에서, 컴퓨터가 판독 가능한 저장 매체는 프로그램을 포함하거나 저장하는 유형의 매체라면 어떠한 것이든 될 수 있으며, 프로그램은 명령 실행 시스템, 장치 또는 디바이스에 의해 사용되거나 조합되어 사용될 수 있다. 본 개시에서, 컴퓨터가 판독 가능한 신호 매체는 기저대역에 있거나 반송파의 일부로서 전파되는 데이터 신호를 포함할 수 있고, 데이터 신호는 컴퓨터가 판독 가능한 프로그램 코드를 운반한다. 이러한 방식으로 전파되는 데이터 신호는 전자기 신호, 광 신호, 또는 이들의 임의의 적절한 조합을 포함하지만 이에 제한되지 않는 복수의 형태를 취할 수 있다. 컴퓨터가 판독 가능한 신호 매체는 컴퓨터가 판독 가능한 저장 매체 외에 컴퓨터가 판독 가능한 임의의 매체를 더 포함할 수 있다. 컴퓨터가 판독 가능한 매체는 명령 실행 시스템, 장치 또는 디바이스에 의해 사용되거나 이와 함께 사용되는 프로그램을 전송, 전파 또는 전송할 수 있다. 컴퓨터가 판독 가능한 매체에 포함된 프로그램 코드는 무선 매체, 유선 등, 또는 이들의 임의의 적절한 조합을 포함하나 이에 제한되지 않는 임의의 적절한 매체를 사용하여 전송될 수 있다.In embodiments of the present disclosure, the computer-readable medium may be a computer-readable signal medium, a computer-readable storage medium, or any combination thereof. A computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or component, or any combination thereof. Computer-readable storage media include electrical connections with one or more wires, portable computer magnetic disks, hard disks, random access memory (RAM), read-only memory (ROM), and erasable programs. erasable programmable read-only memory (EPROM), flash memory, optical fiber, compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any of these. It may include an appropriate combination of, but is not limited to this. In the present disclosure, the computer-readable storage medium may be any tangible medium that includes or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium may include a data signal in baseband or propagating as part of a carrier wave, where the data signal carries computer-readable program code. Data signals propagated in this manner may take multiple forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. The computer-readable signal medium may further include any computer-readable medium in addition to the computer-readable storage medium. A computer-readable medium can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. Program code included in a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless medium, wired, etc., or any suitable combination thereof.

첨부 도면의 흐름도 및 블록도는 본 개시의 다양한 실시예에 따른 시스템, 방법, 및 컴퓨터 프로그램 제품에 의해 구현될 수 있는 가능한 시스템 아키텍처, 기능 및 작동을 예시한다. 이와 관련하여 흐름도 또는 블록도의 각 박스(box)는 모듈, 프로그램 세그먼트 또는 코드의 일부를 나타낼 수 있다. 모듈, 프로그램 세그먼트 또는 코드의 일부는 지정된 논리 기능을 구현하는 데 사용되는 하나 이상의 실행 가능한 명령을 포함한다. 대안으로 사용되는 일부 구현에서, 박스에 주석이 달린 기능은 다르게는 첨부 도면에 주석이 달린 것과 상이한 순서로 발생할 수 있다. 예를 들어, 실제로 연속적으로 도시된 두 개의 박스는 기본적으로 병렬로 수행될 수 있고, 때로는 두 개의 박스가 역순으로 수행될 수 있다. 이것은 관련 기능에 의해 결정된다. 블록도 및/또는 흐름도의 각 박스와 블록도 및/또는 흐름도의 박스 조합은 지정된 기능 또는 작동을 수행하도록 구성된 전용 하드웨어 기반 시스템을 사용하여 구현될 수 있거나 또는 전용 하드웨어와 컴퓨터 명령의 조합을 사용하여 구현될 수 있다.The flow diagrams and block diagrams in the accompanying drawings illustrate possible system architectures, functions, and operations that may be implemented by systems, methods, and computer program products in accordance with various embodiments of the present disclosure. In this regard, each box in a flowchart or block diagram may represent a module, program segment, or portion of code. A module, program segment, or portion of code contains one or more executable instructions used to implement specified logical functions. In some alternative implementations, the functions annotated in the box may occur in a different order than those annotated in the accompanying drawings. For example, two boxes that are actually shown back-to-back may essentially be performed in parallel, and sometimes two boxes may be performed in reverse order. This is determined by the relevant function. Each box in the block diagram and/or flow diagram and combination of boxes in the block diagram and/or flow diagram may be implemented using a dedicated hardware-based system configured to perform the specified function or operation, or using a combination of dedicated hardware and computer instructions. It can be implemented.

본 개시의 실시예에서 설명되는 관련 유닛은 소프트웨어적으로 구현될 수도 있고, 하드웨어적으로 구현될 수도 있으며, 설명된 유닛은 프로세서에 설정될 수도 있다. 유닛의 이름은 특정 경우 유닛에 대한 제한을 구성하지 않는다.Related units described in embodiments of the present disclosure may be implemented in software or hardware, and the described units may be set in a processor. The name of the unit does not constitute a restriction to the unit in any particular case.

다른 측면에 따르면, 본 개시는 컴퓨터가 판독 가능한 매체를 더 제공한다. 컴퓨터가 판독 가능한 매체는 전술한 실시예에서 설명한 애니메이션 처리 장치에 포함될 수도 있고, 단독으로 존재하고 전자 디바이스에 배치되지 않을 수도 있다. 컴퓨터가 판독 가능한 매체는 하나 이상의 프로그램을 운반하며, 하나 이상의 프로그램은 전자 디바이스에 의해 실행될 때, 전자 디바이스로 하여금 전술한 실시예에서 설명된 방법을 구현하게 한다.According to another aspect, the present disclosure further provides a computer-readable medium. The computer-readable medium may be included in the animation processing device described in the above-described embodiment, or may exist alone and not be disposed in the electronic device. The computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to implement the methods described in the foregoing embodiments.

전술한 상세한 설명에서 액션을 수행하도록 구성된 디바이스의 복수의 모듈 또는 유닛이 논의되었지만, 그러한 구분이 필수는 아니다. 실제로, 본 개시의 구현예에 따르면, 상술한 둘 이상의 모듈 또는 유닛의 특징 및 기능은 하나의 모듈 또는 유닛으로 구현될 수 있다. 반대로, 상술한 하나의 모듈 또는 유닛의 특징 및 기능은 더 분할되어 복수의 모듈 또는 유닛으로 구현될 수 있다.Although the foregoing detailed description has discussed multiple modules or units of the device configured to perform actions, such distinction is not required. In fact, according to an implementation example of the present disclosure, the features and functions of two or more modules or units described above may be implemented with one module or unit. Conversely, the features and functions of one module or unit described above may be further divided and implemented as a plurality of modules or units.

구현에 대한 전술한 설명에 따르면, 당업자는 여기에 설명된 예시적인 구현이 소프트웨어를 사용하여 구현될 수 있거나, 소프트웨어와 필요한 하드웨어를 결합하여 구현될 수 있음을 쉽게 이해할 수 있다. 따라서, 본 개시의 실시 예에 따른 기술 솔루션은 소프트웨어 제품의 형태로 구현될 수 있다. 소프트웨어 제품은 비휘발성 저장 매체(CD-ROM, USB 플래시 드라이브, 이동식 하드 디스크 등일 수 있음) 또는 네트워크상에 저장되며, 컴퓨팅 디바이스(개인 컴퓨터, 서버, 터치 단말, 네트워크 디바이스 등일 수 있음)가 본 개시의 실시 예에 따른 방법을 수행하게 하는 여러 명령을 포함한다.Following the foregoing description of the implementation, one skilled in the art will readily understand that the example implementations described herein may be implemented using software, or may be implemented by combining software with the necessary hardware. Accordingly, the technical solution according to an embodiment of the present disclosure may be implemented in the form of a software product. The software product is stored on a non-volatile storage medium (which may be a CD-ROM, USB flash drive, removable hard disk, etc.) or a network, and is stored on a computing device (which may be a personal computer, server, touch terminal, network device, etc.) disclosed herein. Includes several commands to perform the method according to the embodiment.

본 개시의 다른 실시예는 여기에서의 개시의 명세서 및 실시를 고려함으로써 당업자에게 명백할 것이다. 본 개시는 본 개시의 임의의 변형, 사용 또는 적응적 변경을 포함하도록 의도된다. 이러한 변형, 사용 또는 적응적 변경은 본 개시의 일반적인 원칙을 따르며, 본 개시에서 개시하지 않은 일반적인 상식 또는 기술적인 수단을 기술에 포함한다. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This disclosure is intended to cover any modification, use, or adaptation of this disclosure. Such modifications, uses or adaptive changes follow the general principles of the present disclosure and include in the technique common sense or technical means not disclosed in the present disclosure.

본 개시는 상기에서 설명되고 첨부된 도면에 도시된 정확한 구조에 한정되지 않으며, 본 개시의 범위를 벗어나지 않고 다양한 수정 및 변경이 이루어질 수 있음을 이해해야 한다. 본 개시의 범위는 첨부된 청구범위에 의해서만 제한된다.It should be understood that the present disclosure is not limited to the exact structure described above and shown in the accompanying drawings, and that various modifications and changes may be made without departing from the scope of the present disclosure. The scope of the present disclosure is limited only by the appended claims.

Claims

An animation processing method applicable to electronic devices, comprising:
Obtaining a terrain feature from a graphical user interface of a current moment, and obtaining state information and task information corresponding to a virtual character in an animation segment of the current moment;
Input the terrain features, the state information, and the task information into an animation processing model, perform feature extraction on the terrain features, the state information, and the task information using the animation processing model, and perform feature extraction on the terrain features, the state information, and the task information, and Obtaining joint action information corresponding to the character;
determining joint torque according to the joint action information; and
Obtaining gesture adjustment information corresponding to the virtual character at the current moment based on the joint torque, and processing the animation segment according to the gesture adjustment information,
The animation processing model includes a first control network and a second control network;
Input the terrain features, the state information, and the task information into an animation processing model, perform feature extraction on the terrain features, the state information, and the task information using the animation processing model, and perform feature extraction on the terrain features, the state information, and the task information, and The step of acquiring joint action information corresponding to the character is,
Input the terrain features, the state information, and the task information into the first control network, perform feature extraction on the terrain features, the state information, and the task information using the first control network, and perform main ( key) Obtaining target state information corresponding to a joint - the main joint corresponds to the terrain feature and the state information and task information of the virtual character -
determining the target state information as target task information; and
Inputting the state information and the target task information into the second control network, and performing feature extraction on the state information and the target task information using the second control network to obtain the joint action information.
An animation processing method containing .

According to paragraph 1,
If the current moment is the initial moment of the animation segment, determining the state information according to gesture information of the virtual character at the initial moment of the animation segment; and
If the current moment is not the initial moment of the animation segment, determining the state information according to joint action information corresponding to the virtual character at a previous point in time.
An animation processing method further comprising:

According to paragraph 2,
Obtaining gesture adjustment information corresponding to the virtual character at a plurality of moments based on the animation segment; and
Determining a target action sequence according to the gesture adjustment information at the plurality of moments
An animation processing method further comprising:

According to paragraph 1,
The terrain features are self-defined terrain features or actual terrain features;
The state information includes the gesture, speed, and phase of each joint of the virtual character; and
The animation processing method wherein the task information includes a target speed direction or target point coordinates corresponding to the virtual character.

delete

According to paragraph 1,
The first control network includes a convolution unit, a first fully connected layer, a second fully connected layer, and a third fully connected layer; and
The step of performing feature extraction on the terrain feature, the state information, and the task information using the first control network to obtain target state information corresponding to a major joint,
performing feature extraction on the terrain features using the convolution unit to obtain first feature information corresponding to the terrain;
Obtaining second feature information by performing feature combination on the first feature information using the first fully connected layer;
Obtaining third feature information by performing feature combination on the second feature information, the state information, and the task information using the second fully connected layer; and
Obtaining the target state information by performing feature combination on the third feature information using the third fully connected layer.
An animation processing method, including.

According to paragraph 1,
the second control network includes a fourth fully connected layer and a fifth fully connected layer; and
Obtaining the joint action information by performing feature extraction on the state information and the target task information using the second control network,
Obtaining fourth feature information by performing feature combination on the state information and the target task information using the fourth fully connected layer; and
Obtaining the joint action information by performing feature combination on the fourth feature information using the fifth fully connected layer
An animation processing method including.

According to paragraph 1,
The step of determining joint torque according to the joint action information is:
determining the current position and target position of the joint according to the joint action information;
determining the current velocity and current acceleration of the joint according to the current position, and determining the target velocity of the joint according to the target position;
determining a first position and a first velocity corresponding to the joint after a next control period according to the current velocity and the current acceleration; and
calculating the joint torque according to a proportional coefficient, a differential gain coefficient, the current position, the target position, the target velocity, the first position, and the first velocity.
An animation processing method including.

According to paragraph 1,
Obtaining gesture adjustment information corresponding to the virtual character at the current moment based on the joint torque,
Inputting the joint torque into a physics engine, applying the joint torque to a corresponding joint using the physics engine, and performing rendering to generate the gesture adjustment information.
An animation processing method including.

According to paragraph 1,
Before performing feature extraction on the terrain features, the state information, and the task information using the animation processing model,
The animation processing method is,
Obtaining the animation processing model by training a to-be-trained animation processing model.
An animation processing method further comprising:

According to clause 10,
The animation processing model to be trained includes a first control network to be trained and a second control network to be trained; and
The step of training the training target animation processing model to obtain the animation processing model,
acquiring a terrain feature sample, a character state sample, and a task information sample;
training the first control network to be trained according to the terrain feature sample, the character state sample, and the task information sample to obtain the first control network; and
Obtaining the second control network by training the second control network according to state information samples corresponding to main joints of the virtual character and joint action information samples corresponding to all joints of the animation segment samples.
Including,
the first control network to be trained and the second control network to be trained are trained separately; When the first control network to be trained is trained, the first control network to be trained is connected to the second control network with fixed parameters.

According to clause 11,
Obtaining the second control network by training the second control network according to the state information sample corresponding to the main joint of the virtual character and the joint action information sample corresponding to all joints of the animation segment sample,
Obtaining a plurality of animation segment samples;
determining a target animation segment sample from the plurality of animation segment samples according to an initial gesture of the virtual character;
Obtaining a state information sample corresponding to the main joint from the target animation segment sample and using the state information sample as target task information;
Obtaining the joint action information samples corresponding to all joints of the virtual character; and
Training the second control network to be trained according to the target task information and the joint action information sample.
An animation processing method including.

According to clause 11,
The first control network to be trained includes a first training target actor sub-network and a first training target critical sub-network; and
The second control network to be trained includes a second actor sub-network to be trained and a second critic sub-network to be trained,
An animation processing method wherein the structure of the first actor sub-network to be trained is the same as the first critic sub-network to be trained, and the structure of the second actor sub-network to be trained is the same as the structure of the critique sub-network to be trained. .

An animation processing device, comprising:
an information acquisition module configured to obtain terrain features in a graphical user interface at the current moment and obtain state information and task information corresponding to a virtual character in an animation segment at the current moment;
Input the terrain features, the state information, and the task information into an animation processing model, perform feature extraction on the terrain features, the state information, and the task information using the animation processing model, and perform feature extraction on the terrain features, the state information, and the task information, and a model processing module configured to acquire joint action information corresponding to the character; and
Gesture adjustment configured to determine a joint torque according to the joint action information, obtain gesture adjustment information corresponding to the virtual character at the current moment based on the joint torque, and process the animation segment according to the gesture adjustment information. Contains modules,
The animation processing model includes a first control network and a second control network;
The model processing module is,
Input the terrain features, the state information, and the task information into the first control network, perform feature extraction on the terrain features, the state information, and the task information using the first control network, and perform main ( key) a first feature extraction unit configured to obtain target state information corresponding to a joint, wherein the main joint corresponds to the terrain feature and state information and task information of the virtual character; and
Determine the target state information as target task information, input the state information and the target task information into the second control network, and extract features for the state information and the target task information using the second control network. A second feature extraction unit configured to obtain the joint action information by performing
An animation processing unit comprising:

As an electronic device,
One or more processors; and
A storage device configured to store one or more programs
Includes,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the animation processing method according to any one of claims 1 to 4 and 6 to 13, an electronic device. .

A computer-readable storage medium that stores a computer program,
A computer-readable storage medium, wherein the computer program, when executed by a processor, implements the animation processing method according to any one of claims 1 to 4 and 6 to 13.