CN116342763A

CN116342763A - Intelligent multi-mode animation creation system and creation method

Info

Publication number: CN116342763A
Application number: CN202310181768.2A
Authority: CN
Inventors: 孙国玉; 于鹏; 刘大可; 黄心渊
Original assignee: Communication University of China
Current assignee: Communication University of China
Priority date: 2023-02-21
Filing date: 2023-02-21
Publication date: 2023-06-27

Abstract

The present application discloses an intelligent multi-modal animation creation system and creation method, wherein the system includes: a situation design module, which is used to generate creation operation instructions according to the creator's creation intention, and build animation story occurrence scenarios according to the creation operation instructions; The building module is used to create narrative content in the animation story occurrence situation according to the key elements of the narrative and the narrative paradigm; the multi-modal driving module is used to collect the multi-modal action interaction information of the creator and map the multi-modal action interaction information to In the state of body movement of the characters and objects in the scene of the animation story, the creation of intelligent multi-modal paintings is completed. Therefore, the use of gamification design methods and multi-modal interaction technology has enhanced the understanding and interest of young people in animation art, and they can quickly make animations without the limitation of time and space without the experience of using professional tools.

Description

Intelligent multi-modal animation creation system and creation method

技术领域technical field

本申请涉及人工智能与多模态交互技术领域，特别涉及一种智能多模态动画创作系统及创作方法。This application relates to the technical field of artificial intelligence and multimodal interaction, and in particular to an intelligent multimodal animation creation system and creation method.

背景技术Background technique

WebGL(Web Graphics Library)是一种3D绘图协议，这种绘图技术标准允许把JavaScript和OpenGL ES 2.0结合在一起，通过增加OpenGL ES 2.0的一个JavaScript绑定，WebGL可以为HTML5 Canvas提供硬件3D加速渲染，这样Web开发人员就可以借助系统显卡来在浏览器里更流畅地展示3D场景和模型，还能创建复杂的导航和数据视觉化。WebGL (Web Graphics Library) is a 3D drawing protocol. This drawing technology standard allows the combination of JavaScript and OpenGL ES 2.0. By adding a JavaScript binding of OpenGL ES 2.0, WebGL can provide hardware 3D accelerated rendering for HTML5 Canvas , so that web developers can use the system graphics card to more smoothly display 3D scenes and models in the browser, and create complex navigation and data visualization.

Posenet是一种实时姿势检测技术，可以检测图像或视频中的人类姿势。它在两种情况下都可以作为单模式(单个人体姿势检测)和多姿势检测(多个人体姿势检测)工作。其作为深度学习TensorFlow模型，通过检测肘部、臀部、手腕、膝盖、脚踝等身体部位来估计人体姿势，并通过连接这些点形成姿势的骨架结构。PoseNet接受过MobileNet架构训练。MobileNet是谷歌开发的卷积神经网络，在ImageNet数据集上训练，主要用于类别中的图像分类和目标估计。是一个轻量级模型，使用深度可分离卷积来加深网络并减少参数、计算成本并提高准确性。预训练模型在浏览器中运行，这就是posenet与其他依赖API的库的区别。Posenet提供了总共17个可以使用的关键点，从眼睛到耳朵，再到膝盖和脚踝。Posenet is a real-time pose detection technique that detects human poses in images or videos. It works in both cases as single mode (single human pose detection) and multi-pose detection (multiple human pose detection). As a deep learning TensorFlow model, it estimates the pose of the human body by detecting body parts such as elbows, hips, wrists, knees, and ankles, and forms the skeleton structure of the pose by connecting these points. PoseNet is trained on the MobileNet architecture. MobileNet is a convolutional neural network developed by Google. It is trained on the ImageNet dataset and is mainly used for image classification and target estimation in categories. is a lightweight model that uses depthwise separable convolutions to deepen the network and reduce parameters, computational cost, and improve accuracy. The pre-trained model runs in the browser, which is what differentiates posenet from other API-dependent libraries. Posenet provides a total of 17 keypoints that can be used, from eyes to ears to knees and ankles.

游戏引擎是指一些已编写好的可编辑电脑游戏系统或者一些交互式实时图像应用程序的核心组件。这些系统为游戏设计者提供各种编写游戏所需的各种工具，其目的在于让游戏设计者能容易和快速地做出游戏程式而不用由零开始。大部分都支持多种操作平台，如Linux、Mac OS X、微软Windows。游戏引擎包含以下系统：渲染引擎(即“渲染器”，含二维图像引擎和三维图像引擎)、物理引擎、碰撞检测系统、音效、脚本引擎、电脑动画、人工智能、网络引擎以及场景管理。游戏引擎是一个为运行某一类游戏的机器设计的能够被机器识别的代码(指令)集合。它像一个发动机，控制着游戏的运行。一个游戏作品可以分为游戏引擎和游戏资源两大部分。游戏资源包括图象，声音，动画等部分，列一个公式就是：游戏＝引擎(程序代码)+资源(图象，声音，动画等)。游戏引擎则是按游戏设计的要求顺序地调用这些资源。Game engine refers to the core components of some well-written editable computer game systems or some interactive real-time graphics applications. These systems provide game designers with various tools needed to write games. The purpose is to allow game designers to easily and quickly make game programs without starting from scratch. Most of them support multiple operating platforms, such as Linux, Mac OS X, and Microsoft Windows. The game engine includes the following systems: rendering engine ("renderer", including two-dimensional image engine and three-dimensional image engine), physics engine, collision detection system, sound effects, script engine, computer animation, artificial intelligence, network engine and scene management. A game engine is a collection of codes (instructions) designed for a machine that runs a certain type of game and can be recognized by the machine. It is like an engine that controls the operation of the game. A game work can be divided into two parts: game engine and game resources. Game resources include images, sounds, animations and other parts, and a formula is listed: game=engine (program code)+resources (images, sounds, animations, etc.). The game engine calls these resources sequentially according to the requirements of the game design.

目前计算机动画制作软件已发展的较为成熟，但是这些软件都需要使用者具备专业的艺术素养和数量的软件操作经验，青少年使用者往往无法快速掌握。如何降低动画制作的难度，提高动画制作效率，是本领域技术人员急需解决的问题。At present, computer animation production software has been developed relatively maturely, but these software require users to have professional artistic accomplishment and a lot of software operation experience, and young users often cannot quickly master them. How to reduce the difficulty of animation production and improve the efficiency of animation production is an urgent problem for those skilled in the art.

发明内容Contents of the invention

本申请提供一种智能多模态动画创作系统及创作方法，采用游戏化设计方法和多模态交互技术，增强了青少年对动画艺术的理解和兴趣。在没有专业工具使用经验的情况也可以不受时空限制快速制作动画。This application provides an intelligent multi-modal animation creation system and creation method, adopts gamification design method and multi-modal interaction technology, and enhances young people's understanding and interest in animation art. In the case of no experience in using professional tools, you can quickly create animations without limitation of time and space.

本申请第一方面实施例提供一种智能多模态动画创作系统，包括：情境设计模块，用于根据创作者的创作意图生成创作操作指令，根据所述创作操作指令搭建动画故事发生情境；叙事构建模块，用于根据叙事关键元素和叙事范式在所述动画故事发生情境创作叙事内容；多模态驱动模块，用于采集所述创作者的多模态动作交互信息，并将所述多模态动作交互信息映射到所述动画故事发生情境的角色对象身体运动状态中，完成智能多模态画创作。The embodiment of the first aspect of the present application provides an intelligent multi-modal animation creation system, including: a scenario design module, which is used to generate creation operation instructions according to the creation intention of the creator, and build animation story occurrence scenarios according to the creation operation instructions; The building module is used to create narrative content in the animation story occurrence situation according to the narrative key elements and the narrative paradigm; the multimodal driving module is used to collect the multimodal action interaction information of the creator and convert the multimodal The dynamic action interaction information is mapped to the body movement state of the character object in the animation story scene, and the intelligent multi-modal painting creation is completed.

可选地，在本申请的一个实施例中，所述情境设计模块包括：3D场景渲染单元，用于构建三维渲染空间，呈现所述动画故事情境；场景对象选取单元，用于对所述三维渲染空间中的场景对象进行分类，并供用户选取所述创作意图对应的场景对象；场景对象操作单元，用于提供操纵交互接口，以调整所述场景对象在所述三维渲染空间中的空间位置。Optionally, in one embodiment of the present application, the scenario design module includes: a 3D scene rendering unit, configured to construct a 3D rendering space, and present the animation story scenario; a scene object selection unit, configured to perform the 3D rendering The scene objects in the rendering space are classified, and the scene objects corresponding to the creation intention are selected by the user; the scene object operation unit is used to provide a manipulation interaction interface to adjust the spatial position of the scene objects in the three-dimensional rendering space .

可选地，在本申请的一个实施例中，所述场景对象选取单元具体用于将所述场景对象分为背景对象、前景对象、角色对象和道具对象，所述背景对象为位于场景最远处的可选纹理；所述前景对象为多个小于背景大小的图形纹理；所述角色对象为所述叙事内容中的可动角色形象；所述道具对象为具有自身特殊功能的场景对象。Optionally, in one embodiment of the present application, the scene object selection unit is specifically configured to divide the scene objects into background objects, foreground objects, character objects and prop objects, and the background objects are located farthest from the scene. The optional texture at the location; the foreground object is a plurality of graphic textures smaller than the background size; the character object is a movable character image in the narrative content; the prop object is a scene object with its own special function.

可选地，在本申请的一个实施例中，所述叙事构建模块包括：漫画式分镜单元，用于基于叙事范式将动画片段划分为多个分镜区间；角色部署单元，用于在每个分镜区间内设置角色对象和背景布局；录制单元，用于为每个分镜区间内的角色对象录制对应的动画。Optionally, in one embodiment of the present application, the narrative construction module includes: a comic-style mirroring unit, used to divide the animation segment into multiple mirroring intervals based on the narrative paradigm; a role deployment unit, used to Set character objects and background layouts in each segment; the recording unit is used to record corresponding animations for character objects in each segment.

可选地，在本申请的一个实施例中，所述角色部署单元进一步用于将所述角色对象在场景对象中的存在空间划分为上场模式和下场模式，通过所述上场模式和所述下场模式的切换，进行所述角色对象在场景对象中增加与删除。Optionally, in an embodiment of the present application, the role deployment unit is further configured to divide the existence space of the character object in the scene object into an on-field mode and an off-field mode, and through the on-field mode and the off-field mode Mode switching, adding and deleting the character object in the scene object.

可选地，在本申请的一个实施例中，所述多模态驱动模块包括：AI识别单元，用于识别所述创作者录制的角色对象动画，得到所述创作者的多模态动作交互信息；运动重定向单元，用于将所述多模态动作交互信息与所述角色对象的动作进行匹配，以将所述多模态动作交互信息中的动作映射到所述角色对象指定的动作行为；动画驱动单元，用于确所述角色对象的骨骼与运动间的关系，以根据所述角色对象的动作行为驱动骨骼运动。Optionally, in an embodiment of the present application, the multimodal driving module includes: an AI recognition unit, configured to recognize the character object animation recorded by the creator, and obtain the creator's multimodal action interaction Information; a motion redirection unit, configured to match the multimodal action interaction information with the action of the character object, so as to map the action in the multimodal action interaction information to the action specified by the character object Behavior: the animation driving unit is used to determine the relationship between the skeleton and the movement of the character object, so as to drive the movement of the skeleton according to the action behavior of the character object.

可选地，在本申请的一个实施例中，所述运动重定向单元进一步用于定义角色对象骨骼结构，并将角色对象关键点与其子节点和父节点的连线定义为两根骨骼，计算两根骨骼的平面角度，将所述平面角度值传递到目标关节的本地旋转值，其中，所述角色对象骨骼结构为面片式身体构建模式。Optionally, in one embodiment of the present application, the motion redirection unit is further used to define the bone structure of the character object, and define the connection line between the key point of the character object and its child node and parent node as two bones, and calculate The plane angle of the two bones, and transfer the plane angle value to the local rotation value of the target joint, wherein the bone structure of the character object is a patch body construction mode.

可选地，在本申请的一个实施例中，所述动画驱动单元进一步用于通过图像的面部识别和情绪分类识别进行角色脸部表情的映射。Optionally, in an embodiment of the present application, the animation driving unit is further configured to perform facial expression mapping of characters through image facial recognition and emotion classification recognition.

本申请第二方面实施例提供一种智能多模态动画创作方法，包括以下步骤：根据创作者的创作意图生成创作操作指令，根据所述创作操作指令搭建动画故事发生情境；根据叙事关键元素和叙事范式在所述动画故事发生情境创作叙事内容；采集所述创作者的多模态动作交互信息，并将所述多模态动作交互信息映射到所述动画故事发生情境的角色对象身体运动状态中，完成智能多模态画创作。The embodiment of the second aspect of the present application provides an intelligent multi-modal animation creation method, which includes the following steps: generating creation operation instructions according to the creation intention of the creator, and building an animation story occurrence situation according to the creation operation instructions; according to the narrative key elements and The narrative paradigm creates narrative content in the animation story occurrence situation; collects the multi-modal action interaction information of the creator, and maps the multi-modal action interaction information to the character object body movement state of the animation story occurrence situation , complete the creation of intelligent multi-modal paintings.

可选地，在本申请的一个实施例中，根据所述创作者的创作意图生成所述创作操作指令，包括：根据所述创作者的创作意图在预设素材库中选择与所述创作意图对应的场景对象，通过对所述场景对象进行自由摆放或和/或拼贴，生成所述创作操作指令。Optionally, in an embodiment of the present application, generating the creation operation instruction according to the creation intention of the creator includes: selecting a material with the creation intention from a preset material library according to the creation intention of the creator. For the corresponding scene objects, the creation operation instruction is generated by freely arranging or collaging the scene objects.

本申请实施例提出的智能多模态动画创作系统及创作方法，采用游戏化设计方法和多模态交互技术，增强了青少年对动画艺术的理解和兴趣。目标用户可以在没有专业工具使用经验的情况下不受时空限制快速制作出动画作品。The intelligent multi-modal animation creation system and creation method proposed in the embodiment of the present application use gamification design method and multi-modal interaction technology to enhance the understanding and interest of young people in animation art. Target users can quickly produce animation works without the limitation of time and space without professional tool experience.

本申请附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本申请的实践了解到。Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

附图说明Description of drawings

本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present application will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

图1为根据本申请实施例的智能多模态动画创作系统的示例图；Fig. 1 is an example diagram of an intelligent multimodal animation creation system according to an embodiment of the present application;

图2为根据本申请实施例的基于自然交互的动画资源编码和分类的示意图；FIG. 2 is a schematic diagram of animation resource encoding and classification based on natural interaction according to an embodiment of the present application;

图3为根据本申请实施例的动画制作模块与其他外部模块的功能关系示意图；3 is a schematic diagram of the functional relationship between the animation production module and other external modules according to an embodiment of the present application;

图4为根据本申请实施例提供的一种智能多模态动画创作方法的流程图。Fig. 4 is a flow chart of an intelligent multi-modal animation creation method provided according to an embodiment of the present application.

具体实施方式Detailed ways

下面详细描述本申请的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，旨在用于解释本申请，而不能理解为对本申请的限制。Embodiments of the present application are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary, and are intended to explain the present application, and should not be construed as limiting the present application.

图1为根据本申请实施例的智能多模态动画创作系统的示例图。FIG. 1 is an example diagram of an intelligent multi-modal animation creation system according to an embodiment of the present application.

如图1所示，该智能多模态动画创作系统10包括：情境设计模块100、叙事构建模块200和多模态驱动模块300。As shown in FIG. 1 , the intelligent multimodal animation creation system 10 includes: a scenario design module 100 , a narrative construction module 200 and a multimodal driving module 300 .

情境设计模块100，用于根据创作者的创作意图生成创作操作指令，根据创作操作指令搭建动画故事发生情境。叙事构建模块200，用于根据叙事关键元素和叙事范式在动画故事发生情境创作叙事内容。多模态驱动模块300，用于采集创作者的多模态动作交互信息，并将多模态动作交互信息映射到动画故事发生情境的角色对象身体运动状态中，完成智能多模态画创作。The scenario design module 100 is used to generate creation operation instructions according to the creator's creation intention, and build animation story occurrence scenarios according to the creation operation instructions. The narrative construction module 200 is used for creating narrative content in the animation story occurrence situation according to the key elements of the narrative and the narrative paradigm. The multi-modal driving module 300 is used to collect multi-modal action interaction information of the creator, and map the multi-modal action interaction information to the body motion state of the character object in the scene where the animation story occurs, so as to complete the creation of intelligent multi-modal paintings.

情境设计模块可以为创作者快速构建一个动画叙事情境。创作系统中提供了海量的图片素材与拼贴资源，允许创作者在场景中自由摆放拼贴成心中构想的状态。情境设计模块操作直观，体验者不需要绘画技巧或专业软件使用技能也可以创作出各具特色的动画场景。The scenario design module can quickly build an animation narrative scenario for creators. The creation system provides a large number of picture materials and collage resources, allowing creators to freely place collages in the scene to achieve the state of mind. The scenario design module is intuitive to operate, and the experiencer can create unique animation scenes without drawing skills or professional software skills.

叙事构建模块可以用于创作者配置角色、物品、故事链、动作等叙事关键元素，在设计情境中创作叙事内容。动画创作系统采取了精简设计的手段，利用四格漫画和舞台剧的叙事范式引导创作者创作故事情节。Narrative building blocks can be used by creators to configure key narrative elements such as characters, items, story chains, and actions, and create narrative content in design situations. The animation creation system adopts a streamlined design method, using the narrative paradigm of four-frame comics and stage plays to guide creators to create storylines.

多模态驱动模块捕捉和记录创作者基于身体表达的多模态交互信息，这些信息来源于创作者的直觉性动作，能最真实地代表他们对预期运动的描述。多模态驱动模块获取到这些数据后，对其进行进一步清理和筛选，映射到动画角色的身体运动状态。The multi-modal drive module captures and records creators' multi-modal interaction information based on body expression, which comes from the creator's intuitive movements and can most truly represent their description of expected motion. After the multi-modal drive module obtains the data, it is further cleaned and screened, and mapped to the body movement state of the animated character.

可选地，在本申请的一个实施例中，情境设计模块包括：3D场景渲染单元，用于构建三维渲染空间，呈现动画故事情境；场景对象选取单元，用于对三维渲染空间中的场景对象进行分类，并供用户选取创作意图对应的场景对象；场景对象操作单元，用于提供操纵交互接口，以调整场景对象在三维渲染空间中的空间位置。Optionally, in one embodiment of the present application, the scenario design module includes: a 3D scene rendering unit, configured to construct a three-dimensional rendering space, and present an animation story situation; a scene object selection unit, configured to select scene objects in the three-dimensional rendering space Classify and allow the user to select the scene object corresponding to the creation intention; the scene object operation unit is used to provide a manipulation interaction interface to adjust the spatial position of the scene object in the three-dimensional rendering space.

3D场景渲染单元是基于WebGL技术构建出的三维渲染空间，所有动态图形内容都在三维渲染空间中呈现。场景对象选取单元将三维渲染空间中的各项元素分为四个类别，创作者从数据库中自行选取符合创作意图的场景对象拖拽到场景中；场景对象操纵单元通过专门设计的操纵交互接口，调整场景对象的左右、上下和纵深，对场景中的各个对象进行空间位置的操纵和摆放，实现创作者对场景对象的精细化控制。The 3D scene rendering unit is a three-dimensional rendering space constructed based on WebGL technology, and all dynamic graphic content is presented in the three-dimensional rendering space. The scene object selection unit divides the elements in the 3D rendering space into four categories, and the creator selects the scene objects that meet the creative intention from the database and drags them into the scene; the scene object manipulation unit uses a specially designed manipulation interface, Adjust the left, right, up and down, and depth of the scene objects, and manipulate and place the spatial positions of each object in the scene, so as to realize the fine control of the creator on the scene objects.

3D场景渲染单元包含了正交投影模式、倾斜摄像机纵深显示模式和场景分层模式。正交投影是本系统中使用的投影模式，正交投影使场景中的前后对象避免了近大远小的差异，用户的认知注意力集中在同一层级的左右位置上，这样即简化了用户对场景复杂度的认知负荷。投影空间的左右方向为渲染器的世界坐标的X方向，投影空间的上下方向为渲染器师姐坐标的Y方向，投影空间的前后纵深方向为渲染器的世界坐标的Z方向。在前述正交投影模式的基础上，本系统调整了渲染器摄像机的俯仰角度到-20度，与之对应的所有场景对象的垂直角度也以地面为轴心偏转-20度，确保正交视图下扔可以同时展现上下方向和纵深表现的视角差异。分层后的场景系统，其背景由两张互相垂直的图片组成，一张纵向表示天空远景，另一张横向表示地面近景。同时背景还由没有横向地面近景的类型，如天空，水中场景。这种场景下角色和道具没有重力效果。前景分布在场景中的各个位置，可由创作者自由指定并编辑动画。The 3D scene rendering unit includes an orthographic projection mode, an oblique camera depth display mode and a scene layering mode. Orthogonal projection is the projection mode used in this system. Orthogonal projection avoids the difference between near and far objects in the scene, and the user's cognitive attention is concentrated on the left and right positions of the same level, which simplifies the user's Cognitive load on scene complexity. The left and right direction of the projection space is the X direction of the renderer's world coordinates, the up and down direction of the projection space is the Y direction of the renderer's coordinates, and the front and rear depth directions of the projection space are the Z direction of the renderer's world coordinates. On the basis of the aforementioned orthogonal projection mode, this system adjusts the pitch angle of the renderer camera to -20 degrees, and the vertical angles of all scene objects corresponding to it are also deflected by -20 degrees with the ground as the axis to ensure an orthogonal view Throwing down can show the perspective difference of up and down direction and depth performance at the same time. The background of the layered scene system is composed of two mutually perpendicular pictures, one vertically represents the distant view of the sky, and the other horizontally represents the close view of the ground. At the same time, the background also consists of types without horizontal ground close-ups, such as sky and water scenes. In this scene, characters and props have no gravity effect. The foreground is distributed in various positions in the scene, and the creator can freely specify and edit the animation.

如图2所示，场景对象选取单元将场景对象分为背景对象、前景对象、角色对象和道具对象四类。背景对象是位于场景最远处的可选纹理，一个场幕中有唯一一个背景对象。前景对象则是一系列相对于背景大小较小的图形纹理，可被任意选择，多次拖拽到场景中用于装饰场景，并且可以调整其深度位置，营造场景的纵深感。角色对象是可以被创作者驱动的可动角色形象。角色对象可以被任意放置于场景中不受限制；道具对象是可以被创作者驱动且具有自身特殊功能的场景对象。道具对象可以被任意放置于场景中不受限制。As shown in Figure 2, the scene object selection unit divides the scene objects into four categories: background objects, foreground objects, character objects and prop objects. The background object is an optional texture located at the farthest point of the scene, and there is only one background object in a scene. Foreground objects are a series of graphic textures that are smaller in size than the background. They can be selected arbitrarily, dragged and dropped into the scene multiple times to decorate the scene, and their depth positions can be adjusted to create a sense of depth in the scene. A character object is an animated character image that can be driven by the creator. Character objects can be placed arbitrarily in the scene without restriction; prop objects are scene objects that can be driven by creators and have their own special functions. Prop objects can be placed arbitrarily in the scene without restriction.

可选地，在本申请的一个实施例中，叙事构建模块包括：漫画式分镜单元，用于基于叙事范式将动画片段划分为多个分镜区间；角色部署单元，用于在每个分镜区间内设置角色对象和背景布局；录制单元，用于为每个分镜区间内的角色对象录制对应的动画。Optionally, in one embodiment of the present application, the narrative construction module includes: a comic-style mirroring unit, used to divide the animation segment into multiple mirroring intervals based on the narrative paradigm; Set the character object and background layout in the mirror interval; the recording unit is used to record the corresponding animation for the character object in each mirror interval.

漫画式分镜单元基于四格漫画的叙事模式将动画片段分为四个固定的分镜区间，每个分镜区间由创作者安排上场的角色对象，这个部分是由角色部署单元实现的。创作者为每个分镜安排好背景布局和上场角色后即可针对单个分镜录制对应的动画，这部分由录制单元完成。The comic-style storyboarding unit divides the animation clip into four fixed mirroring intervals based on the narrative mode of the four-frame comic. Each mirroring interval is arranged by the creator to play the character object. This part is realized by the character deployment unit. After the creator arranges the background layout and playing characters for each storyboard, the corresponding animation can be recorded for a single storyboard, which is completed by the recording unit.

漫画分镜单元为了简化叙事设计过程，基于青少年熟悉且喜爱的四个漫画作为动画分镜设计的范式，为用户限定了起承转合四个分镜单元。这种限制性设计将用户的注意力集中在如何设计巧妙有趣的故事情节，而可以较少的考虑分镜蒙太奇方面的问题。本单元分镜的选择和切换是使用索引贴的视觉设计模式，方便在同一个视窗内选择不同的分镜，并且可以查看不同分镜之间角色和场景的差异。In order to simplify the narrative design process, the cartoon storyboard unit is based on the four familiar and favorite cartoons of young people as the paradigm of animation storyboard design, and four storyboard units are defined for users. This restrictive design focuses the user's attention on how to design an ingenious and interesting storyline, and can pay less attention to the problem of storyboard montage. The selection and switching of storyboards in this unit is based on the visual design mode of index stickers, which is convenient for selecting different storyboards in the same window, and can view the differences in characters and scenes between different storyboards.

可选地，在本申请的一个实施例中，角色部署单元进一步用于将角色对象在场景对象中的存在空间划分为上场模式和下场模式，通过上场模式和下场模式的切换，进行角色对象在场景对象中增加与删除。Optionally, in one embodiment of the present application, the role deployment unit is further used to divide the existence space of the role object in the scene object into an on-field mode and an off-field mode, and through switching between the on-field mode and the off-field mode, the Scene objects are added and deleted.

角色部署单元将角色在场景中的存在空间划分为场上和场下两个系统。其目的是①实现角色对象和道具对象的出现与消隐；②使创作者更好的感知场景动态环境；③方便场景对象重复使用和增删管理。位于候场区的角色点击上场按钮即可出现在场景中，反之则回到候场区。角色的上下场行为可以被动画系统记录，实现对象的出现与消失。位于候场区的角色在被删除的同时，其储存的动画内容也会被一并删除。在新的分镜下，原有分镜中的角色全部回到候场区等候创作者调遣。The role deployment unit divides the existence space of characters in the scene into two systems: on-field and off-field. Its purpose is to ① realize the appearance and disappearance of character objects and prop objects; ② to enable creators to better perceive the dynamic environment of the scene; ③ to facilitate the reuse of scene objects and the management of addition and deletion. The character in the waiting area can click the play button to appear in the scene, otherwise it will return to the waiting area. The character's on and off behavior can be recorded by the animation system to realize the appearance and disappearance of the object. When a character in the waiting area is deleted, its stored animation content will also be deleted. Under the new storyboard, all the characters in the original storyboard returned to the waiting area to wait for the creator to dispatch.

录制单元模仿电子游戏的UI布局，避免复杂的功能区域和嵌套面板，只在实时渲染的动画场景之上叠加一层扁平化UI。其核心的录制启动按钮位于左下角，右上角摄像头拍摄内容。动画创作系统简化动画的录制过程，对于单个分镜的任意角色，系统不提供录制结果编辑功能，创作者可直接删除重新开始录制。The recording unit imitates the UI layout of video games, avoids complex functional areas and nested panels, and only superimposes a flat UI on top of the real-time rendered animation scene. Its core recording start button is located in the lower left corner, and the camera captures the content in the upper right corner. The animation creation system simplifies the animation recording process. For any character in a single scene, the system does not provide the editing function of the recording result, and the creator can directly delete and restart the recording.

可选地，在本申请的一个实施例中，多模态驱动模块包括：AI识别单元，用于识别创作者录制的角色对象动画，得到创作者的多模态动作交互信息；运动重定向单元，用于将多模态动作交互信息与角色对象的动作进行匹配，以将多模态动作交互信息中的动作映射到角色对象指定的动作行为；动画驱动单元，用于确角色对象的骨骼与运动间的关系，以根据角色对象的动作行为驱动骨骼运动。Optionally, in one embodiment of the present application, the multimodal driving module includes: an AI recognition unit, used to recognize the character object animation recorded by the creator, and obtain the creator’s multimodal action interaction information; a motion redirection unit , used to match the multi-modal action interaction information with the action of the character object, so as to map the action in the multi-modal action interaction information to the action behavior specified by the character object; the animation drive unit is used to determine the bone of the character object and Relationships between motions to drive bone motion based on the character object's motion behavior.

AI识别单元可以识别创作者的身体动作表达信息。本系统将创作者的身体多模态表达视为驱动动画的主要数据来源。运动重定向单元将AI识别单元获取的数据处理并映射到指定的动作行为上。动画驱动单元确定了角色对象的骨骼和运动关系，作为多模态信息驱动虚拟角色运动的技术基础。The AI recognition unit can recognize the creator's body movements to express information. This system regards the creator's body multimodal expression as the main data source for driving animation. The motion redirection unit processes and maps the data acquired by the AI recognition unit to specified actions. The animation drive unit determines the skeleton and motion relationship of the character object, which serves as the technical basis for multi-modal information to drive the virtual character's motion.

AI识别单元包含智能算法模型、数据清洗两个模组。智能算法模型部分主要使用PoseNet姿态识别模型。PoseNet是一个基于TensorFlow的机器学习模型，可以在浏览器中进行实时人体姿态估算(Pose Estimation)。PoseNet既可估算单个姿势，也可估算多个姿势。姿势估算分两个阶段进行：①输入RGB图像并通过卷积神经网络解析。②使用单姿势或多姿势解码算法，输出解码后的姿势(Decode Poses)、姿势置信度得分(Pose ConfidenceScrore)、关键点位置(Keypoint Position)和关键点置信度得分(Keypoint ConfidenceScores)。在估算人体姿势的时候，PoseNet为人体选择了17个关键点，基本覆盖动画角色的所有可动关节。PoseNet模型直接输出的关键点坐标数据受到摄像头拍摄质量影响，会产生明显的抖动和数据丢失现象。对原始数据首先进行基础清晰，丢弃了置信度得分低于0.2的关键点，这样可以将大部分重叠，遮挡点数据过滤掉。在此基础上，对过滤后的离散坐标数据进行延时平滑(目前使用的是Lerp平滑算法)。The AI recognition unit includes two modules: intelligent algorithm model and data cleaning. The intelligent algorithm model part mainly uses the PoseNet gesture recognition model. PoseNet is a TensorFlow-based machine learning model that can perform real-time human pose estimation (Pose Estimation) in the browser. PoseNet can estimate both a single pose and multiple poses. Pose estimation is performed in two stages: ① input RGB image and parse it through convolutional neural network. ② Use a single-pose or multi-pose decoding algorithm to output decoded poses (Decode Poses), pose confidence scores (Pose ConfidenceScrore), keypoint positions (Keypoint Position) and keypoint confidence scores (Keypoint ConfidenceScores). When estimating the pose of the human body, PoseNet selects 17 key points for the human body, basically covering all the movable joints of the animated character. The key point coordinate data directly output by the PoseNet model is affected by the quality of the camera, which will cause obvious jitter and data loss. The original data is first cleared, and the key points with a confidence score lower than 0.2 are discarded, so that most of the overlapping and occluded point data can be filtered out. On this basis, delay smoothing is performed on the filtered discrete coordinate data (currently using the Lerp smoothing algorithm).

可选地，在本申请的一个实施例中，运动重定向单元进一步用于定义角色对象骨骼结构，并将角色对象关键点与其子节点和父节点的连线定义为两根骨骼，计算两根骨骼的平面角度，将平面角度值传递到目标关节的本地旋转值，其中，角色对象骨骼结构为面片式身体构建模式。Optionally, in one embodiment of the present application, the motion redirection unit is further used to define the bone structure of the character object, and define the connection line between the key point of the character object and its child node and parent node as two bones, and calculate the two bones The plane angle of the bone, transfer the plane angle value to the local rotation value of the target joint, where the bone structure of the character object is a patch body construction mode.

运动重定向单元包括对象骨骼结构与数据映射模块。对象骨骼结构定义了面片式身体构建模式，该方案的特点是①相比于刚体式骨骼，3D骨骼支持图片的扭曲和弯折。②所有身体部件不重叠地分布于同一张贴图上。目标是尽可能的实现同一套骨骼可适配多个角色形象。第一类角色形象适配性较差，其蒙皮区域局限了在了角色本身的身体部分所在的区域。贴图替换只能替换头部和身体。第二类骨骼模式针对的是一类特定角色，这些角色形象主体是某个特定物体，如垃圾袋，包子，油条，蔬菜，水果等，四肢均由火柴棍组成。设计了一套火柴人骨骼系统，设计师只需要替换贴图中特定物体的内容即可完成替换。对于不同形状的物体导致的手脚位置差异，火柴人系统可以在不改变模型蒙皮的基础上，在引擎角色构建中设定不同的手脚距离。第三种骨骼模式未来将适配大多数两足类人的人类或动物形象，可以支持较大幅度的形象，衣物变化。刚体骨骼模式是最通用的角色是基于刚体的骨骼模式，对蒙皮的骨骼和肢体的大小和长宽没有特定的要求。因此，贴图不经过蒙皮直接绑定到骨骼上，任意长宽比例的四肢都可以自然地匹配到角色系统之中。The motion redirection unit includes object bone structure and data mapping modules. The skeleton structure of the object defines the patch-type body construction mode. The characteristics of this scheme are: ①Compared with the rigid-body skeleton, the 3D skeleton supports the distortion and bending of the picture. ② All body parts are distributed on the same texture without overlapping. The goal is to realize that the same set of bones can be adapted to multiple character images as much as possible. The first type of character image has poor adaptability, and its skinning area is limited to the area where the body part of the character itself is located. Texture replacement can only replace the head and body. The second type of skeleton model is aimed at a specific type of character. The main body of these character images is a specific object, such as garbage bags, steamed buns, fried dough sticks, vegetables, fruits, etc., and the limbs are composed of matchsticks. A stickman skeleton system is designed, and the designer only needs to replace the content of a specific object in the texture to complete the replacement. For the difference in the position of the hands and feet caused by objects of different shapes, the stickman system can set different distances between the hands and feet in the engine character construction without changing the skin of the model. The third bone model will adapt to most bipedal human or animal images in the future, and can support larger images and clothing changes. Rigid body bone mode is the most general character based on rigid body bone mode, which has no specific requirements on the size, length and width of skinned bones and limbs. Therefore, the texture is directly bound to the bone without skinning, and limbs with any aspect ratio can be naturally matched to the character system.

数据映射将平滑后的关键点数据处理后可被利用在不同的交互应用中。目前将关键点数据用于①单个点的运动映射。②人体关节的整体映射。原始关键点坐标是用户身体在摄像机屏幕坐标系中的投影，而动画系统中的不同的角色骨骼长度比例不一致。为了获得规范化的身体运动姿态，使用户动作可以重定向到任意角色形象，将关键点与其子节点和父节点的连线定义为两根骨骼，并计算这两根骨骼的平面角度。将该角度值传递到目标关节的本地旋转值。Data mapping processes the smoothed keypoint data and can be utilized in different interactive applications. Key point data is currently used for ① motion mapping of a single point. ② Overall mapping of human joints. The original key point coordinates are the projection of the user's body in the camera screen coordinate system, and the bone length ratio of different characters in the animation system is inconsistent. In order to obtain a normalized body motion posture, so that user actions can be redirected to any character image, the connection line between the key point and its child nodes and parent nodes is defined as two bones, and the plane angle of these two bones is calculated. Pass this angle value to the target joint's local rotation value.

动画驱动单元包含了模型蒙皮、骨骼动画、动画状态机三个模组。模型蒙皮模组不包含脸部器官组件。肘关节，膝关节的蒙皮连续，但是肩关节和髋关节断开。在骨骼结构不变的条件下，直接替换贴图，就可以保证角色系统能完整无误地切换皮肤。但是身体贴图各个组件只有一部分区域可以进行有效绘制，超出区域的部分将不会显示。其中手脚左右对称，使用同一个区域的纹理。双足形态骨骼的整体骨骼结构与传模人形模型接近，但是模型要求其自然姿势轻微朝右边偏转，使得角色对象能向左和向右切换。尾巴部分从腰关节延申出额外的骨骼，耳朵也可以设置骨骼节点。骨骼弹簧系统实现了耳朵和尾巴的弹性效果。The animation driver unit includes three modules: model skinning, skeletal animation, and animation state machine. The model skinning mod does not contain facial organ components. Elbow and knee skins are continuous, but shoulder and hip joints are disconnected. Under the condition that the bone structure remains unchanged, directly replacing the texture can ensure that the character system can switch skins without errors. However, only a part of the area of each component of the body map can be effectively drawn, and the part beyond the area will not be displayed. Among them, the left and right sides of the hands and feet are symmetrical, and the texture of the same area is used. The overall skeletal structure of the biped skeleton is close to that of the model humanoid model, but the model requires its natural posture to be slightly deflected to the right, so that the character object can switch to the left and right. The tail part extends extra bones from the waist joint, and the ears can also set bone nodes. The bone spring system realizes the elastic effect of ears and tail.

可选地，在本申请的一个实施例中，动画驱动单元进一步用于通过图像的面部识别和情绪分类识别进行角色脸部表情的映射。Optionally, in one embodiment of the present application, the animation driving unit is further configured to perform facial expression mapping of characters through image facial recognition and emotion classification recognition.

情境中的角色对象的骨骼均可在用户交互影响下运动变换。本系统设计了一套复合式运动捕捉方案。对于面部器官(眼睛，嘴巴，鼻子，眉毛等)，采用基于图像的面部识别(连续)和情绪分类识别(状态)两种输入方式完成角色脸部表情的映射。角色动作层面则由三个层次的交互输入共同实现角色运动。基础层是键盘鼠标操作产生的行走，跳跃，蹲下等基础位移和动画(状态)，这种运动方式类似于游戏中对角色操控，用户可以快速对动画情境产生熟悉的认知。第二层是基于图像识别的轨迹动画，算法识别出手指的运动轨迹映射到对象，这一层的运动数据将会重载上一层的运动数据。第三层是基于图像的姿势识别，算法识别出身体骨架运动轨迹映射到角色对象上。这一层的运动数据以用户体验为核心重新映射。动画系统最终的运动操作模式是：用户通过上下左右按键操作动画角色在场景中自由行走，用户面对摄像头可以用手将动画角色拎到空中飞行，在飞行的过程中，创作者做出的身体动作将会在动画角色身体上表现出来。The bones of character objects in the scene can be moved and transformed under the influence of user interaction. This system designs a set of composite motion capture scheme. For facial organs (eyes, mouth, nose, eyebrows, etc.), two input methods, image-based facial recognition (continuous) and emotion classification recognition (state), are used to complete the mapping of character facial expressions. At the character action level, the three levels of interactive input jointly realize the character movement. The basic layer is the basic displacement and animation (state) such as walking, jumping, and squatting generated by keyboard and mouse operations. This movement method is similar to the character control in the game, and the user can quickly generate familiarity with the animation situation. The second layer is trajectory animation based on image recognition. The algorithm recognizes that the motion trajectory of the finger is mapped to the object, and the motion data of this layer will overload the motion data of the previous layer. The third layer is image-based pose recognition. The algorithm recognizes the movement trajectory of the body skeleton and maps it to the character object. The motion data at this layer is remapped with the user experience as the core. The final motion operation mode of the animation system is: the user operates the animation character to walk freely in the scene by pressing the up, down, left, and right buttons. Actions will be performed physically on the animated character.

本申请的系统通过采用游戏化设计方法和多模态交互技术，增强了青少年对动画艺术的理解和兴趣。在没有专业工具使用经验的情况也可以不受时空限制快速制作动画。The system of the present application enhances young people's understanding and interest in animation art by adopting gamification design method and multi-modal interaction technology. In the case of no experience in using professional tools, you can quickly create animations without limitation of time and space.

结合图3所示，通过一个具体实施例对本申请的创作系统进行详细说明。With reference to FIG. 3 , the authoring system of the present application will be described in detail through a specific embodiment.

1)项目创建1) Project creation

在使用系统前，应首先点击新建按钮创建一个新的项目实例(背景切换为蓝色天空)，或点击读取按钮读取目前存储在云端的项目实例。如果不点击新建或读取按钮直接开始动画编辑会导致系统错误。同时，不能在已经新建项目的基础上读取云端项目，要刷新页面后再点击读取。存项目编辑任意时刻，点击保存按钮即可将目前进度存储到云端，此后点击读取即可继续编辑已经保存的内容。目前多用户系统尚未搭建完成，云端仅有一个存储副本，您的储存项目可能会被其他测试者覆盖。Before using the system, you should first click the New button to create a new project instance (the background is switched to blue sky), or click the Read button to read the project instance currently stored in the cloud. Starting animation editing without clicking the New or Load button will cause a system error. At the same time, you cannot read the cloud project based on the newly created project. You need to refresh the page and then click Read. At any time during project editing, click the save button to save the current progress to the cloud, and then click read to continue editing the saved content. At present, the multi-user system has not been built yet, and there is only one storage copy in the cloud, and your storage items may be overwritten by other testers.

2)资源浏览系统2) Resource browsing system

资源浏览系统包括四个可选项，分别是背景、前景、角色、道具。点击背景按钮可直接切换背景样式。点击前景按钮可选择想要的前景图案。点击角色按钮可选择想要的角色。点击道具按钮可选择想要的道具。The resource browsing system includes four options, namely background, foreground, character and props. Click the background button to switch the background style directly. Click the Foreground button to select the desired foreground pattern. Click the role button to select the desired role. Click the item button to select the item you want.

对象在场景中生成后，点击对象即可激活移动控件(黄色圆圈)。其中，黄色大圈表示该对象在虚拟平地上的位置，正上方的绿色小圈可调整对象高度，侧方蓝色小圈可调整对象左右位置。Once the object is spawned in the scene, clicking on the object activates the movement controls (yellow circles). Among them, the large yellow circle indicates the position of the object on the virtual flat ground, the small green circle directly above can adjust the height of the object, and the small blue circle on the side can adjust the left and right position of the object.

3)分镜视窗系统3) Split-mirror window system

分镜界面默认生成了四个分镜，每个分镜通过上方活页按钮切换。每个分镜都可选定一套由用户定义的背景+前景组合。该功能目前尚未完成，所有分镜目前共享一套背景+前景组合。点击右下角摄像机按钮可进入单分镜界面，再点一次切换回分镜活页界面。The storyboard interface generates four storyboards by default, and each storyboard can be switched by the upper loose-leaf button. Each storyboard can select a set of user-defined background + foreground combination. This feature is not yet complete, all storyboards currently share a set of background + foreground combinations. Click the camera button in the lower right corner to enter the single mirror interface, and click again to switch back to the split mirror interface.

4)时间轴系统4) Time axis system

多分镜界面的时间轴由4个等宽的区域组成，每个区域的最大区间映射为单个分镜的系统默认最长时间(10s)，如果一个分镜的动画时间不超过10s，则会以绿色矩形区域覆盖显示。用户可在绿色区域内自由拖拽时间轴，查看各个分镜的动画。点击右侧播放按钮可播放/暂停播放动画。The time axis of the multi-shot interface is composed of 4 equal-width areas. The maximum interval of each area is mapped to the system default maximum time (10s) of a single shot. If the animation time of a shot does not exceed 10s, it will be displayed as The green rectangular area is overlaid and displayed. Users can freely drag the timeline in the green area to view the animation of each shot. Click the play button on the right to play/pause the animation.

单分镜界面的时间轴则由一条完成时间线构成，从左到右是单个分镜的系统默认最长时间(10s)，如果一个分镜的动画时间不超过10s，则用户只能在有动画录制的时间范围内拖拽时间轴。点击右侧播放按钮可播放/暂停播放动画。The time axis of the single-shot interface is composed of a completion timeline. From left to right is the system default maximum time of a single shot (10s). If the animation time of a shot does not exceed 10s, the user can only Drag the timeline within the time range of movie recording. Click the play button on the right to play/pause the animation.

5)候场/上场系统5) Waiting/Entry System

为方便对场景中各个角色的控制，设计了候场/上场系统，用于选取各个分镜中要使用的角色对象。只有角色和道具对象可被候场系统管理。在一个动画项目中，同一类型的角色对象可以创建多个，如上图中显示项目中创建了两个小猪佩奇对象。他们作为各自独立的演员自始至终都可以在各个分镜内出场，而没有出场的演员，会呈现半透明状态。但动画播放期间，没有上场的角色是不会出现的。在候场系统UI界面中可以看到所有参与表演的演员名单和它们的候场情况。In order to facilitate the control of each character in the scene, a waiting/entering system is designed to select the character object to be used in each storyboard. Only character and prop objects can be managed by the waiting system. In an animation project, multiple character objects of the same type can be created. As shown in the figure above, two Peppa Pig objects are created in the project. As independent actors, they can appear in each scene from the beginning to the end, and the actors who do not appear will appear in a translucent state. However, during the animation, characters who are not in the scene will not appear. In the UI interface of the waiting system, you can see the list of all actors participating in the performance and their waiting conditions.

如上图所示，动画项目中一共有两只小猪佩奇出演，其中第一只在第一个分镜中出演，第二只在第二个分镜中出演。出演的演员在候场区名单中会以红色高亮显示。点击其头像的入场按钮可以在候场和入场状态之间切换。点击头像的删除按钮则会将这个演员移除，其在各个分镜中的动画表演数据也会一并删除。As shown in the picture above, there are two piglets starring in the animation project, the first of which is in the first storyboard, and the second is in the second storyboard. Actors who play will be highlighted in red in the waiting area list. Click the admission button of their avatar to switch between waiting and admission status. Clicking the delete button on the avatar will remove the actor, and the animation performance data in each storyboard will also be deleted.

6)动画录制系统6) Animation recording system

为点击视窗右下角的摄像机按钮则可进入到指定分镜中的动画编辑界面。右侧UI的功能分别是：姿势(pose)动画录制按钮、运动(move)动画录制按钮、存储动画和丢弃动画按钮。Click the camera button in the lower right corner of the window to enter the animation editing interface in the specified storyboard. The functions of the right UI are: pose (pose) animation recording button, motion (move) animation recording button, storage animation and discard animation button.

一个角色在某个分镜中的姿势(pose)动画录制流程如下：①选定动画对象②拖拽到你想录制动画的时间点。动画的起始点必须在已经录制动画的时间范围内确定，即时间轴可以拖拽的范围内。如果起始时间点位于已录制动画中间某部分，那么新的动画片段将会覆盖旧的动画片段。③点击姿势动画录制。按钮开始录制点击该按钮后，时间轴将会随时间移动，此时摄像头捕捉到的姿势运动将会被记录到临时动画数据对象中(animationdata_new)。④再次点击姿势动画录制按钮停止录制。用户再一次点击姿势动画录制按钮将会停止录制。The pose animation recording process of a character in a certain scene is as follows: ① Select the animation object ② Drag and drop it to the time point you want to record the animation. The starting point of the animation must be determined within the time range of the recorded animation, that is, within the range that the time axis can be dragged. If the start time point is somewhere in the middle of the recorded animation, the new animation clip will overwrite the old animation clip. ③Click gesture animation recording. Button to start recording After clicking this button, the time axis will move with time, and the gesture movement captured by the camera will be recorded in the temporary animation data object (animationdata_new). ④ Click the pose animation recording button again to stop recording. Another click of the gesture animation record button by the user will stop the recording.

根据本申请实施例提出的智能多模态动画创作系统，采用游戏化设计方法和多模态交互技术，增强了青少年对动画艺术的理解和兴趣。目标用户可以在没有专业工具使用经验的情况下不受时空限制快速制作出动画作品。According to the intelligent multi-modal animation creation system proposed in the embodiment of the present application, the gamification design method and multi-modal interaction technology are adopted to enhance the understanding and interest of young people in animation art. Target users can quickly produce animation works without the limitation of time and space without professional tool experience.

其次参照附图描述根据本申请实施例提出的智能多模态动画创作方法。Next, the intelligent multi-modal animation creation method proposed according to the embodiment of the present application will be described with reference to the accompanying drawings.

图4为根据本申请实施例的智能多模态动画创作方法的流程图。Fig. 4 is a flowchart of an intelligent multi-modal animation creation method according to an embodiment of the present application.

如图4所示，该智能多模态动画创作方法包括以下步骤：As shown in Figure 4, the intelligent multimodal animation creation method includes the following steps:

步骤S101，根据创作者的创作意图生成创作操作指令，根据创作操作指令搭建动画故事发生情境。In step S101, a creation operation instruction is generated according to the creator's creation intention, and an animation story occurrence scenario is built according to the creation operation instruction.

可选地，在本申请的一个实施例中，根据创作者的创作意图生成创作操作指令，包括：根据创作者的创作意图在预设素材库中选择与创作意图对应的场景对象，通过对场景对象进行自由摆放或和/或拼贴，生成创作操作指令。Optionally, in one embodiment of the present application, generating the creation operation instruction according to the creator's creation intention includes: selecting a scene object corresponding to the creation intention in the preset material library according to the creator's creation intention, and Objects are freely arranged and/or collaged to generate creative operation instructions.

基于游戏化交互接口用户可以快速设计并搭建出脑海中想象的动画故事发生情境。Based on the gamified interactive interface, users can quickly design and build the animation story scene imagined in their minds.

步骤S102，根据叙事关键元素和叙事范式在动画故事发生情境创作叙事内容。Step S102, creating narrative content in the animation story occurrence situation according to the key narrative elements and narrative paradigm.

叙事关键元素包括角色对象、物品、故事链和角色动作中的至少一个。叙事范式包括四格漫画和舞台剧中的中少一个。The narrative key elements include at least one of character objects, items, story chains, and character actions. Narrative paradigms include at least one of four-panel comics and stage plays.

本申请的实施例提供了一套基于四格漫画和舞台演出的互动隐喻，帮助用户在情境中构建故事情节和叙事链。The embodiment of the present application provides a set of interactive metaphors based on four-panel comics and stage performances, helping users construct storylines and narrative chains in situations.

步骤S103，采集创作者的多模态动作交互信息，并将多模态动作交互信息映射到动画故事发生情境的角色对象身体运动状态中，完成智能多模态画创作。Step S103, collecting the creator's multi-modal action interaction information, and mapping the multi-modal action interaction information to the body movement state of the character object in the scene where the animation story takes place, to complete the intelligent multi-modal painting creation.

本申请的实施例通过一系列多模态交互接口，将用户的身体表达转化为角色运动，实现动画叙事的录制。The embodiment of the present application converts the user's body expression into character movement through a series of multi-modal interactive interfaces, and realizes the recording of animation narration.

需要说明的是，前述对智能多模态动画创作系统实施例的解释说明也适用于该实施例的智能多模态动画创作方法，此处不再赘述。It should be noted that the foregoing explanations for the embodiment of the intelligent multi-modal animation creation system are also applicable to the intelligent multi-modal animation creation method of this embodiment, and will not be repeated here.

根据本申请实施例提出的智能多模态动画创作方法，采用游戏化设计方法和多模态交互技术，增强了青少年对动画艺术的理解和兴趣。目标用户可以在没有专业工具使用经验的情况下不受时空限制快速制作出动画作品。According to the intelligent multi-modal animation creation method proposed in the embodiment of the present application, the gamification design method and multi-modal interaction technology are adopted to enhance the understanding and interest of young people in animation art. Target users can quickly produce animation works without the limitation of time and space without professional tool experience.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任一个或N个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present application. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Moreover, the described specific features, structures, materials or characteristics may be combined in any one or N embodiments or examples in an appropriate manner. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中，“N个”的含义是至少两个，例如两个，三个等，除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present application, "N" means at least two, such as two, three, etc., unless otherwise specifically defined.

流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为，表示包括一个或更N个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分，并且本申请的优选实施方式的范围包括另外的实现，其中可以不按所示出或讨论的顺序，包括根据所涉及的功能按基本同时的方式或按相反的顺序，来执行功能，这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method description in a flowchart or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing a custom logical function or step of a process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.

Claims

1. An intelligent multimodal animation authoring system comprising:

the situation design module is used for generating an authoring operation instruction according to the authoring intention of an author and building an animation story generating situation according to the authoring operation instruction;

the narrative construction module is used for creating narrative content according to the narrative key elements and the narrative paradigm in the animation story occurrence situation;

and the multi-mode driving module is used for collecting multi-mode action interaction information of the creator and mapping the multi-mode action interaction information into the body motion state of the character object in the animation story occurrence situation to complete the creation of the intelligent multi-mode painting.

2. The system of claim 1, wherein the context design module comprises:

the 3D scene rendering unit is used for constructing a three-dimensional rendering space and presenting the animation story situation;

the scene object selecting unit is used for classifying the scene objects in the three-dimensional rendering space and providing users with the scene objects corresponding to the creation intention;

and the scene object operation unit is used for providing a manipulation interaction interface so as to adjust the spatial position of the scene object in the three-dimensional rendering space.

3. The system according to claim 2, wherein the scene object selection unit is specifically configured to divide the scene objects into a background object, a foreground object, a character object and a prop object, the background object being a selectable texture located farthest from the scene; the foreground object is a plurality of graphic textures smaller than the background size; the character objects are movable character images in the narrative content; the prop object is a scene object with a special function.

4. The system of claim 1, wherein the narrative construction module comprises:

the cartoon type mirror dividing unit is used for dividing the animation segments into a plurality of mirror dividing sections based on the narrative paradigm;

a character deployment unit for setting character objects and background layouts in each of the scope sections;

and the recording unit is used for recording corresponding animations for the character objects in each of the subarea intervals.

5. The system of claim 4, wherein the character deployment unit is further configured to divide a presence space of the character object in a scene object into an upper scene mode and a lower scene mode, and to add and delete the character object in the scene object by switching the upper scene mode and the lower scene mode.

6. The system of claim 1, wherein the multi-modal driver module comprises:

the AI identification unit is used for identifying the character object animation recorded by the creator to obtain multi-mode action interaction information of the creator;

a motion redirection unit, configured to match the multi-modal interaction information with the actions of the character object, so as to map the actions in the multi-modal interaction information to action behaviors specified by the character object;

and the animation driving unit is used for determining the relation between the bones and the motions of the character objects so as to drive the bones to move according to the action behaviors of the character objects.

7. The system of claim 6, wherein the motion redirection unit is further configured to define a character object skeleton structure and define a connection of a character object key point to its child and parent nodes as two skeletons, calculate a plane angle of the two skeletons, and transfer the plane angle value to a local rotation value of a target joint, wherein the character object skeleton structure is a facial body building model.

8. The system of claim 6, wherein the animation drive unit is further configured to map a facial expression of a character through facial recognition and emotion classification recognition of an image.

9. An intelligent multi-modal animation authoring method comprising the steps of:

generating an authoring operation instruction according to the authoring intention of an author, and building an animation story generating situation according to the authoring operation instruction;

creating narrative content in the animated story occurrence context according to the narrative key elements and the narrative paradigm;

and collecting multi-modal action interaction information of the creator, and mapping the multi-modal action interaction information into the body motion state of the character object in the occurrence situation of the animation story, so as to complete the creation of the intelligent multi-modal painting.

10. The method of claim 9, wherein generating the authoring operation instructions from the author's authoring intent comprises:

and selecting a scene object corresponding to the authoring intention from a preset material library according to the authoring intention of the creator, and generating the authoring operation instruction by freely placing or and/or collaging the scene object.