CN116342763A - Intelligent multi-mode animation creation system and creation method - Google Patents
Intelligent multi-mode animation creation system and creation method Download PDFInfo
- Publication number
- CN116342763A CN116342763A CN202310181768.2A CN202310181768A CN116342763A CN 116342763 A CN116342763 A CN 116342763A CN 202310181768 A CN202310181768 A CN 202310181768A CN 116342763 A CN116342763 A CN 116342763A
- Authority
- CN
- China
- Prior art keywords
- animation
- scene
- character
- mode
- authoring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000003993 interaction Effects 0.000 claims abstract description 39
- 230000033001 locomotion Effects 0.000 claims abstract description 37
- 230000009471 action Effects 0.000 claims abstract description 36
- 238000013461 design Methods 0.000 claims abstract description 25
- 238000010276 construction Methods 0.000 claims abstract description 10
- 238000013507 mapping Methods 0.000 claims abstract description 10
- 238000010422 painting Methods 0.000 claims abstract description 7
- 210000000988 bone and bone Anatomy 0.000 claims description 19
- 238000009877 rendering Methods 0.000 claims description 18
- 230000001815 facial effect Effects 0.000 claims description 8
- 230000006399 behavior Effects 0.000 claims description 7
- 239000000463 material Substances 0.000 claims description 6
- 230000008451 emotion Effects 0.000 claims description 4
- 230000008921 facial expression Effects 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 7
- 230000006870 function Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 230000002452 interceptive effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 210000003414 extremity Anatomy 0.000 description 3
- 210000002683 foot Anatomy 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 210000003127 knee Anatomy 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 210000003423 ankle Anatomy 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- 239000000306 component Substances 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 210000004247 hand Anatomy 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 210000001217 buttock Anatomy 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 235000014510 cooky Nutrition 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 210000001513 elbow Anatomy 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
Abstract
The application discloses an intelligent multi-mode animation creation system and an creation method, wherein the system comprises: the situation design module is used for generating an authoring operation instruction according to the authoring intention of an author and building an animation story occurrence situation according to the authoring operation instruction; the narrative construction module is used for creating the narrative content according to the key elements of the narrative and the narrative paradigm in the animation story occurrence situation; the multi-mode driving module is used for collecting multi-mode action interaction information of an creator and mapping the multi-mode action interaction information into a body motion state of a character object in an animation story occurrence situation to complete intelligent multi-mode painting creation. Therefore, by adopting a game design method and a multi-mode interaction technology, the understanding and interest of teenagers on animation art are enhanced, and the animation can be quickly produced without space-time limitation under the condition of no special tool use experience.
Description
Technical Field
The application relates to the technical field of artificial intelligence and multi-modal interaction, in particular to an intelligent multi-modal animation creation system and an creation method.
Background
WebGL (Web Graphics Library) is a 3D drawing protocol, which allows combining JavaScript with OpenGL ES 2.0, and by adding a JavaScript binding to OpenGL ES 2.0, webGL can provide hardware 3D accelerated rendering for HTML5 Canvas, so that Web developers can more smoothly expose 3D scenes and models in the browser with the help of system graphics cards, and can create complex navigation and data visualization.
Posenet is a real-time gesture detection technique that can detect human gestures in images or videos. It can operate as a single mode (single human body posture detection) and multiple posture detection (multiple human body posture detection) in both cases. As a deep learning TensorFlow model, it estimates the posture of a human body by detecting body parts such as elbows, buttocks, wrists, knees, ankles, and the like, and forms a skeleton structure of the posture by connecting these points. The PoseNet is trained on the MobileNet architecture. MobileNet is a convolutional neural network developed by google, trained on ImageNet datasets, mainly used for image classification and target estimation in categories. Is a lightweight model that uses depth separable convolution to deepen the network and reduce parameters, computation costs, and improve accuracy. The pre-trained model runs in the browser, which is the distinction of pomeet from other libraries that rely on APIs. The Posenet provides a total of 17 key points that can be used, from the eyes to the ears, to the knees and ankles.
The game engine refers to the core components of some compiled editable computer game systems or some interactive real-time image applications. These systems provide game designers with the various tools required to write games in order to allow the game designer to easily and quickly make game programs without starting from zero. Most support a variety of operating platforms such as Linux, mac OS X, microsoft Windows. The game engine comprises the following systems: rendering engines (i.e., "renderers," including two-dimensional and three-dimensional image engines), physics engines, collision detection systems, sound effects, script engines, computer animations, artificial intelligence, network engines, and scene management. A game engine is a set of machine-recognizable codes (instructions) designed for machines running a certain class of games. It acts like an engine, controlling the running of the game. A game work can be divided into two major parts of a game engine and a game resource. The game resource includes image, sound, animation, etc., and one formula is: game = engine (program code) +resources (image, sound, animation, etc.). The game engine then calls these resources sequentially as required by the game design.
At present, computer animation production software is developed more mature, but the software requires users to have professional artistic literacy and a large number of software operation experiences, so that teenager users often cannot quickly master the software. How to reduce the difficulty of animation and improve the efficiency of animation is a problem which needs to be solved by the technicians in the field.
Disclosure of Invention
The application provides an intelligent multi-mode animation creation system and an intelligent multi-mode animation creation method, which adopt a game design method and a multi-mode interaction technology, so that the understanding and interest of teenagers on animation art are enhanced. Animation can be quickly performed without space-time limitation under the condition of no experience of using professional tools.
Embodiments of a first aspect of the present application provide an intelligent multi-modal animation authoring system comprising: the situation design module is used for generating an authoring operation instruction according to the authoring intention of an author and building an animation story generating situation according to the authoring operation instruction; the narrative construction module is used for creating narrative content according to the narrative key elements and the narrative paradigm in the animation story occurrence situation; and the multi-mode driving module is used for collecting multi-mode action interaction information of the creator and mapping the multi-mode action interaction information into the body motion state of the character object in the animation story occurrence situation to complete the creation of the intelligent multi-mode painting.
Optionally, in one embodiment of the present application, the context design module includes: the 3D scene rendering unit is used for constructing a three-dimensional rendering space and presenting the animation story situation; the scene object selecting unit is used for classifying the scene objects in the three-dimensional rendering space and providing users with the scene objects corresponding to the creation intention; and the scene object operation unit is used for providing a manipulation interaction interface so as to adjust the spatial position of the scene object in the three-dimensional rendering space.
Optionally, in one embodiment of the present application, the scene object selection unit is specifically configured to divide the scene object into a background object, a foreground object, a character object and a prop object, where the background object is a selectable texture located farthest from the scene; the foreground object is a plurality of graphic textures smaller than the background size; the character objects are movable character images in the narrative content; the prop object is a scene object with a special function.
Optionally, in one embodiment of the present application, the narrative construction module comprises: the cartoon type mirror dividing unit is used for dividing the animation segments into a plurality of mirror dividing sections based on the narrative paradigm; a character deployment unit for setting character objects and background layouts in each of the scope sections; and the recording unit is used for recording corresponding animations for the character objects in each of the subarea intervals.
Optionally, in an embodiment of the present application, the character deployment unit is further configured to divide a presence space of the character object in the scene object into an upper scene mode and a lower scene mode, and perform addition and deletion of the character object in the scene object by switching the upper scene mode and the lower scene mode.
Optionally, in one embodiment of the present application, the multi-mode driving module includes: the AI identification unit is used for identifying the character object animation recorded by the creator to obtain multi-mode action interaction information of the creator; a motion redirection unit, configured to match the multi-modal interaction information with the actions of the character object, so as to map the actions in the multi-modal interaction information to action behaviors specified by the character object; and the animation driving unit is used for determining the relation between the bones and the motions of the character objects so as to drive the bones to move according to the action behaviors of the character objects.
Optionally, in one embodiment of the present application, the motion redirection unit is further configured to define a skeleton structure of a character object, and define a connection line between a key point of the character object and its child node and parent node as two skeletons, calculate plane angles of the two skeletons, and transfer the plane angle values to a local rotation value of a target joint, where the skeleton structure of the character object is a facial body building mode.
Optionally, in an embodiment of the present application, the animation driving unit is further configured to map the facial expression of the character through facial recognition and emotion classification recognition of the image.
An embodiment of a second aspect of the present application provides an intelligent multi-modal animation authoring method comprising the steps of: generating an authoring operation instruction according to the authoring intention of an author, and building an animation story generating situation according to the authoring operation instruction; creating narrative content in the animated story occurrence context according to the narrative key elements and the narrative paradigm; and collecting multi-modal action interaction information of the creator, and mapping the multi-modal action interaction information into the body motion state of the character object in the occurrence situation of the animation story, so as to complete the creation of the intelligent multi-modal painting.
Optionally, in one embodiment of the present application, generating the authoring operation instruction according to the authoring intention of the author includes: and selecting a scene object corresponding to the authoring intention from a preset material library according to the authoring intention of the creator, and generating the authoring operation instruction by freely placing or and/or collaging the scene object.
The intelligent multi-mode animation creation system and the creation method provided by the embodiment of the application adopt a game design method and a multi-mode interaction technology, so that the understanding and interest of teenagers on animation art are enhanced. The target user can quickly produce the animation work without space-time limitation without professional tool use experience.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is an example diagram of an intelligent multi-modal animation authoring system in accordance with an embodiment of the present application;
FIG. 2 is a schematic diagram of natural interaction based animated resource coding and classification in accordance with an embodiment of the present application;
FIG. 3 is a schematic diagram of the functional relationship of an animation module and other external modules according to an embodiment of the present application;
FIG. 4 is a flow chart of an intelligent multi-modal animation authoring method provided in accordance with an embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.
FIG. 1 is an exemplary diagram of an intelligent multi-modal animation authoring system in accordance with an embodiment of the present application.
As shown in fig. 1, the intelligent multi-modal animation authoring system 10 comprises: a scenario design module 100, a narrative construction module 200, and a multi-modal driver module 300.
The scenario design module 100 is configured to generate an authoring operation instruction according to an authoring intention of an author, and build an animation story generating scenario according to the authoring operation instruction. The narrative construction module 200 is for authoring narrative content in an animated story occurrence context according to the narrative key elements and narrative paradigms. The multi-modal driving module 300 is configured to collect multi-modal action interaction information of an author, and map the multi-modal action interaction information to a physical motion state of a character object in an animation story occurrence situation, so as to complete intelligent multi-modal painting creation.
The context design module can quickly build an animated narrative context for the creator. The authoring system provides massive picture materials and collage resources, and allows an creator to freely put the state conceived in the collage heart in a scene. The situation design module is visual to operate, and an experimenter can create various special animation scenes without drawing skills or professional software using skills.
The narrative construction module can be used by an author to configure the key elements of the narrative for characters, items, story chains, actions, etc., to create the narrative content in a design context. The animation creation system adopts a means of simplified design, and utilizes four-grid cartoon and the narrative paradigm of a stage play to guide an creator to create a story line.
The multi-modal driver module captures and records multi-modal interaction information of the creator based on the physical expressions, which is derived from the intuitive actions of the creator and can most truly represent their description of the intended motion. After the multi-mode driving module acquires the data, the data are further cleaned and screened and mapped to the body motion state of the animation role.
Optionally, in one embodiment of the present application, the context design module includes: the 3D scene rendering unit is used for constructing a three-dimensional rendering space and presenting the situation of the animation story; the scene object selecting unit is used for classifying scene objects in the three-dimensional rendering space and providing users with the option of selecting scene objects corresponding to the authoring intention; and the scene object operation unit is used for providing a manipulation interaction interface so as to adjust the spatial position of the scene object in the three-dimensional rendering space.
The 3D scene rendering unit is a three-dimensional rendering space constructed based on WebGL technology, in which all dynamic graphic contents are presented. The method comprises the steps that a scene object selection unit divides each element in a three-dimensional rendering space into four categories, and an creator automatically selects a scene object which accords with the creation intention from a database and drags the scene object into a scene; the scene object manipulation unit adjusts the left and right, up and down and depth of the scene object through a specially designed manipulation interaction interface, and performs manipulation and placement of the spatial positions of all objects in the scene, so that the creator can realize fine control on the scene object.
The 3D scene rendering unit includes an orthogonal projection mode, a tilt camera depth display mode, and a scene layering mode. The orthogonal projection is a projection mode used in the system, and the orthogonal projection enables front and rear objects in a scene to avoid the difference of near and far, so that the cognitive attention of a user is focused on the left and right positions of the same level, and the cognitive load of the user on the complexity of the scene is simplified. The left-right direction of the projection space is the X direction of the world coordinate of the renderer, the up-down direction of the projection space is the Y direction of the coordinate of a master and slave of the renderer, and the front-back depth direction of the projection space is the Z direction of the world coordinate of the renderer. On the basis of the orthogonal projection mode, the system adjusts the pitching angle of the video camera of the renderer to-20 degrees, and the vertical angles of all scene objects corresponding to the pitching angle are deflected by-20 degrees by taking the ground as the axis, so that the visual angle difference of up-down direction and depth representation can be displayed simultaneously when throwing under the orthogonal view. The background of the layered scene system consists of two mutually perpendicular pictures, one of which vertically represents the sky perspective and the other horizontally represents the ground near perspective. While the background is also made of the type that has no lateral ground proximity, such as sky, in-water scenes. There is no gravitational effect on characters and props in this scenario. The foreground is distributed at various positions in the scene, and the animation can be freely specified and edited by the creator.
As shown in fig. 2, the scene object selection unit classifies the scene object into four classes of background objects, foreground objects, character objects, and prop objects. The background objects are optional textures located furthest away from the scene, with a single background object in one scene. The foreground object is a series of graph textures with smaller size relative to the background, can be arbitrarily selected and dragged into the scene for decorating the scene for multiple times, and can adjust the depth position of the foreground object to create the sense of depth of the scene. Character objects are movable character avatars that can be driven by an author. The character object can be arbitrarily placed in the scene without limitation; prop objects are scene objects that can be driven by an author and have their own special functions. Prop objects can be arbitrarily placed in a scene without limitation.
Optionally, in one embodiment of the present application, the narrative construction module comprises: the cartoon type mirror dividing unit is used for dividing the animation segments into a plurality of mirror dividing sections based on the narrative paradigm; a character deployment unit for setting character objects and background layouts in each of the scope sections; and the recording unit is used for recording corresponding animations for the character objects in each of the subarea intervals.
The cartoon type dividing mirror unit divides the animation segment into four fixed dividing mirror intervals based on the narrative mode of the four-lattice cartoon, each dividing mirror interval is provided with a character object on the scene by an creator, and the part is realized by a character deployment unit. After the creator arranges the background layout and the scene angle for each sub-mirror, the creator can record the corresponding animation for the single sub-mirror, and the method is completed by a recording unit.
Cartoon split mirror unit in order to simplify the narrative design process, four split mirror units are defined for users to take up and turn on based on four cartoon familiar and loved by teenagers as a paradigm of animation split mirror design. This restrictive design focuses the user's attention on how to design a subtle and interesting storyline, with less concern about the problem of split-mirror montage. The selection and the switching of the unit sub-mirrors are visual design modes using the index paste, so that different sub-mirrors can be conveniently selected in the same window, and the differences of roles and scenes among different sub-mirrors can be checked.
Optionally, in an embodiment of the present application, the character deployment unit is further configured to divide a presence space of the character object in the scene object into an upper scene mode and a lower scene mode, and perform adding and deleting of the character object in the scene object by switching between the upper scene mode and the lower scene mode.
The character deployment unit divides the space in which characters exist in the scene into two systems, an on-field system and an off-field system. The purpose is (1) to realize appearance and blanking of character objects and prop objects; (2) the creator can better perceive the scene dynamic environment; (3) the scene objects can be used repeatedly and added and deleted conveniently. The character located in the waiting field can appear in the scene by clicking the upper field button, otherwise, the character returns to the waiting field. The actions of the characters on the upper and lower fields can be recorded by an animation system, so that the appearance and disappearance of the objects are realized. The characters in the waiting area are deleted, and the stored animation content is also deleted. Under the new split mirrors, all roles in the original split mirrors return to the waiting area to wait for dispatch by the creator.
The recording unit simulates the UI layout of the electronic game, avoids complex functional areas and nested panels, and only superimposes a layer of flattened UI on the animation scene rendered in real time. The recording start button of the core is positioned at the lower left corner, and the camera at the upper right corner shoots the content. The animation creation system simplifies the recording process of the animation, and for any role of a single sub-mirror, the system does not provide a recording result editing function, and an creator can directly delete and restart recording.
Optionally, in one embodiment of the present application, the multi-modal driver module includes: the AI identification unit is used for identifying the character object animation recorded by the creator to obtain multi-mode action interaction information of the creator; a motion redirection unit for matching the multi-modal interaction information with the actions of the character object to map the actions in the multi-modal interaction information to action behaviors specified by the character object; and the animation driving unit is used for determining the relation between the bones and the motions of the character objects so as to drive the bones to move according to the action behaviors of the character objects.
The AI-identifying unit may identify the physical-action-expression information of the creator. The system regards the physical multimodal expression of the creator as the primary data source driving the animation. The motion redirection unit processes and maps the data acquired by the AI recognition unit to the specified action behaviors. The animation driving unit determines the skeleton and motion relationship of the character object as a technical basis for driving the virtual character to move by the multi-modal information.
The AI recognition unit comprises an intelligent algorithm model and two modules for data cleaning. The intelligent algorithm model part mainly uses a PoseNet gesture recognition model. PoseNet is a TensorFlow-based machine learning model that allows real-time human body Pose Estimation (Pose Estimation) in a browser. PoseNet can evaluate either a single gesture or multiple gestures. Pose estimation is performed in two stages: (1) RGB images are input and parsed by a convolutional neural network. (2) The decoded pose (codes pos), pose confidence score (Pose Confidence Scrore), keypoint location (Keypoint Position), and keypoint confidence score (Keypoint Confidence Scores) are output using a single-pose or multi-pose decoding algorithm. When estimating body posture, the poisenet selects 17 key points for the body, covering substantially all the movable joints of the animated character. The key point coordinate data directly output by the PoseNet model is influenced by the shooting quality of the camera, and obvious jitter and data loss phenomena can be generated. The method has the advantages that the original data is firstly subjected to clear foundation, key points with confidence scores lower than 0.2 are discarded, so that most of overlapping and shielding point data can be filtered out. On this basis, the filtered discrete coordinate data is time-lapse smoothed (Lerp smoothing algorithm is currently used).
Optionally, in one embodiment of the present application, the motion redirection unit is further configured to define a skeleton structure of the character object, define a connection line between a key point of the character object and its child node and parent node as two skeletons, calculate a plane angle of the two skeletons, and transmit the plane angle value to a local rotation value of the target joint, wherein the skeleton structure of the character object is a face-type body building mode.
The motion redirection unit comprises an object bone structure and a data mapping module. The subject bone structure defines a planar body building model that features (1) 3D bones support the warping and bending of pictures as compared to rigid bones. (2) All body parts are distributed non-overlapping on the same map. The aim is to realize that the same set of bones can adapt to a plurality of character images as much as possible. The first type of character is poorly suited in character form, and its skin area is limited to the area where the body part of the character itself is located. Map replacement can only replace the head and body. The second type of skeletal model is directed to a specific class of characters whose character image bodies are specific objects such as garbage bags, steamed stuffed buns, deep-fried dough sticks, vegetables, fruits, etc., the limbs of which are composed of matchsticks. A matcher skeleton system is designed, and a designer can complete the replacement by replacing the content of a specific object in the map. For the difference of hand and foot positions caused by objects with different shapes, the matchman system can set different hand and foot distances in the construction of the engine role on the basis of not changing the model skin. The third bone mode will be adapted to the human or animal image of most bipedal people in the future, and can support larger-amplitude image and clothing changes. Rigid body skeletal mode is the most common role is rigid body based skeletal mode, with no specific requirements on the size and length and width of the skinned bones and limbs. Therefore, the map is directly bound to the skeleton without the skin, and the limbs with any aspect ratio can be naturally matched into the character system.
The smoothed keypoint data is processed by the data map and can be utilized in different interactive applications. Key point data is currently used for (1) motion mapping of individual points. (2) The overall mapping of the human joints. The original keypoint coordinates are projections of the user's body in the camera screen coordinate system, whereas the different character bone length ratios in the animation system are not identical. In order to obtain normalized body motion gestures, user actions can be redirected to any character image, connecting lines of key points and child nodes and father nodes of the key points are defined as two bones, and plane angles of the two bones are calculated. The angle value is transferred to the local rotation value of the target joint.
The animation driving unit comprises three modules of a model skin, a skeleton animation and an animation state machine. The model skin module does not contain facial organ components. The skin of the elbow, knee, is continuous, but the shoulder and hip are disconnected. Under the condition of unchanged skeleton structure, the mapping is directly replaced, so that the role system can be ensured to switch the skin completely and without errors. However, only a portion of the area of each component of the body map may be effectively rendered and the portion beyond the area will not be displayed. Wherein the hands and feet are symmetrical left and right, and the texture of the same area is used. The overall skeletal structure of the bipedal skeletal structure approximates that of the model of a model-transferring humanoid form, but the model requires that its natural pose be deflected slightly to the right so that the character object can switch left and right. The tail portion extends from the waist joint to the extra bone, and the ear may also be provided with a bone node. The skeletal spring system achieves the elastic effect of the ear and tail.
Optionally, in an embodiment of the present application, the animation driving unit is further configured to map the facial expression of the character through facial recognition and emotion classification recognition of the image.
Bones of character objects in a context may all be transformed in motion under the influence of user interactions. The system designs a set of compound motion capture scheme. For facial organs (eyes, mouth, nose, eyebrows, etc.), mapping of facial expressions of a character is completed by adopting two input modes of facial recognition (continuous) and emotion classification recognition (state) based on images. The character motion level is realized by three layers of interaction inputs. The base layer is the base displacement and animation (state) generated by keyboard and mouse operation, such as walking, jumping, squatting and the like, and the movement mode is similar to diagonal control in a game, so that a user can quickly generate familiar knowledge on the animation situation. The second layer is track animation based on image recognition, the algorithm recognizes that the motion track of the finger is mapped to the object, and the motion data of the layer will reload the motion data of the upper layer. The third layer is image-based gesture recognition, where algorithms recognize that body skeleton motion trajectories map to character objects. The motion data of this layer is remapped with the user experience as the core. The final motion operation mode of the animation system is: the user can freely walk in the scene by operating the animation roles through the up-down, left-right keys, the user can carry the animation roles to fly in the air by hands facing the camera, and in the flying process, the body actions made by the creator can be shown on the body of the animation roles.
The system of the application enhances the understanding and interest of teenagers in animation art by adopting a game design method and a multi-modal interaction technology. Animation can be quickly performed without space-time limitation under the condition of no experience of using professional tools.
The authoring system of the present application is described in detail by way of one specific embodiment in conjunction with FIG. 3.
1) Project creation
Before using the system, a new project instance should be created by clicking the new button (background is switched to blue sky), or a project instance currently stored in the cloud end should be read by clicking the read button. Directly starting animation editing without clicking a new or read button may result in a system error. Meanwhile, cloud items cannot be read on the basis of newly built items, and the page is clicked to be read after refreshing. The stored item is edited at any time, the current progress can be stored to the cloud end by clicking a storage button, and the stored content can be continuously edited by clicking and reading. At present, a multi-user system is not built, only one storage copy exists in the cloud, and the storage items of the cloud can be covered by other testers.
2) Resource browsing system
The resource browsing system comprises four selectable items, namely a background, a foreground, a role and a prop. Clicking the background button may directly switch the background style. Clicking on the foreground button may select the desired foreground pattern. Clicking on the roles button selects the desired role. Clicking on the prop button may select the desired prop.
After the object is generated in the scene, clicking on the object activates the movement control (yellow circle). The yellow large circle indicates the position of the object on the virtual flat ground, the green small circle right above the object can adjust the height of the object, and the side blue small circle can adjust the left and right positions of the object.
3) Mirror splitting window system
Four sub-mirrors are generated by default in the sub-mirror interface, and each sub-mirror is switched through an upper loose-leaf button. Each of the partial mirrors may select a set of user-defined background + foreground combinations. This function is not yet completed, and all the partial mirrors currently share a set of background+foreground combinations. Clicking the lower right camera button can enter the single-lens interface, and then switching back to the lens hinge interface.
4) Time axis system
The time axis of the multi-minute mirror interface is composed of 4 equal-width areas, the maximum interval of each area is mapped to the system default maximum time (10 s) of a single minute mirror, and if the animation time of one minute mirror is not more than 10s, the animation time is displayed in a covering mode by a green rectangular area. The user can drag the time axis freely in the green area to view the animation of each sub-mirror. Clicking the right play button plays/pauses the play of the animation.
The time axis of the single-minute mirror interface is formed by a completion time line, the system default maximum time (10 s) of the single minute mirror from left to right, and if the animation time of one minute mirror is not more than 10s, the user can only drag the time axis within the time range with animation record. Clicking the right play button plays/pauses the play of the animation.
5) Waiting/loading system
In order to facilitate the control of each character in the scene, a waiting field/getting-on system is designed for selecting the character object to be used in each minute mirror. Only character and prop objects can be managed by the waiting space system. In one animation project, the same type of character object may create multiple character objects, as in the display project in the above figures, two piglet cookie objects are created. They can be all the way out in each of the partial mirrors as independent actors, and actors without going out will appear translucent. But during the animation play, the character without the presence is not present. All actor lists participating in the performance and their waiting conditions can be seen in the waiting system UI interface.
As shown in the above diagram, there are two piglets in the animation project that are wearing, wherein the first is wearing in the first minute mirror and the second is wearing in the second minute mirror. The actor being developed will be highlighted in red in the list of waiting areas. Clicking the enter button of its avatar can switch between the waiting and entering states. Clicking the delete button of the avatar removes the actor and the animation performance data in each of the sub-mirrors is also deleted.
6) Cartoon recording system
To click the camera button in the lower right corner of the window, the animation editing interface in the designated minute can be entered. The functions of the right UI are: gesture (ose) animation recording button, movement (move) animation recording button, store animation, and discard animation button.
The recording flow of gesture (phase) animation of a character in a certain minute mirror is as follows: (1) the selected animation object (2) is dragged to the point in time at which you want to record the animation. The starting point of the animation must be determined within the time frame in which the animation has been recorded, i.e. within the range in which the time axis can be dragged. If the starting point in time is located somewhere in the middle of the recorded animation, the new animation segment will overwrite the old animation segment. (3) Click gesture animation recording. After the button starts to be clicked, the time axis will move with time, and the gesture motion captured by the camera will be recorded in the temporary animation data object (animation data_new). (4) Clicking the gesture animation recording button again stops recording. The user will stop recording by clicking the gesture animation recording button again.
According to the intelligent multi-mode animation creation system provided by the embodiment of the application, a game design method and a multi-mode interaction technology are adopted, so that the understanding and interest of teenagers on animation art are enhanced. The target user can quickly produce the animation work without space-time limitation without professional tool use experience.
Next, an intelligent multi-modal animation authoring method according to an embodiment of the present application will be described with reference to the accompanying drawings.
FIG. 4 is a flow chart of an intelligent multi-modal animation authoring method in accordance with an embodiment of the present application.
As shown in fig. 4, the intelligent multi-modal animation authoring method includes the steps of:
step S101, generating an authoring operation instruction according to the authoring intention of an author, and building an animation story generating situation according to the authoring operation instruction.
Optionally, in one embodiment of the present application, generating the authoring operation instruction according to the authoring intention of the author includes: and selecting a scene object corresponding to the authoring intention in a preset material library according to the authoring intention of the creator, and generating an authoring operation instruction by freely placing or and/or collaging the scene object.
Based on the game interactive interface, the user can quickly design and build the imagined animation story generation situation in the brain.
Step S102, creating the narrative content in the animation story occurrence situation according to the narrative key elements and the narrative paradigm.
The narrative key elements comprise at least one of character objects, items, story chains, and character actions. The narrative paradigm includes at least one of four-grid comics and stage plays.
Embodiments of the present application provide a set of interactive metaphors based on four-lattice comics and stage shows that assist users in constructing storyline and narrative chains in context.
Step S103, collecting multi-modal action interaction information of the creator, and mapping the multi-modal action interaction information into the body motion state of the character object in the animation story occurrence situation to complete the creation of the intelligent multi-modal painting.
According to the embodiment of the application, through a series of multi-mode interaction interfaces, the body expression of the user is converted into the character motion, and the recording of the animation narrative is realized.
It should be noted that the foregoing explanation of the embodiment of the intelligent multi-modal animation authoring system is also applicable to the intelligent multi-modal animation authoring method of the embodiment, and will not be repeated herein.
According to the intelligent multi-mode animation creation method provided by the embodiment of the application, the game design method and the multi-mode interaction technology are adopted, so that the understanding and interest of teenagers on animation art are enhanced. The target user can quickly produce the animation work without space-time limitation without professional tool use experience.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "N" is at least two, such as two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
Claims (10)
1. An intelligent multimodal animation authoring system comprising:
the situation design module is used for generating an authoring operation instruction according to the authoring intention of an author and building an animation story generating situation according to the authoring operation instruction;
the narrative construction module is used for creating narrative content according to the narrative key elements and the narrative paradigm in the animation story occurrence situation;
and the multi-mode driving module is used for collecting multi-mode action interaction information of the creator and mapping the multi-mode action interaction information into the body motion state of the character object in the animation story occurrence situation to complete the creation of the intelligent multi-mode painting.
2. The system of claim 1, wherein the context design module comprises:
the 3D scene rendering unit is used for constructing a three-dimensional rendering space and presenting the animation story situation;
the scene object selecting unit is used for classifying the scene objects in the three-dimensional rendering space and providing users with the scene objects corresponding to the creation intention;
and the scene object operation unit is used for providing a manipulation interaction interface so as to adjust the spatial position of the scene object in the three-dimensional rendering space.
3. The system according to claim 2, wherein the scene object selection unit is specifically configured to divide the scene objects into a background object, a foreground object, a character object and a prop object, the background object being a selectable texture located farthest from the scene; the foreground object is a plurality of graphic textures smaller than the background size; the character objects are movable character images in the narrative content; the prop object is a scene object with a special function.
4. The system of claim 1, wherein the narrative construction module comprises:
the cartoon type mirror dividing unit is used for dividing the animation segments into a plurality of mirror dividing sections based on the narrative paradigm;
a character deployment unit for setting character objects and background layouts in each of the scope sections;
and the recording unit is used for recording corresponding animations for the character objects in each of the subarea intervals.
5. The system of claim 4, wherein the character deployment unit is further configured to divide a presence space of the character object in a scene object into an upper scene mode and a lower scene mode, and to add and delete the character object in the scene object by switching the upper scene mode and the lower scene mode.
6. The system of claim 1, wherein the multi-modal driver module comprises:
the AI identification unit is used for identifying the character object animation recorded by the creator to obtain multi-mode action interaction information of the creator;
a motion redirection unit, configured to match the multi-modal interaction information with the actions of the character object, so as to map the actions in the multi-modal interaction information to action behaviors specified by the character object;
and the animation driving unit is used for determining the relation between the bones and the motions of the character objects so as to drive the bones to move according to the action behaviors of the character objects.
7. The system of claim 6, wherein the motion redirection unit is further configured to define a character object skeleton structure and define a connection of a character object key point to its child and parent nodes as two skeletons, calculate a plane angle of the two skeletons, and transfer the plane angle value to a local rotation value of a target joint, wherein the character object skeleton structure is a facial body building model.
8. The system of claim 6, wherein the animation drive unit is further configured to map a facial expression of a character through facial recognition and emotion classification recognition of an image.
9. An intelligent multi-modal animation authoring method comprising the steps of:
generating an authoring operation instruction according to the authoring intention of an author, and building an animation story generating situation according to the authoring operation instruction;
creating narrative content in the animated story occurrence context according to the narrative key elements and the narrative paradigm;
and collecting multi-modal action interaction information of the creator, and mapping the multi-modal action interaction information into the body motion state of the character object in the occurrence situation of the animation story, so as to complete the creation of the intelligent multi-modal painting.
10. The method of claim 9, wherein generating the authoring operation instructions from the author's authoring intent comprises:
and selecting a scene object corresponding to the authoring intention from a preset material library according to the authoring intention of the creator, and generating the authoring operation instruction by freely placing or and/or collaging the scene object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310181768.2A CN116342763A (en) | 2023-02-21 | 2023-02-21 | Intelligent multi-mode animation creation system and creation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310181768.2A CN116342763A (en) | 2023-02-21 | 2023-02-21 | Intelligent multi-mode animation creation system and creation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116342763A true CN116342763A (en) | 2023-06-27 |
Family
ID=86881465
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310181768.2A Pending CN116342763A (en) | 2023-02-21 | 2023-02-21 | Intelligent multi-mode animation creation system and creation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116342763A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117034385A (en) * | 2023-08-30 | 2023-11-10 | 四开花园网络科技(广州)有限公司 | AI system supporting creative design of humanoid roles |
-
2023
- 2023-02-21 CN CN202310181768.2A patent/CN116342763A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117034385A (en) * | 2023-08-30 | 2023-11-10 | 四开花园网络科技(广州)有限公司 | AI system supporting creative design of humanoid roles |
CN117034385B (en) * | 2023-08-30 | 2024-04-02 | 四开花园网络科技(广州)有限公司 | AI system supporting creative design of humanoid roles |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11461950B2 (en) | Object creation using body gestures | |
US6331861B1 (en) | Programmable computer graphic objects | |
Barnes et al. | Video puppetry: a performative interface for cutout animation | |
KR20190025691A (en) | How and where to make a video | |
Thalmann et al. | EG 2007 Course on Populating Virtual Environments with Crowds. | |
CN115331265A (en) | Training method of posture detection model and driving method and device of digital person | |
CN116342763A (en) | Intelligent multi-mode animation creation system and creation method | |
Sénécal et al. | Modelling life through time: cultural heritage case studies | |
US11625900B2 (en) | Broker for instancing | |
Çimen | Animation models for interactive AR characters | |
Walther-Franks et al. | An interaction approach to computer animation | |
US11715249B2 (en) | Hierarchies to generate animation control rigs | |
Shiratori | User Interfaces for Character Animation and Character Interaction | |
Sabiston | Extracting 3d motion from hand-drawn animated figures | |
Willett | Tools for Live 2D Animation | |
Perez | Data-Based Motion Planning for Full-Body Virtual Human Interaction with the Environment | |
Magnenat-Thalmann et al. | Virtual humans | |
Cosmas et al. | Creative tools for producing realistic 3D facial expressions and animation | |
Vacchi et al. | Neo euclide: A low-cost system for performance animation and puppetry | |
Magnenat-Thalmann | Living in both the real and virtual worlds | |
Kiss | 3D Character Centred Online Editing Modalities for VRML-based Virtual Environments | |
Welsman-Dinelle | The animation canvas: a sketch-based visual language for motion editing | |
Lockwood | High Degree of Freedom Input and Large Displays for Accessible Animation | |
Boquet Bertran | Automatic and guided rigging of 3D characters | |
Tsekleves et al. | Semi-automated human body 3D animator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |