CN117475049B

CN117475049B - Virtual image adaptation method and system

Info

Publication number: CN117475049B
Application number: CN202311799886.6A
Authority: CN
Inventors: 杨海宁; 邓泽西; 栾德龙; 张丙锐
Original assignee: One Station Development Beijing Cloud Computing Technology Co ltd
Current assignee: One Station Development Beijing Cloud Computing Technology Co ltd
Priority date: 2023-12-26
Filing date: 2023-12-26
Publication date: 2024-03-08
Anticipated expiration: 2043-12-26
Also published as: CN117475049A

Abstract

The invention relates to an adaptation method and a system of virtual images, which relate to the field of adaptation of virtual scene data, and comprise the steps of acquiring script data; capturing action data and first keyword data in the script data; constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout; and outputting the virtual image corresponding to the region of the relation layout where the synthetic vector is positioned. According to the invention, the action data and the first keyword data in the script data are fused through the synthetic vector, and the synthetic vector is formed according to the action data and the first keyword data, and then the synthetic vector is prepared on a relational layout for outputting a corresponding position result, so that corresponding virtual images can be output according to different script data, the virtual images are more adaptive and more fitting with the content of the script, and the user viewing experience is improved.

Description

Virtual image adaptation method and system

Technical Field

The present invention relates to the field of virtual scene data adaptation, and in particular, to a method and a system for adapting an avatar.

Background

In the virtual platform, people commonly configure an avatar to appear as an avatar of an artificial intelligence session as if the user were talking to the avatar. The conventional avatar is single and fixed, and when it expresses happiness, expression or speaking, it is also absent, and it cannot be adapted to its specific avatar.

In the current virtual dialogue scene, after people input voice instructions, a virtual platform outputs a reply according to an artificial intelligence algorithm, and even each reply has a script to be read. Then, when the avatar reads the script data, the avatar can complete the dialogue voice of the whole script data with only a single avatar, even a single intonation, speed, and loudness. During this time, the user cannot experience a sense of introduction from this avatar, nor can it be more adapted to the content to be expressed by this scenario data according to his avatar.

Accordingly, there is a need for an avatar adaptation method and system capable of adapting different avatars according to scenario data to increase user substitution feeling and immersion feeling.

Disclosure of Invention

The invention aims to provide an avatar adaptation method and system capable of adapting different avatars according to scenario data to increase substitution feeling and immersion feeling of a user.

The invention relates to an adaptation method of an avatar, which comprises the following steps of

Acquiring script data;

capturing action data and first keyword data in the script data;

constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout;

outputting an avatar corresponding to the region where the synthetic vector is located in the relation layout;

wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises

And judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout.

The invention relates to an avatar adaptation method, wherein the method judges whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of a longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.

The invention relates to an avatar adaptation method, wherein the method judges whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of a longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.

The invention relates to an avatar adaptation method, wherein outputting the avatar corresponding to the region where the synthetic vector is positioned in the relational layout comprises the following steps of

Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range.

The invention discloses an avatar adaptation method, wherein the step of outputting the avatar corresponding to the region of the relation layout with the synthetic vector comprises the following steps:

pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;

constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle ₁ Percentage d of the second quadrant ₂ Percentage d of the third quadrant ₃ And a percentage d of the fourth quadrant ₄ ；

Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;

And outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.

The present invention relates to an avatar adaptation system including

The input module is used for acquiring script data; capturing action data and first keyword data in the script data;

a vector synthesis module that constructs a synthesis vector using the action data and the first keyword data, and configures the synthesis vector in a relational layout;

the output module is used for outputting the virtual image corresponding to the region where the synthetic vector is positioned in the relation layout;

The invention provides an avatar adaptation system, wherein the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.

The invention provides an avatar adaptation system, wherein the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.

The invention relates to an avatar adaptation system, wherein the outputting of the avatar corresponding to the region of the relation layout where the synthetic vector is located comprises the following steps

The invention discloses an avatar adaptation system, wherein the step of outputting the avatar corresponding to the region of the relation layout with the synthetic vector comprises the following steps:

The invention is different from the prior art in that the method for adapting the virtual image fuses action data and first keyword data in the script data through the synthetic vector, and the action data and the first keyword data together form a synthetic vector according to the action data and the first keyword data, and the synthetic vector is then formulated on a relational layout for outputting corresponding position results, so that the corresponding virtual image can be output according to different script data, thereby enabling the virtual image to be more adapted and more attached to the content of the script, and improving the watching experience of users.

An avatar adaptation method of the present invention will be further described with reference to the accompanying drawings.

Drawings

FIG. 1 is a schematic illustration of a first state of a relational layout of an avatar adaptation method;

FIG. 2 is a schematic diagram of a second state of a relational layout of an avatar adaptation method;

fig. 3 is a flowchart illustrating an avatar adaptation method.

Detailed Description

Referring to FIGS. 1 to 3, referring to FIGS. 1 and 3, the avatar adaptation method of the present invention includes

Acquiring script data;

capturing action data and first keyword data in the script data;

According to the invention, the action data and the first keyword data in the script data are fused through the synthetic vector, and the synthetic vector is formed according to the action data and the first keyword data, and then the synthetic vector is prepared on a relational layout for outputting a corresponding position result, so that corresponding virtual images can be output according to different script data, the virtual images are more adaptive and more fitting with the content of the script, and the user viewing experience is improved.

Wherein, there are preset X-axis and Y-axis coordinate systems in the dry-cleaning layout area, the coordinate system has four quadrants, similar to the four quadrants of the conventional XY coordinate system.

The expansion range can change the circular shape into the circular circumscribing triangle, the circumscribing square or the circumscribing hexagon.

Of course, the expansion range may expand a triangle or square or hexagon into an circumscribed circle.

Of course, the expansion range may be expanded by a times of the area of the region based on the center of the region, where a may be 0.1% -1000%, and preferably is a ratio of one of the shorter overlapping lengths of the motion data vector and the first keyword vector to one of the longer overlapping lengths. Further, 10% may be preferable.

That is, the present invention can determine the degree of contradiction between the motion data vector and the first keyword vector in a certain dimension by comparing the sum of the overlapping projection lengths of the motion data vector and the first keyword vector in the X axis and the Y axis with the first threshold, and if the sum of the overlapping projection lengths is greater than the first threshold, it can determine that the two vectors are in greater contradiction, so as to adjust or expand the area range of the corresponding relational layout, thereby reducing the deviation of the final synthesized vector position caused by the contradiction, and being more beneficial to the synthesized vector finally reaching the area range of the area of the relational layout corresponding to the vector with longer projection corresponding to the overlapping projection.

For example, as shown in fig. 2, the relationship layout includes inside-out P region, Q region, R region, S region, and T region.

Of course, the avatars may also be different expressions of the same person, which correspond to the avatars of the table, that is, the avatars of the same regular avatar may also be displayed corresponding to different areas by pre-storing different expressions.

In the coordinate system, the X-axis represents emotional happiness or sadness, and the Y-axis represents positive or negative. The database stores motion data vectors and first keyword vectors corresponding to each motion data and the first keyword data. For example, the action data in fig. 2 is "listen to songs", and its pre-stored vector is (4, 3), which is a generally happy and positive vocabulary; the first keyword data in fig. 2 is "recognition spectrum", and its pre-stored vector is (1, 4), which is a generally happy, positive vocabulary. Then their resultant vector may be (5, 7), which lies within the r region of fig. 2. The first threshold length may be 0.1 to 100000, preferably 3.

The invention constructs the synthetic vector by utilizing the action data and the first keyword data.

The character interface of the invention selects the character model of the intelligent digital person corresponding to the script and supports the adjustment of the direction, the position and the size of the character. The invention can be used for sound configuration.

The sum of the projection length of the X-axis superposition and the projection length of the Y-axis superposition can be understood as the superposition length of one of the X-axis and the Y-axis because of the particularity of the vectors, and the two vectors can only be overlapped in the X-axis or in the Y-axis. A comparison is better obtained by comparing it with a first threshold, which may be the length of the vector telling "hello" or the diagonal length or side length of the P-region in fig. 2. In general, the first threshold may be a common criterion, and if the length of overlap exceeds the first threshold, it may represent that most of the length is the length of overlap, the composite vector is shorter, and the motion data vector and the first keyword vector are contradictory.

The term "constructing a composite vector using motion data and first keyword data and disposing the composite vector in a relational layout" is understood to mean that a start point of the composite vector is disposed at an origin of a coordinate system of the relational layout, specifically, a start point of the motion data vector is disposed at an origin of a coordinate system of the relational layout, an end point of the motion data vector is disposed at a start point of the first keyword data vector, and an end point of the first keyword data vector coincides with an end point of the composite vector.

Referring to fig. 2 and 3, as a further explanation of the present invention, the determining unit determines whether or not a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which are overlapped with each other in the X-axis, and a projection length of the first keyword data vector, which are overlapped with each other in the Y-axis, exceeds a first threshold value, and if not, reduces a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector, which are overlapped with each other in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.

The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the region with the nearest range is made as the falling range for the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, and the fact that the midpoint of the region is nearest to the end point of the synthesized vector is designated, so that the influence of the size of the region on the virtual image of the final output is considered.

For example, there are two regions, one region having an edge near the end of the composite vector but a midpoint farther away, and then an avatar corresponding to a region having a midpoint farther away from the midpoint of the composite vector is output.

The "the range of the region of the corresponding relationship layout in which the longer one of the projection lengths of the coincidence of the motion data vector and the first keyword vector is reduced" may be understood as "the longer one of the projection lengths of the coincidence of the motion data vector and the first keyword vector" may be understood as: the y-axis projection length of the motion data vector in fig. 2 is greater than the y-axis projection length of the first keyword vector, and thus, the motion data vector is the longer one of the projection lengths of fig. 2; the projection length of the X-axis of the motion data vector in fig. 1 is greater than the projection length of the X-axis of the first keyword vector, and thus the motion data vector is the longer one of the projection lengths in fig. 1; for example, in each region of the relational layout, a part of the region corresponds to the motion data vector and a part of the region corresponds to the first keyword vector, and thus the region corresponding to the motion data vector or the first keyword vector is narrowed. Wherein, the reduced area range can be understood as: the area of the same center point before and after the area is reduced by 10% in equal proportion; alternatively, a square or triangle is inscribed in the area of the circle, and a circle is inscribed in the area of the square or triangle.

It is understood that "determining whether the end point position of the synthesized vector is located in the area of the relational layout" may be understood that the end point of the synthesized vector may be an end of an arrow of the synthesized vector in fig. 1 or fig. 2.

Wherein each region of the relational layout stores an avatar corresponding to the region.

Referring to fig. 2 and 3, as a modification of the present invention, the determining unit may determine whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which overlap with each other in the X-axis, and a projection length of the first keyword data vector, which overlap with each other in the Y-axis, exceeds a first threshold, and if not, reduce a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.

The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, the region with the nearest range is made as the falling range, and the edge of the region is specified to be nearest to the end point of the synthesized vector, so that the influence of the size of the region on the virtual image of the final output is considered.

For example, the edges may be any one point of the edges, in other words, assuming that there are countless points per edge, an avatar corresponding to an area of the edge where one point of each edge closest to the end point of the synthesized vector is located is outputted.

Referring to fig. 2 and 3, as a further explanation of the present invention, outputting the avatar corresponding to the region of the relational layout where the synthetic vector is located includes the steps of

The region in which the end point of the synthesis vector configured in the above manner falls can be understood as: since the areas of the relational layout are not necessarily all mutually disjoint, since the various areas may be regular rectangles, triangles or circles, etc., which are all substantially of a preset shape, and in order to avoid that there are blank positions of no area in the relational layout, there may be positions where two or more relational layout areas overlap in a certain coordinate unit when the relational layout is configured. In other words, the end point position of the synthesized vector is surely possible to be arranged in two or more areas of the relational layout, and if not, it is possible to be arranged in one area of the relational layout, provided that at least one area is arranged at each position of the relational layout, and if the end point of the synthesized vector is not arranged in one area in a special case, it can be regarded as the position of any one area of the nearest midpoint or edge of the synthesized vector, and the avatar corresponding to the area is output.

Here, the "region of the larger one of the regions in which the output range is enlarged or reduced" is understood to mean that if only one region is enlarged or reduced, it is regarded as the region of the larger one of the regions in which the range is enlarged or reduced; if there are two regions in which the range is enlarged or two regions in which the range is reduced or one region in which the range is enlarged and the other region is reduced, the output range is enlarged or one region which is finally larger after being reduced or enlarged out of the regions in which the range is enlarged or reduced.

That is, if the synthesized vector falls within the overlap region after overlapping the expanded and unexpanded regions, it is determined that the expanded region is a falling region.

Wherein the "outputting the avatar corresponding to the region of the relational layout" also conforms to the principle of either the region near the midpoint or the region near the edge. In other words, the end point of the synthesized vector does not necessarily fall into the region of the relational layout, but may also fall into the position of the region in the relational layout, so that the region of the relational layout with the nearest midpoint or edge is found out to output the corresponding avatar by the nearest principle.

Referring to fig. 2 and 3, as a further explanation of the present invention, the step of outputting the avatar corresponding to the region of the relational layout with the synthetic vector includes:

According to the method, the occupation ratio of the area of the first triangle constructed by the dialogue voice of the virtual image according to different motion data vectors, the first keyword vectors and the synthetic vectors in the four quadrants of the coordinate system can be used as an adjustment index for influencing the dialogue voice of the virtual image, so that the dialogue voice of the virtual image can be subjected to targeted intonation adjustment according to possible expression content of the dialogue voice of the virtual image, and whether the virtual image is happy or not is expressed; and performing speech speed adjustment so as to express whether the dialogue speech of the virtual image is positive; and makes a loudness adjustment to express whether the dialog of the avatar is of affirmative attitudes.

If the condition is not triggered, the synthesized tone is a reference tone, the synthesized speed is a reference speed, and the synthesized loudness is a reference loudness.

Wherein the tone of the reference audio is adjusted, wherein the d ₁ +d ₂ +d ₃ +d ₄ =100%。

Wherein, the adjustment coefficient c can be obtained according to the adjustment coefficient c corresponding to the avatar pre-stored in the database. The adjustment coefficient c may be 0.1 to 5, preferably 3. The younger avatar, c, is relatively smaller, i.e., the degree of change in synthesized tones is greater, conforming to the less sunken character. And the older avatar, c, is relatively larger, that is, the degree of variation of the synthesized tone is larger, conforming to the more sinking character. The adjustment coefficient c may preferably be 1.

Wherein, synthetic tone = reference toneThe method comprises the steps of carrying out a first treatment on the surface of the It can be understood that the above-mentioned partial formula, d ₁ +d ₂ Has a maximum value of 100%, d ₃ +d ₄ For the maximum value of 100%, in order to make the tone of people speaking generally in 100 hz-300 hz, the reference tone is adjusted according to whether the tone is happy or not by different percentage values of the formula, so that the adjusted synthesized tone is more emotionally expressed in western style and better accords with the meaning represented by the duty ratio of the area of the first triangle in four quadrants.

For example, the reference tone is 150Hz, and the above formula changes the tone to 75-300 Hz, so that the happiness and the dishappiness of people are reflected.

Since in the coordinate system the X-axis represents emotional happiness or sadness and the Y-axis represents positive or negative.

Then the duty cycle of the first quadrant and the second quadrant on the right is much, and the reference tone should be lifted in some cases, and vice versa.

Wherein, output the synthetic speed according to the following formula: synthesis speed = reference speed ×；

Then the upper first quadrant and fourth quadrant have a large duty cycle and the reference speed should be raised in some cases, and vice versa.

For example, the original reference speed is 1 time speed, and once the triangle is more in the first quadrant and the fourth quadrant, the triangle tends to represent more front language, and the speed is supposed to be improved, so that the front position of the triangle is better expressed.

The younger avatar, c, is relatively smaller, i.e., the degree of change in the composite speech rate is greater, conforming to the less sunken character. And the older virtual image, c, is relatively larger, that is, the degree of change of the synthesized speech rate is larger, and the character is accordant with the more sunken and stable image of the character. Wherein,

The adjustment coefficient c may have a larger value during the above speed adjustment, that is, the reference speech speed may be adjusted to be as high as 0.8 to 1.2 times the speed, and more preferably, the reference speech speed may be adjusted to be 0.9 to 1.1. In other words, the pre-stored adjustment reference speech rate and adjustment coefficient c for adjusting the reference pitch may be different.

Assuming that the area of the first triangle is 100 square centimeters and the area of the first quadrant of the relationship layout occupied by the first triangle is 10 square centimeters, d ₁ Bits 10%.

If the sum of the projection lengths exceeds a first preset threshold, the loudness is reduced, and if not, the loudness is enlarged.

According to the percentage of the area where the triangle of the synthesized vector is located, the invention configures the tone and tone color with the happiness or the dislike corresponding to the four quadrants.

The X-axis represents emotional happiness or sadness and the Y-axis represents positive or negative in the coordinate system, wherein the coordinate system is preset corresponding to each relation layout.

As shown in fig. 1 and 3, the avatar adaptation system of the present invention includes

Referring to fig. 2 and 3, as a further explanation of the present invention, the determining unit determines whether or not a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which are overlapped with each other in the X-axis, and a projection length of the first keyword data vector, which are overlapped with each other in the Y-axis, exceeds a first threshold value, and if not, reduces a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector, which are overlapped with each other in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.

JudgingWhether the sum of the projection length of the motion data vector generated by the motion data, which is overlapped with the X axis of the first keyword data vector generated by the first keyword data, and the projection length of the Y axis is overlapped exceeds a first threshold value is judged, if yes, whether the projection exceeding the first threshold value is on the X axis or the Y axis is judged, and if yes, the synthesized tone is output according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;

The above examples are only illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solution of the present invention should fall within the scope of protection defined by the claims of the present invention without departing from the spirit of the present invention.

Claims

1. A method for adapting an avatar, characterized by: comprising

Acquiring script data;

capturing action data and first keyword data in the script data;

Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout;

determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout closest to the end point position of the synthesized vector;

The outputting the virtual image corresponding to the region where the synthetic vector is located in the relational layout comprises the steps of

Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range;

the step of outputting the avatar corresponding to the region in which the synthesis vector is located in the relational layout includes:

2. A method for adapting an avatar, characterized by: comprising

Acquiring script data;

capturing action data and first keyword data in the script data;

determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector;

3. An avatar adaptation system, characterized by: comprising

4. An avatar adaptation system, characterized by: comprising

Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if notThe synthesized loudness is output according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;