CN117475049B - Virtual image adaptation method and system - Google Patents

Virtual image adaptation method and system Download PDF

Info

Publication number
CN117475049B
CN117475049B CN202311799886.6A CN202311799886A CN117475049B CN 117475049 B CN117475049 B CN 117475049B CN 202311799886 A CN202311799886 A CN 202311799886A CN 117475049 B CN117475049 B CN 117475049B
Authority
CN
China
Prior art keywords
vector
layout
outputting
data
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311799886.6A
Other languages
Chinese (zh)
Other versions
CN117475049A (en
Inventor
杨海宁
邓泽西
栾德龙
张丙锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
One Station Development Beijing Cloud Computing Technology Co ltd
Original Assignee
One Station Development Beijing Cloud Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by One Station Development Beijing Cloud Computing Technology Co ltd filed Critical One Station Development Beijing Cloud Computing Technology Co ltd
Priority to CN202311799886.6A priority Critical patent/CN117475049B/en
Publication of CN117475049A publication Critical patent/CN117475049A/en
Application granted granted Critical
Publication of CN117475049B publication Critical patent/CN117475049B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Abstract

The invention relates to an adaptation method and a system of virtual images, which relate to the field of adaptation of virtual scene data, and comprise the steps of acquiring script data; capturing action data and first keyword data in the script data; constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout; and outputting the virtual image corresponding to the region of the relation layout where the synthetic vector is positioned. According to the invention, the action data and the first keyword data in the script data are fused through the synthetic vector, and the synthetic vector is formed according to the action data and the first keyword data, and then the synthetic vector is prepared on a relational layout for outputting a corresponding position result, so that corresponding virtual images can be output according to different script data, the virtual images are more adaptive and more fitting with the content of the script, and the user viewing experience is improved.

Description

Virtual image adaptation method and system
Technical Field
The present invention relates to the field of virtual scene data adaptation, and in particular, to a method and a system for adapting an avatar.
Background
In the virtual platform, people commonly configure an avatar to appear as an avatar of an artificial intelligence session as if the user were talking to the avatar. The conventional avatar is single and fixed, and when it expresses happiness, expression or speaking, it is also absent, and it cannot be adapted to its specific avatar.
In the current virtual dialogue scene, after people input voice instructions, a virtual platform outputs a reply according to an artificial intelligence algorithm, and even each reply has a script to be read. Then, when the avatar reads the script data, the avatar can complete the dialogue voice of the whole script data with only a single avatar, even a single intonation, speed, and loudness. During this time, the user cannot experience a sense of introduction from this avatar, nor can it be more adapted to the content to be expressed by this scenario data according to his avatar.
Accordingly, there is a need for an avatar adaptation method and system capable of adapting different avatars according to scenario data to increase user substitution feeling and immersion feeling.
Disclosure of Invention
The invention aims to provide an avatar adaptation method and system capable of adapting different avatars according to scenario data to increase substitution feeling and immersion feeling of a user.
The invention relates to an adaptation method of an avatar, which comprises the following steps of
Acquiring script data;
capturing action data and first keyword data in the script data;
constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout;
outputting an avatar corresponding to the region where the synthetic vector is located in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
And judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout.
The invention relates to an avatar adaptation method, wherein the method judges whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of a longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.
The invention relates to an avatar adaptation method, wherein the method judges whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of a longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.
The invention relates to an avatar adaptation method, wherein outputting the avatar corresponding to the region where the synthetic vector is positioned in the relational layout comprises the following steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range.
The invention discloses an avatar adaptation method, wherein the step of outputting the avatar corresponding to the region of the relation layout with the synthetic vector comprises the following steps:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
And outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
The present invention relates to an avatar adaptation system including
The input module is used for acquiring script data; capturing action data and first keyword data in the script data;
a vector synthesis module that constructs a synthesis vector using the action data and the first keyword data, and configures the synthesis vector in a relational layout;
the output module is used for outputting the virtual image corresponding to the region where the synthetic vector is positioned in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
And judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout.
The invention provides an avatar adaptation system, wherein the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.
The invention provides an avatar adaptation system, wherein the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if not, the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout is reduced; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.
The invention relates to an avatar adaptation system, wherein the outputting of the avatar corresponding to the region of the relation layout where the synthetic vector is located comprises the following steps
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range.
The invention discloses an avatar adaptation system, wherein the step of outputting the avatar corresponding to the region of the relation layout with the synthetic vector comprises the following steps:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
And outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
The invention is different from the prior art in that the method for adapting the virtual image fuses action data and first keyword data in the script data through the synthetic vector, and the action data and the first keyword data together form a synthetic vector according to the action data and the first keyword data, and the synthetic vector is then formulated on a relational layout for outputting corresponding position results, so that the corresponding virtual image can be output according to different script data, thereby enabling the virtual image to be more adapted and more attached to the content of the script, and improving the watching experience of users.
An avatar adaptation method of the present invention will be further described with reference to the accompanying drawings.
Drawings
FIG. 1 is a schematic illustration of a first state of a relational layout of an avatar adaptation method;
FIG. 2 is a schematic diagram of a second state of a relational layout of an avatar adaptation method;
fig. 3 is a flowchart illustrating an avatar adaptation method.
Detailed Description
Referring to FIGS. 1 to 3, referring to FIGS. 1 and 3, the avatar adaptation method of the present invention includes
Acquiring script data;
capturing action data and first keyword data in the script data;
constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout;
outputting an avatar corresponding to the region where the synthetic vector is located in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
And judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout.
According to the invention, the action data and the first keyword data in the script data are fused through the synthetic vector, and the synthetic vector is formed according to the action data and the first keyword data, and then the synthetic vector is prepared on a relational layout for outputting a corresponding position result, so that corresponding virtual images can be output according to different script data, the virtual images are more adaptive and more fitting with the content of the script, and the user viewing experience is improved.
Wherein, there are preset X-axis and Y-axis coordinate systems in the dry-cleaning layout area, the coordinate system has four quadrants, similar to the four quadrants of the conventional XY coordinate system.
The expansion range can change the circular shape into the circular circumscribing triangle, the circumscribing square or the circumscribing hexagon.
Of course, the expansion range may expand a triangle or square or hexagon into an circumscribed circle.
Of course, the expansion range may be expanded by a times of the area of the region based on the center of the region, where a may be 0.1% -1000%, and preferably is a ratio of one of the shorter overlapping lengths of the motion data vector and the first keyword vector to one of the longer overlapping lengths. Further, 10% may be preferable.
That is, the present invention can determine the degree of contradiction between the motion data vector and the first keyword vector in a certain dimension by comparing the sum of the overlapping projection lengths of the motion data vector and the first keyword vector in the X axis and the Y axis with the first threshold, and if the sum of the overlapping projection lengths is greater than the first threshold, it can determine that the two vectors are in greater contradiction, so as to adjust or expand the area range of the corresponding relational layout, thereby reducing the deviation of the final synthesized vector position caused by the contradiction, and being more beneficial to the synthesized vector finally reaching the area range of the area of the relational layout corresponding to the vector with longer projection corresponding to the overlapping projection.
For example, as shown in fig. 2, the relationship layout includes inside-out P region, Q region, R region, S region, and T region.
Of course, the avatars may also be different expressions of the same person, which correspond to the avatars of the table, that is, the avatars of the same regular avatar may also be displayed corresponding to different areas by pre-storing different expressions.
In the coordinate system, the X-axis represents emotional happiness or sadness, and the Y-axis represents positive or negative. The database stores motion data vectors and first keyword vectors corresponding to each motion data and the first keyword data. For example, the action data in fig. 2 is "listen to songs", and its pre-stored vector is (4, 3), which is a generally happy and positive vocabulary; the first keyword data in fig. 2 is "recognition spectrum", and its pre-stored vector is (1, 4), which is a generally happy, positive vocabulary. Then their resultant vector may be (5, 7), which lies within the r region of fig. 2. The first threshold length may be 0.1 to 100000, preferably 3.
The invention constructs the synthetic vector by utilizing the action data and the first keyword data.
The character interface of the invention selects the character model of the intelligent digital person corresponding to the script and supports the adjustment of the direction, the position and the size of the character. The invention can be used for sound configuration.
The sum of the projection length of the X-axis superposition and the projection length of the Y-axis superposition can be understood as the superposition length of one of the X-axis and the Y-axis because of the particularity of the vectors, and the two vectors can only be overlapped in the X-axis or in the Y-axis. A comparison is better obtained by comparing it with a first threshold, which may be the length of the vector telling "hello" or the diagonal length or side length of the P-region in fig. 2. In general, the first threshold may be a common criterion, and if the length of overlap exceeds the first threshold, it may represent that most of the length is the length of overlap, the composite vector is shorter, and the motion data vector and the first keyword vector are contradictory.
The term "constructing a composite vector using motion data and first keyword data and disposing the composite vector in a relational layout" is understood to mean that a start point of the composite vector is disposed at an origin of a coordinate system of the relational layout, specifically, a start point of the motion data vector is disposed at an origin of a coordinate system of the relational layout, an end point of the motion data vector is disposed at a start point of the first keyword data vector, and an end point of the first keyword data vector coincides with an end point of the composite vector.
Referring to fig. 2 and 3, as a further explanation of the present invention, the determining unit determines whether or not a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which are overlapped with each other in the X-axis, and a projection length of the first keyword data vector, which are overlapped with each other in the Y-axis, exceeds a first threshold value, and if not, reduces a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector, which are overlapped with each other in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.
The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the region with the nearest range is made as the falling range for the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, and the fact that the midpoint of the region is nearest to the end point of the synthesized vector is designated, so that the influence of the size of the region on the virtual image of the final output is considered.
For example, there are two regions, one region having an edge near the end of the composite vector but a midpoint farther away, and then an avatar corresponding to a region having a midpoint farther away from the midpoint of the composite vector is output.
The "the range of the region of the corresponding relationship layout in which the longer one of the projection lengths of the coincidence of the motion data vector and the first keyword vector is reduced" may be understood as "the longer one of the projection lengths of the coincidence of the motion data vector and the first keyword vector" may be understood as: the y-axis projection length of the motion data vector in fig. 2 is greater than the y-axis projection length of the first keyword vector, and thus, the motion data vector is the longer one of the projection lengths of fig. 2; the projection length of the X-axis of the motion data vector in fig. 1 is greater than the projection length of the X-axis of the first keyword vector, and thus the motion data vector is the longer one of the projection lengths in fig. 1; for example, in each region of the relational layout, a part of the region corresponds to the motion data vector and a part of the region corresponds to the first keyword vector, and thus the region corresponding to the motion data vector or the first keyword vector is narrowed. Wherein, the reduced area range can be understood as: the area of the same center point before and after the area is reduced by 10% in equal proportion; alternatively, a square or triangle is inscribed in the area of the circle, and a circle is inscribed in the area of the square or triangle.
It is understood that "determining whether the end point position of the synthesized vector is located in the area of the relational layout" may be understood that the end point of the synthesized vector may be an end of an arrow of the synthesized vector in fig. 1 or fig. 2.
Wherein each region of the relational layout stores an avatar corresponding to the region.
Referring to fig. 2 and 3, as a modification of the present invention, the determining unit may determine whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which overlap with each other in the X-axis, and a projection length of the first keyword data vector, which overlap with each other in the Y-axis, exceeds a first threshold, and if not, reduce a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.
The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, the region with the nearest range is made as the falling range, and the edge of the region is specified to be nearest to the end point of the synthesized vector, so that the influence of the size of the region on the virtual image of the final output is considered.
For example, the edges may be any one point of the edges, in other words, assuming that there are countless points per edge, an avatar corresponding to an area of the edge where one point of each edge closest to the end point of the synthesized vector is located is outputted.
Referring to fig. 2 and 3, as a further explanation of the present invention, outputting the avatar corresponding to the region of the relational layout where the synthetic vector is located includes the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range.
The region in which the end point of the synthesis vector configured in the above manner falls can be understood as: since the areas of the relational layout are not necessarily all mutually disjoint, since the various areas may be regular rectangles, triangles or circles, etc., which are all substantially of a preset shape, and in order to avoid that there are blank positions of no area in the relational layout, there may be positions where two or more relational layout areas overlap in a certain coordinate unit when the relational layout is configured. In other words, the end point position of the synthesized vector is surely possible to be arranged in two or more areas of the relational layout, and if not, it is possible to be arranged in one area of the relational layout, provided that at least one area is arranged at each position of the relational layout, and if the end point of the synthesized vector is not arranged in one area in a special case, it can be regarded as the position of any one area of the nearest midpoint or edge of the synthesized vector, and the avatar corresponding to the area is output.
Here, the "region of the larger one of the regions in which the output range is enlarged or reduced" is understood to mean that if only one region is enlarged or reduced, it is regarded as the region of the larger one of the regions in which the range is enlarged or reduced; if there are two regions in which the range is enlarged or two regions in which the range is reduced or one region in which the range is enlarged and the other region is reduced, the output range is enlarged or one region which is finally larger after being reduced or enlarged out of the regions in which the range is enlarged or reduced.
That is, if the synthesized vector falls within the overlap region after overlapping the expanded and unexpanded regions, it is determined that the expanded region is a falling region.
Wherein the "outputting the avatar corresponding to the region of the relational layout" also conforms to the principle of either the region near the midpoint or the region near the edge. In other words, the end point of the synthesized vector does not necessarily fall into the region of the relational layout, but may also fall into the position of the region in the relational layout, so that the region of the relational layout with the nearest midpoint or edge is found out to output the corresponding avatar by the nearest principle.
Referring to fig. 2 and 3, as a further explanation of the present invention, the step of outputting the avatar corresponding to the region of the relational layout with the synthetic vector includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
and outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
According to the method, the occupation ratio of the area of the first triangle constructed by the dialogue voice of the virtual image according to different motion data vectors, the first keyword vectors and the synthetic vectors in the four quadrants of the coordinate system can be used as an adjustment index for influencing the dialogue voice of the virtual image, so that the dialogue voice of the virtual image can be subjected to targeted intonation adjustment according to possible expression content of the dialogue voice of the virtual image, and whether the virtual image is happy or not is expressed; and performing speech speed adjustment so as to express whether the dialogue speech of the virtual image is positive; and makes a loudness adjustment to express whether the dialog of the avatar is of affirmative attitudes.
If the condition is not triggered, the synthesized tone is a reference tone, the synthesized speed is a reference speed, and the synthesized loudness is a reference loudness.
Wherein the tone of the reference audio is adjusted, wherein the d 1 +d 2 +d 3 +d 4 =100%。
Wherein, the adjustment coefficient c can be obtained according to the adjustment coefficient c corresponding to the avatar pre-stored in the database. The adjustment coefficient c may be 0.1 to 5, preferably 3. The younger avatar, c, is relatively smaller, i.e., the degree of change in synthesized tones is greater, conforming to the less sunken character. And the older avatar, c, is relatively larger, that is, the degree of variation of the synthesized tone is larger, conforming to the more sinking character. The adjustment coefficient c may preferably be 1.
Wherein, synthetic tone = reference toneThe method comprises the steps of carrying out a first treatment on the surface of the It can be understood that the above-mentioned partial formula, d 1 +d 2 Has a maximum value of 100%, d 3 +d 4 For the maximum value of 100%, in order to make the tone of people speaking generally in 100 hz-300 hz, the reference tone is adjusted according to whether the tone is happy or not by different percentage values of the formula, so that the adjusted synthesized tone is more emotionally expressed in western style and better accords with the meaning represented by the duty ratio of the area of the first triangle in four quadrants.
For example, the reference tone is 150Hz, and the above formula changes the tone to 75-300 Hz, so that the happiness and the dishappiness of people are reflected.
Since in the coordinate system the X-axis represents emotional happiness or sadness and the Y-axis represents positive or negative.
Then the duty cycle of the first quadrant and the second quadrant on the right is much, and the reference tone should be lifted in some cases, and vice versa.
Wherein, output the synthetic speed according to the following formula: synthesis speed = reference speed ×
Since in the coordinate system the X-axis represents emotional happiness or sadness and the Y-axis represents positive or negative.
Then the upper first quadrant and fourth quadrant have a large duty cycle and the reference speed should be raised in some cases, and vice versa.
For example, the original reference speed is 1 time speed, and once the triangle is more in the first quadrant and the fourth quadrant, the triangle tends to represent more front language, and the speed is supposed to be improved, so that the front position of the triangle is better expressed.
The younger avatar, c, is relatively smaller, i.e., the degree of change in the composite speech rate is greater, conforming to the less sunken character. And the older virtual image, c, is relatively larger, that is, the degree of change of the synthesized speech rate is larger, and the character is accordant with the more sunken and stable image of the character. Wherein,
The adjustment coefficient c may have a larger value during the above speed adjustment, that is, the reference speech speed may be adjusted to be as high as 0.8 to 1.2 times the speed, and more preferably, the reference speech speed may be adjusted to be 0.9 to 1.1. In other words, the pre-stored adjustment reference speech rate and adjustment coefficient c for adjusting the reference pitch may be different.
Assuming that the area of the first triangle is 100 square centimeters and the area of the first quadrant of the relationship layout occupied by the first triangle is 10 square centimeters, d 1 Bits 10%.
If the sum of the projection lengths exceeds a first preset threshold, the loudness is reduced, and if not, the loudness is enlarged.
According to the percentage of the area where the triangle of the synthesized vector is located, the invention configures the tone and tone color with the happiness or the dislike corresponding to the four quadrants.
The X-axis represents emotional happiness or sadness and the Y-axis represents positive or negative in the coordinate system, wherein the coordinate system is preset corresponding to each relation layout.
As shown in fig. 1 and 3, the avatar adaptation system of the present invention includes
The input module is used for acquiring script data; capturing action data and first keyword data in the script data;
A vector synthesis module that constructs a synthesis vector using the action data and the first keyword data, and configures the synthesis vector in a relational layout;
the output module is used for outputting the virtual image corresponding to the region where the synthetic vector is positioned in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
And judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout.
According to the invention, the action data and the first keyword data in the script data are fused through the synthetic vector, and the synthetic vector is formed according to the action data and the first keyword data, and then the synthetic vector is prepared on a relational layout for outputting a corresponding position result, so that corresponding virtual images can be output according to different script data, the virtual images are more adaptive and more fitting with the content of the script, and the user viewing experience is improved.
Referring to fig. 2 and 3, as a further explanation of the present invention, the determining unit determines whether or not a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which are overlapped with each other in the X-axis, and a projection length of the first keyword data vector, which are overlapped with each other in the Y-axis, exceeds a first threshold value, and if not, reduces a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector, which are overlapped with each other in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout, wherein the position of the end point of the synthesized vector is closest to the position of the virtual image.
The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the region with the nearest range is made as the falling range for the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, and the fact that the midpoint of the region is nearest to the end point of the synthesized vector is designated, so that the influence of the size of the region on the virtual image of the final output is considered.
Referring to fig. 2 and 3, as a further explanation of the present invention, the determining unit determines whether or not a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which are overlapped with each other in the X-axis, and a projection length of the first keyword data vector, which are overlapped with each other in the Y-axis, exceeds a first threshold value, and if not, reduces a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector, which are overlapped with each other in the relational layout; and judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector.
The invention can expand the range of the region when the first threshold value is exceeded, and can also reduce the range of the region when the first threshold value is not exceeded, and the end point of the synthesized vector which possibly occurs after the reduced range cannot fall into the range of the region, the region with the nearest range is made as the falling range, and the edge of the region is specified to be nearest to the end point of the synthesized vector, so that the influence of the size of the region on the virtual image of the final output is considered.
Referring to fig. 2 and 3, as a further explanation of the present invention, outputting the avatar corresponding to the region of the relational layout where the synthetic vector is located includes the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range.
The region in which the end point of the synthesis vector configured in the above manner falls can be understood as: since the areas of the relational layout are not necessarily all mutually disjoint, since the various areas may be regular rectangles, triangles or circles, etc., which are all substantially of a preset shape, and in order to avoid that there are blank positions of no area in the relational layout, there may be positions where two or more relational layout areas overlap in a certain coordinate unit when the relational layout is configured. In other words, the end point position of the synthesized vector is surely possible to be arranged in two or more areas of the relational layout, and if not, it is possible to be arranged in one area of the relational layout, provided that at least one area is arranged at each position of the relational layout, and if the end point of the synthesized vector is not arranged in one area in a special case, it can be regarded as the position of any one area of the nearest midpoint or edge of the synthesized vector, and the avatar corresponding to the area is output.
Referring to fig. 2 and 3, as a further explanation of the present invention, the step of outputting the avatar corresponding to the region of the relational layout with the synthetic vector includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
JudgingWhether the sum of the projection length of the motion data vector generated by the motion data, which is overlapped with the X axis of the first keyword data vector generated by the first keyword data, and the projection length of the Y axis is overlapped exceeds a first threshold value is judged, if yes, whether the projection exceeding the first threshold value is on the X axis or the Y axis is judged, and if yes, the synthesized tone is output according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
And outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
According to the method, the occupation ratio of the area of the first triangle constructed by the dialogue voice of the virtual image according to different motion data vectors, the first keyword vectors and the synthetic vectors in the four quadrants of the coordinate system can be used as an adjustment index for influencing the dialogue voice of the virtual image, so that the dialogue voice of the virtual image can be subjected to targeted intonation adjustment according to possible expression content of the dialogue voice of the virtual image, and whether the virtual image is happy or not is expressed; and performing speech speed adjustment so as to express whether the dialogue speech of the virtual image is positive; and makes a loudness adjustment to express whether the dialog of the avatar is of affirmative attitudes.
The above examples are only illustrative of the preferred embodiments of the present invention and are not intended to limit the scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solution of the present invention should fall within the scope of protection defined by the claims of the present invention without departing from the spirit of the present invention.

Claims (4)

1. A method for adapting an avatar, characterized by: comprising
Acquiring script data;
capturing action data and first keyword data in the script data;
constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout;
outputting an avatar corresponding to the region where the synthetic vector is located in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout;
determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout closest to the end point position of the synthesized vector;
The outputting the virtual image corresponding to the region where the synthetic vector is located in the relational layout comprises the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range;
the step of outputting the avatar corresponding to the region in which the synthesis vector is located in the relational layout includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
and outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
2. A method for adapting an avatar, characterized by: comprising
Acquiring script data;
capturing action data and first keyword data in the script data;
constructing a synthetic vector by utilizing the action data and the first keyword data, and configuring the synthetic vector in a relational layout;
outputting an avatar corresponding to the region where the synthetic vector is located in the relation layout;
Wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout;
determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector;
The outputting the virtual image corresponding to the region where the synthetic vector is located in the relational layout comprises the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range;
the step of outputting the avatar corresponding to the region in which the synthesis vector is located in the relational layout includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
and outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
3. An avatar adaptation system, characterized by: comprising
The input module is used for acquiring script data; capturing action data and first keyword data in the script data;
a vector synthesis module that constructs a synthesis vector using the action data and the first keyword data, and configures the synthesis vector in a relational layout;
The output module is used for outputting the virtual image corresponding to the region where the synthetic vector is positioned in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout;
determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the midpoint of the area of the relation layout closest to the end point position of the synthesized vector;
The outputting the virtual image corresponding to the region where the synthetic vector is located in the relational layout comprises the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range;
the step of outputting the avatar corresponding to the region in which the synthesis vector is located in the relational layout includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if not, outputting the synthesized loudness according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
and outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
4. An avatar adaptation system, characterized by: comprising
The input module is used for acquiring script data; capturing action data and first keyword data in the script data;
a vector synthesis module that constructs a synthesis vector using the action data and the first keyword data, and configures the synthesis vector in a relational layout;
The output module is used for outputting the virtual image corresponding to the region where the synthetic vector is positioned in the relation layout;
wherein outputting the avatar corresponding to the position of the synthetic vector in the relational layout comprises
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, expanding the range of the corresponding relation layout area of the longer one of the projection lengths of the superposition of the motion data vector and the first keyword vector in the relation layout;
determining whether a sum of a projection length of the motion data vector generated by the motion data and a projection length of the first keyword data vector generated by the first keyword data, which is overlapped in an X-axis, and a projection length of the first keyword data vector, which is overlapped in a Y-axis exceeds a first threshold, and if not, reducing a range of a region of the relational layout corresponding to a longer one of the projection lengths of the motion data vector and the first keyword vector; judging whether the end point position of the synthesized vector is positioned in the area of the relation layout, if so, outputting an avatar corresponding to the area of the relation layout, and if not, outputting an avatar corresponding to the area of the relation layout corresponding to the edge of the area of the relation layout closest to the end point position of the synthesized vector;
The outputting the virtual image corresponding to the region where the synthetic vector is located in the relational layout comprises the steps of
Judging whether the end point of the synthesized vector is in two or more areas of the relational layout, if not, outputting an avatar corresponding to the areas of the relational layout, if so, judging whether the areas of the two or more relational layout have areas with expanded or contracted ranges, if so, outputting an avatar corresponding to a larger one of the areas with expanded or contracted ranges, and if not, outputting an avatar corresponding to the area with larger range;
the step of outputting the avatar corresponding to the region in which the synthesis vector is located in the relational layout includes:
pre-storing dialogue voice of reference audio frequency, reference speed and reference loudness corresponding to the avatar;
constructing a first triangle using the motion data vector, a first keyword vector, and a composite vector, based on a percentage d of a first quadrant of a coordinate system of the relational layout occupied by an area of the first triangle 1 Percentage d of the second quadrant 2 Percentage d of the third quadrant 3 And a percentage d of the fourth quadrant 4
Judging whether the sum of the projection length of the superposition of the motion data vector generated by the motion data and the X-axis and the projection length of the superposition of the Y-axis of the first keyword data vector generated by the first keyword data exceeds a first threshold value, if so, judging whether the projection exceeding the first threshold value is on the X-axis or the Y-axis, and if so, outputting a synthesized tone according to the following formula: synthesized tone = reference tone ×Wherein c is an adjustment coefficient; if the Y axis is the Y axis, the synthesis speed is output according to the following formula: synthesis speed = reference speed>Wherein c is an adjustment coefficient; if notThe synthesized loudness is output according to the following formula: composite loudness = reference loudness e, where e is the loudness coefficient;
and outputting the dialogue voice of the avatar according to the synthesized tone, the synthesized speed and the synthesized loudness.
CN202311799886.6A 2023-12-26 2023-12-26 Virtual image adaptation method and system Active CN117475049B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311799886.6A CN117475049B (en) 2023-12-26 2023-12-26 Virtual image adaptation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311799886.6A CN117475049B (en) 2023-12-26 2023-12-26 Virtual image adaptation method and system

Publications (2)

Publication Number Publication Date
CN117475049A CN117475049A (en) 2024-01-30
CN117475049B true CN117475049B (en) 2024-03-08

Family

ID=89636497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311799886.6A Active CN117475049B (en) 2023-12-26 2023-12-26 Virtual image adaptation method and system

Country Status (1)

Country Link
CN (1) CN117475049B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019200584A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Generating response in conversation
CN111785246A (en) * 2020-06-30 2020-10-16 联想(北京)有限公司 Virtual character voice processing method and device and computer equipment
CN116312456A (en) * 2023-01-06 2023-06-23 北京红棉小冰科技有限公司 Voice dialogue script generation method and device and electronic equipment
CN116805963A (en) * 2023-06-30 2023-09-26 杭州海康威视数字技术股份有限公司 Virtual scene light source generation method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI521469B (en) * 2012-06-27 2016-02-11 Reallusion Inc Two - dimensional Roles Representation of Three - dimensional Action System and Method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019200584A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Generating response in conversation
CN110998725A (en) * 2018-04-19 2020-04-10 微软技术许可有限责任公司 Generating responses in a conversation
CN111785246A (en) * 2020-06-30 2020-10-16 联想(北京)有限公司 Virtual character voice processing method and device and computer equipment
CN116312456A (en) * 2023-01-06 2023-06-23 北京红棉小冰科技有限公司 Voice dialogue script generation method and device and electronic equipment
CN116805963A (en) * 2023-06-30 2023-09-26 杭州海康威视数字技术股份有限公司 Virtual scene light source generation method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Deep Learning Based Real-time Daily Human Activity Recognition and Its Implementation in a Smartphone;Tsige Tadesse Alemayoh,;《2019 16th International Conference on Ubiquitous Robots (UR)》;20190725;全文 *
基于变刚度与力反馈的虚拟3D画笔触觉行为;黄磊;侯增选;李楠楠;张迪靖;苏金辉;;湖南大学学报(自然科学版);20200825(08);全文 *
翁冬冬 ; 薛雅琼 ; .虚拟环境中人和虚拟角色互动的关键技术.中兴通讯技术.(06),全文. *

Also Published As

Publication number Publication date
CN117475049A (en) 2024-01-30

Similar Documents

Publication Publication Date Title
US6307576B1 (en) Method for automatically animating lip synchronization and facial expression of animated characters
WO2017168870A1 (en) Information processing device and information processing method
CN108146360A (en) Method, apparatus, mobile unit and the readable storage medium storing program for executing of vehicle control
CN109410973B (en) Sound changing processing method, device and computer readable storage medium
CN106486121A (en) It is applied to the voice-optimizing method and device of intelligent robot
US20150256930A1 (en) Masking sound data generating device, method for generating masking sound data, and masking sound data generating system
CN105468582B (en) A kind of method and device for correcting of the numeric string based on man-machine interaction
CN108055617A (en) A kind of awakening method of microphone, device, terminal device and storage medium
JP4588531B2 (en) SEARCH DEVICE, PROGRAM, AND SEARCH METHOD
CN109377979B (en) Method and system for updating welcome language
CN117475049B (en) Virtual image adaptation method and system
CN113448433A (en) Emotion responsive virtual personal assistant
KR20200145776A (en) Method, apparatus and program of voice correcting synthesis
CN117558259A (en) Digital man broadcasting style control method and device
US7219061B1 (en) Method for detecting the time sequences of a fundamental frequency of an audio response unit to be synthesized
CN115019817A (en) Voice awakening method and device, electronic equipment and storage medium
CN109977411B (en) Data processing method and device and electronic equipment
KR102114365B1 (en) Speech recognition method and apparatus
JP6343895B2 (en) Voice control device, voice control method and program
CN114734942A (en) Method and device for adjusting sound effect of vehicle-mounted sound equipment
CN114168713A (en) Intelligent voice AI pacifying method
JP6566076B2 (en) Speech synthesis method and program
JP3464435B2 (en) Speech synthesizer
CN116246633B (en) Wireless intelligent Internet of things conference system
US20050250554A1 (en) Method for eliminating musical tone from becoming wind shear sound

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant