CN116962788A - Animation expression generating method, equipment and readable storage medium - Google Patents

Animation expression generating method, equipment and readable storage medium Download PDF

Info

Publication number
CN116962788A
CN116962788A CN202310952281.XA CN202310952281A CN116962788A CN 116962788 A CN116962788 A CN 116962788A CN 202310952281 A CN202310952281 A CN 202310952281A CN 116962788 A CN116962788 A CN 116962788A
Authority
CN
China
Prior art keywords
text
barrage
feature vector
video
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310952281.XA
Other languages
Chinese (zh)
Inventor
李怀德
刘一民
王�琦
潘兴浩
谢于贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Video Technology Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202310952281.XA priority Critical patent/CN116962788A/en
Publication of CN116962788A publication Critical patent/CN116962788A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Abstract

The invention provides an animation expression generating method, equipment and a readable storage medium, wherein the method comprises the following steps: according to video content information and text barrage information in the video information, obtaining the relevance between the text barrage information and the video content information; determining a first category of an animation expression to be generated according to the attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information; and generating a target animation expression according to the correlation between the text barrage information and the video content information and the first category. According to the scheme, through mining the association between the text barrage information and the video content information, the animation expression is automatically generated, the text barrage is combined with the animation expression, the video watching interestingness is increased, and the experience degree of a user when watching the video is improved.

Description

Animation expression generating method, equipment and readable storage medium
Technical Field
The present invention relates to the field of video playing technologies, and in particular, to an animation expression generating method, an animation expression generating device, and a readable storage medium.
Background
Now, when a user watches a video, if the user wants to express his or her sense of watching the content, the user can realize the method by sending a barrage. The bullet screen refers to comment captions popped up on a video playing interface when watching videos, and the interaction mode not only enables users to express the feeling of watching programs and increase the participation feeling of the programs, but also enables the users to watch bullet screen comments of other users on the programs and increase the interaction of watching programs.
However, the current video barrage only displays the text barrage, and a lot of content irrelevant to the video content exists in the barrage, so that video watching lacks interestingness and the watching experience of users is poor.
Disclosure of Invention
The invention aims to provide an animation expression generating method, equipment and a readable storage medium, which can combine a text barrage with an animation expression so as to improve the experience of a user watching a video.
In order to solve the above technical problems, an embodiment of the present invention provides an animation expression generating method, including:
according to video content information and text barrage information in the video information, obtaining the relevance between the text barrage information and the video content information;
determining a first category of an animation expression to be generated according to the attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information;
and generating a target animation expression according to the correlation between the text barrage information and the video content information and the first category.
The method for obtaining the association between the video content information and the text barrage information according to the video content information and the text barrage information in the video information comprises the following steps:
extracting video content information and text barrage information in the video information;
obtaining a video feature vector according to the video content information;
obtaining bullet screen feature vectors according to the text bullet screen information;
obtaining an incidence matrix according to the video feature vector and the barrage feature vector; the association matrix takes barrage elements in the barrage feature vectors as horizontal columns, and video elements in the video feature vectors as vertical columns; each element in the association matrix is used for representing the association of the barrage element and the video element at the corresponding position.
The obtaining the association matrix according to the video feature vector and the barrage feature vector comprises the following steps:
acquiring a pre-established prior matrix related to the barrage and the video;
and calculating the value of each element in the incidence matrix according to the prior matrix and the preset optimal transmission distance to obtain the incidence matrix.
Wherein the generating a target animation expression according to the association between the video content information and the text bullet screen information and the first category includes:
according to the association matrix, obtaining an association value corresponding to the first text barrage;
and obtaining a target animation expression of which the aesthetic value is matched with the association value in the animation expression of the first category.
Wherein the method further comprises:
calculating a first running track of the target animation expression according to the position information of the first text barrage corresponding to the target animation expression and the position information of a first target in video content information related to the first text barrage;
and controlling the target animation expression to move from the position of the first text barrage to the position of the first target according to the first running track.
Wherein the method further comprises:
receiving a first input of a user to the target animation expression;
if the first text barrage corresponding to the target animation expression is not displayed on the screen, responding to the first input, and displaying the first text barrage;
and if the first text barrage corresponding to the target animation expression is displayed on the screen, responding to the first input, and controlling the target animation expression to move from the position of the first target to the position of the first text barrage according to a second running track.
Wherein, the obtaining the video feature vector according to the video content information includes:
extracting targets from the video content information, and carrying out feature coding on each extracted target to obtain a first feature vector of each target;
performing feature coding on each pre-stored text name to obtain a second feature vector of each text name;
and obtaining the video feature vector based on each first feature vector and each second feature vector, wherein the video feature vector is used for representing a target and entity names corresponding to the target.
Wherein said obtaining said video feature vector based on each said first feature vector and each said second feature vector comprises:
obtaining an association vector according to each first feature vector and each second feature vector, wherein the association vector is used for representing the association between the target and the text name;
obtaining a third feature vector of each target according to the association vector and each first feature vector;
and obtaining the video feature vector according to each third feature vector and each second feature vector.
The embodiment of the invention also provides an animation expression generating device which comprises a memory, a processor and a program which is stored in the memory and can run on the processor; and the processor realizes the animation expression generating method when executing the program.
The embodiment of the invention also provides a readable storage medium, on which a program is stored, which when executed by a processor, implements the steps in the animation expression generation method.
The technical scheme of the invention has the following beneficial effects:
in the scheme, the relevance between the text barrage information and the video content information is obtained according to the video content information and the text barrage information in the video information; determining a first category of an animation expression to be generated according to the attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information; and generating a target animation expression according to the relevance between the text barrage information and the video content information and the first category, so that the animation expression is automatically generated by mining the relevance between the text barrage information and the video content information, the combination of the text barrage and the animation expression is realized, the video watching interestingness is increased, and the experience degree of a user when watching the video is improved.
Drawings
FIG. 1 is a flowchart of an animation expression generation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating analysis of determining attributes of a text bullet screen according to an embodiment of the present invention;
FIG. 3 is a schematic diagram showing correspondence between association values and aesthetic values of an animation expression in an association matrix according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an interface of video content information according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of a video feature vector according to the present invention;
FIG. 6 is a schematic diagram of a schematic implementation of an incidence matrix according to an embodiment of the present invention;
FIG. 7 is a motion diagram of an animated expression according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an animation expression generating device according to an embodiment of the invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
Aiming at the problems of lack of interest in video viewing and poor viewing experience of users in the prior art, the invention provides an animation expression generating method, equipment and a readable storage medium.
The method for generating the animation expression provided by the embodiment of the invention is further described below.
Fig. 1 is a schematic flow chart of an animation expression generating method according to an embodiment of the present invention. The method may include:
step 101, obtaining the relevance between the text barrage information and the video content information according to the video content information and the text barrage information in the video information;
the method specifically comprises the following steps:
step 1011, extracting video content information and text barrage information in the video information;
here, the video content information may be understood as image content in the video information.
Step 1012, obtaining a video feature vector according to the video content information;
specifically, a video feature vector can be obtained based on a feature vector obtained by feature encoding a target identified in video content information and a feature vector obtained by feature encoding a priori text name, wherein the video feature vector is used for representing the target and an entity name corresponding to the target. The specific implementation process can be described in the following examples.
Step 1013, obtaining a barrage feature vector according to the text barrage information;
the feature vector of the barrage can be obtained by carrying out feature coding on the key words of each barrage in the text barrage information. The specific implementation process can be described in the following examples.
Step 1014, obtaining an association matrix according to the video feature vector and the barrage feature vector; the association matrix takes barrage elements in the barrage feature vectors as horizontal columns, and video elements in the video feature vectors as vertical columns; each element in the association matrix is used for representing the association of the barrage element and the video element at the corresponding position.
It should be noted that, the association matrix is used to characterize the association between the text bullet screen information and the video content information.
Step 102, determining a first category of an animation expression to be generated according to an attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information;
it should be noted that the text bullet screen has different attributes and the corresponding animation expressions have different categories. Different classes of animated expressions are pre-stored in the device performing the method of the invention.
Here, upon determining the attributes of the first text bullet, the first text bullet is parsed into a structured syntax tree. As shown in fig. 2, is a structured syntax tree that is parsed according to fixed rules. Wherein, ADV- & gt is of a middle structure, ATT- & gt is of a centering relationship, HED- & gt is of a core relationship, SBV is of a main term relationship, and RAD is of a right additional relationship. As in fig. 2, the first text bullet "ZhangSantai bar" is broken down into a main-predicate relationship, etc., with the root node pointing to the "bar", indicating that the "bar" is the core of the entire sentence. Sentence components pointed to by sides starting with a "bar" depend on the core word "bar" in some "dependency relationship", respectively. Therefore, the grammar structure of the sentence can be analyzed by taking the core word as a clue, and the dependency feeling unit in the text bullet screen is extracted according to a predefined rule, wherein the dependency feeling unit comprises comment objects, emotion words, negatives, degree adverbs and the like corresponding to the text. The attribute of the text bullet screen is one of a positive attribute, a middle attribute, and a negative attribute.
And step 103, generating a target animation expression according to the correlation between the text barrage information and the video content information and the first category.
The method specifically comprises the following steps:
step 1031, obtaining an association value corresponding to the first text bullet screen according to the association matrix;
step 1032, obtaining the target animation expression matching the aesthetic value and the association value in the animation expression of the first category.
The higher the association value, the higher the degree of closeness between the text bullet screen and the video content can be considered. So that the corresponding expression package is more gorgeous and beautiful.
Here, the specific procedure for evaluating the aesthetic degree of the animation expression is as follows:
(1) computing phase consistency features
The highlights and gorgeous degree of the animation expression are often characterized by local characteristics and color vividness, and are the most sensitive parts of human visual sense response. Phase consistency features are thus first extracted for the animated expression.
For any one-dimensional signal g (x), the phase consistency feature is calculated as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,and->Is a log-Gabor filter, e n (x),o n (x) The values obtained using the two filters, respectively, and then the transformed amplitude A is calculated n (x):
Value of phase consistency characteristic PC x The calculation mode of (2) is as follows:
for a two-dimensional image, a filter may be used in both directions.
(2) Computing gradient features
After the phase consistency characteristics are calculated, the favorite positions of the human can be determined, and then the specificity characteristics are calculated for the positions to obtain the fine transformation characteristics of the animation expression.
The gradient characteristics were calculated as follows:
t (x) represents the gradient, T x (x) 2 Representing the gradient in the x-axis direction, T y (x) 2 Representing the gradient in the y-axis direction.
(3) Aesthetic value calculation of animation expression
After the phase consistency characteristic and the gradient characteristic are obtained, the aesthetic value of each animation expression can be calculated.
Aes=α·T(x)+(1-α)PC x
Where α represents a discount factor.
Finally, the animation expression which is possibly selected by the UI under the association value is estimated in a mode that a target animation expression, of which the aesthetic value is matched with the association value, in the animation expression of the first category is obtained.
P{Aes|U(t,v)}
Wherein, P () is a conditional probability, and is solved by a Bayesian method.
As shown in fig. 3, when the association value is 0.9, the expression is most honored and praise 1; when the association value is 0.6, the expression is the general praise expression 2.
It should be noted that the video content information decomposition includes two processes, namely, video content target identification and entity name correspondence. As shown in fig. 4, the object C and the object D in the video content information need to correspond to actual entity names. In fig. 4, it is necessary to not only determine the position of a target in video content information but also to associate the target with an actual entity name. In fig. 4, the left object C is Zhang san, and the right object D is Li four, which need to be corresponded. To achieve this objective, in an optional embodiment, step 1012, obtaining the video feature vector according to the video content information may specifically include:
1) Extracting targets from the video content information, and carrying out feature coding on each extracted target to obtain a first feature vector of each target;
wherein, first, the target in the video content information is identified (namely, the position of the target in the video content information is determined), and then the target is extracted; and finally, carrying out feature coding on each extracted target to obtain a first feature vector of each target, wherein the first feature vector can be represented by V.
2) Performing feature coding on each pre-stored text name to obtain a second feature vector of each text name;
here, the a priori text name is a text name related to the video content information. The second feature vector may be denoted by T.
3) And obtaining the video feature vector based on each first feature vector and each second feature vector, wherein the video feature vector is used for representing a target and entity names corresponding to the target.
The method specifically comprises the following steps:
3-1) obtaining an association vector according to each first feature vector and each second feature vector, wherein the association vector is used for representing the association between the target and the text name;
in order to avoid the condition that the correlation between the first feature vector and the second feature vector is smaller or irrelevant, whether the first feature vector is correlated with the second feature vector or not is judged through the first convolutional neural network.
Specifically, after each first feature vector and each second feature vector are respectively input into a first convolutional neural network, classification is performed through a full connection layer FC to generate a classification vector, and then the classification vector is passed through a gating function G to score elements in the classification vector, and a correlation vector R is obtained after passing through the gating function G.
Here, the first convolutional neural network is used to analyze the relevance of the object in the video content information to the text name.
Specifically, the gating function G is specifically implemented as follows:
R=(x i,j =r)
wherein R is a correlation vector, x i,j The element representing the ith row and jth column in the vector, r, is the relationship probability, where r is found by the gating function.
The formula of the gating function is as follows:
where k represents the number of elements in the vector.
The principle of the gating function can be shown by fig. 4, wherein the image background in fig. 4 is composed of a lot of identification information; after the targets in the video content information are detected, characters in the video content information are determined by using a gating function, zhang three (target C) and Lisi four (target D) are main information in the whole video content information, and the correlation degree between the rest of identification information and the video content information is low.
3-2) obtaining a third feature vector of each target according to the association vector and each first feature vector;
specifically, the association vector R is multiplied by each first feature vector V to obtain a third feature vector R' of each target. At this time, the third feature vector may describe the degree of correlation of all objects in the video content information with the text name.
3-3) obtaining the video feature vector according to each third feature vector and each second feature vector.
Specifically, each third eigenvector R' and each second eigenvector T are respectively input to the second convolutional neural network to obtain a video eigenvector.
The network structure of the second convolutional neural network is completely consistent with that of the first convolutional neural network, and weights are shared. Here, the whole implementation process of this embodiment can be seen in fig. 5.
In an optional embodiment, step 1013, obtaining the barrage feature vector according to the text barrage information may include:
(1) Word segmentation processing is carried out on the text barrage information to obtain a plurality of words;
(2) Determining keywords of each barrage in the text barrage information based on the plurality of words;
it should be noted that, after word segmentation is performed on the text barrage information, a plurality of words, that is, a series of words, are obtained, and some words with lower importance degree exist in the words, so that keywords need to be searched.
Wherein, this step may specifically include:
for each barrage in the text barrage information, the following steps are executed:
(2-1) obtaining a first number of first words in the barrage, wherein the first word is one of the plurality of words, and the first word is a word with occurrence times greater than a preset number of times in the barrage;
as can be seen from the above description, the first word is a high-frequency word with a higher occurrence number in a bullet screen.
(2-2) obtaining a second number of the first word in a first set of scrims, the first set of scrims including all or part of the scrims;
here, the second number may be calculated by the following formula:
where n represents the number of shots contained in the first shot set and u represents the number of shots containing the first word in the first shot set.
It should be noted that, determining the keyword needs to generally count the number of the first words included in all the barrages, that is, the first barrage set includes all the barrages; to increase computational efficiency, the first set of barrages may include portions of barrages; how to determine a partial barrage can be achieved by the following embodiments:
the method further comprises, before obtaining the second number of the first words in the first bullet screen set:
after sequencing all the barrages according to the length, dividing the barrages according to each N row names as a section to obtain a plurality of section sets, wherein N is a positive integer greater than 1;
half of the barrages from each interval set are selected to form the first barrage set.
For example, the total number of the barrages is 30, the total number of the barrages is 10, the barrages are ranked 5 from long to short, the total number of the barrages is 10, the barrages are ranked 6-10, the total number of the barrages is 10, and the total number of the barrages is 5 after ranking, so that 3 interval sets are obtained. Half of the barrages are randomly selected from each interval set, namely 5 barrages are selected from each interval set, and 15 barrages in total form a first barrage set. Thus, the calculation efficiency is correspondingly improved after the number is halved.
(2-3) performing a product operation on the first number and the second number to obtain a target value;
here, P is used w Representing a first quantity, a target value F w Can be calculated by the following formula:
F w =P w ·R w
and (2-4) if the target value is greater than a preset threshold value, determining that the first word is a keyword of the barrage.
After determining the keywords, the present bullet screen is represented by the keywords. Thus, only the association degree between the keywords and the video content elements is required, and the calculation amount can be reduced.
(3) And carrying out feature coding on the keywords of each barrage to obtain the barrage feature vector.
Besides extracting keywords, classifying each bullet screen is also needed, and the basic attribute of each bullet screen, such as likes or dislikes, is judged by a template matching method through a pre-established rule base.
In an optional embodiment, the step 1014 of obtaining the correlation matrix according to the video feature vector and the barrage feature vector may specifically include:
step 10141, obtaining a pre-established prior matrix related to barrages and videos;
step 10142, calculating the value of each element in the correlation matrix according to the prior matrix and the preset optimal transmission distance to obtain the correlation matrix;
specifically, the correlation matrix C can be obtained by the following formula:
wherein M is a matrix obtained by inverting the prior matrix, and represents no correlation between bullet screen elements and video elements, M is E R m×n Wherein M is ij The element representing row i and column j of the M matrix is a known quantity. d is preset optimal transmission distance, and is a known quantity; c (C) ij Representing the associated value of the j-th element in the presence of the i-th element.
The correlation matrix C can be solved through the calculation; all possible values of the correlation matrix C, namely all possible solution spaces, can be obtained through the solution, and the solution spaces can be represented by U (t, v). Wherein:
C={c 1 ,c 2 ......c k },k=m×n,C∈R m×n
the correlation matrix uses elements in the barrage feature vector as a horizontal row and uses elements in the video feature vector as a vertical row, wherein the barrage feature vectorx i Representing the ith element in the barrage feature vector, which may be referred to as barrage element; video feature vector +.>y j Representing the j-th element in the video feature vector, which may be referred to as a video element; m represents the number of video elements and n represents the number of barrage elements. After the association matrix is established, the relationship between each barrage and the video content can be obtained by updating the association matrix. This value is used for [0,1 ]]The closer to 1 means higher association degree. The threshold μmay be empirically set and when the associated value is greater than μ, then the text bullet screen is deemed to be associated with the video content. I.e.
In this embodiment, the prior matrix is established because many prior information, such as nicknames of athletes, often exists in the text barrage information, so a rule base needs to be pre-established, the rule base can be updated manually, the prior probability can be obtained through the rule base, and then the probability of each barrage element under the current video element can be calculated through the training set, and the probability value is the value of the element in the association matrix. The larger the value, the higher the correlation.
The purpose of this embodiment is to associate bullet screen elements with video elements to determine the subsequent addition of animated expressions at specific locations in the video. The problem can be abstracted as a calculation of the correlation of the bullet screen elements with the video elements. For example, the three elements appear in the video, so that the three keywords in the text bullet screen are high in correlation, and then the animation expression is added at the three positions.
The principle of the association matrix is shown in fig. 6, and association of the barrage elements and the video elements is realized by calculating association values of the barrage elements and the video elements.
In an alternative embodiment, the method of the present invention may further comprise:
i) Calculating a first running track of the target animation expression according to the position information of the first text barrage corresponding to the target animation expression and the position information of a first target in the video content information related to the first text barrage;
here, after the target animation expression is acquired, a relationship between the target animation expression and the first text bullet screen is established.
Here, the first moving track calculation process of the target animation expression is as follows:
assume that the position information of the first text bullet corresponding to the target animation expression, that is, the central coordinate point of the first text bullet is (x) 1 ,y 1 ) The position information of the first object in the video content information related to the first text bullet screen, that is, the center coordinate of the first object is (x) 2 ,y 2 ) A and B are each the angle of two points, a third point (x 3 ,y 3 ) For fixing the curvature of the curve.
After the calculation is completed, a bayesian curve, namely a first running track, is generated based on the calculation.
And II) controlling the target animation expression to move from the position of the first text barrage to the position of the first target according to the first running track.
It should be noted that the target animation expression first appears in a fade-in manner from the position of the first text bullet screen, and then moves in a jump manner from the position of the first text bullet screen to the position of the first target according to the first running track. Here, the corresponding effect diagram can be seen in fig. 7.
Further, the method of the invention further comprises:
receiving a first input of a user to the target animation expression;
optionally, the first input is a click input.
If the first text barrage corresponding to the target animation expression is not displayed on the screen, responding to the first input, and displaying the first text barrage;
that is, when the user clicks on the target animation expression, if the target animation expression has scrolled out of the screen (i.e., is not displayed on the screen), the first text bullet screen corresponding to the target animation expression is displayed and generated.
And if the first text barrage corresponding to the target animation expression is displayed on the screen, responding to the first input, and controlling the target animation expression to move from the position of the first target to the position of the first text barrage according to a second running track.
Here, when the user clicks the target animation expression, if the first text bullet screen corresponding to the target animation expression is still displayed on the screen, the target animation expression jumps to the position of the first text bullet screen again in a curve form (second running track).
It should be noted that the calculation principle of the second moving track is the same as that of the first moving track, except that the position of the first text bullet screen may be different at this time.
According to the animation expression generating method, the relevance between the text barrage information and the video content information is obtained according to the video content information and the text barrage information in the video information; determining a first category of an animation expression to be generated according to the attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information; and generating a target animation expression according to the relevance between the text barrage information and the video content information and the first category, so that the animation expression is automatically generated by mining the relevance between the text barrage information and the video content information, the combination of the text barrage and the animation expression is realized, the video watching interestingness is increased, and the experience degree of a user when watching the video is improved.
As shown in fig. 8, the embodiment of the present invention further provides an animation expression generating device, which includes:
a first processing module 801 for obtaining a correlation between text bullet screen information and video content information according to the video content information and the text bullet screen information in the video information
A second processing module 802, configured to determine a first category of an animation expression to be generated according to an attribute of a first text bullet screen, where the first text bullet screen is any text bullet screen in the text bullet screen information;
an animation expression generating module 803 for generating a target animation expression according to the first category and the correlation between the text bullet screen information and the video content information
Optionally, the first processing module 801 includes:
the information extraction unit is used for extracting video content information and text barrage information in the video information;
the first processing unit is used for obtaining video feature vectors according to the video content information;
the second processing unit is used for obtaining bullet screen feature vectors according to the text bullet screen information;
the third processing unit is used for obtaining an incidence matrix according to the video feature vector and the barrage feature vector; the association matrix takes barrage elements in the barrage feature vectors as horizontal columns, and video elements in the video feature vectors as vertical columns; each element in the association matrix is used for representing the association of the barrage element and the video element at the corresponding position.
Optionally, the third processing unit is specifically configured to:
acquiring a pre-established prior matrix related to the barrage and the video;
and calculating the value of each element in the incidence matrix according to the prior matrix and the preset optimal transmission distance to obtain the incidence matrix.
Optionally, the animation expression generating module 803 includes:
the fourth processing unit is used for obtaining an association value corresponding to the bullet first text curtain according to the association matrix;
and the expression generating unit is used for acquiring a target animation expression of which the aesthetic value is matched with the association value in the animation expression of the first category.
Optionally, the device of the embodiment of the present invention further includes:
the track calculation module is used for calculating a first running track of the target animation expression according to the position information of the first text barrage corresponding to the target animation expression and the position information of the first target in the video content information related to the first text barrage;
and the third processing module is used for controlling the target animation expression to move from the position of the first text bullet screen to the position of the first target according to the first running track.
Optionally, the device of the embodiment of the present invention further includes:
the receiving module is used for receiving a first input of a user on the target animation expression;
the display module is used for responding to the first input and displaying the first text barrage under the condition that the first text barrage corresponding to the target animation expression is not displayed on the screen;
and the fourth processing module is used for responding to the first input and controlling the target animation expression to move from the position of the first target to the position of the first text barrage according to the second running track under the condition that the first text barrage corresponding to the target animation expression is displayed on the screen.
Optionally, the first processing unit is specifically configured to:
extracting targets from the video content information, and carrying out feature coding on each extracted target to obtain a first feature vector of each target;
performing feature coding on each pre-stored text name to obtain a second feature vector of each text name;
and obtaining the video feature vector based on each first feature vector and each second feature vector, wherein the video feature vector is used for representing a target and entity names corresponding to the target.
Optionally, the first processing unit is specifically configured to:
obtaining an association vector according to each first feature vector and each second feature vector, wherein the association vector is used for representing the association between the target and the text name;
obtaining a third feature vector of each target according to the association vector and each first feature vector;
and obtaining the video feature vector according to each third feature vector and each second feature vector.
The implementation embodiments of the animation expression generating method are applicable to the embodiment of the animation expression generating device, and the same technical effects can be achieved.
The embodiment of the invention also provides an animation expression generating device which comprises a memory, a processor and a program which is stored in the memory and can run on the processor; the processor implements the animation expression generation method as described above when executing the program.
The implementation embodiments of the animation expression generating method are applicable to the embodiment of the animation expression generating device, and the same technical effects can be achieved.
The embodiment of the invention also provides a readable storage medium, on which a program is stored, which when executed by a processor, implements the steps in the animation expression generation method.
The implementation embodiments of the above method are all applicable to the embodiment of the readable storage medium, and the same technical effects can be achieved.
It should be noted that many of the functional components described in this specification have been referred to as modules, in order to more particularly emphasize their implementation independence.
In an embodiment of the invention, the modules may be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be organized as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different bits which, when joined logically together, comprise the module and achieve the stated purpose for the module.
Indeed, a module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Likewise, operational data may be identified within modules and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.
Where a module may be implemented in software, taking into account the level of existing hardware technology, a module may be implemented in software, and one skilled in the art may, without regard to cost, build corresponding hardware circuitry, including conventional Very Large Scale Integration (VLSI) circuits or gate arrays, and existing semiconductors such as logic chips, transistors, or other discrete components, to achieve the corresponding functions. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and changes can be made without departing from the principles of the present invention, and such modifications and changes should also be considered as being within the scope of the present invention.

Claims (10)

1. An animation expression generating method is characterized by comprising the following steps:
according to video content information and text barrage information in the video information, obtaining the relevance between the text barrage information and the video content information;
determining a first category of an animation expression to be generated according to the attribute of a first text barrage, wherein the first text barrage is any text barrage in the text barrage information;
and generating a target animation expression according to the correlation between the text barrage information and the video content information and the first category.
2. The method of claim 1, wherein the obtaining the association between the video content information and the text bullet screen information based on the video content information and the text bullet screen information in the video information comprises:
extracting video content information and text barrage information in the video information;
obtaining a video feature vector according to the video content information;
obtaining bullet screen feature vectors according to the text bullet screen information;
obtaining an incidence matrix according to the video feature vector and the barrage feature vector; the association matrix takes barrage elements in the barrage feature vectors as horizontal columns, and video elements in the video feature vectors as vertical columns; each element in the association matrix is used for representing the association of the barrage element and the video element at the corresponding position.
3. The method of claim 2, wherein the obtaining an association matrix from the video feature vector and the barrage feature vector comprises:
acquiring a pre-established prior matrix related to the barrage and the video;
and calculating the value of each element in the incidence matrix according to the prior matrix and the preset optimal transmission distance to obtain the incidence matrix.
4. The method of claim 3, wherein the generating a target animated expression based on the first category and the association between the video content information and the text bullet screen information comprises:
according to the association matrix, obtaining an association value corresponding to the first text barrage;
and obtaining a target animation expression of which the aesthetic value is matched with the association value in the animation expression of the first category.
5. The method according to claim 1, wherein the method further comprises:
calculating a first running track of the target animation expression according to the position information of the first text barrage corresponding to the target animation expression and the position information of a first target in video content information related to the first text barrage;
and controlling the target animation expression to move from the position of the first text barrage to the position of the first target according to the first running track.
6. The method of claim 5, wherein the method further comprises:
receiving a first input of a user to the target animation expression;
if the first text barrage corresponding to the target animation expression is not displayed on the screen, responding to the first input, and displaying the first text barrage;
and if the first text barrage corresponding to the target animation expression is displayed on the screen, responding to the first input, and controlling the target animation expression to move from the position of the first target to the position of the first text barrage according to a second running track.
7. The method of claim 2, wherein said obtaining said video feature vector from said video content information comprises:
extracting targets from the video content information, and carrying out feature coding on each extracted target to obtain a first feature vector of each target;
performing feature coding on each pre-stored text name to obtain a second feature vector of each text name;
and obtaining the video feature vector based on each first feature vector and each second feature vector, wherein the video feature vector is used for representing a target and entity names corresponding to the target.
8. The method of claim 7, wherein the deriving the video feature vector based on each of the first feature vector and each of the second feature vector comprises:
obtaining an association vector according to each first feature vector and each second feature vector, wherein the association vector is used for representing the association between the target and the text name;
obtaining a third feature vector of each target according to the association vector and each first feature vector;
and obtaining the video feature vector according to each third feature vector and each second feature vector.
9. An animated expression generating device comprising a memory, a processor, and a program stored on the memory and executable on the processor; the animation expression generation method according to any one of claims 1 to 8 is implemented when the processor executes the program.
10. A readable storage medium having stored thereon a program, which when executed by a processor, implements the steps in the animation expression generation method of any of claims 1 to 8.
CN202310952281.XA 2023-07-31 2023-07-31 Animation expression generating method, equipment and readable storage medium Pending CN116962788A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310952281.XA CN116962788A (en) 2023-07-31 2023-07-31 Animation expression generating method, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310952281.XA CN116962788A (en) 2023-07-31 2023-07-31 Animation expression generating method, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN116962788A true CN116962788A (en) 2023-10-27

Family

ID=88442471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310952281.XA Pending CN116962788A (en) 2023-07-31 2023-07-31 Animation expression generating method, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116962788A (en)

Similar Documents

Publication Publication Date Title
US11714816B2 (en) Information search method and apparatus, device and storage medium
Lokoč et al. On influential trends in interactive video retrieval: video browser showdown 2015–2017
CN111177569B (en) Recommendation processing method, device and equipment based on artificial intelligence
Kavasidis et al. An innovative web-based collaborative platform for video annotation
US8620906B2 (en) Detecting competitive product reviews
CN108776676B (en) Information recommendation method and device, computer readable medium and electronic device
US8229949B2 (en) Apparatus, method and program product for presenting next search keyword
CN107526846B (en) Method, device, server and medium for generating and sorting channel sorting model
CN112115299A (en) Video searching method and device, recommendation method, electronic device and storage medium
US7581184B2 (en) System and method for visualizing the temporal evolution of object metadata
CN110096614B (en) Information recommendation method and device and electronic equipment
CN111831854A (en) Video tag generation method and device, electronic equipment and storage medium
CN112632403B (en) Training method, recommendation method, device, equipment and medium for recommendation model
CN113779381B (en) Resource recommendation method, device, electronic equipment and storage medium
CN111177467A (en) Object recommendation method and device, computer-readable storage medium and electronic equipment
CN113704507B (en) Data processing method, computer device and readable storage medium
US20140006318A1 (en) Collecting, discovering, and/or sharing media objects
CN114154013A (en) Video recommendation method, device, equipment and storage medium
CN111563198B (en) Material recall method, device, equipment and storage medium
CN112650919A (en) Entity information analysis method, apparatus, device and storage medium
CN112989174A (en) Information recommendation method and device, medium and equipment
CN111191059A (en) Image processing method, image processing device, computer storage medium and electronic equipment
CN116962788A (en) Animation expression generating method, equipment and readable storage medium
CN113515701A (en) Information recommendation method and device
CN109063934B (en) Artificial intelligence-based combined optimization result obtaining method and device and readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination