CN114298018A - Video title generation method and device and storage medium - Google Patents

Video title generation method and device and storage medium Download PDF

Info

Publication number
CN114298018A
CN114298018A CN202111610447.7A CN202111610447A CN114298018A CN 114298018 A CN114298018 A CN 114298018A CN 202111610447 A CN202111610447 A CN 202111610447A CN 114298018 A CN114298018 A CN 114298018A
Authority
CN
China
Prior art keywords
video
matching result
target
video frame
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111610447.7A
Other languages
Chinese (zh)
Inventor
吴庆双
周效军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111610447.7A priority Critical patent/CN114298018A/en
Publication of CN114298018A publication Critical patent/CN114298018A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a video title generation method and related equipment, which can generate a video title closely combined with a current hotspot for a video. The method comprises the following steps: determining content characteristics corresponding to a target video; acquiring a keyword set of hot spot content in the previous period; matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set; and determining the video title of the target video according to the matching result set.

Description

Video title generation method and device and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for generating a video title, and a storage medium.
Background
The importance of short video titles since the media age makes it clear that a good title can bring higher traffic and that different titles can bring distinct effects even for the same content. Short video titles are now typically edited manually, often in association with content. Some auto-generated titles are randomly supplemented in short video titles by listing a series of generic words.
The manual title editing has higher requirements on the creator of the video, the file work of video publishers or operators and the sensitivity to hot spots, and if the manual title editing is used for producing short videos in large batch, the requirement of exposing the videos in rapid production is more difficult to meet.
According to the conventional title automatic generation method, a background system is randomly supplemented in a short video title through listing a series of universal words, the generated title has no characteristics and is not innovative enough, the video content cannot be well combined, the requirement of short video batch production is met, but the titles with uniform space lack a positive promotion effect on the exposure and popularization of the video.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a video title and a storage medium, which can generate the video title closely combined with a current hotspot for a video, increase the readability and interestingness of the video title, and reduce the workload of operators.
A first aspect of the present application provides a method for generating a video title, which may include:
determining content characteristics corresponding to a target video;
acquiring a keyword set of hot spot content in the previous period;
matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;
and determining the video title of the target video according to the matching result set.
In one possible design, the determining the video title of the target video according to the matching result set includes:
determining a video frame set corresponding to the matching result set in the target video;
calculating the coincidence degree between the video frame sets of any two matching results in the matching result set to obtain a first coincidence degree set;
determining a first keyword set corresponding to a first target contact ratio, wherein the first target contact ratio is the contact ratio with the minimum contact ratio in the first contact ratio set;
and generating a video title of the target video according to the first keyword set.
In one possible design, the determining a set of video frames in the target video corresponding to the set of matching results includes:
matching each video frame in the target video with each matching result in the matching result set respectively to obtain a video frame set corresponding to each matching result in the matching result set;
and eliminating the video frame set corresponding to the matching result with the video frame number smaller than a preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.
In one possible design, the method further includes:
if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than the preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;
calculating a standard deviation value corresponding to each matching result in the matching result set;
determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;
calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets to obtain a second coincidence degree set, wherein the other matching result subsets are sets of the matching results in the matching result sets except the target matching result;
determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;
and generating a video title of the target video according to the second keyword set.
In one possible design, the content features corresponding to the target video include a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and the method further includes:
if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;
calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;
determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;
and determining the video title of the target video according to the third keyword set.
In one possible design, after determining the first keyword set corresponding to the first target overlap ratio, the method further includes:
determining a target video frame set corresponding to the first keyword set in the target video;
calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;
determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;
and generating a video title of the target video according to the first keyword set and the fourth keyword set.
In one possible design, the calculating the overlap ratio between the video frame sets of any two matching results in the matching result set to obtain a first overlap ratio set includes:
calculating the coincidence ratio between the video frame sets of any two matching results in the matching result set through the following formula:
Figure RE-GDA0003537624310000031
wherein PD (F)1,F2) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set1Set of video frames corresponding to any one of the matching results in the set of matching results, F2Dividing F in the video frame set corresponding to the matching result set1Any other set of video frames, FxIs simultaneously included in F1And F2Any one of the video frames of (1), Fx[F1]Is FxAt F1Position of (5), Fx[F2]Is FxAt F2Of (c) is used.
A second aspect of the present application provides a video title generation apparatus, including:
the first determining unit is used for determining content characteristics corresponding to the target video;
the acquisition unit is used for acquiring a keyword set of hot spot content in the previous period;
the matching unit is used for matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;
and the second determining unit is used for determining the video title of the target video according to the matching result set.
In one possible design, the second determining unit is specifically configured to:
determining a video frame set corresponding to the matching result set in the target video;
calculating the coincidence degree between the video frame sets of any two matching results in the matching result set to obtain a first coincidence degree set;
determining a first keyword set corresponding to a first target contact ratio, wherein the first target contact ratio is the contact ratio with the minimum contact ratio in the first contact ratio set;
and generating a video title of the target video according to the first keyword set.
In one possible design, the determining, by the second determining unit, a set of video frames in the target video corresponding to the set of matching results includes:
matching each video frame in the target video with each matching result in the matching result set respectively to obtain a video frame set corresponding to each matching result in the matching result set;
and eliminating the video frame set corresponding to the matching result with the video frame number smaller than a preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.
In one possible design, the second determination unit is further configured to:
if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than the preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;
calculating a standard deviation value corresponding to each matching result in the matching result set;
determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;
calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets to obtain a second coincidence degree set, wherein the other matching result subsets are sets of the matching results in the matching result sets except the target matching result;
determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;
and generating a video title of the target video according to the second keyword set.
In one possible design, the content features corresponding to the target video include a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and the second determining unit 204 is further configured to:
if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;
calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;
determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;
and determining the video title of the target video according to the third keyword set.
In one possible design, the second determination unit is further configured to:
determining a target video frame set corresponding to the first keyword set in the target video;
calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;
determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;
and generating a video title of the target video according to the first keyword set and the fourth keyword set.
In one possible design, the calculating, by the second determining unit, a coincidence degree between video frame sets of any two matching results in the matching result set, and obtaining a first coincidence degree set includes:
calculating the coincidence ratio between the video frame sets of any two matching results in the matching result set through the following formula:
Figure RE-GDA0003537624310000061
wherein PD (F)1,F2) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set1Set of video frames corresponding to any one of the matching results in the set of matching results, F2Dividing F in the video frame set corresponding to the matching result set1Any other set of video frames, FxIs simultaneously included in F1And F2Any one of the video frames of (1), Fx[F1]Is FxAt F1In the position (a) of (b),Fx[F2]is FxAt F2Of (c) is used.
A third aspect of the present application provides a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the steps of the video title generation method.
A fourth aspect of the present application provides a computer-readable storage medium having at least one executable instruction stored therein, which when run on a computing device, causes the computing device to perform the method for generating a video title according to the first aspect of the present application.
A fifth aspect of the present application discloses a computer program product, which, when run on a computer, causes the computer to execute the method for generating a video title of the first aspect of the present application.
A sixth aspect of the present application discloses an application distribution platform for distributing a computer program product, wherein when the computer program product runs on a computer, the computer is caused to execute the method for generating a video title according to the first aspect of the present application.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment provided by the application, the keyword set of the hot content in the previous period is crawled, the matching record of the keywords in the keyword set and each object in the video of the title to be generated is determined, and the video frame set corresponding to the matching result is determined, so that the contact ratio among all the video sets can be calculated, the keywords related to the keywords in the keyword set are determined according to the contact ratio, and the video title of the video is generated according to the keywords. Therefore, the video titles closely combined with the current affair hotspots can be generated for the videos in batches, readability and interestingness of the video titles are improved, workload of operators can be reduced, video production and online exposure speed are improved, and accordingly video spreading performance is improved.
Drawings
The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flowchart of a method for generating a video title in an embodiment of the present application;
fig. 2 is a schematic view of a virtual structure of a video title generation apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a server in an embodiment of the present application.
Detailed Description
For a person skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. The embodiments in the present application shall fall within the protection scope of the present application.
The following describes a method for generating a video title provided by the present application in more detail from the perspective of a video title generating device, which may be a server or a service unit in the server.
Referring to fig. 1, fig. 1 is a schematic diagram of an embodiment of a method for generating a video title according to an embodiment of the present application, including:
101. and determining the content characteristics corresponding to the target video.
In this embodiment, when a creator issues a new target video, after receiving the target video, the video title generation device may analyze the target video to determine content features corresponding to the target video, where the content features include a first set of a target name (the target name is a name of a character appearing in the target video) and an accuracy rate, a second set of a target action (the target action is an action of a character appearing in the target video) and an accuracy rate, a third set of an object (the object is an object appearing in the target video, such as a basketball stand or a table tennis table lamp) and an accuracy rate, and a fourth set of a scene (the scene is a scene appearing in the target video, such as a football field or a basketball court) and an accuracy rate, where the accuracy rate is an accuracy rate of a name of a character obtained by identifying each frame of video in the target video, The target video is the video of the title to be generated. Specifically, the video title generating device may input the target video into different neural networks respectively, and obtain data sets of characters, actions, scenes, objects and corresponding accuracy rates in the target video respectively, which is described in detail below:
firstly, a face recognition algorithm based on a neural network is used for detecting and recognizing a target video frame by frame, a set of person names with the accuracy rate of more than 95% and the accuracy rate is output, and the first set is marked as Ng={N1:P1,N2:P2,...,Nx:PxAnd (4) storing the first set into a background data cache, wherein N is a character name, and P is the accuracy corresponding to the character name.
Then, detecting and identifying the target video frame by frame based on the regional three-dimensional convolution network of time sequence action detection to find out potential action time intervals in the target video and judge action types, outputting a character action and accuracy rate set (namely a second set) with the accuracy rate of more than 95% in the video, and marking the second set as Ag={A1:P1,A2:P2,...,Ax:PxAnd F, wherein A is the human motion, P is the accuracy corresponding to the human motion, and finally the second set is stored in the background data cache.
And carrying out frame-by-frame detection and identification on the target video through a target detection model to obtain a data set, namely a third set, of objects with the accuracy rate of more than 95% in the target video and the accuracy rate, and marking the third set as Og={O1:P1,O2:P2,...,Ox:PxAnd F, wherein O is the object, P is the accuracy corresponding to the object, and the third set is stored in a background data cache.
Finally, carrying out frame-by-frame detection and identification on the target video through a scene classification algorithm to obtain scenes appearing in the target video, obtaining scenes with the accuracy rate of more than 95% and a data set with the accuracy rate, namely a fourth set, and marking the fourth set as Sg={S1:P1,S2:P2,...,Sx:PxAnd f, wherein S is a scene, P is an accuracy corresponding to the scene, and the fourth set is stored in a background data cache.
The models may be obtained by training in advance with training samples, and the accuracy may be other values, for example, 90%, and is not particularly limited.
102. And acquiring a keyword set of hot spot content in the previous period.
In this embodiment, the video title generation apparatus may obtain a keyword set of hot spot content in the previous period, that is, may set a timing task at the background every morning, crawl hot spot content of a hot search list or a wind cloud list in the previous day, perform processing through natural language to extract a keyword set corresponding to the hot spot content, and mark the keyword set as Hg={H1,H2,...,Hx}. It can be understood that the period may be set to 1 day, or may also be set to 12 hours, or of course, may also be other durations, and in addition, the crawling of the hot content may be performed from the hot search board or the wind and cloud board, or may also be performed from other leaderboards, or may also be set by itself, which is not limited specifically.
It should be noted that, the content feature corresponding to the target video may be determined through step 101, and the keyword set may be obtained through step 102, however, there is no sequential execution order limitation between the two steps, and step 101 may be executed first, step 102 may be executed first, or step 102 may be executed simultaneously, which is not limited specifically.
103. And matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set.
In this embodiment, after obtaining the keyword set of the previous period and the content features corresponding to the target video, the video title generating device may match each keyword in the keyword set with the first set, the second set, the third set, and the fourth set, respectively, to obtain a matching result set, and specifically, may traverse the keyword set HgEach keyword in the set is sequentially associated with Ng,Ag,Og,SgIs performed by each object in (a) to obtain a set of matching results, e.g. HgFirst keyword H in (1)1Is "Zhan a certain", and NgN in (1)1To "Jango", the matching result may be added to the matching result set RgIn this case, the matching result set is Rg={N1:P1H, and so on, and then H2And Ng,Ag,Og,SgIs matched, and a matching result A is assumed to be obtained2:P2And adding the matching result to the matching result set RgIn this case, the matching result set is Rg={N1:P1,A2:P2Sequentially matching to obtain the final product
Figure RE-GDA0003537624310000091
Set of matching results RgX in (a) may be any number, e.g. x 1-1, x 2-2, x 3-2, x 4-2, then R isg={N1:P1,A1:P1,A2:P2,O1:P1,O2:P2,S1:P2,S2:P2The hot content in the previous period comprises a character, two actions and two objects in a keyword set of the hot content in the previous periodTwo scenarios.
104. And determining the video title of the target video according to the matching result set.
In this embodiment, after determining the matching result set, the video title generation apparatus may determine the video title of the target video according to the matching result set. The following is a detailed description of determining a video title of a target video according to a matching result set:
step 1, determining a video frame set corresponding to the matching result set in the target video.
In this step, the video title generating device may determine a video frame set corresponding to the matching result set in the target video, that is, match each video frame in the target video with each matching result in the matching result set, so as to obtain a video frame set corresponding to each matching result in the matching result set; and eliminating the video frame set corresponding to the matching result with the video frame number smaller than the preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.
Below with Rg={N1:P1,A1:P1,A2:P2,O1:P1,O2:P2,S1:P2,S2:P2Explaining how to determine the video frame set corresponding to the matching result in the target video by taking an example as follows:
the video title generation device can traverse all video frames in the target video, detect tasks, actions, objects and scenes in each frame respectively, and detect all N1,A1,A2,O1,O2,S1,S2Marking the video frame to obtain the video frame and N in the target video1,A1,A2,O1,O2,S1,S2A set of relationships RFAssuming that all frames of the target video are FARequires a set of relationships RFThe number of frames corresponding to the medium task, the action, the object and the scene is FAMore than 50% (of course, the actual condition may be determined)Set, for example 60%, without being limited in particular), for example at N1,A1,A2,O1,O2,S1,S2In, S2If the frame number ratio of the corresponding video frame is lower than 50%, S is set2Filtering the corresponding video frames, and finally obtaining a filtered relation set as follows:
Figure RE-GDA0003537624310000101
wherein, F is a video frame in the target video and corresponds to N1,A1,A2,O1,O2,S1The position where the object appears in the target video.
And 2, calculating the contact ratio between the video frame sets of any two matching results in the matching result set to obtain a first contact ratio set.
In this step, after obtaining the video frame set corresponding to the matching result set, the video title generating device may calculate the degree of coincidence between the video frame sets of any two matching results in the matching result set to obtain a first degree of coincidence set, and specifically, may calculate the degree of coincidence between the video frame sets of any two matching results in the matching result set by the following formula:
Figure RE-GDA0003537624310000111
wherein PD (F)1,F2) For the overlap ratio, F, between the sets of video frames of any two matching results in the set of matching results1Set of video frames corresponding to any one of the matching results in the set of matching results, F2Dividing F in the video frame set corresponding to the matching result set1Any other set of video frames, FxIs simultaneously included in F1And F2Any one of the video frames of (1), Fx[F1]Is FxAt F1Position of (5), Fx[F2]Is FxAt F2Of (c) is used.
The matching result set is N1,A1,A2,O1,O2,S1,S2To illustrate by way of example, calculate N1Corresponding set of video frames FN1={F1,F2,...,FnAnd A1Corresponding set of video frames FA1={F1,F2,...,Fn,Fn+2The degree of coincidence between, set FxAt FN1In position Fx[FN1](0≤x≤n-1),FxAt FA1In position Fx[FA1](x is not less than 0 and not more than n +1), then Fx[FN1]-Fx[FA1]For video frame FxIn N1Corresponding video frame set and FxIn A1And finally obtaining a formula according to the position difference in the corresponding video frame set:
Figure RE-GDA0003537624310000112
namely N1Corresponding set of video frames FN1And A1Corresponding set of video frames FA1The average of the absolute values of the differences between the same video frame positions, which is N1Corresponding set of video frames FN1And A1Corresponding set of video frames FA1The smaller the coincidence degree PD between the obtained video frame sets is, the simultaneous occurrence of N in the target video is shown1And A1The more pictures there are.
Then F is calculated by the following formulaN1And A2Set of video frames FA2The contact ratio between:
Figure RE-GDA0003537624310000113
wherein PD (F)N1,FA2) Is FN1And FA2Degree of overlap therebetween, Fx[FN1]-Fx[FA2]For video frame FxIn N1Corresponding video frame set and FxIn A2The position difference in the corresponding set of video frames.
Finally, according to the permutation and combination mode, PD (F) is calculatedN1,FO1)、…、PD(FO2,FS1) I.e. the first contact ratio set.
And step 3, determining a first keyword set corresponding to the first target contact ratio.
In this step, after determining the first contact ratio set, the video title generating device may first determine a first target contact ratio, which is a contact ratio with a minimum contact ratio in the first contact ratio set, and then determine a first keyword set corresponding to the first target contact ratio, for example, a final PD (F) (i.e., a final PD)N1,FO1) Has the smallest value of (A), N1Is Zhan somewhere, O1The target video is a "basketball", which means that the probability that the "jean" and the "basketball" appear simultaneously in the target video is the largest, and the total duration of the video frames corresponding to the "jean" and the "basketball" accounts for more than 50% of the target video, and the "jean" and the "basketball" are the first keyword set.
And 4, generating a video title of the target video according to the first keyword set.
In this step, after obtaining the first keyword set, the video title generation device may generate a video title by natural language understanding according to the first keyword set. The manner of generating the video title is not particularly limited as long as the video title can be generated by natural language understanding. In addition, after the video title of the target video is generated, the target video can be launched, and when the search keyword input by the user is matched with the video title of the target video, the target video is pushed to the user, so that accurate pushing is realized.
It should be noted that, with the title of the attached video content, when a user searches for a video related to a certain content, the content that the user wants to see can be better displayed through matching of the user input content and the system title, and the display relevance of the search result is improved. Or when some hot content is generated in the near future, the video material of the existing resource library is utilized, and the video keywords are combined to generate the title related to the hot content, so that the video content which is past is recommended to the hot topic material again. For example, a user uploads a section of lovely pet cat video before, video content keywords (cat, splitting family) are identified, a new title is given to the section of historical video according to an internet hot search topic 'cat can have bad eyes', through the steps, the background is matched with the section of video, so that when the user searches a topic 'cat can have bad eyes', the video can be recommended to the user again, video materials are fully utilized, the user search intention is fitted, and the video exposure speed is increased.
In one embodiment, the video title generation apparatus further performs the following operations:
if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than a preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;
calculating a standard deviation value corresponding to each matching result in the matching result set;
determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;
calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets, wherein the other matching result subsets are sets of matching results except the target matching result in the matching result set;
determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;
generating a video title of the target video according to the second keyword set
In this embodiment, if the number of video frames in the video frame set corresponding to each matching set in the matching result set is less than the preset threshold, it indicates that each matching result in the matching result set is in the target videoThe number of matched video frames is less than a preset threshold value, and at the moment ReIf it is null, then calculate the standard deviation value corresponding to each matching result in the matching result set, e.g. matching result set RFThe corresponding video frames are:
Figure RE-GDA0003537624310000131
wherein N is calculated in turn1,A1,A2,O1,O2,S1,S2The number of frames in each object is proportional to N1For example, wherein N1Is in a ratio of
Figure RE-GDA0003537624310000132
N is N1Number of frames in the corresponding video frame, FAIs the total frame number of the target video, so that N can be calculated1,A1,A2,O1,O2,S1,S2The frame number ratio of each object is arranged in the order from small to large, for example, the obtained arrangement mode is as follows: {1,3,2,7,5,6,4}.
Then, according to the standard deviation formula, calculating N1,A1,A2,O1,O2,S1,S2Corresponding standard deviation values, e.g. N1Corresponding to a standard deviation of
Figure RE-GDA0003537624310000133
And N is1,A1,A2,O1,O2,S1,S2The standard deviations of all the objects in the list are sorted from small to large, N1,A1,A2,O1,O2,S1,S2The corresponding standard deviation rank is {2, 5, 4, 1, 3, 7, 6 }.
Then determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value, specificallyThe weights of the frame number ratio and the standard deviation, each of which is 30% and 70% (of course, may be adjusted according to actual conditions, and is not limited specifically), may be set, so that N may be calculated by the following formula1Weighted average of (a):
Figure RE-GDA0003537624310000134
wherein the content of the first and second substances,
Figure RE-GDA0003537624310000135
is N1The ratio of the number of frames of (c),
Figure RE-GDA0003537624310000136
is N1The value of the standard deviation of (a),
Figure RE-GDA0003537624310000137
is N1Whereby a weighted average of the remaining objects in the set of matching results can be calculated based on the formula.
Then, the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in the other matching result subsets is calculated to obtain a second coincidence degree set, for example, the second coincidence degree set is finally obtained
Figure RE-GDA0003537624310000141
If the value of (3) is the maximum value, the probability that S1 is one of the main expression contents of the target video is considered to be the maximum value, and the second overlap ratio set can be obtained by the overlap ratio between the video frame set corresponding to S1 and the video frame sets of other objects. The above-mentioned method for calculating the contact ratio has been described in detail, and is not described herein in detail.
Finally, after the second coincidence degree set is obtained, the coincidence degree with the minimum coincidence degree value in the second target coincidence degrees can be selected as the second target coincidence degree, a second keyword set corresponding to the second target coincidence degree is determined, and then the video title of the target video can be generated according to the second keyword set.
In one embodiment, if HgAnd Ng,Ag,Og,SgThe set matching has no result, that is, the result object analyzed by the target video has no intersection with the hot word in the previous period, and the video title generation device can generate the video title of the target video in the following manner:
if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;
calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;
determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;
and determining the video title of the target video according to the third keyword set.
In this embodiment, if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set, and the fourth set, it is determined that a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set, and a video frame set corresponding to the fourth set in the target video are present, the video frames in the target video may be traversed, a video frame set corresponding to the first set in the target video may be found, a video frame set corresponding to the second set in the target video, a video frame set corresponding to the third set in the target video, a video frame set corresponding to the fourth set in the target video, a coincidence degree between any two video frame sets in each video frame set may be calculated, a third coincidence degree set may be obtained, the calculation manner of the coincidence degree has been described in detail, and will not be described herein in detail.
And then determining a third keyword set corresponding to a third target coincidence degree with the minimum coincidence degree value in the third coincidence degree set, and generating a video title of the target video according to the third keyword set.
In one embodiment, after determining the first keyword set corresponding to the first target contact ratio, the video title generating apparatus further performs the following operations:
determining a target video frame set corresponding to the first keyword set in the target video;
calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;
determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;
and generating a video title of the target video according to the first keyword set and the fourth keyword set.
In this embodiment, after the video title generation apparatus determines the first keyword set corresponding to the first target contact ratio, the video frame set corresponding to the keywords in the first keyword set may be used as a reference, and the coincidence degree between the video frame set corresponding to the first keyword set and other videos except the video frame corresponding to the first keyword set in the video frame set corresponding to the matching result set is calculated to obtain a fourth coincidence degree set (the coincidence degree calculation has been described above, and details are not repeated here), and then the fourth keyword set corresponding to a fourth target coincidence degree with the smallest coincidence degree value in the fourth coincidence degree set is determined, and generates a video title of the target video according to the fourth keyword set and the first keyword set, therefore, more relevant words of the target video can be obtained, and a title which is more fit to the video content is generated.
In summary, in the embodiment provided by the application, by crawling the keyword set of the hot content in the previous period, determining that the matching between the keywords in the keyword set and each object in the video of the title to be generated is recorded, and determining the video frame set corresponding to the matching result, the contact ratio between each video set can be calculated, the keywords associated with the keywords in the keyword set are determined according to the contact ratio, and then the video title of the video is generated according to the keywords. Therefore, the video titles closely combined with the current affair hotspots can be generated for the videos in batches, readability and interestingness of the video titles are improved, workload of operators can be reduced, video production and online exposure speed are improved, and accordingly video spreading performance is improved.
The embodiments of the present application are described above with respect to a method of generating a video title, and the embodiments of the present application are described below with respect to a video title generation apparatus:
referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of a video title generation apparatus according to an embodiment of the present application, where the video title generation apparatus 200 includes:
a first determining unit 201, configured to determine a content feature corresponding to a target video;
an obtaining unit 202, configured to obtain a keyword set of hot spot content in a previous period;
a matching unit 203, configured to match each keyword in the keyword set with a content feature corresponding to the target video, respectively, so as to obtain a matching result set;
a second determining unit 204, configured to determine a video title of the target video according to the matching result set.
In one possible design, the second determining unit 204 is specifically configured to:
determining a video frame set corresponding to the matching result set in the target video;
calculating the coincidence degree between the video frame sets of any two matching results in the matching result set to obtain a first coincidence degree set;
determining a first keyword set corresponding to a first target contact ratio, wherein the first target contact ratio is the contact ratio with the minimum contact ratio in the first contact ratio set;
and generating a video title of the target video according to the first keyword set.
In one possible design, the determining, by the second determining unit 204, a set of video frames in the target video corresponding to the set of matching results includes:
matching each video frame in the target video with each matching result in the matching result set respectively to obtain a video frame set corresponding to each matching result in the matching result set;
and eliminating the video frame set corresponding to the matching result with the video frame number smaller than a preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.
In one possible design, the second determining unit 204 is further configured to:
if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than the preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;
calculating a standard deviation value corresponding to each matching result in the matching result set;
determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;
calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets to obtain a second coincidence degree set, wherein the other matching result subsets are sets of the matching results in the matching result sets except the target matching result;
determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;
and generating a video title of the target video according to the second keyword set.
In one possible design, the content features corresponding to the target video include a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and the second determining unit 204 is further configured to:
if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;
calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;
determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;
and determining the video title of the target video according to the third keyword set.
In one possible design, the second determining unit 204 is further configured to:
determining a target video frame set corresponding to the first keyword set in the target video;
calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;
determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;
and generating a video title of the target video according to the first keyword set and the fourth keyword set.
In one possible design, the second determining unit 204 calculates a degree of overlap between any two sets of video frames of the matching results in the set of matching results, and obtaining a first degree of overlap set includes:
calculating the coincidence ratio between the video frame sets of any two matching results in the matching result set through the following formula:
Figure RE-GDA0003537624310000181
wherein PD (F)1,F2) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set1Set of video frames corresponding to any one of the matching results in the set of matching results, F2Dividing F in the video frame set corresponding to the matching result set1Any other set of video frames, FxIs simultaneously included in F1And F2Any one of the video frames of (1), Fx[F1]Is FxAt F1Position of (5), Fx[F2]Is FxAt F2Of (c) is used.
The embodiment of the application also provides a computing device which can be a server. Referring to fig. 3, fig. 3 is a schematic structural diagram of a server 300 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the server 300.
The server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like.
The steps performed by the video title generation apparatus in the above-described embodiment may be based on the server structure shown in fig. 3.
The present application further provides a computer-readable storage medium, in which at least one executable instruction is stored, and when the executable instruction is run on a computing device, the computing device is caused to execute the method for generating a video title according to any of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method for generating a video title, comprising:
determining content characteristics corresponding to a target video;
acquiring a keyword set of hot spot content in the previous period;
matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;
and determining the video title of the target video according to the matching result set.
2. The method of claim 1, wherein the determining a video title of the target video according to the set of matching results comprises:
determining a video frame set corresponding to the matching result set in the target video;
calculating the coincidence degree between the video frame sets of any two matching results in the matching result set to obtain a first coincidence degree set;
determining a first keyword set corresponding to a first target contact ratio, wherein the first target contact ratio is the contact ratio with the minimum contact ratio in the first contact ratio set;
and generating a video title of the target video according to the first keyword set.
3. The method of claim 2, wherein the determining the set of video frames in the target video corresponding to the set of matching results comprises:
matching each video frame in the target video with each matching result in the matching result set respectively to obtain a video frame set corresponding to each matching result in the matching result set;
and eliminating the video frame set corresponding to the matching result with the video frame number smaller than a preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.
4. The method of claim 2, further comprising:
if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than the preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;
calculating a standard deviation value corresponding to each matching result in the matching result set;
determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;
calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets to obtain a second coincidence degree set, wherein the other matching result subsets are sets of the matching results in the matching result sets except the target matching result;
determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;
and generating a video title of the target video according to the second keyword set.
5. The method of claim 1, wherein the content features corresponding to the target video comprise a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and wherein the method further comprises:
if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;
calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;
determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;
and determining the video title of the target video according to the third keyword set.
6. The method according to any one of claims 2 to 5, wherein after determining the first keyword set corresponding to the first target contact ratio, the method further comprises:
determining a target video frame set corresponding to the first keyword set in the target video;
calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;
determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;
and generating a video title of the target video according to the first keyword set and the fourth keyword set.
7. The method according to any one of claims 2 to 5, wherein the calculating the coincidence between the sets of video frames of any two matching results in the set of matching results, and obtaining a first coincidence set comprises:
calculating the coincidence ratio between the video frame sets of any two matching results in the matching result set through the following formula:
Figure FDA0003435263800000031
wherein PD (F)1,F2) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set1Is the said matching junctionSet of video frames corresponding to any one of the matching results in the set of results, F2Dividing F in the video frame set corresponding to the matching result set1Any other set of video frames, FxIs simultaneously included in F1And F2Any one of the video frames of (1), Fx[F1]Is FxAt F1Position of (5), Fx[F2]Is FxAt F2Of (c) is used.
8. A video title generation apparatus, comprising:
the first determining unit is used for determining content characteristics corresponding to the target video;
the acquisition unit is used for acquiring a keyword set of hot spot content in the previous period;
the matching unit is used for matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;
and the second determining unit is used for determining the video title of the target video according to the matching result set.
9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction which causes the processor to execute the steps of the video title generation method according to any one of claims 1-7.
10. A computer-readable storage medium having stored therein at least one executable instruction that, when executed on a computing device, causes the computing device to perform the method of generating a video title according to any one of claims 1 to 7.
CN202111610447.7A 2021-12-27 2021-12-27 Video title generation method and device and storage medium Pending CN114298018A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111610447.7A CN114298018A (en) 2021-12-27 2021-12-27 Video title generation method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111610447.7A CN114298018A (en) 2021-12-27 2021-12-27 Video title generation method and device and storage medium

Publications (1)

Publication Number Publication Date
CN114298018A true CN114298018A (en) 2022-04-08

Family

ID=80969687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111610447.7A Pending CN114298018A (en) 2021-12-27 2021-12-27 Video title generation method and device and storage medium

Country Status (1)

Country Link
CN (1) CN114298018A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114938477A (en) * 2022-06-23 2022-08-23 阿里巴巴(中国)有限公司 Video topic determination method, device and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114938477A (en) * 2022-06-23 2022-08-23 阿里巴巴(中国)有限公司 Video topic determination method, device and equipment
CN114938477B (en) * 2022-06-23 2024-05-03 阿里巴巴(中国)有限公司 Video topic determination method, device and equipment

Similar Documents

Publication Publication Date Title
US20230012732A1 (en) Video data processing method and apparatus, device, and medium
CN109905772B (en) Video clip query method, device, computer equipment and storage medium
WO2017181612A1 (en) Personalized video recommendation method and device
CN107862022B (en) Culture resource recommendation system
CN110309795B (en) Video detection method, device, electronic equipment and storage medium
CN106951571B (en) Method and device for labeling application with label
CN108319723A (en) A kind of picture sharing method and device, terminal, storage medium
CN106844685B (en) Method, device and server for identifying website
CN109471978B (en) Electronic resource recommendation method and device
US9286379B2 (en) Document quality measurement
US10346496B2 (en) Information category obtaining method and apparatus
WO2018196553A1 (en) Method and apparatus for obtaining identifier, storage medium, and electronic device
CN110413867B (en) Method and system for content recommendation
US9245035B2 (en) Information processing system, information processing method, program, and non-transitory information storage medium
CN108959323B (en) Video classification method and device
CN111783712A (en) Video processing method, device, equipment and medium
CN112052387A (en) Content recommendation method and device and computer readable storage medium
CN113407773A (en) Short video intelligent recommendation method and system, electronic device and storage medium
CN113779381A (en) Resource recommendation method and device, electronic equipment and storage medium
CN111708909A (en) Video tag adding method and device, electronic equipment and computer-readable storage medium
CN114154013A (en) Video recommendation method, device, equipment and storage medium
CN112328833A (en) Label processing method and device and computer readable storage medium
CN114298018A (en) Video title generation method and device and storage medium
CN110929169A (en) Position recommendation method based on improved Canopy clustering collaborative filtering algorithm
CN110990705B (en) News processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination