CN114298018A

CN114298018A - Video title generation method and device and storage medium

Info

Publication number: CN114298018A
Application number: CN202111610447.7A
Authority: CN
Inventors: 吴庆双; 周效军
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-04-08

Abstract

The embodiment of the application discloses a video title generation method and related equipment, which can generate a video title closely combined with a current hotspot for a video. The method comprises the following steps: determining content characteristics corresponding to a target video; acquiring a keyword set of hot spot content in the previous period; matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set; and determining the video title of the target video according to the matching result set.

Description

Video title generation method and device and storage medium

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for generating a video title, and a storage medium.

Background

The importance of short video titles since the media age makes it clear that a good title can bring higher traffic and that different titles can bring distinct effects even for the same content. Short video titles are now typically edited manually, often in association with content. Some auto-generated titles are randomly supplemented in short video titles by listing a series of generic words.

The manual title editing has higher requirements on the creator of the video, the file work of video publishers or operators and the sensitivity to hot spots, and if the manual title editing is used for producing short videos in large batch, the requirement of exposing the videos in rapid production is more difficult to meet.

According to the conventional title automatic generation method, a background system is randomly supplemented in a short video title through listing a series of universal words, the generated title has no characteristics and is not innovative enough, the video content cannot be well combined, the requirement of short video batch production is met, but the titles with uniform space lack a positive promotion effect on the exposure and popularization of the video.

Disclosure of Invention

The embodiment of the application provides a method and a device for generating a video title and a storage medium, which can generate the video title closely combined with a current hotspot for a video, increase the readability and interestingness of the video title, and reduce the workload of operators.

A first aspect of the present application provides a method for generating a video title, which may include:

determining content characteristics corresponding to a target video;

acquiring a keyword set of hot spot content in the previous period;

matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;

and determining the video title of the target video according to the matching result set.

In one possible design, the determining the video title of the target video according to the matching result set includes:

determining a video frame set corresponding to the matching result set in the target video;

calculating the coincidence degree between the video frame sets of any two matching results in the matching result set to obtain a first coincidence degree set;

determining a first keyword set corresponding to a first target contact ratio, wherein the first target contact ratio is the contact ratio with the minimum contact ratio in the first contact ratio set;

and generating a video title of the target video according to the first keyword set.

In one possible design, the determining a set of video frames in the target video corresponding to the set of matching results includes:

matching each video frame in the target video with each matching result in the matching result set respectively to obtain a video frame set corresponding to each matching result in the matching result set;

and eliminating the video frame set corresponding to the matching result with the video frame number smaller than a preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.

In one possible design, the method further includes:

if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than the preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;

calculating a standard deviation value corresponding to each matching result in the matching result set;

determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value;

calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets to obtain a second coincidence degree set, wherein the other matching result subsets are sets of the matching results in the matching result sets except the target matching result;

determining a second keyword set corresponding to a second target contact ratio, wherein the second target contact ratio is the contact ratio with the minimum contact ratio value in the second contact ratio set;

and generating a video title of the target video according to the second keyword set.

In one possible design, the content features corresponding to the target video include a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and the method further includes:

if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set and the fourth set, determining a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set and a video frame set corresponding to the fourth set in the target video;

calculating the coincidence degree between the video frame sets corresponding to any two of the video frame set corresponding to the first set, the video frame set corresponding to the second set, the video frame set corresponding to the third set and the video frame set corresponding to the fourth set to obtain a third coincidence degree set;

determining a third keyword set corresponding to a third target contact ratio, wherein the third target contact ratio is the contact ratio with the minimum contact ratio value in the third contact ratio set;

and determining the video title of the target video according to the third keyword set.

In one possible design, after determining the first keyword set corresponding to the first target overlap ratio, the method further includes:

determining a target video frame set corresponding to the first keyword set in the target video;

calculating the coincidence degree between the target video frame set and other video frame sets to obtain a fourth coincidence degree set, wherein the other video frame sets are video frame sets except the video frame set corresponding to the first target coincidence degree in the video frame sets corresponding to the matching result set;

determining a fourth keyword set corresponding to a fourth target contact ratio, wherein the fourth target contact ratio is the contact ratio with the minimum contact ratio value in the fourth contact ratio set;

and generating a video title of the target video according to the first keyword set and the fourth keyword set.

In one possible design, the calculating the overlap ratio between the video frame sets of any two matching results in the matching result set to obtain a first overlap ratio set includes:

calculating the coincidence ratio between the video frame sets of any two matching results in the matching result set through the following formula:

wherein PD (F)₁,F₂) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set₁Set of video frames corresponding to any one of the matching results in the set of matching results, F₂Dividing F in the video frame set corresponding to the matching result set₁Any other set of video frames, F_xIs simultaneously included in F₁And F₂Any one of the video frames of (1), F_x[F₁]Is F_xAt F₁Position of (5), F_x[F₂]Is F_xAt F₂Of (c) is used.

A second aspect of the present application provides a video title generation apparatus, including:

the first determining unit is used for determining content characteristics corresponding to the target video;

the acquisition unit is used for acquiring a keyword set of hot spot content in the previous period;

the matching unit is used for matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set;

and the second determining unit is used for determining the video title of the target video according to the matching result set.

In one possible design, the second determining unit is specifically configured to:

In one possible design, the determining, by the second determining unit, a set of video frames in the target video corresponding to the set of matching results includes:

In one possible design, the second determination unit is further configured to:

In one possible design, the content features corresponding to the target video include a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and the second determining unit 204 is further configured to:

In one possible design, the second determination unit is further configured to:

In one possible design, the calculating, by the second determining unit, a coincidence degree between video frame sets of any two matching results in the matching result set, and obtaining a first coincidence degree set includes:

wherein PD (F)₁,F₂) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set₁Set of video frames corresponding to any one of the matching results in the set of matching results, F₂Dividing F in the video frame set corresponding to the matching result set₁Any other set of video frames, F_xIs simultaneously included in F₁And F₂Any one of the video frames of (1), F_x[F₁]Is F_xAt F₁In the position (a) of (b),F_x[F₂]is F_xAt F₂Of (c) is used.

A third aspect of the present application provides a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the steps of the video title generation method.

A fourth aspect of the present application provides a computer-readable storage medium having at least one executable instruction stored therein, which when run on a computing device, causes the computing device to perform the method for generating a video title according to the first aspect of the present application.

A fifth aspect of the present application discloses a computer program product, which, when run on a computer, causes the computer to execute the method for generating a video title of the first aspect of the present application.

A sixth aspect of the present application discloses an application distribution platform for distributing a computer program product, wherein when the computer program product runs on a computer, the computer is caused to execute the method for generating a video title according to the first aspect of the present application.

According to the technical scheme, the embodiment of the application has the following advantages:

in the embodiment provided by the application, the keyword set of the hot content in the previous period is crawled, the matching record of the keywords in the keyword set and each object in the video of the title to be generated is determined, and the video frame set corresponding to the matching result is determined, so that the contact ratio among all the video sets can be calculated, the keywords related to the keywords in the keyword set are determined according to the contact ratio, and the video title of the video is generated according to the keywords. Therefore, the video titles closely combined with the current affair hotspots can be generated for the videos in batches, readability and interestingness of the video titles are improved, workload of operators can be reduced, video production and online exposure speed are improved, and accordingly video spreading performance is improved.

Drawings

The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a schematic flowchart of a method for generating a video title in an embodiment of the present application;

fig. 2 is a schematic view of a virtual structure of a video title generation apparatus according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a server in an embodiment of the present application.

Detailed Description

For a person skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. The embodiments in the present application shall fall within the protection scope of the present application.

The following describes a method for generating a video title provided by the present application in more detail from the perspective of a video title generating device, which may be a server or a service unit in the server.

Referring to fig. 1, fig. 1 is a schematic diagram of an embodiment of a method for generating a video title according to an embodiment of the present application, including:

101. and determining the content characteristics corresponding to the target video.

In this embodiment, when a creator issues a new target video, after receiving the target video, the video title generation device may analyze the target video to determine content features corresponding to the target video, where the content features include a first set of a target name (the target name is a name of a character appearing in the target video) and an accuracy rate, a second set of a target action (the target action is an action of a character appearing in the target video) and an accuracy rate, a third set of an object (the object is an object appearing in the target video, such as a basketball stand or a table tennis table lamp) and an accuracy rate, and a fourth set of a scene (the scene is a scene appearing in the target video, such as a football field or a basketball court) and an accuracy rate, where the accuracy rate is an accuracy rate of a name of a character obtained by identifying each frame of video in the target video, The target video is the video of the title to be generated. Specifically, the video title generating device may input the target video into different neural networks respectively, and obtain data sets of characters, actions, scenes, objects and corresponding accuracy rates in the target video respectively, which is described in detail below:

firstly, a face recognition algorithm based on a neural network is used for detecting and recognizing a target video frame by frame, a set of person names with the accuracy rate of more than 95% and the accuracy rate is output, and the first set is marked as N_g＝{N₁:P₁,N₂:P₂,...,N_x:P_xAnd (4) storing the first set into a background data cache, wherein N is a character name, and P is the accuracy corresponding to the character name.

Then, detecting and identifying the target video frame by frame based on the regional three-dimensional convolution network of time sequence action detection to find out potential action time intervals in the target video and judge action types, outputting a character action and accuracy rate set (namely a second set) with the accuracy rate of more than 95% in the video, and marking the second set as A_g＝{A₁:P₁,A₂:P₂,...,A_x:P_xAnd F, wherein A is the human motion, P is the accuracy corresponding to the human motion, and finally the second set is stored in the background data cache.

And carrying out frame-by-frame detection and identification on the target video through a target detection model to obtain a data set, namely a third set, of objects with the accuracy rate of more than 95% in the target video and the accuracy rate, and marking the third set as O_g＝{O₁:P₁,O₂:P₂,...,O_x:P_xAnd F, wherein O is the object, P is the accuracy corresponding to the object, and the third set is stored in a background data cache.

Finally, carrying out frame-by-frame detection and identification on the target video through a scene classification algorithm to obtain scenes appearing in the target video, obtaining scenes with the accuracy rate of more than 95% and a data set with the accuracy rate, namely a fourth set, and marking the fourth set as S_g＝{S₁:P₁,S₂:P₂,...,S_x:P_xAnd f, wherein S is a scene, P is an accuracy corresponding to the scene, and the fourth set is stored in a background data cache.

The models may be obtained by training in advance with training samples, and the accuracy may be other values, for example, 90%, and is not particularly limited.

102. And acquiring a keyword set of hot spot content in the previous period.

In this embodiment, the video title generation apparatus may obtain a keyword set of hot spot content in the previous period, that is, may set a timing task at the background every morning, crawl hot spot content of a hot search list or a wind cloud list in the previous day, perform processing through natural language to extract a keyword set corresponding to the hot spot content, and mark the keyword set as H_g＝{H₁,H₂,...,H_x}. It can be understood that the period may be set to 1 day, or may also be set to 12 hours, or of course, may also be other durations, and in addition, the crawling of the hot content may be performed from the hot search board or the wind and cloud board, or may also be performed from other leaderboards, or may also be set by itself, which is not limited specifically.

It should be noted that, the content feature corresponding to the target video may be determined through step 101, and the keyword set may be obtained through step 102, however, there is no sequential execution order limitation between the two steps, and step 101 may be executed first, step 102 may be executed first, or step 102 may be executed simultaneously, which is not limited specifically.

103. And matching each keyword in the keyword set with the content characteristics corresponding to the target video respectively to obtain a matching result set.

In this embodiment, after obtaining the keyword set of the previous period and the content features corresponding to the target video, the video title generating device may match each keyword in the keyword set with the first set, the second set, the third set, and the fourth set, respectively, to obtain a matching result set, and specifically, may traverse the keyword set H_gEach keyword in the set is sequentially associated with N_g,A_g,O_g,S_gIs performed by each object in (a) to obtain a set of matching results, e.g. H_gFirst keyword H in (1)₁Is "Zhan a certain", and N_gN in (1)₁To "Jango", the matching result may be added to the matching result set R_gIn this case, the matching result set is R_g＝{N₁:P₁H, and so on, and then H₂And N_g,A_g,O_g,S_gIs matched, and a matching result A is assumed to be obtained₂:P₂And adding the matching result to the matching result set R_gIn this case, the matching result set is R_g＝{N₁:P₁,A₂:P₂Sequentially matching to obtain the final product

Set of matching results R_gX in (a) may be any number, e.g. x 1-1, x 2-2, x 3-2, x 4-2, then R is_g＝{N₁:P₁,A₁:P₁,A₂:P₂,O₁:P₁,O₂:P₂,S₁:P₂,S₂:P₂The hot content in the previous period comprises a character, two actions and two objects in a keyword set of the hot content in the previous periodTwo scenarios.

104. And determining the video title of the target video according to the matching result set.

In this embodiment, after determining the matching result set, the video title generation apparatus may determine the video title of the target video according to the matching result set. The following is a detailed description of determining a video title of a target video according to a matching result set:

step 1, determining a video frame set corresponding to the matching result set in the target video.

In this step, the video title generating device may determine a video frame set corresponding to the matching result set in the target video, that is, match each video frame in the target video with each matching result in the matching result set, so as to obtain a video frame set corresponding to each matching result in the matching result set; and eliminating the video frame set corresponding to the matching result with the video frame number smaller than the preset threshold value in the matching result set to obtain the video frame set corresponding to the matching result set.

Below with R_g＝{N₁:P₁,A₁:P₁,A₂:P₂,O₁:P₁,O₂:P₂,S₁:P₂,S₂:P₂Explaining how to determine the video frame set corresponding to the matching result in the target video by taking an example as follows:

the video title generation device can traverse all video frames in the target video, detect tasks, actions, objects and scenes in each frame respectively, and detect all N₁,A₁,A₂,O₁,O₂,S₁,S₂Marking the video frame to obtain the video frame and N in the target video₁,A₁,A₂,O₁,O₂,S₁,S₂A set of relationships R_FAssuming that all frames of the target video are F_ARequires a set of relationships R_FThe number of frames corresponding to the medium task, the action, the object and the scene is F_AMore than 50% (of course, the actual condition may be determined)Set, for example 60%, without being limited in particular), for example at N₁,A₁,A₂,O₁,O₂,S₁,S₂In, S₂If the frame number ratio of the corresponding video frame is lower than 50%, S is set₂Filtering the corresponding video frames, and finally obtaining a filtered relation set as follows:

wherein, F is a video frame in the target video and corresponds to N₁,A₁,A₂,O₁,O₂,S₁The position where the object appears in the target video.

And 2, calculating the contact ratio between the video frame sets of any two matching results in the matching result set to obtain a first contact ratio set.

In this step, after obtaining the video frame set corresponding to the matching result set, the video title generating device may calculate the degree of coincidence between the video frame sets of any two matching results in the matching result set to obtain a first degree of coincidence set, and specifically, may calculate the degree of coincidence between the video frame sets of any two matching results in the matching result set by the following formula:

wherein PD (F)₁,F₂) For the overlap ratio, F, between the sets of video frames of any two matching results in the set of matching results₁Set of video frames corresponding to any one of the matching results in the set of matching results, F₂Dividing F in the video frame set corresponding to the matching result set₁Any other set of video frames, F_xIs simultaneously included in F₁And F₂Any one of the video frames of (1), F_x[F₁]Is F_xAt F₁Position of (5), F_x[F₂]Is F_xAt F₂Of (c) is used.

The matching result set is N₁,A₁,A₂,O₁,O₂,S₁,S₂To illustrate by way of example, calculate N₁Corresponding set of video frames F_N1＝{F₁,F₂,...,F_nAnd A₁Corresponding set of video frames F_A1＝{F₁,F₂,...,F_n,F_n+2The degree of coincidence between, set F_xAt F_N1In position F_x[F_N1](0≤x≤n-1)，F_xAt F_A1In position F_x[F_A1](x is not less than 0 and not more than n +1), then F_x[F_N1]-F_x[F_A1]For video frame F_xIn N₁Corresponding video frame set and F_xIn A₁And finally obtaining a formula according to the position difference in the corresponding video frame set:

namely N₁Corresponding set of video frames F_N1And A₁Corresponding set of video frames F_A1The average of the absolute values of the differences between the same video frame positions, which is N₁Corresponding set of video frames F_N1And A₁Corresponding set of video frames F_A1The smaller the coincidence degree PD between the obtained video frame sets is, the simultaneous occurrence of N in the target video is shown₁And A₁The more pictures there are.

Then F is calculated by the following formula_N1And A₂Set of video frames F_A2The contact ratio between:

wherein PD (F)_N1,F_A2) Is F_N1And F_A2Degree of overlap therebetween, F_x[F_N1]-F_x[F_A2]For video frame F_xIn N₁Corresponding video frame set and F_xIn A₂The position difference in the corresponding set of video frames.

Finally, according to the permutation and combination mode, PD (F) is calculated_N1,F_O1)、…、PD(F_O2,F_S1) I.e. the first contact ratio set.

And step 3, determining a first keyword set corresponding to the first target contact ratio.

In this step, after determining the first contact ratio set, the video title generating device may first determine a first target contact ratio, which is a contact ratio with a minimum contact ratio in the first contact ratio set, and then determine a first keyword set corresponding to the first target contact ratio, for example, a final PD (F) (i.e., a final PD)_N1,F_O1) Has the smallest value of (A), N₁Is Zhan somewhere, O₁The target video is a "basketball", which means that the probability that the "jean" and the "basketball" appear simultaneously in the target video is the largest, and the total duration of the video frames corresponding to the "jean" and the "basketball" accounts for more than 50% of the target video, and the "jean" and the "basketball" are the first keyword set.

And 4, generating a video title of the target video according to the first keyword set.

In this step, after obtaining the first keyword set, the video title generation device may generate a video title by natural language understanding according to the first keyword set. The manner of generating the video title is not particularly limited as long as the video title can be generated by natural language understanding. In addition, after the video title of the target video is generated, the target video can be launched, and when the search keyword input by the user is matched with the video title of the target video, the target video is pushed to the user, so that accurate pushing is realized.

It should be noted that, with the title of the attached video content, when a user searches for a video related to a certain content, the content that the user wants to see can be better displayed through matching of the user input content and the system title, and the display relevance of the search result is improved. Or when some hot content is generated in the near future, the video material of the existing resource library is utilized, and the video keywords are combined to generate the title related to the hot content, so that the video content which is past is recommended to the hot topic material again. For example, a user uploads a section of lovely pet cat video before, video content keywords (cat, splitting family) are identified, a new title is given to the section of historical video according to an internet hot search topic 'cat can have bad eyes', through the steps, the background is matched with the section of video, so that when the user searches a topic 'cat can have bad eyes', the video can be recommended to the user again, video materials are fully utilized, the user search intention is fitted, and the video exposure speed is increased.

In one embodiment, the video title generation apparatus further performs the following operations:

if the number of video frames in the video frame set corresponding to each matching result in the matching result set is smaller than a preset threshold, determining the frame number ratio of the video frame corresponding to each matching result in the matching result set;

calculating the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in other matching result subsets, wherein the other matching result subsets are sets of matching results except the target matching result in the matching result set;

generating a video title of the target video according to the second keyword set

In this embodiment, if the number of video frames in the video frame set corresponding to each matching set in the matching result set is less than the preset threshold, it indicates that each matching result in the matching result set is in the target videoThe number of matched video frames is less than a preset threshold value, and at the moment R_eIf it is null, then calculate the standard deviation value corresponding to each matching result in the matching result set, e.g. matching result set R_FThe corresponding video frames are:

wherein N is calculated in turn₁,A₁,A₂,O₁,O₂,S₁,S₂The number of frames in each object is proportional to N₁For example, wherein N₁Is in a ratio of

N is N₁Number of frames in the corresponding video frame, F_AIs the total frame number of the target video, so that N can be calculated₁,A₁,A₂,O₁,O₂,S₁,S₂The frame number ratio of each object is arranged in the order from small to large, for example, the obtained arrangement mode is as follows: {1,3,2,7,5,6,4}.

Then, according to the standard deviation formula, calculating N₁,A₁,A₂,O₁,O₂,S₁,S₂Corresponding standard deviation values, e.g. N₁Corresponding to a standard deviation of

And N is₁,A₁,A₂,O₁,O₂,S₁,S₂The standard deviations of all the objects in the list are sorted from small to large, N₁,A₁,A₂,O₁,O₂,S₁,S₂The corresponding standard deviation rank is {2, 5, 4, 1, 3, 7, 6 }.

Then determining the weighted average value of each matching result in the matching result set according to the frame number ratio and the standard deviation value, specificallyThe weights of the frame number ratio and the standard deviation, each of which is 30% and 70% (of course, may be adjusted according to actual conditions, and is not limited specifically), may be set, so that N may be calculated by the following formula₁Weighted average of (a):

wherein the content of the first and second substances,

is N₁The ratio of the number of frames of (c),

is N₁The value of the standard deviation of (a),

is N₁Whereby a weighted average of the remaining objects in the set of matching results can be calculated based on the formula.

Then, the coincidence degree between the video frame set corresponding to the target matching result with the largest weighted average value and the video frame set corresponding to each matching result in the other matching result subsets is calculated to obtain a second coincidence degree set, for example, the second coincidence degree set is finally obtained

If the value of (3) is the maximum value, the probability that S1 is one of the main expression contents of the target video is considered to be the maximum value, and the second overlap ratio set can be obtained by the overlap ratio between the video frame set corresponding to S1 and the video frame sets of other objects. The above-mentioned method for calculating the contact ratio has been described in detail, and is not described herein in detail.

Finally, after the second coincidence degree set is obtained, the coincidence degree with the minimum coincidence degree value in the second target coincidence degrees can be selected as the second target coincidence degree, a second keyword set corresponding to the second target coincidence degree is determined, and then the video title of the target video can be generated according to the second keyword set.

In one embodiment, if H_gAnd N_g,A_g,O_g,S_gThe set matching has no result, that is, the result object analyzed by the target video has no intersection with the hot word in the previous period, and the video title generation device can generate the video title of the target video in the following manner:

In this embodiment, if each keyword in the keyword set is not successfully matched with the first set, the second set, the third set, and the fourth set, it is determined that a video frame set corresponding to the first set, a video frame set corresponding to the second set, a video frame set corresponding to the third set, and a video frame set corresponding to the fourth set in the target video are present, the video frames in the target video may be traversed, a video frame set corresponding to the first set in the target video may be found, a video frame set corresponding to the second set in the target video, a video frame set corresponding to the third set in the target video, a video frame set corresponding to the fourth set in the target video, a coincidence degree between any two video frame sets in each video frame set may be calculated, a third coincidence degree set may be obtained, the calculation manner of the coincidence degree has been described in detail, and will not be described herein in detail.

And then determining a third keyword set corresponding to a third target coincidence degree with the minimum coincidence degree value in the third coincidence degree set, and generating a video title of the target video according to the third keyword set.

In one embodiment, after determining the first keyword set corresponding to the first target contact ratio, the video title generating apparatus further performs the following operations:

In this embodiment, after the video title generation apparatus determines the first keyword set corresponding to the first target contact ratio, the video frame set corresponding to the keywords in the first keyword set may be used as a reference, and the coincidence degree between the video frame set corresponding to the first keyword set and other videos except the video frame corresponding to the first keyword set in the video frame set corresponding to the matching result set is calculated to obtain a fourth coincidence degree set (the coincidence degree calculation has been described above, and details are not repeated here), and then the fourth keyword set corresponding to a fourth target coincidence degree with the smallest coincidence degree value in the fourth coincidence degree set is determined, and generates a video title of the target video according to the fourth keyword set and the first keyword set, therefore, more relevant words of the target video can be obtained, and a title which is more fit to the video content is generated.

In summary, in the embodiment provided by the application, by crawling the keyword set of the hot content in the previous period, determining that the matching between the keywords in the keyword set and each object in the video of the title to be generated is recorded, and determining the video frame set corresponding to the matching result, the contact ratio between each video set can be calculated, the keywords associated with the keywords in the keyword set are determined according to the contact ratio, and then the video title of the video is generated according to the keywords. Therefore, the video titles closely combined with the current affair hotspots can be generated for the videos in batches, readability and interestingness of the video titles are improved, workload of operators can be reduced, video production and online exposure speed are improved, and accordingly video spreading performance is improved.

The embodiments of the present application are described above with respect to a method of generating a video title, and the embodiments of the present application are described below with respect to a video title generation apparatus:

referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of a video title generation apparatus according to an embodiment of the present application, where the video title generation apparatus 200 includes:

a first determining unit 201, configured to determine a content feature corresponding to a target video;

an obtaining unit 202, configured to obtain a keyword set of hot spot content in a previous period;

a matching unit 203, configured to match each keyword in the keyword set with a content feature corresponding to the target video, respectively, so as to obtain a matching result set;

a second determining unit 204, configured to determine a video title of the target video according to the matching result set.

In one possible design, the second determining unit 204 is specifically configured to:

In one possible design, the determining, by the second determining unit 204, a set of video frames in the target video corresponding to the set of matching results includes:

In one possible design, the second determining unit 204 is further configured to:

In one possible design, the second determining unit 204 calculates a degree of overlap between any two sets of video frames of the matching results in the set of matching results, and obtaining a first degree of overlap set includes:

The embodiment of the application also provides a computing device which can be a server. Referring to fig. 3, fig. 3 is a schematic structural diagram of a server 300 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the server 300.

The server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like.

The steps performed by the video title generation apparatus in the above-described embodiment may be based on the server structure shown in fig. 3.

The present application further provides a computer-readable storage medium, in which at least one executable instruction is stored, and when the executable instruction is run on a computing device, the computing device is caused to execute the method for generating a video title according to any of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for generating a video title, comprising:

determining content characteristics corresponding to a target video;

acquiring a keyword set of hot spot content in the previous period;

2. The method of claim 1, wherein the determining a video title of the target video according to the set of matching results comprises:

3. The method of claim 2, wherein the determining the set of video frames in the target video corresponding to the set of matching results comprises:

4. The method of claim 2, further comprising:

5. The method of claim 1, wherein the content features corresponding to the target video comprise a first set of person names and accuracy rates, a second set of person actions and accuracy rates, a third set of objects and accuracy rates, and a fourth set of scenes and accuracy rates corresponding to the target video, and wherein the method further comprises:

6. The method according to any one of claims 2 to 5, wherein after determining the first keyword set corresponding to the first target contact ratio, the method further comprises:

7. The method according to any one of claims 2 to 5, wherein the calculating the coincidence between the sets of video frames of any two matching results in the set of matching results, and obtaining a first coincidence set comprises:

wherein PD (F)₁,F₂) Is the overlap ratio, F, between the video frame sets of any two matching results in the matching result set₁Is the said matching junctionSet of video frames corresponding to any one of the matching results in the set of results, F₂Dividing F in the video frame set corresponding to the matching result set₁Any other set of video frames, F_xIs simultaneously included in F₁And F₂Any one of the video frames of (1), F_x[F₁]Is F_xAt F₁Position of (5), F_x[F₂]Is F_xAt F₂Of (c) is used.

8. A video title generation apparatus, comprising:

9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction which causes the processor to execute the steps of the video title generation method according to any one of claims 1-7.

10. A computer-readable storage medium having stored therein at least one executable instruction that, when executed on a computing device, causes the computing device to perform the method of generating a video title according to any one of claims 1 to 7.