CN112328834A

CN112328834A - Video association method and device, electronic equipment and storage medium

Info

Publication number: CN112328834A
Application number: CN202011248108.4A
Authority: CN
Inventors: 王铭喜
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-02-05

Abstract

The disclosure relates to a video association method, a video association device, an electronic device and a storage medium, which are used for solving the problem of inconvenient video association in the related art, and the method comprises the following steps: acquiring a video to be identified; determining target information from a video to be identified, the target information comprising at least one of: text information, voice information, and actor image information; determining whether a target film and television work matched with the target information exists in the information base; if the target film and television works matched with the target information exist in the information base, taking the target film and television works as original videos aiming at the videos to be identified, and associating the videos to be identified with the original videos; the text information comprises bullet screen information and subtitle information, and the voice information comprises lines and captions. Therefore, the same identification information does not need to be added to the video to be recognized and the original video, the workload of video association is reduced, and the convenience of video association is improved.

Description

Video association method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of video association technologies, and in particular, to a video association method and apparatus, an electronic device, and a storage medium.

Background

With the popularity of short videos, many users can intercept highlights from movies, TV plays, synthesis and other movie and television works, recommend the highlights to other users, or make an explanation video to share the explanation video to a friend circle, or share the explanation video to a short video platform, and then the purpose of popularizing the movies, TV plays and other movie and television works can be achieved.

In video promotion, identification information is usually added to an original video, and when a corresponding video to be identified is manufactured, identification information consistent with the original video is added to complete association between the video to be identified and the original video. And then, when the user executes the operation of acquiring the original video on the recommended video to be identified in the user interface, searching the original video through the consistent identification information, and jumping the playing interface to the original video.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides a video association method, an apparatus, an electronic device, and a storage medium, so as to solve the problem of inconvenient video association in the related art.

According to a first aspect of the embodiments of the present disclosure, there is provided a video association method, including:

acquiring a video to be identified;

determining target information from the video to be identified, the target information comprising at least one of: text information, voice information, and actor image information;

determining whether a target film and television work matched with the target information exists in an information base;

if the target film and television works matched with the target information exist in the information base, taking the target film and television works as original videos aiming at the videos to be identified, and associating the videos to be identified with the original videos;

the text information comprises bullet screen information and subtitle information, and the voice information comprises lines and captions.

Optionally, when the target information is the text information, the determining whether a target movie and television work matched with the target information exists in an information base includes:

and matching the text information with the titles of the film and television works in the information base, and determining whether the titles of the film and television works in the information base have the target film and television works matched with the text information.

Optionally, in a case that the target information is the actor image information, the determining whether a target movie and television work matching the target information exists in the information base includes:

determining identity information of the actor according to the actor image information;

determining the film and television works of the actors according to the identity information of the actors;

and matching the show movie and television works with the movie and television works in the information base, and determining whether the target movie and television works matched with the show movie and television works exist in the information base.

Optionally, in a case that the target information is the voice information, the determining whether a target movie and television work matched with the target information exists in an information base includes:

converting the voice information into text information;

Optionally, in a case that the target information includes the text information, the voice information, and the actor image information, the determining whether a target movie work matching the target information exists in the information base includes:

matching the text information with the title of the film and television works in the information base, and determining whether the title of the film and television works in the information base has a target film and television work matched with the text information;

if the target film and television works matched with the character information do not exist in the information base, determining the identity information of the actor according to the actor image information;

determining the film and television works of the actor according to the identity information of the actor;

matching the show movie and television works with movie and television works in an information base, and determining whether a target movie and television work matched with the show movie and television works exists in the information base;

if the target film and television works matched with the show film and television works do not exist in the information base, converting the voice information into text information;

According to a second aspect of the embodiments of the present disclosure, there is provided a video association apparatus, the apparatus comprising:

the acquisition module is configured to acquire a video to be identified;

a determination module configured to determine target information from the video to be identified, the target information including at least one of: text information, voice information, and actor image information;

the matching module is configured to determine whether a target film and television work matched with the target information exists in an information base;

the association module is configured to take the target film and television works as original videos for the videos to be identified and associate the videos to be identified with the original videos if the target film and television works matched with the target information exist in the information base;

Optionally, the matching module is specifically configured to, when the target information is the text information, match the text information with the titles of the movie works in the information base, and determine whether the title of the movie work in the information base has the target movie work matched with the text information.

Optionally, the matching module comprises:

a first determining sub-module configured to determine identity information of the actor from the actor image information in a case where the target information is the actor image information;

a second determining submodule configured to determine a movie work of the actor according to the identity information of the actor;

and the third determining submodule is configured to match the reference movie and television works with the movie and television works in the information base, and determine whether the target movie and television works matched with the reference movie and television works exist in the information base.

Optionally, the matching module is specifically configured to, in a case that the target information is the voice information, convert the voice information into text information;

Optionally, the matching module comprises:

the fourth determining sub-module is configured to match the text information with the film names of the film and television works in an information base under the condition that the target information comprises the text information, the voice information and the actor image information, and determine whether the film and television works in the information base have the target film and television works matched with the text information or not;

a fifth determining sub-module, configured to determine identity information of the actor according to the actor image information if the target movie work matched with the text information does not exist in the information base;

a sixth determining submodule configured to determine a movie work of the actor according to the identity information of the actor;

a seventh determining sub-module, configured to match the reference movie and television works with movie and television works in an information base, and determine whether a target movie and television work matched with the reference movie and television works exists in the information base;

the conversion sub-module is configured to convert the voice information into text information if the target film and television works matched with the show film and television works do not exist in the information base;

and the eighth determining submodule is configured to match the text information with the titles of the film and television works in the information base, and determine whether the titles of the film and television works in the information base have the target film and television works matched with the text information.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

acquiring a video to be identified;

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the video association method provided by the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

by acquiring a video to be identified and determining target information from the video to be identified, the target information includes at least one of the following: the system comprises character information, voice information and actor image information, wherein the character information comprises bullet screen information and subtitle information, and the voice information comprises lines and captions; further determining whether a target film and television work matched with the target information exists in the information base; and if the target film and television works matched with the target information exist in the information base, taking the target film and television works as original videos aiming at the videos to be identified, and associating the videos to be identified with the original videos. The same identification information does not need to be added to the video to be recognized and the original video, so that the workload of video association is reduced, and the convenience of video association is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flow chart illustrating a video association method according to an example embodiment.

Fig. 2 is a flowchart illustrating one implementation of step S13 in fig. 1, according to an example embodiment.

Fig. 3 is another flowchart illustrating an implementation of step S13 of fig. 1 according to an example embodiment.

Fig. 4 is a flowchart illustrating another implementation of step S13 of fig. 1 according to an example embodiment.

Fig. 5 is a block diagram illustrating a video association apparatus according to an example embodiment.

FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

It should be noted that in the present disclosure, the terms "first", "second", and the like in the description and in the claims, as well as in the drawings, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. Likewise, the terms "S133", "S134", and the like are used to distinguish the steps and are not necessarily to be construed as performing method steps in a particular order or sequence.

Before introducing the video association method, apparatus, electronic device and storage medium provided by the present disclosure, an application scenario of the present disclosure is first introduced. The video association method provided by the present disclosure may be applied to an electronic device, which may be a server or a terminal device, which may be, for example, a smartphone, a PC (Personal Computer), or the like.

The inventor finds that identification information consistent with the original video needs to be added to each manufactured video with identification, the workload of adding the identification information is large, and the convenience of video association is low. Moreover, the original video is not added with the identification information, or an error occurs when the identification information is added, so that the association accuracy is low, and the original video cannot be found, or the found original video is wrong.

To solve the above technical problem, the present disclosure provides a video association method. Fig. 1 is a flow chart illustrating a video association method according to an exemplary embodiment, which includes the following steps, as shown in fig. 1.

In step S11, a video to be recognized is acquired.

In step S12, target information is determined from the video to be recognized, the target information including at least one of: text information, voice information, and actor image information.

In step S13, it is determined whether a target movie work matching the target information exists in the information base.

In step S14, if there is a target movie work matching the target information in the information base, the target movie work is used as an original video for the video to be recognized, and the video to be recognized is associated with the original video.

Optionally, taking the electronic device as an example here, the server acquires the video to be identified in response to an operation that the user completes the creation of the video to be identified on the terminal device. For example, when the user finishes capturing the video segment, the server takes the captured video as the video to be identified, and for example, when the user finishes recording the video segment, the server takes the recorded video segment as the video to be identified.

Specifically, the information base stores original videos, which can be movies, dramas, anaglyphs and other film and television works made by a producer and videos to be identified made by other users, so that the currently made videos to be identified and the movies made by the producer can be associated, the currently made videos to be identified and the videos made by other users can be associated, and the popularization effect can be improved.

Optionally, the bullet screen information may be acquired from the video to be identified, and the characters in the bullet screen are matched with the titles of the film and television works in the information base by identifying the characters in the bullet screen. And if the completely consistent title is found in the information base, taking the target film and television works corresponding to the title as the original video aiming at the video to be identified.

Alternatively, the subtitle information may be a title, a credits, a lyric, a dialog, a caption, a description, a character introduction, a place name, a year, and the like of the movie, for example, the subtitle information may be a dialog for displaying below the playing interface, a character introduction or a title of the movie displayed on both sides of the playing interface, a bystander for displaying above the playing interface, or a credits at the beginning or end of the movie.

In specific implementation, the text information, the voice information and the actor image information may be selected as the target information, for example, whether a first target movie and television work matched with the video to be recognized exists in the information base may be determined through the text information and the voice information, and then, under the condition that the first target movie and television work matched with the video to be recognized exists in the information base is determined through the text information and the voice information, whether a second target movie and television work matched with the actor image information exists in the first target movie and television work may be further determined. In this way, the accuracy of video association can be improved.

According to the technical scheme, the video to be identified is obtained, and the target information is determined from the video to be identified, wherein the target information comprises at least one of the following information: the system comprises character information, voice information and actor image information, wherein the character information comprises bullet screen information and subtitle information, and the voice information comprises lines and captions; further determining whether a target film and television work matched with the target information exists in the information base; and if the target film and television works matched with the target information exist in the information base, taking the target film and television works as original videos aiming at the videos to be identified, and associating the videos to be identified with the original videos. The same identification information does not need to be added to the video to be recognized and the original video, so that the workload of video association is reduced, and the convenience of video association is improved.

Illustratively, in the case where the text information is subtitle information, first, title information in subtitle information added to the short video is matched with the title of a movie or television work in the information library. For example, the title information may be a title added to the video playing interface when the user produces the video to be identified, so that other users can directly search the original video through the title in the user interface. Or, when the user shares the video to be identified, the user inputs the title information in the caption information introduced by the short video scenario. For example, when the user shares the video to be recognized to the circle of friends, the "wonderful segment" input at the corresponding position of the circle of friends is derived from the fifth set of the "journey to the west", and at this time, the user can be associated with the original video through the "fifth set of the journey to the west".

In a possible implementation manner, the original video may be a video with video made by other users, for example, when the first target user shares the video to be recognized to the circle of friends, the "wonderful segment" input at the corresponding position of the circle of friends is from the fifth set of the "journey to the west", and at this time, the "journey to the west" video to be recognized of the sixth set of the "journey to the west" made by the second target user may be associated through the "journey to the west".

Further, if a first target title matched with the title information in the added subtitle information exists in the information base, taking the film and television works corresponding to the first target title as an original video aiming at the video to be identified; and if the first target title matched with the title information in the added subtitle information does not exist in the information base, matching the subtitle information displayed on two sides of the playing interface with the title of the film and television works in the information base. Illustratively, the Kaifu subtitle information displayed on the left side of the playing interface is matched with the title of the movie and television works in the information base.

If a second target title matched with the subtitle information displayed on the two sides of the playing interface exists in the information base, taking the film and television works corresponding to the second target title as an original video aiming at the video to be identified; and if the second target title matched with the subtitle information displayed on the two sides of the playing interface does not exist in the information base, matching the dialogue used for being displayed at the bottom of the playing interface with the title of the film and television works in the information base.

In one possible implementation, the dialogue for display at the bottom of the playback interface is matched with the dialogue of the movie work in the information base. Illustratively, "he is not you are" subtitle information displayed at the bottom of the playback interface is matched with the dialogue of the movie and television works in the information base. Of course, the matching between the dialogue in the Kaifeng Fu in the information base and the caption information that is displayed at the bottom of the playing interface and is not the same as you can be used, so that the accuracy of video matching can be improved.

And if the third target title matched with the dialogue displayed at the bottom of the playing interface exists in the information base, taking the film and television works corresponding to the third target title as the original video aiming at the video to be identified.

If the third target title matched with the dialogue displayed at the bottom of the playing interface does not exist in the information base, the matching can be further carried out through the actor image information.

By adopting the technical scheme, the video to be recognized can be subjected to title confirmation through different character information, and then the association between the video to be recognized and the original video is completed. The problem that single text information cannot be matched with the original video is avoided. The success rate of the short video association is improved.

It can be understood that the first target title matched with the title information in the added subtitle information exists in the information base, and the subtitle information for displaying on two sides of the playing interface can be matched with the first target title, so that the accuracy of short video association can be improved.

Alternatively, referring to fig. 2, in the case that the target information is the actor image information, in step S13, the determining whether a target movie work matching the target information exists in the information base includes the following steps:

in step S131, identity information of the actor is determined based on the actor image information.

In step S132, the movie work of the actor is determined according to the identity information of the actor.

In step S133, the reference movie and television works are matched with the movie and television works in the information base, and it is determined whether there is a target movie and television work matched with the reference movie and television works in the information base.

In one embodiment, multiple actor images, which may be the same actor image or multiple actor images, may be determined from multiple frame frames by image recognition techniques. Under the condition that the images of the multiple actors are the same actor image, the identity information of the actors can be determined according to the images of the multiple actors, and the accuracy of identity information confirmation can be improved. For example, actor images of a plurality of actor nails may be determined from a plurality of picture frames, and identity information of the actor nails may be determined from the actor images of the plurality of actor nails. And further determining the reference film and television works of the actor, and further matching the film and television works with the film and television works in the information base.

When the plurality of actor images are a plurality of actor images, the identity information of the corresponding actor can be determined according to each actor image, and then the film and television works of each actor are respectively determined. Further, a target movie work commonly played by a plurality of actors is confirmed, and the commonly played target movie work is used as an original video for the video to be identified. The accuracy of video association can be improved.

For example, actor images of an actor a, an actor b, and an actor d may be determined from a plurality of picture frames to identify a first video work of the actor a, to identify a second video work of the actor b, and to identify a third video work of the actor d. Further, from the film and television works A, the film and television works B and the film and television works D, a target film and television work in which the actor A, the actor B and the actor D commonly play is determined. And finally, taking the target film and television works in which the actors A, the actors B and the actors D play together as original videos aiming at the videos to be identified.

Alternatively, referring to fig. 3, in the case that the target information is the voice information, in step S13, the determining whether the target movie and television work matching the target information exists in the information base includes:

in step S134, the speech information is converted into text information.

In step S135, the text information is matched with the titles of the film and television works in the information base, and it is determined whether the title of the film and television work in the information base has the target film and television work matched with the text information.

Optionally, the speech-to-text conversion method includes the steps of obtaining speech information in a video to be recognized, converting speech of characters in the video to be recognized into text information through a speech-to-text technology, recognizing whether the speech contains video title information, matching the title information with titles of film and television works in an information base, and determining whether the titles of the film and television works in the information base have target film and television works matched with the title information.

Or directly matching the lines with the titles of the film and television works in the information base, and determining whether the titles of the film and television works in the information base have the target film and television works matched with the lines.

Alternatively, referring to fig. 4, in the case that the target information includes the text information, the voice information, and the actor image information, in step S13, the determining whether a target movie work matching the target information exists in the information base includes:

in step S1301, the text information is matched with the titles of the film and television works in the information base, and it is determined whether the title of the film and television work in the information base has a target film and television work matched with the text information.

In step S1302, if the target movie work matching the text information does not exist in the information base, the identity information of the actor is determined according to the actor image information.

In step S1303, the movie work of the actor is determined based on the identification information of the actor.

In step S1304, the show movie and television works are matched with the movie and television works in the information base, and it is determined whether there is a target movie and television work matched with the show movie and television works in the information base.

In step S1305, if there is no target movie work matching the show movie work in the information base, the voice information is converted into text information.

In step S1306, the text information is matched with the titles of the movie works in the information base, and it is determined whether there is a target movie work matched with the text information in the titles of the movie works in the information base.

In a possible implementation manner, after the video to be identified is associated with the original video, the associated video to be identified may be displayed to the user on the user interface, and played in a carousel manner. When a user executes the operation of playing the original video associated with the video to be identified, the user can directly jump to the original video, and the video popularization efficiency is improved.

Based on the same inventive concept, the present disclosure also provides a video association apparatus 600 for performing the steps of the video association method provided by the above method embodiments, and the apparatus 600 may implement the video association method in a software, hardware or a combination of the two. Fig. 5 is a block diagram illustrating a video association apparatus according to an exemplary embodiment, and as shown in fig. 5, the apparatus 600 includes: an obtaining module 610, a determining module 620, a matching module 630 and an associating module 640.

Wherein the obtaining module 610 is configured to obtain a video to be identified;

the determining module 620 is configured to determine target information from the video to be identified, the target information including at least one of: text information, voice information, and actor image information;

the matching module 630 is configured to determine whether a target movie and television work matching the target information exists in the information base;

the association module 640 is configured to, if a target movie work matching the target information exists in the information base, regard the target movie work as an original video for the video to be identified, and associate the video to be identified with the original video;

The device obtains the video to be identified and determines target information from the video to be identified, wherein the target information comprises at least one of the following information: the system comprises character information, voice information and actor image information, wherein the character information comprises bullet screen information and subtitle information, and the voice information comprises lines and captions; further determining whether a target film and television work matched with the target information exists in the information base; and if the target film and television works matched with the target information exist in the information base, taking the target film and television works as original videos aiming at the videos to be identified, and associating the videos to be identified with the original videos. The same identification information does not need to be added to the video to be recognized and the original video, so that the workload of video association is reduced, and the convenience of video association is improved.

Optionally, the matching module 630 is specifically configured to, when the target information is the text information, match the text information with the titles of the movie works in the information base, and determine whether the title of the movie work in the information base has the target movie work matched with the text information.

Optionally, the matching module 630 includes: a first determination submodule, a second determination submodule and a third determination submodule.

Wherein the first determining sub-module is configured to determine identity information of the actor from the actor image information in a case where the target information is the actor image information;

the second determining submodule is configured to determine the film and television works of the actor according to the identity information of the actor;

the third determining submodule is configured to match the reference movie and television works with movie and television works in an information base, and determine whether a target movie and television work matched with the reference movie and television works exists in the information base.

Optionally, the matching module 630 is specifically configured to, in a case that the target information is the voice information, convert the voice information into text information;

Optionally, the matching module 630 includes: a fourth determination submodule, a fifth determination submodule, a sixth determination submodule, a seventh determination submodule, a conversion submodule, and an eighth determination submodule.

The fourth determining submodule is configured to match the text information with the titles of the film and television works in an information base under the condition that the target information comprises the text information, the voice information and the actor image information, and determine whether the title of the film and television work in the information base has a target film and television work matched with the text information;

the fifth determining submodule is configured to determine identity information of the actor according to the actor image information if the target film and television work matched with the text information does not exist in the information base;

the sixth determining submodule is configured to determine the film and television works of the actor according to the identity information of the actor;

the seventh determining submodule is configured to match the reference movie and television works with movie and television works in an information base, and determine whether a target movie and television work matched with the reference movie and television works exists in the information base;

the conversion sub-module is configured to convert the voice information into text information if a target movie work matched with the show movie work does not exist in the information base;

the eighth determining submodule is configured to match the text information with the titles of the film and television works in the information base, and determine whether the titles of the film and television works in the information base have the target film and television works matched with the text information.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that, for convenience and brevity of description, the embodiments described in the specification all belong to the preferred embodiments, and the related parts are not necessarily essential to the present invention, for example, the matching module 630 and the associating module 640 may be independent devices or may be the same device when being implemented specifically, and the disclosure is not limited thereto.

The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the video matching method provided by the present disclosure.

The present disclosure also provides an electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

acquiring a video to be identified;

Fig. 6 is a block diagram illustrating an apparatus 1900 for performing a video association method according to an example embodiment. For example, the apparatus 1900 may be provided as a server. Referring to FIG. 6, the device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the steps of the video association method described above

The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system, such as Windows Server, stored in memory 1932^TM，Mac OS X^TM，Unix^TM，Linux^TM，FreeBSD^TMOr the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video association method, comprising:

acquiring a video to be identified;

2. The method of claim 1, wherein in the case that the target information is the text information, the determining whether a target movie and television work matching the target information exists in the information base comprises:

3. The method according to claim 1, wherein in a case where the target information is the actor image information, the determining whether a target movie work matching the target information exists in the information base includes:

4. The method of claim 1, wherein in the case that the target information is the voice information, the determining whether a target movie and television work matching the target information exists in the information base comprises:

converting the voice information into text information;

5. The method of claim 1, wherein in the case that the target information includes the text information, the voice information, and the actor image information, the determining whether a target movie work matching the target information exists in the information base comprises:

6. A video association apparatus, the apparatus comprising:

the acquisition module is configured to acquire a video to be identified;

7. The apparatus of claim 6, wherein the matching module comprises:

8. The apparatus of claim 6, wherein the matching module comprises:

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

acquiring a video to be identified;

10. A computer-readable storage medium, on which computer program instructions are stored, which program instructions, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 5.