CN110427499B

CN110427499B - Method and device for processing multimedia resources, storage medium and electronic device

Info

Publication number: CN110427499B
Application number: CN201810387697.0A
Authority: CN
Inventors: 罗刚
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-04-26
Filing date: 2018-04-26
Publication date: 2023-08-29
Anticipated expiration: 2038-04-26
Also published as: CN110427499A

Abstract

The invention discloses a processing method and device of multimedia resources, a storage medium and an electronic device. Wherein the method comprises the following steps: identifying a target object in the multimedia asset; acquiring target resources related to a target object; determining a target position according to the position of the target object in the playing progress of the multimedia resource; the target resource is associated with the target location to enable the target resource to be presented with the multimedia resource being played to the target location. The invention solves the technical problem that the multimedia resource content display mode in the related technology is single.

Description

Method and device for processing multimedia resources, storage medium and electronic device

Technical Field

The present invention relates to the field of computers, and in particular, to a method and an apparatus for processing multimedia resources, a storage medium, and an electronic device.

Background

Existing multimedia resources, such as documentaries, popular videos, PPT (Power Point), etc., basically show content in the form of video, audio (such as by-pass), pictures, text (such as subtitles), etc. After the multimedia resource is manufactured, the content is fixed, people with different ages and knowledge layers receive the same information from the same resource content, and the knowledge acceptance and range of the watching user are not considered.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a processing method and device of multimedia resources, a storage medium and an electronic device, which at least solve the technical problem that the content display mode of the multimedia resources in the related technology is single.

According to an aspect of an embodiment of the present invention, there is provided a method for processing a multimedia resource, including: identifying a target object in the multimedia asset; acquiring target resources related to a target object; determining a target position according to the position of the target object in the playing progress of the multimedia resource; the target resource is associated with the target location to enable the target resource to be presented with the multimedia resource being played to the target location.

According to another aspect of the embodiment of the present invention, there is also provided a processing apparatus for a multimedia resource, including: an identification unit for identifying a target object in the multimedia resource; a first acquisition unit configured to acquire a target resource related to a target object; the first determining unit is used for determining a target position according to the position of the target object in the playing progress of the multimedia resource; and the association unit is used for associating the target resource with the target position so that the target resource can be displayed under the condition that the multimedia resource is played to the target position.

According to a further aspect of embodiments of the present application, there is also provided a storage medium having stored therein a computer program, wherein the computer program is arranged to perform the above method when run.

According to still another aspect of the embodiments of the present application, there is also provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the above method by the computer program.

In the embodiment of the application, the target object in the multimedia resource is identified, and the acquired resource related to the target object is associated with the target position in the playing progress of the multimedia resource, so that the multimedia resource can be displayed when being played to the target position, and the technical problem that the content display mode of the multimedia resource in the related technology is single is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

Fig. 1 is a schematic view of an application environment of a method for processing multimedia resources according to an embodiment of the present invention;

FIG. 2 is a flow chart of an alternative method of processing multimedia resources according to an embodiment of the present invention;

FIG. 3 is a flow chart of an alternative method for processing multimedia resources according to an embodiment of the invention

FIG. 4 is a schematic diagram of an alternative multimedia asset processing device according to an embodiment of the invention;

fig. 5 is a schematic structural view of an alternative electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

According to an aspect of an embodiment of the present invention, a method for processing a multimedia resource is provided. Alternatively, the above method for processing a multimedia resource may be, but not limited to, applied to the application environment shown in fig. 1, and as shown in fig. 1, an alternative method for processing a multimedia resource may include: the terminal 102 stores multimedia resources, the terminal 102 can send a processing request to the server 106 through the network 104 by using the application, the server 106 feeds back an identification model to the terminal 102 through the network 104, the terminal 102 identifies a target object in the multimedia resources according to the received identification model after receiving the identification model, the terminal 102 can send a request for searching the target resources related to the target object to the server 106 through the network 104, and after the terminal 102 receives the target resources fed back by the server 106, the target position can be determined according to the position of the target object in the playing progress of the multimedia resources; the target resource is associated with the target location to enable the target resource to be presented with the multimedia resource being played to the target location.

In the embodiment of the invention, the target object in the multimedia resource is identified, and the acquired resource related to the target object is associated with the target position in the playing progress of the multimedia resource, so that the multimedia resource can be displayed when being played to the target position, and the technical problem that the content display mode of the multimedia resource in the related technology is single is solved.

Alternatively, in this embodiment, the above terminal may include, but is not limited to, at least one of: a mobile phone, a tablet computer, etc. The network may include, but is not limited to, a wireless network, wherein the wireless network includes: bluetooth, WIFI, and other networks that enable wireless communications. The server may include, but is not limited to, at least one of: PCs and other devices for computing services. The above is merely an example, and the present embodiment is not limited thereto.

Example 2

As an alternative embodiment, as shown in fig. 2, the method for processing a multimedia resource may include:

s202, identifying a target object in a multimedia resource;

s204, acquiring target resources related to the target object;

s206, determining a target position according to the position of the target object in the playing progress of the multimedia resource;

and S208, associating the target resource with the target position so that the target resource can be displayed under the condition that the multimedia resource is played to the target position.

The multimedia resource is a resource carrying a combination of multiple media forms including at least two of: audio, image, text, video (i.e., continuously played image frames), etc. For example, the multimedia asset may be documentaries, popular videos, PPTs, and the like.

The multimedia resource may carry information of the target object in any media form (pictures, audio, text, etc.), for example, the target object may be an animal, plant, famous painting, celebrity, music, scenic spot, or a logo building, etc., and the multimedia resource may include a picture of the target object or a picture for describing the target object, a part or all of the target object itself (such as a part of a section of music), a bystander audio for introducing the target object, etc.

By identifying at least one of a picture, audio, text, etc. in the multimedia asset, it is identified whether a target object is present in the multimedia asset. Alternatively, the step S202 may be performed by identifying models obtained through machine learning training in advance, each identifying model is used for identifying whether a target object exists in the multimedia resource, and further, the step of identifying the target object in the multimedia resource in step S202 may include: an identification model for identifying a target object in the multimedia asset is obtained, and the target object in the multimedia asset is identified by the identification model.

It should be noted that the recognition model may be an audio recognition model, an image recognition model, a text recognition model, or the like, which is used to recognize any media format, so as to be used to recognize a corresponding media format in the multimedia resource. Alternatively, in case the multimedia asset comprises a video, it is determined whether the target object is present in the multimedia asset by identifying the target object in an image frame of the video, i.e. the target object in the image frame is identified by an image identification model.

In the case of identifying a target object in the multimedia resource, a target resource related to the target object, for example, a resource for introducing knowledge content such as text, image, audio, etc. of the target object is acquired. The target resource may be a resource with a mapping relation with the target object, which is stored in a resource library in advance, the resource library may be generated by manual editing, or the target resource related to the target object may be a resource searched on a network, and after the resource library or the resource related to the target object is searched on the network, a part of the most related resource or the resource carried in a specified media form may be selected as the target resource.

After the target resource related to the target object is acquired, the target resource is associated to the playing progress of the multimedia resource, specifically, the target position can be determined according to the position of the target object in the playing progress of the multimedia resource, and the target resource is associated with the target position, so that the target resource can be displayed under the condition that the multimedia resource is played to the target position.

Alternatively, as an optional implementation manner, the obtaining manner of the identification model for identifying the target object in the multimedia resource may include the following steps:

S1, determining a target group of multimedia resources;

s2, determining a target object to be identified in the multimedia resource according to a target group of the multimedia resource;

s3, selecting an identification model for identifying the target object from the identification model library.

The target objects that need to be identified may be different for different target populations. For example, for a popular science class animal documentary for children it may be necessary to identify almost all animals, while for a senior or adult-oriented documentary, it may not be necessary to identify all animals, or more specific identification of the animal species, as the target population already has a certain knowledge background.

For example, if the target group of animal documentaries (multimedia resources) is a child, it is identified in the documentaries by an image identification model for identifying cats, dogs, etc., and if the target group of animal documentaries is an adult, it is identified in the documentaries by an image identification model for identifying a specific breed of cat, a specific breed of dog, etc.

After determining the target object to be identified in the multimedia resource, selecting an identification model corresponding to the target object from an identification model library. The recognition model library includes one or more recognition models, each model for recognizing a corresponding target object. Each recognition model in the recognition model library is pre-generated, specifically, before selecting a recognition model for recognizing the target object in the recognition model library, the method may further include the steps of:

S1, establishing a media training set corresponding to a target object, wherein the media training set comprises at least one media sample pair, and each media sample pair comprises a media sample and training information for indicating whether the target object appears in the media sample;

s2, performing machine learning training by adopting a media training set to obtain a recognition model for recognizing the target object;

and S3, adding the recognition model for recognizing the target object into a recognition model library.

For example, to generate an image recognition model for recognizing a cat, an image training set corresponding to the cat is created, the image training set including a plurality of image sample pairs, each image sample pair including an image, and information indicative of whether the cat is present in the image. Training the neural network model by adopting the image training set to obtain an image recognition model for recognizing the cat, and adding the image recognition model into a recognition model library.

Alternatively, as an optional implementation manner, the step S204 of acquiring the target resource related to the target object may include the following steps:

s1, searching media resources related to a target object by taking the name of the target object as a keyword;

s2, acquiring target resources from the search results.

Optionally, if the multimedia resource needs to be played, the displayed content may be filtered, where the filtering basis is a description file (user portrait) of the user and/or a selection operation, and the user may correspond to a terminal playing the multimedia resource or an account number currently logged in by a client playing the multimedia resource. The filtering step may be performed during playing, that is, filtering the display content in the multimedia resource to be played later while playing, or the filtering step may be performed before or after playing.

As an alternative embodiment, after associating the target resource with the target location in step S208, the following steps may be performed:

s1, receiving a play request of a user, wherein the play request is used for indicating to play multimedia resources;

s2, acquiring a description file and/or a selection operation of a user, wherein the description file can be a file used for describing personal characteristics of the user, such as a user portrait, and the selection operation can be a selection operation input by the user through interaction equipment;

s3, determining a filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position;

And S4, playing the multimedia resource, wherein in the process of playing the multimedia resource, the target resource filtered by the filtering rule is displayed at the target position filtered by the filtering rule.

Optionally, in the process of playing the multimedia resource, the target resource of the text, the picture and even the video can be displayed on the upper layer of the multimedia resource, and further, the target resource can be displayed in a knowledge chain display mode.

Optionally, as an optional implementation, associating the target resource with the target location in step S208 may include performing the following steps:

s1, marking a target position in the playing progress of a multimedia resource, and obtaining a time axis of a target object;

s2, associating the target resource with the time axis of the target object, namely associating the target resource to the moment when the corresponding target object appears in the time axis, and generating the time axis of the target resource.

An alternative implementation of the foregoing embodiment in a specific application scenario is shown in fig. 3, and is described below.

"object recognition" is object recognition data (i.e., the above-described library of recognition models) built by a machine learning system, such as for example, for children science popularization animal videos, only animal species (e.g., dogs) need to be recognized, while for more specialized animal videos, more specialized, refined information (e.g., shepherd dogs) is needed to be able to recognize animal species, genus, etc. The object recognition data is a set of recognition models obtained by machine learning training using a large number of learning materials (pairs of image training samples).

And analyzing the video picture through the identification model, and identifying the target object in the original video. Alternatively, the recognition model may be a recognition model corresponding to a target group selected from object recognition data that is pre-established (may be updated periodically) according to the target group of the original video.

After the target object in the video is identified, a keyword time axis (object time axis) corresponding to the video progress is generated from the position of the video image frame including the target object in the video playback progress, the object time axis marking the period in which the target object appears.

Further, according to these keywords, enhancement information (target resource) corresponding to the keywords is searched for, and "object information enhancement", that is, enhancement information related to the target object is associated with the object time axis, and "enhancement information time axis" is generated, and when the video is played, enhancement information such as a picture, a text introduction, and the like is presented at the upper layer of the video in a period in which the enhancement information corresponds to the occurrence.

Optionally, as shown in fig. 3, when playing the video, the "enhanced information time axis" may also be filtered through user configuration (selection operation)/portrait (description file), so as to obtain a filtered "user enhanced information time axis", and when playing the original video, the "user enhanced information time axis" is played, so that the filtered time period displays enhanced information such as the filtered picture, text introduction, etc. at the upper layer of the video.

The alternative implementation mode is based on machine learning and other technologies, objects (animals, plants, scenery or logo buildings) are identified from videos before or during playing, when the objects are played to identification points, more contents can be displayed to a user on a picture in an introductory text, picture or knowledge point link mode, and the user can acquire more information or associated information through dispersing the knowledge points. Meanwhile, the user can establish a filtering rule according to the interests, the acceptance capacity and the like of the user, and only the enhanced display content which the user wants to acquire is displayed.

According to the method, the object identification data with different levels and ranges are established, the requirement of generating the enhanced content with different depths and ranges on the product side is met, a proper machine learning system can be selected according to conditions such as video content, target crowd and the like to generate target enhanced content, the information quantity of the video is greatly improved, a more convenient and personalized knowledge channel is provided for the user based on the technical scheme of user selection and user portrait filtering of the enhanced information content, and the knowledge experience of the ornamental video is improved.

The technical scheme of the alternative implementation mode combines the machine learning technology which is mature, the existing search products and technology are utilized to collect the self information and the expansion information of the target object through the identification of the target object in the video, and finally the original video and the additional expansion information are output to the user, so that the effect of expanding knowledge is achieved.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the various embodiments of the present invention.

Example 3

According to another aspect of the embodiment of the present invention, there is also provided a processing apparatus for implementing the above multimedia resource, as shown in fig. 4, including: an identifying unit 402, a first obtaining unit 404, a first determining unit 406 and an associating unit 408, wherein the identifying unit is used for identifying a target object in the multimedia resource; a first acquisition unit configured to acquire a target resource related to a target object; the first determining unit is used for determining a target position according to the position of the target object in the playing progress of the multimedia resource; and the association unit is used for associating the target resource with the target position so that the target resource can be displayed under the condition that the multimedia resource is played to the target position.

Optionally, in this embodiment, the identifying unit includes: the first acquisition module is used for acquiring an identification model for identifying a target object in the multimedia resource, wherein the identification model is a model obtained by machine learning training in advance; and the identification module is used for identifying the target object in the multimedia resource through the identification model.

Optionally, in this embodiment, the first obtaining module includes: a first determining module for determining a target population of multimedia resources; the second determining module is used for determining a target object to be identified in the multimedia resource according to the target group of the multimedia resource; and the third determining module is used for selecting the recognition model for recognizing the target object from the recognition model library.

Optionally, in this embodiment, the apparatus further includes: the system comprises a building unit, a storage unit and a storage unit, wherein the building unit is used for building a media training set corresponding to a target object before an identification model for identifying the target object is selected from an identification model library, the media training set comprises at least one media sample pair, and each media sample pair comprises a media sample and training information for indicating whether the target object appears in the media sample; the training unit is used for performing machine learning training by adopting a media training set to obtain an identification model for identifying the target object; and the adding unit is used for adding the identification model for identifying the target object into the identification model library.

Optionally, in this embodiment, the first acquisition unit includes: the searching module is used for searching media resources related to the target object by taking the name of the target object as a keyword; and the second acquisition module is used for acquiring the target resource from the search result.

Optionally, in this embodiment, the apparatus further includes: the receiving unit is used for receiving a play request of a user after the target resource is associated with the target position; the second acquisition unit is used for acquiring the description file and/or the selection operation of the user; the second determining unit is used for determining a filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position; and the playing unit is used for playing the multimedia resources, wherein in the process of playing the multimedia resources, the target resources filtered by the filtering rules are displayed at the target positions filtered by the filtering rules.

Optionally, in this embodiment, the association unit includes: the marking module is used for marking the target position in the playing progress of the multimedia resource and obtaining a time axis of the target object; and the association module is used for associating the target resource with the time axis of the target object to generate the time axis of the target resource.

Optionally, in this embodiment, the multimedia resource includes a video, and the identifying unit is configured to identify a target object in an image frame of the video.

Example 4

According to a further aspect of embodiments of the present invention there is also provided a storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

s1, identifying a target object in a multimedia resource;

s2, acquiring target resources related to a target object;

s3, determining a target position according to the position of the target object in the playing progress of the multimedia resource;

and S4, associating the target resource with the target position so that the target resource can be displayed under the condition that the multimedia resource is played to the target position.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: identifying a target object in a multimedia asset includes: acquiring an identification model for identifying a target object in a multimedia resource, wherein the identification model is a model obtained by machine learning training in advance; the target object in the multimedia asset is identified by the identification model.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: the obtaining of the identification model for identifying the target object in the multimedia resource comprises: determining a target group of multimedia resources; determining a target object to be identified in the multimedia resource according to a target group of the multimedia resource; and selecting the recognition model for recognizing the target object from the recognition model library.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: before selecting the recognition model for recognizing the target object in the recognition model library, the method further comprises: establishing a media training set corresponding to the target object, wherein the media training set comprises at least one media sample pair, and each media sample pair comprises a media sample and training information for indicating whether the target object appears in the media sample; performing machine learning training by adopting a media training set to obtain an identification model for identifying a target object; the recognition model for recognizing the target object is added to the recognition model library.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: acquiring the target resource related to the target object comprises: searching media resources related to the target object by taking the name of the target object as a keyword; and acquiring target resources from the search results.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: after associating the target resource with the target location, the method further comprises: receiving a play request of a user; acquiring a description file and/or a selection operation of a user; determining a filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position; and playing the multimedia resource, wherein in the process of playing the multimedia resource, the target resource filtered by the filtering rule is displayed at the target position filtered by the filtering rule.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: associating the target resource with the target location includes: marking a target position in the playing progress of the multimedia resource to obtain a time axis of a target object; and associating the target resource with the time axis of the target object to generate the time axis of the target resource.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of: the multimedia asset includes a video, and identifying a target object in the multimedia asset includes: a target object in an image frame of a video is identified.

Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

Example 5

According to still another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above-mentioned multimedia resource processing method, as shown in fig. 5, where the electronic device includes: a processor 802, a memory 804, a display 806, a user interface 808, a transmission device 810, a sensor 812, and the like. The memory has stored therein a computer program, the processor being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

Alternatively, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of the computer network.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

S1, identifying a target object in a multimedia resource;

s2, acquiring target resources related to a target object;

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: identifying a target object in a multimedia asset includes: acquiring an identification model for identifying a target object in a multimedia resource, wherein the identification model is a model obtained by machine learning training in advance; the target object in the multimedia asset is identified by the identification model.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: the obtaining of the identification model for identifying the target object in the multimedia resource comprises: determining a target group of multimedia resources; determining a target object to be identified in the multimedia resource according to a target group of the multimedia resource; and selecting the recognition model for recognizing the target object from the recognition model library.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: before selecting the recognition model for recognizing the target object in the recognition model library, the method further comprises: establishing a media training set corresponding to the target object, wherein the media training set comprises at least one media sample pair, and each media sample pair comprises a media sample and training information for indicating whether the target object appears in the media sample; performing machine learning training by adopting a media training set to obtain an identification model for identifying a target object; the recognition model for recognizing the target object is added to the recognition model library.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: acquiring the target resource related to the target object comprises: searching media resources related to the target object by taking the name of the target object as a keyword; and acquiring target resources from the search results.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: after associating the target resource with the target location, the method further comprises: receiving a play request of a user; acquiring a description file and/or a selection operation of a user; determining a filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position; and playing the multimedia resource, wherein in the process of playing the multimedia resource, the target resource filtered by the filtering rule is displayed at the target position filtered by the filtering rule.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: associating the target resource with the target location includes: marking a target position in the playing progress of the multimedia resource to obtain a time axis of a target object; and associating the target resource with the time axis of the target object to generate the time axis of the target resource.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program: the multimedia asset includes a video, and identifying a target object in the multimedia asset includes: a target object in an image frame of a video is identified.

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 5 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 5 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 5, or have a different configuration than shown in FIG. 5.

The memory 804 may be used to store software programs and modules, such as program instructions/modules corresponding to the processing method and apparatus for multimedia resources in the embodiment of the present invention, and the processor 802 executes the software programs and modules stored in the memory 804, thereby executing various functional applications and data processing, that is, implementing the processing method for multimedia resources. The memory 804 may include high-speed random access memory, but may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 804 may further include memory remotely located relative to the processor 802, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 810 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission device 810 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 810 is a Radio Frequency (RF) module for communicating with the internet wirelessly.

The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.

In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for processing a multimedia resource, comprising:

identifying a target object in the multimedia asset;

Acquiring a target resource related to the target object;

marking the position of the target object in the playing progress of the multimedia resource as a target position;

associating the target resources to the target locations to generate an enhanced information timeline, wherein the enhanced information timeline is marked with the target resources associated with each of the target locations where a different one of the target objects appears;

determining the target resource associated with the target position based on the enhancement information time axis when the playing progress of the multimedia resource is played to the target position;

and under the condition that the target resource does not meet the filtering rule set by the account number currently logged in by the client for playing the multimedia resource, displaying the target resource at the upper layer of the multimedia resource.

2. The method of claim 1, wherein identifying a target object in a multimedia asset comprises:

acquiring an identification model for identifying the target object in the multimedia resource, wherein the identification model is a model obtained by machine learning training in advance;

identifying the target object in the multimedia resource through the identification model.

3. The method of claim 2, wherein obtaining an identification model for identifying the target object in the multimedia asset comprises:

determining a target population of the multimedia resource;

determining the target object to be identified in the multimedia resource according to the target group of the multimedia resource;

the identification model used for identifying the target object is selected from an identification model library.

4. A method according to claim 3, wherein prior to selecting the recognition model for recognition of the target object in a library of recognition models, the method further comprises:

establishing a media training set corresponding to the target object, wherein the media training set comprises at least one media sample pair, and each media sample pair comprises a media sample and training information for indicating whether the target object appears in the media sample;

performing machine learning training by adopting the media training set to obtain an identification model for identifying the target object;

adding an identification model for identifying the target object into the identification model library.

5. The method of claim 1, wherein obtaining a target resource associated with the target object comprises:

Searching media resources related to the target object by taking the name of the target object as a keyword;

and acquiring the target resource from the search result.

6. The method of claim 1, wherein after associating the target resource with the target location, the method further comprises:

receiving a play request of a user;

acquiring a description file and/or a selection operation of the user;

determining the filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position;

and playing the multimedia resource, wherein the target resource filtered by the filtering rule is displayed at the target position filtered by the filtering rule in the process of playing the multimedia resource.

7. The method of any of claims 1-6, wherein the multimedia asset comprises a video, and identifying a target object in the multimedia asset comprises:

the target object in an image frame of the video is identified.

8. A processing apparatus for a multimedia resource, comprising:

An identification unit for identifying a target object in the multimedia resource;

a first obtaining unit, configured to obtain a target resource related to the target object;

a first determining unit, configured to mark, as a target position, a position where the target object appears in the playing progress of the multimedia resource;

an associating unit configured to associate the target resource to the target location, so as to generate an enhancement information time axis, where the enhancement information time axis is marked with the target resource associated with each of the target locations where different target objects appear;

the device is also for: determining the target resource associated with the target position based on the enhancement information time axis when the playing progress of the multimedia resource is played to the target position; and under the condition that the target resource does not meet the filtering rule set by the account number currently logged in by the client for playing the multimedia resource, displaying the target resource at the upper layer of the multimedia resource.

9. The apparatus of claim 8, wherein the identification unit comprises:

the first acquisition module is used for acquiring an identification model for identifying the target object in the multimedia resource, wherein the identification model is a model obtained by machine learning training in advance;

And the identification module is used for identifying the target object in the multimedia resource through the identification model.

10. The apparatus of claim 9, wherein the first acquisition module comprises:

a first determining module, configured to determine a target population of the multimedia resource;

a second determining module, configured to determine, according to a target group of the multimedia resource, the target object that needs to be identified in the multimedia resource;

and a third determining module for selecting the recognition model for recognizing the target object from a recognition model library.

11. The apparatus of claim 10, wherein the apparatus further comprises:

a building unit, configured to build a media training set corresponding to the target object before the recognition model for recognizing the target object is selected from a recognition model library, where the media training set includes at least one media sample pair, and each media sample pair includes a media sample and training information for indicating whether the target object appears in the media sample;

the training unit is used for performing machine learning training by adopting the media training set to obtain an identification model for identifying the target object;

And the adding unit is used for adding the identification model for identifying the target object into the identification model library.

12. The apparatus of claim 8, wherein the first acquisition unit comprises:

the searching module is used for searching media resources related to the target object by taking the name of the target object as a keyword;

and the second acquisition module is used for acquiring the target resource from the search result.

13. The apparatus of claim 8, wherein the apparatus further comprises:

the receiving unit is used for receiving a play request of a user after the target resource is associated with the target position;

the second acquisition unit is used for acquiring the description file and/or the selection operation of the user;

the second determining unit is used for determining the filtering rule according to the description file and/or the selection operation, wherein the filtering rule is used for filtering the target resource and/or the target position;

and the playing unit is used for playing the multimedia resource, wherein the target resource filtered by the filtering rule is displayed at the target position filtered by the filtering rule in the process of playing the multimedia resource.

14. The apparatus according to any one of claims 8 to 13, wherein the multimedia asset comprises a video, and the identifying unit is configured to identify the target object in an image frame of the video.

15. A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of claims 1 to 7 when run.

16. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.