CN111125388B

CN111125388B - Method, device and equipment for detecting multimedia resources and storage medium

Info

Publication number: CN111125388B
Application number: CN201911403935.3A
Authority: CN
Inventors: 申世伟
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2023-12-15
Anticipated expiration: 2039-12-30
Also published as: CN111125388A

Abstract

The disclosure relates to a method, a device, equipment and a storage medium for detecting multimedia resources. The detection method is used for detecting a target multimedia resource from a multimedia resource set to be detected, and comprises the following steps: acquiring the recall rate expected by the current multimedia resource detection; determining a threshold combination used by a plurality of classification models corresponding to the recall according to a pre-stored mapping relation between the recall and the threshold combination, wherein the threshold combination is a combination for obtaining the highest detection accuracy under the recall; for any multimedia resource in the multimedia resource set, respectively classifying the multimedia resource through a plurality of classification models to obtain classification information respectively output by the plurality of classification models; if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource. According to the embodiment, the detection accuracy of the multimedia resources is improved.

Description

Method, device and equipment for detecting multimedia resources and storage medium

Technical Field

The present disclosure relates to the field of communications, and in particular, to a method, an apparatus, a device, and a storage medium for detecting a multimedia resource.

Background

As the amount of short video upload increases, supervision of short video content becomes increasingly important. The short video platform has nearly billions of videos uploaded by users every day, and if all the videos are audited manually, huge labor cost is consumed. At present, a man-machine collaborative auditing mode is adopted, a machine samples videos possibly having problems by using a deep learning algorithm and transmits the videos to professional auditing personnel for auditing, so that huge labor cost is reduced.

However, it is difficult to sample all problematic videos from hundreds of millions of video samples using a single classification model, and in the related art, sampling video samples using multiple classification models, however, how to detect problematic videos from a huge number of videos with high performance is a technical problem to be solved.

Disclosure of Invention

The disclosure provides a method, a device, equipment and a storage medium for detecting multimedia resources, so as to improve the detection accuracy of the multimedia resources. The technical scheme of the present disclosure is as follows:

According to a first aspect of embodiments of the present disclosure, there is provided a method for detecting a multimedia resource, the method being configured to detect a target multimedia resource from a set of multimedia resources to be detected; the method comprises the following steps:

acquiring a recall rate expected by current multimedia resource detection, wherein the recall rate is the duty ratio of the detected data volume of the target multimedia resource to the actual data volume, and the actual data volume is the data volume of the target multimedia resource actually included in the multimedia resource set;

determining threshold combinations used by a plurality of classification models corresponding to the recall rates according to a mapping relation of the pre-stored recall rates and the threshold combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining highest detection accuracy under the recall;

for any multimedia resource in the multimedia resource set, respectively carrying out classification processing on the multimedia resource through the plurality of classification models to obtain classification information respectively output by the plurality of classification models, wherein the classification information is used for representing the probability that the multimedia resource is the target multimedia resource;

If at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to a classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource.

In an embodiment, before determining the threshold combination used by the multiple classification models corresponding to the recall according to the mapping relation between the pre-stored recall and the threshold combination, the method for detecting the multimedia resource further includes:

obtaining multimedia resource sample data;

obtaining all threshold combinations of the plurality of classification models according to a preset traversal rule in a preset interval;

updating the plurality of classification models by using each threshold combination, and detecting the multimedia resource sample data by using the updated plurality of classification models to obtain recall rate and detection accuracy corresponding to each threshold combination;

and recording and storing the highest detection accuracy rate under each recall rate and the corresponding threshold combination thereof.

In an embodiment, the detecting the multimedia resource sample data by using the updated multiple classification models to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination includes:

Aiming at each threshold combination, acquiring target multimedia resource data to be detected by using each updated classification model;

combining all the acquired target multimedia resource data to be detected to obtain a first data set;

detecting the multimedia resource sample data by using the updated multiple classification models to obtain multiple target multimedia resource sample data;

combining the plurality of target multimedia resource sample data to obtain a second data set;

calculating a first ratio of the sample data quantity of the target multimedia resource, which is marked in advance as the target multimedia resource, in the second data set to the sample data quantity of the target multimedia resource in the multimedia resource sample data, and taking the first ratio as the recall corresponding to the current threshold combination;

and calculating a second ratio of the sample data quantity of the target multimedia resource, which is marked in advance in the second data set, to the data quantity in the first data set, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.

In an embodiment, the merging the obtained all the target multimedia resource data to be detected includes:

Removing repeated data from all the acquired target multimedia resource data to be detected; or alternatively

The merging the plurality of target multimedia resource sample data includes:

and removing repeated data from the plurality of target multimedia resource sample data.

In an embodiment, the method for detecting a multimedia resource further includes:

and if the classification information output by all the classification models is smaller than the corresponding classification threshold value, determining that the multimedia resource is not the target multimedia resource.

According to a second aspect of embodiments of the present disclosure, there is provided a multimedia resource detection apparatus for detecting a target multimedia resource from a set of multimedia resources to be detected; the device comprises:

an acquisition module configured to acquire a recall rate expected by current multimedia resource detection, wherein the recall rate is a ratio of a detected data volume of the target multimedia resource to an actual data volume, and the actual data volume is a data volume of the target multimedia resource actually included in the multimedia resource set;

the determining module is configured to determine threshold combinations used by a plurality of classification models corresponding to the recall acquired by the acquiring module according to a mapping relation of pre-stored recall and threshold combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining highest detection accuracy under the recall;

The processing module is configured to respectively classify the multimedia resources through the plurality of classification models combined by the threshold value determined by the determining module for any multimedia resource in the multimedia resource set to obtain classification information respectively output by the plurality of classification models, wherein the classification information is used for representing the probability that the multimedia resource is the target multimedia resource;

a first detection module configured to, if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model obtained by the processing module is greater than or equal to a classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource.

In an embodiment, the device for detecting a multimedia resource further includes:

the first obtaining module is configured to obtain multimedia resource sample data before the determining module determines threshold combinations used by a plurality of classification models corresponding to the recall according to the mapping relation between the pre-stored recall and the threshold combinations;

the second obtaining module is configured to obtain all threshold combinations of the plurality of classification models according to a preset traversal rule in a preset interval;

The updating detection module is configured to update the plurality of classification models by using each threshold combination obtained by the second obtaining module, and detect the multimedia resource sample data by using the updated plurality of classification models to obtain recall rate and detection accuracy rate corresponding to each threshold combination;

the record storage module is configured to record and store the highest detection accuracy rate and the corresponding threshold combination of the highest detection accuracy rate under each recall rate obtained by the update detection module.

In one embodiment, the update detection module includes:

the first acquisition submodule is configured to acquire target multimedia resource data to be detected by using each updated classification model according to each threshold combination;

the first merging sub-module is configured to merge all the target multimedia resource data to be detected, which are acquired by the first acquisition module, to obtain a first data set;

the detection sub-module is configured to detect the multimedia resource sample data by utilizing the updated multiple classification models to obtain multiple target multimedia resource sample data;

the second merging sub-module is configured to merge the plurality of target multimedia resource sample data obtained by the detection sub-module to obtain a second data set;

A first calculating sub-module, configured to calculate a first ratio of a sample data amount of the target multimedia resource, which is marked in advance as the target multimedia resource in the second data set obtained by the second merging sub-module, to a sample data amount of the target multimedia resource in the multimedia resource sample data, and use the first ratio as the recall corresponding to a current threshold combination;

and the second calculation sub-module is used for calculating a second ratio of the sample data quantity of the target multimedia resource, which is marked in advance in the second data set obtained by the second merging sub-module, to the data quantity in the first data set obtained by the first merging sub-module, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.

In an embodiment, the first merging sub-module is configured to:

The second merging sub-module is configured to:

and the second detection module is configured to determine that the multimedia resource is not the target multimedia resource if the classification information output by all the classification models obtained by the processing module is smaller than the corresponding classification threshold value.

According to a third aspect of the embodiments of the present disclosure, there is provided a multimedia resource detection apparatus, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method for detecting a multimedia resource described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, which when executed by a processor of a multimedia asset detection device, enables the multimedia asset detection device to perform the above-described method of detecting a multimedia asset.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the method comprises the steps of obtaining the expected recall rate of the current multimedia resource detection, determining the threshold combination used by a plurality of classification models corresponding to the recall rate according to the mapping relation between the pre-stored recall rate and the threshold combination, and improving the detection accuracy of the multimedia resource because the threshold combination is the combination with the highest detection accuracy obtained under the recall rate, namely the detection accuracy obtained by detecting the multimedia resource by using the plurality of classification models adopting the threshold combination is the highest, namely the detection accuracy of the multimedia resource is detected by using the plurality of classification models adopting the threshold combination.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

Fig. 1 is a flowchart illustrating a method for detecting a multimedia asset according to an exemplary embodiment of the present disclosure.

FIG. 2 is a flow chart illustrating a method of recording and saving the highest detection accuracy at each recall and its corresponding threshold combination according to an exemplary embodiment of the present disclosure.

Fig. 3 is a block diagram of a multimedia asset detection device according to an exemplary embodiment of the present disclosure.

Fig. 4 is a block diagram of another multimedia asset detection device according to an exemplary embodiment of the present disclosure.

Fig. 5 is a block diagram of another multimedia asset detection device according to an exemplary embodiment of the present disclosure.

Fig. 6 is a block diagram of another multimedia asset detection device according to an exemplary embodiment of the present disclosure.

Fig. 7 is a block diagram of a multimedia asset detection device according to an exemplary embodiment of the present disclosure.

Fig. 8 is a block diagram illustrating a device for detecting multimedia resources according to an exemplary embodiment of the present disclosure.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

Fig. 1 is a flowchart of a method for detecting a multimedia resource according to an exemplary embodiment of the present disclosure, and as shown in fig. 1, the method for detecting a multimedia resource is applicable to a multimedia resource detecting apparatus, and is used for detecting a target multimedia resource from a multimedia resource set to be detected, and the method for detecting a multimedia resource includes the steps of:

In step S101, a recall rate expected by the current multimedia asset detection is acquired, where the recall rate is a ratio of a detected data amount of the target multimedia asset to an actual data amount, and the actual data amount is a data amount of the target multimedia asset actually included in the multimedia asset set.

In this embodiment, the multimedia resource detection device may obtain a current service requirement, which may be a desired detected data amount of the target multimedia resource.

The target multimedia resource refers to a multimedia resource with problems, and the multimedia resource to be detected currently can include, but is not limited to, video or image.

After obtaining the expected detection data amount of the target multimedia resource, the multimedia resource detection device can calculate the duty ratio of the expected detection data amount of the target multimedia resource in the data amount of the target multimedia resource actually included in the current multimedia resource set, where the duty ratio is the recall rate expected by the current multimedia resource detection.

In step S102, determining a threshold combination used by a plurality of classification models corresponding to the recall according to a mapping relation between the pre-stored recall and the threshold combination; the threshold combination comprises classification thresholds respectively used by a plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy under the recall rate.

In order to determine the threshold combinations used by the multiple classification models corresponding to the recall, in this embodiment, the highest detection accuracy at each recall and its corresponding threshold combination may be pre-stored.

For example, if the number of classification models is n, the pre-stored correspondence may be (C1, P1, R11, R12 … … R1 n), (C2, P2, R21, R12 … … R2 n) … … (Cm, pm, rm1, rm2 … … Rmn), where Pi is the highest detection accuracy under Ci, and Ri1, ri2 … … Rin are the classification thresholds corresponding to the n classification models, that is, ri1, ri2 … … Rin are the classification threshold combinations corresponding to the n classification models, where i is an integer greater than or equal to 1 and less than or equal to m.

In this embodiment, after acquiring the recall rate desired for detection, the threshold combination used by the plurality of classification models corresponding to the recall rate may be determined from the mapping relation of the pre-stored recall rate and the threshold combination.

Because the threshold combination is the combination which obtains the highest detection accuracy under the recall rate, the detection accuracy obtained by using a plurality of classification models adopting the threshold combination to detect the multimedia resources is highest, namely the detection accuracy can be improved by using a plurality of classification models adopting the threshold combination to detect the multimedia resources.

In step S103, for any multimedia resource in the multimedia resource set, the multimedia resource is classified by a plurality of classification models, so as to obtain classification information output by the classification models, where the classification information is used to represent a probability that the multimedia resource is a target multimedia resource.

In step S104, if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource.

Optionally, the method for detecting the multimedia resource may further include: if the classification information output by all the classification models is smaller than the corresponding classification threshold value, determining that the multimedia resource is not the target multimedia resource.

For example, the multiple classification models are model 1 and model 2 … … model n, the classification thresholds corresponding to model 1 and model 2 … … model n are R1 and R2 … … Rn, respectively, the multiple classification information obtained after the video 1 to be detected inputs the multiple classification models is T1 and T2 … … Tn, respectively, if T1< R1, T2< R2, … … Tn < Rn, the video 1 to be detected is not a target video, that is, the video 1 to be detected has no problem, and if T1> R1, the video 1 to be detected is a target video, that is, the video 1 to be detected has a problem.

In the above embodiment, the desired recall rate is detected by obtaining the current multimedia resource, and the threshold combination used by the multiple classification models corresponding to the recall rate is determined according to the mapping relation between the pre-stored recall rate and the threshold combination, and since the threshold combination is the combination with the highest detection accuracy obtained under the recall rate, that is, the detection accuracy obtained by detecting the multimedia resource by using the multiple classification models adopting the threshold combination is the highest, that is, the detection accuracy of the multimedia resource is improved by detecting the multimedia resource by using the multiple classification models adopting the threshold combination.

Fig. 2 is a flowchart of recording and saving the highest detection accuracy at each recall and the corresponding threshold combination according to an exemplary embodiment of the present disclosure, as shown in fig. 2, before the step S102, the method for detecting a multimedia resource may further include:

in step S201, multimedia asset sample data is obtained.

The multimedia asset sample data includes problematic multimedia asset sample data and non-problematic multimedia asset sample data, and the multimedia asset sample data may be marked by a label (label), for example, for problematic multimedia asset sample data, i.e., target multimedia asset sample data, label may be set to 1.

In step S202, all the threshold combinations of the plurality of classification models are obtained according to the preset traversal rule within the preset interval.

All threshold combinations for obtaining a plurality of classification models according to a preset traversal rule can be as follows: for each classification model, a preset threshold variation is sequentially added in sequence, the preset interval is [0,1], the threshold variation can be x (x is [0,1], mx is [0,1], m is N, N is a positive integer), in other words, integer multiples of a plurality of x are also in [0,1 ].

For example, for n classification models, the classification threshold of the ith classification model is denoted as ti, and assuming that the threshold variation x is 0.1, tn is any one of 0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1, t (n-1) is any one of 0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1, and so on, the threshold of each classification model traverses all values, and all threshold combinations of the n classification models can be obtained.

In this embodiment, the threshold change amount x is 0.1 for illustration only, the value of x may be [0,1], and the integer multiple of x may be any value in [0,1], for example, x may be 0.15, 0.2, 0.23, etc.

In step S203, a plurality of classification models are updated by using each threshold combination, and the updated plurality of classification models are used to detect the multimedia resource sample data, so as to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination.

The method comprises the steps of acquiring target multimedia resource data to be detected by using each updated classification model according to each threshold combination, and combining all acquired target multimedia resource data to be detected to obtain a first data set; detecting the multimedia resource sample data by utilizing the updated multiple classification models to obtain multiple target multimedia resource sample data, and combining the multiple target multimedia resource sample data to obtain a second data set; and calculating the recall rate and the detection accuracy rate corresponding to each threshold combination based on the first data set, the second data set and the target multimedia resource sample data in the multimedia resource sample data.

The method for merging all the acquired target multimedia resource data to be detected may be: and removing repeated data from all the acquired target multimedia resource data needing to be detected. The method for merging the plurality of target multimedia resource sample data is as follows: duplicate data is removed from the plurality of target multimedia asset sample data.

In this embodiment, the recognition model is updated by using each threshold combination, and the updated recognition model is used to detect the multimedia resource sample data, and if the detected multimedia resource sample data is the target multimedia resource sample data, the label thereof may be set to 1.

In this embodiment, a first ratio of the amount of sample data of the second data set, which is marked as the target multimedia resource in advance, to the amount of sample data of the target multimedia resource in the sample data of the multimedia resource may be used as the recall rate, and a second ratio of the amount of sample data of the second data set, which is marked as the target multimedia resource in advance, to the amount of data in the first data set may be used as the detection accuracy rate.

In step S204, the highest detection accuracy at each recall and its corresponding threshold combination are recorded and saved.

For the same recall, if the currently obtained detection accuracy is greater than the existing detection accuracy, recording the currently obtained detection accuracy, otherwise, maintaining the previously recorded detection accuracy.

And when the highest detection accuracy rate under each recall rate is recorded and stored, recording and storing a threshold combination corresponding to the highest accuracy rate.

Alternatively, a change curve may be generated according to the highest detection accuracy at each recall, for example, the recall is taken as a horizontal axis, the detection accuracy is taken as a vertical axis, a plurality of points are determined according to each recall and the corresponding highest detection accuracy thereof, and the points are connected to generate a change curve, so that the corresponding detection accuracy is conveniently determined according to the service curve and the recall.

The highest detection accuracy at each recall and its corresponding threshold combination may be stored in a database.

In the above embodiment, all the threshold combinations of the plurality of classification models are obtained according to a preset traversal rule in a preset interval, each threshold combination is utilized to update the plurality of classification models, the updated plurality of classification models are utilized to detect the multimedia resource sample data, the recall rate and the detection accuracy corresponding to each threshold combination are obtained, the highest detection accuracy under each recall rate and the corresponding threshold combination thereof are recorded and stored, and conditions are provided for the subsequent determination of the threshold combinations used by the plurality of classification models corresponding to the recall rate according to the recall rate.

Fig. 3 is a block diagram of a multimedia asset detection device for detecting a target multimedia asset from a set of multimedia assets to be detected, according to an exemplary embodiment of the present disclosure. Referring to fig. 3, the apparatus includes:

The obtaining module 31 is configured to obtain a recall rate expected for the current multimedia asset detection, where the recall rate is a ratio of a detected data amount of the target multimedia asset to an actual data amount, and the actual data amount is a data amount of the target multimedia asset actually included in the multimedia asset set.

The determining module 32 is configured to determine, according to a mapping relationship between the pre-stored recall ratio and the threshold combination, the threshold combination used by the plurality of classification models corresponding to the recall ratio acquired by the acquiring module 31; the threshold combination comprises a plurality of classification models which respectively correspond to the classification thresholds used, and the threshold combination is the combination for obtaining the highest detection accuracy under the recall rate.

The processing module 33 is configured to perform classification processing on the multimedia resources by using the plurality of classification models of the threshold combination determined by the determining module 32, for any one of the multimedia resources in the multimedia resource set, respectively, to obtain classification information output by the plurality of classification models, respectively, where the classification information is used to represent a probability that the multimedia resource is a target multimedia resource.

The first detection module 34 is configured to, if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model obtained by the processing module 33 is greater than or equal to the classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource.

Fig. 4 is a block diagram of another multimedia resource detection device according to an exemplary embodiment of the present disclosure, and as shown in fig. 4, the multimedia resource detection device may further include, on the basis of the embodiment shown in fig. 3:

the first obtaining module 35 is configured to obtain multimedia asset sample data before the determining module 32 determines a threshold combination used by a plurality of classification models corresponding to recall according to a pre-stored mapping relationship of recall to the threshold combination.

The second obtaining module 36 is configured to obtain all threshold combinations of the plurality of classification models according to a preset traversal rule within a preset interval.

The update detection module 37 is configured to update a plurality of classification models by using each threshold combination obtained by the second obtaining module 36, and detect the multimedia resource sample data by using the updated plurality of classification models, so as to obtain a recall rate and a detection accuracy rate corresponding to each threshold combination.

The record keeping module 38 is configured to record and keep the highest detection accuracy at each recall and its corresponding threshold combination obtained by the update detection module 37.

Fig. 5 is a block diagram of another multimedia resource detection device according to an exemplary embodiment of the present disclosure, and as shown in fig. 5, the update detection module 37 may include:

the first obtaining sub-module 371 is configured to obtain, for each threshold combination, target multimedia resource data to be detected using each classification model after updating.

The first merging sub-module 372 is configured to merge all the target multimedia resource data to be detected acquired by the first acquiring module 371, so as to obtain a first data set.

The detection sub-module 373 is configured to detect the multimedia resource sample data by using the updated multiple classification models, so as to obtain multiple target multimedia resource sample data.

The second merging sub-module 374 is configured to merge the plurality of target multimedia resource sample data obtained by the detection sub-module 373 to obtain a second data set.

The first calculating sub-module 375 calculates a first ratio of the sample data amount of the target multimedia resource, which is marked as the target multimedia resource in advance, in the second data set obtained by the second merging sub-module 374 to the sample data amount of the target multimedia resource in the multimedia resource sample data, and uses the first ratio as a recall corresponding to the current threshold combination.

The second calculation sub-module 376 calculates a second ratio of the sample data amount of the second data set obtained by the second merging sub-module 374, which is marked as the target multimedia resource in advance, to the data amount of the first data set obtained by the first merging sub-module 372, and uses the second ratio as the detection accuracy corresponding to the current threshold combination.

Fig. 6 is a block diagram of another multimedia resource detection device according to an exemplary embodiment of the present disclosure, as shown in fig. 6, the device may further include, on the basis of the embodiment shown in fig. 3:

the second detection module 35 is configured to determine that the multimedia resource is not the target multimedia resource if the classification information output by all the classification models obtained by the processing module 33 is smaller than the corresponding classification threshold.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Fig. 7 is a block diagram of a multimedia asset detection device according to an exemplary embodiment of the present disclosure. As shown in fig. 7, the service multimedia resource detection apparatus includes a processor 710, a memory 720 for storing instructions executable by the processor 710; the processor is configured to execute the instructions to implement the method for detecting a multimedia resource. In addition to the processor 710 and the memory 720 shown in fig. 7, the service multimedia resource detection device may further include other hardware according to the actual function of information transmission, which will not be described herein.

In an exemplary embodiment, a storage medium is also provided, such as a memory 720, comprising instructions executable by the processor 710 to perform the method of detecting a multimedia asset described above. Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, a ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

Fig. 8 is a block diagram illustrating an apparatus for detecting multimedia resources according to an exemplary embodiment. For example, apparatus 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or like terminal device.

Referring to fig. 8, apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the apparatus 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing element 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

One of the processors 820 in the processing component 802 may be configured to:

acquiring a recall rate expected by current multimedia resource detection, wherein the recall rate is the ratio of the detected data volume of the target multimedia resource to the actual data volume, and the actual data volume is the data volume of the target multimedia resource actually included in the multimedia resource set;

determining threshold combinations used by a plurality of classification models corresponding to the recall rates according to the pre-stored mapping relation between the recall rates and the threshold combinations; the threshold combination comprises classification thresholds which are respectively and correspondingly used by a plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy under the recall rate;

for any multimedia resource in the multimedia resource set, respectively carrying out classification processing on the multimedia resource through a plurality of classification models to obtain classification information respectively output by the plurality of classification models, wherein the classification information is used for representing the probability that the multimedia resource is a target multimedia resource;

if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value corresponding to the classification model, determining that the multimedia resource is the target multimedia resource.

The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the apparatus 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the assemblies, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in position of the device 800 or one of the assemblies of the device 800, the presence or absence of user contact with the device 800, an orientation or acceleration/deceleration of the device 800, and a change in temperature of the device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices, either in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for performing the above-described method of detecting a multimedia asset.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for detecting a multimedia asset, the method being used for detecting a target multimedia asset from a set of multimedia assets to be detected, the method comprising:

If at least one classification model of the plurality of classification models satisfies: the classification information output by the classification model is greater than or equal to a classification threshold value corresponding to the classification model, and the multimedia resource is determined to be the target multimedia resource;

before determining the threshold combination used by the multiple classification models corresponding to the recall according to the mapping relation of the pre-stored recall and the threshold combination, the method for detecting the multimedia resource further comprises the following steps:

obtaining multimedia resource sample data;

recording and storing the highest detection accuracy rate under each recall rate and the corresponding threshold combination thereof;

the method for detecting the multimedia resource sample data by using the updated multiple classification models to obtain recall rate and detection accuracy corresponding to each threshold combination comprises the following steps:

2. The method for detecting multimedia resources according to claim 1, wherein the merging all the acquired target multimedia resource data to be detected includes:

The merging the plurality of target multimedia resource sample data includes:

3. The method for detecting a multimedia resource according to claim 1, wherein the method for detecting a multimedia resource further comprises:

4. A multimedia asset detection device for detecting a target multimedia asset from a set of multimedia assets to be detected, the device comprising:

a first detection module configured to, if at least one classification model of the plurality of classification models satisfies: the classification information output by the classification model obtained by the processing module is greater than or equal to a classification threshold value corresponding to the classification model, and the multimedia resource is determined to be the target multimedia resource;

wherein, the device for detecting the multimedia resource further comprises:

the record storage module is configured to record and store the highest detection accuracy rate under each recall rate and the corresponding threshold combination obtained by the update detection module;

wherein the update detection module comprises:

the first merging sub-module is configured to merge all the target multimedia resource data to be detected, which are acquired by the first acquisition sub-module, to obtain a first data set;

5. The apparatus according to claim 4, wherein the first merging sub-module is configured to:

The second merging sub-module is configured to:

6. The apparatus for detecting a multimedia resource according to claim 4, wherein the apparatus for detecting a multimedia resource further comprises:

7. A multimedia asset detection device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of detecting a multimedia asset as claimed in any one of claims 1 to 3.

8. A storage medium, characterized in that instructions in the storage medium, when executed by a processor of a multimedia asset detection device, enable the multimedia asset detection device to perform the method of detection of a multimedia asset as claimed in any one of claims 1 to 3.