CN111125388A - Multimedia resource detection method, device and equipment and storage medium - Google Patents

Multimedia resource detection method, device and equipment and storage medium Download PDF

Info

Publication number
CN111125388A
CN111125388A CN201911403935.3A CN201911403935A CN111125388A CN 111125388 A CN111125388 A CN 111125388A CN 201911403935 A CN201911403935 A CN 201911403935A CN 111125388 A CN111125388 A CN 111125388A
Authority
CN
China
Prior art keywords
multimedia resource
classification
threshold
data
classification models
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911403935.3A
Other languages
Chinese (zh)
Other versions
CN111125388B (en
Inventor
申世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Reach Best Technology Co Ltd
Original Assignee
Reach Best Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Reach Best Technology Co Ltd filed Critical Reach Best Technology Co Ltd
Priority to CN201911403935.3A priority Critical patent/CN111125388B/en
Publication of CN111125388A publication Critical patent/CN111125388A/en
Application granted granted Critical
Publication of CN111125388B publication Critical patent/CN111125388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The disclosure relates to a method, a device and equipment for detecting multimedia resources and a storage medium. The detection method is used for detecting a target multimedia resource from a multimedia resource set to be detected, and comprises the following steps: acquiring a recall rate expected by current multimedia resource detection; determining a threshold combination used by a plurality of classification models corresponding to the recall rate according to a mapping relation between the pre-stored recall rate and the threshold combination, wherein the threshold combination is a combination for obtaining the highest detection accuracy rate under the recall rate; for any multimedia resource in the multimedia resource set, classifying the multimedia resource through a plurality of classification models respectively to obtain classification information output by the plurality of classification models respectively; if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource. The embodiment improves the detection accuracy of the multimedia resources.

Description

Multimedia resource detection method, device and equipment and storage medium
Technical Field
The present disclosure relates to the field of communications, and in particular, to a method, an apparatus, and a device for detecting multimedia resources and a storage medium.
Background
With the increasing amount of short video uploads, the supervision of short video content becomes increasingly important. Hundreds of millions of videos of the short video platform are uploaded by users every day, and if all videos are audited manually, huge labor cost is consumed. At present, a man-machine cooperation auditing mode is adopted, and a machine samples videos which possibly have problems by using a deep learning algorithm and then sends the videos to professional auditors for auditing so as to reduce huge labor cost.
However, it is difficult to sample all problematic videos from billions of video samples by using a single classification model, and in the related art, a plurality of classification models are used to sample video samples, but how to detect problematic videos from a large number of videos with high performance is a technical problem to be solved.
Disclosure of Invention
The disclosure provides a method, a device and equipment for detecting multimedia resources and a storage medium, so as to improve the detection accuracy of the multimedia resources. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, a method for detecting a multimedia resource is provided, where the method is used to detect a target multimedia resource from a set of multimedia resources to be detected; the method comprises the following steps:
acquiring a recall rate expected by the detection of the current multimedia resource, wherein the recall rate is the ratio of the detected data volume of the target multimedia resource to the actual data volume, and the actual data volume is the data volume of the target multimedia resource actually included in the multimedia resource set;
determining threshold value combinations used by a plurality of classification models corresponding to the recall rate according to a mapping relation between the pre-stored recall rate and the threshold value combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy at the recall rate;
for any multimedia resource in the multimedia resource set, classifying the multimedia resource through the plurality of classification models respectively to obtain classification information output by the plurality of classification models respectively, wherein the classification information is used for expressing the probability that the multimedia resource is the target multimedia resource;
if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
In an embodiment, before determining the threshold combinations used by the classification models corresponding to the recall rates according to the mapping relationship between the pre-stored recall rates and the threshold combinations, the method for detecting the multimedia resources further includes:
obtaining multimedia resource sample data;
in a preset interval, all threshold combinations of the multiple classification models are obtained according to a preset traversal rule;
updating the plurality of classification models by using each threshold combination, and detecting the multimedia resource sample data by using the plurality of updated classification models to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination;
and recording and storing the highest detection accuracy and the corresponding threshold combination under each recall rate.
In an embodiment, the detecting the multimedia resource sample data by using the updated classification models to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination includes:
acquiring target multimedia resource data which needs to be detected by using each updated classification model aiming at each threshold combination;
merging all acquired target multimedia resource data needing to be detected to obtain a first data set;
detecting the multimedia resource sample data by using the updated classification models to obtain a plurality of target multimedia resource sample data;
merging the sample data of the plurality of target multimedia resources to obtain a second data set;
calculating a first ratio of the sample data size of the target multimedia resource marked in advance in the second data set to the sample data size of the target multimedia resource in the multimedia resource sample data, and taking the first ratio as the recall rate corresponding to the current threshold combination;
and calculating a second ratio of the sample data size marked as the target multimedia resource in the second data set in advance to the data size in the first data set, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.
In an embodiment, the merging all the acquired target multimedia resource data that need to be detected includes:
removing repeated data from all the acquired target multimedia resource data needing to be detected; or
The merging the multiple target multimedia resource sample data includes:
and removing repeated data from the plurality of target multimedia resource sample data.
In an embodiment, the method for detecting a multimedia resource further includes:
and if the classification information output by all the classification models is smaller than the corresponding classification threshold, determining that the multimedia resource is not the target multimedia resource.
According to a second aspect of the embodiments of the present disclosure, there is provided a multimedia resource detection apparatus, configured to detect a target multimedia resource from a set of multimedia resources to be detected; the device comprises:
an obtaining module configured to obtain a recall rate expected by detection of a current multimedia resource, where the recall rate is a ratio of a detected data volume of the target multimedia resource to an actual data volume of the target multimedia resource, and the actual data volume is a data volume of the target multimedia resource actually included in the multimedia resource set;
the determining module is configured to determine threshold combinations used by a plurality of classification models corresponding to the recall rates acquired by the acquiring module according to a mapping relation between pre-stored recall rates and the threshold combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy at the recall rate;
a processing module configured to, for any multimedia resource in the multimedia resource set, perform classification processing on the multimedia resource by using the plurality of classification models of the threshold combination determined by the determining module, respectively, to obtain classification information output by the plurality of classification models, respectively, where the classification information is used to represent a probability that the multimedia resource is the target multimedia resource;
a first detection module configured to, if at least one of the plurality of classification models satisfies: and if the classification information output by the classification model obtained by the processing module is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
In an embodiment, the apparatus for detecting multimedia resources further includes:
the first obtaining module is configured to obtain multimedia resource sample data before the determining module determines the threshold value combination used by the plurality of classification models corresponding to the recall rate according to the mapping relation between the pre-stored recall rate and the threshold value combination;
the second obtaining module is configured to obtain all threshold combinations of the multiple classification models according to a preset traversal rule in a preset interval;
the updating detection module is configured to update the plurality of classification models by using each threshold combination obtained by the second obtaining module, and detect the multimedia resource sample data by using the plurality of updated classification models to obtain a recall rate and a detection accuracy rate corresponding to each threshold combination;
and the record storage module is configured to record and store the highest detection accuracy and the corresponding threshold combination of the highest detection accuracy and the highest detection accuracy under each recall rate obtained by the updating detection module.
In one embodiment, the update detection module includes:
the first obtaining submodule is configured to obtain target multimedia resource data which needs to be detected by using each updated classification model aiming at each threshold combination;
the first merging submodule is configured to merge all target multimedia resource data which need to be detected and are acquired by the first acquisition module to obtain a first data set;
the detection submodule is configured to detect the multimedia resource sample data by using the updated classification models to obtain a plurality of target multimedia resource sample data;
the second merging submodule is configured to merge the multiple target multimedia resource sample data obtained by the detection submodule to obtain a second data set;
the first calculation submodule is used for calculating a first ratio of the sample data size of the target multimedia resource marked in advance in the second data set obtained by the second merging submodule to the sample data size of the target multimedia resource in the multimedia resource sample data, and taking the first ratio as the recall rate corresponding to the current threshold combination;
and the second calculating submodule is used for calculating a second ratio of the sample data size which is marked as the target multimedia resource in the second data set obtained by the second merging submodule in advance to the data size in the first data set obtained by the first merging submodule, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.
In an embodiment, the first merging submodule is configured to:
removing repeated data from all the acquired target multimedia resource data needing to be detected; or
The second merge sub-module configured to:
and removing repeated data from the plurality of target multimedia resource sample data.
In an embodiment, the apparatus for detecting multimedia resources further includes:
the second detection module is configured to determine that the multimedia resource is not the target multimedia resource if the classification information output by all the classification models obtained by the processing module is smaller than the corresponding classification threshold.
According to a third aspect of the embodiments of the present disclosure, there is provided a multimedia resource detection apparatus, including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method for detecting the multimedia resource.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a storage medium, wherein instructions that, when executed by a processor of a multimedia asset detection device, enable the multimedia asset detection device to perform the above-mentioned multimedia asset detection method.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the method comprises the steps of obtaining a recall rate expected by current multimedia resource detection, determining a threshold value combination used by a plurality of classification models corresponding to the recall rate according to a mapping relation between the pre-stored recall rate and the threshold value combination, and obtaining the highest detection accuracy rate by using the plurality of classification models adopting the threshold value combination because the threshold value combination is the combination which obtains the highest detection accuracy rate under the recall rate, namely detecting the multimedia resources by using the plurality of classification models adopting the threshold value combination, so that the detection accuracy rate of the multimedia resources can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a flowchart illustrating a method for detecting a multimedia resource according to an exemplary embodiment of the present disclosure.
Fig. 2 is a flow chart illustrating an exemplary embodiment of the present disclosure for recording and saving the highest detection accuracy and its corresponding threshold combination for each recall.
Fig. 3 is a block diagram illustrating a multimedia asset detection apparatus according to an exemplary embodiment of the present disclosure.
Fig. 4 is a block diagram illustrating another multimedia asset detection apparatus according to an exemplary embodiment of the present disclosure.
Fig. 5 is a block diagram illustrating another multimedia asset detection apparatus according to an exemplary embodiment of the present disclosure.
Fig. 6 is a block diagram illustrating another multimedia asset detection apparatus according to an exemplary embodiment of the present disclosure.
Fig. 7 is a block diagram illustrating a multimedia asset detection device according to an exemplary embodiment of the present disclosure.
Fig. 8 is a block diagram of an apparatus for multimedia resource detection according to an exemplary embodiment of the disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating a method for detecting a multimedia resource according to an exemplary embodiment of the present disclosure, where as shown in fig. 1, the method for detecting a multimedia resource is applicable to a multimedia resource detection device, and the method is used to detect a target multimedia resource from a set of multimedia resources to be detected, and the method for detecting a multimedia resource includes the following steps:
in step S101, a recall rate expected by the detection of the current multimedia resource is obtained, where the recall rate is a ratio of a detected data volume of the target multimedia resource to an actual data volume of the target multimedia resource, and the actual data volume is a data volume of the target multimedia resource actually included in the multimedia resource set.
In this embodiment, the multimedia resource detection device may obtain a current service requirement, where the current service requirement may be a detection data amount expected by the target multimedia resource.
The target multimedia resource refers to a multimedia resource with a problem, and the current multimedia resource to be detected may include, but is not limited to, a video or an image.
After obtaining the detection data amount expected by the target multimedia resource, the multimedia resource detection device may calculate a proportion of the data amount of the target multimedia resource actually included in the current multimedia resource set by the detection data amount expected by the target multimedia resource, where the proportion is an expected recall rate for the current multimedia resource detection.
In step S102, determining a threshold combination used by a plurality of classification models corresponding to a recall rate according to a pre-stored mapping relationship between the recall rate and the threshold combination; the threshold combination comprises a plurality of classification thresholds which are respectively correspondingly used by the classification models, and the threshold combination is the combination which obtains the highest detection accuracy under the recall rate.
In order to determine the threshold combinations used by the classification models corresponding to the recall rates, in this embodiment, the highest detection accuracy rate and the corresponding threshold combination at each recall rate may be stored in advance.
For example, if the number of classification models is n, the pre-stored correspondence relationship may be (C1, P1, R11, R12 … … R1n), (C2, P2, R21, R12 … … R2n) … … (Cm, Pm, Rm1, Rm2 … … Rmn), where Pi is the highest detection accuracy under Ci, and Ri1 and Ri2 … … Rin are classification thresholds corresponding to the n classification models, i.e., Ri1 and Ri2 … … Rin are combinations of classification thresholds corresponding to the n classification models, where i is an integer greater than or equal to 1 and less than or equal to m.
In this embodiment, after obtaining the recall rate expected by detection, the threshold combinations used by the classification models corresponding to the recall rate may be determined according to the mapping relationship between the pre-stored recall rate and the threshold combinations.
Since the threshold combination is a combination that achieves the highest detection accuracy at the recall rate, the detection accuracy achieved by detecting the multimedia resource using the plurality of classification models using the threshold combination is the highest, i.e., the detection accuracy can be improved by detecting the multimedia resource using the plurality of classification models using the threshold combination.
In step S103, for any multimedia resource in the multimedia resource set, the multimedia resource is classified by the plurality of classification models, so as to obtain classification information output by the plurality of classification models, where the classification information is used to indicate a probability that the multimedia resource is a target multimedia resource.
In step S104, if at least one of the classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
Optionally, the method for detecting a multimedia resource may further include: and if the classification information output by all the classification models is smaller than the corresponding classification threshold, determining that the multimedia resource is not the target multimedia resource.
For example, the plurality of classification models are model 1 and model 2 … … model n, the classification threshold values corresponding to the model 1 and model 2 … … model n are R1 and R2 … … Rn, respectively, the plurality of classification information obtained after the video 1 to be detected is input into the plurality of classification models are T1 and T2 … … Tn, respectively, if T1< R1, T2< R2 and … … Tn < Rn, the video 1 to be detected is not the target video, that is, the video 1 to be detected has no problem, and if T1> R1, the video 1 to be detected is the target video, that is, the video 1 to be detected has a problem.
In the above embodiment, by obtaining the recall rate expected by the current multimedia resource detection, and determining the threshold combination used by the plurality of classification models corresponding to the recall rate according to the mapping relationship between the pre-stored recall rate and the threshold combination, since the threshold combination is the combination that obtains the highest detection accuracy rate at the recall rate, that is, the detection accuracy rate obtained by detecting the multimedia resource by using the plurality of classification models using the threshold combination is the highest, that is, the detection accuracy rate of the multimedia resource is improved by using the plurality of classification models using the threshold combination.
Fig. 2 is a flowchart illustrating an exemplary embodiment of the present disclosure to record and store the highest detection accuracy and the corresponding threshold combination for each recall rate, and as shown in fig. 2, before the step S102, the method for detecting a multimedia resource may further include:
in step S201, multimedia resource sample data is obtained.
The multimedia resource sample data includes multimedia resource sample data with a problem and multimedia resource sample data without a problem, and the multimedia resource sample data may be marked by a label (label), for example, for the multimedia resource sample data with a problem, that is, the target multimedia resource sample data, the label may be set to 1.
In step S202, all threshold combinations of the multiple classification models are obtained according to a preset traversal rule within a preset interval.
Wherein, obtaining all threshold combinations of the plurality of classification models according to the preset traversal rule may be: for each classification model, sequentially increasing a preset threshold variable quantity in sequence, wherein the preset interval is [0,1], the threshold variable quantity can be x (x belongs to [0,1], and mx belongs to [0,1], m belongs to N, and N is a positive integer), in other words, the integral multiple of a plurality of x is also in [0,1 ].
For example, for n classification models, the classification threshold of the ith classification model is denoted as ti, assuming that the threshold variation x is 0.1, then tn is any one of 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, and t (n-1) is any one of 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1, and so on, the threshold of each classification model traverses all values, and all threshold combinations of the n classification models can be obtained.
In this embodiment, the threshold variation x is 0.1 for illustration purposes only, the value of x may be [0,1], and the integer multiple of x is also any value in [0,1], for example, x may be 0.15, 0.2, 0.23, and the like.
In step S203, the multiple classification models are updated by using each threshold combination, and the multimedia resource sample data is detected by using the updated multiple classification models, so as to obtain a recall rate and a detection accuracy rate corresponding to each threshold combination.
Acquiring target multimedia resource data which needs to be detected by using each updated classification model aiming at each threshold combination, and merging all the acquired target multimedia resource data which need to be detected to obtain a first data set; detecting multimedia resource sample data by using the updated classification models to obtain a plurality of target multimedia resource sample data, and merging the target multimedia resource sample data to obtain a second data set; and calculating the recall rate and the detection accuracy rate corresponding to each threshold combination based on the target multimedia resource sample data in the first data set, the second data set and the multimedia resource sample data.
The method for merging all the acquired target multimedia resource data to be detected may be as follows: and removing repeated data from all the acquired target multimedia resource data needing to be detected. The method for merging the sample data of the plurality of target multimedia resources comprises the following steps: and removing the repeated data from the sample data of the plurality of target multimedia resources.
In this embodiment, the identification model is updated by using each threshold combination, and the multimedia resource sample data is detected by using the updated identification model, and if it is detected that the multimedia resource sample data is the target multimedia resource sample data, the label thereof may be set to 1.
In this embodiment, a first ratio of the sample data size of the target multimedia resource in the sample data of the multimedia resource to the sample data size of the target multimedia resource in the second data set marked as the target multimedia resource in advance may be used as a recall rate, and a second ratio of the sample data size of the target multimedia resource in the second data set to the data size in the first data set in advance may be used as a detection accuracy rate.
In step S204, the highest detection accuracy and the corresponding threshold combination at each recall rate are recorded and saved.
And for the same recall rate, recording the currently obtained detection accuracy rate if the currently obtained detection accuracy rate is greater than the existing detection accuracy rate, and otherwise, maintaining the previously recorded detection accuracy rate.
And when the highest detection accuracy under each recall rate is recorded and saved, recording and saving the threshold combination corresponding to the highest accuracy.
Alternatively, a variation curve may be generated according to the highest detection accuracy rate at each recall rate, for example, with the recall rate as the horizontal axis and the detection accuracy as the vertical axis, multiple points are determined according to each recall rate and the corresponding highest detection accuracy rate, and the points are connected to generate the variation curve, so as to conveniently determine the corresponding detection accuracy rate according to the service curve and the recall rate.
The highest detection accuracy at each recall rate and its corresponding threshold combination may be stored in a database.
In the above embodiment, all the threshold combinations of the multiple classification models are obtained according to the preset traversal rule in the preset interval, the multiple classification models are updated by using each threshold combination, the multimedia resource sample data is detected by using the updated multiple classification models, the recall rate and the detection accuracy rate corresponding to each threshold combination are obtained, the highest detection accuracy rate and the threshold combination corresponding to each recall rate are recorded and stored, and conditions are provided for subsequently determining the threshold combinations used by the multiple classification models corresponding to the recall rate according to the recall rate.
Fig. 3 is a block diagram of a multimedia resource detection apparatus for detecting a target multimedia resource from a set of multimedia resources to be detected according to an exemplary embodiment of the present disclosure. Referring to fig. 3, the apparatus includes:
the obtaining module 31 is configured to obtain a recall ratio expected by the detection of the current multimedia resource, where the recall ratio is a ratio of a detected data volume of the target multimedia resource to an actual data volume of the target multimedia resource actually included in the multimedia resource set.
The determining module 32 is configured to determine a threshold combination used by the plurality of classification models corresponding to the recall rate acquired by the acquiring module 31 according to a mapping relationship between pre-stored recall rates and the threshold combination; the threshold combination comprises classification thresholds which are respectively correspondingly used by the plurality of classification models, and the threshold combination is the combination which obtains the highest detection accuracy under the recall rate.
The processing module 33 is configured to, for any multimedia resource in the multimedia resource set, perform classification processing on the multimedia resource by using the plurality of classification models of the threshold combinations determined by the determining module 32, respectively, to obtain classification information output by the plurality of classification models, where the classification information is used to indicate a probability that the multimedia resource is the target multimedia resource.
The first detection module 34 is configured to, if at least one classification model of the plurality of classification models satisfies: if the classification information output by the classification model obtained by the processing module 33 is greater than or equal to the classification threshold value used by the classification model, it is determined that the multimedia resource is the target multimedia resource.
In the above embodiment, by obtaining the recall rate expected by the current multimedia resource detection, and determining the threshold combination used by the plurality of classification models corresponding to the recall rate according to the mapping relationship between the pre-stored recall rate and the threshold combination, since the threshold combination is the combination that obtains the highest detection accuracy rate at the recall rate, that is, the detection accuracy rate obtained by detecting the multimedia resource by using the plurality of classification models using the threshold combination is the highest, that is, the detection accuracy rate of the multimedia resource is improved by using the plurality of classification models using the threshold combination.
Fig. 4 is a block diagram of another multimedia resource detection apparatus according to an exemplary embodiment of the disclosure, and as shown in fig. 4, based on the embodiment shown in fig. 3, the multimedia resource detection apparatus may further include:
the first obtaining module 35 is configured to obtain multimedia resource sample data before the determining module 32 determines the threshold combinations used by the plurality of classification models corresponding to the recall rates according to the pre-stored mapping relationship between the recall rates and the threshold combinations.
The second obtaining module 36 is configured to obtain all threshold combinations of the plurality of classification models according to a preset traversal rule within a preset interval.
The update detecting module 37 is configured to update the plurality of classification models by using each threshold combination obtained by the second obtaining module 36, and detect the multimedia resource sample data by using the plurality of updated classification models, so as to obtain a recall rate and a detection accuracy rate corresponding to each threshold combination.
The record keeping module 38 is configured to record and keep the highest detection accuracy and the corresponding threshold combination thereof at each recall rate obtained by the update detection module 37.
Fig. 5 is a block diagram of another multimedia resource detection apparatus according to an exemplary embodiment of the disclosure, and as shown in fig. 5, on the basis of the embodiment shown in fig. 3, the update detection module 37 may include:
the first obtaining sub-module 371 is configured to obtain, for each threshold combination, target multimedia resource data that needs to be detected with each classification model updated.
The first combining submodule 372 is configured to combine all the target multimedia resource data that needs to be detected and is obtained by the first obtaining module 371, so as to obtain a first data set.
The detection submodule 373 is configured to detect multimedia resource sample data by using the updated classification models, so as to obtain a plurality of target multimedia resource sample data.
The second merging submodule 374 is configured to merge the multiple target multimedia resource sample data obtained by the detection submodule 373 to obtain a second data set.
The first calculating submodule 375 calculates a first ratio of the sample data size of the target multimedia resource marked in advance in the second data set obtained by the second merging submodule 374 to the sample data size of the target multimedia resource in the multimedia resource sample data, and takes the first ratio as the recall rate corresponding to the current threshold combination.
The second calculating submodule 376 calculates a second ratio between the sample data size in the second data set obtained by the second merging submodule 374, which is marked as the target multimedia resource in advance, and the data size in the first data set obtained by the first merging submodule 372, and uses the second ratio as the detection accuracy corresponding to the current threshold combination.
Fig. 6 is a block diagram of another multimedia resource detection apparatus according to an exemplary embodiment of the disclosure, and as shown in fig. 6, on the basis of the embodiment shown in fig. 3, the apparatus may further include:
the second detecting module 35 is configured to determine that the multimedia resource is not the target multimedia resource if the classification information output by all the classification models obtained by the processing module 33 is smaller than the corresponding classification threshold.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 7 is a block diagram illustrating a multimedia asset detection device according to an exemplary embodiment of the present disclosure. As shown in fig. 7, the service multimedia resource detection device includes a processor 710, a memory 720 for storing instructions executable by the processor 710; wherein, the processor is configured to execute the above instructions to implement the detection method of the multimedia resource. In addition to the processor 710 and the memory 720 shown in fig. 7, the service multimedia resource detection device may further include other hardware according to the actual function of information transmission, which is not described in detail herein.
In an exemplary embodiment, a storage medium comprising instructions, such as the memory 720 comprising instructions, executable by the processor 710 to perform the method of detecting a multimedia asset is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Fig. 8 is a block diagram illustrating an apparatus for multimedia asset detection according to an exemplary embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or other terminal device.
Referring to fig. 8, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
One of the processors 820 in the processing component 802 may be configured to:
acquiring a recall rate expected by the current multimedia resource detection, wherein the recall rate is the ratio of the detection data volume of a target multimedia resource to the actual data volume, and the actual data volume is the data volume of the target multimedia resource actually included in a multimedia resource set;
determining threshold value combinations used by a plurality of classification models corresponding to the recall rate according to a mapping relation between the pre-stored recall rate and the threshold value combinations; the threshold combination comprises a plurality of classification thresholds which are respectively and correspondingly used by the classification models, and the threshold combination is the combination which obtains the highest detection accuracy under the recall rate;
for any multimedia resource in the multimedia resource set, classifying the multimedia resource through a plurality of classification models respectively to obtain classification information output by the plurality of classification models respectively, wherein the classification information is used for expressing the probability that the multimedia resource is a target multimedia resource;
if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, communications component 816 further includes a Near Field Communications (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described multimedia resource detection method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method for detecting a target multimedia resource from a set of multimedia resources to be detected, the method comprising:
acquiring a recall rate expected by the detection of the current multimedia resource, wherein the recall rate is the ratio of the detected data volume of the target multimedia resource to the actual data volume, and the actual data volume is the data volume of the target multimedia resource actually included in the multimedia resource set;
determining threshold value combinations used by a plurality of classification models corresponding to the recall rate according to a mapping relation between the pre-stored recall rate and the threshold value combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy at the recall rate;
for any multimedia resource in the multimedia resource set, classifying the multimedia resource through the plurality of classification models respectively to obtain classification information output by the plurality of classification models respectively, wherein the classification information is used for expressing the probability that the multimedia resource is the target multimedia resource;
if at least one classification model of the plurality of classification models satisfies: and if the classification information output by the classification model is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
2. The method according to claim 1, wherein before determining the threshold combinations used by the classification models corresponding to the recall rates according to the pre-stored mapping relationship between the recall rates and the threshold combinations, the method further comprises:
obtaining multimedia resource sample data;
in a preset interval, all threshold combinations of the multiple classification models are obtained according to a preset traversal rule;
updating the plurality of classification models by using each threshold combination, and detecting the multimedia resource sample data by using the plurality of updated classification models to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination;
and recording and storing the highest detection accuracy and the corresponding threshold combination under each recall rate.
3. The method according to claim 2, wherein the detecting the multimedia resource sample data by using the updated classification models to obtain the recall rate and the detection accuracy rate corresponding to each threshold combination comprises:
acquiring target multimedia resource data which needs to be detected by using each updated classification model aiming at each threshold combination;
merging all acquired target multimedia resource data needing to be detected to obtain a first data set;
detecting the multimedia resource sample data by using the updated classification models to obtain a plurality of target multimedia resource sample data;
merging the sample data of the plurality of target multimedia resources to obtain a second data set;
calculating a first ratio of the sample data size of the target multimedia resource marked in advance in the second data set to the sample data size of the target multimedia resource in the multimedia resource sample data, and taking the first ratio as the recall rate corresponding to the current threshold combination;
and calculating a second ratio of the sample data size marked as the target multimedia resource in the second data set in advance to the data size in the first data set, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.
4. The method according to claim 3, wherein the combining all the acquired target multimedia resource data to be detected comprises:
removing repeated data from all the acquired target multimedia resource data needing to be detected; or
The merging the multiple target multimedia resource sample data includes:
and removing repeated data from the plurality of target multimedia resource sample data.
5. The method for detecting a multimedia resource according to claim 1, further comprising:
and if the classification information output by all the classification models is smaller than the corresponding classification threshold, determining that the multimedia resource is not the target multimedia resource.
6. A device for detecting a multimedia resource, the device being configured to detect a target multimedia resource from a set of multimedia resources to be detected, the device comprising:
an obtaining module configured to obtain a recall rate expected by detection of a current multimedia resource, where the recall rate is a ratio of a detected data volume of the target multimedia resource to an actual data volume of the target multimedia resource, and the actual data volume is a data volume of the target multimedia resource actually included in the multimedia resource set;
the determining module is configured to determine threshold combinations used by a plurality of classification models corresponding to the recall rates acquired by the acquiring module according to a mapping relation between pre-stored recall rates and the threshold combinations; wherein the threshold combination comprises classification thresholds respectively used by the plurality of classification models, and the threshold combination is a combination for obtaining the highest detection accuracy at the recall rate;
a processing module configured to, for any multimedia resource in the multimedia resource set, perform classification processing on the multimedia resource by using the plurality of classification models of the threshold combination determined by the determining module, respectively, to obtain classification information output by the plurality of classification models, respectively, where the classification information is used to represent a probability that the multimedia resource is the target multimedia resource;
a first detection module configured to, if at least one of the plurality of classification models satisfies: and if the classification information output by the classification model obtained by the processing module is greater than or equal to the classification threshold value correspondingly used by the classification model, determining that the multimedia resource is the target multimedia resource.
7. The apparatus for detecting multimedia resources according to claim 6, further comprising:
the first obtaining module is configured to obtain multimedia resource sample data before the determining module determines the threshold value combination used by the plurality of classification models corresponding to the recall rate according to the mapping relation between the pre-stored recall rate and the threshold value combination;
the second obtaining module is configured to obtain all threshold combinations of the multiple classification models according to a preset traversal rule in a preset interval;
the updating detection module is configured to update the plurality of classification models by using each threshold combination obtained by the second obtaining module, and detect the multimedia resource sample data by using the plurality of updated classification models to obtain a recall rate and a detection accuracy rate corresponding to each threshold combination;
and the record storage module is configured to record and store the highest detection accuracy and the corresponding threshold combination of the highest detection accuracy and the highest detection accuracy under each recall rate obtained by the updating detection module.
8. The apparatus for detecting multimedia resources according to claim 7, wherein the update detection module comprises:
the first obtaining submodule is configured to obtain target multimedia resource data which needs to be detected by using each updated classification model aiming at each threshold combination;
the first merging submodule is configured to merge all target multimedia resource data which need to be detected and are acquired by the first acquisition module to obtain a first data set;
the detection submodule is configured to detect the multimedia resource sample data by using the updated classification models to obtain a plurality of target multimedia resource sample data;
the second merging submodule is configured to merge the multiple target multimedia resource sample data obtained by the detection submodule to obtain a second data set;
the first calculation submodule is used for calculating a first ratio of the sample data size of the target multimedia resource marked in advance in the second data set obtained by the second merging submodule to the sample data size of the target multimedia resource in the multimedia resource sample data, and taking the first ratio as the recall rate corresponding to the current threshold combination;
and the second calculating submodule is used for calculating a second ratio of the sample data size which is marked as the target multimedia resource in the second data set obtained by the second merging submodule in advance to the data size in the first data set obtained by the first merging submodule, and taking the second ratio as the detection accuracy corresponding to the current threshold combination.
9. A multimedia asset detection device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of detecting a multimedia asset of any of claims 1 to 5.
10. A storage medium, characterized in that instructions in the storage medium, when executed by a processor of a multimedia asset detection device, enable the multimedia asset detection device to perform a multimedia asset detection method according to any of claims 1 to 5.
CN201911403935.3A 2019-12-30 2019-12-30 Method, device and equipment for detecting multimedia resources and storage medium Active CN111125388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911403935.3A CN111125388B (en) 2019-12-30 2019-12-30 Method, device and equipment for detecting multimedia resources and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911403935.3A CN111125388B (en) 2019-12-30 2019-12-30 Method, device and equipment for detecting multimedia resources and storage medium

Publications (2)

Publication Number Publication Date
CN111125388A true CN111125388A (en) 2020-05-08
CN111125388B CN111125388B (en) 2023-12-15

Family

ID=70505886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911403935.3A Active CN111125388B (en) 2019-12-30 2019-12-30 Method, device and equipment for detecting multimedia resources and storage medium

Country Status (1)

Country Link
CN (1) CN111125388B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984899A (en) * 2020-08-19 2020-11-24 北京达佳互联信息技术有限公司 Multimedia data processing method, device, equipment and storage medium
CN114722970A (en) * 2022-05-12 2022-07-08 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170091617A1 (en) * 2015-09-29 2017-03-30 International Business Machines Corporation Incident prediction and response using deep learning techniques and multimodal data
CN106776842A (en) * 2016-11-28 2017-05-31 腾讯科技(上海)有限公司 Multi-medium data detection method and device
WO2017117234A1 (en) * 2016-01-03 2017-07-06 Gracenote, Inc. Responding to remote media classification queries using classifier models and context parameters
CN108921206A (en) * 2018-06-15 2018-11-30 北京金山云网络技术有限公司 A kind of image classification method, device, electronic equipment and storage medium
CN109189950A (en) * 2018-09-03 2019-01-11 腾讯科技(深圳)有限公司 Multimedia resource classification method, device, computer equipment and storage medium
CN109614987A (en) * 2018-11-08 2019-04-12 北京字节跳动网络技术有限公司 More disaggregated model optimization methods, device, storage medium and electronic equipment
CN110135505A (en) * 2019-05-20 2019-08-16 北京达佳互联信息技术有限公司 Image classification method, device, computer equipment and computer readable storage medium
US10462026B1 (en) * 2016-08-23 2019-10-29 Vce Company, Llc Probabilistic classifying system and method for a distributed computing environment
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170091617A1 (en) * 2015-09-29 2017-03-30 International Business Machines Corporation Incident prediction and response using deep learning techniques and multimodal data
WO2017117234A1 (en) * 2016-01-03 2017-07-06 Gracenote, Inc. Responding to remote media classification queries using classifier models and context parameters
US10462026B1 (en) * 2016-08-23 2019-10-29 Vce Company, Llc Probabilistic classifying system and method for a distributed computing environment
CN106776842A (en) * 2016-11-28 2017-05-31 腾讯科技(上海)有限公司 Multi-medium data detection method and device
CN108921206A (en) * 2018-06-15 2018-11-30 北京金山云网络技术有限公司 A kind of image classification method, device, electronic equipment and storage medium
CN109189950A (en) * 2018-09-03 2019-01-11 腾讯科技(深圳)有限公司 Multimedia resource classification method, device, computer equipment and storage medium
CN109614987A (en) * 2018-11-08 2019-04-12 北京字节跳动网络技术有限公司 More disaggregated model optimization methods, device, storage medium and electronic equipment
CN110135505A (en) * 2019-05-20 2019-08-16 北京达佳互联信息技术有限公司 Image classification method, device, computer equipment and computer readable storage medium
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984899A (en) * 2020-08-19 2020-11-24 北京达佳互联信息技术有限公司 Multimedia data processing method, device, equipment and storage medium
CN114722970A (en) * 2022-05-12 2022-07-08 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium
CN114722970B (en) * 2022-05-12 2022-08-26 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium

Also Published As

Publication number Publication date
CN111125388B (en) 2023-12-15

Similar Documents

Publication Publication Date Title
CN108496317B (en) Method and device for searching public resource set of residual key system information
CN110827253A (en) Training method and device of target detection model and electronic equipment
KR102004079B1 (en) Image type identification method, apparatus, program and recording medium
CN106919629B (en) Method and device for realizing information screening in group chat
CN107480785B (en) Convolutional neural network training method and device
CN111125388B (en) Method, device and equipment for detecting multimedia resources and storage medium
CN107316207B (en) Method and device for acquiring display effect information
CN109565650B (en) Method and device for broadcasting and receiving configuration information of synchronous signal block
CN110312300B (en) Control method, control device and storage medium
CN109214175B (en) Method, device and storage medium for training classifier based on sample characteristics
CN112445832A (en) Data anomaly detection method and device, electronic equipment and storage medium
CN108629814B (en) Camera adjusting method and device
CN107707759B (en) Terminal control method, device and system, and storage medium
CN105227426B (en) Application interface switching method and device and terminal equipment
US11570693B2 (en) Method and apparatus for sending and receiving system information, and user equipment and base station
CN110913276B (en) Data processing method, device, server, terminal and storage medium
CN108012258B (en) Data traffic management method and device for virtual SIM card, terminal and server
CN107885464B (en) Data storage method, device and computer readable storage medium
CN111859097A (en) Data processing method and device, electronic equipment and storage medium
CN111382242A (en) Information providing method, device and readable medium
CN107122356B (en) Method and device for displaying face value and electronic equipment
CN114124866A (en) Session processing method, device, electronic equipment and storage medium
CN110149310B (en) Flow intrusion detection method, device and storage medium
CN113870195A (en) Target map detection model training and map detection method and device
CN109088920B (en) Evaluation method, device and equipment of intelligent sound box and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant