WO2021237570A1 - Image auditing method and apparatus, device, and storage medium - Google Patents

Image auditing method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2021237570A1
WO2021237570A1 PCT/CN2020/092923 CN2020092923W WO2021237570A1 WO 2021237570 A1 WO2021237570 A1 WO 2021237570A1 CN 2020092923 W CN2020092923 W CN 2020092923W WO 2021237570 A1 WO2021237570 A1 WO 2021237570A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
file
feature vector
threshold
image file
Prior art date
Application number
PCT/CN2020/092923
Other languages
French (fr)
Chinese (zh)
Inventor
罗茂
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN202080100202.7A priority Critical patent/CN115443490A/en
Priority to PCT/CN2020/092923 priority patent/WO2021237570A1/en
Publication of WO2021237570A1 publication Critical patent/WO2021237570A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the embodiments of this application relate to Internet technology, and relate to but not limited to image review methods and devices, equipment, and storage media.
  • the image review method provided by the embodiment of the application includes: extracting features of the image file to be reviewed using a target classification model to obtain a corresponding feature vector; wherein the target classification model uses multiple sample image files and corresponding multiple images Obtained by transforming file training; determining the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; determining the relationship between the determined similarity and the first threshold State whether the image file to be reviewed is a violation file.
  • the image review device includes: a feature extraction module configured to use a target classification model to perform feature extraction on an image file to be reviewed to obtain a corresponding feature vector; wherein the target classification model is based on a plurality of sample image files Obtained through training with corresponding multiple image transformation files; the first determining module is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; the review module is configured to According to the determined relationship between the similarity and the first threshold, it is determined whether the pending image file is a violation file.
  • the electronic device provided by an embodiment of the present application includes a memory and a processor.
  • the memory stores a computer program that can run on the processor.
  • the processor executes the program, the image review described in any of the embodiments of the present application is implemented Steps in the method.
  • the computer-readable storage medium provided by the embodiment of the present application has a computer program stored thereon, and when the computer program is executed by a processor, it implements the steps in any one of the image review methods described in the embodiment of the present application.
  • the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • FIG. 1 is a schematic diagram of an exemplary application scenario of an image review method according to an embodiment of this application
  • FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application.
  • FIG. 3 is a schematic diagram of the training process of the target classification model according to the embodiment of the application.
  • FIG. 4 is a schematic diagram of the implementation process of the method for generating a review set according to an embodiment of the application
  • FIG. 5 is a schematic diagram of an implementation process of a method for determining a first threshold value according to an embodiment of the application
  • FIG. 6 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 7A is a schematic structural diagram of MobileNetV2 according to an embodiment of the application.
  • FIG. 7B is a schematic structural diagram of a feature extraction structure according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 9 is a schematic diagram of the implementation process of yet another image review method according to an embodiment of the application.
  • FIG. 10 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 11 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 12 is a schematic diagram of a transformation operation performed on an original picture according to an embodiment of the application.
  • FIG. 13 is a simplified structural diagram of MobileNetV2 according to an embodiment of the application.
  • Figure 14 is a schematic diagram of the curve of the sigmoid function
  • FIG. 15 is a schematic diagram of a process of image matching according to an embodiment of the application.
  • FIG. 16 is the corresponding recall and wrong_recall when the candidate threshold is 35 to 70 in the embodiment of the application;
  • FIG. 17 is the corresponding recall and wrong_recall when the candidate threshold is 50 to 55 according to the embodiment of the application;
  • FIG. 18 is a schematic flowchart of a picture review system according to an embodiment of the application.
  • FIG. 19 is a schematic diagram of the Mobilehashnet algorithm flow in the picture review system according to an embodiment of the application.
  • 20A is a schematic diagram of the structure of an image file review device according to an embodiment of the application.
  • 20B is a schematic structural diagram of another image file review device according to an embodiment of the application.
  • FIG. 21 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the application.
  • first ⁇ second ⁇ third involved in the embodiments of the present application only distinguishes similar or different objects, and does not represent a specific order of objects. Understandably, “first ⁇ second ⁇ “Third” can be interchanged in a specific order or sequence when permitted, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein.
  • FIG. 1 is a schematic diagram of an exemplary application scenario 100 of an image review method provided by an embodiment of the present application.
  • the scene 100 includes a terminal 101, an image review device 102 and a second database 103.
  • the image review device 102 is used to review the image file 104 input by the user at the terminal 101 to determine whether the file is a violation file; if it is a violation file, it is forbidden to store the file in the second database 103; otherwise If it is not a violation file, that is, the file is a compliant file, then the file is allowed to be stored in the second database 103 so that the user or other users can retrieve, browse or download the file.
  • the terminal 101 may be a mobile terminal with wireless communication capabilities such as a mobile phone (for example, a mobile phone), a tablet computer, a notebook computer, or the like, or a desktop computer or desktop computer with computing functions that is inconvenient to move.
  • a mobile phone for example, a mobile phone
  • a tablet computer for example, a tablet computer
  • a notebook computer or the like
  • a desktop computer or desktop computer with computing functions that is inconvenient to move such as a mobile phone (for example, a mobile phone), a tablet computer, a notebook computer, or the like
  • desktop computer or desktop computer with computing functions that is inconvenient to move.
  • the image review device 102 may be configured in the terminal 101, or may be configured independently of the terminal 101. There may be one or more image review devices 102 in the application scene 100. Multiple image review devices 102 can review the image files input by different users in parallel, thereby increasing the data processing speed.
  • the second database 103 can also be configured in the image reviewing device 102 when the image reviewing device 102 is configured on the network side.
  • the terminal 101, the image auditing device 102, and the second database 103 are independent of different devices
  • the terminal 101 and the image auditing device 102 can communicate through the network
  • the image auditing device 102 and the second database 103 can also communicate with each other through the network.
  • the communication may be performed through a network, and the network may be a wireless network or a wired network, and the embodiment of the present application does not specifically limit the communication mode here.
  • the embodiment of the application provides an image review method, which can be applied to electronic equipment with an image review device.
  • the electronic equipment can be a computer device, a notebook computer, any node server in a distributed computing architecture, or a mobile terminal. Wait.
  • the functions implemented by the image review method can be implemented by invoking program codes by the processor in the electronic device.
  • the program codes can be stored in a computer storage medium. It can be seen that the electronic device at least includes a processor and a storage medium.
  • FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 2, the method may include the following steps 201 to 203:
  • Step 201 Use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • the target classification model may be a deep learning model, for example, a neural network model.
  • the model can be a lightweight neural network model, such as MobileNetV2.
  • the model can also be a non-lightweight neural network model.
  • the electronic device can be implemented through steps 301 to 304 in the following embodiment.
  • the so-called image transformation file refers to a file obtained by performing transformation processing such as inversion, rotation, liquefaction, scaling, cropping, mosaic, noise, color change, or occlusion on a sample image file, or a combination of these transformation methods.
  • the image file to be reviewed may be of various types.
  • the image file to be reviewed is an image or a piece of video (for example, a short video, a live video, a movie, a TV series, etc.).
  • the electronic device can randomly sample one or more video frame images from the video, and then perform feature extraction on these images through the target classification model to obtain the feature vector corresponding to the video.
  • Step 202 Determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set.
  • the corresponding review set can be different. That is to say, when the image file to be reviewed is an image, the reference feature vector in the corresponding review set is extracted by the electronic device from the image. When the image file to be reviewed is a piece of video, a reference feature vector in the corresponding review set is extracted by the electronic device from multiple images. All in all, the dimension of the feature vector of the image file to be reviewed is consistent with the dimension of the reference feature vector. Of course, it is not limited to the above rules. The dimensions of the two feature vectors can also be different.
  • the parameter types that characterize the similarity can be varied, for example, it can be Hamming distance, Euclidean distance, or cosine similarity.
  • Step 203 Determine whether the to-be-reviewed image file is a violation file according to the determined relationship between the similarity and the first threshold.
  • the audit set generated based on the compliant reference image file (for a brief description, referred to as the compliant set) and the audit set generated based on the offending reference image file (hereinafter referred to as the violation set), correspond to the judgment criteria Is different.
  • the similarity is characterized by the Hamming distance.
  • the Hamming distance between two strings of equal length refers to the number of different characters at the corresponding positions of the two strings. Therefore, the smaller the Hamming distance, the more similar the two feature vectors, and the more similar the corresponding two image files.
  • the ratio of the number of similarities less than the first threshold to the total number of similarities is determined, and when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a violation file.
  • the compliance set in one example, when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a compliance file.
  • the electronic device can be implemented through step 604 to step 606 in the following embodiment.
  • the electronic device can also be implemented through step 802 to step 809 in the following embodiment.
  • the similarity characterizes the number of different features between two feature vectors.
  • the audit set is a violation set. Every time the electronic device determines the similarity with the reference feature vector, it counts the current similarity that is less than the first threshold. If the number is greater than or equal to the third threshold, the calculation of similarity is stopped, and the image file to be reviewed is determined to be a violation file, which is output as the review result.
  • the electronic device can also determine whether the image file to be reviewed is a violation file through steps 902 to 904 in the following embodiment.
  • the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is trained through multiple sample image files and corresponding multiple image transformation files In this way, even if the image file to be reviewed is a file that has undergone multiple transformation processes such as rotation, liquefaction, and deformation of the original file, it can still extract the feature vector consistent with the original file, so as to realize the image file that is arbitrarily transformed Accurate identification can enhance the robustness of the image review method.
  • the electronic device may pre-train to obtain the target classification model, generate the review set, and determine the first threshold; wherein,
  • the following steps 301 to 304 may be included. It should be noted that the electronic device may perform the following steps 301 to 304 before performing feature extraction on the image file to be reviewed. The electronic device may also execute the following steps 301 to 304 when it is configured to have an image review function.
  • Step 301 Obtain the type label of each sample image file.
  • the sample image files include illegal image files and compliant image files.
  • Violating video files for example, can be files related to terror, violence, pornography, and gambling.
  • Compliant image files for example, may be files related to natural scenery and buildings.
  • the electronic device can sample some illegal sample files from the first database that collects a variety of illegal image files, and sample some compliant sample files from the second database that collects a variety of compliant image files. .
  • a certain number of image files are selected from the first database and the second database as the sample files. For example, select 100 illegal images and 100 compliant images from these two databases as sample image files.
  • Step 302 Perform transformation processing on each of the sample image files according to multiple transformation rules to obtain a set of image transformation files corresponding to the files.
  • the transformation rules can be various.
  • the basic transformation rules include flip, rotation, liquefaction, zoom, crop, mosaic, noise, color change, and occlusion.
  • the combined transformation rule is a combination of at least two basic transformation rules. Taking the above 9 basic transformation rules as an example, there are 502 combined transformation rules, namely In an example, the electronic device may perform transformation processing on the sample image file according to 100 different transformation rules to obtain 100 image transformation files corresponding to the file.
  • Step 303 Assign the type label of each sample image file to each image transformation file in the corresponding image transformation file set.
  • the type tags of the image file after conversion and the image file before conversion should be consistent. For example, if the illegal image file has been liquefied, the liquefied file is still illegal, and its nature remains unchanged. Therefore, the type label of the image transformation file corresponding to each sample image file can be consistent with the type label of the sample image file.
  • Step 304 Train a specific neural network model according to each of the sample image files, each of the image transformation files, and respective corresponding type labels to obtain the target classification model.
  • each sample image file is transformed according to multiple transformation rules to obtain the image transformation file set of the corresponding file; the type label of each sample image file is assigned to the corresponding image transformation file set Each of the image transformation files; according to each of the sample image files, each of the image transformation files and respective corresponding type tags, a specific neural network model is trained to obtain the target classification model.
  • the training samples include image transformation files obtained by performing multiple transformations on the sample image files, which can enrich the diversity of training samples and make the target classification model obtained by training have better robustness.
  • the model can accurately extract the feature vector of the transformed file, so as to accurately identify whether the file is It is a violation file.
  • the feature vectors extracted from the image files before and after the transformation process using this model are basically the same. Therefore, even if the input image file is a file after the transformation process, the electronic device can accurately identify whether the file is Violating documents.
  • the type label of each sample image file is assigned to each image transformation file in the corresponding image transformation file set; in this way, under the premise of ensuring the diversity of training samples, it reduces Manual labeling costs, no need to manually label each image transformation file type label.
  • the electronic device can automatically obtain a large number of rich and diverse training samples by transforming and processing the sample image files.
  • the electronic device may load the generated audit set into the cache in advance. There is no restriction on the timing of loading.
  • the electronic device can load the generated review set before using the target classification model to extract the features of the image file to be reviewed; for another example, the electronic device can also extract the features of the image file to be reviewed and determine the image file to be reviewed Before the similarity between the feature vector of and at least one reference feature vector in the audit set, load the generated audit set; another example, when the electronic device is configured to have the image audit function, load the generated audit set .
  • the method for generating an audit set may include the following steps 401 and 402:
  • Step 401 Using the target classification model, perform feature extraction on multiple reference image files to obtain feature vectors of corresponding files.
  • the multiple reference image files may be violation files, for example, all or part of the files in the first database, and the audit set obtained based on this is the violation set.
  • the multiple reference image files may be compliance files, for example, all or part of the files in the second database.
  • the nature of the audit set is different, that is, the compliance set and the violation set. In the image review stage, the corresponding judgment criteria are also different.
  • the multiple reference image files are part of the files in the database, they may be files randomly extracted from the database by the electronic device, or some representative files in the database, such as some files with higher priority.
  • Step 402 Use the feature vector of each reference image file as a reference feature vector to generate the review set.
  • the review set is loaded into the buffer area in advance.
  • the electronic device does not need to perform feature extraction on the multiple reference image files to generate the review set; instead, it can directly use the pre-generated review set to perform the image review. In this way, the time consumption of the feature extraction process can be saved, so that the time for reviewing the image can be saved.
  • the electronic device may load the determined first threshold into the cache in advance. There is no restriction on the timing of loading. For example, the electronic device may load the determined first threshold before determining whether the image file to be reviewed is a violation file; for another example, the electronic device may also load the determined first threshold value before performing feature extraction on the image file to be reviewed. Threshold; For another example, the electronic device can also load the determined first threshold when it is configured to have an image review function.
  • the method for determining the first threshold may include the following steps 501 to 503:
  • Step 501 assuming that the first threshold is a plurality of different candidate thresholds, according to the image review method, determine whether a plurality of verified image files are violating files, so as to obtain the review corresponding to each candidate threshold Result collection.
  • the plurality of verification image files may include a violation image file and a compliance image file.
  • the verification image file is different from the file used to train the neural network model.
  • the multiple verification image files may also include files obtained after the electronic device performs various transformation processes on the original image files.
  • the transformation rules used in the transformation processing may be the same as the transformation rules used in the model training stage.
  • step 501 a set of audit results obtained based on each candidate threshold can be obtained.
  • the set of audit results corresponding to threshold 1 is the content in the second column of Table 1.
  • Candidate threshold 2 ... Candidate threshold N Verify image file 1 1 1 ... 1 Verify image file 2 0 1 ... 1 ... ... ... ... ... Verify image file M 1 1 ... 0
  • Step 502 Determine the correct recall rate and the error recall rate under the corresponding candidate threshold according to each audit result set and the type label of each verified image file.
  • TN represents the number of violation documents reviewed as violations
  • FP represents the number of violation documents reviewed as compliance documents
  • FN represents the number of compliance documents reviewed as violation documents.
  • Step 503 Determine the candidate thresholds corresponding to the correct recall rate and the false recall rate that meet specific conditions as the first threshold.
  • the candidate threshold corresponding to the minimum error recall rate is selected as the first threshold.
  • the electronic device may adopt a grid search method to gradually approach the optimal value, so as to select the first threshold from a plurality of candidate thresholds.
  • FIG. 6 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 6, the method may include the following steps 601 to 606:
  • Step 601 Obtain a feature vector extraction structure of the target classification model.
  • the feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model; wherein, the target classification model uses a plurality of sample image files And the corresponding multiple image transformation file training.
  • the target classification model can be a lightweight neural network model MobileNetV2.
  • the structure of the network includes a "bottleneck structure", a conv2d layer, a sigmoid activation layer, an n ⁇ 1 dimensional fully connected layer (Dense), and a normalized index layer (softmax).
  • the "bottleneck structure", the conv2d layer, and the sigmoid activation layer may be used as the feature vector extraction structure.
  • Step 602 Use the feature vector extraction structure to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
  • the output of the nonlinear activation layer of the feature vector extraction structure is the feature vector corresponding to the file.
  • Step 603 Determine the similarity between the feature vector of the image file to be reviewed and each reference feature vector in the review set; wherein the similarity is used to represent the number of different features between the two feature vectors;
  • Step 604 Determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors.
  • the similarity is the Hamming distance.
  • Step 605 Determine the ratio of the number to the total number of similarities
  • Step 606 Determine whether the to-be-reviewed image file is a violation file according to the relationship between the ratio and the second threshold.
  • the image file to be reviewed is determined to be a violation file; when the ratio is less than or equal to the second threshold, the file is determined to be a compliant file .
  • the image file to be reviewed is determined to be a compliant file; when the ratio is less than or equal to the second threshold, the file is determined to be a violation file.
  • the number of similarities less than the first threshold is counted, and the ratio between the number and the total number of similarities is determined; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation File; In this way, compared to only obtaining the audit result based on the similarity with a reference feature vector, the audit result obtained in this way is more reliable and the recognition accuracy rate is higher.
  • FIG. 8 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 8, the method may include the following steps 801 to 809:
  • the target classification model is used to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • a target classification model usually consists of multiple sequentially connected layers.
  • the first layer generally takes an image as input, and extracts features from the image through specific operations.
  • the features extracted from the previous layer of each layer are used as input, and by transforming them in a specific form, more complex features can be obtained.
  • This hierarchical feature extraction process can be accumulated, which gives the neural network powerful feature extraction capabilities.
  • the neural network can transform the initial input image into higher-level abstract features.
  • the image review method when feature extraction is performed on the image file to be reviewed through the target classification model, no matter how complicated the original file is to obtain the image file to be reviewed, the extracted feature vector is basically unchanged. In this way, the image review method has strong robustness, and even if the illegal file is transformed and uploaded to the network, it can still be accurately identified.
  • Step 802 Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set Total number
  • Step 803 Determine whether the image file to be reviewed is a violation file according to the relationship between the similarity corresponding to the i-th reference feature vector and the first threshold; if so, go to step 804; otherwise, go to step 807;
  • the so-called similarity corresponding to the i-th reference feature vector refers to the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector.
  • Step 804 Count the first determined number of times the image file to be reviewed is a violation file
  • Step 805 Determine whether the first number of determinations is greater than the third threshold; if yes, go to step 806; otherwise, i+1, go back to step 802;
  • Step 806 Output that the image file to be reviewed is a violation file.
  • the first number of times of determination is greater than the third threshold, it is sufficient to reliably determine that the image file to be reviewed is a violation file, and there is no need to continue to calculate the similarity between the feature vector of the image file to be reviewed and the remaining reference feature vector. , Thereby saving the amount of calculation and shortening the audit time.
  • the third threshold is 900
  • the similarity is represented by Hamming distance.
  • the first determination number is 901. That is, among the similarities corresponding to the first to 1000th reference feature vectors, 901 similarities are less than the first threshold.
  • the image review process can be ended, and the review result of the image file to be reviewed as a violation file is output. There is no need to continue to calculate the similarity with the remaining 9,000 reference feature vectors.
  • Step 807 Count the second determined number of times that the image file to be reviewed is a compliant file
  • Step 808 Determine whether the second determination times are greater than the fourth threshold; if yes, go to step 809; otherwise, i+1, go back to step 802;
  • the fourth threshold is greater than the third threshold. In this way, the false detection rate of illegal files can be reduced.
  • Step 809 Output that the image file to be reviewed is a compliance file.
  • FIG. 9 is a schematic diagram of the implementation process of the image review method of the embodiment of the application. As shown in FIG. 9, the method may include the following steps 901 to 904:
  • Step 901 using the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files;
  • Step 902 Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set
  • the total number, the reference image file corresponding to the reference feature vector is a violation file; the similarity is used to characterize the number of different features between the two feature vectors;
  • Step 903 Determine whether the similarity corresponding to the i-th reference feature vector is less than the first threshold; if yes, go to step 904; otherwise, i+1, go back to step 902;
  • Step 904 Determine that the image file to be reviewed is a violation file, and output the review result.
  • the review process is ended, and the output to be reviewed is the review result of the violation file; otherwise, continue to traverse the next One refers to the feature vector until it is determined that the image file to be reviewed is a violation file.
  • the output pending image file is the audit result of the compliance file.
  • the input picture that is, the picture to be reviewed
  • the picture in the violation gallery that is, an example of the first database
  • Commonly used similarity algorithms such as the perceptual hash (pHash) algorithm and the Scale-Invariant Feature Transform (SIFT) algorithm.
  • the pHash algorithm is a rule algorithm designed manually.
  • the basic principle of the algorithm is to obtain the hash value of the input picture, and then calculate the hash "distance" between the input picture and a picture in the illegal library to obtain these two
  • the similarity of the picture when the similarity is greater than the set threshold, the match is considered successful.
  • the implementation process of the algorithm is as follows:
  • Reduce the size of the input picture simplify the color of the reduced picture; calculate the average value of the simplified picture; compare the grayscale of the pixel based on the average; calculate the hash value based on the grayscale; calculate and violate the rules based on the hash value
  • the Hamming distance of a picture in the gallery when the Hamming distance is less than the set threshold, it is determined that the matching is successful, and the input picture is an illegal picture.
  • the SIFT algorithm is used to detect and describe the local features in the picture. It finds extreme points in the spatial scale and extracts its position, scale, and rotation invariants. The description and detection of local features can help to identify objects. SIFT features are based on some local appearance points of interest on the object and have nothing to do with the size and rotation of the picture.
  • the algorithm factors (ie, image feature extraction operators) of the pHash algorithm and the SIFT algorithm are both artificially designed, so they can only meet specific matching scenarios.
  • the pHash algorithm can only maintain the invariance of scale scaling and color change;
  • the SIFT algorithm can only maintain the invariance of rotation, scale scaling, brightness change, affine, and noise.
  • the neural network model is mainly used to directly calculate whether the two pictures match.
  • the implementation process is shown in Figure 10, which is divided into a training phase and a prediction phase.
  • the basic process of the training phase includes the following steps 1001 to 1004:
  • Step 1001 design a model structure (including convolutional layer, fully connected layer, pooling layer, etc.) to obtain an initial similarity model, that is, a neural network model;
  • Step 1002 prepare a large amount of image data as training samples
  • Step 1003 Perform data enhancement processing on each picture in the training sample, for example, rotate, mirror, and render the pictures separately, and combine the two pictures obtained after different data transformations of the same picture into a positive sample (1 ), and other transformed pictures as negative samples (0).
  • step 1004 the initial similarity model is updated through the gradient descent series optimization algorithm and the training samples after data enhancement, to obtain the trained similarity model, that is, the target classification model.
  • the basic process of the prediction phase includes steps 1005 to 1007:
  • Step 1005 the input picture and each picture in the illegal library are calculated for similarity
  • Step 1006 Determine whether the ratio of the number of similarities less than the first threshold to the total number of similarities is greater than the second threshold; if so, go to step 1007;
  • step 1007 it is considered that the matching is successful, and it is determined that the input picture is a violation picture.
  • the deep learning model contains multiple convolution kernels obtained through gradient descent.
  • the convolution kernel has a strong ability to express image features and basically meets all image transformation scenarios.
  • it is necessary to cyclically perform matching calculations with all pictures in the gallery, plus the computational consumption of the neural network model itself, and its resource consumption is unacceptable.
  • a deep neural network is used to extract image features to obtain the image hash, which is an example of feature vector; compare the similarity of the two image hashes to determine whether the matching is successful.
  • the process may include the following Step 1 to Step 4):
  • Step1 Data preparation. Prepare 200 original pictures, as shown in Figure 12, perform picture transformation operations such as flipping, rotating, scaling, cropping, liquefying, mosaicing, noise, discoloration, and occlusion on each original picture, or a combination of them. Perform 100 different transformation operations on each picture, so that a total of 20,000 samples are obtained.
  • picture transformation operations such as flipping, rotating, scaling, cropping, liquefying, mosaicing, noise, discoloration, and occlusion on each original picture, or a combination of them.
  • Step2 Design the model.
  • the lightweight deep neural network MobileNetV2 is selected as the feature extractor. Before training the model, modify the MobileNetV2 network structure.
  • the original structure of MobileNetV2 is shown in Table 2 below.
  • the header "Input” is the input size of the structure layer
  • “Operator” is the structure type of the layer.
  • C is the dimension of the output feature layer of this layer
  • n is the number of repetitions of this layer
  • s is the number of steps of the deep convolution kernel.
  • the input size of the 11th layer of MobileNetV2 is fixed at 1 ⁇ 1 ⁇ 1280, and k 1 ⁇ 1 size convolution kernels are used for convolution calculation, so as to output a 1-dimensional vector of length k. Finally, connect the softmax activation layer to calculate the probability of k categories.
  • the MobileNetV2 structure is modified as follows: between the conv2d layer and the softmax layer, a sigmoid activation layer and an n ⁇ 1 dimensional fully connected layer (Dense) are added.
  • the modified MobileNetV2 structure is shown in Figure 7A.
  • Step3 Model training stage.
  • a picture classification model that is, a specific neural network model.
  • k 200
  • n is the dimension of the hash that needs to be encoded (for example, 300).
  • the model loss function is a multi-category cross-entropy loss (categorical_crossentropy), the optimization algorithm is Adam, the learning rate is fixed at 0.001, and the accuracy of the trained model is >99.5%.
  • Step4 Matching stage.
  • the output of the model is a 1-dimensional vector with a length of n (for example, 300).
  • n for example, 300.
  • the activation function is sigmoid
  • the value range of the sigmoid output is (0, 1).
  • the output is filtered according to the principle of output ⁇ 0.5, then 0, output>0.5, then 1, and the output is filtered, and finally a hash vector with a length of 300 and a value of 0 or 1 is obtained, that is, a feature vector.
  • the reason why the extracted feature vector is called a hash vector is because even if the input image is a transformed image of the original image, the feature vector extracted by Mobilehashnet is still consistent with the feature vector of the original image.
  • the Hamming distance of the two pictures can be calculated according to the hash vector of the picture. The smaller the distance, the more similar the two pictures.
  • the realization of matching can specify a first threshold. When the Hamming distance is lower than the first threshold, the two pictures are considered to be the same picture and the matching is successful; otherwise, the matching fails.
  • the preparation process of the validation set is the same as the above training set. Prepare several pictures in the non-training set, perform data enhancement, and calculate the correct recall rate (recall) and wrong recall rate (wrong_recall) of the matching model under different candidate thresholds.
  • a grid search method can be used to gradually approach the optimal value.
  • the grid search results are shown in Figure 16 and Figure 17; among them, Figure 16 shows that when the candidate threshold is 35 to 70, the corresponding recall And wrong_recall. Figure 17 shows the corresponding recall and wrong_recall when the candidate threshold is 50 to 55.
  • the hash dimension directly determines the number of convolution kernels of the 2d convolutional layer (conv2d1 ⁇ 1) in the modified MobileNetV2 structure and the output dimension n of the activation layer. Since it is at the end of the network structure, its size directly affects the learning ability of the model. If the hash dimension is too small, it will lead to underfitting of the model and reduce the limit on the number of libraries; too large dimension not only increases the time consumption of generating hash, but also increases the time consumption of calculating the Hamming distance, so you need to choose a reasonable hash dimension .
  • Mobilehashnet uses deep neural networks to extract image features, which theoretically has performance advantages.
  • the matching performance of the Mobilehashnet algorithm, the Phash algorithm and the SIFT algorithm is compared under different image transformation methods. The experimental results are shown in Table 3.
  • the Phash algorithm is basically unable to match in image transformations such as flipping, rotating, and zooming; the SIFT algorithm is at a low value in all types of image changes.
  • the Mobilehashnet algorithm can achieve 100% recall in image transformations of flipping, distorting, cutting, mosaic, and noise, and in other image transformations, the recall value is higher, and the wrong_recall value is lower. .
  • training can be performed without manually labeling a large number of samples, and a large number of training samples are automatically obtained through image data enhancement technology.
  • the Mobilehashnet algorithm provided by the embodiments of this application extracts image features by using a deep neural network, generates image hashes based on these features, and performs image matching. Compared with the related image matching/similarity algorithm, it effectively improves the correct recall rate, reduces the false recall rate, and does not require a large amount of manual data annotation.
  • the picture review system reviews the pictures uploaded by users to prevent the spread of a large number of illegal pictures. Due to the complexity of image content, as shown in Figure 18, the process of the image review system includes an illegal library matching model, an image classification model, a face recognition model, a text recognition model, and a text classification model. The pictures to be reviewed are reviewed by each model in turn. When the results of all models are "normal”, the review result can be "normal", that is, a compliant picture; otherwise, it is a violating picture.
  • the illegal library matching model in the image review system can be implemented by the Mobilehashnet algorithm provided in the embodiment of this application, which ensures a high correct recall rate and a low error recall rate for matching.
  • the implementation process of this algorithm is shown in Figure 19, extract the hash vector of the picture to be reviewed; determine the Hamming distance of each hash vector in the illegal hash library corresponding to the hash vector and the illegal library, that is, calculate the Hamming distance in batches; judge; Whether each Hamming distance is greater than the first threshold, so as to obtain the recall result, that is, the correct recall rate and the false recall rate.
  • the offending hash library can be obtained when the system is initialized, and only one hash calculation is required for matching, that is, only the feature extraction of the image to be reviewed is required.
  • the image file review device provided by the embodiments of the present application, including the modules included and the units included in each module, can be implemented by the processor in the terminal; of course, it can also be implemented by specific logic. Circuit implementation; in the implementation process, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
  • the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
  • FIG. 20A is a schematic structural diagram of an image file review device according to an embodiment of the application.
  • the device 200 includes a feature extraction module 201, a first determination module 202, and an review module 203, wherein:
  • the feature extraction module 201 is configured to use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files ;
  • the first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set;
  • the review module 203 is configured to determine whether the image file to be reviewed is a violation file according to the determined relationship between the similarity and the first threshold.
  • the feature extraction module 201 is configured to obtain a feature vector extraction structure of the target classification model, and the feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model;
  • the type of the target classification model is a neural network model; the feature vector extraction structure is used to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
  • the image auditing device 200 further includes: a tag acquisition module 204, configured to acquire the type tag of each sample image file; a transformation processing module 205, configured to follow a variety of transformation rules , Performing transformation processing on each of the sample image files to obtain an image transformation file set of the corresponding file; the tag labeling module 206 is configured to assign the type label of each sample image file to the corresponding image transformation file set Each image transformation file; the model training module 207 is configured to train a specific neural network model according to each of the sample image files, each of the image transformation files, and their corresponding type labels to obtain the target classification Model.
  • the review module 203 is configured to: determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors; determine that the number is equal to The ratio of the total number of similarities; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation file.
  • the first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and Less than or equal to the total number of reference feature vectors in the review set; the similarity is used to characterize the number of different features between two feature vectors, and the reference image file corresponding to the reference feature vector is a violation file; accordingly ,
  • the review module 203 is configured to determine that the image file to be reviewed is a violation file when the similarity corresponding to the i-th reference feature vector is less than the first threshold.
  • the first determining module 202 is further configured to: when the similarity corresponding to the i-th reference feature vector is greater than or equal to the first threshold, determine the feature vector of the image file to be reviewed and The similarity between the i+1th reference feature vector in the review set is used to determine whether the image file to be reviewed is a violation file.
  • the image review device 200 further includes: a loading module 208 configured to load the generated review set; correspondingly, the feature extraction module 201 is further configured to: use the The target classification model performs feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and uses the feature vector of each reference image file as a reference feature vector to generate the review set.
  • a loading module 208 configured to load the generated review set
  • the feature extraction module 201 is further configured to: use the The target classification model performs feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and uses the feature vector of each reference image file as a reference feature vector to generate the review set.
  • the loading module 208 is configured to load the determined first threshold
  • the device further includes a second determination module, configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds.
  • a second determination module configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds.
  • the embodiments of the present application if the above-mentioned image review method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • FIG. 21 is a schematic diagram of the hardware entity of the electronic device according to an embodiment of the application.
  • the electronic device 210 includes a memory 211 and a processor 212.
  • 211 stores a computer program that can be run on the processor 212, and the processor 212 implements the steps in the image review method provided in the foregoing embodiment when the processor 212 executes the program.
  • the memory 211 is configured to store instructions and applications executable by the processor 212, and can also cache data to be processed or processed by the processor 212 and each module in the electronic device 210 (for example, image data, audio data, etc.). , Voice communication data and video communication data), which can be implemented by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the image review method provided in the above-mentioned embodiments are implemented.
  • the disclosed device and method can be implemented in other ways.
  • the embodiments of the touch screen system described above are merely illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation, such as: multiple modules or components can be combined , Or can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed can be indirect coupling or communication connection through some interfaces, devices or modules, and can be electrical, mechanical or other forms of.
  • modules described above as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules; they may be located in one place or distributed on multiple network units; Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the embodiments of the present application may all be integrated into one processing unit, or each module may be individually used as a unit, or two or more modules may be integrated into one unit; the above-mentioned integration
  • the module can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a removable storage device, a read only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk.
  • ROM Read Only Memory
  • the aforementioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks, or optical disks and other media that can store program codes.

Abstract

An image auditing method, comprising: using a target classification model to perform feature extraction on an image file to be audited to obtain a corresponding feature vector, wherein the target classification model is obtained by training using multiple sample image files and corresponding multiple image transformation files; determining the similarity between the feature vector of the image file to be audited and at least one reference feature vector in an auditing set; and determining, according to the relationship between the determined similarity and a first threshold, whether the image file to be audited is an offending file. Also provided are an image auditing apparatus, a device, and a storage medium.

Description

影像审核方法及装置、设备、存储介质Image review method and device, equipment and storage medium 技术领域Technical field
本申请实施例涉及互联网技术,涉及但不限于影像审核方法及装置、设备、存储介质。The embodiments of this application relate to Internet technology, and relate to but not limited to image review methods and devices, equipment, and storage media.
背景技术Background technique
在互联网内容的审核业务中,“坏人”故意将违规的影像文件进行各种方式的变换,以“骗过”影像审核装置,进而将违规的影像文件传播到互联网。影像文件的变换方式多种多样,例如,旋转、液化、变形、噪点、渲染等基本变换方式或它们的组合。可见,“坏人”将违规的影像文件进行变换后上传至互联网,给影像审核装置带来了非常大的技术挑战。In the Internet content review business, "bad guys" deliberately transform the illegal image files in various ways to "cheat" the image review device, and then spread the illegal image files to the Internet. There are many ways to transform image files, such as rotation, liquefaction, deformation, noise, rendering and other basic transformation methods or their combination. It can be seen that the "bad guys" transform the illegal image files and upload them to the Internet, which brings a very big technical challenge to the image review device.
发明内容Summary of the invention
本申请实施例提供的影像审核方法及装置、设备、存储介质是这样实现的:The image review method, device, equipment, and storage medium provided in the embodiments of this application are implemented as follows:
本申请实施例提供的影像审核方法,包括:利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度;根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。The image review method provided by the embodiment of the application includes: extracting features of the image file to be reviewed using a target classification model to obtain a corresponding feature vector; wherein the target classification model uses multiple sample image files and corresponding multiple images Obtained by transforming file training; determining the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; determining the relationship between the determined similarity and the first threshold State whether the image file to be reviewed is a violation file.
本申请实施例提供的影像审核装置,包括:特征提取模块,配置为利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;第一确定模块,配置为确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度;审核模块,配置为根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。The image review device provided by this embodiment of the application includes: a feature extraction module configured to use a target classification model to perform feature extraction on an image file to be reviewed to obtain a corresponding feature vector; wherein the target classification model is based on a plurality of sample image files Obtained through training with corresponding multiple image transformation files; the first determining module is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; the review module is configured to According to the determined relationship between the similarity and the first threshold, it is determined whether the pending image file is a violation file.
本申请实施例提供的电子设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现本申请实施例任一所述影像审核方法中的步骤。The electronic device provided by an embodiment of the present application includes a memory and a processor. The memory stores a computer program that can run on the processor. When the processor executes the program, the image review described in any of the embodiments of the present application is implemented Steps in the method.
本申请实施例提供的计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现本申请实施例任一所述影像审核方法中的步骤。The computer-readable storage medium provided by the embodiment of the present application has a computer program stored thereon, and when the computer program is executed by a processor, it implements the steps in any one of the image review methods described in the embodiment of the present application.
本申请实施例中,电子设备利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;如此,即使待审影像文件是对原始文件进行旋转、液化、变形等多种变换处理后的文件,仍然能够提取与原始文件相一致的特征向量,从而实现对任意变换的影像文件的准确识别,进而能够增强影像审核方法的鲁棒性。In the embodiment of this application, the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files In this way, even if the image file to be reviewed is a file that has undergone various transformations such as rotation, liquefaction, and deformation, it can still extract the feature vector consistent with the original file, so as to achieve the accuracy of the arbitrarily transformed image file Recognition, in turn, can enhance the robustness of the image review method.
附图说明Description of the drawings
图1为本申请实施例影像审核方法的示例性应用场景的示意图;FIG. 1 is a schematic diagram of an exemplary application scenario of an image review method according to an embodiment of this application;
图2为本申请实施例影像审核方法的实现流程示意图;FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application;
图3为本申请实施例目标分类模型的训练过程示意图;FIG. 3 is a schematic diagram of the training process of the target classification model according to the embodiment of the application;
图4为本申请实施例审核集合的生成方法的实现流程示意图;4 is a schematic diagram of the implementation process of the method for generating a review set according to an embodiment of the application;
图5为本申请实施例第一阈值的确定方法的实现流程示意图;FIG. 5 is a schematic diagram of an implementation process of a method for determining a first threshold value according to an embodiment of the application; FIG.
图6为本申请实施例另一影像审核方法的实现流程示意图;FIG. 6 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application;
图7A为本申请实施例MobileNetV2的结构示意图;FIG. 7A is a schematic structural diagram of MobileNetV2 according to an embodiment of the application;
图7B为本申请实施例特征提取结构的结构示意图;FIG. 7B is a schematic structural diagram of a feature extraction structure according to an embodiment of the application;
图8为本申请实施例又一影像审核方法的实现流程示意图;FIG. 8 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application;
图9为本申请实施例再一影像审核方法的实现流程示意图;FIG. 9 is a schematic diagram of the implementation process of yet another image review method according to an embodiment of the application;
图10为本申请实施例另一影像审核方法的实现流程示意图;10 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application;
图11为本申请实施例又一影像审核方法的实现流程示意图;FIG. 11 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application;
图12为本申请实施例对原始图片进行变换操作的示意图;FIG. 12 is a schematic diagram of a transformation operation performed on an original picture according to an embodiment of the application;
图13为本申请实施例简化后的MobileNetV2结构示意图;FIG. 13 is a simplified structural diagram of MobileNetV2 according to an embodiment of the application;
图14为sigmoid函数的曲线示意图;Figure 14 is a schematic diagram of the curve of the sigmoid function;
图15为本申请实施例图片匹配的流程示意图;FIG. 15 is a schematic diagram of a process of image matching according to an embodiment of the application;
图16为本申请实施例候选阈值为35至70时对应的recall和wrong_recall;FIG. 16 is the corresponding recall and wrong_recall when the candidate threshold is 35 to 70 in the embodiment of the application;
图17为本申请实施例候选阈值为50至55时对应的recall和wrong_recall;FIG. 17 is the corresponding recall and wrong_recall when the candidate threshold is 50 to 55 according to the embodiment of the application;
图18为本申请实施例图片审核系统的流程示意图;FIG. 18 is a schematic flowchart of a picture review system according to an embodiment of the application;
图19为本申请实施例图片审核系统中的Mobilehashnet算法流程示意图;19 is a schematic diagram of the Mobilehashnet algorithm flow in the picture review system according to an embodiment of the application;
图20A为本申请实施例影像文件审核装置的结构示意图;20A is a schematic diagram of the structure of an image file review device according to an embodiment of the application;
图20B为本申请实施例另一影像文件审核装置的结构示意图;20B is a schematic structural diagram of another image file review device according to an embodiment of the application;
图21为本申请实施例的电子设备的硬件实体示意图。FIG. 21 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请的具体技术方案做进一步详细描述。以下实施例用于说明本申请,但不用来限制本申请的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the specific technical solutions of the present application will be described in further detail below in conjunction with the drawings in the embodiments of the present application. The following examples are used to illustrate the application, but are not used to limit the scope of the application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terminology used herein is only for the purpose of describing the embodiments of the application, and is not intended to limit the application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, “some embodiments” are referred to, which describe a subset of all possible embodiments, but it is understood that “some embodiments” may be the same subset or different subsets of all possible embodiments, and Can be combined with each other without conflict.
需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅仅是是区别类似或不同的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。It should be pointed out that the term "first\second\third" involved in the embodiments of the present application only distinguishes similar or different objects, and does not represent a specific order of objects. Understandably, "first\second\ "Third" can be interchanged in a specific order or sequence when permitted, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein.
下面首先说明本申请实施例提供的影像审核方法的示例性应用场景。The following first describes an exemplary application scenario of the image review method provided in the embodiment of the present application.
图1是本申请实施例提供的影像审核方法的示例性应用场景100的示意图。如图1 所示,场景100包括终端101、影像审核装置102和第二数据库103。其中,影像审核装置102,用于对用户在终端101输入的影像文件104进行审核,以确定该文件是否是违规文件;如果是违规文件,则禁止将该文件存储在第二数据库103中;反之,如果不是违规文件,即该文件是合规文件,则允许该文件存储在第二数据库103中,以便该用户或其他用户检索、浏览或下载该文件。FIG. 1 is a schematic diagram of an exemplary application scenario 100 of an image review method provided by an embodiment of the present application. As shown in FIG. 1, the scene 100 includes a terminal 101, an image review device 102 and a second database 103. Among them, the image review device 102 is used to review the image file 104 input by the user at the terminal 101 to determine whether the file is a violation file; if it is a violation file, it is forbidden to store the file in the second database 103; otherwise If it is not a violation file, that is, the file is a compliant file, then the file is allowed to be stored in the second database 103 so that the user or other users can retrieve, browse or download the file.
需要说明的是,终端101可以是移动电话(例如手机)、平板电脑、笔记本电脑等具有无线通信能力的移动终端,还可以是不便移动的具有计算功能的台式计算机、桌面电脑等。It should be noted that the terminal 101 may be a mobile terminal with wireless communication capabilities such as a mobile phone (for example, a mobile phone), a tablet computer, a notebook computer, or the like, or a desktop computer or desktop computer with computing functions that is inconvenient to move.
影像审核装置102可以配置在终端101中,也可以独立于终端101而配置。应用场景100中可以有一个或多个影像审核装置102。多个影像审核装置102可以并行对不同用户输入的影像文件进行审核,从而提高数据处理速度。The image review device 102 may be configured in the terminal 101, or may be configured independently of the terminal 101. There may be one or more image review devices 102 in the application scene 100. Multiple image review devices 102 can review the image files input by different users in parallel, thereby increasing the data processing speed.
第二数据库103除了可以独立于影像审核装置102和终端101的配置之外,在影像审核装置102配置在网络侧的情况下,第二数据库103还可以配置在影像审核装置102中。In addition to being independent of the configuration of the image reviewing device 102 and the terminal 101, the second database 103 can also be configured in the image reviewing device 102 when the image reviewing device 102 is configured on the network side.
在终端101、影像审核装置102和第二数据库103相互独立于不同的设备的情况下,终端101与影像审核装置102之间可以通过网络进行通信,影像审核装置102与第二数据库103之间也可以通过网络进行通信,该网络可以为无线网络或有线网络,本申请实施例在此不对通信方式进行具体限定。In the case that the terminal 101, the image auditing device 102, and the second database 103 are independent of different devices, the terminal 101 and the image auditing device 102 can communicate through the network, and the image auditing device 102 and the second database 103 can also communicate with each other through the network. The communication may be performed through a network, and the network may be a wireless network or a wired network, and the embodiment of the present application does not specifically limit the communication mode here.
本申请实施例提供一种影像审核方法,所述方法可以应用于具有影像审核装置的电子设备,所述电子设备可以是计算机设备、笔记本电脑、分布式计算架构中的任一节点服务器、移动终端等。所述影像审核方法所实现的功能可以通过所述电子设备中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中。可见,所述电子设备至少包括处理器和存储介质。The embodiment of the application provides an image review method, which can be applied to electronic equipment with an image review device. The electronic equipment can be a computer device, a notebook computer, any node server in a distributed computing architecture, or a mobile terminal. Wait. The functions implemented by the image review method can be implemented by invoking program codes by the processor in the electronic device. Of course, the program codes can be stored in a computer storage medium. It can be seen that the electronic device at least includes a processor and a storage medium.
图2为本申请实施例影像审核方法的实现流程示意图,如图2所示,所述方法可以包括以下步骤201至步骤203:FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 2, the method may include the following steps 201 to 203:
步骤201,利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的。Step 201: Use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
需要说明的是,目标分类模型可以是深度学习模型,例如为神经网络模型。对于该模型中所包含的层数不做限定。该模型可以是轻量级的神经网络模型,例如为MobileNetV2。当然,该模型也可以是非轻量级的神经网络模型。对于目标分类模型的训练过程,电子设备可以通过如下实施例的步骤301至304实现。It should be noted that the target classification model may be a deep learning model, for example, a neural network model. There is no limit to the number of layers included in the model. The model can be a lightweight neural network model, such as MobileNetV2. Of course, the model can also be a non-lightweight neural network model. For the training process of the target classification model, the electronic device can be implemented through steps 301 to 304 in the following embodiment.
可以理解地,所谓影像变换文件,指的是对样本影像文件进行翻转、旋转、液化、缩放、剪裁、马赛克、噪声、变色或遮挡等变换处理或者这些变换方式的组合处理得到的文件。Understandably, the so-called image transformation file refers to a file obtained by performing transformation processing such as inversion, rotation, liquefaction, scaling, cropping, mosaic, noise, color change, or occlusion on a sample image file, or a combination of these transformation methods.
待审影像文件可以是多种多样的,例如待审影像文件为一张图像或者一段视频(例如为短视频、直播视频、电影、电视剧等)。在待审影像文件为一段视频的情况下,电子设备可以从该视频中随机采样一帧或多帧视频帧图像,然后通过目标分类模型对这些图像进行特征提取,得到该视频对应的特征向量。The image file to be reviewed may be of various types. For example, the image file to be reviewed is an image or a piece of video (for example, a short video, a live video, a movie, a TV series, etc.). In the case that the image file to be reviewed is a video, the electronic device can randomly sample one or more video frame images from the video, and then perform feature extraction on these images through the target classification model to obtain the feature vector corresponding to the video.
步骤202,确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度。Step 202: Determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set.
通常情况下,为了保证审核准确率,待审影像文件为一张图像和为一段视频时,对应的审核集合可以是不同的。也就是说,待审影像文件为一张图像时,对应的审核集合中的参考特征向量是电子设备从一张图像中提取得到的。待审影像文件为一段视频时, 对应的审核集合中的一个参考特征向量是电子设备从多张图像中提取得到的。总而言之,待审影像文件的特征向量的维度与参考特征向量的维度一致。当然,也可以不局限于上述规则。这两个特征向量的维度也可以是不同的。Generally, in order to ensure the accuracy of the review, when the image file to be reviewed is an image and a video, the corresponding review set can be different. That is to say, when the image file to be reviewed is an image, the reference feature vector in the corresponding review set is extracted by the electronic device from the image. When the image file to be reviewed is a piece of video, a reference feature vector in the corresponding review set is extracted by the electronic device from multiple images. All in all, the dimension of the feature vector of the image file to be reviewed is consistent with the dimension of the reference feature vector. Of course, it is not limited to the above rules. The dimensions of the two feature vectors can also be different.
表征相似度的参数类型可以是多种多样的,例如可以是汉明距离、欧氏距离或者余弦相似度等。The parameter types that characterize the similarity can be varied, for example, it can be Hamming distance, Euclidean distance, or cosine similarity.
步骤203,根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。Step 203: Determine whether the to-be-reviewed image file is a violation file according to the determined relationship between the similarity and the first threshold.
可以理解地,基于合规的参考影像文件生成的审核集合(为简便描述,以下称为合规集合)和基于违规的参考影像文件生成的审核集合(以下称为违规集合),对应的判断准则是不同的。Understandably, the audit set generated based on the compliant reference image file (for a brief description, referred to as the compliant set) and the audit set generated based on the offending reference image file (hereinafter referred to as the violation set), correspond to the judgment criteria Is different.
以相似度通过汉明距离来表征为例,两个等长字符串之间的汉明距离指的是两个字符串对应位置的不同字符的个数。因此,汉明距离越小,说明两个特征向量越相似,对应的两个影像文件也越相似。对于违规集合来讲,在一个示例中,确定小于第一阈值的相似度的数目与相似度总数目的比值,当该比值大于第二阈值时,确定待审影像文件为违规文件。对于合规集合来讲,在一个示例中,当该比值大于第二阈值时,确定待审影像文件为合规文件。For example, the similarity is characterized by the Hamming distance. The Hamming distance between two strings of equal length refers to the number of different characters at the corresponding positions of the two strings. Therefore, the smaller the Hamming distance, the more similar the two feature vectors, and the more similar the corresponding two image files. For the violation set, in one example, the ratio of the number of similarities less than the first threshold to the total number of similarities is determined, and when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a violation file. For the compliance set, in one example, when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a compliance file.
确定待审影像文件是否是违规文件的方法可以是多种多样的。例如,电子设备可以通过如下实施例的步骤604至步骤606实现。再如,电子设备还可以通过如下实施例的步骤802至步骤809实现。相似度表征的是两个特征向量之间的不同特征的数目,审核集合为违规集合,电子设备可以每确定一次与参考特征向量之间的相似度,便统计一次当前小于第一阈值的相似度数目,如果该数目大于或等于第三阈值,则停止相似度的运算,确定待审影像文件为违规文件,以此作为审核结果输出。There are many ways to determine whether an image file to be reviewed is a violation file. For example, the electronic device can be implemented through step 604 to step 606 in the following embodiment. For another example, the electronic device can also be implemented through step 802 to step 809 in the following embodiment. The similarity characterizes the number of different features between two feature vectors. The audit set is a violation set. Every time the electronic device determines the similarity with the reference feature vector, it counts the current similarity that is less than the first threshold. If the number is greater than or equal to the third threshold, the calculation of similarity is stopped, and the image file to be reviewed is determined to be a violation file, which is output as the review result.
又如,电子设备还可以通过以下实施例的步骤902至步骤904确定待审影像文件是否是违规文件。For another example, the electronic device can also determine whether the image file to be reviewed is a violation file through steps 902 to 904 in the following embodiment.
在本申请实施例中,电子设备利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;如此,即使待审影像文件是对原始文件进行旋转、液化、变形等多种变换处理后的文件,仍然能够提取与原始文件相一致的特征向量,从而实现对任意变换的影像文件的准确识别,进而能够增强影像审核方法的鲁棒性。In the embodiment of the present application, the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is trained through multiple sample image files and corresponding multiple image transformation files In this way, even if the image file to be reviewed is a file that has undergone multiple transformation processes such as rotation, liquefaction, and deformation of the original file, it can still extract the feature vector consistent with the original file, so as to realize the image file that is arbitrarily transformed Accurate identification can enhance the robustness of the image review method.
在一些实施例中,电子设备在对待审影像文件进行审核之前,可以预先地训练得到目标分类模型、生成审核集合和确定第一阈值;其中,In some embodiments, before the electronic device reviews the image file to be reviewed, it may pre-train to obtain the target classification model, generate the review set, and determine the first threshold; wherein,
对于目标分类模型的训练过程,如图3所示,可以包括以下步骤301至步骤304。需要说明的是,电子设备可以在对待审影像文件进行特征提取之前,执行以下步骤301至步骤304。电子设备还可以在被配置为具有影像审核功能时,执行以下步骤301至步骤304。For the training process of the target classification model, as shown in FIG. 3, the following steps 301 to 304 may be included. It should be noted that the electronic device may perform the following steps 301 to 304 before performing feature extraction on the image file to be reviewed. The electronic device may also execute the following steps 301 to 304 when it is configured to have an image review function.
步骤301,获取每一样本影像文件的类型标签。Step 301: Obtain the type label of each sample image file.
可以理解地,样本影像文件包括违规的影像文件和合规的影像文件。违规的影像文件,例如可以是与恐怖、暴力、色情和赌博等相关的文件。合规的影像文件,例如可以是与自然风景和建筑物等相关的文件。电子设备可以从收集了多种多样的违规影像文件的第一数据库中采样得到部分违规的样本文件,从收集了多种多样的合规影像文件的第二数据库中采样得到部分合规的样本文件。Understandably, the sample image files include illegal image files and compliant image files. Violating video files, for example, can be files related to terror, violence, pornography, and gambling. Compliant image files, for example, may be files related to natural scenery and buildings. The electronic device can sample some illegal sample files from the first database that collects a variety of illegal image files, and sample some compliant sample files from the second database that collects a variety of compliant image files. .
为了降低每一样本影像文件的标签标注工作,通常情况下,从第一数据库和第二数据库中选取一定数量的影像文件作为样本文件。例如,从这两个数据库中选取100张违规图像和100张合规图像作为样本影像文件。In order to reduce the labeling work of each sample image file, usually, a certain number of image files are selected from the first database and the second database as the sample files. For example, select 100 illegal images and 100 compliant images from these two databases as sample image files.
步骤302,按照多种变换规则,对每一所述样本影像文件进行变换处理,得到对应文件的影像变换文件集合。Step 302: Perform transformation processing on each of the sample image files according to multiple transformation rules to obtain a set of image transformation files corresponding to the files.
变换规则可以是多种多样的,例如,基本的变换规则包括翻转、旋转、液化、缩放、剪裁、马赛克、噪声、变色和遮挡等。组合的变换规则为至少两种基本的变换规则的组合。以上述9种基本的变换规则为例,组合的变换规则包括502种,即
Figure PCTCN2020092923-appb-000001
在一个示例中,电子设备可以按照100种不同的变换规则,对样本影像文件进行变换处理,得到该文件对应的100个影像变换文件。
The transformation rules can be various. For example, the basic transformation rules include flip, rotation, liquefaction, zoom, crop, mosaic, noise, color change, and occlusion. The combined transformation rule is a combination of at least two basic transformation rules. Taking the above 9 basic transformation rules as an example, there are 502 combined transformation rules, namely
Figure PCTCN2020092923-appb-000001
In an example, the electronic device may perform transformation processing on the sample image file according to 100 different transformation rules to obtain 100 image transformation files corresponding to the file.
步骤303,将每一所述样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件。Step 303: Assign the type label of each sample image file to each image transformation file in the corresponding image transformation file set.
可以理解地,变换后的影像文件与变换前的影像文件的类型标签应该是一致的。例如,违规影像文件被进行了液化处理,液化处理后的文件仍然是违规的,其性质是不变的。因此,每一样本影像文件对应的影像变换文件的类型标签可以与该样本影像文件的类型标签一致。Understandably, the type tags of the image file after conversion and the image file before conversion should be consistent. For example, if the illegal image file has been liquefied, the liquefied file is still illegal, and its nature remains unchanged. Therefore, the type label of the image transformation file corresponding to each sample image file can be consistent with the type label of the sample image file.
步骤304,根据每一所述样本影像文件、每一所述影像变换文件和各自对应的类型标签,对特定的神经网络模型进行训练,得到所述目标分类模型。Step 304: Train a specific neural network model according to each of the sample image files, each of the image transformation files, and respective corresponding type labels to obtain the target classification model.
在本申请实施例中,按照多种变换规则,对每一样本影像文件进行变换处理,得到对应文件的影像变换文件集合;将每一样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件;根据每一所述样本影像文件、每一所述影像变换文件和各自对应的类型标签,对特定的神经网络模型进行训练,得到所述目标分类模型。In the embodiment of this application, each sample image file is transformed according to multiple transformation rules to obtain the image transformation file set of the corresponding file; the type label of each sample image file is assigned to the corresponding image transformation file set Each of the image transformation files; according to each of the sample image files, each of the image transformation files and respective corresponding type tags, a specific neural network model is trained to obtain the target classification model.
如此,一方面,训练样本中包括对样本影像文件进行多种变换得到的影像变换文件,这样能够丰富训练样本的多样性,使得训练得到的目标分类模型具有较好的鲁棒性。在基于该目标分类模型对待审影像文件进行审核时,能够对抗变换处理后的文件。即使用户在输入影像文件之前,对该文件进行了翻转、旋转、缩放、裁剪、马赛克等变换处理,也能够通过该模型准确地提取变换处理后的文件的特征向量,从而能够准确识别该文件是否是违规文件。简单来说,利用该模型对变换处理前和变换处理后的影像文件提取的特征向量基本一致,因此即使输入的影像文件是变换处理后的文件,电子设备也能够准确地识别出该文件是否是违规文件。In this way, on the one hand, the training samples include image transformation files obtained by performing multiple transformations on the sample image files, which can enrich the diversity of training samples and make the target classification model obtained by training have better robustness. When reviewing the image file to be reviewed based on the target classification model, it can fight against the transformed file. Even if the user performs transformation processing such as flipping, rotating, scaling, cropping, and mosaicing on the file before inputting the image file, the model can accurately extract the feature vector of the transformed file, so as to accurately identify whether the file is It is a violation file. To put it simply, the feature vectors extracted from the image files before and after the transformation process using this model are basically the same. Therefore, even if the input image file is a file after the transformation process, the electronic device can accurately identify whether the file is Violating documents.
另一方面,在本申请实施例中,将每一样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件;如此,在保证训练样本多样性的前提下,减少了人工标注成本,无需人工对每一影像变换文件标注类型标签。电子设备通过对样本影像文件进行变换处理,即可自动获取大量丰富多样的训练样本。On the other hand, in the embodiment of the present application, the type label of each sample image file is assigned to each image transformation file in the corresponding image transformation file set; in this way, under the premise of ensuring the diversity of training samples, it reduces Manual labeling costs, no need to manually label each image transformation file type label. The electronic device can automatically obtain a large number of rich and diverse training samples by transforming and processing the sample image files.
在一些实施例中,电子设备可以预先将已生成的审核集合加载至缓存中。对于加载的时机不做限定。例如,电子设备可以在利用目标分类模型对待审影像文件进行特征提取之前,加载已生成的审核集合;再如,电子设备还可以在对待审影像文件进行特征提取之后,且在确定待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度之前,加载已生成的审核集合;又如,电子设备还可以在被配置为具有影像审核功能时,加载已生成的审核集合。In some embodiments, the electronic device may load the generated audit set into the cache in advance. There is no restriction on the timing of loading. For example, the electronic device can load the generated review set before using the target classification model to extract the features of the image file to be reviewed; for another example, the electronic device can also extract the features of the image file to be reviewed and determine the image file to be reviewed Before the similarity between the feature vector of and at least one reference feature vector in the audit set, load the generated audit set; another example, when the electronic device is configured to have the image audit function, load the generated audit set .
在一些实施例中,对于审核集合的生成方法,如图4所示,可以包括以下步骤401和步骤402:In some embodiments, the method for generating an audit set, as shown in FIG. 4, may include the following steps 401 and 402:
步骤401,利用所述目标分类模型,对多个参考影像文件进行特征提取,得到对应文件的特征向量。Step 401: Using the target classification model, perform feature extraction on multiple reference image files to obtain feature vectors of corresponding files.
在一些实施例中,所述多个参考影像文件可以是违规文件,例如为第一数据库中的全部或部分文件,基于此得到的审核集合为违规集合。在另一些实施例中,所述多个参 考影像文件可以是合规文件,例如为第二数据库中的全部或部分文件。如上文提到的,审核集合的性质不同,即合规集合和违规集合,在影像审核阶段,对应的判断准则也是不同的。In some embodiments, the multiple reference image files may be violation files, for example, all or part of the files in the first database, and the audit set obtained based on this is the violation set. In other embodiments, the multiple reference image files may be compliance files, for example, all or part of the files in the second database. As mentioned above, the nature of the audit set is different, that is, the compliance set and the violation set. In the image review stage, the corresponding judgment criteria are also different.
所述多个参考影像文件是数据库中的部分文件时,可以是电子设备从数据库中随机抽取的文件,还可以是数据库中一些具有代表性的文件,比如优先级比较高的一些文件。When the multiple reference image files are part of the files in the database, they may be files randomly extracted from the database by the electronic device, or some representative files in the database, such as some files with higher priority.
步骤402,将每一所述参考影像文件的特征向量作为参考特征向量,生成所述审核集合。Step 402: Use the feature vector of each reference image file as a reference feature vector to generate the review set.
在本申请实施例中,预先将审核集合加载至缓存区。这样,电子设备在对待审影像文件进行审核的过程中,无需对所述多个参考影像文件进行特征提取,以生成审核集合;而是,直接使用预先生成的审核集合进行影像审核即可。如此,能够节约特征提取处理的时间消耗,从而能够节约影像的审核时长。In the embodiment of the present application, the review set is loaded into the buffer area in advance. In this way, in the process of reviewing the image file to be reviewed, the electronic device does not need to perform feature extraction on the multiple reference image files to generate the review set; instead, it can directly use the pre-generated review set to perform the image review. In this way, the time consumption of the feature extraction process can be saved, so that the time for reviewing the image can be saved.
在一些实施例中,电子设备可以预先将已确定的第一阈值加载至缓存中。对于加载的时机不做限定。例如,电子设备可以在确定所述待审影像文件是否是违规文件之前,加载已确定的第一阈值;再如,电子设备还可以在对待审影像文件进行特征提取之前,加载已确定的第一阈值;又如,电子设备还可以在被配置为具有影像审核功能时,加载已确定的第一阈值。In some embodiments, the electronic device may load the determined first threshold into the cache in advance. There is no restriction on the timing of loading. For example, the electronic device may load the determined first threshold before determining whether the image file to be reviewed is a violation file; for another example, the electronic device may also load the determined first threshold value before performing feature extraction on the image file to be reviewed. Threshold; For another example, the electronic device can also load the determined first threshold when it is configured to have an image review function.
在一些实施例中,所述第一阈值的确定方法,如图5所示,可以包括以下步骤501至步骤503:In some embodiments, the method for determining the first threshold, as shown in FIG. 5, may include the following steps 501 to 503:
步骤501,在假设所述第一阈值分别为多个不同候选阈值的情况下,根据所述影像审核方法,确定多个验证影像文件是否是违规文件,从而得到每一所述候选阈值对应的审核结果集合。 Step 501, assuming that the first threshold is a plurality of different candidate thresholds, according to the image review method, determine whether a plurality of verified image files are violating files, so as to obtain the review corresponding to each candidate threshold Result collection.
在一些实施例中,所述多个验证影像文件可以包括违规影像文件和合规影像文件。验证影像文件与用于训练神经网络模型的文件不同。所述多个验证影像文件中还可以包括电子设备对原始影像文件进行多种变换处理后的文件。变换处理采用的变换规则可以与模型训练阶段采用的变换规则相同。In some embodiments, the plurality of verification image files may include a violation image file and a compliance image file. The verification image file is different from the file used to train the neural network model. The multiple verification image files may also include files obtained after the electronic device performs various transformation processes on the original image files. The transformation rules used in the transformation processing may be the same as the transformation rules used in the model training stage.
可以理解地,通过实施步骤501,能够得到基于每一候选阈值获得的审核结果集合。如表1所示,其中阈值1对应的审核结果集合为表1中的第2列的内容。Understandably, by implementing step 501, a set of audit results obtained based on each candidate threshold can be obtained. As shown in Table 1, the set of audit results corresponding to threshold 1 is the content in the second column of Table 1.
表1Table 1
 To 候选阈值1 Candidate threshold 1 候选阈值2Candidate threshold 2 ……... 候选阈值NCandidate threshold N
验证影像文件1Verify image file 1 11 11 ……... 11
验证影像文件2Verify image file 2 00 11 ……... 11
……... ……... ……... ……... ……...
验证影像文件MVerify image file M 11 11 ……... 00
其中,候选阈值所属列中的“1”表示对应的文件的审核结果为合规文件,“0”表示对应的文件的审核结果为违规文件。Among them, "1" in the column to which the candidate threshold belongs indicates that the audit result of the corresponding file is a compliant file, and "0" indicates that the audit result of the corresponding file is a violation file.
步骤502,根据每一审核结果集合和每一所述验证影像文件的类型标签,确定在对应候选阈值下的正确召回率和错误召回率。Step 502: Determine the correct recall rate and the error recall rate under the corresponding candidate threshold according to each audit result set and the type label of each verified image file.
在一个示例中,正确召回率的计算公式如下式(1)所示:In an example, the calculation formula for the correct recall rate is shown in the following formula (1):
Figure PCTCN2020092923-appb-000002
Figure PCTCN2020092923-appb-000002
错误召回率的计算公式如下式(2)所示:The calculation formula of the error recall rate is shown in the following formula (2):
Figure PCTCN2020092923-appb-000003
Figure PCTCN2020092923-appb-000003
在式(1)和式(2)中,TN表示将违规文件审核为违规文件的数量;FP表示将违规文件审核为合规文件的数量;FN表示将合规文件审核为违规文件的数量。In equations (1) and (2), TN represents the number of violation documents reviewed as violations; FP represents the number of violation documents reviewed as compliance documents; FN represents the number of compliance documents reviewed as violation documents.
步骤503,将满足特定条件的正确召回率和错误召回率所对应的候选阈值,确定为所述第一阈值。Step 503: Determine the candidate thresholds corresponding to the correct recall rate and the false recall rate that meet specific conditions as the first threshold.
可以理解地,选择哪个候选阈值作为第一阈值,直接决定了影像审核方法的识别准确率。因此,应该在保证较高正确召回率的前提下,尽量地降低错误召回率,从而选择对应的候选阈值作为第一阈值。举例来说,在保证正确召回率大于或等于最小正确召回率(比如0.85)的情况下,选择最小错误召回率对应的候选阈值,作为第一阈值。Understandably, which candidate threshold is selected as the first threshold directly determines the recognition accuracy of the image review method. Therefore, on the premise of ensuring a higher correct recall rate, the false recall rate should be reduced as much as possible, so as to select the corresponding candidate threshold as the first threshold. For example, in the case of ensuring that the correct recall rate is greater than or equal to the minimum correct recall rate (such as 0.85), the candidate threshold corresponding to the minimum error recall rate is selected as the first threshold.
在一些实施例中,电子设备可以采用网格搜索法,逐渐逼近最佳值,从而从多个候选阈值中选择第一阈值。In some embodiments, the electronic device may adopt a grid search method to gradually approach the optimal value, so as to select the first threshold from a plurality of candidate thresholds.
本申请实施例再提供一种影像审核方法,图6为本申请实施例影像审核方法的实现流程示意图,如图6所示,所述方法可以包括以下步骤601至步骤606:The embodiment of the application further provides an image review method. FIG. 6 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 6, the method may include the following steps 601 to 606:
步骤601,获取所述目标分类模型的特征向量提取结构,所述特征向量提取结构包括所述目标分类模型的输入层至非线性激活层;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的。Step 601: Obtain a feature vector extraction structure of the target classification model. The feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model; wherein, the target classification model uses a plurality of sample image files And the corresponding multiple image transformation file training.
举例来说,目标分类模型可以为轻量级的神经网络模型MobileNetV2。该网络的结构,如图7A所示,包括“bottleneck结构”、conv2d层、sigmoid激活层、n×1维的全连接层(Dense)和归一化指数层(softmax)。在一些实施例中,如图7B所示,可以将“bottleneck结构”、conv2d层和sigmoid激活层作为特征向量提取结构。For example, the target classification model can be a lightweight neural network model MobileNetV2. The structure of the network, as shown in FIG. 7A, includes a "bottleneck structure", a conv2d layer, a sigmoid activation layer, an n×1 dimensional fully connected layer (Dense), and a normalized index layer (softmax). In some embodiments, as shown in FIG. 7B, the "bottleneck structure", the conv2d layer, and the sigmoid activation layer may be used as the feature vector extraction structure.
步骤602,利用所述特征向量提取结构,对所述待审影像文件进行特征提取,得到对应的特征向量。Step 602: Use the feature vector extraction structure to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
也就是说,特征向量提取结构的非线性激活层的输出即为该文件对应的特征向量。In other words, the output of the nonlinear activation layer of the feature vector extraction structure is the feature vector corresponding to the file.
步骤603,确定所述待审影像文件的特征向量与审核集合中的每一参考特征向量之间的相似度;其中,所述相似度用于表征两个特征向量之间的不同的特征数目;Step 603: Determine the similarity between the feature vector of the image file to be reviewed and each reference feature vector in the review set; wherein the similarity is used to represent the number of different features between the two feature vectors;
步骤604,确定小于所述第一阈值的相似度的数目,所述相似度用于表征两个特征向量之间的不同的特征数目。Step 604: Determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors.
例如,相似度为汉明距离。For example, the similarity is the Hamming distance.
步骤605,确定所述数目与相似度总数目的比值;Step 605: Determine the ratio of the number to the total number of similarities;
步骤606,根据所述比值与第二阈值之间的关系,确定所述待审影像文件是否是违规文件。Step 606: Determine whether the to-be-reviewed image file is a violation file according to the relationship between the ratio and the second threshold.
可以理解地,在审核集合为违规集合的情况下,所述比值大于第二阈值时,确定待审影像文件为违规文件;所述比值小于或等于第二阈值时,确定该文件为合规文件。Understandably, when the audit set is a violation set, when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a violation file; when the ratio is less than or equal to the second threshold, the file is determined to be a compliant file .
在审核集合为合规集合的情况下,所述比值大于第二阈值时,确定待审影像文件为合规文件;所述比值小于或等于第二阈值时,确定该文件为违规文件。When the audit set is a compliance set, when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a compliant file; when the ratio is less than or equal to the second threshold, the file is determined to be a violation file.
在本申请实施例中,统计小于第一阈值的相似度的数目,确定该数目与确定的相似度总数目之间的比值;根据比值与第二阈值的关系,确定待审影像文件是否是违规文件;如此,相比于仅根据与一个参考特征向量的相似度,获得审核结果,这种方式获得的审核结果更为可靠,识别准确率更高。In this embodiment of the application, the number of similarities less than the first threshold is counted, and the ratio between the number and the total number of similarities is determined; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation File; In this way, compared to only obtaining the audit result based on the similarity with a reference feature vector, the audit result obtained in this way is more reliable and the recognition accuracy rate is higher.
本申请实施例再提供一种影像审核方法,图8为本申请实施例影像审核方法的实现流程示意图,如图8所示,所述方法可以包括以下步骤801至步骤809:The embodiment of the application further provides an image review method. FIG. 8 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 8, the method may include the following steps 801 to 809:
步骤801,利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的。In step 801, the target classification model is used to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
可以理解地,一个目标分类模型通常由多个顺序连接的层(layer)组成。第一层一般以图像为输入,通过特定的运算从图像中提取特征。接下来,每一层以前一层提取的特征作为输入,对其进行特定形式的变换,便可以得到更复杂一些的特征。这种层次化的特征提取过程可以累加,从而赋予了神经网络强大的特征提取能力。经过很多层的变换之后,神经网络就可以将初始输入的图像变换为更高层次的抽象的特征。Understandably, a target classification model usually consists of multiple sequentially connected layers. The first layer generally takes an image as input, and extracts features from the image through specific operations. Next, the features extracted from the previous layer of each layer are used as input, and by transforming them in a specific form, more complex features can be obtained. This hierarchical feature extraction process can be accumulated, which gives the neural network powerful feature extraction capabilities. After many layers of transformation, the neural network can transform the initial input image into higher-level abstract features.
这种由简单到复杂、由低级到高级的抽象过程可以通过生活中的例子来体会。例如,在英语学习过程中,通过字母的组合,可以得到单词;通过单词的组合,可以得到句子;通过对句子的分析,可以了解语义;通过对语义的分析,可以获得表达的思想或目的。而这种语义、思想等,就是更高级别的抽象。This abstract process from simple to complex, from low-level to high-level can be experienced through examples in life. For example, in the process of English learning, through the combination of letters, you can get words; through the combination of words, you can get sentences; through the analysis of sentences, you can understand the semantics; through the analysis of semantics, you can get the expressed thought or purpose. And this kind of semantics, thoughts, etc. is a higher level of abstraction.
因此,在本申请实施例中,通过目标分类模型对待审影像文件进行特征提取时,无论待审影像文件是原始文件经过多么复杂的变换处理得到的,其提取的特征向量基本是不变的。这样,使得所述影像审核方法具有较强的鲁棒性,即使违规文件被变换处理后上传至网络,仍然能够被准确识别。Therefore, in the embodiment of the present application, when feature extraction is performed on the image file to be reviewed through the target classification model, no matter how complicated the original file is to obtain the image file to be reviewed, the extracted feature vector is basically unchanged. In this way, the image review method has strong robustness, and even if the illegal file is transformed and uploaded to the network, it can still be accurately identified.
步骤802,确定所述待审影像文件的特征向量与所述审核集合中的第i个参考特征向量之间的相似度;其中,i大于0且小于或等于所述审核集合中的参考特征向量总数目;Step 802: Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set Total number
步骤803,根据所述第i个参考特征向量对应的相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件;如果是,执行步骤804;否则,执行步骤807;Step 803: Determine whether the image file to be reviewed is a violation file according to the relationship between the similarity corresponding to the i-th reference feature vector and the first threshold; if so, go to step 804; otherwise, go to step 807;
所谓第i个参考特征向量对应的相似度,指的是待审影像文件的特征向量与第i个参考特征向量之间的相似度。The so-called similarity corresponding to the i-th reference feature vector refers to the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector.
步骤804,统计所述待审影像文件是违规文件的第一确定次数;Step 804: Count the first determined number of times the image file to be reviewed is a violation file;
步骤805,确定所述第一确定次数是否大于第三阈值;如果是,执行步骤806;否则,i+1,返回执行步骤802;Step 805: Determine whether the first number of determinations is greater than the third threshold; if yes, go to step 806; otherwise, i+1, go back to step 802;
步骤806,输出所述待审影像文件是违规文件。Step 806: Output that the image file to be reviewed is a violation file.
可以理解地,如果第一确定次数大于第三阈值,则足以可靠地确定待审影像文件是违规文件,此时无需再继续计算待审影像文件的特征向量与剩余参考特征向量之间的相似度了,从而节约运算量,缩短审核时长。Understandably, if the first number of times of determination is greater than the third threshold, it is sufficient to reliably determine that the image file to be reviewed is a violation file, and there is no need to continue to calculate the similarity between the feature vector of the image file to be reviewed and the remaining reference feature vector. , Thereby saving the amount of calculation and shortening the audit time.
举例来说,假设审核集合包括10000个参考特征向量,第三阈值为900,相似度通过汉明距离表征。那么,在计算至第1000个参考特征向量对应的相似度时,第一确定次数为901。即,在第1个至第1000个参考特征向量对应的相似度中,有901个相似度小于第一阈值。至此可以结束影像审核流程,输出待审影像文件为违规文件的审核结果。而无需再继续计算与剩余的9000个参考特征向量之间的相似度了。For example, suppose that the audit set includes 10,000 reference feature vectors, the third threshold is 900, and the similarity is represented by Hamming distance. Then, when calculating the similarity corresponding to the 1000th reference feature vector, the first determination number is 901. That is, among the similarities corresponding to the first to 1000th reference feature vectors, 901 similarities are less than the first threshold. At this point, the image review process can be ended, and the review result of the image file to be reviewed as a violation file is output. There is no need to continue to calculate the similarity with the remaining 9,000 reference feature vectors.
步骤807,统计所述待审影像文件是合规文件的第二确定次数;Step 807: Count the second determined number of times that the image file to be reviewed is a compliant file;
步骤808,确定所述第二确定次数是否大于第四阈值;如果是,执行步骤809;否则,i+1,返回执行步骤802;Step 808: Determine whether the second determination times are greater than the fourth threshold; if yes, go to step 809; otherwise, i+1, go back to step 802;
在一些实施例中,第四阈值大于第三阈值。这样,可以降低违规文件的误检率。In some embodiments, the fourth threshold is greater than the third threshold. In this way, the false detection rate of illegal files can be reduced.
步骤809,输出所述待审影像文件是合规文件。Step 809: Output that the image file to be reviewed is a compliance file.
本申请实施例再提供一种影像审核方法,图9为本申请实施例影像审核方法的实现流程示意图,如图9所示,所述方法可以包括以下步骤901至步骤904:The embodiment of the application further provides an image review method. FIG. 9 is a schematic diagram of the implementation process of the image review method of the embodiment of the application. As shown in FIG. 9, the method may include the following steps 901 to 904:
步骤901,利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的; Step 901, using the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files;
步骤902,确定所述待审影像文件的特征向量与所述审核集合中的第i个参考特征向量之间的相似度;其中,i大于0且小于或等于所述审核集合中的参考特征向量总数 目,所述参考特征向量对应的参考影像文件为违规文件;所述相似度用于表征两个特征向量之间的不同的特征数目;Step 902: Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set The total number, the reference image file corresponding to the reference feature vector is a violation file; the similarity is used to characterize the number of different features between the two feature vectors;
步骤903,确定所述第i个参考特征向量对应的相似度是否小于第一阈值;如果是,执行步骤904;否则,i+1,返回执行步骤902;Step 903: Determine whether the similarity corresponding to the i-th reference feature vector is less than the first threshold; if yes, go to step 904; otherwise, i+1, go back to step 902;
步骤904,确定所述待审影像文件是违规文件,并输出该审核结果。Step 904: Determine that the image file to be reviewed is a violation file, and output the review result.
相比于上述步骤802至步骤809,这里,如果第i个参考特征向量对应的相似度小于第一阈值,则结束审核流程,输出待审影像文件是违规文件的审核结果;否则,继续遍历下一参考特征向量,直至确定待审影像文件是违规文件为止。当然,在一些实施例中,如果遍历审核集合中的每一参考特征向量,结果均为对应的相似度大于或等于第一阈值,则输出待审影像文件是合规文件的审核结果。Compared with the above steps 802 to 809, here, if the similarity corresponding to the i-th reference feature vector is less than the first threshold, the review process is ended, and the output to be reviewed is the review result of the violation file; otherwise, continue to traverse the next One refers to the feature vector until it is determined that the image file to be reviewed is a violation file. Of course, in some embodiments, if each reference feature vector in the audit set is traversed, and the result is that the corresponding similarity is greater than or equal to the first threshold, then the output pending image file is the audit result of the compliance file.
在相关技术中,通过将输入图片(即待审图片)与违规图库(即第一数据库的一种示例)中的图片进行相似度计算,以判断该输入图片是否违规。常用的相似度算法,比如感知哈希(pHash)算法和尺度不变特征转换(Scale-Invariant Feature Transform,SIFT)算法。In the related art, the input picture (that is, the picture to be reviewed) and the picture in the violation gallery (that is, an example of the first database) are calculated for similarity to determine whether the input picture is in violation. Commonly used similarity algorithms, such as the perceptual hash (pHash) algorithm and the Scale-Invariant Feature Transform (SIFT) algorithm.
pHash算法,是通过人工设计的规则算法,该算法的基本原理是:获得输入图片的hash值,再通过计算该输入图片与违规图库中的一张图片的hash“距离”,从而得到这两张图片的相似度;当相似度大于设定的阈值时,则认为匹配成功。算法的实现过程如下:The pHash algorithm is a rule algorithm designed manually. The basic principle of the algorithm is to obtain the hash value of the input picture, and then calculate the hash "distance" between the input picture and a picture in the illegal library to obtain these two The similarity of the picture; when the similarity is greater than the set threshold, the match is considered successful. The implementation process of the algorithm is as follows:
缩小输入图片的尺寸;简化缩小后的图片的色彩;计算简化后的图片的平均值;基于平均值,比较像素的灰度;基于灰度,计算哈希值;基于哈希值,计算与违规图库中的一张图片的汉明距离;当汉明距离小于设定的阈值时,则确定匹配成功,输入图片为违规图片。Reduce the size of the input picture; simplify the color of the reduced picture; calculate the average value of the simplified picture; compare the grayscale of the pixel based on the average; calculate the hash value based on the grayscale; calculate and violate the rules based on the hash value The Hamming distance of a picture in the gallery; when the Hamming distance is less than the set threshold, it is determined that the matching is successful, and the input picture is an illegal picture.
SIFT算法,用来侦测和描述图片中的局部性特征,它在空间尺度中寻找极值点,并提取出其位置、尺度、旋转不变量。局部特征的描述和侦测可以帮助辨识物体,SIFT特征是基于物体上的一些局部外观的兴趣点而与图片的大小和旋转无关。The SIFT algorithm is used to detect and describe the local features in the picture. It finds extreme points in the spatial scale and extracts its position, scale, and rotation invariants. The description and detection of local features can help to identify objects. SIFT features are based on some local appearance points of interest on the object and have nothing to do with the size and rotation of the picture.
然而,pHash算法和SIFT算法的算法因子(即图片特征抽取算子)均由人为设计,因此只能满足特定的匹配场景。pHash算法只能保持尺度缩放、变色的不变性;SIFT算法只能保持旋转、尺度缩放、亮度变化、仿射、噪声的不变性。However, the algorithm factors (ie, image feature extraction operators) of the pHash algorithm and the SIFT algorithm are both artificially designed, so they can only meet specific matching scenarios. The pHash algorithm can only maintain the invariance of scale scaling and color change; the SIFT algorithm can only maintain the invariance of rotation, scale scaling, brightness change, affine, and noise.
基于此,下面将说明本申请实施例在一个实际的应用场景中的示例性应用。Based on this, an exemplary application of the embodiment of the present application in an actual application scenario will be described below.
对于端到端的深度学习匹配算法,主要通过神经网络模型,直接计算两张图片是否匹配。实现流程如图10所示,分为训练阶段和预测阶段。训练阶段的基本流程包括以下步骤1001至步骤1004:For the end-to-end deep learning matching algorithm, the neural network model is mainly used to directly calculate whether the two pictures match. The implementation process is shown in Figure 10, which is divided into a training phase and a prediction phase. The basic process of the training phase includes the following steps 1001 to 1004:
步骤1001,设计模型结构(包括卷积层、全连接层和池化层等),得到初始的相似度模型,即神经网络模型; Step 1001, design a model structure (including convolutional layer, fully connected layer, pooling layer, etc.) to obtain an initial similarity model, that is, a neural network model;
步骤1002,准备大量图片数据作为训练样本; Step 1002, prepare a large amount of image data as training samples;
步骤1003,对训练样本中的每张图片进行数据增强处理,比如,对图片分别进行旋转、镜像和渲染等,将同一张图片经过不同数据变换后得到的两张图片,组合为正样本(1),其他变换后的图片作为负样本(0)。Step 1003: Perform data enhancement processing on each picture in the training sample, for example, rotate, mirror, and render the pictures separately, and combine the two pictures obtained after different data transformations of the same picture into a positive sample (1 ), and other transformed pictures as negative samples (0).
步骤1004,通过梯度下降系列优化算法和数据增强后的训练样本,更新初始的相似度模型,得到训练好的相似度模型,即目标分类模型。In step 1004, the initial similarity model is updated through the gradient descent series optimization algorithm and the training samples after data enhancement, to obtain the trained similarity model, that is, the target classification model.
预测阶段的基本流程,如图10所示,包括步骤1005至步骤1007:The basic process of the prediction phase, as shown in Figure 10, includes steps 1005 to 1007:
步骤1005,输入图片与违规图库中的每张图片进行相似度计算; Step 1005, the input picture and each picture in the illegal library are calculated for similarity;
步骤1006,确定小于第一阈值的相似度数目与相似度总数目的比值是否大于第二阈值;如果是,执行步骤1007;Step 1006: Determine whether the ratio of the number of similarities less than the first threshold to the total number of similarities is greater than the second threshold; if so, go to step 1007;
步骤1007,认为匹配成功,确定输入图片为违规图片。In step 1007, it is considered that the matching is successful, and it is determined that the input picture is a violation picture.
端到端的深度学习匹配算法,深度学习模型含有多个通过梯度下降获得的卷积核,卷积核对图片特征的表达能力极强,基本满足所有图片变换场景。但是,在预测阶段,对于一个输入图片,需要循环地与图库中的所有图片进行匹配计算,再加上神经网络模型本身的计算消耗,其资源的消耗是无法接受的。End-to-end deep learning matching algorithm. The deep learning model contains multiple convolution kernels obtained through gradient descent. The convolution kernel has a strong ability to express image features and basically meets all image transformation scenarios. However, in the prediction stage, for an input picture, it is necessary to cyclically perform matching calculations with all pictures in the gallery, plus the computational consumption of the neural network model itself, and its resource consumption is unacceptable.
在本申请实施例中,结合hash和深度学习的特点,采用深度神经网络抽取图片特征,获得图片hash,即特征向量的一种示例;比较两张图片hash的相似度,判断是否匹配成功。In the embodiment of the present application, combining the characteristics of hash and deep learning, a deep neural network is used to extract image features to obtain the image hash, which is an example of feature vector; compare the similarity of the two image hashes to determine whether the matching is successful.
以下详细描述本申请实施例提供的影像审核方法的实现流程,如图11所示,该流程可以包括以下Step1至Step4):The following describes in detail the implementation process of the image review method provided by the embodiment of the present application. As shown in FIG. 11, the process may include the following Step 1 to Step 4):
Step1)数据准备。准备200张原始图片,如图12所示,对每张原始图片进行翻转、旋转、缩放、裁剪、液化、马赛克、噪声、变色、遮挡等图片变换操作,或者它们的组合变换。对每张图片进行100次不同的变换操作,这样总共获得20000个样本。Step1) Data preparation. Prepare 200 original pictures, as shown in Figure 12, perform picture transformation operations such as flipping, rotating, scaling, cropping, liquefying, mosaicing, noise, discoloration, and occlusion on each original picture, or a combination of them. Perform 100 different transformation operations on each picture, so that a total of 20,000 samples are obtained.
Step2)设计模型。选用轻量级的深度神经网络MobileNetV2作为特征提取器。在对该模型进行训练之前,对MobileNetV2网络结构进行修改,MobileNetV2的原结构如下表2所示,其中,表头“Input”为该结构层输入的大小,“Operator”为该层的结构类型,“c”为该层的输出特征层维度,“n”为该层的重复次数,“s”为深度卷积核的步数。Step2) Design the model. The lightweight deep neural network MobileNetV2 is selected as the feature extractor. Before training the model, modify the MobileNetV2 network structure. The original structure of MobileNetV2 is shown in Table 2 below. The header "Input" is the input size of the structure layer, and "Operator" is the structure type of the layer. "C" is the dimension of the output feature layer of this layer, "n" is the number of repetitions of this layer, and "s" is the number of steps of the deep convolution kernel.
表2Table 2
NumNum InputInput OperatorOperator cc nn ss
11 224 2×3 224 2 ×3 Conv2dConv2d 3232 11 22
22 112 2×32 112 2 ×32 bottleneckbottleneck 1616 11 11
33 112 2×16 112 2 ×16 bottleneckbottleneck 24twenty four 22 22
44 56 2×24 56 2 ×24 bottleneckbottleneck 3232 33 22
55 28 2×32 28 2 ×32 bottleneckbottleneck 6464 44 22
66 14 2×64 14 2 ×64 bottleneckbottleneck 9696 33 11
77 14 2×96 14 2 ×96 bottleneckbottleneck 160160 33 22
88 7 2×160 7 2 ×160 bottleneckbottleneck 320320 11 11
99 7 2×320 7 2 ×320 Conv2d 1×1 Conv2d 1×1 12801280 11 11
1010 7 2×1280 7 2 ×1280 Avgpool 7×7Avgpool 7×7 -- 11 --
1111 1×1×12801×1×1280 Conv2d 1×1 Conv2d 1×1 kk -- --
1212 k×1k×1 Active-SoftmaxActive-Softmax kk -- --
MobileNetV2的第11层的输入大小固定为1×1×1280,采用k个1×1大小的卷积核进行卷积计算,从而输出长度为k的1维向量。最后,连接softmax激活层,从而计算得到k个类别的概率。The input size of the 11th layer of MobileNetV2 is fixed at 1×1×1280, and k 1×1 size convolution kernels are used for convolution calculation, so as to output a 1-dimensional vector of length k. Finally, connect the softmax activation layer to calculate the probability of k categories.
为了便于描述,将表2所示的第1至10层简称为“bottleneck结构”,简化后的MobileNetV2结构如图13所示。For ease of description, the first to tenth layers shown in Table 2 are referred to as "bottleneck structure", and the simplified MobileNetV2 structure is shown in Figure 13.
对MobileNetV2结构进行如下修改:在conv2d层与softmax层中间,添加一层sigmoid激活层与n×1维的全连接层(Dense)。修改后的MobileNetV2结构如图7A所示。The MobileNetV2 structure is modified as follows: between the conv2d layer and the softmax layer, a sigmoid activation layer and an n×1 dimensional fully connected layer (Dense) are added. The modified MobileNetV2 structure is shown in Figure 7A.
Step3)模型训练阶段。Step3) Model training stage.
将Step1中得到的20000张图片作为训练样本,200张原始图片作为训练样本的标签,训练一个图片分类模型,即特定的神经网络模型。对应到图7A中,k=200,n为需要编码hash的维度(例如取为300)。训练图7A中所示的修改后的MobileNetV2分类 模型。Take the 20000 pictures obtained in Step1 as training samples and 200 original pictures as the labels of the training samples to train a picture classification model, that is, a specific neural network model. Corresponding to FIG. 7A, k=200, and n is the dimension of the hash that needs to be encoded (for example, 300). Train the modified MobileNetV2 classification model shown in Figure 7A.
模型损失函数为多分类的交叉熵损失(categorical_crossentropy),优化算法为Adam,学习率固定为0.001,训练得到的模型准确率>99.5%。The model loss function is a multi-category cross-entropy loss (categorical_crossentropy), the optimization algorithm is Adam, the learning rate is fixed at 0.001, and the accuracy of the trained model is >99.5%.
Step4)匹配阶段。Step4) Matching stage.
加载Step3得到的模型参数,为了得到图片的hash值,删除模型的最后两层,即Dense层与softmax层,修改后的模型如图7B所示。为便于描述,将此模型称为“Mobilehashnet”,即特征向量提取结构的一种示例。将基于该模型实现的图片审核方法称为Mobilehashnet算法。Load the model parameters obtained in Step3. In order to obtain the hash value of the picture, delete the last two layers of the model, namely the Dense layer and the softmax layer. The modified model is shown in Figure 7B. For ease of description, this model is called "Mobilehashnet", which is an example of feature vector extraction structure. The image review method based on this model is called Mobilehashnet algorithm.
此时模型的输出为一个长度为n(例如为300)的1维向量,如图14所示,由于激活函数为sigmoid,sigmoid输出的取值范围为(0,1)。然后,根据输出<0.5则取0,输出>0.5则取1的原则,对输出进行过滤,最终得到长度为300、取值为0或1的hash向量,即特征向量。At this time, the output of the model is a 1-dimensional vector with a length of n (for example, 300). As shown in Figure 14, since the activation function is sigmoid, the value range of the sigmoid output is (0, 1). Then, the output is filtered according to the principle of output <0.5, then 0, output>0.5, then 1, and the output is filtered, and finally a hash vector with a length of 300 and a value of 0 or 1 is obtained, that is, a feature vector.
需要说明的是,之所以将提取的特征向量称为hash向量,是因为即使输入图片是原始图片被变换处理后的图片,Mobilehashnet提取的特征向量仍然与原始图片的特征向量一致。It should be noted that the reason why the extracted feature vector is called a hash vector is because even if the input image is a transformed image of the original image, the feature vector extracted by Mobilehashnet is still consistent with the feature vector of the original image.
如图15所示,在获得图片1和图片2的hash向量之后,即可根据图片的hash向量,计算两张图片的汉明距离。距离越小两张图片越相似。匹配的实现可规定一个第一阈值,当汉明距离低于第一阈值时,则认为两张图片为同一张图片,匹配成功;否则,匹配失败。As shown in Figure 15, after obtaining the hash vectors of picture 1 and picture 2, the Hamming distance of the two pictures can be calculated according to the hash vector of the picture. The smaller the distance, the more similar the two pictures. The realization of matching can specify a first threshold. When the Hamming distance is lower than the first threshold, the two pictures are considered to be the same picture and the matching is successful; otherwise, the matching fails.
需要说明的是,这里对于第一阈值的选取,需要预先通过验证获得。其中验证集的准备过程与上述训练集相同。准备若干非训练集中的图片,进行数据增强,计算不同候选阈值下,匹配模型的正确召回率(recall)和错误召回率(wrong_recall)。It should be noted that the selection of the first threshold here needs to be obtained through verification in advance. The preparation process of the validation set is the same as the above training set. Prepare several pictures in the non-training set, perform data enhancement, and calculate the correct recall rate (recall) and wrong recall rate (wrong_recall) of the matching model under different candidate thresholds.
一个好的匹配模型,应在保证正确召回率的前提下,尽量降低错误召回率。在一些实施例中,可以采用网格搜索法,逐渐逼近最佳值,网格搜索结果如图16和图17所示;其中,图16示出了候选阈值为35至70时,对应的recall和wrong_recall。图17示出了候选阈值为50至55时,对应的recall和wrong_recall。A good matching model should minimize the false recall rate while ensuring the correct recall rate. In some embodiments, a grid search method can be used to gradually approach the optimal value. The grid search results are shown in Figure 16 and Figure 17; among them, Figure 16 shows that when the candidate threshold is 35 to 70, the corresponding recall And wrong_recall. Figure 17 shows the corresponding recall and wrong_recall when the candidate threshold is 50 to 55.
在一个示例中,在候选阈值=52处,recall=0.85,wrong_recall=0.15,是一个好的取值。这是因为,在保证recall的值大于或等于0.85的前提下,wrong_recall的值越小越好,因此可以将最小wrong_recall对应的候选阈值确定为第一阈值。In an example, at candidate threshold=52, recall=0.85 and wrong_recall=0.15, which is a good value. This is because, on the premise that the value of recall is greater than or equal to 0.85, the smaller the value of wrong_recall, the better, so the candidate threshold corresponding to the smallest wrong_recall can be determined as the first threshold.
hash维度直接决定了修改后的MobileNetV2结构中2d卷积层(conv2d1×1)的卷积核个数以及激活层的输出维度n,由于处于网络结构的末端,其大小直接影响模型的学习能力。hash维度过小,将导致模型欠拟合,并降低图库的数量限制;维度过大不仅增加了生成hash的耗时,并增加了计算汉明距离的耗时,所以需要选择一个合理的hash维度。The hash dimension directly determines the number of convolution kernels of the 2d convolutional layer (conv2d1×1) in the modified MobileNetV2 structure and the output dimension n of the activation layer. Since it is at the end of the network structure, its size directly affects the learning ability of the model. If the hash dimension is too small, it will lead to underfitting of the model and reduce the limit on the number of libraries; too large dimension not only increases the time consumption of generating hash, but also increases the time consumption of calculating the Hamming distance, so you need to choose a reasonable hash dimension .
在一个示例中,hash维度n取为原始图片数量(分类种类)的1.5倍,即,n=1.5×200=300。In an example, the hash dimension n is taken to be 1.5 times the number of original pictures (classification types), that is, n=1.5×200=300.
可以理解地,相对于依靠纯人工设计的计算因子,Mobilehashnet采用深度神经网络提取图片特征,理论上具有性能优势。为了更直观地说明其高性能特点,在不同图片变换方式下,进行Mobilehashnet算法与Phash算法、SIFT算法的匹配性能对比,实验结果如表3所示。Understandably, compared to relying on purely artificially designed calculation factors, Mobilehashnet uses deep neural networks to extract image features, which theoretically has performance advantages. In order to illustrate its high-performance characteristics more intuitively, the matching performance of the Mobilehashnet algorithm, the Phash algorithm and the SIFT algorithm is compared under different image transformation methods. The experimental results are shown in Table 3.
表3table 3
Figure PCTCN2020092923-appb-000004
Figure PCTCN2020092923-appb-000004
Figure PCTCN2020092923-appb-000005
Figure PCTCN2020092923-appb-000005
从表3所示的对比结果中可以看出,Phash算法在翻转、旋转、缩放等图片变换中基本无法进行匹配;SIFT算法在所有图片变化种类中,recall均处于较低值。而本申请实施例中,Mobilehashnet算法在翻转、扭曲、剪切、马赛克、噪声的图片变换中,能达到100%的recall,且在其余图片变换中,recall值均较高,wrong_recall值均较低。From the comparison results shown in Table 3, it can be seen that the Phash algorithm is basically unable to match in image transformations such as flipping, rotating, and zooming; the SIFT algorithm is at a low value in all types of image changes. In the embodiment of this application, the Mobilehashnet algorithm can achieve 100% recall in image transformations of flipping, distorting, cutting, mosaic, and noise, and in other image transformations, the recall value is higher, and the wrong_recall value is lower. .
相比于相关技术中通过人工标注大量样本训练一个图片分类模型,在本申请实施例所提供的Mobilehashnet算法中,无需通过人工大量标注样本即可进行训练,通过图片数据增强技术自动获取大量训练样本。Compared with training a picture classification model by manually labeling a large number of samples in related technologies, in the Mobilehashnet algorithm provided in the embodiment of this application, training can be performed without manually labeling a large number of samples, and a large number of training samples are automatically obtained through image data enhancement technology. .
本申请实施例所提供的Mobilehashnet算法,通过采用深度神经网络提取图片特征,基于这些特征生成图片hash,并进行图片匹配。相比于相关图片匹配/相似度算法,有效地提高了正确召回率,降低了错误召回率,且无需人工大量标注数据。The Mobilehashnet algorithm provided by the embodiments of this application extracts image features by using a deep neural network, generates image hashes based on these features, and performs image matching. Compared with the related image matching/similarity algorithm, it effectively improves the correct recall rate, reduces the false recall rate, and does not require a large amount of manual data annotation.
图片审核系统对用户上传的图片进行审核,防止大量违法违规图片的传播。由于图片内容的复杂性,如图18所示,图片审核系统流程包括了违规图库匹配模型、图片分类模型、人脸识别模型、文字识别模型、文本分类模型。待审图片依次经过各个模型进行审核,当所有模型结果均为“正常”时,其审核结果才能是“正常”,即是合规图片;否则,则为违规图片。The picture review system reviews the pictures uploaded by users to prevent the spread of a large number of illegal pictures. Due to the complexity of image content, as shown in Figure 18, the process of the image review system includes an illegal library matching model, an image classification model, a face recognition model, a text recognition model, and a text classification model. The pictures to be reviewed are reviewed by each model in turn. When the results of all models are "normal", the review result can be "normal", that is, a compliant picture; otherwise, it is a violating picture.
其中,图片审核系统中的违规图库匹配模型,可由本申请实施例提供的Mobilehashnet算法实现,保证匹配的高的正确召回率与低的错误召回率。该算法的实现流程如图19所示,提取待审图片的hash向量;确定该hash向量与违规图库对应的违规hash库中的每一hash向量的汉明距离,即批量计算汉明距离;判断每一汉明距离是否大于第一阈值,从而获得召回结果,即正确召回率和错误召回率。Among them, the illegal library matching model in the image review system can be implemented by the Mobilehashnet algorithm provided in the embodiment of this application, which ensures a high correct recall rate and a low error recall rate for matching. The implementation process of this algorithm is shown in Figure 19, extract the hash vector of the picture to be reviewed; determine the Hamming distance of each hash vector in the illegal hash library corresponding to the hash vector and the illegal library, that is, calculate the Hamming distance in batches; judge; Whether each Hamming distance is greater than the first threshold, so as to obtain the recall result, that is, the correct recall rate and the false recall rate.
在一些实施例中,违规hash库在系统初始化时即可获得,匹配时仅需进行一次hash计算,即仅需对待审图片进行特征提取即可。In some embodiments, the offending hash library can be obtained when the system is initialized, and only one hash calculation is required for matching, that is, only the feature extraction of the image to be reviewed is required.
基于前述的实施例,本申请实施例提供的影像文件审核装置,包括所包括的各模块、以及各模块所包括的各单元,可以通过终端中的处理器来实现;当然也可通过具体的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、 数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。Based on the foregoing embodiments, the image file review device provided by the embodiments of the present application, including the modules included and the units included in each module, can be implemented by the processor in the terminal; of course, it can also be implemented by specific logic. Circuit implementation; in the implementation process, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
图20A为本申请实施例影像文件审核装置的结构示意图,如图20A所示,所述装置200包括特征提取模块201、第一确定模块202和审核模块203,其中:FIG. 20A is a schematic structural diagram of an image file review device according to an embodiment of the application. As shown in FIG. 20A, the device 200 includes a feature extraction module 201, a first determination module 202, and an review module 203, wherein:
特征提取模块201,配置为利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;The feature extraction module 201 is configured to use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files ;
第一确定模块202,配置为确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度;The first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set;
审核模块203,配置为根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。The review module 203 is configured to determine whether the image file to be reviewed is a violation file according to the determined relationship between the similarity and the first threshold.
在一些实施例中,特征提取模块201,配置为:获取所述目标分类模型的特征向量提取结构,所述特征向量提取结构包括所述目标分类模型的输入层至非线性激活层;其中,所述目标分类模型的类型为神经网络模型;利用所述特征向量提取结构,对所述待审影像文件进行特征提取,得到对应的特征向量。In some embodiments, the feature extraction module 201 is configured to obtain a feature vector extraction structure of the target classification model, and the feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model; The type of the target classification model is a neural network model; the feature vector extraction structure is used to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
在一些实施例中,如图20B所示,影像审核装置200还包括:标签获取模块204,配置为获取每一所述样本影像文件的类型标签;变换处理模块205,配置为按照多种变换规则,对每一所述样本影像文件进行变换处理,得到对应文件的影像变换文件集合;标签标注模块206,配置为将每一所述样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件;模型训练模块207,配置为根据每一所述样本影像文件、每一所述影像变换文件和各自对应的类型标签,对特定的神经网络模型进行训练,得到所述目标分类模型。In some embodiments, as shown in FIG. 20B, the image auditing device 200 further includes: a tag acquisition module 204, configured to acquire the type tag of each sample image file; a transformation processing module 205, configured to follow a variety of transformation rules , Performing transformation processing on each of the sample image files to obtain an image transformation file set of the corresponding file; the tag labeling module 206 is configured to assign the type label of each sample image file to the corresponding image transformation file set Each image transformation file; the model training module 207 is configured to train a specific neural network model according to each of the sample image files, each of the image transformation files, and their corresponding type labels to obtain the target classification Model.
在一些实施例中,审核模块203,配置为:确定小于所述第一阈值的相似度的数目,所述相似度用于表征两个特征向量之间的不同的特征数目;确定所述数目与相似度总数目的比值;根据所述比值与第二阈值之间的关系,确定所述待审影像文件是否是违规文件。In some embodiments, the review module 203 is configured to: determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors; determine that the number is equal to The ratio of the total number of similarities; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation file.
在一些实施例中,第一确定模块202,配置为:确定所述待审影像文件的特征向量与所述审核集合中的第i个参考特征向量之间的相似度;其中,i大于0且小于或等于所述审核集合中的参考特征向量总数目;所述相似度用于表征两个特征向量之间的不同的特征数目,所述参考特征向量对应的参考影像文件为违规文件;相应地,审核模块203,配置为在所述第i个参考特征向量对应的相似度小于所述第一阈值的情况下,确定所述待审影像文件是违规文件。In some embodiments, the first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and Less than or equal to the total number of reference feature vectors in the review set; the similarity is used to characterize the number of different features between two feature vectors, and the reference image file corresponding to the reference feature vector is a violation file; accordingly , The review module 203 is configured to determine that the image file to be reviewed is a violation file when the similarity corresponding to the i-th reference feature vector is less than the first threshold.
在一些实施例中,第一确定模块202,还配置为:在所述第i个参考特征向量对应的相似度大于或等于所述第一阈值时,确定所述待审影像文件的特征向量与所述审核集合中的第i+1个参考特征向量之间的相似度,以确定所述待审影像文件是否是违规文件。In some embodiments, the first determining module 202 is further configured to: when the similarity corresponding to the i-th reference feature vector is greater than or equal to the first threshold, determine the feature vector of the image file to be reviewed and The similarity between the i+1th reference feature vector in the review set is used to determine whether the image file to be reviewed is a violation file.
在一些实施例中,如图20B所示,影像审核装置200,还包括:加载模块208,配置为加载已生成的所述审核集合;相应地,特征提取模块201,还配置为:利用所述目标分类模型,对多个参考影像文件进行特征提取,得到对应文件的特征向量;将每一所述参考影像文件的特征向量作为参考特征向量,生成所述审核集合。In some embodiments, as shown in FIG. 20B, the image review device 200 further includes: a loading module 208 configured to load the generated review set; correspondingly, the feature extraction module 201 is further configured to: use the The target classification model performs feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and uses the feature vector of each reference image file as a reference feature vector to generate the review set.
在一些实施例中,加载模块208,配置为加载已确定的所述第一阈值;In some embodiments, the loading module 208 is configured to load the determined first threshold;
相应地,所述装置还包括第二确定模块,配置为:在假设所述第一阈值分别为多个不同候选阈值的情况下,利用所述装置的特征提取模块、第一确定模块和审核模块,确定多个验证影像文件是否是违规文件,从而得到每一所述候选阈值对应的审核结果集合;根据每一审核结果集合和每一所述验证影像文件的类型标签,确定在对应候选阈值下的正确召回率和错误召回率;将满足特定条件的正确召回率和错误召回率所对应的候 选阈值,确定为所述第一阈值。Correspondingly, the device further includes a second determination module, configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds. , To determine whether a plurality of verification image files are illegal files, so as to obtain the audit result set corresponding to each candidate threshold; according to each audit result set and the type label of each verification image file, it is determined to be under the corresponding candidate threshold The correct recall rate and the false recall rate of, and the candidate thresholds corresponding to the correct recall rate and the false recall rate that meet specific conditions are determined as the first threshold.
以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。The description of the above device embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的影像审核方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。It should be noted that, in the embodiments of the present application, if the above-mentioned image review method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies. The computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
对应地,本申请实施例提供一种电子设备,图21为本申请实施例的电子设备的硬件实体示意图,如图21所示,所述电子设备210包括存储器211和处理器212,所述存储器211存储有可在处理器212上运行的计算机程序,所述处理器212执行所述程序时实现上述实施例中提供的影像审核方法中的步骤。Correspondingly, an embodiment of the present application provides an electronic device. FIG. 21 is a schematic diagram of the hardware entity of the electronic device according to an embodiment of the application. As shown in FIG. 21, the electronic device 210 includes a memory 211 and a processor 212. 211 stores a computer program that can be run on the processor 212, and the processor 212 implements the steps in the image review method provided in the foregoing embodiment when the processor 212 executes the program.
需要说明的是,存储器211配置为存储由处理器212可执行的指令和应用,还可以缓存待处理器212以及电子设备210中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。It should be noted that the memory 211 is configured to store instructions and applications executable by the processor 212, and can also cache data to be processed or processed by the processor 212 and each module in the electronic device 210 (for example, image data, audio data, etc.). , Voice communication data and video communication data), which can be implemented by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的影像审核方法中的步骤。Correspondingly, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the image review method provided in the above-mentioned embodiments are implemented.
这里需要指出的是:以上存储介质、芯片和终端设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质、芯片和终端设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。It should be pointed out here that the description of the foregoing storage medium, chip, and terminal device embodiments is similar to the description of the foregoing method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the embodiments of the storage medium, chip, and terminal device of this application, please refer to the description of the method embodiments of this application for understanding.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”或“一些实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”或“在一些实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。It should be understood that “one embodiment” or “an embodiment” or “some embodiments” mentioned throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application . Therefore, appearances of "in one embodiment" or "in an embodiment" or "in some embodiments" in various places throughout the specification do not necessarily refer to the same embodiment. In addition, these specific features, structures, or characteristics can be combined in one or more embodiments in any suitable manner. It should be understood that, in the various embodiments of the present application, the size of the sequence numbers of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application. The implementation process constitutes any limitation. The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的触摸屏系统的实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个模块或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或模块的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed device and method can be implemented in other ways. The embodiments of the touch screen system described above are merely illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation, such as: multiple modules or components can be combined , Or can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed can be indirect coupling or communication connection through some interfaces, devices or modules, and can be electrical, mechanical or other forms of.
上述作为分离部件说明的模块可以是、或也可以不是物理上分开的,作为模块显示 的部件可以是、或也可以不是物理模块;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部模块来实现本实施例方案的目的。The modules described above as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules; they may be located in one place or distributed on multiple network units; Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各实施例中的各功能模块可以全部集成在一个处理单元中,也可以是各模块分别单独作为一个单元,也可以两个或两个以上模块集成在一个单元中;上述集成的模块既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, the functional modules in the embodiments of the present application may all be integrated into one processing unit, or each module may be individually used as a unit, or two or more modules may be integrated into one unit; the above-mentioned integration The module can be implemented in the form of hardware, or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。A person of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware. The foregoing program can be stored in a computer readable storage medium. When the program is executed, the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a removable storage device, a read only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk.
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the aforementioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies. The computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application. The aforementioned storage media include: removable storage devices, ROMs, magnetic disks, or optical disks and other media that can store program codes.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in the several method embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in the several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain a new method embodiment or device embodiment.
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only the implementation manners of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Covered in the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (18)

  1. 影像审核方法,所述方法包括:Image review method, the method includes:
    利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;Use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files;
    确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度;Determining the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set;
    根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。According to the determined relationship between the similarity and the first threshold, it is determined whether the pending image file is a violation file.
  2. 根据权利要求1所述的方法,其中,所述目标分类模型的类型为神经网络模型,所述利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量,包括:The method according to claim 1, wherein the type of the target classification model is a neural network model, and the use of the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector comprises:
    获取所述目标分类模型的特征向量提取结构,所述特征向量提取结构包括所述目标分类模型的输入层至非线性激活层;Acquiring a feature vector extraction structure of the target classification model, where the feature vector extraction structure includes an input layer to a non-linear activation layer of the target classification model;
    利用所述特征向量提取结构,对所述待审影像文件进行特征提取,得到对应的特征向量。Using the feature vector extraction structure, feature extraction is performed on the image file to be reviewed to obtain the corresponding feature vector.
  3. 根据权利要求1或2所述的方法,其中,所述目标分类模型的训练过程,包括:The method according to claim 1 or 2, wherein the training process of the target classification model includes:
    获取每一所述样本影像文件的类型标签;Acquiring the type label of each of the sample image files;
    按照多种变换规则,对每一所述样本影像文件进行变换处理,得到对应文件的影像变换文件集合;Perform transformation processing on each of the sample image files according to multiple transformation rules to obtain a set of image transformation files corresponding to the files;
    将每一所述样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件;Assigning the type label of each sample image file to each image transformation file in the corresponding image transformation file set;
    根据每一所述样本影像文件、每一所述影像变换文件和各自对应的类型标签,对特定的神经网络模型进行训练,得到所述目标分类模型。According to each of the sample image files, each of the image transformation files, and respective corresponding type labels, a specific neural network model is trained to obtain the target classification model.
  4. 根据权利要求1所述的方法,其中,所述根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件,包括:The method according to claim 1, wherein the determining whether the image file to be reviewed is a violation file according to the determined relationship between the similarity degree and the first threshold value comprises:
    确定小于所述第一阈值的相似度的数目,所述相似度用于表征两个特征向量之间的不同的特征数目;Determining the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors;
    确定所述数目与相似度总数目的比值;Determine the ratio of the number to the total number of similarities;
    根据所述比值与第二阈值之间的关系,确定所述待审影像文件是否是违规文件。According to the relationship between the ratio and the second threshold, it is determined whether the pending image file is a violation file.
  5. 根据权利要求1所述的方法,其中,所述相似度用于表征两个特征向量之间的不同的特征数目,所述参考特征向量对应的参考影像文件为违规文件;The method according to claim 1, wherein the similarity is used to characterize the number of different features between two feature vectors, and the reference image file corresponding to the reference feature vector is a violation file;
    所述确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度,包括:The determining the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set includes:
    确定所述待审影像文件的特征向量与所述审核集合中的第i个参考特征向量之间的相似度;其中,i大于0且小于或等于所述审核集合中的参考特征向量总数目;Determining the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the total number of reference feature vectors in the review set;
    相应地,所述根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件,包括:Correspondingly, the determining whether the image file to be reviewed is a violation file according to the determined relationship between the similarity and the first threshold includes:
    在所述第i个参考特征向量对应的相似度小于所述第一阈值的情况下,确定所述待审影像文件是违规文件。In a case where the similarity corresponding to the i-th reference feature vector is less than the first threshold, it is determined that the image file to be reviewed is a violation file.
  6. 根据权利要求5所述的方法,其中,还包括:The method according to claim 5, further comprising:
    在所述第i个参考特征向量对应的相似度大于或等于所述第一阈值时,确定所述待审影像文件的特征向量与所述审核集合中的第i+1个参考特征向量之间的相似度,以确定所述待审影像文件是否是违规文件。When the similarity corresponding to the i-th reference feature vector is greater than or equal to the first threshold, it is determined between the feature vector of the image file to be reviewed and the i+1-th reference feature vector in the review set To determine whether the pending image file is a violation file.
  7. 根据权利要求1至6任一项所述的方法,其中,还包括:加载已生成的所述审核集合;The method according to any one of claims 1 to 6, further comprising: loading the generated audit set;
    所述审核集合的生成方法,包括:The method for generating the audit set includes:
    利用所述目标分类模型,对多个参考影像文件进行特征提取,得到对应文件的特征向量;Using the target classification model, feature extraction is performed on multiple reference image files to obtain feature vectors of corresponding files;
    将每一所述参考影像文件的特征向量作为参考特征向量,生成所述审核集合。The feature vector of each reference image file is used as a reference feature vector to generate the review set.
  8. 根据权利要求1至6任一项所述的方法,其中,还包括:加载已确定的所述第一阈值;其中,所述第一阈值的确定方法包括:The method according to any one of claims 1 to 6, further comprising: loading the determined first threshold; wherein the method for determining the first threshold comprises:
    在假设所述第一阈值分别为多个不同候选阈值的情况下,根据所述影像审核方法,确定多个验证影像文件是否是违规文件,从而得到每一所述候选阈值对应的审核结果集合;Under the assumption that the first threshold is a plurality of different candidate thresholds, according to the image review method, determine whether a plurality of verified image files are violating files, so as to obtain a set of review results corresponding to each of the candidate thresholds;
    根据每一审核结果集合和每一所述验证影像文件的类型标签,确定在对应候选阈值下的正确召回率和错误召回率;Determine the correct recall rate and the error recall rate under the corresponding candidate threshold according to each audit result set and the type label of each verification image file;
    将满足特定条件的正确召回率和错误召回率所对应的候选阈值,确定为所述第一阈值。The candidate thresholds corresponding to the correct recall rate and the false recall rate that meet the specific conditions are determined as the first threshold.
  9. 影像审核装置,包括:Image review device, including:
    特征提取模块,配置为利用目标分类模型对待审影像文件进行特征提取,得到对应的特征向量;其中,所述目标分类模型是通过多个样本影像文件和对应的多种影像变换文件训练得到的;The feature extraction module is configured to use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files;
    第一确定模块,配置为确定所述待审影像文件的特征向量与审核集合中的至少一个参考特征向量之间的相似度;The first determining module is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set;
    审核模块,配置为根据确定的所述相似度与第一阈值之间的关系,确定所述待审影像文件是否是违规文件。The review module is configured to determine whether the image file to be reviewed is a violation file based on the determined relationship between the similarity and the first threshold.
  10. 根据权利要求9所述的装置,其中,所述特征提取模块,配置为:The device according to claim 9, wherein the feature extraction module is configured to:
    获取所述目标分类模型的特征向量提取结构,所述特征向量提取结构包括所述目标分类模型的输入层至非线性激活层;其中,所述目标分类模型的类型为神经网络模型;Acquiring a feature vector extraction structure of the target classification model, the feature vector extraction structure including an input layer to a non-linear activation layer of the target classification model; wherein the type of the target classification model is a neural network model;
    利用所述特征向量提取结构,对所述待审影像文件进行特征提取,得到对应的特征向量。Using the feature vector extraction structure, feature extraction is performed on the image file to be reviewed to obtain the corresponding feature vector.
  11. 根据权利要求9或10所述的装置,其中,还包括:The device according to claim 9 or 10, further comprising:
    标签获取模块,配置为获取每一所述样本影像文件的类型标签;The label obtaining module is configured to obtain the type label of each sample image file;
    变换处理模块,配置为按照多种变换规则,对每一所述样本影像文件进行变换处理,得到对应文件的影像变换文件集合;The transformation processing module is configured to perform transformation processing on each of the sample image files according to a variety of transformation rules to obtain a set of image transformation files corresponding to the files;
    标签标注模块,配置为将每一所述样本影像文件的类型标签,赋予给对应影像变换文件集合中的每一影像变换文件;The labeling module is configured to assign the type label of each sample image file to each image transformation file in the corresponding image transformation file set;
    模型训练模块,配置为根据每一所述样本影像文件、每一所述影像变换文件和各自对应的类型标签,对特定的神经网络模型进行训练,得到所述目标分类模型。The model training module is configured to train a specific neural network model according to each of the sample image files, each of the image transformation files and respective corresponding type labels to obtain the target classification model.
  12. 根据权利要求9所述的装置,其中,所述审核模块,配置为:The device according to claim 9, wherein the audit module is configured to:
    确定小于所述第一阈值的相似度的数目,所述相似度用于表征两个特征向量之间的不同的特征数目;Determining the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors;
    确定所述数目与相似度总数目的比值;Determine the ratio of the number to the total number of similarities;
    根据所述比值与第二阈值之间的关系,确定所述待审影像文件是否是违规文件。According to the relationship between the ratio and the second threshold, it is determined whether the pending image file is a violation file.
  13. 根据权利要求9所述的装置,其中,The device according to claim 9, wherein:
    所述第一确定模块,配置为:确定所述待审影像文件的特征向量与所述审核集合中的第i个参考特征向量之间的相似度;The first determining module is configured to determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set;
    其中,i大于0且小于或等于所述审核集合中的参考特征向量总数目;所述相似度用于表征两个特征向量之间的不同的特征数目,所述参考特征向量对应的参考影像文件为违规文件;Wherein, i is greater than 0 and less than or equal to the total number of reference feature vectors in the review set; the similarity is used to characterize the number of different features between the two feature vectors, and the reference image file corresponding to the reference feature vector Is a violation document;
    相应地,所述审核模块,配置为在所述第i个参考特征向量对应的相似度小于所述第一阈值的情况下,确定所述待审影像文件是违规文件。Correspondingly, the review module is configured to determine that the image file to be reviewed is a violation file when the similarity corresponding to the i-th reference feature vector is less than the first threshold.
  14. 根据权利要求13所述的装置,其中,所述第一确定模块,还配置为:The device according to claim 13, wherein the first determining module is further configured to:
    在所述第i个参考特征向量对应的相似度大于或等于所述第一阈值时,确定所述待审影像文件的特征向量与所述审核集合中的第i+1个参考特征向量之间的相似度,以确定所述待审影像文件是否是违规文件。When the similarity corresponding to the i-th reference feature vector is greater than or equal to the first threshold, it is determined between the feature vector of the image file to be reviewed and the i+1-th reference feature vector in the review set To determine whether the pending image file is a violation file.
  15. 根据权利要求9至14任一所述的装置,其中,还包括:The device according to any one of claims 9 to 14, further comprising:
    加载模块,配置为加载已生成的所述审核集合;Loading module, configured to load the generated audit set;
    相应地,所述特征提取模块,还配置为:利用所述目标分类模型,对多个参考影像文件进行特征提取,得到对应文件的特征向量;将每一所述参考影像文件的特征向量作为参考特征向量,生成所述审核集合。Correspondingly, the feature extraction module is further configured to: use the target classification model to perform feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and use the feature vector of each reference image file as a reference The feature vector is used to generate the review set.
  16. 根据权利要求9至14任一所述的装置,其中,还包括:The device according to any one of claims 9 to 14, further comprising:
    加载模块,配置为加载已确定的所述第一阈值;A loading module configured to load the determined first threshold;
    相应地,所述装置还包括第二确定模块,配置为:在假设所述第一阈值分别为多个不同候选阈值的情况下,利用所述装置的特征提取模块、第一确定模块和审核模块,确定多个验证影像文件是否是违规文件,从而得到每一所述候选阈值对应的审核结果集合;根据每一审核结果集合和每一所述验证影像文件的类型标签,确定在对应候选阈值下的正确召回率和错误召回率;将满足特定条件的正确召回率和错误召回率所对应的候选阈值,确定为所述第一阈值。Correspondingly, the device further includes a second determination module, configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds. , To determine whether a plurality of verification image files are illegal files, so as to obtain the audit result set corresponding to each candidate threshold; according to each audit result set and the type label of each verification image file, it is determined to be under the corresponding candidate threshold The correct recall rate and the false recall rate of, and the candidate thresholds corresponding to the correct recall rate and the false recall rate that meet specific conditions are determined as the first threshold.
  17. 电子设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至8任一项所述影像审核方法中的步骤。An electronic device, comprising a memory and a processor, the memory storing a computer program that can run on the processor, and the processor implements the steps in the image review method of any one of claims 1 to 8 when the processor executes the program .
  18. 计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至8任一项所述影像审核方法中的步骤。A computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the steps in the image review method described in any one of claims 1 to 8 are realized.
PCT/CN2020/092923 2020-05-28 2020-05-28 Image auditing method and apparatus, device, and storage medium WO2021237570A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080100202.7A CN115443490A (en) 2020-05-28 2020-05-28 Image auditing method and device, equipment and storage medium
PCT/CN2020/092923 WO2021237570A1 (en) 2020-05-28 2020-05-28 Image auditing method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092923 WO2021237570A1 (en) 2020-05-28 2020-05-28 Image auditing method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021237570A1 true WO2021237570A1 (en) 2021-12-02

Family

ID=78745395

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092923 WO2021237570A1 (en) 2020-05-28 2020-05-28 Image auditing method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN115443490A (en)
WO (1) WO2021237570A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443880A (en) * 2022-01-24 2022-05-06 南昌市安厦施工图设计审查有限公司 Picture examination method and picture examination system for large sample picture of fabricated building
CN114612839A (en) * 2022-03-18 2022-06-10 壹加艺术(武汉)文化有限公司 Short video analysis processing method, system and computer storage medium
CN115297360A (en) * 2022-09-14 2022-11-04 百鸣(北京)信息技术有限公司 Intelligent auditing system for multimedia software video uploading
CN115994772A (en) * 2023-02-22 2023-04-21 中信联合云科技有限责任公司 Book data processing method and system, book rapid goods laying method and electronic equipment
CN116452836A (en) * 2023-05-10 2023-07-18 武汉精阅数字传媒科技有限公司 New media material content acquisition system based on image data processing

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116866666B (en) * 2023-09-05 2023-12-08 天津市北海通信技术有限公司 Video stream picture processing method and device in rail transit environment
CN117292395A (en) * 2023-09-27 2023-12-26 自然资源部地图技术审查中心 Training method and training device for drawing-examining model and drawing-examining method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101359372A (en) * 2008-09-26 2009-02-04 腾讯科技(深圳)有限公司 Training method and device of classifier, and method apparatus for recognising sensitization picture
CN108960782A (en) * 2018-07-10 2018-12-07 北京木瓜移动科技股份有限公司 content auditing method and device
CN109561322A (en) * 2018-12-27 2019-04-02 广州市百果园信息技术有限公司 A kind of method, apparatus, equipment and the storage medium of video audit
US10402699B1 (en) * 2015-12-16 2019-09-03 Hrl Laboratories, Llc Automated classification of images using deep learning—back end
CN110377775A (en) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 A kind of picture examination method and device, storage medium
CN110738697A (en) * 2019-10-10 2020-01-31 福州大学 Monocular depth estimation method based on deep learning
CN111079816A (en) * 2019-12-11 2020-04-28 北京金山云网络技术有限公司 Image auditing method and device and server

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101359372A (en) * 2008-09-26 2009-02-04 腾讯科技(深圳)有限公司 Training method and device of classifier, and method apparatus for recognising sensitization picture
US10402699B1 (en) * 2015-12-16 2019-09-03 Hrl Laboratories, Llc Automated classification of images using deep learning—back end
CN108960782A (en) * 2018-07-10 2018-12-07 北京木瓜移动科技股份有限公司 content auditing method and device
CN109561322A (en) * 2018-12-27 2019-04-02 广州市百果园信息技术有限公司 A kind of method, apparatus, equipment and the storage medium of video audit
CN110377775A (en) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 A kind of picture examination method and device, storage medium
CN110738697A (en) * 2019-10-10 2020-01-31 福州大学 Monocular depth estimation method based on deep learning
CN111079816A (en) * 2019-12-11 2020-04-28 北京金山云网络技术有限公司 Image auditing method and device and server

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443880A (en) * 2022-01-24 2022-05-06 南昌市安厦施工图设计审查有限公司 Picture examination method and picture examination system for large sample picture of fabricated building
CN114612839A (en) * 2022-03-18 2022-06-10 壹加艺术(武汉)文化有限公司 Short video analysis processing method, system and computer storage medium
CN114612839B (en) * 2022-03-18 2023-10-31 壹加艺术(武汉)文化有限公司 Short video analysis processing method, system and computer storage medium
CN115297360A (en) * 2022-09-14 2022-11-04 百鸣(北京)信息技术有限公司 Intelligent auditing system for multimedia software video uploading
CN115994772A (en) * 2023-02-22 2023-04-21 中信联合云科技有限责任公司 Book data processing method and system, book rapid goods laying method and electronic equipment
CN115994772B (en) * 2023-02-22 2024-03-08 中信联合云科技有限责任公司 Book data processing method and system, book rapid goods laying method and electronic equipment
CN116452836A (en) * 2023-05-10 2023-07-18 武汉精阅数字传媒科技有限公司 New media material content acquisition system based on image data processing
CN116452836B (en) * 2023-05-10 2023-11-28 杭州元媒科技有限公司 New media material content acquisition system based on image data processing

Also Published As

Publication number Publication date
CN115443490A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
WO2021237570A1 (en) Image auditing method and apparatus, device, and storage medium
WO2020119350A1 (en) Video classification method and apparatus, and computer device and storage medium
WO2020199468A1 (en) Image classification method and device, and computer readable storage medium
CN107463605B (en) Method and device for identifying low-quality news resource, computer equipment and readable medium
US8818916B2 (en) System and method for linking multimedia data elements to web pages
US10621755B1 (en) Image file compression using dummy data for non-salient portions of images
US9576221B2 (en) Systems, methods, and devices for image matching and object recognition in images using template image classifiers
RU2668717C1 (en) Generation of marking of document images for training sample
US20230376527A1 (en) Generating congruous metadata for multimedia
Murray et al. A deep architecture for unified aesthetic prediction
CN110427895A (en) A kind of video content similarity method of discrimination based on computer vision and system
US10380267B2 (en) System and method for tagging multimedia content elements
CN111651636A (en) Video similar segment searching method and device
CN110163061B (en) Method, apparatus, device and computer readable medium for extracting video fingerprint
KR101647691B1 (en) Method for hybrid-based video clustering and server implementing the same
WO2021012493A1 (en) Short video keyword extraction method and apparatus, and storage medium
CN113434716B (en) Cross-modal information retrieval method and device
CN113221918B (en) Target detection method, training method and device of target detection model
Phadikar et al. Content-based image retrieval in DCT compressed domain with MPEG-7 edge descriptor and genetic algorithm
WO2021179631A1 (en) Convolutional neural network model compression method, apparatus and device, and storage medium
US10504002B2 (en) Systems and methods for clustering of near-duplicate images in very large image collections
US11537636B2 (en) System and method for using multimedia content as search queries
US20130191368A1 (en) System and method for using multimedia content as search queries
Kapadia et al. Improved CBIR system using Multilayer CNN
US20230222762A1 (en) Adversarially robust visual fingerprinting and image provenance models

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937851

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 25.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20937851

Country of ref document: EP

Kind code of ref document: A1