WO2021237570A1 - Procédé et appareil d'audit d'image, dispositif, et support de stockage - Google Patents

Procédé et appareil d'audit d'image, dispositif, et support de stockage Download PDF

Info

Publication number
WO2021237570A1
WO2021237570A1 PCT/CN2020/092923 CN2020092923W WO2021237570A1 WO 2021237570 A1 WO2021237570 A1 WO 2021237570A1 CN 2020092923 W CN2020092923 W CN 2020092923W WO 2021237570 A1 WO2021237570 A1 WO 2021237570A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
file
feature vector
threshold
image file
Prior art date
Application number
PCT/CN2020/092923
Other languages
English (en)
Chinese (zh)
Inventor
罗茂
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN202080100202.7A priority Critical patent/CN115443490A/zh
Priority to PCT/CN2020/092923 priority patent/WO2021237570A1/fr
Publication of WO2021237570A1 publication Critical patent/WO2021237570A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the embodiments of this application relate to Internet technology, and relate to but not limited to image review methods and devices, equipment, and storage media.
  • the image review method provided by the embodiment of the application includes: extracting features of the image file to be reviewed using a target classification model to obtain a corresponding feature vector; wherein the target classification model uses multiple sample image files and corresponding multiple images Obtained by transforming file training; determining the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; determining the relationship between the determined similarity and the first threshold State whether the image file to be reviewed is a violation file.
  • the image review device includes: a feature extraction module configured to use a target classification model to perform feature extraction on an image file to be reviewed to obtain a corresponding feature vector; wherein the target classification model is based on a plurality of sample image files Obtained through training with corresponding multiple image transformation files; the first determining module is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set; the review module is configured to According to the determined relationship between the similarity and the first threshold, it is determined whether the pending image file is a violation file.
  • the electronic device provided by an embodiment of the present application includes a memory and a processor.
  • the memory stores a computer program that can run on the processor.
  • the processor executes the program, the image review described in any of the embodiments of the present application is implemented Steps in the method.
  • the computer-readable storage medium provided by the embodiment of the present application has a computer program stored thereon, and when the computer program is executed by a processor, it implements the steps in any one of the image review methods described in the embodiment of the present application.
  • the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • FIG. 1 is a schematic diagram of an exemplary application scenario of an image review method according to an embodiment of this application
  • FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application.
  • FIG. 3 is a schematic diagram of the training process of the target classification model according to the embodiment of the application.
  • FIG. 4 is a schematic diagram of the implementation process of the method for generating a review set according to an embodiment of the application
  • FIG. 5 is a schematic diagram of an implementation process of a method for determining a first threshold value according to an embodiment of the application
  • FIG. 6 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 7A is a schematic structural diagram of MobileNetV2 according to an embodiment of the application.
  • FIG. 7B is a schematic structural diagram of a feature extraction structure according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 9 is a schematic diagram of the implementation process of yet another image review method according to an embodiment of the application.
  • FIG. 10 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 11 is a schematic diagram of the implementation process of another image review method according to an embodiment of the application.
  • FIG. 12 is a schematic diagram of a transformation operation performed on an original picture according to an embodiment of the application.
  • FIG. 13 is a simplified structural diagram of MobileNetV2 according to an embodiment of the application.
  • Figure 14 is a schematic diagram of the curve of the sigmoid function
  • FIG. 15 is a schematic diagram of a process of image matching according to an embodiment of the application.
  • FIG. 16 is the corresponding recall and wrong_recall when the candidate threshold is 35 to 70 in the embodiment of the application;
  • FIG. 17 is the corresponding recall and wrong_recall when the candidate threshold is 50 to 55 according to the embodiment of the application;
  • FIG. 18 is a schematic flowchart of a picture review system according to an embodiment of the application.
  • FIG. 19 is a schematic diagram of the Mobilehashnet algorithm flow in the picture review system according to an embodiment of the application.
  • 20A is a schematic diagram of the structure of an image file review device according to an embodiment of the application.
  • 20B is a schematic structural diagram of another image file review device according to an embodiment of the application.
  • FIG. 21 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the application.
  • first ⁇ second ⁇ third involved in the embodiments of the present application only distinguishes similar or different objects, and does not represent a specific order of objects. Understandably, “first ⁇ second ⁇ “Third” can be interchanged in a specific order or sequence when permitted, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein.
  • FIG. 1 is a schematic diagram of an exemplary application scenario 100 of an image review method provided by an embodiment of the present application.
  • the scene 100 includes a terminal 101, an image review device 102 and a second database 103.
  • the image review device 102 is used to review the image file 104 input by the user at the terminal 101 to determine whether the file is a violation file; if it is a violation file, it is forbidden to store the file in the second database 103; otherwise If it is not a violation file, that is, the file is a compliant file, then the file is allowed to be stored in the second database 103 so that the user or other users can retrieve, browse or download the file.
  • the terminal 101 may be a mobile terminal with wireless communication capabilities such as a mobile phone (for example, a mobile phone), a tablet computer, a notebook computer, or the like, or a desktop computer or desktop computer with computing functions that is inconvenient to move.
  • a mobile phone for example, a mobile phone
  • a tablet computer for example, a tablet computer
  • a notebook computer or the like
  • a desktop computer or desktop computer with computing functions that is inconvenient to move such as a mobile phone (for example, a mobile phone), a tablet computer, a notebook computer, or the like
  • desktop computer or desktop computer with computing functions that is inconvenient to move.
  • the image review device 102 may be configured in the terminal 101, or may be configured independently of the terminal 101. There may be one or more image review devices 102 in the application scene 100. Multiple image review devices 102 can review the image files input by different users in parallel, thereby increasing the data processing speed.
  • the second database 103 can also be configured in the image reviewing device 102 when the image reviewing device 102 is configured on the network side.
  • the terminal 101, the image auditing device 102, and the second database 103 are independent of different devices
  • the terminal 101 and the image auditing device 102 can communicate through the network
  • the image auditing device 102 and the second database 103 can also communicate with each other through the network.
  • the communication may be performed through a network, and the network may be a wireless network or a wired network, and the embodiment of the present application does not specifically limit the communication mode here.
  • the embodiment of the application provides an image review method, which can be applied to electronic equipment with an image review device.
  • the electronic equipment can be a computer device, a notebook computer, any node server in a distributed computing architecture, or a mobile terminal. Wait.
  • the functions implemented by the image review method can be implemented by invoking program codes by the processor in the electronic device.
  • the program codes can be stored in a computer storage medium. It can be seen that the electronic device at least includes a processor and a storage medium.
  • FIG. 2 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 2, the method may include the following steps 201 to 203:
  • Step 201 Use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • the target classification model may be a deep learning model, for example, a neural network model.
  • the model can be a lightweight neural network model, such as MobileNetV2.
  • the model can also be a non-lightweight neural network model.
  • the electronic device can be implemented through steps 301 to 304 in the following embodiment.
  • the so-called image transformation file refers to a file obtained by performing transformation processing such as inversion, rotation, liquefaction, scaling, cropping, mosaic, noise, color change, or occlusion on a sample image file, or a combination of these transformation methods.
  • the image file to be reviewed may be of various types.
  • the image file to be reviewed is an image or a piece of video (for example, a short video, a live video, a movie, a TV series, etc.).
  • the electronic device can randomly sample one or more video frame images from the video, and then perform feature extraction on these images through the target classification model to obtain the feature vector corresponding to the video.
  • Step 202 Determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set.
  • the corresponding review set can be different. That is to say, when the image file to be reviewed is an image, the reference feature vector in the corresponding review set is extracted by the electronic device from the image. When the image file to be reviewed is a piece of video, a reference feature vector in the corresponding review set is extracted by the electronic device from multiple images. All in all, the dimension of the feature vector of the image file to be reviewed is consistent with the dimension of the reference feature vector. Of course, it is not limited to the above rules. The dimensions of the two feature vectors can also be different.
  • the parameter types that characterize the similarity can be varied, for example, it can be Hamming distance, Euclidean distance, or cosine similarity.
  • Step 203 Determine whether the to-be-reviewed image file is a violation file according to the determined relationship between the similarity and the first threshold.
  • the audit set generated based on the compliant reference image file (for a brief description, referred to as the compliant set) and the audit set generated based on the offending reference image file (hereinafter referred to as the violation set), correspond to the judgment criteria Is different.
  • the similarity is characterized by the Hamming distance.
  • the Hamming distance between two strings of equal length refers to the number of different characters at the corresponding positions of the two strings. Therefore, the smaller the Hamming distance, the more similar the two feature vectors, and the more similar the corresponding two image files.
  • the ratio of the number of similarities less than the first threshold to the total number of similarities is determined, and when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a violation file.
  • the compliance set in one example, when the ratio is greater than the second threshold, the image file to be reviewed is determined to be a compliance file.
  • the electronic device can be implemented through step 604 to step 606 in the following embodiment.
  • the electronic device can also be implemented through step 802 to step 809 in the following embodiment.
  • the similarity characterizes the number of different features between two feature vectors.
  • the audit set is a violation set. Every time the electronic device determines the similarity with the reference feature vector, it counts the current similarity that is less than the first threshold. If the number is greater than or equal to the third threshold, the calculation of similarity is stopped, and the image file to be reviewed is determined to be a violation file, which is output as the review result.
  • the electronic device can also determine whether the image file to be reviewed is a violation file through steps 902 to 904 in the following embodiment.
  • the electronic device uses the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein, the target classification model is trained through multiple sample image files and corresponding multiple image transformation files In this way, even if the image file to be reviewed is a file that has undergone multiple transformation processes such as rotation, liquefaction, and deformation of the original file, it can still extract the feature vector consistent with the original file, so as to realize the image file that is arbitrarily transformed Accurate identification can enhance the robustness of the image review method.
  • the electronic device may pre-train to obtain the target classification model, generate the review set, and determine the first threshold; wherein,
  • the following steps 301 to 304 may be included. It should be noted that the electronic device may perform the following steps 301 to 304 before performing feature extraction on the image file to be reviewed. The electronic device may also execute the following steps 301 to 304 when it is configured to have an image review function.
  • Step 301 Obtain the type label of each sample image file.
  • the sample image files include illegal image files and compliant image files.
  • Violating video files for example, can be files related to terror, violence, pornography, and gambling.
  • Compliant image files for example, may be files related to natural scenery and buildings.
  • the electronic device can sample some illegal sample files from the first database that collects a variety of illegal image files, and sample some compliant sample files from the second database that collects a variety of compliant image files. .
  • a certain number of image files are selected from the first database and the second database as the sample files. For example, select 100 illegal images and 100 compliant images from these two databases as sample image files.
  • Step 302 Perform transformation processing on each of the sample image files according to multiple transformation rules to obtain a set of image transformation files corresponding to the files.
  • the transformation rules can be various.
  • the basic transformation rules include flip, rotation, liquefaction, zoom, crop, mosaic, noise, color change, and occlusion.
  • the combined transformation rule is a combination of at least two basic transformation rules. Taking the above 9 basic transformation rules as an example, there are 502 combined transformation rules, namely In an example, the electronic device may perform transformation processing on the sample image file according to 100 different transformation rules to obtain 100 image transformation files corresponding to the file.
  • Step 303 Assign the type label of each sample image file to each image transformation file in the corresponding image transformation file set.
  • the type tags of the image file after conversion and the image file before conversion should be consistent. For example, if the illegal image file has been liquefied, the liquefied file is still illegal, and its nature remains unchanged. Therefore, the type label of the image transformation file corresponding to each sample image file can be consistent with the type label of the sample image file.
  • Step 304 Train a specific neural network model according to each of the sample image files, each of the image transformation files, and respective corresponding type labels to obtain the target classification model.
  • each sample image file is transformed according to multiple transformation rules to obtain the image transformation file set of the corresponding file; the type label of each sample image file is assigned to the corresponding image transformation file set Each of the image transformation files; according to each of the sample image files, each of the image transformation files and respective corresponding type tags, a specific neural network model is trained to obtain the target classification model.
  • the training samples include image transformation files obtained by performing multiple transformations on the sample image files, which can enrich the diversity of training samples and make the target classification model obtained by training have better robustness.
  • the model can accurately extract the feature vector of the transformed file, so as to accurately identify whether the file is It is a violation file.
  • the feature vectors extracted from the image files before and after the transformation process using this model are basically the same. Therefore, even if the input image file is a file after the transformation process, the electronic device can accurately identify whether the file is Violating documents.
  • the type label of each sample image file is assigned to each image transformation file in the corresponding image transformation file set; in this way, under the premise of ensuring the diversity of training samples, it reduces Manual labeling costs, no need to manually label each image transformation file type label.
  • the electronic device can automatically obtain a large number of rich and diverse training samples by transforming and processing the sample image files.
  • the electronic device may load the generated audit set into the cache in advance. There is no restriction on the timing of loading.
  • the electronic device can load the generated review set before using the target classification model to extract the features of the image file to be reviewed; for another example, the electronic device can also extract the features of the image file to be reviewed and determine the image file to be reviewed Before the similarity between the feature vector of and at least one reference feature vector in the audit set, load the generated audit set; another example, when the electronic device is configured to have the image audit function, load the generated audit set .
  • the method for generating an audit set may include the following steps 401 and 402:
  • Step 401 Using the target classification model, perform feature extraction on multiple reference image files to obtain feature vectors of corresponding files.
  • the multiple reference image files may be violation files, for example, all or part of the files in the first database, and the audit set obtained based on this is the violation set.
  • the multiple reference image files may be compliance files, for example, all or part of the files in the second database.
  • the nature of the audit set is different, that is, the compliance set and the violation set. In the image review stage, the corresponding judgment criteria are also different.
  • the multiple reference image files are part of the files in the database, they may be files randomly extracted from the database by the electronic device, or some representative files in the database, such as some files with higher priority.
  • Step 402 Use the feature vector of each reference image file as a reference feature vector to generate the review set.
  • the review set is loaded into the buffer area in advance.
  • the electronic device does not need to perform feature extraction on the multiple reference image files to generate the review set; instead, it can directly use the pre-generated review set to perform the image review. In this way, the time consumption of the feature extraction process can be saved, so that the time for reviewing the image can be saved.
  • the electronic device may load the determined first threshold into the cache in advance. There is no restriction on the timing of loading. For example, the electronic device may load the determined first threshold before determining whether the image file to be reviewed is a violation file; for another example, the electronic device may also load the determined first threshold value before performing feature extraction on the image file to be reviewed. Threshold; For another example, the electronic device can also load the determined first threshold when it is configured to have an image review function.
  • the method for determining the first threshold may include the following steps 501 to 503:
  • Step 501 assuming that the first threshold is a plurality of different candidate thresholds, according to the image review method, determine whether a plurality of verified image files are violating files, so as to obtain the review corresponding to each candidate threshold Result collection.
  • the plurality of verification image files may include a violation image file and a compliance image file.
  • the verification image file is different from the file used to train the neural network model.
  • the multiple verification image files may also include files obtained after the electronic device performs various transformation processes on the original image files.
  • the transformation rules used in the transformation processing may be the same as the transformation rules used in the model training stage.
  • step 501 a set of audit results obtained based on each candidate threshold can be obtained.
  • the set of audit results corresponding to threshold 1 is the content in the second column of Table 1.
  • Candidate threshold 2 ... Candidate threshold N Verify image file 1 1 1 ... 1 Verify image file 2 0 1 ... 1 ... ... ... ... ... Verify image file M 1 1 ... 0
  • Step 502 Determine the correct recall rate and the error recall rate under the corresponding candidate threshold according to each audit result set and the type label of each verified image file.
  • TN represents the number of violation documents reviewed as violations
  • FP represents the number of violation documents reviewed as compliance documents
  • FN represents the number of compliance documents reviewed as violation documents.
  • Step 503 Determine the candidate thresholds corresponding to the correct recall rate and the false recall rate that meet specific conditions as the first threshold.
  • the candidate threshold corresponding to the minimum error recall rate is selected as the first threshold.
  • the electronic device may adopt a grid search method to gradually approach the optimal value, so as to select the first threshold from a plurality of candidate thresholds.
  • FIG. 6 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 6, the method may include the following steps 601 to 606:
  • Step 601 Obtain a feature vector extraction structure of the target classification model.
  • the feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model; wherein, the target classification model uses a plurality of sample image files And the corresponding multiple image transformation file training.
  • the target classification model can be a lightweight neural network model MobileNetV2.
  • the structure of the network includes a "bottleneck structure", a conv2d layer, a sigmoid activation layer, an n ⁇ 1 dimensional fully connected layer (Dense), and a normalized index layer (softmax).
  • the "bottleneck structure", the conv2d layer, and the sigmoid activation layer may be used as the feature vector extraction structure.
  • Step 602 Use the feature vector extraction structure to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
  • the output of the nonlinear activation layer of the feature vector extraction structure is the feature vector corresponding to the file.
  • Step 603 Determine the similarity between the feature vector of the image file to be reviewed and each reference feature vector in the review set; wherein the similarity is used to represent the number of different features between the two feature vectors;
  • Step 604 Determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors.
  • the similarity is the Hamming distance.
  • Step 605 Determine the ratio of the number to the total number of similarities
  • Step 606 Determine whether the to-be-reviewed image file is a violation file according to the relationship between the ratio and the second threshold.
  • the image file to be reviewed is determined to be a violation file; when the ratio is less than or equal to the second threshold, the file is determined to be a compliant file .
  • the image file to be reviewed is determined to be a compliant file; when the ratio is less than or equal to the second threshold, the file is determined to be a violation file.
  • the number of similarities less than the first threshold is counted, and the ratio between the number and the total number of similarities is determined; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation File; In this way, compared to only obtaining the audit result based on the similarity with a reference feature vector, the audit result obtained in this way is more reliable and the recognition accuracy rate is higher.
  • FIG. 8 is a schematic diagram of the implementation process of the image review method according to the embodiment of the application. As shown in FIG. 8, the method may include the following steps 801 to 809:
  • the target classification model is used to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files.
  • a target classification model usually consists of multiple sequentially connected layers.
  • the first layer generally takes an image as input, and extracts features from the image through specific operations.
  • the features extracted from the previous layer of each layer are used as input, and by transforming them in a specific form, more complex features can be obtained.
  • This hierarchical feature extraction process can be accumulated, which gives the neural network powerful feature extraction capabilities.
  • the neural network can transform the initial input image into higher-level abstract features.
  • the image review method when feature extraction is performed on the image file to be reviewed through the target classification model, no matter how complicated the original file is to obtain the image file to be reviewed, the extracted feature vector is basically unchanged. In this way, the image review method has strong robustness, and even if the illegal file is transformed and uploaded to the network, it can still be accurately identified.
  • Step 802 Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set Total number
  • Step 803 Determine whether the image file to be reviewed is a violation file according to the relationship between the similarity corresponding to the i-th reference feature vector and the first threshold; if so, go to step 804; otherwise, go to step 807;
  • the so-called similarity corresponding to the i-th reference feature vector refers to the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector.
  • Step 804 Count the first determined number of times the image file to be reviewed is a violation file
  • Step 805 Determine whether the first number of determinations is greater than the third threshold; if yes, go to step 806; otherwise, i+1, go back to step 802;
  • Step 806 Output that the image file to be reviewed is a violation file.
  • the first number of times of determination is greater than the third threshold, it is sufficient to reliably determine that the image file to be reviewed is a violation file, and there is no need to continue to calculate the similarity between the feature vector of the image file to be reviewed and the remaining reference feature vector. , Thereby saving the amount of calculation and shortening the audit time.
  • the third threshold is 900
  • the similarity is represented by Hamming distance.
  • the first determination number is 901. That is, among the similarities corresponding to the first to 1000th reference feature vectors, 901 similarities are less than the first threshold.
  • the image review process can be ended, and the review result of the image file to be reviewed as a violation file is output. There is no need to continue to calculate the similarity with the remaining 9,000 reference feature vectors.
  • Step 807 Count the second determined number of times that the image file to be reviewed is a compliant file
  • Step 808 Determine whether the second determination times are greater than the fourth threshold; if yes, go to step 809; otherwise, i+1, go back to step 802;
  • the fourth threshold is greater than the third threshold. In this way, the false detection rate of illegal files can be reduced.
  • Step 809 Output that the image file to be reviewed is a compliance file.
  • FIG. 9 is a schematic diagram of the implementation process of the image review method of the embodiment of the application. As shown in FIG. 9, the method may include the following steps 901 to 904:
  • Step 901 using the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files;
  • Step 902 Determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and less than or equal to the reference feature vector in the review set
  • the total number, the reference image file corresponding to the reference feature vector is a violation file; the similarity is used to characterize the number of different features between the two feature vectors;
  • Step 903 Determine whether the similarity corresponding to the i-th reference feature vector is less than the first threshold; if yes, go to step 904; otherwise, i+1, go back to step 902;
  • Step 904 Determine that the image file to be reviewed is a violation file, and output the review result.
  • the review process is ended, and the output to be reviewed is the review result of the violation file; otherwise, continue to traverse the next One refers to the feature vector until it is determined that the image file to be reviewed is a violation file.
  • the output pending image file is the audit result of the compliance file.
  • the input picture that is, the picture to be reviewed
  • the picture in the violation gallery that is, an example of the first database
  • Commonly used similarity algorithms such as the perceptual hash (pHash) algorithm and the Scale-Invariant Feature Transform (SIFT) algorithm.
  • the pHash algorithm is a rule algorithm designed manually.
  • the basic principle of the algorithm is to obtain the hash value of the input picture, and then calculate the hash "distance" between the input picture and a picture in the illegal library to obtain these two
  • the similarity of the picture when the similarity is greater than the set threshold, the match is considered successful.
  • the implementation process of the algorithm is as follows:
  • Reduce the size of the input picture simplify the color of the reduced picture; calculate the average value of the simplified picture; compare the grayscale of the pixel based on the average; calculate the hash value based on the grayscale; calculate and violate the rules based on the hash value
  • the Hamming distance of a picture in the gallery when the Hamming distance is less than the set threshold, it is determined that the matching is successful, and the input picture is an illegal picture.
  • the SIFT algorithm is used to detect and describe the local features in the picture. It finds extreme points in the spatial scale and extracts its position, scale, and rotation invariants. The description and detection of local features can help to identify objects. SIFT features are based on some local appearance points of interest on the object and have nothing to do with the size and rotation of the picture.
  • the algorithm factors (ie, image feature extraction operators) of the pHash algorithm and the SIFT algorithm are both artificially designed, so they can only meet specific matching scenarios.
  • the pHash algorithm can only maintain the invariance of scale scaling and color change;
  • the SIFT algorithm can only maintain the invariance of rotation, scale scaling, brightness change, affine, and noise.
  • the neural network model is mainly used to directly calculate whether the two pictures match.
  • the implementation process is shown in Figure 10, which is divided into a training phase and a prediction phase.
  • the basic process of the training phase includes the following steps 1001 to 1004:
  • Step 1001 design a model structure (including convolutional layer, fully connected layer, pooling layer, etc.) to obtain an initial similarity model, that is, a neural network model;
  • Step 1002 prepare a large amount of image data as training samples
  • Step 1003 Perform data enhancement processing on each picture in the training sample, for example, rotate, mirror, and render the pictures separately, and combine the two pictures obtained after different data transformations of the same picture into a positive sample (1 ), and other transformed pictures as negative samples (0).
  • step 1004 the initial similarity model is updated through the gradient descent series optimization algorithm and the training samples after data enhancement, to obtain the trained similarity model, that is, the target classification model.
  • the basic process of the prediction phase includes steps 1005 to 1007:
  • Step 1005 the input picture and each picture in the illegal library are calculated for similarity
  • Step 1006 Determine whether the ratio of the number of similarities less than the first threshold to the total number of similarities is greater than the second threshold; if so, go to step 1007;
  • step 1007 it is considered that the matching is successful, and it is determined that the input picture is a violation picture.
  • the deep learning model contains multiple convolution kernels obtained through gradient descent.
  • the convolution kernel has a strong ability to express image features and basically meets all image transformation scenarios.
  • it is necessary to cyclically perform matching calculations with all pictures in the gallery, plus the computational consumption of the neural network model itself, and its resource consumption is unacceptable.
  • a deep neural network is used to extract image features to obtain the image hash, which is an example of feature vector; compare the similarity of the two image hashes to determine whether the matching is successful.
  • the process may include the following Step 1 to Step 4):
  • Step1 Data preparation. Prepare 200 original pictures, as shown in Figure 12, perform picture transformation operations such as flipping, rotating, scaling, cropping, liquefying, mosaicing, noise, discoloration, and occlusion on each original picture, or a combination of them. Perform 100 different transformation operations on each picture, so that a total of 20,000 samples are obtained.
  • picture transformation operations such as flipping, rotating, scaling, cropping, liquefying, mosaicing, noise, discoloration, and occlusion on each original picture, or a combination of them.
  • Step2 Design the model.
  • the lightweight deep neural network MobileNetV2 is selected as the feature extractor. Before training the model, modify the MobileNetV2 network structure.
  • the original structure of MobileNetV2 is shown in Table 2 below.
  • the header "Input” is the input size of the structure layer
  • “Operator” is the structure type of the layer.
  • C is the dimension of the output feature layer of this layer
  • n is the number of repetitions of this layer
  • s is the number of steps of the deep convolution kernel.
  • the input size of the 11th layer of MobileNetV2 is fixed at 1 ⁇ 1 ⁇ 1280, and k 1 ⁇ 1 size convolution kernels are used for convolution calculation, so as to output a 1-dimensional vector of length k. Finally, connect the softmax activation layer to calculate the probability of k categories.
  • the MobileNetV2 structure is modified as follows: between the conv2d layer and the softmax layer, a sigmoid activation layer and an n ⁇ 1 dimensional fully connected layer (Dense) are added.
  • the modified MobileNetV2 structure is shown in Figure 7A.
  • Step3 Model training stage.
  • a picture classification model that is, a specific neural network model.
  • k 200
  • n is the dimension of the hash that needs to be encoded (for example, 300).
  • the model loss function is a multi-category cross-entropy loss (categorical_crossentropy), the optimization algorithm is Adam, the learning rate is fixed at 0.001, and the accuracy of the trained model is >99.5%.
  • Step4 Matching stage.
  • the output of the model is a 1-dimensional vector with a length of n (for example, 300).
  • n for example, 300.
  • the activation function is sigmoid
  • the value range of the sigmoid output is (0, 1).
  • the output is filtered according to the principle of output ⁇ 0.5, then 0, output>0.5, then 1, and the output is filtered, and finally a hash vector with a length of 300 and a value of 0 or 1 is obtained, that is, a feature vector.
  • the reason why the extracted feature vector is called a hash vector is because even if the input image is a transformed image of the original image, the feature vector extracted by Mobilehashnet is still consistent with the feature vector of the original image.
  • the Hamming distance of the two pictures can be calculated according to the hash vector of the picture. The smaller the distance, the more similar the two pictures.
  • the realization of matching can specify a first threshold. When the Hamming distance is lower than the first threshold, the two pictures are considered to be the same picture and the matching is successful; otherwise, the matching fails.
  • the preparation process of the validation set is the same as the above training set. Prepare several pictures in the non-training set, perform data enhancement, and calculate the correct recall rate (recall) and wrong recall rate (wrong_recall) of the matching model under different candidate thresholds.
  • a grid search method can be used to gradually approach the optimal value.
  • the grid search results are shown in Figure 16 and Figure 17; among them, Figure 16 shows that when the candidate threshold is 35 to 70, the corresponding recall And wrong_recall. Figure 17 shows the corresponding recall and wrong_recall when the candidate threshold is 50 to 55.
  • the hash dimension directly determines the number of convolution kernels of the 2d convolutional layer (conv2d1 ⁇ 1) in the modified MobileNetV2 structure and the output dimension n of the activation layer. Since it is at the end of the network structure, its size directly affects the learning ability of the model. If the hash dimension is too small, it will lead to underfitting of the model and reduce the limit on the number of libraries; too large dimension not only increases the time consumption of generating hash, but also increases the time consumption of calculating the Hamming distance, so you need to choose a reasonable hash dimension .
  • Mobilehashnet uses deep neural networks to extract image features, which theoretically has performance advantages.
  • the matching performance of the Mobilehashnet algorithm, the Phash algorithm and the SIFT algorithm is compared under different image transformation methods. The experimental results are shown in Table 3.
  • the Phash algorithm is basically unable to match in image transformations such as flipping, rotating, and zooming; the SIFT algorithm is at a low value in all types of image changes.
  • the Mobilehashnet algorithm can achieve 100% recall in image transformations of flipping, distorting, cutting, mosaic, and noise, and in other image transformations, the recall value is higher, and the wrong_recall value is lower. .
  • training can be performed without manually labeling a large number of samples, and a large number of training samples are automatically obtained through image data enhancement technology.
  • the Mobilehashnet algorithm provided by the embodiments of this application extracts image features by using a deep neural network, generates image hashes based on these features, and performs image matching. Compared with the related image matching/similarity algorithm, it effectively improves the correct recall rate, reduces the false recall rate, and does not require a large amount of manual data annotation.
  • the picture review system reviews the pictures uploaded by users to prevent the spread of a large number of illegal pictures. Due to the complexity of image content, as shown in Figure 18, the process of the image review system includes an illegal library matching model, an image classification model, a face recognition model, a text recognition model, and a text classification model. The pictures to be reviewed are reviewed by each model in turn. When the results of all models are "normal”, the review result can be "normal", that is, a compliant picture; otherwise, it is a violating picture.
  • the illegal library matching model in the image review system can be implemented by the Mobilehashnet algorithm provided in the embodiment of this application, which ensures a high correct recall rate and a low error recall rate for matching.
  • the implementation process of this algorithm is shown in Figure 19, extract the hash vector of the picture to be reviewed; determine the Hamming distance of each hash vector in the illegal hash library corresponding to the hash vector and the illegal library, that is, calculate the Hamming distance in batches; judge; Whether each Hamming distance is greater than the first threshold, so as to obtain the recall result, that is, the correct recall rate and the false recall rate.
  • the offending hash library can be obtained when the system is initialized, and only one hash calculation is required for matching, that is, only the feature extraction of the image to be reviewed is required.
  • the image file review device provided by the embodiments of the present application, including the modules included and the units included in each module, can be implemented by the processor in the terminal; of course, it can also be implemented by specific logic. Circuit implementation; in the implementation process, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
  • the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
  • FIG. 20A is a schematic structural diagram of an image file review device according to an embodiment of the application.
  • the device 200 includes a feature extraction module 201, a first determination module 202, and an review module 203, wherein:
  • the feature extraction module 201 is configured to use the target classification model to perform feature extraction on the image file to be reviewed to obtain the corresponding feature vector; wherein the target classification model is obtained through training of multiple sample image files and corresponding multiple image transformation files ;
  • the first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and at least one reference feature vector in the review set;
  • the review module 203 is configured to determine whether the image file to be reviewed is a violation file according to the determined relationship between the similarity and the first threshold.
  • the feature extraction module 201 is configured to obtain a feature vector extraction structure of the target classification model, and the feature vector extraction structure includes the input layer to the non-linear activation layer of the target classification model;
  • the type of the target classification model is a neural network model; the feature vector extraction structure is used to perform feature extraction on the image file to be reviewed to obtain a corresponding feature vector.
  • the image auditing device 200 further includes: a tag acquisition module 204, configured to acquire the type tag of each sample image file; a transformation processing module 205, configured to follow a variety of transformation rules , Performing transformation processing on each of the sample image files to obtain an image transformation file set of the corresponding file; the tag labeling module 206 is configured to assign the type label of each sample image file to the corresponding image transformation file set Each image transformation file; the model training module 207 is configured to train a specific neural network model according to each of the sample image files, each of the image transformation files, and their corresponding type labels to obtain the target classification Model.
  • the review module 203 is configured to: determine the number of similarities less than the first threshold, where the similarity is used to characterize the number of different features between two feature vectors; determine that the number is equal to The ratio of the total number of similarities; according to the relationship between the ratio and the second threshold, it is determined whether the image file to be reviewed is a violation file.
  • the first determining module 202 is configured to determine the similarity between the feature vector of the image file to be reviewed and the i-th reference feature vector in the review set; where i is greater than 0 and Less than or equal to the total number of reference feature vectors in the review set; the similarity is used to characterize the number of different features between two feature vectors, and the reference image file corresponding to the reference feature vector is a violation file; accordingly ,
  • the review module 203 is configured to determine that the image file to be reviewed is a violation file when the similarity corresponding to the i-th reference feature vector is less than the first threshold.
  • the first determining module 202 is further configured to: when the similarity corresponding to the i-th reference feature vector is greater than or equal to the first threshold, determine the feature vector of the image file to be reviewed and The similarity between the i+1th reference feature vector in the review set is used to determine whether the image file to be reviewed is a violation file.
  • the image review device 200 further includes: a loading module 208 configured to load the generated review set; correspondingly, the feature extraction module 201 is further configured to: use the The target classification model performs feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and uses the feature vector of each reference image file as a reference feature vector to generate the review set.
  • a loading module 208 configured to load the generated review set
  • the feature extraction module 201 is further configured to: use the The target classification model performs feature extraction on multiple reference image files to obtain the feature vector of the corresponding file; and uses the feature vector of each reference image file as a reference feature vector to generate the review set.
  • the loading module 208 is configured to load the determined first threshold
  • the device further includes a second determination module, configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds.
  • a second determination module configured to use the feature extraction module, the first determination module, and the review module of the device under the assumption that the first threshold is a plurality of different candidate thresholds.
  • the embodiments of the present application if the above-mentioned image review method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • FIG. 21 is a schematic diagram of the hardware entity of the electronic device according to an embodiment of the application.
  • the electronic device 210 includes a memory 211 and a processor 212.
  • 211 stores a computer program that can be run on the processor 212, and the processor 212 implements the steps in the image review method provided in the foregoing embodiment when the processor 212 executes the program.
  • the memory 211 is configured to store instructions and applications executable by the processor 212, and can also cache data to be processed or processed by the processor 212 and each module in the electronic device 210 (for example, image data, audio data, etc.). , Voice communication data and video communication data), which can be implemented by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the image review method provided in the above-mentioned embodiments are implemented.
  • the disclosed device and method can be implemented in other ways.
  • the embodiments of the touch screen system described above are merely illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation, such as: multiple modules or components can be combined , Or can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed can be indirect coupling or communication connection through some interfaces, devices or modules, and can be electrical, mechanical or other forms of.
  • modules described above as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules; they may be located in one place or distributed on multiple network units; Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the embodiments of the present application may all be integrated into one processing unit, or each module may be individually used as a unit, or two or more modules may be integrated into one unit; the above-mentioned integration
  • the module can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a removable storage device, a read only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk.
  • ROM Read Only Memory
  • the aforementioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to related technologies.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable The electronic device executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks, or optical disks and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé d'audit d'image, comprenant les étapes consistant à : utiliser un modèle de classification cible pour effectuer une extraction de caractéristiques sur un fichier d'image à auditer afin d'obtenir un vecteur de caractéristiques correspondant, le modèle de classification cible étant obtenu par entraînement à l'aide de multiples fichiers d'image d'échantillon et de multiples fichiers de transformation d'image correspondants ; déterminer la similarité entre le vecteur de caractéristiques du fichier d'image à auditer et au moins un vecteur de caractéristiques de référence dans un ensemble d'audit ; et déterminer, selon la relation entre la similarité déterminée et un premier seuil, si le fichier d'image à auditer est un fichier contraire à la loi. L'invention concerne également un appareil d'audit d'image, un dispositif, et un support de stockage.
PCT/CN2020/092923 2020-05-28 2020-05-28 Procédé et appareil d'audit d'image, dispositif, et support de stockage WO2021237570A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080100202.7A CN115443490A (zh) 2020-05-28 2020-05-28 影像审核方法及装置、设备、存储介质
PCT/CN2020/092923 WO2021237570A1 (fr) 2020-05-28 2020-05-28 Procédé et appareil d'audit d'image, dispositif, et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092923 WO2021237570A1 (fr) 2020-05-28 2020-05-28 Procédé et appareil d'audit d'image, dispositif, et support de stockage

Publications (1)

Publication Number Publication Date
WO2021237570A1 true WO2021237570A1 (fr) 2021-12-02

Family

ID=78745395

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092923 WO2021237570A1 (fr) 2020-05-28 2020-05-28 Procédé et appareil d'audit d'image, dispositif, et support de stockage

Country Status (2)

Country Link
CN (1) CN115443490A (fr)
WO (1) WO2021237570A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443880A (zh) * 2022-01-24 2022-05-06 南昌市安厦施工图设计审查有限公司 一种装配式建筑的大样图审图方法及审图系统
CN114612839A (zh) * 2022-03-18 2022-06-10 壹加艺术(武汉)文化有限公司 一种短视频分析处理方法、系统及计算机存储介质
CN115297360A (zh) * 2022-09-14 2022-11-04 百鸣(北京)信息技术有限公司 一种多媒体软件视频上传智能审核系统
CN115994772A (zh) * 2023-02-22 2023-04-21 中信联合云科技有限责任公司 图书资料处理方法及系统、图书快速铺货方法、电子设备
CN116452836A (zh) * 2023-05-10 2023-07-18 武汉精阅数字传媒科技有限公司 一种基于图像数据处理的新媒体素材内容采集系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116866666B (zh) * 2023-09-05 2023-12-08 天津市北海通信技术有限公司 轨道交通环境下的视频流画面处理方法及装置
CN117292395B (zh) * 2023-09-27 2024-05-24 自然资源部地图技术审查中心 审图模型的训练方法和训练装置及审图的方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101359372A (zh) * 2008-09-26 2009-02-04 腾讯科技(深圳)有限公司 分类器的训练方法及装置、识别敏感图片的方法及装置
CN108960782A (zh) * 2018-07-10 2018-12-07 北京木瓜移动科技股份有限公司 内容审核方法以及装置
CN109561322A (zh) * 2018-12-27 2019-04-02 广州市百果园信息技术有限公司 一种视频审核的方法、装置、设备和存储介质
US10402699B1 (en) * 2015-12-16 2019-09-03 Hrl Laboratories, Llc Automated classification of images using deep learning—back end
CN110377775A (zh) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 一种图片审核方法及装置、存储介质
CN110738697A (zh) * 2019-10-10 2020-01-31 福州大学 基于深度学习的单目深度估计方法
CN111079816A (zh) * 2019-12-11 2020-04-28 北京金山云网络技术有限公司 图像的审核方法、装置和服务器

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101359372A (zh) * 2008-09-26 2009-02-04 腾讯科技(深圳)有限公司 分类器的训练方法及装置、识别敏感图片的方法及装置
US10402699B1 (en) * 2015-12-16 2019-09-03 Hrl Laboratories, Llc Automated classification of images using deep learning—back end
CN108960782A (zh) * 2018-07-10 2018-12-07 北京木瓜移动科技股份有限公司 内容审核方法以及装置
CN109561322A (zh) * 2018-12-27 2019-04-02 广州市百果园信息技术有限公司 一种视频审核的方法、装置、设备和存储介质
CN110377775A (zh) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 一种图片审核方法及装置、存储介质
CN110738697A (zh) * 2019-10-10 2020-01-31 福州大学 基于深度学习的单目深度估计方法
CN111079816A (zh) * 2019-12-11 2020-04-28 北京金山云网络技术有限公司 图像的审核方法、装置和服务器

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114443880A (zh) * 2022-01-24 2022-05-06 南昌市安厦施工图设计审查有限公司 一种装配式建筑的大样图审图方法及审图系统
CN114612839A (zh) * 2022-03-18 2022-06-10 壹加艺术(武汉)文化有限公司 一种短视频分析处理方法、系统及计算机存储介质
CN114612839B (zh) * 2022-03-18 2023-10-31 壹加艺术(武汉)文化有限公司 一种短视频分析处理方法、系统及计算机存储介质
CN115297360A (zh) * 2022-09-14 2022-11-04 百鸣(北京)信息技术有限公司 一种多媒体软件视频上传智能审核系统
CN115994772A (zh) * 2023-02-22 2023-04-21 中信联合云科技有限责任公司 图书资料处理方法及系统、图书快速铺货方法、电子设备
CN115994772B (zh) * 2023-02-22 2024-03-08 中信联合云科技有限责任公司 图书资料处理方法及系统、图书快速铺货方法、电子设备
CN116452836A (zh) * 2023-05-10 2023-07-18 武汉精阅数字传媒科技有限公司 一种基于图像数据处理的新媒体素材内容采集系统
CN116452836B (zh) * 2023-05-10 2023-11-28 杭州元媒科技有限公司 一种基于图像数据处理的新媒体素材内容采集系统

Also Published As

Publication number Publication date
CN115443490A (zh) 2022-12-06

Similar Documents

Publication Publication Date Title
WO2021237570A1 (fr) Procédé et appareil d'audit d'image, dispositif, et support de stockage
WO2020119350A1 (fr) Procédé et appareil de classification de vidéos, dispositif informatique et support d'informations
WO2020199468A1 (fr) Procédé et dispositif de classification d'image et support de stockage lisible par ordinateur
CN107463605B (zh) 低质新闻资源的识别方法及装置、计算机设备及可读介质
US10831814B2 (en) System and method for linking multimedia data elements to web pages
Hua et al. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines
US10621755B1 (en) Image file compression using dummy data for non-salient portions of images
US9576221B2 (en) Systems, methods, and devices for image matching and object recognition in images using template image classifiers
Thyagharajan et al. A review on near-duplicate detection of images using computer vision techniques
US20230376527A1 (en) Generating congruous metadata for multimedia
Murray et al. A deep architecture for unified aesthetic prediction
CN110427895A (zh) 一种基于计算机视觉的视频内容相似度判别方法及系统
US10380267B2 (en) System and method for tagging multimedia content elements
KR101647691B1 (ko) 하이브리드 기반의 영상 클러스터링 방법 및 이를 운용하는 서버
CN111651636A (zh) 视频相似片段搜索方法及装置
WO2021179631A1 (fr) Procédé, appareil et dispositif de compression de modèle de réseau neuronal convolutionnel, et support de stockage
CN113434716B (zh) 一种跨模态信息检索方法和装置
CN110163061B (zh) 用于提取视频指纹的方法、装置、设备和计算机可读介质
WO2021012493A1 (fr) Procédé et appareil d'extraction de mot-clé de vidéo courte, et support d'informations
CN113221918B (zh) 目标检测方法、目标检测模型的训练方法及装置
Phadikar et al. Content-based image retrieval in DCT compressed domain with MPEG-7 edge descriptor and genetic algorithm
US10504002B2 (en) Systems and methods for clustering of near-duplicate images in very large image collections
US20230222762A1 (en) Adversarially robust visual fingerprinting and image provenance models
US11537636B2 (en) System and method for using multimedia content as search queries
Kapadia et al. Improved CBIR system using Multilayer CNN

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937851

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 25.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20937851

Country of ref document: EP

Kind code of ref document: A1