CN113704532B - Method and system for improving picture retrieval recall rate - Google Patents

Method and system for improving picture retrieval recall rate Download PDF

Info

Publication number
CN113704532B
CN113704532B CN202011341433.5A CN202011341433A CN113704532B CN 113704532 B CN113704532 B CN 113704532B CN 202011341433 A CN202011341433 A CN 202011341433A CN 113704532 B CN113704532 B CN 113704532B
Authority
CN
China
Prior art keywords
picture
pictures
hash value
pixel values
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011341433.5A
Other languages
Chinese (zh)
Other versions
CN113704532A (en
Inventor
史国杰
曹靖诚
张继东
刘硙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Digital Life Technology Co Ltd
Original Assignee
Tianyi Digital Life Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Digital Life Technology Co Ltd filed Critical Tianyi Digital Life Technology Co Ltd
Priority to CN202011341433.5A priority Critical patent/CN113704532B/en
Publication of CN113704532A publication Critical patent/CN113704532A/en
Application granted granted Critical
Publication of CN113704532B publication Critical patent/CN113704532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10068Endoscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Library & Information Science (AREA)
  • Biomedical Technology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for improving the recall rate of picture retrieval, wherein the method comprises the following steps: converting a first picture and a second picture into pixel values, respectively, wherein the first picture is used as a reference picture; calculating a hash value string of the first picture based on differences in pixel values of adjacent pixels; calculating hash value strings of the second picture and a plurality of rotated and mirrored pictures of the second picture based on differences of pixel values of adjacent pixels respectively; determining the similarity between the second picture and the first picture after rotation and mirroring based on the hash value string; and taking the second picture and the picture with the highest similarity with the first picture in the plurality of rotated and mirrored pictures as the main direction of the second picture, and substituting the picture and the first picture into a picture identification model.

Description

Method and system for improving picture retrieval recall rate
Technical Field
The invention relates to the field of artificial intelligence and image recognition and processing, in particular to a method and a system for improving the recall rate of picture retrieval by using a hash algorithm for main direction positioning.
Background
With the development of scientific technology, artificial intelligence is increasingly widely used, such as license plate recognition, face recognition and the like, most of information is acquired through images, and particularly in the field of the artificial intelligence which is relatively hot nowadays, a large number of pictures are required for training and learning, wherein the picture recognition is the condition that the picture recognition is not performed.
Fig. 1 is an exemplary architecture diagram of a prior art model for identifying similar pictures. As shown in fig. 1, there are a variety of models already in the prior art that have been trained to recognize similar pictures, such as picture recognition model 102 in fig. 1. Such a model may provide a recognition result of whether the two pictures are similar. For example, in fig. 1, a picture a and a picture B are provided as inputs to the similar picture recognition model 102, where the picture a is typically a reference picture, and the picture recognition model 102 outputs a recognition result, i.e., the picture B is similar or dissimilar to the picture a as the reference picture.
The existing image similarity recognition method mainly comprises a similar picture retrieval scheme based on global features and local features, such as Sift and Surf, and is gradually replaced by a deep learning feature extraction method, wherein the deep learning feature extraction generally uses classical CNN (computer numerical network) networks, such as VGG, res, xception and other network models, a large-scale picture classification dataset ImageNet is used for pre-training, a certain layer of output result of a convolution layer or a full-connection layer is extracted as a feature representation of a picture, and then Euclidean distance or cosine similarity comparison is carried out. The method has stronger semanteme and higher precision. But has the following problems: the recall ratio is not good enough. Mainly because the pre-training model is based on classification, unobvious features can be discarded, and the extracted features can have larger differences under different conditions of rotation, mirror image, lens focal length and the like. The most intuitive expression used in picture similarity recognition is that when an input is a rotated picture, even if the rotation angle is small (about 30 degrees), efficient recognition cannot be performed, so that the recall ratio of the similar picture is extremely low. The conventional optimization method of these methods is to improve the network, which may bring about a series of problems such as increased algorithm complexity and excessively long development time.
A chinese patent application (201410848431.3) entitled "a similar picture detection method and apparatus" proposes a picture similarity detection method based on HASH, ashing a picture, dividing the picture by 8×8, further obtaining an average value of each region, and then performing quantization marking to form a HASH string. However, the method does not consider the situation that the recall ratio of the similar identification is low and the precision ratio is not high when the similar image identification is carried out by acquiring the characteristic value, so that the patent cannot completely solve the problem of the similar image identification.
Chinese patent application (201811449694.1) entitled "image similarity determination method based on pre-screening method and PHash" proposes a method for performing variance calculation on images by using a color variance algorithm, calculating variance difference between two images based on color variance, and completing the pre-screening process. If the variance difference value based on the color variance is larger than the variance threshold value, judging that the images are dissimilar, and ending the steps; if the variance difference value based on the color variance is smaller than or equal to the variance threshold value, continuing to carry out HASH on the images through PHASH algorithm, and calculating the Hamming distance based on PHASH between the two images; if the Hamming distance based on PHASH is less than the Yu Hanming distance threshold, determining that the images are similar; otherwise, the images are judged to be dissimilar. The method can improve the picture similarity retrieval efficiency, but has limited efficiency improvement and does not have anti-rotation capability for similarity identification of the rotated pictures through a series of complex algorithms and calculation, and does not have outstanding contribution to picture recall ratio improvement.
Chinese patent application (201710029935.6) entitled "HASH image retrieval method based on deep learning and local feature fusion" applies a method combining deep learning network and HASH image retrieval, extracts two features, then uses an approximate nearest neighbor search strategy to perform image retrieval, and also performs retrieval calculation on mirror images. However, the method still carries out different strategy processing on all the pictures, so that the operation speed and the precision are improved to a certain extent, but the recall ratio of the rotated pictures is still not guaranteed.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In order to solve the problems, the invention introduces a picture main direction identification method based on hash values. More specifically, the method may include ashing the picture to form pixel points, comparing two adjacent pixel values according to a certain rule to generate HASH, and obtaining the HASH of the picture to be compared in the same manner, and obtaining the HASH of different angles through rotation calculation. The advantage of adopting HASH contrast to search the similar pictures is mainly that the speed is fast, the accuracy is high, the recall ratio is high, etc. The HASH is calculated through the picture pixels only by one time, the rest rotation can be directly obtained through the rotation calculation of the obtained HASH, and when the similarity of the comparison results is extremely high and the aspect ratio is the same, the picture similarity can be directly judged, so that the processing speed is further increased. Particularly, the method has higher accuracy when the rotation and mirror image pictures are identified similarly, so that the recall ratio of the similar pictures is improved.
According to one aspect of the present invention, there is provided a method for enhancing picture retrieval recall, wherein the method comprises:
Converting a first picture and a second picture into pixel values, respectively, wherein the first picture is used as a reference picture;
calculating a hash value string of the first picture based on differences in pixel values of adjacent pixels;
Calculating hash value strings of the second picture and a plurality of rotated and mirrored pictures of the second picture based on differences of pixel values of adjacent pixels respectively;
determining the similarity between the second picture and the first picture after rotation and mirroring based on the hash value string; and
And taking the second picture and one of the plurality of rotated and mirrored pictures, which has the highest similarity with the first picture, as a main direction of the second picture, and substituting the picture and the first picture into a picture identification model.
According to a further embodiment of the invention, the method further comprises: and if none of the second picture and the rotated and mirrored pictures is higher than a preset threshold value, skipping the steps of taking the main direction and substituting the main direction into a picture identification model, and directly judging that the second picture is dissimilar to the first picture.
According to a further embodiment of the present invention, the converting the first picture and the second picture into pixel values, respectively, further comprises: and ashing the first picture and the second picture to obtain pixel values.
According to a further embodiment of the present invention, calculating a hash value string based on differences in pixel values of neighboring pixels further includes: comparing each pixel value of the picture with adjacent pixel values pixel by pixel, and obtaining a plurality of hash values based on the difference degree of the pixel values; and concatenating the plurality of hash values into a hash value string.
According to a further embodiment of the present invention, determining the similarity between the second picture and the plurality of rotated and mirrored pictures and the first picture, respectively, based on the hash value string further comprises:
Calculating an error sum of each picture, wherein the error sum is a sum of absolute differences of adjacent hash values in the hash value string; and
And calculating a difference between the error sum of the second picture and each of the plurality of rotated and mirrored pictures and the error sum of the first picture, wherein the smaller the difference is, the higher the similarity between the picture and the first picture is.
According to a further embodiment of the present invention, determining the similarity between the second picture and the plurality of rotated and mirrored pictures and the first picture, respectively, based on the hash value string further comprises:
Calculating a ratio of a number of bits equal to corresponding bits in the hash value string of the second picture and each of the plurality of rotated and mirrored pictures to a total length of the hash value string of the first picture, wherein the greater the ratio indicates a higher similarity of the picture to the first picture.
According to a further embodiment of the present invention, at least one of the hash value strings of the rotated and mirrored pictures of the second picture is obtained by correspondingly rotating and/or mirroring the hash value string of the second picture.
According to another aspect of the present invention, there is provided a system for identifying similar pictures, wherein the system comprises:
A picture primary direction determination module configured to:
converting a first picture and a second picture into pixel values, respectively, wherein the first picture is taken as
A reference picture;
calculating a hash value string of the first picture based on differences in pixel values of adjacent pixels;
Calculating the second picture and the second picture, respectively, based on differences in pixel values of adjacent pixels
A plurality of rotated and mirrored hash value strings of the picture;
Determining the second picture and the plurality of rotated and mirrored images, respectively, based on the hash value string
Similarity between the first picture and the second picture; and
Taking the second picture and the picture with the highest similarity with the first picture in the plurality of rotated and mirrored pictures as the main direction of the second picture, and combining the second picture with the first picture
The slices are provided together to a picture recognition model; and
A picture recognition model configured to extract feature values of a set of pictures provided by the picture main direction determination module and to recognize whether the set of pictures are similar.
According to a further embodiment of the invention, the picture main direction determination module is further configured to: and if none of the second picture and the rotated and mirrored pictures is higher than a preset threshold value, skipping the steps of taking the main direction and substituting the main direction into a picture identification model, and directly judging that the second picture is dissimilar to the first picture.
According to a further embodiment of the present invention, the converting the first picture and the second picture into pixel values, respectively, further comprises: ashing the first picture and the second picture to obtain pixel values, and
Calculating a hash value string based on differences in pixel values of adjacent pixels further includes: comparing each pixel value of the picture with adjacent pixel values pixel by pixel, and obtaining a plurality of hash values based on the difference degree of the pixel values; and concatenating the plurality of hash values into a hash value string.
Compared with the scheme in the prior art, the picture retrieval method provided by the invention has at least the following advantages:
(1) The method has high accuracy. The method can solve the problem that the identification rate and recall rate of the rotation and mirror image pictures are poor when the neural network performs similar picture identification.
(2) The method has high efficiency, and for the similar picture identification algorithm mainly based on the neural network, the processing speed of the similar picture is improved by more than 90%.
(3) The method has stability, pixel difference is calculated based on HASH to form HASH strings, and the technology is mature and can ensure the stability of similar recognition functions.
(4) With flexible adaptation, the main direction positioning algorithm based on Hash can be adapted in front of the convolution layer of any CNN model.
(5) The method has the advantages that the accuracy, the efficiency and the recall ratio of the neural network for identifying the rotation and mirror images are improved, the resource cost and the time cost are saved, and the faster processing speed and the higher recall ratio are obtained.
These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.
Drawings
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
Fig. 1 is an exemplary architecture diagram of a prior art model for identifying similar pictures.
Fig. 2 is an example flow diagram of a method for enhancing picture retrieval recall in accordance with one embodiment of the present invention.
FIG. 3 shows a schematic diagram of an example HASH function.
Fig. 4 is a schematic diagram of a HASH string mirror up and down.
Fig. 5 is an exemplary architecture diagram of a system for identifying similar pictures according to one embodiment of the invention.
Detailed Description
The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.
In order to solve the problem that rotation and mirror images cannot be effectively identified in the prior art, the invention provides a method capable of improving the identification accuracy and the identification efficiency of similar images and further improving the recall rate of image retrieval. Fig. 2 is an example flow diagram of a method for enhancing picture retrieval recall in accordance with one embodiment of the present invention.
The method starts at step 202, a first picture and a second picture are respectively converted into pixel values, wherein the first picture is used as a reference picture, and the second picture is a picture to be identified, i.e. a picture to be identified and having a similarity to the reference picture. For convenience of description, hereinafter, the first picture will be referred to as picture a, and the picture to be identified will be referred to as picture B.
As is well known, a picture is typically represented by a color value for each pixel it contains, where the color values may have different color formats, e.g., RGB, CMYK, etc. Through experimentation, the inventors have found that the use of color values such as RGB does not significantly improve the calculation of the method of the present invention, but rather the complexity. Thus, converting a picture to a pixel value may preferably comprise de-coloring, i.e. ashing, the picture, followed by re-acquisition of its pixel value. The range of the pixel values after ashing depends on the set gradation level, and the gradation level may be set to 8, 16, 32, 64, 128, 256, or the like, for example. Subsequently, the method proceeds to step 204.
In step 204, a hash value string for the first picture is calculated based on the differences in pixel values of the neighboring pixels. As an example, the pixel value of the picture a may be processed, two adjacent pixels of each row are compared, different HASH (HASH) values are obtained according to the difference of the pixel values of the two, and finally a HASH string is obtained. FIG. 3 shows a schematic diagram of an example HASH function. The upper right hand corner of fig. 3 shows a close-up view of the ashed pixels of one picture. Then, starting from the leftmost first pixel of the first row, pixel value comparisons are made with its right adjacent pixels one by one. As an example, assume that the HASH function rule is adopted that HASH writes "2" when the left pixel value is equal to or greater than the right pixel plus a threshold (e.g., 4), HASH writes "1" when the left pixel value is equal to or greater than the right pixel minus a threshold (e.g., 4), and other cases write "0". In this example, the pixel value comparison of the first pixel of the first row and its neighboring pixels corresponds to: left > = right-4, and not met = right +4, then HASH value is 1. And so on, then the second pixel is compared with the third pixel adjacent to the right, and finally a HASH value string consisting of 0, 1 and 2 is obtained and is marked AHASH. It will be understood by those skilled in the art that different sections, for example, more than 0, 1, and 2 sections, may be set according to the requirements, and the specific difference threshold of each section may be set according to the requirements and the value range of the pixel value. Subsequently, the method proceeds to step 206.
In step 206, a hash value string of the second picture is calculated based on the differences in pixel values of the adjacent pixels, and hash value strings of the plurality of rotated and mirrored pictures of the second picture. First, in the same manner as the first picture, a HASH value string of the second picture may be calculated and recorded as BHASH. Then, hash values of the second picture after being rotated by 90 degrees, 180 degrees and 270 degrees, and horizontally mirrored and vertically mirrored pictures are respectively recorded as BHASH _90, BHASH _180, BHASH _270, BHASH _IMA_0 and BHASH _IMA_180.
In addition, since BHASH is a set of pixel value comparison results and is stored in a certain order, depending on the set HASH function, the HASH result of a part of the rotated or mirrored picture can be directly obtained by rotating the HASH result of the original picture. Fig. 4 is a schematic diagram of a HASH string mirror up and down. As shown in fig. 4, assuming that HASH strings obtained from an original picture are arranged by the number of rows and columns of pixels as shown in (a) of fig. 4, it is easy to understand that if a HASH function is used in which HASH values are obtained by comparing left and right adjacent pixel values, HASH results of pictures mirrored up and down are shown in (b) of fig. 4, and thus can be obtained by directly mirroring HASH values. Subsequently, the method proceeds to step 208.
In step 208, the similarity between the second picture and its rotation and the mirrored picture and the first picture, respectively, is determined based on the hash value string. According to one embodiment of the invention, the similarity between two pictures may be determined based on the error sum of the hash value strings. The error sum of a HASH value string is calculated by taking the absolute difference from the first character and the next character until the last, and summing each difference to obtain the total difference. For example, assuming AHASH is 012012012, the error sum of AHASH is se_ AHASH = |0-1|+|1-2|+|2-0|+|0-1|+|1-2|=12. Similarly, the error sums BHASH may be calculated as SE BHASH, and the error sums SE BHASH _90, SE BHASH _180, SE BHASH _270, SE BHASH _ima_0, and SE BHASH _ima_180 of the HASH value strings of the rotated and mirrored pictures. Subsequently, se_bhash, se_ BHASH _90, se_ BHASH _180, se_ BHASH _270, se_ BHASH _ima_0, and se_ BHASH _ima_180 are compared to se_ AHASH, respectively, and the closer the two pictures are, the more similar the two pictures are.
According to a further embodiment of the invention, the similarity between two pictures may be determined based on the ratio of the number of corresponding bits of the hash value string equal to the total length. For example, assuming that AHASH and BHASH each have a length of 64 bits, if the number of bits corresponding to the same bit is 48 bits, the same bit ratio can be found to be 48/64=75%. The higher the ratio, the more similar the two pictures are.
In step 210, the second picture and the rotation and mirror image of the second picture with the highest similarity to the first picture are taken as the main direction of the second picture, and the second picture and the first picture are substituted into the picture identification model. As depicted in step 208, the similarity of the second picture and its rotation and mirror image pictures, respectively, is quantized based on the hash value string, taking the most similar one of them as the main direction of the second picture, i.e. the direction is considered to be the same as the main direction of the first picture. For example, if the horizontal mirror image of the second picture is considered to be most similar to the first picture, the horizontal mirror image will replace the second picture and be substituted into the neural network together with the first picture, for example, in the picture recognition model 102 in fig. 1, and further subsequent operations such as feature value calculation are performed. The main direction before substituting the picture identification model is determined, so that the defect that the neural network cannot effectively identify the rotation and mirror image pictures is effectively overcome. With the effective identification of the rotated and mirrored pictures, the precision, recall and recall of the pictures are improved.
Optionally, the method may further include a dissimilarity determination step 212. More specifically, if the similarity between the second picture and each of the rotated and mirrored pictures is found to be very low, the second picture and the first picture may be directly determined to be dissimilar, and the process of determining the main direction and substituting the picture identification model may be skipped, and the process proceeds to step 210. For example, in the example of calculating the error sum, a threshold may be set for the difference of the error sum of the second picture and the first picture, and step 210 may be entered only if there is at least one of the error sum of the second picture and its respective rotated and mirrored pictures and the difference of the error sum from the first picture is greater than the threshold. When comparing the number of the same bits as the corresponding bits, it may be set that step 210 is performed only when at least one comparison result is 75% or more. On the contrary, the second picture and the first picture can be directly judged to be dissimilar, a large amount of operation processes are saved, the obviously dissimilar pictures can be screened out before entering the identification model, and the overall operation efficiency of picture identification can be greatly improved, for example, the overall operation efficiency is improved by more than 90%.
It should be noted that the similarity determined in step 208 does not fully represent the actual degree of similarity of the two pictures, but is simply a simple determination of the degree of similarity of the two pictures, but by step 208 it can be determined at least which angle of the picture is most likely to be a picture similar to the reference picture (i.e. determining the main direction of the picture), and the recognition problem of small angle rotation is also solved, since a hash value string that still shows a certain angle is closer to the hash value string of the reference picture, although there is a small angle rotation. Furthermore, clearly dissimilar pictures can be quickly and easily excluded by comparison of hash value strings.
Fig. 5 is an exemplary architecture diagram of a system for identifying similar pictures according to one embodiment of the invention. As shown in fig. 5, system 500 includes a picture primary direction determination module 502 and a picture identification model 504. Picture a, which is a reference picture, and picture B to be identified are provided as inputs to picture main direction determination module 502 first. Picture principal direction determination module 502 may determine the principal direction of picture B using, for example, the method described above in connection with fig. 2, and directly determine that picture B is dissimilar to picture a when none of the directions is similar to picture a to a threshold. Subsequently, the picture main direction determination module 502 supplies the picture of the main direction of the picture B to the picture recognition model 502 together with the picture a, which further determines whether the two are similar, and outputs the recognition result accordingly.
The method for improving the retrieval recall rate of the CNN network picture is described above. The method solves the problem that the recall ratio generated when the neural network performs picture similarity recognition is low, and particularly solves the problem that the recall ratio of the neural network to the rotated and mirrored pictures is very low. According to the principle and characteristics of picture rotation, the method adopts a mode of solving the HASH values of different angles of the picture and respectively comparing the HASH values to obtain the picture HASH values determined by the main directions, calculates the HASH values of the picture in multiple directions after the picture is rotated, determines the main directions of the picture through comparing the HASH values, and then the picture is correspondingly rotated and input into the neural network to obtain the similar result of the similar picture, thereby solving the problems of low recognition rate, low recognition accuracy and low recall rate encountered by the neural network when the similar picture is searched, perfecting the function of a similar picture recognition system and ensuring the recognition effect of the similar picture. By using the method of the invention, the operation efficiency of the image similarity recognition algorithm based on the neural network can be improved by more than 90%, and the image recall ratio can be improved by more than 20%.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.

Claims (6)

1. A method for enhancing picture retrieval recall, the method comprising:
Converting a first picture and a second picture into pixel values, respectively, wherein the first picture is used as a reference picture;
calculating a hash value string of the first picture based on differences in pixel values of adjacent pixels;
Calculating hash value strings of the second picture and a plurality of rotated and mirrored pictures of the second picture respectively based on differences of pixel values of adjacent pixels, wherein at least one of the hash value strings of the plurality of rotated and mirrored pictures of the second picture is obtained by correspondingly rotating and/or mirroring the hash value string of the second picture;
determining the similarity between the second picture and the first picture after rotation and mirroring based on the hash value string; and
Taking the second picture and the one of the plurality of rotated and mirrored pictures, which has the highest similarity with the first picture, as a main direction of the second picture, substituting the picture and the first picture into a picture identification model, and if none of the second picture and the plurality of rotated and mirrored pictures has the similarity with the first picture higher than a preset threshold, skipping the main direction and substituting the picture identification model to directly determine that the second picture is dissimilar with the first picture,
Wherein calculating the hash value string based on the differences in pixel values of the neighboring pixels further comprises:
comparing each pixel value of the picture with adjacent pixel values pixel by pixel, and obtaining a plurality of hash values based on the difference degree of the pixel values; and
Concatenating the plurality of hash values into a hash value string.
2. The method of claim 1, wherein converting the first picture and the second picture, respectively, to pixel values further comprises:
and ashing the first picture and the second picture to obtain pixel values.
3. The method of claim 1, wherein determining similarities between the second picture and the plurality of rotated and mirrored pictures and the first picture, respectively, based on the hash string further comprises:
Calculating an error sum of each picture, wherein the error sum is a sum of absolute differences of adjacent hash values in the hash value string; and
And calculating a difference between the error sum of the second picture and each of the plurality of rotated and mirrored pictures and the error sum of the first picture, wherein the smaller the difference is, the higher the similarity between the picture and the first picture is.
4. The method of claim 1, wherein determining similarities between the second picture and the plurality of rotated and mirrored pictures and the first picture, respectively, based on the hash string further comprises:
Calculating a ratio of a number of bits equal to corresponding bits in the hash value string of the second picture and each of the plurality of rotated and mirrored pictures to a total length of the hash value string of the first picture, wherein the greater the ratio indicates a higher similarity of the picture to the first picture.
5. A system for identifying similar pictures, the system comprising:
A picture primary direction determination module configured to:
Converting a first picture and a second picture into pixel values, respectively, wherein the first picture is used as a reference picture;
calculating a hash value string of the first picture based on differences in pixel values of adjacent pixels;
Calculating hash value strings of the second picture and a plurality of rotated and mirrored pictures of the second picture respectively based on differences of pixel values of adjacent pixels, wherein at least one of the hash value strings of the plurality of rotated and mirrored pictures of the second picture is obtained by correspondingly rotating and/or mirroring the hash value string of the second picture;
determining the similarity between the second picture and the first picture after rotation and mirroring based on the hash value string; and
Taking the second picture and the one of the plurality of rotated and mirrored pictures, which has the highest similarity with the first picture, as a main direction of the second picture, providing the second picture and the first picture to a picture identification model, and if none of the second picture and the plurality of rotated and mirrored pictures has the similarity with the first picture higher than a preset threshold, skipping the main direction and substituting the main direction into the picture identification model to directly judge that the second picture is dissimilar with the first picture; and
A picture recognition model configured to extract feature values of a set of pictures provided by the picture main direction determination module and to recognize whether the set of pictures are similar,
Wherein calculating the hash value string based on the differences in pixel values of the neighboring pixels further comprises:
comparing each pixel value of the picture with adjacent pixel values pixel by pixel, and obtaining a plurality of hash values based on the difference degree of the pixel values; and
Concatenating the plurality of hash values into a hash value string.
6. The system of claim 5, wherein converting the first picture and the second picture to pixel values, respectively, further comprises: and ashing the first picture and the second picture to obtain pixel values.
CN202011341433.5A 2020-11-25 2020-11-25 Method and system for improving picture retrieval recall rate Active CN113704532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011341433.5A CN113704532B (en) 2020-11-25 2020-11-25 Method and system for improving picture retrieval recall rate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011341433.5A CN113704532B (en) 2020-11-25 2020-11-25 Method and system for improving picture retrieval recall rate

Publications (2)

Publication Number Publication Date
CN113704532A CN113704532A (en) 2021-11-26
CN113704532B true CN113704532B (en) 2024-04-26

Family

ID=78646662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011341433.5A Active CN113704532B (en) 2020-11-25 2020-11-25 Method and system for improving picture retrieval recall rate

Country Status (1)

Country Link
CN (1) CN113704532B (en)

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003242509A (en) * 2001-12-13 2003-08-29 Toshiba Corp Pattern recognition device and method
CN104317902A (en) * 2014-10-24 2015-01-28 西安电子科技大学 Image retrieval method based on local locality preserving iterative quantization hash
JP2015056077A (en) * 2013-09-12 2015-03-23 Kddi株式会社 Image retrieval device, system, program, and method using image based binary feature vector
US9384519B1 (en) * 2013-12-12 2016-07-05 Zazzle Inc. Finding similar images based on extracting keys from images
CN106599028A (en) * 2016-11-02 2017-04-26 华南理工大学 Book content searching and matching method based on video image processing
CN106650829A (en) * 2017-01-04 2017-05-10 华南理工大学 Picture similarity calculation method
CN106886599A (en) * 2017-02-28 2017-06-23 北京京东尚科信息技术有限公司 Image search method and device
CN106980641A (en) * 2017-02-09 2017-07-25 上海交通大学 The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks
CN108287833A (en) * 2017-01-09 2018-07-17 北京艺鉴通科技有限公司 It is a kind of for the art work identification to scheme to search drawing method
CN109284411A (en) * 2017-07-19 2019-01-29 哈尔滨工业大学深圳研究生院 One kind being based on having supervision hypergraph discretized image binary-coding method
CN110263205A (en) * 2019-06-06 2019-09-20 温州大学 A kind of search method for ginseng image
CN110287348A (en) * 2019-04-15 2019-09-27 南京邮电大学 A kind of GIF format picture searching method based on machine learning
WO2019214269A1 (en) * 2018-05-07 2019-11-14 深圳壹账通智能科技有限公司 Verification picture processing method and apparatus, and computer device and storage medium
CN110569878A (en) * 2019-08-08 2019-12-13 上海汇付数据服务有限公司 Photograph background similarity clustering method based on convolutional neural network and computer
CN110598022A (en) * 2019-08-05 2019-12-20 华中科技大学 Image retrieval system and method based on robust deep hash network
CN110599486A (en) * 2019-09-20 2019-12-20 福州大学 Method and system for detecting video plagiarism
CN110807473A (en) * 2019-10-12 2020-02-18 浙江大华技术股份有限公司 Target detection method, device and computer storage medium
CN110942002A (en) * 2019-11-18 2020-03-31 中山大学 Unmanned aerial vehicle aerial photography video frame positioning method based on rotation invariant perceptual hashing
CN111340109A (en) * 2020-02-25 2020-06-26 深圳市景阳科技股份有限公司 Image matching method, device, equipment and storage medium
CN111368687A (en) * 2020-02-28 2020-07-03 成都市微泊科技有限公司 Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
WO2020147857A1 (en) * 2019-01-18 2020-07-23 上海极链网络科技有限公司 Method and system for extracting, storing and retrieving mass video features
CN111666442A (en) * 2020-06-02 2020-09-15 腾讯科技(深圳)有限公司 Image retrieval method and device and computer equipment
CN111967033A (en) * 2020-08-28 2020-11-20 深圳康佳电子科技有限公司 Picture encryption method, device, terminal and storage medium based on face recognition

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150169644A1 (en) * 2013-01-03 2015-06-18 Google Inc. Shape-Gain Sketches for Fast Image Similarity Search

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003242509A (en) * 2001-12-13 2003-08-29 Toshiba Corp Pattern recognition device and method
JP2015056077A (en) * 2013-09-12 2015-03-23 Kddi株式会社 Image retrieval device, system, program, and method using image based binary feature vector
US9384519B1 (en) * 2013-12-12 2016-07-05 Zazzle Inc. Finding similar images based on extracting keys from images
CN104317902A (en) * 2014-10-24 2015-01-28 西安电子科技大学 Image retrieval method based on local locality preserving iterative quantization hash
CN106599028A (en) * 2016-11-02 2017-04-26 华南理工大学 Book content searching and matching method based on video image processing
CN106650829A (en) * 2017-01-04 2017-05-10 华南理工大学 Picture similarity calculation method
CN108287833A (en) * 2017-01-09 2018-07-17 北京艺鉴通科技有限公司 It is a kind of for the art work identification to scheme to search drawing method
CN106980641A (en) * 2017-02-09 2017-07-25 上海交通大学 The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks
CN106886599A (en) * 2017-02-28 2017-06-23 北京京东尚科信息技术有限公司 Image search method and device
CN109284411A (en) * 2017-07-19 2019-01-29 哈尔滨工业大学深圳研究生院 One kind being based on having supervision hypergraph discretized image binary-coding method
WO2019214269A1 (en) * 2018-05-07 2019-11-14 深圳壹账通智能科技有限公司 Verification picture processing method and apparatus, and computer device and storage medium
WO2020147857A1 (en) * 2019-01-18 2020-07-23 上海极链网络科技有限公司 Method and system for extracting, storing and retrieving mass video features
CN110287348A (en) * 2019-04-15 2019-09-27 南京邮电大学 A kind of GIF format picture searching method based on machine learning
CN110263205A (en) * 2019-06-06 2019-09-20 温州大学 A kind of search method for ginseng image
CN110598022A (en) * 2019-08-05 2019-12-20 华中科技大学 Image retrieval system and method based on robust deep hash network
CN110569878A (en) * 2019-08-08 2019-12-13 上海汇付数据服务有限公司 Photograph background similarity clustering method based on convolutional neural network and computer
CN110599486A (en) * 2019-09-20 2019-12-20 福州大学 Method and system for detecting video plagiarism
CN110807473A (en) * 2019-10-12 2020-02-18 浙江大华技术股份有限公司 Target detection method, device and computer storage medium
CN110942002A (en) * 2019-11-18 2020-03-31 中山大学 Unmanned aerial vehicle aerial photography video frame positioning method based on rotation invariant perceptual hashing
CN111340109A (en) * 2020-02-25 2020-06-26 深圳市景阳科技股份有限公司 Image matching method, device, equipment and storage medium
CN111368687A (en) * 2020-02-28 2020-07-03 成都市微泊科技有限公司 Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN111666442A (en) * 2020-06-02 2020-09-15 腾讯科技(深圳)有限公司 Image retrieval method and device and computer equipment
CN111967033A (en) * 2020-08-28 2020-11-20 深圳康佳电子科技有限公司 Picture encryption method, device, terminal and storage medium based on face recognition

Also Published As

Publication number Publication date
CN113704532A (en) 2021-11-26

Similar Documents

Publication Publication Date Title
JP4928310B2 (en) License plate recognition device, control method thereof, computer program
CN111325203A (en) American license plate recognition method and system based on image correction
CN109977942B (en) Scene character recognition method based on scene classification and super-resolution
CN108427925B (en) Copy video detection method based on continuous copy frame sequence
CN109783691B (en) Video retrieval method for deep learning and Hash coding
CN113111871A (en) Training method and device of text recognition model and text recognition method and device
JP7252009B2 (en) Processing Text Images Using Line Recognition Max-Min Pooling for OCR Systems Using Artificial Neural Networks
CN113139544A (en) Saliency target detection method based on multi-scale feature dynamic fusion
CN105612535A (en) Efficient content-based video retrieval
CN113870286A (en) Foreground segmentation method based on multi-level feature and mask fusion
CN115410059B (en) Remote sensing image part supervision change detection method and device based on contrast loss
CN114663371A (en) Image salient target detection method based on modal unique and common feature extraction
CN112883795A (en) Rapid and automatic table extraction method based on deep neural network
CN114898263B (en) Video key frame extraction method based on image information entropy and HOG_SSIM
Zhou et al. Attention transfer network for nature image matting
CN111723852A (en) Robust training method for target detection network
CN117437426B (en) Semi-supervised semantic segmentation method for high-density representative prototype guidance
CN108966042B (en) Video abstract generation method and device based on shortest path
Wang et al. Multi-scale aggregation network for temporal action proposals
CN111144469B (en) End-to-end multi-sequence text recognition method based on multi-dimensional associated time sequence classification neural network
CN110942463B (en) Video target segmentation method based on generation countermeasure network
CN113704532B (en) Method and system for improving picture retrieval recall rate
CN111079527B (en) Shot boundary detection method based on 3D residual error network
Wang et al. Rethinking low-level features for interest point detection and description
CN114820666B (en) Method and device for increasing matting accuracy, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220210

Address after: Room 1423, No. 1256 and 1258, Wanrong Road, Jing'an District, Shanghai 200072

Applicant after: Tianyi Digital Life Technology Co.,Ltd.

Address before: 201702 3rd floor, 158 Shuanglian Road, Qingpu District, Shanghai

Applicant before: Tianyi Smart Family Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant