CN111191065B - Homologous image determining method and device - Google Patents

Homologous image determining method and device Download PDF

Info

Publication number
CN111191065B
CN111191065B CN201911312848.7A CN201911312848A CN111191065B CN 111191065 B CN111191065 B CN 111191065B CN 201911312848 A CN201911312848 A CN 201911312848A CN 111191065 B CN111191065 B CN 111191065B
Authority
CN
China
Prior art keywords
images
evaluation
feature
target
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911312848.7A
Other languages
Chinese (zh)
Other versions
CN111191065A (en
Inventor
胡江明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haier Uplus Intelligent Technology Beijing Co Ltd
Original Assignee
Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haier Uplus Intelligent Technology Beijing Co Ltd filed Critical Haier Uplus Intelligent Technology Beijing Co Ltd
Priority to CN201911312848.7A priority Critical patent/CN111191065B/en
Publication of CN111191065A publication Critical patent/CN111191065A/en
Application granted granted Critical
Publication of CN111191065B publication Critical patent/CN111191065B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a method and a device for determining homologous images, wherein the method comprises the following steps: acquiring a target image to be processed and a plurality of images in a database; determining feature vectors of the target image and acquiring feature tensors of the plurality of images; determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images; the image corresponding to the component of the target evaluation vector larger than or equal to the preset threshold value is determined to be the homologous image of the image to be processed, so that the problem that the homologous image is difficult to accurately and rapidly retrieve from the database in the related technology can be solved, and the homologous image can be rapidly and accurately determined from the database.

Description

Homologous image determining method and device
Technical Field
The invention relates to the field of intelligent household appliances, in particular to a method and a device for determining homologous images.
Background
The homologous pictures or images refer to pictures or images with the same source on the same platform, for example, after a picture is transferred to a certain platform, the pictures are transmitted after being processed by various transformations such as rotation, scaling, shearing, shading, filtering, blurring, shielding and the like, so that the pictures belong to homologous pictures, and if pictures obtained after photographing an original picture also belong to homologous pictures, but if the same scene is photographed or drawn by different people, the pictures do not belong to homologous pictures.
In the internet era, a large amount of data are generated by users according to the needs, after one data source enters the internet, the users process the original data source according to the needs of the users, so that a large amount of new data are generated, for example, a certain net friend uploads a picture on a social network site, the picture is subjected to various compression, cutting, format modification, user information addition, slight PS (PhotoShop) and other operation modification of different users in the propagation process, a large amount of similar pictures are generated, and the difficulty of picture tracking is increased. When the homologous image is searched again, simple image features cannot be applied, such as the existing MD5 and the like, so that the homologous image can be searched, and only a content-based image searching method or a semantic feature-based searching method can be adopted.
The current homologous picture determining technology mainly comprises the technologies of picture similarity recognition, text labels, watermark-based and the like. Most of the techniques of picture similarity are realized by using methods such as SIFT, etc., the operation amount is large, the identification accuracy rate of homologous pictures is not high, and mainly because the methods such as SIFT are based on fuzzy matching, the characteristic points cannot be extracted accurately for smooth targets, the exact similarity of the pictures cannot be guaranteed, and the homologous pictures cannot be guaranteed. The technology based on the text labels is a simpler technology, the text labels are large in workload and difficult to realize in a cross-platform mode, and meanwhile, the text labels are lost due to user operation, so that in the Internet big data age, accurate and rapid homologous pictures are difficult to realize by the traditional technologies.
No solution has been proposed to the problem in the related art that it is difficult to accurately and rapidly retrieve the homologous images from the database.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining homologous images, which at least solve the problem that the accurate and rapid retrieval of the homologous images from a database is difficult to realize in the related technology.
According to an embodiment of the present invention, there is provided a homologous image determining method including:
acquiring a target image to be processed and a plurality of images in a database;
determining feature vectors of the target image and acquiring feature tensors of the plurality of images;
determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images;
and determining an image corresponding to the component of the target evaluation vector which is larger than or equal to a preset threshold value as a homologous image of the image to be processed.
Optionally, determining the feature vector of the target image and the target evaluation vector of the feature tensor of the plurality of images comprises:
expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images;
Splicing the expanded characteristic tensor of the target image with the characteristic tensors of the plurality of images to obtain a first spliced characteristic tensor;
inputting the first spliced characteristic tensor into a pre-trained evaluation model to obtain an evaluation matrix output by the evaluation model;
determining the first evaluation vector according to the evaluation matrix;
the first evaluation vector is determined as the target evaluation vector.
Optionally, after determining the first evaluation vector according to the evaluation matrix, the method further comprises:
determining cosine similarity between the feature vector of the target image and feature tensors of the plurality of images to obtain a second evaluation vector;
determining a feature vector of the target image and a target evaluation vector of feature tensors of the plurality of images comprises:
a product of the first evaluation vector and the second evaluation vector is determined as the target evaluation vector.
Optionally, inputting the first stitching feature tensor into a pre-trained evaluation model, and obtaining an evaluation matrix output by the evaluation model includes:
inputting the first spliced characteristic tensor into a first full-connection layer of the target evaluation model to obtain a first characteristic tensor output by the first full-connection layer;
Inputting the first characteristic tensor into a second full-connection layer of the target evaluation model to obtain a second characteristic tensor output at the second full-connection position;
inputting the second characteristic tensor into a softmax layer of the target evaluation model to obtain an evaluation matrix output by the softmax layer, wherein the evaluation matrix is a two-dimensional matrix; the row index of the evaluation matrix corresponds to the numbers of a plurality of images in a database, the column index corresponds to whether the images are homologous images or not, the first column of the evaluation matrix corresponds to the probability that each image in the database is a non-homologous image with the target image, and the second column corresponds to the probability that each image in the database is a homologous image with the target image;
determining the first evaluation vector according to the evaluation matrix comprises:
and selecting a second column vector from the evaluation matrix to determine the second column vector as the first evaluation vector.
Optionally, determining the feature vector of the target image includes:
and inputting the target image into a pre-trained target neural network model to obtain a feature vector corresponding to the target image output by the target neural network model.
Optionally, before acquiring the target image to be processed and the plurality of images in the database, the method further comprises:
Acquiring a predetermined number of original images and a group of images corresponding to the original images, wherein the group of images are a set of homologous images and non-homologous images corresponding to the original images in the same number;
training an original neural network model by using the preset number of original images and a group of images corresponding to the original images to obtain the target neural network model, wherein the preset number of original images and the group of images corresponding to the original images are input into the original neural network model, the feature vectors of the original images and the feature vectors actually corresponding to the original images output by the trained target neural network model meet a preset target function, and the feature tensors of the group of images and the feature tensors actually corresponding to the group of images output by the trained target neural network model meet the preset target function.
Optionally, after training an original neural network model using the predetermined number of original images and a set of images corresponding to the original images to obtain the target neural network model, the method further includes:
Expanding the feature vector of the original image into a feature tensor of the same dimension as the feature tensor of the set of images;
splicing the expanded characteristic tensor of the original image with the characteristic tensor of the group of images to obtain a second spliced characteristic tensor after splicing;
training an original evaluation model according to the second spliced characteristic tensor to obtain the target evaluation model, wherein the second spliced characteristic tensor is input into the original evaluation model, and an evaluation matrix of the second spliced characteristic tensor output by the trained target evaluation model and an evaluation matrix corresponding to the second spliced characteristic tensor meet a preset function.
According to another embodiment of the present invention, there is also provided a homologous image determining apparatus including:
the first acquisition module is used for acquiring a target image to be processed and a plurality of images in the database;
a first determining module, configured to determine feature vectors of the target image, and acquire feature tensors of the plurality of images;
a second determining module, configured to determine a target evaluation vector of feature vectors of the target image and feature tensors of the plurality of images, where each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images;
And the third determining module is used for determining an image corresponding to the component of the target evaluation vector which is larger than or equal to a preset threshold value as a homologous image of the image to be processed.
Optionally, the second determining module includes:
an expansion sub-module for expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images;
the splicing sub-module is used for splicing the characteristic tensor of the target image after expansion with the characteristic tensors of the plurality of images to obtain a first spliced characteristic tensor;
the input sub-module is used for inputting the first spliced characteristic tensor into a pre-trained evaluation model to obtain an evaluation matrix output by the evaluation model;
a first determination submodule for determining the first evaluation vector according to the evaluation matrix;
and the second determination submodule is used for determining the first evaluation vector as the target evaluation vector.
Optionally, the apparatus further comprises:
a third determining submodule, configured to determine cosine similarity between the feature vector of the target image and feature tensors of the multiple images, and obtain a second evaluation vector;
a fourth determination sub-module for determining a feature vector of the target image and a target evaluation vector of feature tensors of the plurality of images includes:
A fifth determination sub-module for determining a product of the first evaluation vector and the second evaluation vector as the target evaluation vector.
Optionally, the input sub-module is further configured to
Inputting the first spliced characteristic tensor into a first full-connection layer of the target evaluation model to obtain a first characteristic tensor output by the first full-connection layer;
inputting the first characteristic tensor into a second full-connection layer of the target evaluation model to obtain a second characteristic tensor output at the second full-connection position;
inputting the second characteristic tensor into a softmax layer of the target evaluation model to obtain an evaluation matrix output by the softmax layer, wherein the evaluation matrix is a two-dimensional matrix; the row index of the evaluation matrix corresponds to the numbers of a plurality of images in a database, the column index corresponds to whether the images are homologous images or not, the first column of the evaluation matrix corresponds to the probability that each image in the database is a non-homologous image with the target image, and the second column corresponds to the probability that each image in the database is a homologous image with the target image;
the second determining submodule is further used for selecting a second column vector from the evaluation matrix to determine the second column vector as the first evaluation vector.
Optionally, the first determining module is further configured to
And inputting the target image into a pre-trained target neural network model to obtain a feature vector corresponding to the target image output by the target neural network model.
Optionally, the apparatus further comprises:
a second acquisition module, configured to acquire a predetermined number of original images and a group of images corresponding to the original images, where the group of images is a set of the same number of homologous images and non-homologous images corresponding to the original images;
the first training module is configured to train an original neural network model by using the predetermined number of original images and a set of images corresponding to the original images to obtain the target neural network model, wherein the predetermined number of original images and the set of images corresponding to the original images are input into the original neural network model, the feature vector of the original image output by the trained target neural network model and the feature vector actually corresponding to the original image satisfy a predetermined objective function, and the feature tensor of the set of images output by the trained target neural network model and the feature tensor actually corresponding to the set of images satisfy the predetermined objective function.
Optionally, the apparatus further comprises:
the expansion module is used for expanding the feature vector of the original image into a feature tensor with the same dimension as the feature tensor of the group of images;
the splicing module is used for splicing the characteristic tensor of the original image after expansion with the characteristic tensor of the group of images to obtain a second spliced characteristic tensor after splicing;
the second training module is used for training the original evaluation model according to the second spliced characteristic tensor to obtain the target evaluation model, wherein the second spliced characteristic tensor is input into the original evaluation model, and the evaluation matrix of the second spliced characteristic tensor output by the trained target evaluation model and the evaluation matrix corresponding to the second spliced characteristic tensor meet a preset function.
According to a further embodiment of the invention, there is also provided a computer-readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to the method and the device, the target image to be processed and a plurality of images in a database are acquired; determining feature vectors of the target image and acquiring feature tensors of the plurality of images; determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images; the image corresponding to the component of the target evaluation vector larger than or equal to the preset threshold value is determined to be the homologous image of the image to be processed, so that the problem that the homologous image is difficult to accurately and rapidly retrieve from the database in the related technology can be solved, and the homologous image can be rapidly and accurately determined from the database.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
fig. 1 is a block diagram of a hardware structure of a mobile terminal of a homologous image determining method according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of determining a homologous image according to an embodiment of the present application;
FIG. 3 is a flowchart I of a method for determining a homologous picture based on convolutional neural network according to an embodiment of the present application;
FIG. 4 is a second flowchart of a method for determining a homologous picture based on convolutional neural network according to an embodiment of the present application;
fig. 5 is a block diagram of a homologous image determination apparatus according to an embodiment of the present application;
fig. 6 is a block diagram one of a homologous image determination apparatus according to a preferred embodiment of the present application;
fig. 7 is a block diagram two of a homologous image determination apparatus according to a preferred embodiment of the present application.
Detailed Description
The application will be described in detail hereinafter with reference to the drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
Example 1
The method according to the first embodiment of the present application may be implemented in a mobile terminal, a computer terminal or a similar computing device. Taking a mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of a mobile terminal according to an embodiment of the present application, where, as shown in fig. 1, the mobile terminal 10 may include one or more (only one is shown in fig. 1) processors 102 (the processors 102 may include, but are not limited to, a microprocessor MCU or a programmable logic device FPGA or the like) and a memory 104 for storing data, and optionally, the mobile terminal may further include a transmission device 106 for a communication function and an input/output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal 10 may also include more or fewer components than shown in FIG. 1 or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a message receiving method in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. The specific examples of networks described above may include wireless networks provided by the communication provider of the mobile terminal 10. In one example, the transmission device 106 includes a network adapter (Network INterface CoNtroller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio FrequeNcy (RF) module, which is configured to communicate with the internet wirelessly.
Based on the above mobile terminal or network architecture, in this embodiment, a method for determining a homologous image is provided, and fig. 2 is a flowchart of a method for determining a homologous image according to an embodiment of the present invention, as shown in fig. 2, where the flowchart includes the following steps:
step S202, obtaining a target image to be processed and a plurality of images in a database;
step S204, determining feature vectors of the target image, and acquiring feature tensors of the plurality of images;
step S206, determining a target evaluation vector of the feature vector of the target image and the feature tensors of the plurality of images, wherein each component of the target evaluation vector represents the probability that the plurality of images and the target image are homologous images;
step S208, determining an image corresponding to the component of the target evaluation vector which is larger than or equal to a preset threshold value as a homologous image of the image to be processed.
Obtaining a target image to be processed and a plurality of images in a database through the steps S202 to S208; determining feature vectors of the target image and acquiring feature tensors of the plurality of images; determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images; the image corresponding to the component of the target evaluation vector larger than or equal to the preset threshold value is determined to be the homologous image of the image to be processed, so that the problem that the accurate and rapid determination of the homologous image is difficult to realize in the related technology can be solved, and the homologous image can be rapidly and accurately determined.
In various manners of determining the target evaluation vector in the embodiment of the present invention, in an optional embodiment, the step S206 may specifically include: expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images; splicing the expanded characteristic tensor of the target image with the characteristic tensors of the plurality of images to obtain a first spliced characteristic tensor; inputting the first spliced characteristic tensor into a pre-trained evaluation model to obtain an evaluation matrix output by the evaluation model; determining the first evaluation vector according to the evaluation matrix; the first evaluation vector is determined as the target evaluation vector.
In another optional embodiment, after determining the first evaluation vector according to the evaluation matrix, determining cosine similarity between the feature vector of the target image and the feature tensors of the plurality of images to obtain a second evaluation vector; the corresponding step S206 specifically includes: a product of the first evaluation vector and the second evaluation vector is determined as the target evaluation vector.
Further, inputting the first stitching feature tensor into a pre-trained evaluation model, and obtaining an evaluation matrix output by the evaluation model may specifically include:
Inputting the first spliced characteristic tensor into a first full-connection layer of the target evaluation model to obtain a first characteristic tensor output by the first full-connection layer;
inputting the first characteristic tensor into a second full-connection layer of the target evaluation model to obtain a second characteristic tensor output at the second full-connection position;
inputting the second characteristic tensor into a softmax layer of the target evaluation model to obtain an evaluation matrix output by the softmax layer, wherein the evaluation matrix is a two-dimensional matrix; the row index of the evaluation matrix corresponds to the numbers of a plurality of images in a database, the column index corresponds to whether the images are homologous images or not, the first column of the evaluation matrix corresponds to the probability that each image in the database is a non-homologous image with the target image, and the second column corresponds to the probability that each image in the database is a homologous image with the target image;
correspondingly, determining the first evaluation vector according to the evaluation matrix specifically includes:
and selecting a second column vector from the evaluation matrix to determine the second column vector as the first evaluation vector.
In the embodiment of the present invention, the step S204 may specifically include: and inputting the target image into a pre-trained target neural network model to obtain a feature vector corresponding to the target image output by the target neural network model.
Training a target neural network model and a target evaluation model before acquiring a target image to be processed and a plurality of images in a database, specifically acquiring a preset number of original images and a group of images corresponding to the original images, wherein the group of images are a set of homologous images and non-homologous images corresponding to the original images in the same number;
training an original neural network model by using the preset number of original images and a group of images corresponding to the original images to obtain the target neural network model, wherein the preset number of original images and the group of images corresponding to the original images are input into the original neural network model, the feature vectors of the original images and the feature vectors actually corresponding to the original images output by the trained target neural network model meet a preset target function, and the feature tensors of the group of images and the feature tensors actually corresponding to the group of images output by the trained target neural network model meet the preset target function.
Further, training an original neural network model by using the original images with the preset number and a group of images corresponding to the original images to obtain the target neural network model, and then expanding the feature vector of the original image into a feature tensor with the same dimension as the feature tensor of the group of images; splicing the expanded characteristic tensor of the original image with the characteristic tensor of the group of images to obtain a second spliced characteristic tensor after splicing; training an original evaluation model according to the second spliced characteristic tensor to obtain the target evaluation model, wherein the second spliced characteristic tensor is input into the original evaluation model, and an evaluation matrix of the second spliced characteristic tensor output by the trained target evaluation model and an evaluation matrix corresponding to the second spliced characteristic tensor meet a preset function.
The determination of the homologous image is described in detail below by taking the object evaluation model as a feature similarity evaluation model a and the cosine similarity calculation model as an example of establishing the feature similarity evaluation model B.
Fig. 3 is a flowchart one of a method for determining a homologous picture based on a convolutional neural network according to an embodiment of the present invention, as shown in fig. 3, including:
s301, establishing a feature extraction model, which specifically comprises the following steps: and inputting an original image and a group of images into the feature extraction model to obtain feature vectors corresponding to the original image and feature tensors corresponding to the group of images, and expanding the feature vectors corresponding to the original image to the same dimension as the feature tensors corresponding to the group of images, wherein the group of images are a set of homologous images and non-homologous images corresponding to the original image in equal quantity.
S302, establishing a feature similarity evaluation model A, which specifically comprises the following steps: and inputting tensors obtained by the feature extraction model into a feature similarity evaluation model A, performing similarity comparison, giving scores, and generating an evaluation vector A, wherein the evaluation vector A corresponds to the first similarity evaluation vector, and the evaluation vector B corresponds to the second similarity evaluation vector.
S303, a feature similarity evaluation model B is established, and the process is the same as that of S302, so as to generate an evaluation vector B.
S304, performing point multiplication processing on the two evaluation vectors to obtain a final evaluation vector, and ordering the results in a descending order.
S305, dividing according to the threshold, wherein the images in the threshold are homologous images, and the threshold of the training stage is set to be 0.5.
And S306, obtaining feature vectors corresponding to the first target image and the second target image based on the feature extraction model.
S307, based on the feature similarity evaluation models A and B, obtaining an evaluation vector A and an evaluation vector B of tensors obtained by the feature extraction models, and performing point multiplication calculation to obtain a final evaluation vector, wherein the final evaluation vector corresponds to the target similarity evaluation vector, the similarity value can be calculated according to the target similarity evaluation vector, if the similarity value is within a threshold range, the similarity value can be determined to be a homologous image, if the similarity value is not within the threshold range, the similarity value can be determined to be a non-homologous image, and the threshold setting depends on a specific data set.
In addition to identifying whether a plurality of images or pictures are homologous, the embodiment of the invention can also search whether the pictures homologous to the images exist in the database according to one picture, and the embodiment of the invention is described in detail by taking the example of searching the homologous pictures from the database.
Fig. 4 is a second flowchart of a method for determining a homologous picture based on a convolutional neural network according to an embodiment of the present invention, as shown in fig. 4, including:
s401, establishing a feature extraction model, which specifically comprises the following steps: and inputting an original image and a group of pictures into the feature extraction model to obtain feature vectors corresponding to the original image and feature tensors corresponding to the group of pictures, and expanding the feature vectors corresponding to the original image to the same dimension as the feature tensors corresponding to the group of pictures, wherein the group of pictures are homologous pictures and non-homologous picture sets corresponding to the original image in equal quantity.
S402, establishing a feature similarity evaluation model A, which specifically comprises the following steps: and inputting tensors obtained by the feature extraction model into a feature similarity evaluation model A, performing similarity comparison, and giving scores to generate an evaluation vector A.
S403, a feature similarity evaluation model B is established, and the process is the same as that of S402, so as to generate an evaluation vector B.
S404, performing point multiplication processing on the two evaluation vectors to obtain a final evaluation vector, and ordering the results in a descending order.
S405, dividing according to a threshold, wherein pictures in the threshold are homologous pictures, and the threshold is set to be 0.5 in a training stage.
S406, based on the feature extraction model, obtaining a feature tensor corresponding to each original picture and feature tensors corresponding to all pictures in a database.
S407, based on the feature similarity evaluation models A and B, obtaining evaluation vectors A and B of tensors obtained by the feature extraction models, performing point multiplication calculation to obtain final evaluation vectors, sorting in descending order, wherein pictures divided in a threshold are the retrieved homologous pictures, and the threshold is set according to a specific data set.
In the above step S301 or S401, a feature extraction model is built, specifically, CNN is a standard convolution structure of VGG network. The operation of each convolution layer is the same, except that the data acquired by each convolution layer is different. Let the output vector of the upper convolution layer be X, and the calculation of the lower convolution layer beWherein W represents a parameter of the convolution kernel, +.>Representing the area where the convolution acts on the input data, b is the bias of the layer and f is the activation function. After the convolution structure, two fully-connected layers, i.e. FC, ifThe final output vector of the convolution structure is X, and the calculation of the full-connection layer is y= (ax+b), where a represents the weight parameter of the layer, b represents the bias of the layer, and f is the activation function. Inputting an original picture and a group of picture sets which are formed by packing n/2 homologous pictures and n/2 non-homologous pictures into a feature extraction model. In the embodiment of the invention, a feature vector of 1 x 512 dimension is generated corresponding to the original image, and a feature tensor of n x 512 dimension is generated corresponding to the n picture sets. And expanding the feature vector corresponding to the original image into an n-512-dimensional feature tensor, and then splicing the n-512-dimensional feature tensor corresponding to the picture set to obtain an n-1024-dimensional feature tensor. In the embodiment of the present invention, n=120. The above procedure is to build a feature extraction model.
Then, similarity analysis is performed on the extracted feature tensor, in the steps S302 and S402, a feature similarity evaluation model a is established, specifically, FC1 is a fully connected layer, in the embodiment of the present invention, there are 512 nodes, FC2 is a fully connected layer with only two nodes, and finally, a softmax layer is connected to perform two classifications to obtain a probability value of whether each picture is a homologous picture, and the softmax function is
Where Si represents the probability value of the ith node, e j Representing the jth node vector value. Inputting the feature tensor obtained in the step S301 into the FC1, and obtaining an n x 2-dimensional feature tensor through the FC1, the FC2 and the softmax. Generally, the second column of feature tensors is the prediction result we want, so the second column of feature tensors is taken as the final evaluation vector a, and the scale is n×1 dimensions. In the embodiment of the present invention, n=120. The above procedure is to build a feature similarity evaluation model a.
In S303 or S403 described above, the feature similarity evaluation model B is established. Specifically, the cosine similarity is calculated for the feature tensor obtained in S301 or S401, and the calculation formula is:
wherein A represents the feature vector corresponding to the original image, and B represents the feature vector corresponding to any one of the images in the image set. And calculating through an evaluation model B to obtain an n 1-dimensional evaluation vector B. The above procedure is to build a feature similarity evaluation model B.
Then, the product of the number of the evaluation vectors a and B, i.e. a×b, is calculated, and the obtained result is used as the final evaluation vector. And sorting the final evaluation vectors in a descending order, and dividing the pictures in the threshold into the retrieved homologous pictures according to the preset threshold. In the embodiment of the present invention, the threshold is set to 0.5.
The above process is only a model training process, and a trained homologous picture retrieval model is obtained.
Next, the real dataset is processed based on the trained homologous picture retrieval model.
Specifically, in step S306 or S406, based on the feature extraction model, n×512-dimensional feature tensors corresponding to all the pictures are obtained. And then sequentially acquiring the characteristic tensor corresponding to each original picture, and splicing to obtain the n-x 1024-dimensional characteristic tensor corresponding to each original picture. In the embodiment of the invention, the database has 2000000 pictures, the original pictures to be searched are 1000, each original picture has 60 homologous pictures, and the dimension of the obtained characteristic tensor is 2000000 x 1024.
Finally, based on the feature similarity evaluation models a and B, the above S307 or S407 calculates the evaluation vectors a and B corresponding to each feature tensor generated in S306 or S406, and calculates the number product of the evaluation vector a and the evaluation vector B, i.e., a×b, to obtain the final evaluation vector. And sorting the final evaluation vectors in a descending order, and dividing the pictures in the threshold into the retrieved homologous pictures according to the threshold, wherein the threshold is set to be 0.00003 in the embodiment of the invention.
That is, when any original picture is input, the corresponding homologous picture can be found in the database by the above method.
The embodiment of the invention has high feasibility for searching the homologous pictures in a large amount of data, greatly improves the searching speed and precision compared with the traditional method, and has good generalization because the network model can be trained by different data sets.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
Example 2
In this embodiment, a homologous image determining apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, and will not be described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 5 is a block diagram of a homologous image determination apparatus according to an embodiment of the present invention, as illustrated in fig. 5, comprising:
a first acquiring module 52, configured to acquire a target image to be processed and a plurality of images in a database;
a first determining module 54, configured to determine feature vectors of the target image and obtain feature tensors of the plurality of images;
a second determining module 56, configured to determine a target evaluation vector of feature vectors of the target image and feature tensors of the plurality of images, where each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images;
a third determining module 58, configured to determine an image corresponding to a component of the target evaluation vector greater than or equal to a preset threshold as a homologous image of the image to be processed.
Fig. 6 is a block diagram of a homologous image determination apparatus according to a preferred embodiment of the present invention, as shown in fig. 6, the second determination module 56 includes:
an expansion sub-module 62 for expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images;
a stitching sub-module 64, configured to stitch the feature tensor of the expanded target image and the feature tensors of the multiple images to obtain a first stitched feature tensor;
an input sub-module 66, configured to input the first stitching feature tensor into a pre-trained evaluation model, to obtain an evaluation matrix output by the evaluation model;
a first determination sub-module 68 for determining the first evaluation vector from the evaluation matrix;
a second determination sub-module 610 is configured to determine the first evaluation vector as the target evaluation vector.
Fig. 7 is a block diagram two of a homologous image determination apparatus according to a preferred embodiment of the present invention, as shown in fig. 7, the apparatus further comprising:
a third determining sub-module 72, configured to determine cosine similarity between the feature vector of the target image and the feature tensors of the plurality of images, to obtain a second evaluation vector;
A fourth determination sub-module 74 for determining a feature vector of the target image and a target evaluation vector of feature tensors of the plurality of images comprises:
a fifth determination submodule 76 determines a product of the first evaluation vector and the second evaluation vector as the target evaluation vector.
Optionally, the input sub-module 66 is also used for
Inputting the first spliced characteristic tensor into a first full-connection layer of the target evaluation model to obtain a first characteristic tensor output by the first full-connection layer;
inputting the first characteristic tensor into a second full-connection layer of the target evaluation model to obtain a second characteristic tensor output at the second full-connection position;
inputting the second characteristic tensor into a softmax layer of the target evaluation model to obtain an evaluation matrix output by the softmax layer, wherein the evaluation matrix is a two-dimensional matrix; the row index of the evaluation matrix corresponds to the numbers of a plurality of images in a database, the column index corresponds to whether the images are homologous images or not, the first column of the evaluation matrix corresponds to the probability that each image in the database is a non-homologous image with the target image, and the second column corresponds to the probability that each image in the database is a homologous image with the target image;
The second determining submodule is further used for selecting a second column vector from the evaluation matrix to determine the second column vector as the first evaluation vector.
Optionally, the first determining module 54 is further configured to
And inputting the target image into a pre-trained target neural network model to obtain a feature vector corresponding to the target image output by the target neural network model.
Optionally, the apparatus further comprises:
a second acquisition module, configured to acquire a predetermined number of original images and a group of images corresponding to the original images, where the group of images is a set of the same number of homologous images and non-homologous images corresponding to the original images;
the first training module is configured to train an original neural network model by using the predetermined number of original images and a set of images corresponding to the original images to obtain the target neural network model, wherein the predetermined number of original images and the set of images corresponding to the original images are input into the original neural network model, the feature vector of the original image output by the trained target neural network model and the feature vector actually corresponding to the original image satisfy a predetermined objective function, and the feature tensor of the set of images output by the trained target neural network model and the feature tensor actually corresponding to the set of images satisfy the predetermined objective function.
Optionally, the apparatus further comprises:
the expansion module is used for expanding the feature vector of the original image into a feature tensor with the same dimension as the feature tensor of the group of images;
the splicing module is used for splicing the characteristic tensor of the original image after expansion with the characteristic tensor of the group of images to obtain a second spliced characteristic tensor after splicing;
the second training module is used for training the original evaluation model according to the second spliced characteristic tensor to obtain the target evaluation model, wherein the second spliced characteristic tensor is input into the original evaluation model, and the evaluation matrix of the second spliced characteristic tensor output by the trained target evaluation model and the evaluation matrix corresponding to the second spliced characteristic tensor meet a preset function.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Example 3
Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:
s1, acquiring a target image to be processed and a plurality of images in a database;
s2, determining feature vectors of the target image, and acquiring feature tensors of the plurality of images;
s3, determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents the probability that the plurality of images and the target image are homologous images;
s4, determining an image corresponding to the component of the target evaluation vector which is larger than or equal to a preset threshold value as a homologous image of the image to be processed.
Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-ONly Memory (ROM), a random access Memory (RaNdom Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
Example 4
An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, acquiring a target image to be processed and a plurality of images in a database;
s2, determining feature vectors of the target image, and acquiring feature tensors of the plurality of images;
s3, determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents the probability that the plurality of images and the target image are homologous images;
s4, determining an image corresponding to the component of the target evaluation vector which is larger than or equal to a preset threshold value as a homologous image of the image to be processed.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method for determining a homologous image, comprising:
acquiring a target image to be processed and a plurality of images in a database;
determining feature vectors of the target image and acquiring feature tensors of the plurality of images;
determining a target evaluation vector of a feature vector of the target image and a feature tensor of the plurality of images, wherein each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images;
determining an image corresponding to a component of the target evaluation vector which is greater than or equal to a preset threshold as a homologous image of the target image to be processed;
wherein determining the feature vector of the target image and the target evaluation vector of the feature tensor of the plurality of images comprises:
expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images;
Splicing the expanded characteristic tensor of the target image with the characteristic tensors of the plurality of images to obtain a first spliced characteristic tensor;
inputting the first spliced characteristic tensor into a pre-trained evaluation model to obtain an evaluation matrix output by the evaluation model;
determining a first evaluation vector according to the evaluation matrix;
the first evaluation vector is determined as the target evaluation vector.
2. The method of claim 1, wherein after determining a first evaluation vector from the evaluation matrix, the method further comprises:
determining cosine similarity between the feature vector of the target image and feature tensors of the plurality of images to obtain a second evaluation vector;
determining a feature vector of the target image and a target evaluation vector of feature tensors of the plurality of images comprises:
a product of the first evaluation vector and the second evaluation vector is determined as the target evaluation vector.
3. The method of claim 1, wherein inputting the first stitching feature tensor into a pre-trained evaluation model to obtain an evaluation matrix for output by the evaluation model comprises:
Inputting the first spliced characteristic tensor into a first full-connection layer of the evaluation model to obtain a first characteristic tensor output by the first full-connection layer;
inputting the first characteristic tensor into a second full-connection layer of the evaluation model to obtain a second characteristic tensor output at the second full-connection position;
inputting the second characteristic tensor into a softmax layer of the evaluation model to obtain an evaluation matrix output by the softmax layer, wherein the evaluation matrix is a two-dimensional matrix; the row index of the evaluation matrix corresponds to the numbers of a plurality of images in a database, the column index corresponds to whether the images are homologous images or not, the first column of the evaluation matrix corresponds to the probability that each image in the database is a non-homologous image with the target image, and the second column corresponds to the probability that each image in the database is a homologous image with the target image;
determining the first evaluation vector from the evaluation matrix comprises:
and selecting a second column vector from the evaluation matrix to determine the second column vector as the first evaluation vector.
4. The method of claim 1, wherein determining the feature vector of the target image comprises:
and inputting the target image into a pre-trained target neural network model to obtain a feature vector corresponding to the target image output by the target neural network model.
5. The method of claim 4, wherein prior to acquiring the target image to be processed and the plurality of images in the database, the method further comprises:
acquiring a predetermined number of original images and a group of images corresponding to the original images, wherein the group of images are a set of homologous images and non-homologous images corresponding to the original images in the same number;
training an original neural network model by using the preset number of original images and a group of images corresponding to the original images to obtain the target neural network model, wherein the preset number of original images and the group of images corresponding to the original images are input into the original neural network model, the feature vectors of the original images and the feature vectors actually corresponding to the original images output by the trained target neural network model meet a preset target function, and the feature tensors of the group of images and the feature tensors actually corresponding to the group of images output by the trained target neural network model meet the preset target function.
6. The method of claim 5, wherein after training an original neural network model using the predetermined number of original images and a set of images corresponding to the original images to obtain the target neural network model, the method further comprises:
Expanding the feature vector of the original image into a feature tensor of the same dimension as the feature tensor of the set of images;
splicing the expanded characteristic tensor of the original image with the characteristic tensor of the group of images to obtain a second spliced characteristic tensor after splicing;
training an original evaluation model according to the second spliced characteristic tensor to obtain the evaluation model, wherein the second spliced characteristic tensor is input into the original evaluation model, and an evaluation matrix of the second spliced characteristic tensor output by the trained evaluation model and an evaluation matrix corresponding to the second spliced characteristic tensor meet a preset function.
7. A homologous image determination apparatus, comprising:
the first acquisition module is used for acquiring a target image to be processed and a plurality of images in the database;
a first determining module, configured to determine feature vectors of the target image, and acquire feature tensors of the plurality of images;
a second determining module, configured to determine a target evaluation vector of feature vectors of the target image and feature tensors of the plurality of images, where each component of the target evaluation vector represents a probability that the plurality of images and the target image are homologous images;
A third determining module, configured to determine an image corresponding to a component of the target evaluation vector that is greater than or equal to a preset threshold as a homologous image of the target image to be processed;
wherein the second determining module further comprises:
an expansion sub-module for expanding the feature vector of the target image into a feature tensor of the same dimension as the feature tensors of the plurality of images;
the splicing sub-module is used for splicing the characteristic tensor of the target image after expansion with the characteristic tensors of the plurality of images to obtain a first spliced characteristic tensor;
the input sub-module is used for inputting the first spliced characteristic tensor into a pre-trained evaluation model to obtain an evaluation matrix output by the evaluation model;
a first determination submodule for determining a first evaluation vector according to the evaluation matrix;
and the second determination submodule is used for determining the first evaluation vector as the target evaluation vector.
8. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 6 when run.
9. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 6.
CN201911312848.7A 2019-12-18 2019-12-18 Homologous image determining method and device Active CN111191065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911312848.7A CN111191065B (en) 2019-12-18 2019-12-18 Homologous image determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911312848.7A CN111191065B (en) 2019-12-18 2019-12-18 Homologous image determining method and device

Publications (2)

Publication Number Publication Date
CN111191065A CN111191065A (en) 2020-05-22
CN111191065B true CN111191065B (en) 2023-10-31

Family

ID=70710102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911312848.7A Active CN111191065B (en) 2019-12-18 2019-12-18 Homologous image determining method and device

Country Status (1)

Country Link
CN (1) CN111191065B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723868B (en) * 2020-06-22 2023-07-21 海尔优家智能科技(北京)有限公司 Method, device and server for removing homologous pictures
CN112541417B (en) * 2020-12-03 2022-09-16 山东众阳健康科技集团有限公司 Efficient decoding method used in character detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9256950B1 (en) * 2014-03-06 2016-02-09 Google Inc. Detecting and modifying facial features of persons in images
CN106548134A (en) * 2016-10-17 2017-03-29 沈阳化工大学 GA optimizes palmmprint and the vena metacarpea fusion identification method that SVM and normalization combine
CN107944238A (en) * 2017-11-15 2018-04-20 中移在线服务有限公司 Identity identifying method, server and system
CN108615007A (en) * 2018-04-23 2018-10-02 深圳大学 Three-dimensional face identification method, device and the storage medium of feature based tensor
CN110347854A (en) * 2019-06-13 2019-10-18 西安理工大学 Image search method based on target positioning
WO2019233421A1 (en) * 2018-06-04 2019-12-12 京东数字科技控股有限公司 Image processing method and device, electronic apparatus, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9256950B1 (en) * 2014-03-06 2016-02-09 Google Inc. Detecting and modifying facial features of persons in images
CN106548134A (en) * 2016-10-17 2017-03-29 沈阳化工大学 GA optimizes palmmprint and the vena metacarpea fusion identification method that SVM and normalization combine
CN107944238A (en) * 2017-11-15 2018-04-20 中移在线服务有限公司 Identity identifying method, server and system
CN108615007A (en) * 2018-04-23 2018-10-02 深圳大学 Three-dimensional face identification method, device and the storage medium of feature based tensor
WO2019233421A1 (en) * 2018-06-04 2019-12-12 京东数字科技控股有限公司 Image processing method and device, electronic apparatus, and storage medium
CN110347854A (en) * 2019-06-13 2019-10-18 西安理工大学 Image search method based on target positioning

Also Published As

Publication number Publication date
CN111191065A (en) 2020-05-22

Similar Documents

Publication Publication Date Title
CN112199375B (en) Cross-modal data processing method and device, storage medium and electronic device
CN110866140B (en) Image feature extraction model training method, image searching method and computer equipment
CN108304435B (en) Information recommendation method and device, computer equipment and storage medium
CN109685121B (en) Training method of image retrieval model, image retrieval method and computer equipment
KR101531618B1 (en) Method and system for comparing images
JP5934653B2 (en) Image classification device, image classification method, program, recording medium, integrated circuit, model creation device
CN110321422A (en) Method, method for pushing, device and the equipment of on-line training model
CN110033023B (en) Image data processing method and system based on picture book recognition
CN111581414B (en) Method, device, equipment and storage medium for identifying, classifying and searching clothes
CN111062871A (en) Image processing method and device, computer equipment and readable storage medium
CN112000819A (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
WO2019137185A1 (en) Image screening method and apparatus, storage medium and computer device
Cheng et al. A data-driven point cloud simplification framework for city-scale image-based localization
CN110765882B (en) Video tag determination method, device, server and storage medium
CN105117399B (en) Image searching method and device
CN111191065B (en) Homologous image determining method and device
CN111507285A (en) Face attribute recognition method and device, computer equipment and storage medium
CN113705596A (en) Image recognition method and device, computer equipment and storage medium
CN111222399B (en) Method and device for identifying object identification information in image and storage medium
CN114358109A (en) Feature extraction model training method, feature extraction model training device, sample retrieval method, sample retrieval device and computer equipment
CN113033507B (en) Scene recognition method and device, computer equipment and storage medium
CN111401193A (en) Method and device for obtaining expression recognition model and expression recognition method and device
US10909167B1 (en) Systems and methods for organizing an image gallery
CN107193979B (en) Method for searching homologous images
CN113821657A (en) Artificial intelligence-based image processing model training method and image processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant