CN116244458B

CN116244458B - Method for generating training, generating sample pair, searching model training and trademark searching

Info

Publication number: CN116244458B
Application number: CN202211619479.8A
Authority: CN
Inventors: 李建武; 胡文宇; 韩荣华; 周竞智; 胡云姣
Original assignee: Shenzhen Yiya Technology Co ltd; Beijing Institute of Technology BIT
Current assignee: Shenzhen Yiya Technology Co ltd; Beijing Institute of Technology BIT
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-08-25
Anticipated expiration: 2042-12-16
Also published as: CN116244458A

Abstract

The embodiment of the application relates to the technical field of computer vision and discloses a training method of a target generator, wherein a similarity coefficient is introduced into the method, a similar image is generated in a generator according to an input image and the similarity coefficient, then the similar image is restored through another generator, a discriminator is trained through the input image and the restored image, then the similar image is input into the trained discriminator, a similarity value is output by the discriminator, and parameters of the generator are adjusted through a difference value between the similarity coefficient and the similarity value and an Euclidean distance between the input image and the restored image, so that the difference value and the Euclidean distance which are obtained later are reduced, the performance of the generator is trained, and the target generator capable of generating an image with the similarity equal to the similarity coefficient is obtained. Through the mode, the method and the device solve the problem that the generation efficiency of similar samples of trademark images is low.

Description

Method for generating training, generating sample pair, searching model training and trademark searching

Technical Field

The embodiment of the application relates to the technical field of computer vision, in particular to a method for training a target generator, generating a similar sample pair, training an image retrieval model and retrieving a trademark image.

Background

The trademark is a brand mark, and is composed of elements such as numbers, characters, graphics, letters, sounds, color combinations, three-dimensional marks and the like, and is used for identifying and distinguishing the sources of goods or services. Trademark registration requires a trademark duplication checking process to prevent trademark infringement and to ensure validity of the trademark itself and its owner.

The main problem of the current trademark image retrieval is that the acquisition of the training set is difficult, and the training set needs to construct a large number of similar sample pairs and give out the similarity of two trademark images, however, due to the characteristics of the trademark images, the manual labeling method is difficult to quantitatively express the similarity of the two trademark images, so that the generation efficiency of the similar sample of the trademark image is low.

Disclosure of Invention

In view of the above problems, the embodiments of the present application provide a method for training a target generator, generating a pair of similar samples, training an image retrieval model, and retrieving a trademark image, which is used for solving the problem of low efficiency of generating similar samples of a trademark image in the prior art.

According to an aspect of an embodiment of the present application, there is provided a target generator training method including:

Acquiring a first image and a second image; obtaining a similarity coefficient, wherein the similarity coefficient is used for representing the similarity of the first image and the second image; inputting the similarity coefficient into a first generator and a second generator;

inputting the first image into a first generator to generate a first similar image; inputting the first similar image into a second generator to generate a first restored image; inputting the first image and the first restored image into a second discriminator, and outputting a first similarity value of the first image and a second similarity value of the first restored image by the second discriminator; adjusting the second discriminator according to the first similarity value and the second similarity value to increase the first similarity value determined subsequently and decrease the second similarity value, so that the second discriminator is used for discriminating the similarity between the input image and the first image;

inputting the second image into a second generator to generate a second similar image; inputting the second similar image into a first generator to generate a second restored image; inputting the second image and the second restored image into the first discriminator, and outputting a third similar value of the second image and a fourth similar value of the second restored image by the first discriminator; adjusting the first discriminator according to the third similarity value and the fourth similarity value to increase the third similarity value determined subsequently and decrease the fourth similarity value, so that the first discriminator is used for discriminating the similarity degree of the input image and the second image;

Inputting the first similar image into a second discriminator, and outputting a fifth similar value of the first similar image by the second discriminator; determining a first difference between the fifth similarity value and the similarity coefficient; inputting the second similar image into the first discriminator, and outputting a sixth similar value of the second similar image by the first discriminator; determining a second difference between the sixth similarity value and the similarity coefficient; determining a first euclidean distance of the first image and the first restored image; determining a second euclidean distance of the second image and the second restored image;

judging whether the first difference value and the second difference value are smaller than a first preset threshold value or not, and judging whether the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold value or not at the same time; if not, adjusting parameters of the first generator according to the first difference value and the second Euclidean distance, adjusting parameters of the second generator according to the second difference value and the first Euclidean distance, and jumping to the step of inputting the first image into the first generator to generate a first similar image so as to reduce the first difference value, the second difference value, the first Euclidean distance and the second Euclidean distance which are determined subsequently; if so, at least one of the first generator and the second generator is determined as a target generator, and the target generator is used for generating an image with similarity equal to a similarity coefficient according to the input image.

The first generator generates a first similar image of the first image and a second restored image of the second image based on the similarity coefficient, and the second generator generates a first restored image of the first image and a second similar image of the second image y. And then the second discriminator adjusts the parameters of the second discriminator through discriminating the first image and the first restored image, and the first discriminator improves the discrimination performance of the first discriminator and the second discriminator through discriminating the second image and the second restored image so as to be used by a subsequent training generator. And then, adjusting parameters of the first generator through a first difference value of a similarity value and a similarity coefficient of the first similar image output by the second discriminator and a second Euclidean distance between the second restored image and the second image, adjusting parameters of the second generator through a second difference value of a similarity value and a similarity coefficient of the second similar image output by the first discriminator and a first Euclidean distance between the first restored image and the first image, improving the performance of the first generator and the second generator, and repeating the steps of generating the first similar image until the first difference value and the second difference value are smaller than a first preset threshold, and the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold. Through the method, the characteristics of the trademark images can be matched, the adjustable similarity coefficient is introduced to train the generator, the similarity between the generated image and the input image is adjusted under the condition that the content of the image is unchanged, so that the target generator capable of generating the image with the similarity equal to the similarity coefficient is obtained, the similarity of the trademark images is manually controlled, the similar samples are quickly generated, and the generation efficiency of the similar samples of the trademark images is improved.

In an alternative way, acquiring the first image and the second image includes: acquiring a first image set and a second image set; determining one of the first set of images as a first image; determining one of the second set of images as a second image; if so, determining at least one of the first generator and the second generator as a target generator, wherein the target generator is used for generating an image with similarity equal to a similarity coefficient according to the input image, and comprises the following steps: if yes, determining the other image in the first image set as a first image, determining the other image in the second image set as a second image, jumping to the step of inputting the first image into the first generator to generate a first similar image until at least one of the first generator and the second generator is determined as a target generator after traversing all the images in the first image set and the second image set.

By the method, the images of the first image set and the second image set can be input into the first generator and the second generator for training, so that the first generator and the second generator fully learn the characteristics of the image set, the quality of the first generator and the second generator is improved, and more real images are generated.

In an alternative approach, the sum of the minimized contrast loss functions of the first generator and the second generator is:

wherein S is a similarity coefficient, G is a first generator, F is a second generator, D _X Is the second discriminator, D _Y Is a first discriminator, x is a first image, y is a second image, E is an expected, x-P _data(x) Corresponding to the sampling of the first image, y-P _data(y) Corresponding to the sampling of the second image.

By minimizing the sum of the countermeasures loss functions of the first generator and the second generator, the generators and the discriminant can be mutually evolved, so that the first generator can generate more realistic images in the sample space, and the second generator can generate more realistic images in the sample space.

According to another aspect of the embodiment of the present application, there is provided a method for generating a pair of similar samples, including: acquiring a sample image; inputting the sample image into a target generator, wherein the target generator is obtained through training by the target generator training method provided by any one of the embodiments; outputting a similar image through the target generator, wherein the similarity between the similar image and the sample image is equal to a similarity coefficient; determining the sample image and the similar image together as a similar sample pair; and determining the similarity coefficient as a similarity score label of the similar sample.

By inputting the sample images into the target generator, similar images with different similarities can be generated, similar sample pairs are quickly generated, the similarities of the two images are given, the construction efficiency of the similar sample pairs is improved, and the method is used for guiding training and learning of subsequent models, so that the efficiency of image retrieval is improved.

According to another aspect of the embodiment of the present application, there is provided an image retrieval model training method, including: obtaining a similar sample pair set, wherein the similar sample pair set comprises a plurality of similar sample pairs, and the similar sample pairs are generated by the similar sample pair generation method provided by the embodiment; and inputting the similar sample pairs into a neural network for training to obtain an image retrieval model.

By training all similar samples in the relative sample set on the input neural network, an image retrieval model for calculating the similarity of any two images can be obtained, so that the image retrieval efficiency is improved.

According to another aspect of the embodiment of the present application, there is provided a trademark image retrieval method, including: acquiring an existing trademark image library, wherein the trademark image library comprises a plurality of existing trademark images; acquiring a trademark image to be searched; inputting the trademark images to be searched and the existing trademark images into an image search model, wherein the image search model is obtained through training by the image search model training method provided by the embodiment; and outputting the similarity scores of the trademark images to be searched and each existing trademark image through the image searching model.

By inputting the trademark image to be detected and each existing trademark image into the image retrieval model, registered trademark images with high similarity with the trademark image to be detected can be directly retrieved, and the trademark image retrieval efficiency is improved.

In an alternative manner, after acquiring the trademark image to be retrieved, it includes: acquiring registration information of a trademark image to be retrieved; determining an existing trademark image belonging to the same field as the trademark to be searched according to the registration information; inputting the trademark image to be searched and the existing trademark image into an image search model, wherein the image search model is obtained by training the image search model training method provided by the embodiment, and comprises the following steps: inputting the trademark image to be searched and the existing trademark image belonging to the same field with the trademark to be searched into an image searching model; outputting similarity scores of the trademark images to be searched and each existing trademark image through the image searching model comprises the following steps: and outputting the trademark image to be searched and the similarity score of the existing trademark image belonging to the same field with the trademark to be searched through the image searching model.

After the registered information is used for acquiring the existing trademark images in the same field of the trademark images to be searched, the number of the existing trademark images needing to be searched can be reduced, and the trademark image searching efficiency is improved.

In an alternative way, outputting the similarity score of the trademark image and each existing trademark image by the image retrieval model includes: preprocessing a trademark image to be searched and an existing trademark image by an image searching model; determining projection representations of the trademark image to be searched and the existing trademark image by the image searching model according to the preprocessed trademark image to be searched and the existing trademark image; and determining the similarity scores of the trademark images to be searched and the existing trademark images according to the projection representation by the image searching model.

The pre-processed trademark images to be searched and the existing trademark images are input into the encoder of the image searching model for deeper searching, so that the processing work of the encoder on the images can be reduced, and the trademark image searching efficiency is further improved.

In an alternative manner, determining, by the image retrieval model, a projected representation of the trademark image to be retrieved and the existing trademark image from the preprocessed trademark image to be retrieved and the existing trademark image, includes: carrying out serialization operation on the preprocessed trademark image to be searched and the existing trademark image by using an image searching model to obtain image blocks of the trademark image to be searched and the existing trademark image; determination of vector representation F of image blocks by flattened linear projection layer from image retrieval model ₁ The method comprises the steps of carrying out a first treatment on the surface of the Determination of F by image retrieval model through multi-head self-attention mode ₁ Global dependency of F ₂ The method comprises the steps of carrying out a first treatment on the surface of the F is carried out by an image retrieval model ₁ And F ₂ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₃ The method comprises the steps of carrying out a first treatment on the surface of the By image retrieval model through the pair F ₃ Calculating by a multi-layer perceptron, and determining a characteristic F ₄ The method comprises the steps of carrying out a first treatment on the surface of the F is carried out by an image retrieval model ₃ And F is equal to ₄ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₅ The method comprises the steps of carrying out a first treatment on the surface of the Jumping from image retrieval model to determining F by image retrieval model through multi-head self-attention mode ₁ Global dependency of F ₂ And K times to obtain feature F ₆ The method comprises the steps of carrying out a first treatment on the surface of the By image retrieval model by combining F ₆ The input full connection layer determines the projected representations of the brand image to be retrieved and the existing brand image.

By the method, the image retrieval model can extract the deeper features of the trademark image to be retrieved and the existing trademark image, so that finer retrieval can be performed on the trademark content, and the trademark image retrieval quality is improved.

According to another aspect of an embodiment of the present application, there is provided a target generator training apparatus including:

the first acquisition module is used for acquiring a first image and a second image; the second acquisition module is used for acquiring a similarity coefficient, wherein the similarity coefficient is used for representing the similarity between the first image and the second image; the input module is used for inputting the similarity coefficient into the first generator and the second generator;

The first generation module is used for inputting the first image into a first generator and generating a first similar image; the first restoration module is used for inputting the first similar image into the second generator to generate a first restored image; the first judging module is used for inputting the first image and the first restored image into the second judging device, and outputting a first similar value of the first image and a second similar value of the first restored image by the second judging device; the first adjustment module is used for adjusting the second discriminator according to the first similarity value and the second similarity value so as to increase the first similarity value which is determined subsequently and decrease the second similarity value, thereby enabling the second discriminator to be used for discriminating the similarity between the input image and the first image;

the second generation module is used for inputting a second image into the second generator to generate a second similar image; the second restoration module is used for inputting the second similar image into the first generator to generate a second restored image; a second discriminating module for inputting the second image and the second restored image into the first discriminator, and outputting a third similarity value of the second image and a fourth similarity value of the second restored image by the first discriminator; the second adjusting module is used for adjusting the first discriminator according to the third similarity value and the fourth similarity value so as to increase the third similarity value which is determined subsequently and decrease the fourth similarity value, so that the first discriminator is used for discriminating the similarity between the input image and the second image;

The third judging module is used for inputting the first similar image into the second judging device and outputting a fifth similar value of the first similar image by the second judging device; a first determining module, configured to determine a first difference between the fifth similarity value and the similarity coefficient; the fourth judging module is used for inputting the second similar image into the first judging device and outputting a sixth similar value of the second similar image by the first judging device; a second determining module, configured to determine a second difference between the sixth similarity value and the similarity coefficient; a third determining module for determining a first euclidean distance between the first image and the first restored image; a fourth determining module, configured to determine a second euclidean distance between the second image and the second restored image;

the judging module is used for judging whether the first difference value and the second difference value are smaller than a first preset threshold value or not and judging whether the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold value or not at the same time;

the third adjusting module is used for adjusting parameters of the first generator according to the first difference value and the second Euclidean distance when the first difference value and the second difference value are larger than or equal to a first preset threshold value or the first Euclidean distance and the second Euclidean distance are larger than or equal to a second preset threshold value, adjusting parameters of the second generator according to the second difference value and the first Euclidean distance, and jumping to the step of inputting the first image into the first generator to generate a first similar image so that the first difference value, the second difference value, the first Euclidean distance and the second Euclidean distance which are determined later are reduced;

And a fifth determining module, configured to determine at least one of the first generator and the second generator as a target generator when the first difference value and the second difference value are smaller than a first preset threshold value and the first euclidean distance and the second euclidean distance are smaller than a second preset threshold value, where the target generator is used to generate an image with similarity equal to a similarity coefficient.

The foregoing description is only an overview of the technical solutions of the embodiments of the present application, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present application can be more clearly understood, and the following specific embodiments of the present application are given for clarity and understanding.

Drawings

The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

FIG. 1 is a flow chart of a training method of a target generator according to an embodiment of the application;

FIG. 2 is a flow chart of a training method of a target generator according to another embodiment of the present application;

FIG. 3 is a flow chart illustrating a method for generating similar sample pairs according to an embodiment of the present application;

FIG. 4 is a flowchart of an image retrieval model training method according to an embodiment of the present application;

fig. 5 is a schematic flow chart of a trademark image retrieval method according to an embodiment of the present application;

fig. 6 is a schematic flow chart of a trademark image retrieval method according to another embodiment of the present application;

fig. 7 is a schematic flow chart of a trademark image retrieval method according to another embodiment of the present application;

fig. 8 is a schematic flow chart of a trademark image retrieval method according to another embodiment of the present application;

FIG. 9 is a schematic diagram of a training device for a target generator according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein.

The traditional trademark inquiry uses a method of classifying codes, namely, tree inquiry is performed according to international trademark standard classification. With the development of computer vision technology, according to the feature extraction principle, the constituent elements of the trademark are used as the feature of the model for classification, so that deep learning can be used for realizing content-based trademark image retrieval. The main problem of the current trademark image retrieval is that the acquisition of the training set is difficult, and the training set needs to construct a large number of similar sample pairs and give the similarity of two trademark images. Because of the characteristics of trademark images, the manual labeling method is difficult to quantitatively express the similarity of two trademark images, so that the generation efficiency of similar samples of the trademark images is low.

In view of the above problems, the present inventors have provided a method for training a target generator, which introduces a similarity coefficient, generates a similar image from an input image and the similarity coefficient in a generator, then restores the similar image through another generator, trains a discriminator through the input image and the restored image, then inputs the similar image into the trained discriminator, outputs a similarity value from the discriminator, and adjusts parameters of the generator through a difference between the similarity coefficient and the similarity value and an euclidean distance between the input image and the restored image, so that the difference and the euclidean distance obtained subsequently are reduced, thereby training the performance of the generator, and obtaining the target generator capable of generating an image with a similarity equal to the similarity coefficient. By the method, the characteristics of the trademark images can be matched, the similarity coefficient is introduced to train the generator, the similarity between the generated image and the input image is adjusted under the condition that the content of the image is unchanged, so that the target generator is obtained, the image with the similarity equal to the similarity coefficient is generated, the similarity of the trademark images is controlled manually, the similar sample is generated quickly, and the generation efficiency of the similar sample of the trademark images is improved.

The target generator training method provided by the application is based on a cycle GAN model, and can be used for generating an image dataset, converting an image into an image, editing a photo, repairing a photo and the like, and the embodiment of the application mainly takes the generation of a trademark image dataset as an example to describe the concept of the application, and the application scene does not limit the protection scope of the application. The generative antagonism network (GAN, generative Adversarial Networks) is a deep learning model that is built up of (at least) two modules: the mutual game learning of the generator (generating Model) and the arbiter (Discriminative Model) produces a fairly good output. The Cycle GAN is essentially two mirror symmetry GANs, forming a ring network, the two GANs sharing two generators and each having one arbiter, i.e. two arbiter and two generators. The object of the generator is to generate the real picture as much as possible to deception the discriminator, and the object of the discriminator is to separate the picture generated by the generator from the real picture as much as possible. Thus, the arbiter and the generator constitute a dynamic "gaming process".

In order to solve the problem of low efficiency of generating similar samples of images, an embodiment of the present application provides a training method for a target generator, and fig. 1 shows a flowchart of the training method for a target generator provided by the embodiment of the present application, where the method includes the following steps:

step 110: a first image and a second image are acquired.

The Cycle GAN has two generators, namely a first generator G and a second generator F, and two discriminators, namely a first discriminator D _Y And a second discriminator D _X . The Cycle GAN generator can convert one type of image into another typeThe image and the discriminator are used for discriminating whether the generated image is a real image or not, so that the first image x and the second image y of the real sample need to be acquired in the training process, and the x and the y respectively belong to two sample spaces X, Y. The image in sample space X may be generated by a generator that generates an image of sample space Y.

Step 120: and obtaining a similarity coefficient, wherein the similarity coefficient is used for representing the similarity of the first image and the second image.

And introducing an adjustable similarity coefficient S, wherein the similarity coefficient S is determined according to the similarity between the first image x and the second image y, and the similarity between the generated image and the original image can be adjusted through a similarity parameter S.

Step 130: the similarity coefficient is input to the first generator and the second generator.

After the first generator G and the second generator F obtain the similarity coefficient S, a similar image may be generated according to the similarity coefficient S.

Step 140: the first image is input to a first generator, which generates a first similar image.

In order to enable the first generator to generate an image of the sample space Y, the first image x is input to the first generator G, and a first similar image G (x) is generated according to the similarity coefficient S, the first similar image G (x) being an image of the sample space Y.

Step 150: the first similar image is input to a second generator, which generates a first restored image.

In order to ensure that the generated image of the second generator is only different in style from the input first image X and the content is the same, the first similar image G (X) is input into the second generator F, and a first restored image F (G (X)) is generated according to the similarity S, wherein F (G (X)) is an image of the sample space X.

Step 160: the first image and the first restored image are input into a second discriminator, and the second discriminator outputs a first similarity value of the first image and a second similarity value of the first restored image.

The discriminator discriminates the image, the output result is the probability that the image is a real image, and the range of the result is [0,1]. Specifically, the smaller the result, the smaller the probability of proving that the image is a true image, the larger the result, the larger the probability of proving that the image is a true image, and when the result is 1, it represents that 100% is a true picture, and when the result is 0, it represents that it is impossible to be a true picture.

In order to identify whether the generated first restored image F (G (X)) is a true image of the sample space X, the first image X and the first restored image F (G (X)) are input to the second discriminator D _X Obtaining a first similarity value D of the discrimination result of the first image x _X (x) Second similarity value D of discrimination result of first restored image F (G (x)) _X (F (G (x))). Second discriminator D _X The task of (1) is to distinguish the first image x from the first restored image F (G (x)), to distinguish the first image x as true, and to distinguish the first restored image F (G (x)) as false, i.e. the first similarity value D _X (x) Approach 1, second similarity value D _X (F (G (x))) approaches 0.

Step 170: and adjusting the second discriminator according to the first similarity value and the second similarity value to increase the first similarity value determined subsequently and decrease the second similarity value, so that the second discriminator is used for discriminating the similarity between the input image and the first image.

To increase the second discriminator D _X According to the first similarity value D of the output _X (x) And a second similar value D _X (F (G (x))) adjusting the second discriminator D _X Let the second discriminator D _X Subsequently, the first image x and the first restored image F (G (x)) are discriminated so that the first similarity value D is re-outputted _X (x) Increase, approach to 1, so that the re-output second similar value D _X (F (G (x))) decreases toward 0. In this way, the image is subsequently input to the second discriminator D _X Second discriminator D _X The similarity of the input image to the first image x can be discriminated.

Step 180: the second image is input to a second generator, which generates a second similar image.

Accordingly, in order to enable the second generator to reversely generate an image of the sample space X, the second image y is input to the second generator F, and the second similar image F (y) is generated based on the similarity coefficient S, the second similar image F (y) being an image of the sample space X.

Step 190: the second similar image is input to the first generator, and a second restored image is generated.

In order to ensure that the generated image of the first generator is only different in style from the input second image Y and the content is the same, the second similar image F (Y) is input into the first generator F, and a second restored image G (F (Y)) is generated according to the similarity S, and G (F (Y)) is an image of the sample space Y.

Step 200: the second image and the second restored image are input into the first discriminator, and the third similarity value of the second image and the fourth similarity value of the second restored image are output by the first discriminator.

In order to identify whether the generated second restored image G (F (Y)) is a true image of the sample space Y, the second image Y and the second restored image G (F (Y)) are input to the first discriminator D _Y In the process, a third similar value D of the discrimination result of the second image y is obtained _Y (y), a fourth similarity value D of the discrimination result of the second restored image G (F (y)) _Y (G (F (y))). First discriminator D _Y The task of (1) is to distinguish the second image y from the second restored image G (F (y)), to distinguish the second image y as true, and to distinguish the second restored image G (F (y)) as false, i.e. the third similarity value D _Y (y) approach 1, fourth similarity value D _Y (G (F (y))) approaches 0.

Step 210: and adjusting the first discriminator according to the third similarity value and the fourth similarity value to increase the third similarity value determined subsequently and decrease the fourth similarity value, so that the first discriminator is used for discriminating the similarity degree of the input image and the second image.

To increase the first discriminator D _Y According to the third similarity value D of the above output _Y (y) and fourth similarity value D _Y (G (F (y))) adjusting the first discriminator D _Y Let the first discriminator D _Y Then, the second image y and the second restored image G (F (y)) are discriminated so that the third phase value D is re-outputted _Y (y) increasing, approaching 1, so that a fourth similarity value D is re-output _Y (G (F (y))) reductionApproaching 0. In this way, the image is subsequently input into the first discriminator D _Y First discriminator D _Y The similarity of the input image to the second image y can be discriminated.

Step 220: the first similar image is input into a second discriminator, and a fifth similar value of the first similar image is output by the second discriminator.

In order to identify whether the first similar image G (x) is an image of the sample space Y, the first similar image G (x) is input to a trained second discriminant D _X Second discriminator D _X Based on the first image x, outputting a fifth similarity value D of the first similar image G (x) _X (G (x)), fifth similarity value D _X (G (x)) represents the similarity of the first image x and the first similar image G (x).

Step 230: a first difference between the fifth similarity value and the similarity coefficient is determined.

Since the similarity of the first image x and the second image y is equal to the similarity coefficient S, the first image x and the second image y are compared with the fifth similarity value D by the similarity coefficient S _X First difference S-D between (G (x)) _X (G (x)) can be obtained by knowing the difference between the first similar image G (x) generated by the first generator G and the image of the sample space Y, and grasping the performance of the first generator G.

Step 240: the second similar image is input into the first discriminator, and the sixth similar value of the second similar image is output by the first discriminator.

In order to identify whether the second similar image F (y) is an image of the sample space X, the second similar image F (y) is input to a trained first arbiter D _Y First discriminator D _Y Based on the second image y, outputting a sixth similarity value D of the second similar image F (y) _Y (F (y)) sixth similarity value D _Y (F (y)) represents the similarity of the second image y to the second similar image F (y).

Step 250: a second difference between the sixth similarity value and the similarity coefficient is determined.

Similarly, since the similarity of the first image x and the second image y is equal to the similarity coefficient S, the first image x and the second image y are compared with the sixth similarity value D by the similarity coefficient S _Y Second difference between (F (y))Value S-D _Y (F (y)), the difference between the first similar image F (y) generated by the second generator F and the image of the sample space X can be known, and the performance of the second generator F can be grasped.

Step 260: a first euclidean distance of the first image and the first restored image is determined.

The euclidean distance represents the point-to-point distance and can be used to characterize the matching degree of two-dimensional images, and the larger the euclidean distance is, the more unmatched the two images are. By determining the first euclidean distance of the first image x and the first restored image F (G (x)), the degree of matching, i.e., the similarity, of the first restored image F (G (x)) with the first image x can be determined, and the performance of the second generator F can be further grasped.

Step 270: a second euclidean distance of the second image and the second restored image is determined.

Similarly, by determining the first euclidean distance between the second image y and the second restored image G (F (y)), the degree of matching between the second restored image G (F (y)) and the second image y, that is, the similarity, can be determined, and the performance of the first generator G can be further grasped.

Step 280: and judging whether the first difference value and the second difference value are smaller than a first preset threshold value or not, and judging whether the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold value or not at the same time.

By determining a first difference S-D _X (G (x)) and a second difference S-D _Y (F (y)) and the first preset threshold, the first euclidean distance and the second preset threshold may determine whether the first generator G and the second generator F are trained. The first preset threshold and the second threshold may be set to be the same, for example, both set to 0.05, or may be set to be different, for example, the first preset threshold is set to 0.05, and the second preset threshold is set to 0.04. If the first difference S-D _X (G (x)) and a second difference S-D _Y Step 290 is performed when one of (F (y)) is greater than or equal to a first preset threshold, or when one of the first euclidean distance and the second euclidean distance is greater than or equal to a second preset threshold, otherwise step 300 is performed.

Step 290: and adjusting parameters of the first generator according to the first difference value and the second Euclidean distance, adjusting parameters of the second generator according to the second difference value and the first Euclidean distance, and jumping to the step of inputting the first image into the first generator to generate a first similar image so as to reduce the first difference value, the second difference value, the first Euclidean distance and the second Euclidean distance which are determined subsequently.

Due to the first difference S-D _X (G (x)) is determined based on the first similar image G (x), a second Euclidean distance is determined based on the second restored image G (F (y)), and a second difference S-D _Y (F (y)) is determined based on the second similar image F (y), and the first Euclidean distance is determined based on the first restored image F (G (x)), and thus is based on the first difference S-D _Y (F (y)) and a second Euclidean distance, and adjusting the parameters of the first generator G according to the second difference S-D _Y (F (y)) and the first Euclidean distance, and jump to step 140, and execute step 140 and subsequent steps to obtain a first difference S-D _X (G (x)), a second difference S-D _Y (F (y)), the first Euclidean distance and the second Euclidean distance are reduced until a first difference S-D is determined in step 280 _X (G (x)) and a second difference S-D _Y (F (y)) is less than a first preset threshold, and both the first euclidean distance and the second euclidean distance are less than a second preset threshold.

Step 300: at least one of the first generator and the second generator is determined as a target generator for generating an image with a similarity equal to a similarity coefficient from the input image.

After training, one of the first generator G and the second generator F may be determined as a target generator, and may also be determined as both target generators. The image is input to the object generator, and an image having a similarity equal to the similarity coefficient S can be generated.

Further, by employing the first image x and the first image y of different degrees of similarity, the similarity coefficient S input to the first generator G and the second generator F can be made different, so that the target generator for generating images of different degrees of similarity can be trained.

The first generator G generates a first similar image G (x) of the first image x and a second similar image G (x) of the second image y based on the similarity coefficient SThe restored image G (F (y)), the second generator F generates a first restored image F (G (x)) of the first image x and a second similar image F (y) of the second image y. Then second discriminator D _X The first discriminator D adjusts the parameters thereof by discriminating the first image x and the first restored image F (G (x)) _Y The first discriminator D is improved by discriminating the second image y and the second restored image G (F (y)) and adjusting the parameters thereof _Y And a second discriminator D _X For use by a subsequent training generator. Then, pass through a second discriminator D _X A first difference S-D between the similarity value of the output first similar image G (x) and the similarity coefficient S _X (G (x)), a second Euclidean distance between the second restored image G (F (y)) and the second image y, and the parameters of the first generator G are adjusted by the first discriminator D _Y A second difference S-D between the similarity value of the second output similar image F (y) and the similarity coefficient S _Y (F (y)), adjusting parameters of the second generator F by a first Euclidean distance between the first restored image F (G (x)) and the first image x, improving the performance of the first generator G and the second generator F, and repeating the steps of generating the first similar image G (x) until a first difference S-D _X (G (x)) and a second difference S-D _Y And (F (y)) is smaller than a first preset threshold, the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold, and the target generator is obtained. Through the method, the characteristics of the trademark images can be matched, the adjustable similarity coefficient S is introduced to train the generator, the similarity between the generated image and the input image is adjusted under the condition that the content of the image is unchanged, so that the target generator which can generate the image with the similarity equal to the similarity coefficient S is obtained, the similarity of the trademark images is manually controlled, the similar samples are quickly generated, and the generation efficiency of the similar samples of the trademark images is improved.

In order to improve the quality of the generator, optionally, referring to fig. 2, fig. 2 shows a flowchart of a target generator training method according to another embodiment of the present application, as shown in the figure, step 110 includes the following steps:

step 111: a first image set and a second image set are acquired.

Step 112: one of the images in the first set of images is determined to be the first image.

Step 113: one of the images in the second set of images is determined to be the second image.

Step 300 comprises the steps of:

step 310: determining the other image in the first image set as a first image, determining the other image in the second image set as a second image, and jumping to the step of inputting the first image into the first generator to generate a first similar image until at least one of the first generator and the second generator is determined as a target generator after traversing all the images in the first image set and the second image set.

Specifically, the first image set X and the second image set Y include a plurality of real samples. One of the real samples is acquired from the first image set X as a first image X, and the first image X is input into a first generator G to generate a first similar sample G (X). Likewise, one of the true samples is acquired from the second image set Y as a second image Y, and the second image Y is input to the second generator F, generating a second similar sample F (Y).

To improve the quality of the target first generator and the second generator, when the first difference S-D _X (G (x)) and a second difference S-D _Y And (F (Y)) is smaller than a first preset threshold value, and when the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold value, the other real sample in the first image set X is redetermined as the first image X, the other real sample in the second image set Y is redetermined as the second image Y, the step 140 is skipped, and the subsequent steps are continuously executed until all the real samples in the first image set X and the second image set Y are traversed, and one of the finally trained first generator G or second generator F is taken as a target generator. Through repeated training, the first generator G and the second generator F can fully learn the characteristics of the first image set X and the second image set Y to generate more real images.

Through the mode, the images of the first image set X and the second image set Y can be input into the cycle GAN to train the first generator G and the second generator F, so that the first generator G and the second generator F fully learn the characteristics of the image sets X and Y, the quality of the first generator G and the second generator F is improved, and more real images are generated.

To ensure that the generators and the discriminants evolve with each other, and thus that the generators can generate more realistic images, according to some embodiments of the present application, optionally, the sum of the minimized contrast loss functions of the first generator and the second generator is:

In order for the first generator G to generate a more realistic image of the sample space Y, a fifth similarity value D of the first similarity image is therefore _X (G (x)) approaches the similarity coefficient S, log (S-D) _X (G (x))) approaches minus infinity. Likewise, in order for the second generator F to generate a more realistic image of the sample space X, a sixth similarity value D of the second similar image is therefore _Y (F (y)) approaches the similarity coefficient S, log (S-D) _Y (F (y))) approaches minus infinity. Finally, loss _GAN Loss by activating function softmax _GAN Is 0.

In addition, to ensure that the output pictures of the generator are only different in style from the input pictures and are identical in content, the cyclic consistency loss function is minimized as follows:

By minimizing ||F (G (x)) -x|| ₁ So that the first restored image F (G (x)) output by the second generator F is just of a different style from the first image xThe content is the same. Likewise, by minimizing ||G (F (y)) -y||G (F (y)) -y|G (y) ₁ The second restored image G (F (y)) outputted from the first generator G is made to be just different in style from the second image y and identical in content.

To improve discrimination techniques of the discriminators, in some embodiments, a second discriminator D _X Is:

similarly, a first discriminator D _Y Is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing a second discriminant D _X Is added to the system, the loss of (a) is,representing a second discriminant D _Y Is a loss of (2).

Training a second discriminant D _X When D should be maximized _X (x) While minimizing D _X (F (G (x))) such that 1-D _X (F (G (x))) maximization. Training a first discriminant D _Y When D should be maximized _Y (y) while minimizing D _Y (G (F (y))) to 1-D _Y (G (F (y))) maximization. In this way, the first discriminator D can be lifted _Y And a second discriminator D _X Is a function of the discrimination capability of the device.

By minimizing the sum of the counterloss functions of the first generator G and the second generator F, the generators and the discriminant can be mutually evolved, ensuring that the first generator generates a more realistic image of the sample space Y and the second generator F generates a more realistic image of the sample space X.

In order to improve the construction efficiency of the similar sample pair, according to another aspect of the embodiment of the present application, there is further provided a method for generating a similar sample pair, and fig. 3 shows a flowchart of the method for generating a similar sample pair provided by the embodiment of the present application, and the method includes the following steps:

step 410: a sample image is acquired.

Step 420: the sample image is input into a target generator, and the target generator is trained by the target generator training method in any one of the above embodiments.

Step 430: the similarity between the similar image and the sample image is equal to the similarity coefficient by the object generator outputting the similar image.

Step 440: the sample image is determined together with the similar image as a similar sample pair.

Step 450: and determining the similarity coefficient as a similarity score label of the similar sample.

The sample image is obtained by randomly obtaining an image, then inputting the obtained sample image into a trained target generator, and outputting a similar image of the sample image. Since the target generator is obtained by training by introducing the similarity coefficient S for the cycle GAN, the similarity between the similar image and the sample image is equal to the similarity coefficient S. When the target generator outputs the similar image, the sample image and the similar image form a similarity sample pair. In order to facilitate the acquisition of the similarity between the sample image and the similar image, the similarity coefficient S is used as a similarity score label of the similar sample pair for characterizing the similarity between the sample image and the similar image.

In order to improve the efficiency of image retrieval, according to another aspect of the embodiment of the present application, there is further provided an image retrieval model training method, and fig. 4 shows the image retrieval model training method provided by the embodiment of the present application, where the method includes:

step 510: a set of similar sample pairs including a plurality of similar sample pairs generated by the similar sample pair generation method provided in the above embodiment is obtained.

Step 520: and inputting the similar sample pairs into a neural network for training to obtain an image retrieval model.

The neural network may be, but is not limited to, a Vision Transformer model, a convolutional neural network (Convolutional Neural Networks, CNN), a recurrent neural network (recurrent neural network, RNN), or the like. The embodiment of the present application mainly uses Vision Transformer as an example to describe the concept of the present application, and this application scenario does not limit the protection scope of the present application. The relative sample pair set comprises a plurality of similar sample pairs formed by similar images generated by the target generator and sample images, all the similar sample pairs in the similar sample pair set are input into a Vision Transformer model for training, the Vision Transformer model takes the constituent elements of the similar sample pairs as the characteristic of the model for classifying according to the characteristic extraction principle, namely, the image retrieval model capable of calculating the similarity of any two images, namely, the Vision Transformer model can be obtained by using deep learning content-based image retrieval, so that the image retrieval efficiency is improved.

In order to improve the efficiency of trademark image retrieval, according to another aspect of the embodiment of the present application, there is further provided a trademark image retrieval method, and fig. 5 shows a flowchart of the trademark image retrieval method provided by the embodiment of the present application, and the method includes the following steps:

step 610: an existing trademark image library is obtained, and the trademark image library comprises a plurality of existing trademark images.

Step 620: and acquiring a trademark image to be retrieved.

Step 630: and inputting the trademark images to be searched and the existing trademark images into an image search model, wherein the image search model is obtained through training by the image search model training method provided by the embodiment.

Step 640: and outputting the similarity scores of the trademark images to be searched and each existing trademark image through the image searching model.

In order to prevent trade mark infringement and ensure the legitimacy of trade mark and its owner, trade mark image needs to be input into image search model for searching. Firstly, all registered existing trademark images are acquired from an existing trademark image library, then, trademark images to be searched are acquired, then, the trademark images to be searched and each existing trademark are input into a trademark image searching model, the image searching model can calculate similarity scores of the trademark images to be searched and each existing trademark image, and finally, all the similar images searched from high to low according to the similarity scores are output.

In order to reduce the number of existing trademark images required for trademark image retrieval and improve the trademark image retrieval efficiency, according to some embodiments of the present application, optionally, fig. 6 shows a flowchart of a trademark image retrieval method according to another embodiment of the present application, as shown in the figure, after step 620, including the following steps:

step 621: registration information of a brand image to be retrieved is acquired.

Step 622: and determining an existing trademark image belonging to the same field as the trademark to be searched according to the registration information.

Step 630 includes the steps of:

step 631: and inputting the trademark image to be searched and the existing trademark image belonging to the same field as the trademark to be searched into an image searching model.

Step 640 includes the steps of:

step 641: and outputting the trademark image to be searched and the similarity score of the existing trademark image belonging to the same field with the trademark to be searched through the image searching model.

After the trademark image to be searched is obtained, the registration information of the trademark image to be searched is obtained, the international standard class number is extracted from the registration information, then the major class of the trademark image to be searched is searched according to the international standard class number, then the existing trademark image in the same field is found in the trademark image library, and only the trademark searching is needed in the field, so that the searching of the existing trademark images in other different fields can be avoided, and the trademark image searching efficiency is improved. Further, the trademark images to be searched and each existing trademark image in the same field are input into an image search model for searching, the similarity scores of the two trademark images are calculated by the image search model, and finally all similar images in the same field are searched according to the similarity scores from high to low output.

In order to further improve the efficiency of trademark image retrieval, according to some embodiments of the present application, optionally, referring to fig. 7, fig. 7 shows a flowchart of a trademark image retrieval method according to another embodiment of the present application, where, as shown in the figure, a similarity score between a trademark image and each existing trademark image is output through an image retrieval model, including:

step 741: the image retrieval model preprocesses the trademark image to be retrieved and the existing trademark image.

When the trademark images to be searched and the existing trademark images are input into an image searching model, the image searching model carries out preprocessing on the two trademark images, format unification processing is carried out on the two trademark images, namely, the two trademark images are complemented into squares and unified in size, and then the two trademark images are converted into images in a Tensor format from RGB images.

Step 742: and determining the projection representation of the trademark image to be searched and the existing trademark image by the image searching model according to the preprocessed trademark image to be searched and the existing trademark image.

After preprocessing a trademark image to be searched and an existing trademark image by an image search model, inputting the two trademark images into an encoder of the image search model through a flattened linear projection layer to obtain an image representation x of the trademark image to be searched _i And image representation x of an existing brand image _j Obtaining an image representation x of the trademark image to be searched by the nonlinear projection layer _i Projection representation z of (2) _i And image representation x of an existing brand image _j Projection representation z of (2) _j 。

Step 743: and determining the similarity scores of the trademark images to be searched and the existing trademark images according to the projection representation by the image searching model.

After obtaining the projection representation of the trademark image to be searched and the existing trademark image, the projection representation z can be obtained by calculation _i And z _j Cosine similarity S _i,j Cosine similarity S _i,j The similarity score of the trademark image to be searched and the existing trademark image is calculated according to the following formula:

wherein τ is an adjustable temperature parameter used to control the cosine similarity within [ -1,1]，||z _i ||、||z _j And I is the norm of the vector.

In order to improve quality of trademark image retrieval, according to some embodiments of the present application, optionally, please refer to fig. 8, fig. 8 shows a flowchart of a trademark image retrieval method according to another embodiment of the present application, where determining, by an image retrieval model, a projection representation of a trademark image to be retrieved and an existing trademark image according to a preprocessed trademark image to be retrieved and the existing trademark image, includes:

step 742a: and carrying out serialization operation on the preprocessed trademark image to be searched and the existing trademark image by using the image searching model to obtain image blocks of the trademark image to be searched and the existing trademark image.

In the image retrieval model, in order to process two-dimensional trademark images to be retrieved and existing trademark images, serializing operation is carried out on the input trademark images to be retrieved and the existing trademark images in a Tensor format, and each trademark image x epsilon R ^H×W×C Segmentation remodelling and flattening into sequencesAnd respectively obtaining an image block of the trademark image to be retrieved and an image block of the existing trademark image. Wherein (H, W) is defined as the resolution of the original trademark image according to the standard, C is the number of channels of the trademark image, and (P, P) is the resolution of the divided image blocks, the number of image blocks N satisfying the relation n=hw/P ² 。

Step 742b: determination of vector representation F of image blocks by flattened linear projection layer from image retrieval model ₁ 。

In the image retrieval model, in order to extract image features, two-dimensional images need to be converted into one-dimensional sequences, so that image blocks of a trademark image to be retrieved and an existing trademark image are mapped to D dimensions after passing through a flattened linear projection layer, and in the process, in order to maintain spatial position information among input image blocks, position information E of each vector needs to be added _pos Furthermore, an additional learnable classification header x is added to the mapped vector sequence _class As a result of the class prediction, a vector representation F is obtained ₁ The formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,is a sequenceThe product of the projection matrix E and,and E is _pos ∈R ^(N+1)×D 。

Step 742c: determination of F by image retrieval model through multi-head self-attention mode ₁ Global dependency of F ₂ 。

First, the vector is represented by F ₁ Performing three linear transformations to obtain query vector Q, key vector K and value vector V, Q ε R ^(N+1)×D 、K∈R ^(N+1)×D 、V∈R ^(N+1)×D The formulas of the three linear transformations are respectively:

Q=F ₁ W ^Q ，K=F ₁ W ^K ，V=F ₁ W ^V ，

where N represents the number of image blocks and W is a matrix of size D x D. F (F) ₁ The multiplication by W means that a linear transformation is made.

Then, the correlation matrix score between the image blocks is calculated, and the score epsilon R ^(N+1)(N+1) The matrix determines the degree of attention of the image block at a certain position and the other positions, and the calculation formula is as follows:

wherein, softmax is the activation function,for scaling factors, T is the matrix transpose, for alleviating the problem of gradient vanishing due to softmax activation function.

Then, the correlation matrix and the value vector V are subjected to matrix multiplication operationObtaining a global dependence of an angle, namely a head in a multi-head self-attention mechanism ₁ ，head ₁ ∈R ^(N+1)×D The calculation formula is as follows:

in order to fully extract the global dependency relationship of each image block from multiple angles, repeating all the operations to obtain h different heads, and performing splicing operation, namely head, on the h heads for better fusion of multi-angle information ₁ ,head ₂ ,…,head _h 。

Finally, in order to ensure the consistency of the input and output dimensions, the spliced features are subjected to linear transformation to obtain features F ₂ ，F ₂ ∈R ^(N+1)×D ，F ₂ The global dependency relationship of each image block in one image extracted from multiple angles is calculated as follows:

F ₂ =concat(head ₁ ,head ₂ ,…,head _h )W

where concat represents a matrix stitching operation.

Step 742d: f is carried out by an image retrieval model ₁ And F ₂ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₃ 。

Specifically, F ₁ And F ₂ Carrying out residual connection and layer normalization to obtain a characteristic F ₃ ，F ₃ ∈R ^(N+1)×D The calculation formula is as follows:

F ₃ =LayerNorm(F ₁ +F ₂ )

wherein LayerNorm represents layer normalization.

Step 742e: by image retrieval model through the pair F ₃ Calculating by a multi-layer perceptron, and determining a characteristic F ₄ 。

For F ₃ Performing multi-layer perceptron (Multilayer Perceptron, MLP) computation to obtain feature F ₄ The calculation formula is as follows:

F ₄ =MLP(F ₃ )

step 742f: f is carried out by an image retrieval model ₃ And F is equal to ₄ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₅ 。

Specifically, F ₃ And F is equal to ₄ After residual connection, performing layer normalization operation to obtain feature F ₅ ，F ₅ ∈R ^(N+1)×D The calculation formula is as follows:

F ₅ =LayerNorm(F ₃ +F ₄ )

step 742g: jumping from the image retrieval model to step 742c and executing K times to obtain feature F ₆ 。

To better extract features of the image, steps 742 c-742F are repeated K times, extracting deeper features F using deeper networks ₆ ，F ₆ ∈R ^(N+1)×D The calculation formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,steps 742c through 742f are repeated for the K-th iteration.

Step 742h: by image retrieval model by combining F ₆ The input full connection layer determines the projected representations of the brand image to be retrieved and the existing brand image.

Features F of the image extracted by the image retrieval model ₆ Inputting a full-connection layer, wherein the number of output neurons of the full-connection layer is equal to the vector space dimension D, and obtaining a characteristic F ₇ ，F ₇ ∈R ^1×D ，F ₇ Expressed as a projected representation of the brand image to be retrieved and the existing brand image, the calculation formula is as follows:

F ₇ =FC(flatten(F ₆ ))

wherein, flat represents tiling two-dimensional features into one-dimensional vector, and FC represents full join.

Fig. 9 shows a schematic structural diagram of a training device for a target generator according to an embodiment of the present application. As shown in the figure, the target generator training apparatus 800 includes: the first obtaining module 801, the second obtaining module 802, the input module 803, the first generating module 804, the first restoring module 805, the first discriminating module 806, the first adjusting module 807, the second generating module 808, the second restoring module 809, the second discriminating module 810, the second adjusting module 811, the third discriminating module 812, the first determining module 813, the fourth discriminating module 814, the second determining module 815, the third determining module 816, the fourth determining module 817, the judging module 818, the third adjusting module 819, and the fifth determining module 820.

A first acquiring module 801, configured to acquire a first image and a second image. A second obtaining module 802 is configured to obtain a similarity coefficient, where the similarity coefficient is used to characterize a similarity between the first image and the second image. An input module 803 for inputting the similarity coefficient into the first generator and the second generator.

The first generating module 804 is configured to input the first image into a first generator and generate a first similar image. The first restoration module 805 is configured to input the first similar image into the second generator and generate a first restored image.

The first discriminating module 806 is configured to input the first image and the first restored image into the second discriminator, and output the first similarity value of the first image and the second similarity value of the first restored image by the second discriminator. The first adjusting module 807 is configured to adjust the second discriminator according to the first similarity value and the second similarity value, so that the first similarity value determined subsequently increases, and the second similarity value decreases, so that the second discriminator is configured to discriminate the similarity between the input image and the first image.

A second generation module 808 is configured to input the second image into a second generator to generate a second similar image. A second restoration module 809 for inputting the second similar image into the first generator to generate a second restored image.

The second discriminating module 810 is configured to input the second image and the second restored image into the first discriminator, and output a third similarity value of the second image and a fourth similarity value of the second restored image by the first discriminator. The second adjustment module 811 is configured to adjust the first discriminator according to the third similarity value and the fourth similarity value, so as to increase the third similarity value determined subsequently and decrease the fourth similarity value, thereby enabling the first discriminator to discriminate the similarity between the input image and the second image.

The third discriminating module 812 is configured to input the first similar image into the second discriminator, and output a fifth similar value of the first similar image by the second discriminator. The first determining module 813 is configured to determine a first difference between the fifth similarity value and the similarity coefficient.

The fourth determining module 814 is configured to input the second similar image into the first determining device, and output a sixth similar value of the second similar image by the first determining device. A second determining module 815 is configured to determine a second difference between the sixth similarity value and the similarity coefficient.

A third determining module 816 is configured to determine a first euclidean distance of the first image and the first restored image. A fourth determination module 817 is configured to determine a second euclidean distance of the second image and the second restored image.

The determining module 818 is configured to determine whether the first difference and the second difference are smaller than a first preset threshold, and determine whether the first euclidean distance and the second euclidean distance are smaller than a second preset threshold.

And the third adjusting module 819 is configured to adjust parameters of the first generator according to the first difference and the second euclidean distance when the first difference and the second difference are greater than or equal to a first preset threshold or the first euclidean distance and the second euclidean distance are greater than or equal to a second preset threshold, adjust parameters of the second generator according to the first difference and the second euclidean distance, skip to a step of inputting the first image into the first generator to generate a first similar image, so that the first difference, the second difference, the first euclidean distance and the second euclidean distance determined later are reduced.

And a fifth determining module 820, configured to determine at least one of the first generator and the second generator as a target generator when the first difference value and the second difference value are smaller than a first preset threshold value and the first euclidean distance and the second euclidean distance are smaller than a second preset threshold value, where the target generator is configured to generate an image with a similarity equal to the similarity coefficient.

In an alternative manner, the first obtaining module 801 is further configured to obtain a first image set and a second image set, where one image in the first image set is determined to be the first image, and one image in the second image set is determined to be the second image. The fifth determining module 820 is further configured to determine the other image in the first image set as the first image, determine the other image in the second image set as the second image when the first difference and the second difference are smaller than a first preset threshold and the first euclidean distance and the second euclidean distance are smaller than a second preset threshold, skip to the step of inputting the first image into the first generator to generate the first similar image until at least one of the first generator and the second generator is determined as the target generator after traversing all the images in the first image set and the second image set.

In the target generator training apparatus, based on the similarity coefficient S, the first generation module 804 generates a first similar image G (x) of the first image x, the first restoration module 805 restores a second restoration image G (F (y)) of the second image y, the second generation module 808 generates a first restoration image F (G (x)) of the first image x, and the second restoration module 809 restores a second similar image F (y) of the second image y. Then, the first adjustment module 807 adjusts its own parameters by discriminating the first image x and the first restored image F (x)), and the second adjustment module 811 adjusts its own parameters by discriminating the second image y and the second restored image G (F (y)), so as to improve the discrimination performance of the first discrimination module 806 and the second discrimination module 810 for use by the subsequent training generator. Next, a first difference S-D between the similarity value of the first similar image G (x) output by the first discriminating module 806 and the similarity coefficient S _X (G (x)), a second euclidean distance between the second restored image G (F (y)) and the second image y, and a first generating module 804A second difference S-D between the similarity value of the second similar image F (y) output by the second discrimination module 810 and the similarity coefficient S _Y (F (y)), adjusting parameters of the second generation module 808 by a first Euclidean distance between the first restored image F (G (x)) and the first image x, improving performance of the first generation module 804 and the second generation module 808, and repeating the steps of generating the first similar image G (x) until a first difference S-D _X (G (x)) and a second difference S-D _Y And (F (y)) is smaller than a first preset threshold, the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold, and the target generator is obtained. Through the target generator training device 800, the characteristics of trademark images can be matched, the adjustable similarity coefficient S is introduced to train the generator, and the similarity between the generated image and the input image is adjusted under the condition that the content of the images is unchanged, so that the target generator capable of generating the images with the similarity equal to the similarity coefficient S is obtained, the similarity of the trademark images is manually controlled, the similar samples are quickly generated, and the generation efficiency of the similar samples of the trademark images is improved.

Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application, which is not limited to the specific implementation of the electronic device according to the embodiment of the present application.

As shown in fig. 10, the electronic device may include: a processor 902, a communication interface (Communications Interface), a memory 906, and a communication bus 908.

Wherein: processor 902, communication interface 904, and memory 906 communicate with each other via a communication bus 908. A communication interface 904 for communicating with network elements of other devices, such as clients or other servers. The processor 902 is configured to execute the program 910, and specifically may execute the target generator training method in any of the method embodiments described above, and/or the similar sample pair generating method, and/or the image retrieval model training method, and/or the relevant steps in the trademark image retrieval method embodiments.

The embodiment of the application provides a computer readable storage medium, which stores executable instructions that, when executed on an electronic device, cause the electronic device to execute the target generator training method in any of the method embodiments described above, and/or the similar sample pair generation method, and/or the image retrieval model training method, and/or the trademark image retrieval method.

Embodiments of the present application provide a computer program that may be invoked by a processor to cause an electronic device to perform the target generator training method, and/or the similar sample pair generation method, and/or the image retrieval model training method, and/or the brand image retrieval method of any of the method embodiments described above.

The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present application are not directed to any particular programming language. It will be appreciated that the teachings of the present application described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present application.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the embodiments of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed application requires more features than are expressly recited in each claim.

Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component, and they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims

1. A method of training a target generator, the method comprising:

acquiring a first image and a second image;

Obtaining a similarity coefficient, wherein the similarity coefficient is used for representing the similarity of the first image and the second image;

inputting the similarity coefficient into a first generator and a second generator;

inputting the first image into the first generator to generate a first similar image;

inputting the first similar image into the second generator to generate a first restored image;

inputting the first image and the first restored image into a second discriminator, and outputting a first similarity value of the first image and a second similarity value of the first restored image by the second discriminator;

adjusting the second discriminator according to the first similarity value and the second similarity value to increase the first similarity value which is determined subsequently and decrease the second similarity value, so that the second discriminator is used for discriminating the similarity between the input image and the first image;

inputting the second image into the second generator to generate a second similar image;

inputting the second similar image into the first generator to generate a second restored image;

inputting the second image and the second restored image into a first discriminator, and outputting a third similarity value of the second image and a fourth similarity value of the second restored image by the first discriminator;

Adjusting the first discriminator according to the third similarity value and the fourth similarity value to increase the third similarity value which is determined subsequently and decrease the fourth similarity value, so that the first discriminator is used for discriminating the similarity degree of the input image and the second image;

inputting the first similar image into the second discriminator, and outputting a fifth similar value of the first similar image by the second discriminator;

determining a first difference between the fifth similarity value and the similarity coefficient;

inputting the second similar image into the first discriminator, and outputting a sixth similar value of the second similar image by the first discriminator;

determining a second difference between the sixth similarity value and the similarity coefficient;

determining a first euclidean distance of the first image and the first restored image;

determining a second euclidean distance of the second image and the second restored image;

judging whether the first difference value and the second difference value are smaller than a first preset threshold value or not, and judging whether the first Euclidean distance and the second Euclidean distance are smaller than a second preset threshold value or not at the same time;

if not, adjusting parameters of the first generator according to the first difference value and the second Euclidean distance, adjusting parameters of the second generator according to the second difference value and the first Euclidean distance, and jumping to the step of inputting the first image into the first generator to generate a first similar image so as to reduce the first difference value, the second difference value, the first Euclidean distance and the second Euclidean distance which are determined later;

If yes, at least one of the first generator and the second generator is determined to be a target generator, and the target generator is used for generating an image with similarity equal to the similarity coefficient according to an input image.

2. The method of claim 1, wherein the acquiring the first image and the second image comprises:

acquiring a first image set and a second image set;

determining one of the first set of images as the first image;

determining one of the second set of images as the second image;

if yes, determining at least one of the first generator and the second generator as a target generator, wherein the target generator is used for generating an image with similarity equal to the similarity coefficient according to an input image, and comprises the following steps:

if yes, determining the other image in the first image set as the first image, determining the other image in the second image set as the second image, jumping to the step of inputting the first image into the first generator to generate a first similar image, and determining at least one of the first generator and the second generator as a target generator after traversing all images in the first image set and the second image set.

3. The target generator training method of claim 1, wherein a sum of minimized countermeasures loss functions of the first generator and the second generator is:

wherein S is the similarity coefficient, G is the first generator, F is the second generator, D _X For the second discriminator, D _Y For the first discriminator, x is the first image, y is the second image, E is the expected, x-P _data(x) Corresponding to the sampling of the first image, y-P _data(y) Corresponding to the sampling of the second image.

4. A method for generating similar sample pairs, the method comprising:

acquiring a sample image;

inputting the sample image into a target generator, wherein the target generator is obtained by training the target generator training method according to any one of claims 1-3;

outputting, by the target generator, a similar image, a similarity between the similar image and the sample image being equal to the similarity coefficient;

determining the sample image and the similar image together as a similar sample pair;

and determining the similarity coefficient as a similarity score label of the similar sample.

5. The image retrieval model training method is characterized by comprising the following steps of:

obtaining a set of similar sample pairs comprising a plurality of similar sample pairs, the similar sample pairs generated by the similar sample pair generation method of claim 4;

and inputting the similar sample pairs into a neural network for training to obtain an image retrieval model.

6. A brand image retrieval method, characterized in that the brand image retrieval method comprises:

acquiring an existing trademark image library, wherein the trademark image library comprises a plurality of existing trademark images;

acquiring a trademark image to be searched;

inputting the trademark image to be searched and the existing trademark image into an image search model, wherein the image search model is obtained by training the image search model training method according to claim 5;

and outputting similarity scores of the trademark images to be searched and each existing trademark image through the image searching model.

7. The trademark image retrieval method of claim 6, wherein after the trademark image to be retrieved is obtained, comprising:

acquiring registration information of the trademark image to be retrieved;

Determining the existing trademark image belonging to the same field as the trademark to be searched according to the registration information;

the step of inputting the trademark image to be searched and the existing trademark image into an image search model, wherein the image search model is obtained by training the image search model training method according to claim 5, and comprises the following steps:

inputting the trademark image to be searched and the existing trademark image belonging to the same field as the trademark to be searched into the image searching model;

the outputting, by the image retrieval model, the similarity score between the trademark image to be retrieved and each of the existing trademark images includes:

and outputting the trademark image to be searched and the similarity score of the existing trademark image belonging to the same field with the trademark to be searched through the image search model.

8. A brand image retrieval method according to claim 6 or 7, wherein said outputting, by said image retrieval model, a similarity score of said brand image with each of said existing brand images includes:

preprocessing the trademark image to be searched and the existing trademark image by the image searching model;

Determining projection representations of the trademark image to be searched and the existing trademark image by the image searching model according to the preprocessed trademark image to be searched and the existing trademark image;

and determining the similarity scores of the trademark image to be searched and the existing trademark image according to the projection representation by the image searching model.

9. The trademark image retrieval method of claim 8, wherein the determining, by the image retrieval model, a projected representation of the trademark image to be retrieved and the existing trademark image from the preprocessed trademark image to be retrieved and the existing trademark image includes:

carrying out serialization operation on the preprocessed trademark image to be searched and the existing trademark image by the image searching model to obtain image blocks of the trademark image to be searched and the existing trademark image;

determining a vector representation F of the image block by the image retrieval model through a flattened linear projection layer ₁ ；

Determining said F by said image retrieval model in a multi-headed self-attention manner ₁ Global dependency of F ₂ ；

The F is retrieved by the image retrieval model ₁ And said F ₂ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₃ ；

By the image retrieval model through the F ₃ Calculating by a multi-layer perceptron, and determining a characteristic F ₄ ；

The F is retrieved by the image retrieval model ₃ With said F ₄ Residual connection is carried out, and the characteristic F is determined through layer normalization operation ₅ ；

Jumping from said image retrieval model to said determining said F by said image retrieval model by multi-headed self-attention means ₁ Global dependency of F ₂ And K times to obtain feature F ₆ ；

Retrieving, by the image retrieval model, the F ₆ The input full connection layer determines the projected representations of the brand image to be retrieved and the existing brand image.

10. A goal generator training apparatus, the goal generator training apparatus comprising:

the first acquisition module is used for acquiring a first image and a second image;

the second acquisition module is used for acquiring a similarity coefficient, wherein the similarity coefficient is used for representing the similarity between the first image and the second image;

the input module is used for inputting the similarity coefficient into the first generator and the second generator;

a first generation module for inputting the first image into the first generator to generate a first similar image;

A first restoration module for inputting the first similar image into the second generator to generate a first restored image;

a first discriminating module for inputting the first image and the first restored image into a second discriminator, and outputting a first similarity value of the first image and a second similarity value of the first restored image by the second discriminator;

the first adjustment module is used for adjusting the second discriminator according to the first similarity value and the second similarity value so as to increase the first similarity value which is determined subsequently and decrease the second similarity value, so that the second discriminator is used for discriminating the similarity between an input image and the first image;

a second generation module for inputting the second image into the second generator to generate a second similar image;

a second restoration module for inputting the second similar image into the first generator to generate a second restored image;

a second discriminating module for inputting the second image and the second restored image into a first discriminator, and outputting a third similarity value of the second image and a fourth similarity value of the second restored image by the first discriminator;

The second adjustment module is used for adjusting the first discriminator according to the third similarity value and the fourth similarity value so as to increase the third similarity value which is determined subsequently and decrease the fourth similarity value, so that the first discriminator is used for discriminating the similarity between the input image and the second image;

a third discrimination module, configured to input the first similar image into the second discriminator, and output, by the second discriminator, a fifth similar value of the first similar image;

a first determining module, configured to determine a first difference between the fifth similarity value and the similarity coefficient;

a fourth discrimination module for inputting the second similar image into the first discriminator, and outputting a sixth similar value of the second similar image by the first discriminator;

a second determining module, configured to determine a second difference between the sixth similarity value and the similarity coefficient;

a third determining module, configured to determine a first euclidean distance between the first image and the first restored image;

a fourth determining module, configured to determine a second euclidean distance between the second image and the second restored image;

A third adjustment module, configured to adjust parameters of the first generator according to the first difference value and the second euclidean distance when the first difference value and the second difference value are greater than or equal to the first preset threshold value or the first euclidean distance and the second euclidean distance are greater than or equal to the second preset threshold value, adjust parameters of the second generator according to the first difference value and the second euclidean distance, and jump to the step of inputting the first image into the first generator to generate a first similar image, so that the first difference value, the second difference value, the first euclidean distance and the second euclidean distance that are determined subsequently are reduced;

a fifth determining module, configured to determine at least one of the first generator and the second generator as a target generator when the first difference value and the second difference value are smaller than the first preset threshold value and the first euclidean distance and the second euclidean distance are smaller than the second preset threshold value, where the target generator is configured to generate an image with similarity equal to the similarity coefficient.