CN113313192A

CN113313192A - Image similarity judgment method and system

Info

Publication number: CN113313192A
Application number: CN202110660611.9A
Authority: CN
Inventors: 陈思宇; 王铭宇; 王雷
Original assignee: Chengdu Star Innovation Technology Co ltd
Current assignee: Chengdu Star Innovation Technology Co ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2021-08-27

Abstract

The invention discloses an image similarity judgment method, which comprises the following steps: inputting images in the image group with the similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated; processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated; and inputting the first similarity evaluation value and the second similarity evaluation value into the trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated, so that the similarity of the articles in the images can be accurately evaluated.

Description

Image similarity judgment method and system

Technical Field

The invention relates to the technical field of image detection, in particular to an image similarity judgment method and system.

Background

With the rapid increase in the number of vehicles traveling in urban roads. The vehicle detection task becomes an important means for supervising vehicle operation and guaranteeing urban road traffic safety. The similarity judgment of the vehicles is a key problem of accurately detecting and tracking the same vehicle.

Disclosure of Invention

An aspect of an embodiment of the present specification provides an image similarity determination method, including: inputting images in an image group with similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated; processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated; and inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated. .

An aspect of an embodiment of the present specification provides an image similarity determination system, including: the first prediction module is used for inputting the images in the image group with the similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated; the second prediction module is used for processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated; and the judging module is used for inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

An aspect of embodiments of the present specification provides an image similarity determination apparatus, including at least one storage medium for storing computer instructions and at least one processor; the at least one processor is used for executing the computer instructions to realize the operation corresponding to the image similarity judgment method.

An aspect of the embodiments of the present specification provides a computer-readable storage medium, where the storage medium stores computer instructions, and when a computer reads the computer instructions in the storage medium, the method for determining image similarity is implemented.

Drawings

The present description will be further described by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:

FIG. 1 is a schematic diagram of an application scenario of an image similarity determination system according to some embodiments of the present application;

FIG. 2 is a block diagram of an image similarity determination system according to some embodiments of the present description;

FIG. 3 is a flow diagram of an image similarity determination method according to some embodiments of the present description;

FIG. 4 is a schematic diagram of a structure of a first discriminant model shown in accordance with some embodiments of the present description;

FIG. 5 is a schematic diagram of a second model process flow shown in accordance with some embodiments of the present description;

fig. 6 is a schematic flow diagram of an image similarity determination method shown in accordance with some embodiments of the present description.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

It should be understood that "system", "device", "unit" and/or "module" as used in this specification is a method for distinguishing different components, elements, parts or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Embodiments of the present application may be applied to the identification of various static or dynamic items, such as vehicles and the like, including but not limited to bicycles, electric vehicles, automobiles, motorcycles, airplanes and the like. The method and the device can evaluate the similarity degree among the articles according to the image information of the articles, so that the judgment of the articles is realized. It should be understood that the application scenarios of the system and method of the present application are merely examples or embodiments of the present application, and those skilled in the art can also apply the present application to other similar scenarios without inventive effort based on these drawings. Although the present application has been described primarily in the context of a vehicle, and particularly an automobile, it should be noted that the principles of the present application are applicable to other articles, and the determination of the status, presence or absence of an article and the location of an article may also be made in accordance with the principles of the present application.

In the present application, the determination of the vehicle similarity is merely an example. It should be noted that detecting similarity between two vehicles is for illustrative purposes only and is not intended to limit the scope of the present application. In some embodiments, the present disclosure may be applied to other similar scenarios, such as, but not limited to, identification of products, and the like.

As shown in fig. 1, an application scenario 100 of the image similarity determination system according to the present specification may include a computing system 110, a network 120, a storage device 130, and a user terminal 140.

The computing system 110 may be used to determine the similarity of images within a group of images for which the similarity is to be calculated, thereby determining whether items within each image are the same item. In some embodiments, the method may be specifically used for determining two vehicle pictures to determine whether the vehicles in the pictures are the same, so as to implement monitoring of the vehicles, and this monitoring technique may be applied to, for example, vehicle regulatory departments, traffic management departments, and the like. The computing system 110 may make the determination based on the acquired data, the similarity of the vehicles.

Computing system 110 refers to a system having computing capabilities, and in some embodiments, computing system 110 may be a single server or a group of servers. The set of servers can be centralized or distributed (e.g., computing system 110 can be a distributed system). In some embodiments, the computing system 110 may be local or remote. For example, computing system 110 may access information and/or data stored in user terminal 140 and/or storage device 130 via network 120. As such, computing system 110 may be directly connected to user terminal 140 and/or storage device 130 to access stored information and/or data. In some embodiments, the computing system 110 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof. In some embodiments, computing system 110 may execute on a computing device that includes one or more components.

In some embodiments, computing system 110 may include a processing device 112. Processing device 112 may process information and/or data to perform one or more functions described herein. For example, the processing device 112 may acquire data of images within a group of images for which a similarity is to be calculated. The processing device 112 may process the acquired data to determine whether images in the image group whose similarity is to be calculated are similar.

In some embodiments, processing device 112 may execute program instructions. The Processing device 112 may include various conventional general purpose Central Processing Units (CPUs), Graphics Processing Units (GPUs), microprocessors, application-specific integrated circuits (ASICs), or other types of integrated circuits.

Network 120 may facilitate the exchange of information and/or data. In some embodiments, computing system 110 may obtain image data within a group of images for which a similarity is to be calculated from storage device 130 and/or user terminal 140 via network 120. In some embodiments, the computing system 110 may communicate image similarity data within the set of images for which similarities are to be computed to the storage device 130 and/or the user terminal 140 via the network 120, in some embodiments, the network 120 may be any one of a wired network or a wireless network, or a combination thereof. By way of example only, network 120 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth network, a zigbee network, a Near Field Communication (NFC) network, the like, or any combination of the above. In some embodiments, network 120 may include one or more network access points. For example, network 120 may include wired or wireless network access points, such as base stations and/or internet exchange points 120-1, 120-2.

In some embodiments, the user terminal 140 may be user related to the item in the image (e.g., a home person of the vehicle, vehicle regulatory personnel, traffic management personnel). The user terminal 140 may include a mobile device 140-1, a tablet 140-2, a laptop computer 140-3, etc., or a combination thereof. In some embodiments, mobile device 140-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, and the like, or any combination thereof. In some embodiments, the smart home devices may include smart lighting devices, smart appliance control devices, smart monitoring devices, smart televisions, smart cameras, interphones, and the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, smart footwear, smart glasses, smart helmet, smart watch, smart apparel, smart backpack, smart accessory, or the like, or any combination thereof. In some embodiments, the smart mobile device may include a smart phone, a Personal Digital Assistant (PDA), a gaming device, a navigation device, a POS machine, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, virtual reality glasses, virtual reality eyeshields, augmented reality helmets, augmented reality glasses, augmented reality eyeshields, and the like, or any combination thereof. For example, the virtual reality device and/or augmented reality device may include Googleglass, RiftCon, Fragments, Gear VR, etc.

In some embodiments, the user terminal 140 may send image data within the image group of which the similarity is to be calculated to the computing system 110, and in some embodiments, the user terminal 140 may acquire image similarity data or determination results within the image group of which the similarity is to be calculated from the computing system 110.

Storage device 130 may store data and/or instructions. In some embodiments, storage device 130 may store data obtained from user terminal 140 and/or computing system 110. For example, the storage device 130 may store image data within a group of images of which the similarity is to be calculated, obtained from the user terminal 140. In some embodiments, storage device 130 may store data and/or instructions that may be executed or used by computing system 110 to perform the example methods described herein. For example, the storage device 130 may store instructions that the processing device 112 may execute to predict whether images within the group of images for which the similarity is to be predicted is similar based on images within the group of images for which the similarity is to be calculated. In some embodiments, storage 130 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), and the like, or any combination thereof. Exemplary mass storage may include magnetic disks, optical disks, solid state drives, and the like. Exemplary removable memories may include flash drives, floppy disks, optical disks, memory cards, compact disks, magnetic tape, and the like. Exemplary volatile read and write memories can include Random Access Memory (RAM). Exemplary random access memories may include Dynamic Random Access Memory (DRAM), double-data-rate synchronous dynamic random access memory (DDR SDRAM), Static Random Access Memory (SRAM), thyristor random access memory (T-RAM), and zero-capacitance random access memory (Z-RAM), among others. Exemplary read-only memories may include mask read-only memory (MROM), programmable read-only memory (PROM), erasable programmable read-only memory (perrom), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM), digital versatile disc read-only memory, and the like. In some embodiments, the storage device 130 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof.

In some embodiments, storage device 130 may be connected to network 120 to communicate with computing system 110 and/or user terminal 140. Computing system 110 may access data or instructions stored in storage device 130 via network 120. In some embodiments, storage device 130 may be directly connected to or in communication with computing system 110. In some embodiments, storage device 130 may be part of computing system 110.

Fig. 2 is a block diagram of an image similarity determination system according to some embodiments of the present description.

As shown in fig. 2, the image similarity determination system may include a first prediction module 210, a second prediction module 220, and a determination module 230.

In some embodiments, the first prediction module 210 is configured to input images in an image group with a similarity to be calculated into a trained first discriminant model, so as to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated.

In some embodiments, the second prediction module 220 is configured to perform a second model processing on the images in the image group with the similarity to be calculated, so as to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated;

in some embodiments, the determining module 230 is configured to input the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model, so as to obtain similarity evaluation data of images in the image group with the similarity to be calculated.

In some embodiments, the image similarity determination system may further include a training module 240, where the training module 240 is configured to train the constructed first discrimination model, second model, and evaluation model according to the acquired data, so as to obtain the trained first discrimination model, second model, and evaluation model.

In some embodiments, the first discriminant model, the second discriminant model, and the evaluation model may be trained to: the first discrimination model, the second model and the evaluation model can be obtained by training based on training samples respectively.

In some embodiments, the trained sample images may include existing vehicle images. The existing vehicle images can be acquired in various manners, such as vehicle images acquired by a historical driving recorder, vehicle images uploaded by a historical user, vehicle images acquired by an electronic monitoring device, and the like. In some embodiments, data enhancement may be performed on existing vehicle images to increase the number of sample images. Methods of data enhancement include, but are not limited to, flipping, rotating, scaling, cropping, translating, adding noise, and the like. In some embodiments, the status data of the sample image may be tagged, which may be done manually or by a computer program. For example, the score of the vehicle may be counted by the user based on the history, and so on. For example only, the model may be trained with the sample image as input and the corresponding vehicle state as the correct criteria (Ground Truth). While the model parameters may be adjusted in reverse based on the difference between the predicted output of the model (e.g., predicted vehicle state) and the correct criteria. When a predetermined condition is satisfied, for example, the number of training sample images reaches a predetermined number, the predicted accuracy of the model is greater than a predetermined accuracy threshold, or the value of the loss function (LossFunction) is less than a predetermined value, the training process is stopped, and the trained model is designated as the state detection model. For more details of the first discriminant model and the second discriminant model in this specification, refer to fig. 4 and 5, which are not repeated herein.

It should be understood that the system and its modules shown in FIG. 2 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules in this specification may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).

It should be noted that the above description of the image similarity determination system 200 and the modules thereof is only for convenience of description, and the description is not limited to the scope of the embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, the first prediction module 210, the second prediction module 220, the decision module 230, and the training module 240 in fig. 2 may be different modules in a system, or may be a module that implements the functions of the two modules. For another example, in the image similarity determination system 200, each module may share one storage module, and each module may have its own storage module. Such variations are within the scope of the present disclosure.

Fig. 3 is a flowchart of an image similarity determination method according to embodiments shown in some embodiments of the present description. As shown in fig. 3, the process 300 may include the following steps:

step 310, inputting the images in the image group with the similarity to be calculated into the trained first discrimination model, and obtaining a first similarity evaluation value of the images in the image group with the similarity to be calculated.

In particular, this step may be performed by the first prediction module 210.

In some embodiments, the images in the image group of which the similarity is to be calculated may be captured and uploaded by a dedicated person, in some embodiments, the obtaining of the images in the image group of which the similarity is to be calculated includes extracting a vehicle picture by capturing a vehicle video and performing video segmentation, and in some embodiments, the obtaining of the images in the image group of which the similarity is to be calculated may directly capture an image of the vehicle by a monitoring device, a camera device, or the like.

For a detailed description of the first discriminant model, reference is made to the related content of fig. 4, which is not repeated herein.

And 320, processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated.

In particular, this step may be performed by the second prediction module 220.

In order to ensure the accuracy of the detection result, in some embodiments, the acquired data needs to be processed by a second model, and a second similarity evaluation value of the image in the image group with the similarity to be calculated is obtained.

For a detailed description of the second judgment model, reference is made to the related content of fig. 5, which is not described herein again.

And 330, inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

In particular, this step may be performed by the decision block 230.

In some embodiments, the accuracy of the determination may be further improved by combining the first similarity evaluation value and the second similarity evaluation value to obtain a final similarity determination result.

For a detailed description of the evaluation model, reference is made to the relevant contents of fig. 6, which are not described herein again.

Fig. 4 is a schematic structural diagram of a first discriminant model according to some embodiments of the present disclosure.

In some embodiments, the first discriminant model includes a number of convolutional neural network models and a full connection layer, and the number of the convolutional neural network models is not less than the number of images included in the image group with the similarity to be calculated.

Correspondingly, the step of inputting the images in the image group with the similarity to be calculated into the trained first discrimination model to obtain the first similarity evaluation value of the images in the image group with the similarity to be calculated specifically includes: respectively inputting the images in the image group with the similarity to be calculated into a convolutional neural network model;

preprocessing the characteristic vectors output by each convolution neural network model and inputting the preprocessed characteristic vectors into the full connection layer;

and acquiring the first similarity evaluation value output by the full connection layer.

As shown in fig. 4, in some embodiments, the convolutional neural network model of the first discriminant model may be constructed by using two or more convolutional neural networks with the same structure, and the number of the constructed convolutional neural networks may be the same as the number of images to be determined.

For example, when the number of images to be determined is two images, a twin network may be constructed based on two convolutional neural networks having the same structure, and the weight is shared between the two networks. After two networks are trained, the two images are respectively used as the input of the two networks, a feature vector is obtained after calculation, and then the similarity of the features is measured according to the distance and is used as a basis for judging the similarity of the two images.

In some embodiments, the constructed convolutional neural network is formed by arranging a plurality of first convolutional network blocks and a plurality of second convolutional network blocks at intervals, wherein the first convolutional network blocks sequentially comprise convolutional layers, pooling layers and activation layers, and the second convolutional network blocks sequentially comprise convolutional layers and activation layers.

In some embodiments, each convolutional neural network of the convolutional neural network model of the first discriminant model may output a corresponding eigenvalue, which may be in a vector form or a matrix form, based on the input vehicle image data, which is not limited in this specification.

In some embodiments, before inputting the feature output by each convolutional neural network model into the fully-connected layer, preprocessing is required, and in some embodiments, the preprocessing is to subtract two feature values to obtain a feature difference value, and then input the feature difference value into the fully-connected layer to obtain the output first similarity evaluation value.

The training of the first discriminant model may be implemented by the training module 240, and the first discriminant model may be obtained by training historical vehicle similarity-related data. For example only, the model may be trained with historical basic information as an input and an appropriate similarity value corresponding to the historical basic information as a correct criterion (Ground Truth). And meanwhile, the model parameters can be reversely adjusted according to the difference between the prediction output of the model and the correct standard. When a predetermined condition is satisfied, for example, the number of training samples reaches a predetermined number, the predicted accuracy of the model is greater than a predetermined accuracy threshold, or the Loss Function (Loss Function) value is less than a predetermined value, the training process is stopped, and the trained model is designated as the first discriminant model.

It should be noted that, in some embodiments, the constructed convolutional neural network model has the same structure and weight parameters. Therefore, we only need to train one of the networks. Two-class cross entropy loss can be used as an optimization target in the training process, and regularization is used for relieving the overfitting problem. The calculation is defined as follows:

wherein n represents the number of training pictures, p represents the probability distribution of the real sample, and q represents the probability distribution of prediction.

FIG. 5 is a schematic diagram illustrating a second model process flow according to some embodiments of the present description.

In some embodiments, the second model processes the image as follows:

respectively reducing the images of the image group with the similarity to be calculated to be n times of the original image;

converting the reduced image into a gray image;

calculating a mask mean value of the gray level image and acquiring a pixel value of the gray level image;

processing the pixel value based on a preset rule to obtain a processing result;

and performing hash calculation on the processing result, and taking the obtained hash value as the second similarity evaluation value.

The training of the second model may be performed by the training module 240, and the second model may be obtained by training historical vehicle similarity-related data. For example only, the model may be trained with historical basic information as an input and an appropriate similarity value corresponding to the historical basic information as a correct criterion (Ground Truth). And meanwhile, the model parameters can be reversely adjusted according to the difference between the prediction output of the model and the correct standard. When a predetermined condition is met, for example, the number of training samples reaches a predetermined number, the predicted accuracy of the model is greater than a predetermined accuracy threshold, or the Loss Function (Loss Function) value is less than a predetermined value, the training process is stopped, and the trained model is designated as the second model.

In some embodiments, the purpose of the second model using perceptual hashing is to make up for the false detection problem of the twin network, and improve the detection accuracy of the similarity. Perceptual hashing mainly utilizes low-frequency information in picture pixels. In some embodiments, the difference caused by pictures with different proportions and sizes can be abandoned by firstly reducing the pictures to 8 times of the original pictures; then converting the image into a gray image; subsequently, the mask mean is calculated and the image pixels are filtered (pixel values greater than the mean are set to 1, otherwise equal to 0); finally, a hash contrast value is calculated and used as a second similarity evaluation value.

Fig. 6 is a schematic flow diagram illustrating an image similarity determination method according to some embodiments of the present description.

In some embodiments, the evaluation model processes the input data as follows:

judging whether the first similarity evaluation value and a first threshold value meet a preset relation or not;

judging whether the second similarity evaluation value and a second threshold value meet a preset relation or not;

and if the preset relations are all satisfied, determining that the images in the image group with the similarity to be calculated are similar.

In some embodiments, the preset relationship that the first similarity evaluation value needs to satisfy with the first threshold may be that the first similarity evaluation value is not less than the first threshold, and in some embodiments, the first threshold may be 0.7.

In some embodiments, the preset relationship that the second similarity evaluation value needs to satisfy with the second threshold may be that the second similarity evaluation value is not less than the second threshold, and in some embodiments, the first threshold may be 0.9.

It should be noted that although the first discrimination model, the second model, and the evaluation model are described separately above, in some embodiments, at least two of them may be combined into one model, and the model may comprehensively determine the similarity of the vehicle from different vehicle image data and obtain the similarity determination result. For example, the model is input as images of a plurality of vehicles, and output as the similarity of the vehicles in the plurality of images or the similarity determination result. The process of model training and the process of determining vehicle similarity may be performed separately. In some embodiments, the training process may be performed on the server 110, or may be performed on another device, and the trained model may be applied to the server 110. In some embodiments, the similarity of the vehicles respectively determined from various different kinds of data reflecting the similarity of the vehicles may also be integrated. For example, the similarity of the vehicles determined according to the image data of the vehicles may be respectively given corresponding weights, and then the vehicle similarities respectively determined according to the outputs of the different models may be subjected to weighted summation to obtain the final similarity of the vehicles.

The similarity calculation method of the embodiments of the present specification has beneficial effects including, but not limited to, the following: 1. a supervised deep learning algorithm is used as a main part for feature extraction, and the method can adapt to complex and changeable environments in real scenes by using excellent robustness of deep learning. 2. The change of the pixels is not used as a judgment basis of the similarity of the vehicle, but the similarity problem is converted into a characteristic distance measurement problem, and the influence caused by the change of the angle and the position of the vehicle is solved to the greatest extent. 3. The application of the twin neural network can ensure that the detection process does not depend on the data of the template database any more, thereby reducing the resource consumption caused by establishing the template database. Only two vehicle images to be distinguished need to be extracted on the same feature extraction network, and then similarity is judged according to the distance between the features.

In practice, model tests are carried out under the conditions that the vehicle moves and does not move through the scheme provided by the invention, and the analysis of test results shows that when the vehicle does not move, the error rate of the scheme does not exceed 2.3 percent, which indicates that the features extracted by the twin network are effective and can be used for distinguishing different vehicles. When the movement occurs, the accuracy rate of the scheme reaches 100%, which shows that the problems that the traditional method cannot cope with pixel deviation and complex environment can be solved based on deep learning.

The embodiment of the present specification further provides an image similarity determination apparatus, including at least one storage medium and at least one processor, where the at least one storage medium is used to store computer instructions; the at least one processor is configured to execute the foregoing image similarity determination method, which includes: inputting images in an image group with similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated; processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated; and inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

The embodiment of the specification also provides a computer readable storage medium. The storage medium stores computer instructions, and after the computer instructions in the storage medium are read by the computer, the computer realizes the method for detecting the vehicle state, wherein the method comprises the following steps: inputting images in an image group with similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated; processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated; and inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.

The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.

Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, VisualBasic, Fortran2003, Perl, COBOL2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or processing device. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).

Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing processing device or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims

1. An image similarity judging method is characterized by comprising the following steps:

inputting images in an image group with similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated;

processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated;

and inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

2. The method according to claim 1, wherein the first discriminant model comprises a plurality of convolutional neural network models and a full connection layer, and the number of the convolutional neural network models is not less than the number of images included in the image group with the similarity to be calculated.

3. The method according to claim 2, wherein the inputting the images in the image group with the similarity to be calculated into the trained first discriminant model to obtain the first similarity evaluation value of the images in the image group with the similarity to be calculated comprises:

respectively inputting the images in the image group with the similarity to be calculated into a convolutional neural network model;

4. The method of claim 2 or 3, wherein the weight matrices used within the plurality of convolutional neural network models are the same, and the structures of the plurality of convolutional neural network models are identical.

5. The method of claim 2 or 3, wherein the convolutional neural network model is composed of a plurality of first convolutional network blocks and a plurality of second convolutional network blocks arranged at intervals.

6. The method of claim 5, wherein the first convolutional network block sequentially comprises a convolutional layer, a pooling layer, and an activation layer, and wherein the second convolutional network block sequentially comprises a convolutional layer and an activation layer.

7. The method according to claim 1, wherein the subjecting the images in the image group with the similarity to be calculated to the second model processing to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated comprises:

converting the reduced image into a gray image;

8. The method according to claim 1, wherein the inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated comprises:

9. An image similarity determination system, comprising:

the first prediction module is used for inputting the images in the image group with the similarity to be calculated into a trained first discrimination model to obtain a first similarity evaluation value of the images in the image group with the similarity to be calculated;

the second prediction module is used for processing the images of the image group with the similarity to be calculated through a second model to obtain a second similarity evaluation value of the images in the image group with the similarity to be calculated;

and the judging module is used for inputting the first similarity evaluation value and the second similarity evaluation value into a trained evaluation model to obtain similarity evaluation data of the images in the image group with the similarity to be calculated.

10. An image similarity judging device comprises a processor and a memory; the memory is configured to store instructions, and the instructions, when executed by the processor, cause the apparatus to implement operations corresponding to the image similarity determination method according to any one of claims 1 to 8.