CN115661472A

CN115661472A - Image duplicate checking method and device, computer equipment and storage medium

Info

Publication number: CN115661472A
Application number: CN202211420321.8A
Authority: CN
Inventors: 叶向荣; 汪文娟
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-01-31

Abstract

The embodiment of the application belongs to the field of artificial intelligence, and relates to an image duplicate checking method, which comprises the following steps: acquiring an image to be checked; performing feature extraction on the image to be checked based on a preset convolutional neural network to obtain a feature vector of the image to be checked; determining an image set with similarity of the feature vectors of the images to be checked meeting preset conditions from all images stored in a preset image database based on a vector similarity search engine; carrying out image registration on the image to be subjected to duplicate checking and the image set based on a scale invariant feature transform algorithm to obtain a registration result of the image to be subjected to duplicate checking and the image set; and generating a duplicate checking result of the image to be checked based on the registration result. The application also provides an image duplicate checking device, computer equipment and a storage medium. In addition, the application also relates to a block chain technology, and the duplicate checking result can be stored in the block chain. The method and the device can accurately determine whether the image to be checked is the repeated image, and effectively improve the precision and the efficiency of image duplicate checking.

Description

Image duplicate checking method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an image duplicate checking method and apparatus, a computer device, and a storage medium.

Background

In some business fields, such as the insurance product application field, repeated insurance application actions of users using the same insurance product can occur, and for such fraudulent insurance actions, the insurance image submitted by the user is generally required to be checked and reviewed. In the traditional method, related personnel adopt a mode of searching similar images and a mode of visual inspection to carry out inspection and review. When the number of images is very large, the manual review mode can cause that the review and review of the images are very inefficient and have low accuracy. Moreover, the existing method for searching similar images by using the image searching method cannot effectively ensure the accuracy of image duplicate checking.

Disclosure of Invention

The embodiment of the application aims to provide an image duplicate checking method, an image duplicate checking device, computer equipment and a storage medium, so as to solve the problems of low efficiency and low accuracy of the existing image duplicate checking and auditing mode.

In order to solve the above technical problem, an embodiment of the present application provides an image duplicate checking method, which adopts the following technical scheme:

acquiring an image to be checked;

performing feature extraction on the image to be checked based on a preset convolutional neural network to obtain a feature vector of the image to be checked;

determining an image set with similarity of the feature vectors of the images to be checked meeting preset conditions from all images stored in a preset image database based on a vector similarity search engine;

carrying out image registration on the image to be subjected to image duplication checking and the image set based on a scale-invariant feature transformation algorithm to obtain a registration result of the image to be subjected to image duplication checking and the image set;

and generating a duplicate checking result of the image to be checked based on the registration result.

Further, the step of determining, by the vector similarity search engine, an image set whose similarity with the feature vector of the image to be repeatedly checked satisfies a preset condition from all images stored in a preset image database specifically includes:

generating a first similarity between each image in the image database and the image to be found based on Euclidean distances between the characteristic vectors respectively corresponding to each image in the image database and the image to be found through the vector similarity search engine;

sequencing all the first similarity according to the numerical value from large to small to obtain a corresponding sequencing result;

sequentially acquiring a plurality of target similarities of a preset number from a first similarity in the sequencing result;

screening out target images corresponding to the target similarity from all images in the image database;

and taking all the target images as the image set.

Further, the step of performing image registration on the image to be repeated and the image set based on the scale-invariant feature transform algorithm to obtain a registration result of the image to be repeated and the image set specifically includes:

performing feature extraction on the image to be subjected to repeated checking based on the scale invariant feature transformation algorithm to obtain a first specified feature vector of the image to be subjected to repeated checking;

extracting the features of all the images in the image set based on the scale invariant feature transformation algorithm to obtain second specified feature vectors which are in one-to-one correspondence with the images in the image set;

determining a second similarity between the image to be checked and each image in the image set based on the first specified feature vector and all the second specified feature vectors;

and generating a registration result of the image to be checked and the image set based on all the second similarities.

Further, the step of generating the registration result of the image to be reviewed and the image set based on all the second similarities specifically includes:

acquiring a preset similarity threshold;

comparing each second similarity with the similarity threshold value respectively;

if all the second similarity degrees are smaller than the similarity threshold value, generating a first registration result of a similar image which is not matched with the image to be checked in the image set;

and if at least one target similarity in all the second similarities is not smaller than the similarity threshold, generating a second registration result of the similar images matched with the to-be-checked image in the image set.

Further, the step of extracting features of the image to be found based on a preset convolutional neural network to obtain a feature vector of the image to be found specifically includes:

carrying out image feature extraction on the image to be checked through the convolution layer of the convolution neural network to obtain corresponding initial features;

performing feature filtering processing on the initial features through a pooling layer of the convolutional neural network to obtain processed initial features;

and taking the processed initial features as feature vectors of the image to be checked.

Further, after the step of acquiring the image to be checked, the method further includes:

acquiring field acquisition parameter information of a target corresponding to the image to be checked;

calculating and generating a self factor evaluation value and an environment factor evaluation value of the target object based on the field acquisition parameter information;

and generating an underwriting evaluation result of the target object based on the self factor evaluation value and the environment factor evaluation value.

Further, the step of generating an underwriting evaluation result of the target object based on the self-factor evaluation value and the environment factor evaluation value specifically includes:

acquiring a first preset weight corresponding to the self factor evaluation value; and (c) a second step of,

acquiring a second preset weight corresponding to the environmental factor evaluation value;

performing weighted calculation on the self factor evaluation value and the environment factor evaluation value based on the first preset weight and the second preset weight to generate an underwriting evaluation score corresponding to the target object;

and determining an underwriting evaluation result corresponding to the underwriting evaluation score based on a preset score evaluation table.

In order to solve the above technical problem, an embodiment of the present application further provides an image duplicate checking device, which adopts the following technical solutions:

the first acquisition module is used for acquiring an image to be checked;

the extraction module is used for extracting the features of the image to be found based on a preset convolution neural network to obtain the feature vector of the image to be found;

the determining module is used for determining an image set with similarity of the feature vectors of the image to be checked and the image to be checked meeting preset conditions from all images stored in a preset image database based on a vector similarity search engine;

the processing module is used for carrying out image registration on the image to be subjected to duplication checking and the image set based on a scale-invariant feature transformation algorithm to obtain a registration result of the image to be subjected to duplication checking and the image set;

and the first generation module is used for generating a duplication checking result of the image to be duplicated based on the registration result.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

acquiring an image to be checked;

In order to solve the foregoing technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

acquiring an image to be checked;

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:

after the image to be subjected to duplicate checking is obtained, feature extraction is performed on the image to be subjected to duplicate checking based on a preset convolutional neural network to obtain a corresponding feature vector, then a vector similarity search engine is called, an image set with the similarity meeting preset conditions with the feature vector of the image to be subjected to duplicate checking is determined from all images stored in a preset image database, namely, a part of images similar to the image to be subjected to duplicate checking are screened out, the registration result of the image to be subjected to duplicate checking and the image set is further determined according to the screened image set based on a scale-invariant feature transformation algorithm, and then the duplicate checking result of the image to be subjected to duplicate checking can be finally generated based on the registration result, so that whether the image to be subjected to duplicate checking is a repeated image can be accurately determined based on the obtained duplicate checking result, and the precision and the efficiency of image duplicate checking are effectively improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is an exemplary system architecture diagram to which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of an image duplication checking method according to the present application;

FIG. 3 is a schematic block diagram of an embodiment of an image duplication apparatus according to the present application;

FIG. 4 is a block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the foregoing drawings are used for distinguishing between different objects and not for describing a particular sequential order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. Network 104 is the medium used to provide communication links between

terminal devices

101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the

terminal devices

101, 102, 103.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that the image duplicate checking method provided in the embodiment of the present application is generally executed by a server/terminal device, and accordingly, the image duplicate checking apparatus is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continuing reference to FIG. 2, a flow diagram of one embodiment of an image duplication checking method according to the present application is shown. The image duplicate checking method comprises the following steps:

step S201, acquiring an image to be checked.

In the present embodiment, an electronic device (e.g., the server/terminal device shown in fig. 1) on which the image duplication checking method operates. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G/5G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, an UWB (ultra wideband) connection, and other wireless connection means now known or developed in the future.

The image to be checked is an image which needs to be subjected to similar image checking, and whether the image to be checked is cheated and saved can be confirmed by carrying out the checking on the image to be checked. The image to be checked may be an image of a subject matter to be certified, which is input by an insurance businessman. For example, in a dairy cow breeding insurance application scenario, the image to be checked is an image containing facial features of the dairy cow, such as facial sense organs and facial lines of the dairy cow. In addition, if there are multiple images to be processed for similar image duplicate checking, the process can be executed for each image to be checked, so as to realize parallel or serial search of similar images of multiple images to be checked.

And S202, extracting the features of the image to be checked based on a preset convolutional neural network to obtain the feature vector of the image to be checked.

In this embodiment, the convolutional neural network includes a convolutional layer and a pooling layer, and the convolutional layer of the convolutional neural network may be used to perform image feature extraction on the image to be checked, and then perform feature filtering processing on initial features output by the convolutional layer through the pooling layer of the convolutional neural network, so as to obtain a feature vector of the image to be checked.

Step S203, based on the vector similarity search engine, determining an image set with similarity of the feature vectors of the image to be checked meeting preset conditions from all images stored in a preset image database.

In this embodiment, the vector similarity search engine is specifically Milvus, which is an sourced feature vector similarity search engine, and is convenient to use, practical, reliable, easy to expand, stable, efficient, and rapid to search, and the Milvus can well deal with massive vector data, integrates several widely known source databases (faces, SPTAG, etc.) in the field of vector similarity calculation, and obtains optimal search performance by reasonably scheduling data and hardware computation. Milvus utilizes a GPU (Nvidia) to perform index acceleration and query acceleration, and can greatly improve single-machine performance. In addition to providing near real-time search capabilities for vectors, milvus can filter scalar data. With the increase of data and query scales, the Milvus also provides a solution for cluster fragmentation, supports the functions of read-write separation, horizontal expansion, dynamic capacity expansion and the like, and realizes the support for the super-large data scale. Furthermore, milvus is able to efficiently manage vector data, providing the ability to perform incremental and destructive lookups and lookups on vector and non-vector data. The image database is an image database storing image data for performing duplication checking and discrimination, and may be, for example, a database storing claim images.

Specifically, by using the vector similarity search engine Milvus, the first similarity between each image in the image database and the image to be repeated is determined based on the euclidean distance between each image in the image database and the feature vector corresponding to each image to be repeated, and then the image set with the first similarity satisfying the preset condition is determined. The preset condition is not particularly limited, and may be set according to actual requirements. For example, the determined image set is an image set composed of a specified number of images with the highest first similarity. Or the determined image set is an image set consisting of images of which the first similarity is not lower than a specified value. The designated number and the value of the designated value are not specifically limited and can be set according to actual requirements. In the embodiment, the image set is determined from the image database by using the vector similarity search engine, so that the corresponding duplicate checking result can be obtained only by carrying out duplicate checking on the image set and the image to be checked in the subsequent process, the duplicate checking process on mass data in the image database is avoided, the data processing amount is effectively reduced, and the processing speed for generating the duplicate checking result is also favorably improved.

And S204, carrying out image registration on the image to be subjected to image duplication checking and the image set based on a scale-invariant feature transformation algorithm to obtain a registration result of the image to be subjected to image duplication checking and the image set.

In this embodiment, scale-Invariant feature transform (Scale-Invariant feature transform or SIFT) is a computer vision algorithm used to detect and describe local features in an image, which finds extreme points in the spatial Scale and extracts the Invariant of position, scale and rotation. The SIFT algorithm is adopted to match the images, and the images shot by a non-fixed camera or an approximate fixed angle and with scale deformation and inconsistent shooting angles can be rotated, scaled and cut and then compared with each other in image difference, so that the false alarm rate of direct comparison is reduced.

Specifically, feature extraction may be performed on the image to be reduplicated to obtain corresponding first specified feature vectors based on a SIFT algorithm, feature extraction may be performed on all the images in the image set to obtain second specified feature vectors in one-to-one correspondence with the images in the image set, then, based on the first specified feature vectors and all the second specified feature vectors, second similarities between the image to be reduplicated and the images in the image set are determined, and then, based on all the second similarities, a registration result between the image to be reduplicated and the image set is generated. After the image set is obtained, the second similarity between the image to be found and all images in the image set is further determined by considering the SIFT feature vector distribution of all images contained in the image set and the image to be found, and then the registration result between the image to be found and the image set is output based on the obtained second similarity, so that the accuracy of image similarity query is effectively improved.

And S205, generating a duplicate checking result of the image to be checked based on the registration result.

In this embodiment, the registration result includes that no similar image matching the image to be checked exists in the image set, or a similar image matching the image to be checked exists in the image set. Specifically, if the generated first registration result is that no similar image matched with the image to be checked exists in the image set, a first duplicate checking result of the image to be checked without repeated images is generated; and if the generated second registration result is that similar images matched with the images to be checked exist in the image set, taking the similar images matched with the images to be checked existing in the image set as repeated images of the images to be checked, and generating a second repeated checking result that the images to be checked have repeated images. In addition, after the duplicate checking result is obtained, the duplicate checking result can be displayed, so that a user can quickly obtain the duplicate checking result of the current image to be checked, and the use experience of the user is improved.

After the image to be subjected to the duplicate checking is obtained, feature extraction is performed on the image to be subjected to the duplicate checking based on a preset convolutional neural network to obtain a corresponding feature vector, then a vector similarity search engine is called, an image set with the similarity of the feature vector of the image to be subjected to the duplicate checking meeting a preset condition is determined from all images stored in a preset image database, namely, a part of images similar to the image to be subjected to the duplicate checking are screened out, the registration result of the image to be subjected to the duplicate checking and the image set is further determined according to the screened image set based on a scale-invariant feature transformation algorithm, and then the duplicate checking result of the image to be subjected to the duplicate checking can be finally generated based on the registration result, so that whether the image to be subjected to the duplicate checking is a repeated image can be accurately determined based on the obtained duplicate checking result, and the precision and the efficiency of the image duplicate checking are effectively improved.

In some optional implementations, step S203 includes the following steps:

and generating a first similarity between each image in the image database and the image to be checked on the basis of Euclidean distances between the characteristic vectors of each image in the image database and the corresponding characteristic vectors of the image to be checked through the vector similarity search engine.

In the present embodiment, the vector similarity search engine is embodied as Milvus. The image set with the first similarity meeting the preset condition is determined through a vector similarity search engine Milvus, all images of an image database to be queried can be subjected to feature extraction in advance by using the convolutional neural network to obtain feature vectors, and the extracted first specified feature vectors are registered into the vector similarity search engine Milvus to establish an index. After the to-be-checked image to be processed is acquired, the vector similarity search engine Milvus may perform vector search on the feature vectors corresponding to the to-be-checked image, and obtain a corresponding image set by using the euclidean distance as the similarity measure.

And sequencing all the first similarity according to the numerical order from large to small to obtain a corresponding sequencing result.

In this embodiment, the images in the image set may be sorted in descending order of the similarity value to generate corresponding sorting results.

And sequentially acquiring a plurality of target similarities of a preset number from the first similarity in the sequencing result.

In this embodiment, the value of the preset number is not limited, and may be set according to actual requirements. For example, N images sorted according to the first similarity in descending order may be obtained, where N is a preset number, and may be set to be 10, for example.

And screening out target images respectively corresponding to the target similarity from all the images in the image database.

In this embodiment, based on the correspondence between the target similarity and the image, the target images corresponding to the target similarities may be screened from all the images stored in the image database.

And taking all the target images as the image set.

In this embodiment, after all the target images are obtained, all the target images can be used as the image set.

According to the method and the device, the image set with the first similarity of the feature vectors of the images to be checked meeting the preset requirements is determined from all the images in the preset image database through the vector similarity search engine, the primary screening of each image in the image data is realized, the images to be checked and the obtained image set are processed subsequently only on the basis of the scale-invariant feature transformation algorithm, the duplicate checking result of the images to be checked is generated quickly and accurately, the data processing amount in the duplicate checking and judging process is effectively reduced, and the accuracy of the generated duplicate checking result is also ensured.

In some optional implementations of this embodiment, step S204 includes the following steps:

and performing feature extraction on the image to be subjected to weight checking based on the scale invariant feature transformation algorithm to obtain a first specified feature vector of the image to be subjected to weight checking.

In this embodiment, the specific feature vector may specifically refer to an SIFT feature vector, and the process of obtaining the first specific feature vector of the image to be repeated based on the feature extraction performed by the scale-invariant feature transform algorithm may include: calculating a saliency map of the image to be checked based on a visual attention mechanism to obtain an interest region of the image to be checked; and extracting the features of the interest region based on a scale invariant feature transformation algorithm to obtain the first specified feature vector. The feature points extracted by using the scale invariant feature transformation algorithm can not only maintain the invariance of an image set to illumination change, scale scaling and image rotation, but also maintain a certain degree of stability to view angle change, affine transformation and noise interference.

Specifically, the process of extracting the feature of the interest region based on the scale invariant feature transform algorithm to obtain the first specified feature vector may include: based on a scale invariant feature transform algorithm, acquiring the positions of SIFT feature points and SIFT feature descriptors of the interest region; clustering the SIFT feature points of the interest region based on a K-means algorithm, and screening redundant SIFT feature points according to a generated clustering center to obtain target SIFT feature points; and constructing and obtaining the first appointed feature vector based on the position of the target SIFT feature point in the interest region and the target SIFT feature descriptor corresponding to the target SIFT feature point. In the process of extracting the SIFT features of the image to be checked, a small amount of feature point sets which can represent image features need to be obtained from the image as much as possible. Therefore, a visual attention image set mechanism, a scale invariant feature transformation algorithm and a K-means algorithm are combined, a visual salient image set image of an image to be checked is obtained by using the visual salient mechanism, SIFT feature points of the image to be checked are extracted from a salient region of the image to be checked, the situation that the number of SIFT feature points is too large in an image set is avoided being extracted from a global image, finally, clustering of the image feature points is achieved by using the K-means algorithm, a feature point subset which can better represent the image is selected, and the process of image matching is accelerated.

And extracting the features of all the images in the image set based on the scale invariant feature transformation algorithm to obtain second specified feature vectors which are in one-to-one correspondence with the images in the image set.

In this embodiment, the process of extracting the second specific feature vector is the same as the process of extracting the first specific feature vector, and is not repeated herein.

And determining a second similarity between the image to be checked and each image in the image set based on the first specified feature vector and all the second specified feature vectors.

In this embodiment, the process of calculating the second similarity between the image to be checked and each image in the image set includes: respectively calculating Euclidean distances between each feature point in the first specified feature vector and each feature point in the second specified feature vector; acquiring a matching point pair with a matching relation between the first specified characteristic vector and the second specified characteristic vector based on the Euclidean distance corresponding to each characteristic point in the first specified characteristic vector; acquiring a second area of a single pixel of the specified image in the image set according to the number of the feature points in the first specified feature vector, the first area of the single pixel of the image to be checked and the scale factor; the scale factor is a scale factor of a matching point pair between the first specified feature vector and the second specified feature vector, and the specified image is any one of all images contained in the image set; according to the second area, acquiring the matching total area matched with the specified image in the image to be checked and acquiring the total area of the image to be checked; wherein, the total matching area of the image to be checked, which is matched with the designated image, is the second area of a single pixel of the designated image, the number of matching point pairs matched between the first designated feature vector and the second designated feature vector, and the second area is equal to the first area of the single pixel of the image to be checked, the scale factor; the total area of the images to be checked is the number of the feature points of the first specified feature vector as well as the first area of a single pixel of the images to be checked; and generating an area ratio based on the total matching area and the target total area, and taking the area ratio as a second similarity between the image to be checked and the specified image.

Specifically, the process of obtaining a matching point pair having a matching relationship between a first specified feature vector and a second specified feature vector based on the euclidean distance corresponding to each feature point in the first specified feature vector may include calculating a difference absolute value between two euclidean distances having minimum numerical values corresponding to each feature point in the first specified feature vector; judging whether the absolute value of the difference value accords with a preset value range or not; and if so, taking the three feature points corresponding to the two Euclidean distances with the minimum numerical values as a matching point pair. The determination process of the matching point pairs is carried out by calculating the difference value of the Euclidean distances, so that the influence of unstable factors such as shielding and illumination can be effectively solved.

In this embodiment, if the obtained second similarity between the image to be checked and the designated image is greater than a preset similarity threshold, determining that the designated image is a duplicate image of the image to be checked, and generating a duplicate checking result that the duplicate image exists in the image to be checked.

According to the method and the device, the corresponding first specified characteristic vector and the second specified characteristic vector are obtained by respectively extracting the characteristics of the image to be checked and all the images in the image set based on the scale invariant characteristic transformation algorithm, the similarity calculation processing is further performed on the first specified characteristic vector and all the second specified characteristic vectors, the registration result of the image to be checked and the image set can be accurately generated according to the second similarity generated by calculation, the subsequent duplication checking result of the image to be checked can be generated based on the obtained registration result, whether the image to be checked is repeated or not can be accurately determined according to the registration result, and the precision and the efficiency of image duplication checking are effectively improved.

In some optional implementations, the generating of the registration result of the to-be-checked reiterated image and the image set based on all the second similarities includes the following steps:

and acquiring a preset similarity threshold.

In this embodiment, a similarity threshold corresponding to the SIFT feature vector may be preset. If the similarity between the SIFT specified features of any one image in the image set and the SIFT specified features of the image to be checked exceeds the second similarity threshold, the image in the image database and the image to be checked can be regarded as a matched repeated image.

And comparing the second similarity with the similarity threshold value respectively.

In this embodiment, for each image to be checked, the second similarity and the similarity threshold may be compared respectively to implement parallel comparison of the similarities, so as to improve the processing efficiency of the similarity comparison, and be beneficial to improving the generation rate of the registration result.

And if all the second similarities are smaller than the similarity threshold, generating a first registration result of the similar image which is not matched with the image to be checked in the image set.

In this embodiment, if the second similarity of all the images in the image database is smaller than the second similarity threshold, the registration result of the image to be reviewed is that there is no similar image matching the image to be reviewed in the image set.

And if at least one target similarity in all the second similarities is not smaller than the similarity threshold, generating a second registration result of the similar images which are matched with the to-be-checked images in the image set.

In this embodiment, after the second similarity corresponding to the SIFT feature vector between the image to be repeated and each image in the image set is obtained, the second similarity corresponding to each image in the image database may be respectively compared with a second similarity threshold, and if the second similarity is not less than the second similarity threshold, an image number corresponding to the second similarity may be recorded, where the image corresponding to the number is an image that is repeated with the image to be repeated, and a registration result that a similar image matching the image to be repeated exists in the image set may be generated.

According to the method and the device, the corresponding feature similarity is obtained by comparing the first specified feature vector of the image to be checked with the second specified feature vector of the image in the image database, and the registration result is obtained by analyzing the obtained feature similarity and the preset similarity threshold, so that the accuracy of the registration result is improved. And a corresponding duplicate checking result can be generated subsequently based on the obtained registration result, so that the duplicate checking efficiency of the image is effectively improved.

In some optional implementations, step S202 includes the following steps:

and carrying out image feature extraction on the image to be checked through the convolution layer of the convolution neural network to obtain corresponding initial features.

In this embodiment, the feature extraction may be performed on the image to be checked through a convolutional neural network to obtain a corresponding feature vector, where the convolutional neural network includes a convolutional layer and a pooling layer. The process of extracting the initial features of the image to be checked through the convolution layer comprises the following steps: and constructing a matrix for the acquired image to be checked, and then establishing a convolution kernel matrix in the matrix, wherein each element of the convolution kernel matrix comprises a weight coefficient and a deviation value, on the basis, the convolution kernel matrix is subjected to weighted summation and the deviation value is superposed to form a new matrix, and the new matrix is used as a convolution characteristic, namely the initial characteristic.

And performing feature filtering processing on the initial features through a pooling layer of the convolutional neural network to obtain the processed initial features.

In this embodiment, after the initial feature extraction step of the convolutional layer is completed, the obtained initial features are further subjected to feature filtering in the pooling layer. By down-sampling and dimensionality reduction of the initial features at the pooling level to generate processed initial features, a reduction in computational data volume while preserving the underlying image features can be achieved. Wherein, the pooling layer operation uses ReLu piecewise function to realize the nonlinearity of the neural network model. And simultaneously replacing the result of a single point in the feature image with the feature map statistics of its neighboring regions.

The method and the device have the advantages that the convolutional layer and the pooling layer of the convolutional neural network are used for carrying out feature extraction on the image to be checked to obtain the corresponding feature vector, so that the method and the device are favorable for determining the image set with the similarity meeting the preset conditions with the feature vector of the image to be checked quickly from all images stored in the preset image database based on a vector similarity search engine, and the accuracy of the generated image set is ensured.

In some optional implementation manners of this embodiment, after step S201, the electronic device may further perform the following steps:

and acquiring field acquisition parameter information of the object corresponding to the to-be-checked image.

In this embodiment, the parameter information collected on site is collected by a salesman when the salesman is in the site to coordinate and protect the target object, and specifically may include the age of the target object, the weight, the proportion of four limbs, the variety data, the area of the target object shed, the whole body image of the complete target object additionally shot, the panoramic image of the shed, and the geographical coordinates of the location of the farm. For example, if the target is a cow, the corresponding field acquisition parameter information of the cow is the age of the cow, the weight, the proportion of four limbs, the variety data, the shed area of the cow, the additionally shot whole body image of the whole cow, the shed panoramic image and the geographical coordinates of the location of the farm.

And calculating and generating a self factor evaluation value and an environment factor evaluation value of the target object based on the field acquisition parameter information.

In the present embodiment, the generation process of the self-factor evaluation value of the subject matter includes: searching an industry breeding health standard corresponding to the target object according to the specified indexes of the target object, namely the age of livestock, the weight, the ratio of four limbs and the variety data, judging whether the specified indexes are all in a standard range, inquiring a preset state data table according to the obtained numerical value to obtain a corresponding state value, and recording the state value as the growth state value of the target object. The state data table is a table which is created in advance and stores a numerical range of a designated index of a target object and a state value corresponding to the numerical range of the designated index.

The environmental factor assessment value for the subject matter may include one or more of a farming density value, a disaster risk value, and a claim risk value. Specifically, intelligent counting can be performed on the collected shed panoramic images, and the cow breeding density is calculated according to the area of the target shed and is recorded as the breeding density value. The disaster grade and the occurrence frequency of the corresponding coordinates in the national meteorological disaster monitoring database can be called according to the geographical coordinates of the location of the farm and recorded as the disaster risk value. And acquiring the frequency of claim settlement of the area within a certain threshold range from the position of the geographic coordinate by rechecking the insurance application database, and recording the frequency as a claim settlement risk value. The certain threshold range may be set according to actual requirements.

In this embodiment, a weight value of the self-factor evaluation value and a weight value corresponding to the environment factor evaluation value may be obtained, and then a weighted calculation process may be performed on the self-factor evaluation value and the environment factor evaluation value based on the obtained weight values, so as to generate an underwriting evaluation result of the target object.

According to the method and the device, the self-factor evaluation value and the environment factor evaluation value of the target object are obtained by calculating the field acquisition parameter information of the target object, and then the underwriting evaluation result of the target object is generated based on the self-factor evaluation value and the environment factor evaluation value, so that the health degree index of the target object is analyzed according to the field acquisition parameter information of the target object, and therefore a corresponding underwriting suggestion can be generated according to the obtained underwriting evaluation result for reference of an underwriter, the problem that the potential risk of the target object cannot be rapidly and comprehensively positioned when the underwriter conducts manual auditing is effectively solved, the intelligence of underwriting processing is improved, and the operation efficiency of underwriting flow is also improved.

In some optional implementations of this embodiment, the generating an underwriting evaluation result of the target object based on the self-factor evaluation value and the environment factor evaluation value includes the following steps:

and acquiring a first preset weight corresponding to the self factor evaluation value.

In this embodiment, the value of the first preset weight is not specifically limited, and may be set according to actual use requirements.

And acquiring a second preset weight corresponding to the environmental factor evaluation value.

In this embodiment, the value of the second preset weight is not specifically limited, and may be set according to actual use requirements.

And performing weighted calculation on the self factor evaluation value and the environment factor evaluation value based on the first preset weight and the second preset weight to generate an underwriting evaluation score corresponding to the target object.

In this embodiment, the weighting calculation may specifically refer to weighted sum processing.

In this embodiment, the process of determining the underwriting evaluation result corresponding to the underwriting evaluation score based on the preset score evaluation table may include: a score evaluation table in which a plurality of evaluation score ranges and evaluation results corresponding to each evaluation score range one to one are stored is created in advance. After the underwriting evaluation score is obtained, an evaluation result corresponding to the underwriting evaluation score is obtained by inquiring the score evaluation table, and the evaluation result is used as the underwriting evaluation result corresponding to the target object. Specifically, the evaluation score matching the underwriting evaluation score may be found from the score evaluation table based on the underwriting evaluation score, and then the evaluation result having an association relationship with the evaluation score may be obtained and used as the underwriting evaluation result of the target object. For example, the score evaluation table includes two evaluation score ranges, and an evaluation result corresponding to the first evaluation score range is that the risk of claim settlement is low, and can be passed through the underwriting; and the evaluation result corresponding to the second evaluation score range is high in claim settlement risk and cannot pass the insurance application. And if the calculated underwriting evaluation score of the subject matter is within the first evaluation score range, the risk of correspondingly generating the claim is low, and the underwriting evaluation result can be passed through underwriting. And if the calculated underwriting evaluation score of the target object is within the second evaluation score range, an underwriting evaluation result which is high in claim risk and cannot pass the application is correspondingly generated.

After the evaluation value of the self factor of the object and the evaluation value of the environmental factor are obtained through calculation, the underwriting evaluation score of the object can be quickly and accurately calculated through the first preset weight based on the evaluation value of the self factor and the second preset weight based on the evaluation value of the environmental factor, and then the underwriting evaluation result corresponding to the underwriting evaluation score can be determined, so that the health index of the object can be analyzed according to the field acquisition parameter information of the object, a corresponding underwriting suggestion can be generated according to the obtained underwriting evaluation result for reference of an underwriter, the problem that the underwriter cannot quickly and comprehensively position the potential risk of the object during manual audit is effectively avoided, the intelligence of underwriting is improved, and the operation efficiency of underwriting flow is also improved.

It should be emphasized that, in order to further ensure the privacy and security of the duplication checking result, the error location information may also be stored in a node of a block chain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the processes of the embodiments of the methods described above can be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless otherwise indicated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of an image duplicate checking apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 3, the image duplication checking apparatus 300 according to the present embodiment includes: a first acquisition module 301, an extraction module 302, a determination module 303, a processing module 304, and a first generation module 305. Wherein:

a first obtaining module 301, configured to obtain an image to be checked;

an extraction module 302, configured to perform feature extraction on the image to be duplicate checked based on a preset convolutional neural network, so as to obtain a feature vector of the image to be duplicate checked;

a determining module 303, configured to determine, based on a vector similarity search engine, an image set, where similarity of feature vectors of the image to be checked and the image to be reduplicated meets a preset condition, from all images stored in a preset image database;

the processing module 304 is configured to perform image registration on the image to be subjected to image duplication checking and the image set based on a scale-invariant feature transformation algorithm, so as to obtain a registration result of the image to be subjected to image duplication checking and the image set;

a first generating module 305, configured to generate a duplication result of the image to be duplicated based on the registration result.

In this embodiment, the operations performed by the modules or units are in one-to-one correspondence with the steps of the image duplicate checking method in the foregoing embodiment, and are not described herein again.

In some optional implementations of this embodiment, the determining module 303 includes:

a generating submodule, configured to generate, by the vector similarity search engine, a first similarity between each image in the image database and the image to be duplicate checked based on euclidean distances between the feature vectors respectively corresponding to each image in the image database and the image to be duplicate checked;

the sorting submodule is used for sorting all the first similarity according to the numerical order from large to small to obtain a corresponding sorting result;

the first obtaining submodule is used for sequentially obtaining a plurality of target similarities of a preset number from the first similarity in the sequencing result;

the screening submodule is used for screening out target images which respectively correspond to the target similarity from all the images in the image database;

a first determining sub-module for taking all the target images as the image set.

In some optional implementations of this embodiment, the processing module 304 includes:

the first extraction submodule is used for extracting the features of the image to be checked based on the scale-invariant feature transformation algorithm to obtain a first specified feature vector of the image to be checked;

the second extraction submodule is used for extracting the features of all the images in the image set based on the scale-invariant feature transformation algorithm to obtain second specified feature vectors which are in one-to-one correspondence with the images in the image set;

a second determining submodule, configured to determine, based on the first specified feature vector and all the second specified feature vectors, a second similarity between the image to be repeated and each image in the image set;

and the first generation submodule is used for generating a registration result of the image to be checked and the image set based on all the second similarities.

In this embodiment, the operations performed by the modules or units are respectively corresponding to the steps of the image duplicate checking method in the foregoing embodiment one to one, and are not described herein again.

In some optional implementations of this embodiment, the generating the sub-module includes:

the acquisition unit is used for acquiring a preset similarity threshold;

the comparison unit is used for carrying out numerical comparison on each second similarity and the similarity threshold value respectively;

a first generating unit, configured to generate a first registration result of a similar image that does not exist in the image set and matches the to-be-checked image if all the second similarities are smaller than the similarity threshold;

and the second generating unit is used for generating a second registration result of a similar image which is matched with the image to be checked and exists in the image set if at least one target similarity in all the second similarities is not smaller than the similarity threshold.

In some optional implementations of this embodiment, the extracting module 302 includes:

the third extraction submodule is used for extracting image features of the image to be checked through the convolution layer of the convolution neural network to obtain corresponding initial features;

the filtering submodule is used for carrying out feature filtering processing on the initial features through a pooling layer of the convolutional neural network to obtain processed initial features;

and the third determining submodule is used for taking the processed initial features as feature vectors of the image to be checked.

In some optional implementations of this embodiment, the image duplication checking apparatus further includes:

the second acquisition module is used for acquiring the field acquisition parameter information of the target object corresponding to the image to be checked;

the second generation module is used for calculating and generating a self factor evaluation value and an environment factor evaluation value of the target object based on the field acquisition parameter information;

and the third generation module is used for generating an underwriting evaluation result of the target object based on the self factor evaluation value and the environment factor evaluation value.

In some optional implementations of this embodiment, the third generating module includes:

the second acquisition sub-module is used for acquiring a first preset weight corresponding to the self factor evaluation value; and the number of the first and second groups,

the third obtaining submodule is used for obtaining a second preset weight corresponding to the environment factor evaluation value;

the second generation submodule is used for carrying out weighted calculation on the self factor evaluation value and the environment factor evaluation value on the basis of the first preset weight and the second preset weight so as to generate an underwriting evaluation score corresponding to the target object;

and the fourth determining submodule is used for determining an underwriting evaluation result corresponding to the underwriting evaluation score based on a preset score evaluation table.

In order to solve the technical problem, the embodiment of the application further provides computer equipment. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 4 comprises a memory 41, a processor 42, and a network interface 43, which are communicatively connected to each other via a system bus. It is noted that only computer device 4 having components 41-43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 41 includes at least one type of readable storage medium including flash memory, hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the computer device 4. Of course, the memory 41 may also include both an internal storage unit of the computer device 4 and an external storage device thereof. In this embodiment, the memory 41 is generally used for storing an operating system installed in the computer device 4 and various types of application software, such as computer readable instructions of an image duplicate checking method. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, for example, execute computer readable instructions of the image duplicate checking method.

The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing a communication connection between the computer device 4 and other electronic devices.

in the embodiment of the application, after the duplicate image to be checked is obtained, feature extraction is performed on the duplicate image to be checked on the basis of a preset convolutional neural network to obtain a corresponding feature vector, then a vector similarity search engine is called, an image set with similarity meeting a preset condition with the feature vector of the duplicate image to be checked is determined from all images stored in a preset image database, namely a part of images similar to the duplicate image to be checked are screened out, a registration result of the image to be checked and the image set is further determined according to the screened image set on the basis of a scale-invariant feature transformation algorithm, and then the duplicate checking result of the image to be checked can be finally generated on the basis of the registration result, so that whether the duplicate image to be checked is a repeated image can be accurately determined on the basis of the obtained duplicate checking result, and the precision and the efficiency of image duplicate checking are effectively improved.

The present application further provides another embodiment, which is to provide a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of the image duplication checking method as described above.

in the embodiment of the application, after the image to be duplicate checked is obtained, feature extraction is performed on the image to be duplicate checked based on a preset convolutional neural network to obtain a corresponding feature vector, then a vector similarity search engine is called, an image set with similarity to the feature vector of the image to be duplicate checked meeting a preset condition is determined from all images stored in a preset image database, namely, a part of images similar to the image to be duplicate checked are screened out, a registration result of the image to be duplicate checked and the image set is further determined according to the screened image set based on a scale-invariant feature transformation algorithm, and then the duplicate checking result of the image to be duplicate checked can be finally generated based on the registration result, so that whether the image to be duplicate checked is a duplicate image can be accurately determined based on the obtained duplicate checking result, and the precision and the efficiency of image duplicate checking are effectively improved.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It should be understood that the above-described embodiments are merely exemplary of some, and not all, embodiments of the present application, and that the drawings illustrate preferred embodiments of the present application without limiting the scope of the claims appended hereto. This application is capable of embodiments in many different forms and the embodiments are provided so that this disclosure will be thorough and complete. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that modifications can be made to the embodiments described in the foregoing detailed description, or equivalents can be substituted for some of the features described therein. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields, and all the equivalent structures are within the protection scope of the present application.

Claims

1. An image duplicate checking method is characterized by comprising the following steps:

acquiring an image to be checked;

2. The image duplicate checking method according to claim 1, wherein the step of determining, by the vector similarity search engine, an image set whose similarity to the feature vector of the image to be checked satisfies a preset condition from all images stored in a preset image database specifically includes:

generating, by the vector similarity search engine, a first similarity between each image in the image database and the image to be duplicate checked based on euclidean distances between the respective corresponding feature vectors of each image in the image database and the image to be duplicate checked;

sequencing all the first similarity according to the numerical order from large to small to obtain corresponding sequencing results;

and taking all the target images as the image set.

3. The image duplicate checking method according to claim 1, wherein the step of performing image registration on the image to be checked and the image set based on the scale-invariant feature transform algorithm to obtain a registration result of the image to be checked and the image set specifically comprises:

4. The image duplicate checking method according to claim 3, wherein the step of generating the registration result of the image to be checked and the image set based on all the second similarities specifically comprises:

acquiring a preset similarity threshold;

5. The image duplicate checking method according to claim 1, wherein the step of extracting features of the image to be checked based on a preset convolutional neural network to obtain a feature vector of the image to be checked specifically comprises:

6. The image duplication checking method of claim 1, further comprising, after the step of obtaining the duplicate image to be checked:

7. The image duplication checking method of claim 6, wherein the step of generating the underwriting evaluation result of the subject matter based on the self-factor evaluation value and the environment-factor evaluation value specifically includes:

acquiring a first preset weight corresponding to the self factor evaluation value; and the number of the first and second groups,

8. An image duplicate checking device, comprising:

the first acquisition module is used for acquiring an image to be checked;

the extraction module is used for extracting the features of the image to be checked based on a preset convolutional neural network to obtain the feature vector of the image to be checked;

the determining module is used for determining an image set which has similarity with the feature vector of the image to be checked and meets a preset condition from all images stored in a preset image database based on a vector similarity search engine;

the processing module is used for carrying out image registration on the image to be subjected to image duplication checking and the image set based on a scale-invariant feature transformation algorithm to obtain a registration result of the image to be subjected to image duplication checking and the image set;

and the first generation module is used for generating a duplicate checking result of the image to be checked based on the registration result.

9. A computer device comprising a memory having computer readable instructions stored therein and a processor which when executed implements the steps of the image duplication checking method of any one of claims 1 to 7.

10. A computer-readable storage medium, having computer-readable instructions stored thereon, which, when executed by a processor, implement the steps of the image duplication checking method according to any one of claims 1 to 7.