CN116304141A

CN116304141A - Repeated picture detection method, device, equipment and medium based on deep learning

Info

Publication number: CN116304141A
Application number: CN202310137043.3A
Authority: CN
Inventors: 许垒; 沈鹏; 李银锋; 刘海伦; 王月宝; 黄明星; 胡尧; 周晓波
Original assignee: Beijing Shuidi Technology Group Co ltd
Current assignee: Beijing Shuidi Technology Group Co ltd
Priority date: 2023-02-13
Filing date: 2023-02-13
Publication date: 2023-06-23

Abstract

The application relates to the technical field of machine learning and artificial intelligence, and provides a repeated picture detection method, device, equipment and medium based on deep learning, wherein the method comprises the following steps: acquiring a picture to be detected; extracting features of the picture to be detected by using a deep learning model in the field of pre-trained images to obtain a first feature vector; performing similarity calculation on the first feature vector and each second feature vector in a vector retrieval library corresponding to the historical picture to obtain a similarity calculation result; and determining whether the picture to be detected is repeated with the historical picture according to the similarity calculation result. Through the technical scheme of the application, the repeated pictures with the pictures to be detected can be quickly and accurately retrieved, and the detection efficiency is improved.

Description

Repeated picture detection method, device, equipment and medium based on deep learning

[ field of technology ]

The application relates to the technical field of machine learning and artificial intelligence, in particular to a repeated picture detection method, device, equipment and medium based on deep learning.

[ background Art ]

In recent years, along with the rapid development of the Internet, a new mass funding platform gradually appears, when people are ill unfortunately, the self case report can be sent to the hospital inspection list uploading platform, so that mass funding cases are initiated to carry out funding, the platform checks the authenticity and the effectiveness, and the checked pictures can be released to the platform along with the cases. However, some people upload pictures in the historical cases, write false funding documents and submit the false funding documents to the platform for false funding, so that public opinion crisis is caused, and the reputation of the company is greatly influenced. In the traditional method, the uploaded case pictures are manually compared with the historical picture library one by one, so that the timeliness cannot reach the expectations, the labor cost is huge, and the company cannot bear the burden.

[ invention ]

The embodiment of the application provides a repeated picture detection method, device, equipment and medium based on deep learning, and aims to solve the technical problems of low efficiency and high labor cost of manually identifying case pictures in related technologies.

In a first aspect, an embodiment of the present application provides a repeated picture detection method based on deep learning, including:

acquiring a picture to be detected;

extracting features of the picture to be detected by using a deep learning model in the field of pre-trained images to obtain a first feature vector;

performing similarity calculation on the first feature vector and each second feature vector in a vector retrieval library corresponding to the historical picture to obtain a similarity calculation result;

and determining whether the picture to be detected is repeated with the historical picture according to the similarity calculation result.

In one embodiment, preferably, the method further comprises:

and extracting the characteristics of each historical picture by using a deep learning model of the pre-trained image field to obtain the second characteristic vector, and storing the second characteristic vector into the vector retrieval library.

In one embodiment, preferably, the method further comprises:

when the picture to be detected and the historical picture are determined to be repeated, determining that the case corresponding to the picture to be detected is a false case and not passing through.

In one embodiment, preferably, the performing similarity calculation between the first feature vector and each second feature vector in the vector search library corresponding to the historical picture includes:

and respectively calculating cosine similarity values of the first feature vector and each second feature vector.

In one embodiment, the cosine similarity is preferably calculated using the following formula:

wherein a represents the first feature vector and B represents the second feature vector.

In one embodiment, preferably, determining whether the picture to be detected is repeated with the history picture according to the similarity calculation result includes:

and when the cosine similarity value of the second feature vector and the first feature vector of any target historical picture is larger than a preset value, determining that the picture to be detected and the target historical picture are repeated.

In one embodiment, preferably, the method further comprises:

outputting all target historical pictures when the number of the target historical pictures which are repeated with the pictures to be detected is smaller than or equal to a preset number threshold value;

when the number of the target historical pictures which are repeated with the pictures to be detected is larger than a preset number threshold, all the target historical pictures are arranged in a descending order according to cosine similarity values, and the preset number of target historical pictures arranged in front are output.

In one embodiment, preferably, the method further comprises:

acquiring a training picture, wherein the training picture is a labeled picture;

aiming at each picture, carrying out translation and/or cutting operation on the picture, and setting the corresponding label value to be the maximum so as to obtain a training picture pair;

training the deep learning model in the image field by using the training pictures to obtain the deep learning model in the pre-trained image field.

In a second aspect, an embodiment of the present application provides a repeated picture detection apparatus based on deep learning, including:

the acquisition module is used for acquiring the picture to be detected;

the extraction module is used for extracting the characteristics of the picture to be detected by using a deep learning model in the field of pre-trained images so as to obtain a first characteristic vector;

the calculation module is used for carrying out similarity calculation on the first feature vector and each second feature vector in the vector retrieval library corresponding to the historical picture so as to obtain a similarity calculation result;

and the determining module is used for determining whether the picture to be detected is repeated with the historical picture according to the similarity calculation result.

In one embodiment, preferably, the extraction module is further configured to:

In one embodiment, preferably, the apparatus further comprises:

and the processing module is used for determining that the case corresponding to the picture to be detected is a false case and does not pass when the picture to be detected and the historical picture are determined to be repeated.

In one embodiment, preferably, the computing module is configured to:

In one embodiment, preferably, the determining module is configured to:

In one embodiment, preferably, the apparatus further comprises:

the output module is used for outputting all target historical pictures when the number of the target historical pictures which are repeated with the pictures to be detected is smaller than or equal to a preset number threshold value;

In one embodiment, preferably, the apparatus further comprises:

the image acquisition module is used for acquiring training images, wherein the training images are tagged images;

the picture processing module is used for carrying out translation and/or cutting operation on each picture, and setting the corresponding label value to be the maximum so as to obtain a training picture pair;

and the training module is used for training the deep learning model in the image field by using the training picture pair so as to obtain the deep learning model in the pre-trained image field.

In a third aspect, a computer device is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above repeated picture detection method based on deep learning when executing the computer program.

In a fourth aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program, which when executed by a processor, implements the steps of the repeated picture detection method based on deep learning.

In the scheme realized by the repeated picture detection method, the device, the equipment and the medium based on the deep learning, the picture to be detected can be obtained; extracting features of the picture to be detected by using a deep learning model in the field of pre-trained images to obtain a first feature vector; performing similarity calculation on the first feature vector and each second feature vector in a vector retrieval library corresponding to the historical picture to obtain a similarity calculation result; and determining whether the picture to be detected is repeated with the historical picture according to the similarity calculation result. In the repeated picture detection method based on the deep learning, in order to extract dense vector representation of the picture, a deep learning model in the field of the picture is used, so that specific semantic information of the picture can be captured, and the simple geometric transformation effect on the picture is good. And comparing the extracted first characteristic vector of the picture to be detected with the second characteristic vector in the vector retrieval library, so that the query response time can be greatly reduced in the query stage.

[ description of the drawings ]

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a schematic flow chart of a repeated picture detection method based on deep learning according to one embodiment of the present application.

FIG. 2 illustrates a schematic diagram of a deep learning model of an image domain according to one embodiment of the present application.

Fig. 3 shows a schematic flow chart of a repeated picture detection method based on deep learning according to one embodiment of the present application.

Fig. 4 shows a schematic flow chart of a repeated picture detection method based on deep learning according to another embodiment of the present application.

Fig. 5 shows a specific flowchart of a repeated picture detection method based on deep learning according to one embodiment of the present application.

FIG. 6 illustrates a schematic diagram of deep learning model extraction feature vectors for an image domain according to one embodiment of the present application.

Fig. 7 illustrates a repeating picture selection schematic according to one embodiment of the present application.

Fig. 8 shows a block diagram of a repeated picture detection apparatus based on deep learning according to another embodiment of the present application.

FIG. 9 illustrates a block diagram of a computer device, according to one embodiment of the present application.

[ detailed description ] of the invention

For a better understanding of the technical solutions of the present application, embodiments of the present application are described in detail below with reference to the accompanying drawings.

It should be understood that the described embodiments are merely some, but not all, of the embodiments of the present application. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without making any inventive effort, are intended to be within the scope of the present application.

The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.

It should be noted that, the embodiment of the present application may acquire and process related data based on artificial intelligence technology. Wherein artificial intelligence is the intelligence of simulating, extending and expanding a person using a digital computer or a machine controlled by a digital computer, sensing the environment, obtaining knowledge, and using knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The key point of the repeated picture query method is how to use the dense vector to represent a picture, and the method which is easy to think is to directly process the original pixel points of the images, and when the similarity value between the two pictures is calculated, the resolution ratio of the input image is often higher, so that the method brings great calculation cost, and is unacceptable in the aspect of calculation cost; in another method, a traditional image processing method is adopted, firstly, the size operation is carried out on an input image, all images are scaled to the same large size, then the average value of all pixels is counted, the difference is carried out on the pixel value and the average value of the pixels, and finally, the binarization operation is carried out on each pixel. Meanwhile, the closer two pictures are visually, the smaller the cosine distance between dense vectors is, so the problem to be solved by the invention is the vector representation of the pictures.

Referring to fig. 1, fig. 1 shows a schematic flow chart of a repeated picture detection method based on deep learning according to an embodiment of the present application.

As shown in fig. 1, a flow of a repeated picture detection method based on deep learning according to an embodiment of the present application includes:

step S101, obtaining a picture to be detected; optionally, the pictures to be detected can be pictures in crowd funding cases, such as medical record reports, hospital check sheets, patients and the like.

Step S102, extracting features of the picture to be detected by using a deep learning model in the field of pre-trained images to obtain a first feature vector;

the deep learning model in the image field can be a Resnet model or a deep learning model in the image field such as a Clip model, and the received data are all of picture types. The above technical solution of the present invention will be described in detail below by taking a Resnet model as an example.

As shown in fig. 2, the Resnet model is formed by combining a plurality of residual units, wherein each layer is a convolutional neural network, the deep learning model in the image field receives the input of a picture and outputs the category to which the picture belongs.

Step S103, performing similarity calculation on the first feature vector and each second feature vector in a vector retrieval library corresponding to the historical picture to obtain a similarity calculation result;

the vector search library can be a Faiss similar search library, faiss (full name Facebook AI Similarity Search) is a cluster and similarity search library which is open source by Facebook AI team, provides efficient similarity search and clustering for dense vectors, supports the search of billions-level vectors, and is a more mature approximate neighbor search library at present. Of course, the vector search library may be any other search library having a vector similarity search function and capable of realizing efficient search, and is not particularly limited herein.

Step S104, determining whether the picture to be detected is repeated with the historical picture according to the similarity calculation result.

In this embodiment, whether the history picture and the picture to be detected are similar or not may be determined according to the cosine similarity between the two pictures. Specifically, a preset value, for example, 90%, may be set, and when the similarity between the picture to be detected and the history picture is greater than 90%, the two pictures are considered to be similar. When the similarity between the picture to be detected and the historical picture is smaller than or equal to 90%, the picture to be detected and the historical picture are not similar. Specifically, the preset value may be set according to actual requirements.

In one embodiment, preferably, the method further comprises:

And collecting pictures of all initiating cases of the history, extracting picture feature vectors by a deep learning model in the image field, and storing the picture feature vectors into a vector retrieval library.

As shown in fig. 3, in one embodiment, preferably, the method further comprises:

step S301, when it is determined that the to-be-detected picture and the history picture are repeated, it is determined that the case corresponding to the to-be-detected picture is a false case and is not passed.

In this embodiment, when it is determined that the picture to be detected has a duplicate picture, it may be determined that its corresponding case is a false case and does not pass. Of course, to further ensure the accuracy of the judgment, the number of pictures to be detected, in which repeated pictures exist in one case, may be counted, and when multiple pictures all exist, for example, when 3 pictures all exist in one case, the case is determined to be a false case.

In the present invention, preferably, the similarity between the picture to be detected and the history picture can be determined by cosine similarity calculation, and of course, the similarity can also be determined by other similarity calculation methods.

In one embodiment, preferably, the method further comprises:

In this embodiment, if the pictures to be detected and the history pictures are repeated, the history pictures can be output, when the number of the pictures is smaller than or equal to a preset number threshold, for example, the preset number threshold is 10, when the number of the repeated pictures is smaller than 10, all the repeated pictures can be output, and when the number of the repeated pictures is larger, the TOP10 pictures ranked in the TOP can be selected for output according to the similarity ranking, so that the checking staff can conveniently further check and verify.

As shown in fig. 4, in one embodiment, preferably, the method further comprises:

step S401, obtaining a training picture, wherein the training picture is a labeled picture;

step S402, carrying out translation and/or cutting operation on each picture, and setting the corresponding label value to be the maximum so as to obtain a training picture pair;

step S403, training the deep learning model in the image domain by using the training picture pair, so as to obtain the deep learning model in the pre-trained image domain.

In this embodiment, in order to improve the query effect of the material pictures, a large number of material pictures are collected, a picture pair is constructed, specifically, for an original picture, operations such as random translation/clipping are performed on the original picture, a training picture pair is labeled in advance, when the translation/clipping operation is performed on the same picture, the label is set to 0.99, otherwise, the label is set to 0.09, and a deep learning model in the field of further training images is used for the picture pair.

The above technical solution of the present invention is described in detail below with reference to a deep learning model in the image domain as a Resnet model and a vector search library as a faiss library as a specific embodiment.

As shown in fig. 5, feature vectors of each picture in the history initiating case are extracted according to a pretrained Resnet model, and as shown in fig. 6, the feature vectors of the pictures are extracted by the Resnet model and then stored in a Faiss similar search library, which is also called a history dense vector library.

When the picture to be detected needs to be inquired, the method is the same, the ResNet model is used for extracting the characteristic vector of the picture to be detected, and cosine similarity values are calculated with the vectors in the historical dense vector library one by one. When the value is greater than 1.0, it is considered to belong to a duplicate picture and the url of the picture is returned, as shown in fig. 7. When the number of repeated pictures is large, the TOP1 picture can be selected for return.

As shown in fig. 8, in a second aspect, an embodiment of the present application provides a repeated picture detection apparatus based on deep learning, including:

an obtaining module 81, configured to obtain a picture to be detected;

the extracting module 82 is configured to perform feature extraction on the to-be-detected picture by using a deep learning model in the pre-trained image field, so as to obtain a first feature vector;

the calculating module 83 is configured to perform similarity calculation on the first feature vector and each second feature vector in the vector search library corresponding to the history picture, so as to obtain a similarity calculation result;

a determining module 84, configured to determine whether the picture to be detected is repeated with the historical picture according to the similarity calculation result.

In one embodiment, preferably, the extraction module is further configured to:

In one embodiment, preferably, the apparatus further comprises:

In one embodiment, preferably, the computing module is configured to:

In one embodiment, preferably, the determining module is configured to:

In one embodiment, preferably, the apparatus further comprises:

It should be noted that, for convenience and brevity of description, the specific working process of the apparatus and each module described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The apparatus described above may be implemented in the form of a computer program which is executable on a computer device as shown in fig. 9.

Referring to fig. 9, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a storage medium and an internal memory.

The storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any of the methods for predicting infectious disease space for multi-source data provided in embodiments of the present application.

The processor is used to provide computing and control capabilities to support the operation of the entire computer device.

The internal memory provides an environment for the execution of a computer program in a storage medium that, when executed by a processor, causes the processor to perform any of the methods. The storage medium may be nonvolatile or volatile.

The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the structure shown in fig. 9 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The computer device of embodiments of the present application exists in a variety of forms including, but not limited to:

(1) Mobile communication devices, which are characterized by mobile communication functionality and are aimed at providing voice, data communication. Such terminals include smart phones (e.g., iPhone), multimedia phones, functional phones, and low-end phones, among others.

(2) Ultra mobile personal computer equipment, which belongs to the category of personal computers, has the functions of calculation and processing and generally has the characteristic of mobile internet surfing. Such terminals include PDA, MID and UMPC devices, etc., such as iPad.

(3) Portable entertainment devices such devices can display and play multimedia content. Such devices include audio, video players (e.g., iPod), palm game consoles, electronic books, and smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture in that the server is provided with high-reliability services, and therefore, the server has high requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like.

(5) Other electronic devices with data interaction function.

In addition, embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions for performing the steps of:

acquiring a picture to be detected;

In one embodiment, preferably, the method further comprises:

In one embodiment, preferably, the method further comprises:

It should be noted that, the functions or steps that can be implemented by the computer readable storage medium or the electronic device may correspond to the relevant descriptions in the foregoing method embodiments, and are not described herein for avoiding repetition.

The technical scheme of the application is explained in detail by combining the drawings, through the technical scheme of the application, related operation of gray release can be integrated in the release system, and a developer can enable the release system to call the deployment system to correspondingly deploy for gray release only by carrying out integrated setting in the release system, so that the complexity of gray release deployment work is reduced, and the efficiency and reliability of gray release are improved.

It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

It should be understood that although the terms first, second, etc. may be used in embodiments of the present application to describe the setting units, these setting units should not be limited by these terms. These terms are only used to distinguish the setting units from each other. For example, the first setting unit may also be referred to as a second setting unit, and similarly, the second setting unit may also be referred to as a first setting unit, without departing from the scope of the embodiments of the present application.

Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. The repeated picture detection method based on the deep learning is characterized by comprising the following steps of:

acquiring a picture to be detected;

2. The repeated picture detection method according to claim 1, wherein the method further comprises:

3. The method for detecting repeated pictures according to claim 1, wherein said performing similarity calculation between the first feature vector and each second feature vector in the vector search library corresponding to the history picture comprises:

4. The repeated picture detection method according to claim 3, wherein determining whether the picture to be detected is repeated with a history picture according to the similarity calculation result comprises:

5. The repeated picture detection method according to claim 4, wherein the method further comprises:

6. The repeated picture detection method according to claim 1, wherein the method further comprises:

7. The repeated picture detection method according to any one of claims 1 to 6, wherein the method further comprises:

8. Repeated picture detection device based on degree of depth study, characterized by, include:

the acquisition module is used for acquiring the picture to be detected;

9. A computer device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method of any of the preceding claims 1 to 7.

10. A computer readable storage medium having stored thereon computer executable instructions for performing the method flow of any one of claims 1 to 7.