CN115131632B

CN115131632B - Low-consumption image retrieval method and system for training sample depth optimization

Info

Publication number: CN115131632B
Application number: CN202211043682.5A
Authority: CN
Inventors: 吴昊; 叶舟
Original assignee: Beijing Normal University
Current assignee: Beijing Normal University
Priority date: 2022-08-29
Filing date: 2022-08-29
Publication date: 2022-11-04
Anticipated expiration: 2042-08-29
Also published as: CN115131632A

Abstract

The invention provides a low-consumption image retrieval method and system for training sample depth optimization, and relates to the technical field of image retrieval. Retrieving a preset number of positive and negative sample images; detecting the sample image by using an HC significance detection method, and deleting the sample image if the area of the significance region of the sample image is lower than a first preset proportion of the whole image; processing the sample image by using a convolution kernel, calculating the similarity, and reserving the sample with the similarity reaching a preset first threshold; training the training samples by using an SVM model; inputting the picture to be detected into a retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, judging that the target class image is a target class image and displaying the target class image as a retrieval result; and when the calculated score is smaller than a second preset threshold value, judging that the image is not the target type image, and excluding. The method can reduce the use amount of samples and reduce the consumption of manpower and computing resources.

Description

Low-consumption image retrieval method and system for training sample depth optimization

Technical Field

The invention relates to the technical field of image retrieval, in particular to a low-consumption image retrieval method and system for training sample depth optimization.

Background

In the digital media era and the big data era, a large number of images play important roles in a plurality of fields such as life, education, psychology, scientific research, medical treatment and the like, and bring great progress to the whole society. Meanwhile, with the explosive increase of the number of digital images, how to retrieve the target image from a massive image library becomes a very valuable and meaningful problem. Compared with an image retrieval method based on deep learning, the image retrieval method based on the SVM model can realize high-precision image retrieval by using less training samples. However, the training samples with low quality not only significantly reduce the practical application value of the SVM model, but also increase the consumption of the training samples, the calculation resources and the human resources. Therefore, a low-consumption image retrieval method for training sample depth optimization is urgently needed.

Disclosure of Invention

The invention aims to provide a low-consumption image retrieval method for training sample depth optimization, which can reduce the use amount of samples and reduce the consumption of manpower and computing resources.

The embodiment of the invention is realized by the following steps:

in a first aspect, an embodiment of the present application provides a low-consumption image retrieval method for training sample depth optimization, which includes retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the internet; detecting any sample image by using an HC significance detection method, and deleting the sample image if the significance area of the sample image is lower than a first preset proportion of the whole image; processing the sample images by using a convolution kernel with a preset voidage value, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples; training the training samples by using an SVM model to obtain a retrieval decision model; inputting the picture to be detected into a retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the calculated score based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and excluding.

In some embodiments of the present invention, the step of processing the sample image by using the convolution kernel with the preset porosity value, calculating the similarity between the positive samples and the similarity between the negative samples, respectively, and retaining the samples with the similarity reaching the preset first threshold to obtain the training samples includes: processing the samples by using a convolution kernel with a void ratio of 1, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain a first training sample; processing the first training samples by using a convolution kernel with a void ratio of 2, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the first training samples with the similarity reaching a preset first threshold value to obtain second training samples; and processing the second training samples by using the convolution kernel with the void ratio of 3, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the second training samples with the similarity reaching a preset first threshold value to obtain final training samples.

In some embodiments of the present invention, the step of calculating the similarity between the positive samples and the similarity between the negative samples comprises: respectively carrying out depth self-coding on sample images of the positive sample and the negative sample; calculating Euclidean distances of different sample images after self-encoding; and identifying the sample image with the Euclidean distance calculation result in a preset range as a similar image, and selecting and storing one of the similar images.

In some embodiments of the present invention, detecting any sample image by using an HC saliency detection method, and if the area of the saliency region of the sample image is lower than a first preset proportion of the entire image, deleting the sample image includes: carrying out quantization color channel on the sample image, and finding out the number of color types and the corresponding total number of pixels in the sample image; sorting the pixels from large to small according to the total number of the pixels, and simultaneously recording corresponding colors; finding out high-frequency color types of which the pixel number coverage images are not less than a second preset proportion and other low-frequency color types of which the pixel number coverage images are not more than a third preset proportion; classifying pixels of the low frequency color into the high frequency color closest in color distance in its LAB color space; calculating a significant value of each color, distributing the significant value to each pixel in the image to generate a significant graph, calculating the proportion of the pixels with the significant values larger than a preset significant value threshold value in the image, deleting the sample image if the proportion is lower than a first preset proportion of the whole image, and otherwise continuing the next step.

In some embodiments of the invention, the step after generating the saliency map further comprises: and carrying out normalization processing and linear spatial filtering on the saliency map.

In some embodiments of the invention, the step of quantizing the color channels of the sample image comprises: and quantizing the color channel of the sample image by utilizing a color space reduction algorithm.

In some embodiments of the invention, the first predetermined ratio is one third.

In a second aspect, an embodiment of the present application provides a low-consumption image retrieval system for training sample depth optimization, which includes an acquisition module, configured to retrieve a preset number of positive sample images and negative sample images from an image library disclosed by the internet; the salience detection module is used for detecting any sample image by using an HC salience detection method, and deleting the sample image if the area of a salience area of the sample image is lower than a first preset proportion of the whole image; the sample processing module is used for processing the sample images by using the convolution kernel with the preset porosity value, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples; the decision model module is used for training the training samples by utilizing the SVM model to obtain a retrieval decision model; the judging module is used for inputting the picture to be detected to the retrieval decision model; when the score calculated based on the SVM model is larger than or equal to a second preset threshold value, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the calculated score based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and excluding.

In a third aspect, an embodiment of the present application provides an electronic device, including at least one processor, at least one memory, and a data bus; wherein: the processor and the memory complete mutual communication through a data bus; the memory stores program instructions executable by the processor, which invokes the program instructions to perform a low-loss image retrieval method for training sample depth optimization.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, where the computer program, when executed by a processor, implements a low-consumption image retrieval method for training sample depth optimization.

Compared with the prior art, the embodiment of the invention has at least the following advantages or beneficial effects:

and the significance detection and the cavity convolution depth of the multi-cavity rate are utilized to optimize a training sample group, so that a more excellent training sample is obtained. Meanwhile, more excellent training samples are utilized for training and learning, the use amount of the samples is reduced, and the consumption of manpower and computing resources is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is a schematic flow chart of a low-consumption image retrieval method for training sample depth optimization according to the present invention;

FIG. 2 is a flow chart of similarity calculation between samples in the present invention;

FIG. 3 is a schematic flow chart of a method for detecting HC significance in the present invention;

FIG. 4 is a diagram illustrating the results of a low-consumption image retrieval system with deep optimization of training samples according to the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to the present invention.

Icon: 1. an acquisition module; 2. a significance detection module; 3. a sample processing module; 4. a decision model module; 5. a judgment module; 6. a processor; 7. a memory; 8. a data bus.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not construed as indicating or implying relative importance.

It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

In the description of the present application, it should be noted that the terms "upper", "lower", "inner", "outer", and the like indicate orientations or positional relationships based on orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally found in use of products of the application, and are used only for convenience in describing the present application and for simplification of description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present application.

In the description of the present application, it is also to be noted that, unless otherwise explicitly specified or limited, the terms "disposed" and "connected" are to be interpreted broadly, e.g., as being either fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments and features of the embodiments described below can be combined with one another without conflict.

Example 1

Referring to fig. 1, in order to provide a low-consumption image retrieval method for training sample depth optimization according to an embodiment of the present application, in the present design, a training sample group is optimized by using significance detection and hole convolution depth of a multi-hole rate, so as to obtain a more refined training sample. Meanwhile, more excellent training samples are used for training and learning, so that the use amount of the samples is reduced, and the consumption of manpower and computing resources is reduced.

S1: retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the Internet;

for a particular image retrieval category (e.g., giraffe image category), a number of positive and negative sample images (around 300 each) are retrieved from the image library.

S2: detecting any sample image by using an HC significance detection method, and deleting the sample image if the significance area of the sample image is lower than a first preset proportion of the whole image;

the purpose of saliency detection using the HC model is to perform saliency processing on objects or objects within a picture, thereby allowing a computer to more accurately identify the objects or objects.

S3: processing the sample images by using a convolution kernel with a preset voidage value, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples;

the method has the advantages that by using the hole convolution algorithm, under the condition that more fine training samples can be ensured, the receptive field can be enlarged under the condition that pooling loss information is not needed, and the output of each convolution contains information in a larger range. The hole convolution can be well applied to the problem that the image needs global information or the voice and the text need long sequence information.

S4: training the training samples by using an SVM (support vector machine) model to obtain a retrieval decision model;

the SVM model is adopted, so that compared with an image retrieval method of deep learning, the image retrieval method based on the SVM model can realize high-precision image retrieval by using fewer training samples.

S5: inputting the picture to be detected into a retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the score calculated based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and eliminating the non-target class image.

And directly discharging or directly skipping to continue the next picture retrieval if not required by taking the required picture as a retrieval result.

In some embodiments of the present invention, the step of processing the sample image by using the convolution kernel with the preset porosity value, calculating the similarity between the positive samples and the similarity between the negative samples, respectively, and retaining the samples with the similarity reaching the preset first threshold to obtain the training samples includes: processing the samples by using a convolution kernel with a void rate of 1, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain a first training sample; processing the first training samples by using a convolution kernel with a void rate of 2, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the first training samples with the similarity reaching a preset first threshold value to obtain second training samples; and processing the second training samples by using the convolution kernel with the void ratio of 3, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the second training samples with the similarity reaching a preset first threshold value to obtain final training samples.

The samples are processed by utilizing convolution kernels of various different void rate values, and the aim is to identify and select pictures in various modes, so that a more excellent training sample is obtained.

Referring to fig. 2, in some embodiments of the present invention, the step of calculating the similarity between the positive samples and the similarity between the negative samples respectively comprises:

s301: respectively carrying out depth self-coding on sample images of the positive sample and the negative sample;

the self-coding is one of unsupervised learning by utilizing a neural network, and the light weight advantage of data processing of the self-coding is utilized, so that the computing resources are saved.

S302: calculating Euclidean distances of different sample images after self-encoding;

the phase velocity is calculated using the euclidean distance.

S303: and identifying the sample image with the Euclidean distance calculation result in a preset range as a similar image, and selecting and storing one of the similar images.

And judging the sample image with higher similarity (namely in a preset range) by using the Euclidean distance for retention.

Referring to fig. 3, in some embodiments of the present invention, detecting any sample image by using the HC saliency detection method, and if the area of the saliency region of the sample image is less than the first predetermined proportion of the entire image, the step of deleting the sample image includes:

s201: carrying out quantization color channel on the sample image, and finding out the number of color types and the corresponding total number of pixels in the sample image;

the significance detection mainly identifies high-frequency colors, and the number of color types and the corresponding total number of pixels need to be found out firstly in some cases.

S202: sorting according to the total number of the pixels from large to small, and simultaneously recording corresponding colors;

after sorting, the operation is firstly carried out on the larger total number of pixels, and the significance calculation is also facilitated.

S203: finding out a high-frequency color type of which the pixel number coverage image is not less than a second preset proportion and other low-frequency color types of which the pixel number coverage image is not more than a third preset proportion;

for the screening of the high-frequency colors and the low-frequency colors, a preset coverage proportion is adopted, for example, the high-frequency color types with the pixel number not less than 95% of the coverage image and other low-frequency color types not more than 5% are found;

s204: classifying pixels of the low frequency color into the high frequency color closest in color distance in its LAB color space;

since the closer colors in the LAB color space have the smaller color differences, pixels of low-frequency colors are classified into high-frequency colors that are closest in color distance to their colors in the LAB color space in order to increase the calculation speed as much as possible.

S205: calculating a significant value of each color, distributing the significant value to each pixel in the image to generate a significant graph, calculating the proportion of the pixels with the significant values larger than a preset significant value threshold value in the image, deleting the sample image if the proportion is lower than a first preset proportion of the whole image, and otherwise continuing the next step.

In some embodiments of the invention, the step of quantizing the color channels of the sample image comprises: and quantizing the color channel of the sample image by using a color space reduction algorithm.

Possible pixel values for a three channel image are 255255255. This results in the inability to use histograms for the acceleration procedure, so we quantize each channel to 12 colors, which is 121212 possible pixel values. Thereby increasing the calculation speed.

Example 2

Referring to fig. 4, the low-consumption image retrieval system for training sample depth optimization provided by the present invention includes an acquisition module 1, configured to retrieve a preset number of positive sample images and negative sample images from an image library disclosed by the internet; the saliency detection module 2 is used for detecting any sample image by using an HC saliency detection method, and deleting the sample image if the area of a saliency region of the sample image is lower than a first preset proportion of the whole image; the sample processing module 3 is used for processing the sample images by using the convolution kernel with the preset porosity value, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples; the decision model module 4 is used for training the training samples by using the SVM model to obtain a retrieval decision model; the judging module 5 is used for inputting the picture to be detected to the retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the calculated score based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and excluding.

Example 3

Referring to fig. 5, an electronic device provided by the present invention includes at least one processor 6, at least one memory 7, and a data bus 8; wherein: the processor 6 and the memory 7 complete mutual communication through a data bus 8; the memory 7 stores program instructions executable by the processor 6, and the processor 6 calls the program instructions to perform a low-consumption image retrieval method for training sample depth optimization. For example, to realize:

retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the Internet; detecting any sample image by using an HC significance detection method, and deleting the sample image if the significance region area of the sample image is lower than a first preset proportion of the whole image; processing the sample images by using a convolution kernel with a preset porosity value, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples; training the training samples by using an SVM (support vector machine) model to obtain a retrieval decision model; inputting the picture to be detected into a retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the score calculated based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and eliminating the non-target class image.

Example 4

The present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor 6, implements a low-consumption image retrieval method for training sample depth optimization. For example, to realize:

retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the Internet; detecting any sample image by using an HC significance detection method, and deleting the sample image if the significance area of the sample image is lower than a first preset proportion of the whole image; processing the sample images by using a convolution kernel with a preset voidage value, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples; training the training samples by using an SVM model to obtain a retrieval decision model; inputting the picture to be detected into a retrieval decision model; when the score calculated based on the SVM model is larger than or equal to a second preset threshold value, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result; and when the score calculated based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and eliminating the non-target class image.

The MEMORY 7 may be, but not limited to, RANDOM ACCESS MEMORY (RAM), READ ONLY MEMORY (ROM), PROGRAMMABLE READ ONLY MEMORY (PROM), ERASABLE READ ONLY MEMORY (EPROM), electrically ERASABLE READ ONLY MEMORY (EEPROM), and the like.

The processor 6 may be an integrated circuit chip having signal processing capabilities. The PROCESSOR 6 may be a general-purpose PROCESSOR including a CENTRAL PROCESSING UNIT (CPU), a NETWORK PROCESSOR (NP), etc.; but also DIGITAL SIGNAL Processors (DSPs), APPLICATION SPECIFIC INTEGRATED CIRCUITs (ASICs), FIELD PROGRAMMABLE GATE ARRAYs (FPGAs) or other PROGRAMMABLE logic devices, discrete GATEs or transistor logic devices, discrete hardware components.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims

1. A low-consumption image retrieval method for training sample depth optimization is characterized by comprising the following steps:

retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the Internet;

detecting any sample image by using an HC significance detection method, and deleting the sample image if the area of a significance region of the sample image is lower than a first preset proportion of the whole image;

processing the sample images by using a convolution kernel with a preset voidage value, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples;

training the training samples by using an SVM (support vector machine) model to obtain a retrieval decision model;

inputting the picture to be detected into the retrieval decision model; when the score calculated based on the SVM model is larger than or equal to a second preset threshold value, judging the picture to be detected as a target category image, and displaying the target category image as a retrieval result;

and when the calculated score based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and removing.

2. The method for searching the low-consumption image for the deep optimization of the training sample as claimed in claim 1, wherein the step of processing the sample image by using the convolution kernel with the preset porosity value, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the sample with the similarity reaching the preset first threshold value to obtain the training sample comprises the steps of:

processing the samples by using a convolution kernel with a void rate of 1, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain a first training sample;

processing the first training samples by using a convolution kernel with a void ratio of 2, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the first training samples with the similarity reaching a preset first threshold value to obtain second training samples;

and processing the second training samples by using a convolution kernel with a void ratio of 3, respectively calculating the similarity between positive samples and the similarity between negative samples, and reserving the second training samples with the similarity reaching a preset first threshold value to obtain final training samples.

3. The method for searching the low-consumption image with the training sample depth optimized as claimed in claim 2, wherein the step of calculating the similarity between the positive samples and the similarity between the negative samples respectively comprises:

respectively carrying out depth self-coding on sample images of the positive sample and the negative sample;

calculating Euclidean distances of different sample images after self-encoding;

and identifying the sample image with the Euclidean distance calculation result in a preset range as a similar image, and selecting and storing one of the similar images.

4. The method as claimed in claim 1, wherein the step of detecting any sample image by using HC saliency detection method, and if the area of the saliency region of the sample image is lower than the first predetermined proportion of the whole image, deleting the sample image comprises:

quantizing a color channel of the sample image, and finding out the number of color types and the corresponding total number of pixels in the sample image;

sorting according to the total number of the pixels from large to small, and simultaneously recording corresponding colors;

finding out high-frequency color types of which the pixel number coverage images are not less than a second preset proportion and other low-frequency color types of which the pixel number coverage images are not more than a third preset proportion;

classifying pixels of the low frequency color into the high frequency color closest in color distance in its LAB color space;

calculating a significant value of each color, distributing the significant value to each pixel in the image to generate a significant graph, calculating the proportion of the pixels with the significant values larger than a preset significant value threshold value in the image, deleting the sample image if the proportion is lower than a first preset proportion of the whole image, and otherwise, continuing the next step.

5. The method for low-consumption image retrieval based on training sample depth optimization as claimed in claim 4, wherein the step of generating the saliency map further comprises:

and carrying out normalization processing and linear spatial filtering on the saliency map.

6. The method as claimed in claim 4, wherein the step of quantizing the color channel of the sample image comprises:

and quantizing the color channel of the sample image by using a color space reduction algorithm.

7. The method as claimed in claim 4, wherein the first predetermined ratio is one third.

8. A low-consumption image retrieval system for training sample depth optimization, comprising:

the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for retrieving a preset number of positive sample images and negative sample images from an image library disclosed by the Internet;

the saliency detection module is used for detecting any sample image by using an HC saliency detection method, and deleting the sample image if the area of a saliency region of the sample image is lower than a first preset proportion of the whole image;

the sample processing module is used for processing the sample images by using the convolution kernel with the preset porosity value, respectively calculating the similarity between the positive samples and the similarity between the negative samples, and reserving the samples with the similarity reaching a preset first threshold value to obtain training samples;

the decision model module is used for training the training samples by utilizing an SVM model to obtain a retrieval decision model;

the judging module is used for inputting the picture to be detected into the retrieval decision model; when the calculated score is larger than or equal to a second preset threshold value based on the SVM model, the picture to be detected is judged to be a target category image, and the target category image is used as a retrieval result to be displayed; and when the calculated score based on the SVM model is smaller than a second preset threshold value, judging the picture to be detected as a non-target class image, and removing.

9. An electronic device comprising at least one processor, at least one memory, and a data bus; wherein: the processor and the memory complete mutual communication through the data bus; the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the method of any of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.