CN112200204A

CN112200204A - Image feature characterization method and device

Info

Publication number: CN112200204A
Application number: CN202011419950.XA
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Mdata Information Technology Co ltd
Current assignee: Shanghai Mido Technology Co ltd
Priority date: 2020-12-07
Filing date: 2020-12-07
Publication date: 2021-01-08
Anticipated expiration: 2040-12-07
Also published as: CN112200204B

Abstract

The invention aims to provide an image feature characterization method and equipment, wherein three methods of HASH, HIST and CNN are fused, 3 kinds of characterizations are fused to be used as characterization vector values of an image, and more complete and richer features of the image can be obtained. Complementary fusion enhancement based on HASH, HIST and CNN representations improves the accuracy of image search, can achieve the purpose of searching original images by fuzzy pictures and small pictures, and has effect on searching original images by local pictures. By acquiring the representations of different dimensions of the image, the problem that the original image cannot be searched by the fuzzy picture and the small picture is solved.

Description

Image feature characterization method and device

Technical Field

The invention relates to the field of computers, in particular to an image feature characterization method and device.

Background

The image representation means that a certain dimension of vector is used for representing images, and the similarity between pictures can be calculated by using the vector of the images, so as to judge the similarity between the pictures.

Searching the images according to the images: the method is characterized in that a user inputs pictures, and the system displays original pictures and the pictures to the user in a mode of reversing the similarity of the original pictures and the input pictures. When the user inputs the fuzzy picture and the small picture, the original picture is often not searched by the system and is displayed to the user.

The general representation mode of the picture is as follows:

1) the traditional characterization method of HASH: the Hash algorithm uses binary vectors similar to fingerprints to represent images, similarity is judged by calculating similarity through Hamming distance, namely judging whether characters at corresponding positions are the same or not, and the more the same digit is, the more similar the images are. Disadvantages of the conventional characterization methods of HASH: the representation mode is single, the elements are all 0 or 1, the similarity calculation cannot be carried out by using similarity modes such as Euclidean distance and the like, only structural information of the image is considered, the gray level image is effective, and the sensitivity to the color image is not high.

2) HIST characterization mode: and the histogram algorithm is used for representing the image by using the color distribution of the image, and the similarity is calculated by using a Bhattacharyya coefficient algorithm, namely, the similarity is calculated by multiplying the corresponding position elements and then summing the sums of the squares. Disadvantages of the HIST characterization approach: the characterization dimensionality is too large, only the color change of the image is considered, and the method is insensitive to the structural information of the image.

3) The CNN characterization mode is as follows: the deep learning convolution algorithm is characterized in that features are extracted in series through multilayer convolution operation, 512-dimensional vectors are used as image representations, and similarity is calculated through a mode of multiplying and summing corresponding position elements. The disadvantages of CNN characterization: when a small picture or a blurred picture is characterized, the characterization is often not representative.

Each of the above representation methods cannot separately solve the problem of the blurred picture and the original image searched by the small picture.

Disclosure of Invention

The invention aims to provide an image feature characterization method and equipment.

According to an aspect of the present invention, there is provided an image feature characterization method, the method comprising:

calculating a HASH characteristic value, a HIST characteristic value and a CNN characteristic value corresponding to an image to be searched;

calculating and calculating a HASH characteristic value, a HIST characteristic value and a CNN characteristic value corresponding to an original image in a gallery;

splicing the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the image to be searched to obtain a fusion characteristic value of the image to be searched;

and splicing the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the original image to obtain a fusion characteristic value of the original image.

Further, in the above method, after the HASH characteristic value, the HIST characteristic value, and the CNN characteristic value of the original image are spliced to obtain a fusion characteristic value of the original image, the method further includes:

and displaying the original image similar to the image to be searched based on the fusion characteristic value of the image to be searched and the fusion characteristic value of the original image.

Further, in the above method, calculating a CNN token value corresponding to the image to be searched includes:

step S1111, inputting the image to be searched, and using convolution kernels of 3 × 3 pixels to sequentially perform convolution operation, pooling operation and activation of a nonlinear function on the image to obtain a first feature map of 224 × 64 pixels;

step S1112, using convolution kernels of 3 × 3 pixels to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first feature image, so as to obtain a first two-feature map of 224 × 64 pixels;

step S1113, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first two feature maps to obtain a first three feature map of 112 × 128 pixel points;

step S1114, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first three feature maps, so as to obtain a first four feature map of 112 × 128 pixel points;

step S1115, using the convolution kernel of 3 × 3 pixels, sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first four feature maps to obtain a first five feature map of 56 × 256 pixels;

step S1116, using the convolution kernel of the 3 × 3 pixel points, sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first five feature map to obtain a first six feature map of 56 × 256 pixel points;

step S1117, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first six feature maps, so as to obtain a first seven feature map of 56 × 256 pixel points;

step S1118, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first seven feature map, so as to obtain a first eight feature map of 28 × 512 pixel points;

step S1119, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first eight feature map, so as to obtain a first nine feature map of 28 × 512 pixel points;

step S11110, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first nine feature map, so as to obtain a first ten feature map of 28 × 512 pixel points;

step S11111, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first ten feature maps, so as to obtain a first eleventh feature map of 14 × 512 pixel points;

step S11112, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the first eleventh feature map, so as to obtain a first twelfth feature map of 7 × 512 pixel points;

step S11113, performing average calculation of the first 2 dimensions of 7 × 7 and the first twelve feature maps of the 7 × 512 pixel points to obtain 512 first two feature values;

step S11114, respectively calculating a standard deviation and a variance for the 512 first binary feature values, and performing a normalization operation based on the 512 first binary feature values based on the calculated standard deviation and variance to obtain a first CNN representation value corresponding to the image to be searched.

Further, in the above method, performing mean calculation of the first 2 dimensions of 7 × 7 to the first twelve feature maps of 7 × 512 pixel points to fuse local information of the image to obtain 512 first two feature values, including:

averaging the first dimension 7 in the first twelve feature maps of the 7 × 512 pixel points to obtain a first feature value of 1 × 7 × 512;

and averaging the second dimension 7 in the first characteristic values of 1 × 7 × 512 to obtain 512 first two characteristic values.

Further, in the above method, calculating a CNN representation value corresponding to the original image in the gallery includes:

step S2111, inputting the original image, and performing convolution operation, pooling operation, and activation of a nonlinear function on the image in sequence by using a convolution kernel of 3 × 3 pixels to obtain a second feature map of 224 × 64 pixels;

step S2112, using the convolution kernel of the 3 × 3 pixels to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second feature image, so as to obtain a second feature map of 224 × 64 pixels;

step S2113, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second feature map, so as to obtain a second third feature map of 112 × 128 pixel points;

step S2114, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second three feature maps, so as to obtain a second four feature map of 112 × 128 pixel points;

step S2115, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of the nonlinear function on the second four feature maps, so as to obtain a second five feature map of 56 × 256 pixel points;

step S2116, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second five feature map, so as to obtain a second six feature map of 56 × 256 pixel points;

step S2117, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of the nonlinear function on the second six feature maps, so as to obtain a second seven feature map of 56 × 256 pixel points;

step S2118, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second seven feature map, so as to obtain a second eight feature map of 28 × 512 pixel points;

step S2119, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second eight feature map, so as to obtain a second nine feature map of 28 × 512 pixel points;

step S21110, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the second ninth feature map, so as to obtain a twentieth feature map of 28 × 512 pixel points;

step S21111, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the twentieth feature map to obtain the twenty-first feature map of the 14 × 512 pixel points;

step S21112, using the convolution kernel of the 3 × 3 pixel points to sequentially perform convolution operation, pooling operation, and activation of a nonlinear function on the twenty-first feature map to obtain a twenty-second feature map of 7 × 512 pixel points;

step S21113, performing mean calculation of the first 2 dimensions of 7 × 7 on the twenty-second feature map of the 7 × 512 pixel points to obtain 512 second feature values;

step S21114, calculating a standard deviation and a variance for the 512 second feature values, and performing a normalization operation based on the 512 second feature values based on the calculated standard deviation and variance to obtain a second CNN representation value corresponding to the original image.

Further, in the above method, performing mean calculation of the first 2 dimensions of 7 × 7 on the twenty-second feature map of the 7 × 512 pixel points to fuse local information of the image to obtain 512 second feature values, including:

averaging the first dimension 7 in the twenty-second feature map of the 7 × 512 pixel points to obtain a second feature value of 1 × 7 × 512;

and averaging the second dimension 7 in the first characteristic values of 1 × 7 × 512 to obtain 512 second characteristic values.

Further, in the method, calculating a HASH value corresponding to the image to be searched includes:

step S1121, scaling the image to be searched to a first image with 8 × 8 pixels;

step 1122, converting the first image into a first gray scale image;

step S1123, calculating a pixel mean value of a pixel point of the first gray image;

step S1124 is to calculate a hash value of each pixel of the first binary image, set the position of the corresponding pixel in the binary feature map corresponding to the first binary image to 1 if the hash value of the pixel is greater than or equal to the pixel mean value, and set the position of the corresponding pixel in the binary feature map corresponding to the first binary image to 0 if the hash value of the pixel is less than the pixel mean value, so as to obtain a 64-bit binary first feature map;

step S1125, performing a normalization operation based on the 64-bit binary first feature map to obtain a HASH representation value corresponding to the image to be searched.

Further, in the method, calculating a HASH value corresponding to the original image includes:

step S2121, zooming the original image into a second image with 8 × 8 pixel points;

step S2122, converting the second first image into a second gray image;

step S2123, calculating a pixel mean value of pixel points of the second gray scale image;

step S2124 of calculating a hash value of each pixel point of the second binary gray scale image, setting the position of the corresponding pixel point in the binary characteristic diagram corresponding to the second binary gray scale image to be 1 if the hash value of the pixel point is greater than or equal to a pixel mean value, and setting the position of the corresponding pixel point in the binary characteristic diagram corresponding to the second binary gray scale image to be 0 if the hash value of the pixel point is less than the pixel mean value so as to obtain a 64-bit binary second characteristic diagram;

step S2125, performing normalization operation based on the 64-bit binary second feature map to obtain a HASH representation value corresponding to the original image.

Further, in the method, obtaining the HIST characteristic value corresponding to the image to be searched includes:

step S1131, scaling the image to be searched into the first two images with 224 × 3 pixels, where 3 of 224 × 3 represents 3 channels, which are a red channel, a green channel, and a blue channel, respectively;

step S1132, respectively counting the number of times of occurrence of a pixel point corresponding to each pixel value in 0-255 pixel values of each red channel, each green channel and each blue channel in the first two-image, and the category of 768 pixel values of 3 channels, to obtain the number of times of occurrence of a pixel point corresponding to the pixel values of 768 categories, which is used as a set of 768 categories;

step S1133, performing normalization operation based on the 768 category sets to obtain a HIST representation value corresponding to the image to be searched.

Further, in the method, obtaining the HIST characteristic value corresponding to the original image includes:

step S2131, scaling the to-be-original image into a second image with 224 × 3 pixels, where 3 of 224 × 3 represents 3 channels, which are a red channel, a green channel, and a blue channel, respectively;

step 2132, respectively counting the number of times of occurrence of a pixel point corresponding to each pixel value in 0-255 pixel values of each red channel, each green channel and each blue channel in the second image, and obtaining the number of times of occurrence of a pixel point corresponding to the pixel values of 768 categories as 768 category sets, wherein the categories of 768 pixel values are 3 channels;

step S2133, performing normalization operation based on the 768 category sets to obtain the HIST representation value corresponding to the original image.

Further, in the above method, the stitching the HASH characteristic value, the HIST characteristic value, and the CNN characteristic value of the image to be searched to obtain the fusion characteristic value of the image includes:

weighting the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the image to be searched respectively;

and sequentially carrying out serial splicing on the weighted HASH characteristic value, HIST characteristic value and CNN characteristic value of the image to be searched according to the sequence of CNN, HASH and HIST to obtain a fusion characteristic value of the image to be searched.

Further, in the above method, the stitching the HASH characteristic value, the HIST characteristic value, and the CNN characteristic value of the original image to obtain a fusion characteristic value of the image includes:

weighting the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the original image respectively;

and sequentially performing serial splicing on the weighted HASH characteristic value, HIST characteristic value and CNN characteristic value of the original image according to the sequence of CNN, HASH and HIST to obtain a fusion characteristic value of the original image.

According to another aspect of the present invention, there is also provided a computer readable medium having computer readable instructions stored thereon, the computer readable instructions being executable by a processor to implement the method of any one of the above.

According to another aspect of the present invention, there is also provided an apparatus for information processing at a network device, the apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform any of the methods described above.

Compared with the prior art, the method disclosed by the invention has the advantages that three methods of HASH, HIST and CNN are fused, 3 representations are fused and used as representation vector values of the image, and more complete and richer features of the image can be obtained. Complementary fusion enhancement based on HASH, HIST and CNN representations improves the accuracy of image search, can achieve the purpose of searching original images by fuzzy pictures and small pictures, and has effect on searching original images by local pictures. By acquiring the representations of different dimensions of the image, the problem that the original image cannot be searched by the fuzzy picture and the small picture is solved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

fig. 1 is a schematic diagram illustrating an image feature characterization method according to an embodiment of the present invention.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

The present invention is described in further detail below with reference to the attached drawing figures.

In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As shown in fig. 1, the present invention provides an image feature characterization method, the method comprising:

step S1, calculating a HASH characteristic value, a HIST characteristic value and a CNN characteristic value corresponding to the image to be searched; calculating and calculating a HASH characteristic value, a HIST characteristic value and a CNN characteristic value corresponding to an original image in a gallery;

here, HASH characterization values: the vector can be a vector which is composed of 64 elements and is composed of 0 or 1;

HIST characterization value: the decimal number can be 0-1, and the vector is a vector consisting of 768 elements;

CNN characterization value: the decimal can be a vector consisting of 512 elements which are decimal between 0 and 1;

step S2, splicing the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the image to be searched to obtain a fusion characteristic value of the image to be searched; and splicing the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the original image to obtain a fusion characteristic value of the original image.

Before similarity comparison between images, the acquired images may be images to be searched and original images in a gallery, which requires processing the images to be searched and the original images in the gallery respectively, obtaining fusion characteristic values of the images to be searched and the fusion characteristic values of the original images respectively, and then performing similarity comparison based on the respective fusion characteristic values.

The number of original images in the image library may be many, and the HASH characteristic value, the HIST characteristic value, and the CNN characteristic value corresponding to the original image may be spliced to obtain the fusion characteristic value of the original image.

Considering that the representation of the image should consider the information of texture, structure, color, semantic and the like of the image at the same time, the invention fuses three methods of HASH, HIST and CNN, performs fusion processing on 3 representations to be used as representation vector values of the image, and can obtain more complete and abundant features of the image. Complementary fusion enhancement based on HASH, HIST and CNN representations improves the accuracy of image search, can achieve the purpose of searching original images by fuzzy pictures and small pictures, and has effect on searching original images by local pictures. By acquiring the representations of different dimensions of the image, the problem that the original image cannot be searched by the fuzzy picture and the small picture is solved.

In an embodiment of the image feature characterization method, in step S2, the HASH characterization value, the HIST characterization value, and the CNN characterization value of the image to be searched are spliced to obtain a fusion feature value of the image to be searched; after the HASH characteristic value, the HIST characteristic value, and the CNN characteristic value of the original image are spliced to obtain a fusion characteristic value of the original image, the method further includes:

and step S3, displaying the original image similar to the image to be searched based on the fusion characteristic value of the image to be searched and the fusion characteristic value of the original image.

In the invention, the original image similar to the image to be searched is displayed based on the fusion characteristic value of the image to be searched and the fusion characteristic value of the original image, so that the effect of searching the original image corresponding to the fuzzy image and the small image (the image to be searched) can be achieved.

In an embodiment of the image feature characterization method of the present invention, in step S1, calculating a CNN characterization value corresponding to an image to be searched includes:

step S1111, inputting an image to be searched, and using a convolution kernel of 3 × 3 pixel points (size) to sequentially perform convolution operation, pooling operation and activation of a nonlinear function on the image so as to obtain a first feature map (feature map 1) of 224 × 64 pixel points (size);

step S1112, using the convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first feature image to obtain a first two-feature map (feature map 2) of 224 × 64 pixel (size);

here, the operation of obtaining the feature map of 224 × 64 pixels (size) is a loop of 2 times;

step S1113, using a convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first two feature maps (feature map 2), so as to obtain a first three feature map (feature map 3) of 112 × 128 pixel (size);

step S1114, using a convolution kernel of 3 × 3 pixels (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first three feature maps (feature maps 3) to obtain a first four feature map (feature map 4) of 112 × 128 pixels (size);

here, the operation of obtaining the feature map of 112 × 128 pixels (size) is a loop of 2 times;

step S1115, using a convolution kernel of 3 × 3 pixels (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first four feature map (feature map 4), so as to obtain a first five feature map (feature map 5) of 56 × 256 pixels (size);

step S1116, using a convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first five feature map (feature map 5) to obtain a first six feature map (feature map 6) of 56 × 256 pixel (size);

step S1117, using the convolution kernel of 3 × 3 pixels (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first six feature maps (feature maps 6) to obtain a first seven feature map (feature map 7) of 56 × 256 pixels (size);

here, the operation of obtaining the feature map of 56 × 256 pixels (size) is performed in 3 cycles;

step S1118, using a convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first seven feature map (feature map 7), so as to obtain a first eight feature map (feature map 8) of 28 × 512 pixel (size);

step S1119, using a convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first eight feature map (feature map 8), so as to obtain a first nine feature map (feature map 9) of 28 × 512 pixel (size);

step S11110, using a convolution kernel of 3 × 3 pixel (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first nine feature map (feature map 9), so as to obtain a first ten feature map (feature map 10) of 28 × 512 pixel (size);

here, the operation of obtaining the 28 × 512 pixel (size) feature map is performed in 3 cycles;

step S11111, using a convolution kernel of 3 × 3 pixels (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first tenth feature map (feature map 10), so as to obtain an eleventh feature map (feature map 11) of 14 × 512 pixels (size);

here, the operation of obtaining the feature map of 14 × 512 pixels (size) is a cycle of 1 time;

step S11112, using the convolution kernel of 3 × 3 pixels (size), sequentially performing convolution operation, pooling operation, and activation of a nonlinear function on the first eleventh feature map (feature map 11) to obtain a first twelfth feature map (feature map 12) of 7 × 512 pixels (size);

here, the operation of obtaining the 7 × 512 pixel (size) feature map is a cycle of 1 time;

step S11113, performing average calculation of the first 2 dimensions of 7 × 7 to the first twelve feature maps (feature maps 12) of the 7 × 512 pixels (size) to fuse local information of the image, so as to obtain 512 first two feature values;

step S11114, respectively calculating a standard deviation and a variance for the 512 first binary feature values, and performing a normalization operation based on the calculated standard deviation and variance and the 512 first binary feature values to obtain a CNN representation value corresponding to the image to be searched, where the CNN representation value includes 512 elements, such as [0.04, 0.41, 00.1,.., 0.78 ].

In an embodiment of the image feature characterization method of the present invention, in step S11116, performing average calculation of the first 2 dimensions of 7 × 7 to the twelfth feature map (feature map 12) of the 7 × 512 pixel points (size) to fuse local information of the image, so as to obtain 512 first binary feature values, including:

firstly, averaging a first dimension 7 in 7 × 512 to obtain a first characteristic value of 1 × 7 × 512;

then, the second dimension 7 in the first eigenvalue of 1 × 7 × 512 is averaged to obtain 1 × 512 eigenvalue, and finally 1 × 512=512, that is, 512 first two eigenvalues.

In an embodiment of the image feature characterization method of the present invention, calculating a CNN characterization value corresponding to an original image in a gallery includes:

Here, the CNN representation value of the image to be searched and the CNN representation value of the original image need to be obtained for the image to be searched and the original image respectively according to the above steps.

In an embodiment of the image feature characterization method of the present invention, performing mean calculation of the first 2 dimensions of 7 × 7 on the twenty-second feature map of the 7 × 512 pixel points to fuse local information of the image to obtain 512 second feature values, where the method includes:

In an embodiment of the image feature characterization method of the present invention, in step S1, obtaining a HASH characterization value corresponding to an image to be searched includes:

step S1121, zooming the image to be searched into a first image with 8 × 8 pixel points (size), wherein the first image has 64 pixel points;

step 1122, converting the first image into a first gray scale image;

step S1123, calculating the pixel mean value of the pixel points of the gray level image;

step S1124 is to calculate a hash value (hash value) of each pixel point of the first binary image, set the position of the corresponding pixel point in the binary feature map corresponding to the first binary image to 1 if the hash value of the pixel point is greater than or equal to the pixel mean value, and set the position of the corresponding pixel point in the binary feature map corresponding to the first binary image to 0 if the hash value of the pixel point is less than the pixel mean value, so as to obtain a 64-bit binary first feature map;

here, a 64-bit binary feature map is obtained, i.e. a 64-value set consisting of 0 or 1;

step S1125, performing a normalization operation based on the 64-bit binary feature map to obtain a HASH representation value corresponding to the image to be searched, where the HASH representation value includes 64 elements, such as [1, 0, 1,.. 0 ].

Calculating a HASH characteristic value corresponding to the original image, wherein the HASH characteristic value comprises the following steps:

step S2122, converting the second first image into a second gray image;

Here, the HASH representation value of the image to be searched and the HASH representation value of the original image need to be obtained for the image to be searched and the original image respectively according to the above steps.

In an embodiment of the image feature characterization method of the present invention, in step S1, obtaining a HIST characterization value corresponding to an image to be searched includes:

step S1131, scaling the image to be searched into the first and second images with 224 × 3 pixel points (size), where 3 of 224 × 3 represents 3 channels, which are R (red), G (green), and B (blue) channels, respectively;

step S1132, respectively counting the number of times that a pixel point corresponding to each pixel value in 0-255 pixel values of each R (red), G (green), and B (blue) channel appears, that is, the number of times that a pixel point having the same pixel value of each channel appears, and the categories of 256+256+256=768 pixel values in 3 channels, to obtain the number of times that a pixel point corresponding to a pixel value of 768 categories appears, which is used as a set of 768 categories;

step S1133, performing a normalization operation based on the 768 category sets to obtain HIST characterization values corresponding to the image to be searched, where each HIST characterization value includes 768 elements, such as [0.01, 0.41, 00.1,..., 0.88 ].

Acquiring HIST characteristic values corresponding to the original image, wherein the HIST characteristic values comprise:

Here, it is necessary to obtain the HIST characteristic value of the image to be searched and the HIST characteristic value of the original image according to the above steps for the image to be searched and the original image, respectively.

In an embodiment of the image feature characterization method of the present invention, in step S2, the step of splicing the HASH characterization value, the HIST characterization value, and the CNN characterization value of the image to be searched to obtain a fusion feature value of the image includes:

Splicing the HASH characteristic value, the HIST characteristic value and the CNN characteristic value of the original image to obtain a fusion characteristic value of the image, wherein the fusion characteristic value comprises the following steps:

Here, the weighting method may be, for example: 0.5 × CNN, 0.3 × HASH, 0.2 × HIST, where × represents the multiplication, i.e., weighting manner;

the concatenation may be: 0.5 × CNN +0.3 × HASH +0.2 × HIST, + indicates concatenation, i.e., concatenation, and 3 token values can be concatenated in the order CNN, HASH, and HIST.

For example, if 0.5 × CNN is characterized as: [0.04, 0.41, 00.1,.., 0.78], contains 512 elements;

0.3 × HASH characterized by: [1, 0, 1,.., 0], 64 elements;

0.2 his, characterized by: [0.01, 0.41, 00.1.., 0.88], contained 768 elements.

Then, performing serial splicing to obtain a fusion characteristic value of the image as follows: [0.04, 0.41, 00.1, 0.78, 1, 0, 01, 0.41, 00.1, 0.88], contains 512+64+768=1920 elements.

For details of embodiments of each device and storage medium of the present invention, reference may be made to corresponding parts of each method embodiment, and details are not described herein again.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

It should be noted that the present invention may be implemented in software and/or in a combination of software and hardware, for example, as an Application Specific Integrated Circuit (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software program of the present invention may be executed by a processor to implement the steps or functions described above. Also, the software programs (including associated data structures) of the present invention can be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Further, some of the steps or functions of the present invention may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

In addition, some of the present invention can be applied as a computer program product, such as computer program instructions, which when executed by a computer, can invoke or provide the method and/or technical solution according to the present invention through the operation of the computer. Program instructions which invoke the methods of the present invention may be stored on a fixed or removable recording medium and/or transmitted via a data stream on a broadcast or other signal-bearing medium and/or stored within a working memory of a computer device operating in accordance with the program instructions. An embodiment according to the invention herein comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or solution according to embodiments of the invention as described above.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. A method of image feature characterization, wherein the method comprises:

2. The method of claim 1, wherein after stitching the HASH, HIST, and CNN tokens of the original image to obtain the fused token of the original image, further comprising:

3. The method according to claim 1 or 2, wherein calculating the CNN representation value corresponding to the image to be searched comprises:

4. The method of claim 3, wherein performing a mean calculation of 7 × 7 of the first 2 dimensions on the first twelve feature maps of 7 × 512 pixels to fuse local information of the image to obtain 512 first two feature values comprises:

5. The method of claim 1 or 2, wherein calculating CNN tokens corresponding to the original images in the gallery comprises:

6. The method according to claim 5, wherein performing a mean calculation of 7 × 7 of the first 2 dimensions on the twenty-second feature map of the 7 × 512 pixels to fuse local information of the image to obtain 512 second feature values includes:

7. The method according to claim 1 or 2, wherein calculating the HASH representation value corresponding to the image to be searched comprises:

step 1122, converting the first image into a first gray scale image;

8. The method according to claim 1 or 2, wherein calculating the HASH representation corresponding to the original image comprises:

step S2122, converting the second first image into a second gray image;

9. The method according to claim 1 or 2, wherein obtaining the HIST representation value corresponding to the image to be searched comprises:

10. The method according to claim 1 or 2, wherein obtaining the HIST characterizing value corresponding to the original image comprises:

11. The method according to claim 1 or 2, wherein stitching the HASH, HIST and CNN token values of the image to be searched to obtain the fusion feature value of the image comprises:

12. The method of claim 1 or 2, wherein stitching the HASH, HIST, and CNN tokens of the original image to obtain a fused feature value of the image comprises:

13. A computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of any one of claims 1 to 12.

14. An apparatus for information processing at a network device, the apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method of any one of claims 1 to 12.