CN114898104A - Hash method and device for image features and processing equipment - Google Patents

Hash method and device for image features and processing equipment Download PDF

Info

Publication number
CN114898104A
CN114898104A CN202210813030.9A CN202210813030A CN114898104A CN 114898104 A CN114898104 A CN 114898104A CN 202210813030 A CN202210813030 A CN 202210813030A CN 114898104 A CN114898104 A CN 114898104A
Authority
CN
China
Prior art keywords
hash
feature
image
network
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210813030.9A
Other languages
Chinese (zh)
Inventor
李登实
王若溪
陈何玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jianghan University
Original Assignee
Jianghan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jianghan University filed Critical Jianghan University
Priority to CN202210813030.9A priority Critical patent/CN114898104A/en
Publication of CN114898104A publication Critical patent/CN114898104A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a hashing method, a hashing device and processing equipment for image features, which are used for enhancing the spatial correlation among feature graphs of an input hashing layer, so that a hash code obtained by subsequent hashing has obviously improved precision. The method comprises the following steps: the processing equipment acquires an image to be processed; the processing equipment inputs an image to be processed into a deep hash network, wherein the deep hash network comprises a feature extraction module, a long-time dependence module and a hash layer, and in the working process of the deep hash network, the feature extraction module performs feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module to obtain a Hash code; the processing device extracts the hash code output by the deep hash network.

Description

Hash method and device for image features and processing equipment
Technical Field
The present application relates to the field of images, and in particular, to a hashing method and apparatus for image features, and a processing device.
Background
The Hash method projects high-dimensional features into compact binary codes through a Hash function, so that the database can store more data, and meanwhile, the retrieval efficiency can be improved. Compared with the traditional method, the deep hash method can effectively learn the high-quality nonlinear hash function while extracting the deep features, so that the learned hash function can more effectively encode the extracted deep features, and the generated hash code can better represent the image features.
The deep hash method mainly comprises two steps: feature extraction and hash coding. The characteristic extraction extracts useful image characteristics in the input image to form a characteristic graph, and the Hash coding converts the extracted characteristic information into a Hash code through a Hash function.
In the existing research process of related technologies, the inventor finds that the existing deep hash network has the problem of limited feature extraction precision, which causes low precision of hash codes obtained by subsequent hash coding.
Disclosure of Invention
The application provides a hashing method, a hashing device and processing equipment for image features, which are used for enhancing the spatial correlation among feature graphs of an input hashing layer, so that a hash code obtained by subsequent hashing has obviously improved precision.
In a first aspect, the present application provides a hashing method for image features, including:
the processing equipment acquires an image to be processed;
the processing equipment inputs an image to be processed into a deep hash network, wherein the deep hash network comprises a feature extraction module, a long-time dependence module and a hash layer, and in the working process of the deep hash network, the feature extraction module performs feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module and obtains a Hash code;
the processing device extracts the hash code output by the deep hash network.
With reference to the first aspect of the present application, in a first possible implementation manner of the first aspect of the present application, the long-time dependency module specifically employs a network structure of a Gated Round Unit (GRU).
With reference to the first aspect of the present application, in a second possible implementation manner of the first aspect of the present application, the hash layer specifically uses a network structure of three fully-connected layers to implement hash coding, the first two layers use a ReLU activation function, the last layer uses a hyperbolic tangent activation function, and an output of the last layer is a hash code.
With reference to the first aspect of the present application, in a third possible implementation manner of the first aspect of the present application, the feature extraction module specifically uses a network structure of a ResNet50 network, the feature extraction module is configured with 50 two-dimensional convolution layer operations including a convolution process and four residual blocks, a batch normalization layer is provided after each part to accelerate training speed, a rectifier unit avoids gradient disappearance, a maximum pooling layer realizes downsampling, and an output of the last residual block is subjected to average pooling.
With reference to the first aspect of the present application, in a fourth possible implementation manner of the first aspect of the present application, the method further includes:
the processing equipment carries out network training on the deep hash network through the sample image, and a central similarity loss function L is adopted in the training process C Pairwise similarity loss function L P And a quantization loss function L Q Training is carried out;
center similarity loss function L C For quantizing hash codes
Figure 294115DEST_PATH_IMAGE001
And corresponding hash center
Figure 210119DEST_PATH_IMAGE002
Hamming distance to maintain center similarity learning;
pairwise similarity loss function L P The Hamming distance of the hash codes of the data with only partial similar labels in the multi-label data set is quantized with the associated rows between the labels so as to keep the hash codes of the data pairs with the similar labels as close as possible;
quantization loss function L Q For assignment with maximum probability
Figure 275158DEST_PATH_IMAGE003
To control the quality of the generated hash code.
With reference to the fourth possible implementation manner of the first aspect of the present application, in a fifth possible implementation manner of the first aspect of the present application, the central similarity loss function L C Is defined as follows:
Figure 643822DEST_PATH_IMAGE004
wherein the content of the first and second substances,Nas to the number of samples,Kis the length of the binary hash code,
Figure 803408DEST_PATH_IMAGE005
a hash code is represented that is, in turn,
Figure 573918DEST_PATH_IMAGE006
a semantic hash center representing the sample;
pairwise similarity loss function L P Is defined as follows:
Figure 934492DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 56032DEST_PATH_IMAGE008
in order to be a hyper-parameter,
Figure 2997DEST_PATH_IMAGE009
in order to be the length of the hash code,
Figure 893593DEST_PATH_IMAGE010
so that
Figure 425068DEST_PATH_IMAGE011
The value of (a) is not less than 0,
Figure 627379DEST_PATH_IMAGE012
a label vector code for the sample;
quantization loss function L Q Is defined as follows:
Figure 535293DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 280395DEST_PATH_IMAGE014
is a hash code for forced learning.
With reference to the fourth possible implementation manner of the first aspect of the present application, in a sixth possible implementation manner of the first aspect of the present application, the central similarity loss function L C Pairwise similarity loss function L P And a quantization loss function L Q The three are specifically trained by a combined loss function, which is defined as follows:
Figure 592559DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 423111DEST_PATH_IMAGE016
is the set of all parameters learned by the deep hash function,
Figure 603557DEST_PATH_IMAGE017
and
Figure 327799DEST_PATH_IMAGE018
is a hyper-parameter obtained by a grid search.
In a second aspect, the present application provides an image feature hashing apparatus, including:
the acquisition unit is used for acquiring an image to be processed;
the system comprises a Hash unit, a processing unit and a processing unit, wherein the Hash unit is used for inputting an image to be processed into a deep Hash network, the deep Hash network comprises a feature extraction module, a long-time dependence module and a Hash layer, and in the working process of the deep Hash network, the feature extraction module carries out feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module to obtain a Hash code;
and the extraction unit is used for extracting the hash code output by the deep hash network.
With reference to the second aspect of the present application, in a first possible implementation manner of the second aspect of the present application, the long-time dependent module specifically employs a network structure of a GRU.
With reference to the second aspect of the present application, in a second possible implementation manner of the second aspect of the present application, the hash layer specifically uses a network structure of three fully-connected layers to implement hash coding, the first two layers use a ReLU activation function, the last layer uses a hyperbolic tangent activation function, and an output of the last layer is a hash code.
With reference to the second aspect of the present application, in a third possible implementation manner of the second aspect of the present application, the feature extraction module specifically adopts a network structure of a ResNet50 network, the feature extraction module is configured with 50 two-dimensional convolution layer operations including a convolution process and four residual blocks, a batch normalization layer is provided after each part to accelerate training speed, a rectifier unit avoids gradient disappearance, a maximum pooling layer realizes downsampling, and an output of a last residual block is subjected to average pooling.
With reference to the second aspect of the present application, in a fourth possible implementation manner of the second aspect of the present application, the apparatus further includes a training unit, configured to:
performing network training on the deep hash network through the sample image, and adopting a central similarity loss function L in the training process C Pairwise similarity loss function L P And a quantization loss function L Q Training is carried out;
center similarity loss function L C For quantizing hash codes
Figure 201077DEST_PATH_IMAGE019
And corresponding hash center
Figure 518926DEST_PATH_IMAGE020
Hamming distance to maintain center similarity learning;
pairwise similarity loss function L P The Hamming distance of the hash codes of the data with only partial similar labels in the multi-label data set is quantized with the associated rows between the labels so as to keep the hash codes of the data pairs with the similar labels as close as possible;
quantization loss function L Q For assignment with maximum probability
Figure 611385DEST_PATH_IMAGE021
To control the quality of the generated hash code.
With reference to the fourth possible implementation manner of the second aspect of the present application, in a fifth possible implementation manner of the second aspect of the present application, the central similarity loss function L C Is defined as follows:
Figure 331079DEST_PATH_IMAGE022
wherein the content of the first and second substances,Nas to the number of samples,Kis the length of the binary hash code,
Figure 375259DEST_PATH_IMAGE023
a hash code is represented that is, in turn,
Figure 39458DEST_PATH_IMAGE024
semantic hashing to represent samplesA center;
pairwise similarity loss function L P Is defined as follows:
Figure 827286DEST_PATH_IMAGE025
wherein the content of the first and second substances,
Figure 401486DEST_PATH_IMAGE026
in order to be a hyper-parameter,
Figure 226354DEST_PATH_IMAGE027
in order to be the length of the hash code,
Figure 253216DEST_PATH_IMAGE028
so that
Figure 844734DEST_PATH_IMAGE029
The value of (a) is not less than 0,
Figure 132496DEST_PATH_IMAGE030
a label vector code for the sample;
quantization loss function L Q Is defined as follows:
Figure 518478DEST_PATH_IMAGE031
wherein the content of the first and second substances,
Figure 406537DEST_PATH_IMAGE032
is a hash code for forced learning.
With reference to the fourth possible implementation manner of the second aspect of the present application, in a sixth possible implementation manner of the second aspect of the present application, the central similarity loss function L C Pairwise similarity loss function L P And a quantization loss function L Q The three are specifically trained by a combined loss function, which is defined as follows:
Figure 536167DEST_PATH_IMAGE033
wherein the content of the first and second substances,
Figure 819381DEST_PATH_IMAGE034
is a set of all parameters learned by the deep hash function,
Figure 641844DEST_PATH_IMAGE035
and
Figure 767932DEST_PATH_IMAGE036
is a hyper-parameter obtained by a grid search.
In a third aspect, the present application provides a processing device, including a processor and a memory, where the memory stores a computer program, and the processor executes the method provided in the first aspect of the present application or any one of the possible implementation manners of the first aspect of the present application when calling the computer program in the memory.
In a fourth aspect, the present application provides a computer-readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the method provided in the first aspect of the present application or any one of the possible implementations of the first aspect of the present application.
From the above, the present application has the following advantageous effects:
for the hash processing, a long-time dependency module is embedded between a feature extraction module for performing the feature extraction processing and a hash layer for performing the hash coding, and the module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation between each feature map, so that the obtained enhanced feature map can be coded to obtain a hash code with significantly improved precision after being input into the hash layer, thereby avoiding the situation that the spatial information between the input features is ignored by the hash coding in the prior art, and further realizing the image feature storage effect with higher precision.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart of a hashing method for image features according to the present application;
fig. 2 is a schematic view of an application scenario of the deep hash method of the present application;
fig. 3 is a schematic view of a scene in which a GRU is applied to a deep hash according to the present application;
fig. 4 is a schematic diagram of another application scenario of the deep hash method of the present application;
FIG. 5 is a schematic diagram of a hash apparatus for image features according to the present application;
FIG. 6 is a schematic diagram of a processing apparatus according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus. The naming or numbering of the steps appearing in the present application does not mean that the steps in the method flow have to be executed in the chronological/logical order indicated by the naming or numbering, and the named or numbered process steps may be executed in a modified order depending on the technical purpose to be achieved, as long as the same or similar technical effects are achieved.
The division of the modules presented in this application is a logical division, and in practical applications, there may be another division, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed, and in addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, and the indirect coupling or communication connection between the modules may be in an electrical or other similar form, which is not limited in this application. The modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to achieve the purpose of the present disclosure.
Before describing the hashing method for image features provided in the present application, the background related to the present application will be described first.
The image feature hashing method, the image feature hashing device and the computer-readable storage medium can be applied to processing equipment and are used for enhancing the spatial correlation among feature graphs of input hash layers, and hash codes obtained through subsequent hash coding have remarkably improved precision.
In the image feature hashing method mentioned in the present application, an execution main body may be a hashing device for an image feature, or different types of processing devices such as a server, a physical host, or a User Equipment (UE) that integrates the hashing device for the image feature. The image characteristic hash device may be implemented in a hardware or software manner, the UE may specifically be a terminal device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, or a Personal Digital Assistant (PDA), and the processing device may be set in a device cluster manner.
Next, the hash method of the image features provided in the present application is described.
First, referring to fig. 1, fig. 1 shows a schematic flow chart of the image feature hashing method according to the present application, and the image feature hashing method according to the present application may specifically include the following steps S101 to S103:
step S101, a processing device acquires an image to be processed;
it can be understood that the present application is specifically directed to an image storage scenario, in which an image feature is stored on a data level by specifically using a depth hash method, so as to complete the image storage work.
Correspondingly, the image related to the present application is denoted as a to-be-processed image, and may be an image that can be related in any application scene, and is specifically adjusted according to the actual application, which is not specifically limited herein.
For the acquisition of the image to be processed, in a specific application, the image may be acquired in real time or called.
Step S102, inputting an image to be processed into a deep hash network by processing equipment, wherein the deep hash network comprises a feature extraction module, a long-time dependence module and a hash layer, and in the working process of the deep hash network, the feature extraction module carries out feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module and obtains a Hash code;
it can be understood that the core work of the present application is to provide an improvement on the existing deep hash method, specifically, to perform network optimization configuration on a deep hash network.
According to the application, in a traditional deep hash network, a hash layer only combines feature maps extracted by features, however, spatial correlation exists between the feature maps extracted by the features in a deep layer, and the spatial correlation information between the feature maps is ignored in the hash coding process in the existing scheme, so that the quality of a generated hash code is influenced.
In this case, a long-time dependency module is embedded between a feature extraction module for performing feature extraction processing and a hash layer for performing hash coding, and in the previous training process, the module regards each input feature map as a time sequence (a plurality of feature maps are regarded as sequences with time precedence characteristics) to detect spatial correlation between the feature maps, so that after the training is completed, the module can regard each feature map output by the feature extraction module as a time sequence in practical application to detect the spatial correlation between each feature map.
At the moment, the long-time dependency module is used for establishing the long-time characteristic dependency to learn the spatial correlation information among the characteristic graphs, so that the spatial correlation among the characteristic graphs of the hash layer behind the input is strengthened, the spatial correlation information among the input characteristic graphs can be effectively avoided being ignored in the hash coding, and the improvement of the quality of the hash code is promoted.
Specifically, an application scenario diagram of the deep hash method of the present application shown in fig. 2 may be further combined to learn the deep hash network provided in the present application.
(1) As a practical implementation manner, in the present application, the feature extraction module may specifically adopt a network structure of a ResNet50 network in practical application, and the feature extraction module is configured with 50 two-dimensional convolution layers (conv 2 d) and includes a convolution process and four residual blocks, each part is followed by a Batch Normalization (BN) layer to accelerate training speed, a rectifier unit (ReLU function) to avoid gradient disappearance, a maximum pooling layer to implement downsampling, and an output of the last residual block is averaged pooling.
As an example, ResNet50 is used for residual learning among three layers of convolution layers, convolution kernels are 1x1, 3x3 and 1x1 in size, images of 112x112 pixels are input, the output of the last layer of convolution layers is 2048 feature maps of 7x7, and 2048 feature maps of 1x1 are obtained through a global averaging pooling layer.
(2) As a practical implementation manner, the application depends on the module for a long time, and in practical application, a network structure of a GRU may be specifically adopted.
Specifically, as an example, the GRU layer has an input dimension of 2048, and only one layer contains 2048 hidden units. In that
In the process of deep hashing, taking ResNet50 as an example, after a series of convolution pooling processes, ResNet50 generates 2048 feature maps
Figure 170094DEST_PATH_IMAGE037
In the process, the spatial association information between the feature maps is ignored by the fully-connected layer of the prior art, which may cause that the extracted image features may be combined into a target object that does not exist, and a large deviation may be generated from the actual situation, resulting in a coding error.
The Recurrent Neural Network (RNN) can learn the time sequence of an input sequence, namely the output of the current sequence is related to historical output but cannot be kept for a Long time and has the problem of gradient disappearance, so that a Long Short-Term Memory (LSTM) solves the problem by utilizing a gate structure, mainly comprises a forgetting gate, an input gate and an output gate, allows information to be persistent, and a GRU is continuously improved on the basis of the LSTM, so that the Network structure is simpler and is easier to calculate.
Therefore, the GRU layer is introduced between the ResNet50 and the hash layer for the first time, and each feature map obtained by the ResNet50 is regarded as a sequence, so that the GRU can learn the time sequence information in the sequences, namely the spatial correlation information between the feature maps.
Referring to fig. 3, a schematic diagram of a scenario in which GRU of the present application is applied to deep hash is shown, which illustrates an exemplary diagram of current input features
Figure 573394DEST_PATH_IMAGE038
Obtain the corresponding output characteristic diagram
Figure 442124DEST_PATH_IMAGE039
The process of (1).
Wherein the first characteristic diagram is input
Figure 665294DEST_PATH_IMAGE040
There is no history-related information for the signature, but since the hidden layer between the GRUs is connected, it will retain the signature
Figure 402306DEST_PATH_IMAGE040
Some information of (2) output gate inside GRU
Figure 519167DEST_PATH_IMAGE041
In (1), make the characteristic diagram
Figure 417853DEST_PATH_IMAGE042
Can be combined with the characteristic diagram
Figure 393899DEST_PATH_IMAGE040
And establishing a dependency relationship. For characteristic diagram
Figure 669023DEST_PATH_IMAGE043
GRU will be based on the feature map
Figure 166955DEST_PATH_IMAGE042
Information of and
Figure 502122DEST_PATH_IMAGE041
to obtain
Figure 434305DEST_PATH_IMAGE044
Make the characteristic diagram
Figure 637754DEST_PATH_IMAGE043
Can be combined with the characteristic diagram
Figure 870152DEST_PATH_IMAGE040
And characteristic diagrams
Figure 845061DEST_PATH_IMAGE042
And establishing a dependency relationship. By analogy, for the current input feature map
Figure 671066DEST_PATH_IMAGE045
In other words, GRU will be according to the preamble
Figure 287992DEST_PATH_IMAGE046
Information acquisition of individual characteristic maps
Figure 374897DEST_PATH_IMAGE047
Make the characteristic diagram
Figure 645341DEST_PATH_IMAGE045
Can be combined with
Figure 817696DEST_PATH_IMAGE048
The individual feature maps establish dependencies.
Based on the above process, the GRU can establish a long-time dependency relationship between the currently input feature diagram and the previously input feature diagram. The GRU establishes a long-time dependency relationship to learn the relation between input sequences, and for the sequences with time sequence, the GRU learns the time correlation information between the sequences; for the sequence of feature maps herein, these feature maps are only spatially related, so the GRU learns spatial correlation information between feature maps. Therefore, the GRU establishes the long-time characteristics and relies on the spatial correlation information among the learning characteristic graphs, the spatial correlation of the characteristics of the input hash layer is strengthened, and the spatial correlation information among the input characteristic graphs is prevented from being ignored in hash coding.
(3) As a practical implementation manner, in practical application, the hash layer of the present application may specifically adopt a network structure of three fully-connected layers for implementing hash coding, the first two layers adopt a ReLU activation function, the last layer adopts a hyperbolic tangent activation function (Tanh), the output of the last layer is a hash code, the input dimensions may all be 2048, the length of the output hash code may be 16, 32, 64, and the hash code may be compressed into [1,1 ].
In step S103, the processing device extracts the hash code output by the deep hash network.
After the feature extraction and the hash coding of the image to be processed are completed through the deep hash network, the hash code obtained by processing the image to be processed can be extracted.
At this time, after the hash code is obtained, the hash code may be subjected to related processing according to application requirements in a specific application scenario, for example, the hash code is stored in a corresponding database, and the like, which is not limited herein.
As can be seen from the embodiment shown in fig. 1, for the hash process, a long-time dependency module is embedded between a feature extraction module for performing the feature extraction process and a hash layer for performing the hash coding, and the module regards each feature map output by the feature extraction module as a time sequence to detect spatial correlation between each feature map, so that after the obtained enhanced feature map is input into the hash layer, a hash code with significantly improved accuracy can be encoded, and a situation that spatial information between input features is ignored by the hash coding in the prior art is avoided, thereby achieving an image feature storage effect with higher accuracy.
Further, it can be understood that the design of the deep hash algorithm in practical application may be considered to include two parts, namely, the design of the network structure and the design of the loss function, and in this case, in the training link of the deep hash algorithm, the present application also focuses on the problem of the existing training scheme of the deep hash algorithm to a certain extent.
The starting point of the loss function design is to keep the similarity between the image and the hash feature, at present, the deep hash mostly learns the hash function through the similarity between a binary group, a triple group or a center. The center similarity learning encourages the hash codes generated by similar images to approach a common hash center, and different images converge to different hash centers, so that the hash learning efficiency and the retrieval accuracy are improved.
In the center similarity learning, each label has its corresponding hash center, and the semantic hash center is determined by the hash center corresponding to the label. For a single-label data point, the semantic hash center of the single-label data point is consistent with the hash center corresponding to the label, and the generated hash code converges towards the semantic hash center corresponding to the single-label data point, namely converges towards the label corresponding to the single-label data point. For a multi-label data point, the semantic hash center is the centroid of the hash centers corresponding to the labels, so the corresponding semantic hash center is a little away from the label. The hash code is far away from the corresponding semantic hash center, so that only the data pairs with partially similar labels cannot generate the hash code with a short hamming distance.
Thus, although the globally optimal hash code can be obtained in the center similarity learning manner, for a multi-label data point, a semantic hash center is determined by the hash centers corresponding to a plurality of labels together, which may cause that a hash code with a short hamming distance cannot be generated for a data pair with only partially similar labels but inconsistent semantic hash centers, and this may cause that in the existing center similarity learning technology, for data with only partially similar labels on a multi-label data set, a link between the hamming distance of the generated hash code and the label thereof is ignored, resulting in a situation that the quality of the hash code is not high.
In view of this problem, as another practical implementation manner of the present application, the present application may further include the following:
the processing device performs network training on the deep hash network through the sample image, and in the training process, referring to another application scenario diagram of the deep hash method shown in fig. 4, a central similarity loss function L may be specifically adopted C Pairwise similarity loss function L P And a quantization loss function L Q Carry out trainingWherein, for the three loss functions specifically configured in the present application, the following is briefly described:
(1) center similarity loss function L C For quantizing hash codes
Figure 972734DEST_PATH_IMAGE049
And corresponding hash center
Figure 288047DEST_PATH_IMAGE050
Hamming distance to maintain center similarity learning;
(2) pairwise similarity loss function L P The Hamming distance of the hash codes of the data with only partial similar labels in the multi-label data set is quantized with the associated rows between the labels so as to keep the hash codes of the data pairs with the similar labels as close as possible;
(3) quantization loss function L Q For assignment with maximum probability
Figure 870338DEST_PATH_IMAGE051
To control the quality of the generated hash code.
It can be understood that, for the setting of the above-mentioned loss function of the present application, it can be understood that, because the pairwise similarity loss function can link paired hamming distances and paired similarity labels, the present application utilizes this characteristic, and introduces pairwise similarity to improve the problem of central similarity learning, so that the generated hash code can reduce the hamming distance of the hash code between data pairs having only partial similar labels in the multi-label data set while converging to the corresponding semantic hash center, thus achieving the effect of significantly guaranteeing the quality of the hash code.
Wherein for the above mentioned central similarity loss function L C Pairwise similarity loss function L P And a quantization loss function L Q For convenience of understanding, as another practical implementation, the following may be included:
(1) during training, probability distribution of expected network prediction is close to data distribution of training set, and cross entropy can be constantThe distance between the distributions. The hash center is a Binary vector, and Binary Cross Entropy (BCE) can be used to measure the hamming distance between the hash code and its center. Hash code
Figure 529989DEST_PATH_IMAGE052
With its hash center
Figure 613352DEST_PATH_IMAGE053
The smaller the Hamming distance between the hash codes is, the hash codes are close to the corresponding centers, so the center similarity loss function L C The definition of (a) can be as follows:
Figure 409270DEST_PATH_IMAGE054
wherein the content of the first and second substances,Nas to the number of samples,Kis the length of the binary hash code,
Figure 772249DEST_PATH_IMAGE055
a hash code is represented that is, in turn,
Figure 247093DEST_PATH_IMAGE056
a semantic hash center representing the sample;
(2) pairwise similarity loss function L P Cause the Hamming distance of the hash code of data in the multi-label dataset with only partially similar labels to be associated with the label, for which a pairwise similarity loss function L P The definition of (a) can be as follows:
Figure 743933DEST_PATH_IMAGE057
wherein the content of the first and second substances,
Figure 768259DEST_PATH_IMAGE058
in order to be a hyper-parameter,
Figure 957931DEST_PATH_IMAGE059
in order to be the length of the hash code,
Figure 592175DEST_PATH_IMAGE060
so that
Figure 892706DEST_PATH_IMAGE061
The value of (a) is not less than 0,
Figure 522271DEST_PATH_IMAGE062
a label vector code for the sample;
pairwise similarity loss function P L Therefore, in the training process of the multi-label data set, the Hamming distance of the data with only partial similar labels to the generated hash code is reduced.
(3) The purpose of the quantization loss is to assign it with maximum probability
Figure 617266DEST_PATH_IMAGE063
Quantizing the loss function L Q The definition of (a) can be as follows:
Figure 83013DEST_PATH_IMAGE064
wherein the content of the first and second substances,
Figure 921656DEST_PATH_IMAGE065
is a vector. Function of loss due to quantization L Q Is a non-smooth function, the derivative of which is difficult to calculate, so the smooth function is adopted in the application
Figure 671307DEST_PATH_IMAGE066
Instead of, i.e. using
Figure 45525DEST_PATH_IMAGE067
Then the loss function L is quantized Q Can be converted into:
Figure 388782DEST_PATH_IMAGE068
wherein the content of the first and second substances,
Figure 624591DEST_PATH_IMAGE069
is a hash code for forced learning.
Furthermore, as yet another practical implementation, the similarity loss function L is applied to the center C Pairwise similarity loss function L P And a quantization loss function L Q In practical application, the method and the device can also introduce more flexible setting so as to enhance the corresponding network training effect.
In particular, the central similarity loss function L C Pairwise similarity loss function L P And a quantization loss function L Q The three can be trained by a combined loss function, which can be defined as follows:
Figure 104114DEST_PATH_IMAGE070
wherein the content of the first and second substances,
Figure 416278DEST_PATH_IMAGE071
is the set of all parameters learned by the deep hash function,
Figure 981251DEST_PATH_IMAGE072
and
Figure 286330DEST_PATH_IMAGE073
is a hyper-parameter obtained by a grid search.
The above is an introduction of the image feature hashing method provided by the present application, and in order to better implement the image feature hashing method provided by the present application, the present application further provides an image feature hashing device from the perspective of a functional module.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an image feature hashing device according to the present application, in the present application, the image feature hashing device 500 may specifically include the following structure:
an obtaining unit 501, configured to obtain an image to be processed;
the hash unit 502 is configured to input an image to be processed into a deep hash network, where the deep hash network includes a feature extraction module, a long-term dependency module, and a hash layer, and in a working process of the deep hash network, the feature extraction module performs feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module to obtain a Hash code;
an extracting unit 503, configured to extract the hash code output by the deep hash network.
In yet another exemplary implementation, the long-term dependent module specifically employs a network structure of GRUs.
In another exemplary implementation manner, the hash layer specifically uses a network structure of three fully-connected layers to implement hash coding, the first two layers use a ReLU activation function, the last layer uses a hyperbolic tangent activation function, and the output of the last layer is a hash code.
In another exemplary implementation, the feature extraction module specifically adopts a network structure of a ResNet50 network, the feature extraction module is configured with 50 two-dimensional convolutional layer operations including one convolutional processing and four residual blocks, each part is followed by a batch normalization layer to accelerate training speed, a rectifier unit to avoid gradient disappearance, a maximum pooling layer to implement down-sampling, and the output of the last residual block is averaged pooled.
In yet another exemplary implementation, the apparatus further includes a training unit 504 configured to:
performing network training on the deep hash network through the sample image, and adopting a central similarity loss function L in the training process C Pairwise similarity loss function L P And a quantization loss function L Q Training is carried out;
center similarity loss function L C For quantizing hash codes
Figure 885939DEST_PATH_IMAGE074
And corresponding hash center
Figure 759217DEST_PATH_IMAGE075
Hamming distance to maintain center similarity learning;
pairwise similarity loss function L P The Hamming distance of the hash codes of the data with only partial similar labels in the multi-label data set is quantized with the associated rows between the labels so as to keep the hash codes of the data pairs with the similar labels as close as possible;
quantization loss function L Q For assignment with maximum probability
Figure 919809DEST_PATH_IMAGE076
To control the quality of the generated hash code.
In yet another exemplary implementation, a central similarity loss function L C Is defined as follows:
Figure 903945DEST_PATH_IMAGE077
wherein the content of the first and second substances,Nas to the number of samples,Kis the length of the binary hash code,
Figure 482694DEST_PATH_IMAGE078
a hash code is represented that is, in turn,
Figure 526874DEST_PATH_IMAGE079
a semantic hash center representing the sample;
pairwise similarity loss function L P Is defined as follows:
Figure 676226DEST_PATH_IMAGE080
wherein the content of the first and second substances,
Figure 464054DEST_PATH_IMAGE081
in order to be a hyper-parameter,
Figure 38255DEST_PATH_IMAGE082
in order to be the length of the hash code,
Figure 253335DEST_PATH_IMAGE083
so that
Figure 404831DEST_PATH_IMAGE084
The value of (a) is not less than 0,
Figure 573513DEST_PATH_IMAGE085
a label vector code for the sample;
quantization loss function L Q Is defined as follows:
Figure 2220DEST_PATH_IMAGE086
wherein the content of the first and second substances,
Figure 388202DEST_PATH_IMAGE087
is a hash code for forced learning.
In yet another exemplary implementation, a central similarity loss function L C Pairwise similarity loss function L P And a quantization loss function L Q The three are specifically trained by a combined loss function, which is defined as follows:
Figure 26994DEST_PATH_IMAGE088
wherein the content of the first and second substances,
Figure 891045DEST_PATH_IMAGE089
is a set of all parameters learned by the deep hash function,
Figure 439838DEST_PATH_IMAGE090
and
Figure 872087DEST_PATH_IMAGE091
is a hyper-parameter obtained by a grid search.
The present application further provides a processing device from a hardware structure perspective, referring to fig. 6, fig. 6 shows a schematic structural diagram of the processing device of the present application, specifically, the processing device of the present application may include a processor 601, a memory 602, and an input/output device 603, where the processor 601 is configured to implement steps of the hash method of the image features in the corresponding embodiment of fig. 1 when executing a computer program stored in the memory 602; alternatively, the processor 601 is configured to implement the functions of the units in the embodiment corresponding to fig. 5 when executing the computer program stored in the memory 602, and the memory 602 is configured to store the computer program required by the processor 601 to execute the hash method of the image feature in the embodiment corresponding to fig. 1.
Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in the memory 602 and executed by the processor 601 to accomplish the present application. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.
The processing devices may include, but are not limited to, a processor 601, a memory 602, and input-output devices 603. It will be appreciated by a person skilled in the art that the illustration is merely an example of a processing device and does not constitute a limitation of the processing device and may comprise more or less components than those illustrated, or some components may be combined, or different components, e.g. the processing device may further comprise a network access device, a bus, etc. via which the processor 601, the memory 602, the input output device 603, etc. are connected.
The Processor 601 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center for the processing device and the various interfaces and lines connecting the various parts of the overall device.
The memory 602 may be used for storing computer programs and/or modules, and the processor 601 may implement various functions of the computer apparatus by executing or executing the computer programs and/or modules stored in the memory 602 and calling data stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the processing apparatus, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The processor 601, when executing the computer program stored in the memory 602, may specifically implement the following functions:
acquiring an image to be processed;
inputting an image to be processed into a deep hash network, wherein the deep hash network comprises a feature extraction module, a long-time dependence module and a hash layer, and in the working process of the deep hash network, the feature extraction module performs feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the Hash layer carries out Hash coding on the enhanced feature graph output by the long-time dependency module to obtain a Hash code;
and extracting the hash code output by the deep hash network.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the hash apparatus, the processing device and the corresponding units of the image features described above may refer to the description of the hash method of the image features in the embodiment corresponding to fig. 1, and are not described herein again in detail.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
For this reason, the present application provides a computer-readable storage medium, where a plurality of instructions are stored, where the instructions can be loaded by a processor to execute the steps of the image feature hashing method in the embodiment corresponding to fig. 1 in the present application, and specific operations may refer to the description of the image feature hashing method in the embodiment corresponding to fig. 1, which is not described herein again.
Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the computer-readable storage medium can execute the steps of the image feature hashing method in the embodiment corresponding to fig. 1, the beneficial effects that can be achieved by the image feature hashing method in the embodiment corresponding to fig. 1 can be achieved, which are described in detail in the foregoing description and are not repeated herein.
The foregoing describes in detail the hashing method, apparatus, processing device and computer-readable storage medium for image features provided in the present application, and specific examples are applied herein to explain the principles and embodiments of the present application, and the description of the foregoing embodiments is only used to help understand the method and its core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for hashing an image feature, the method comprising:
the processing equipment acquires an image to be processed;
the processing equipment inputs the image to be processed into a deep hash network, wherein the deep hash network comprises a feature extraction module, a long-time dependence module and a hash layer, and in the working process of the deep hash network, the feature extraction module performs feature extraction on the input image to be processed to obtain a feature map as an image feature; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the hash layer carries out hash coding on the reinforced characteristic graph output by the long-time dependence module to obtain a hash code;
the processing device extracts the hash code output by the deep hash network.
2. The method according to claim 1, characterized in that said long time dependent module specifically employs a network structure of gated round robin units GRU.
3. The method according to claim 1, wherein the hash layer specifically employs a network structure of three fully-connected layers for implementing hash coding, the first two layers employ a ReLU activation function, the last layer employs a hyperbolic tangent activation function, and an output of the last layer is the hash code.
4. The method of claim 1, wherein the feature extraction module specifically adopts a network structure of a ResNet50 network, the feature extraction module is configured with 50 two-dimensional convolutional layer operations including a convolutional process and four residual blocks, each part is followed by a batch normalization layer to accelerate training speed, a rectifier unit to avoid gradient disappearance, a maximum pooling layer to realize down-sampling, and an output of the last residual block is averaged and pooled.
5. The method of claim 1, further comprising:
the processing equipment carries out network training on the deep hash network through the sample image, and in the training process, center similarity is adoptedFunction of sexual loss L C Pairwise similarity loss function L P And a quantization loss function L Q Training is carried out;
the central similarity loss function L C For quantizing hash codes
Figure 132910DEST_PATH_IMAGE001
And corresponding hash center
Figure 190865DEST_PATH_IMAGE002
Hamming distance to maintain center similarity learning;
the pairwise similarity loss function L P The Hamming distance of the hash codes of the data with only partial similar labels in the multi-label data set is quantized with the associated rows between the labels so as to keep the hash codes of the data pairs with the similar labels as close as possible;
the quantization loss function L Q For assignment with maximum probability
Figure 594165DEST_PATH_IMAGE003
To control the quality of the generated hash code.
6. The method of claim 5, wherein the central similarity loss function L C Is defined as follows:
Figure 321949DEST_PATH_IMAGE004
wherein the content of the first and second substances,Nas to the number of samples,Kis the length of the binary hash code,
Figure 810699DEST_PATH_IMAGE005
a hash code is represented that is, in turn,
Figure 656033DEST_PATH_IMAGE006
a semantic hash center representing the sample;
pairwise similarityLoss function L P Is defined as follows:
Figure 913839DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 812525DEST_PATH_IMAGE008
in order to be a hyper-parameter,
Figure 913205DEST_PATH_IMAGE009
in order to be the length of the hash code,
Figure 188329DEST_PATH_IMAGE010
so that
Figure 35062DEST_PATH_IMAGE011
The value of (a) is not less than 0,
Figure 245595DEST_PATH_IMAGE012
a label vector code for the sample;
quantization loss function L Q Is defined as follows:
Figure 443358DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 522172DEST_PATH_IMAGE014
is a hash code for forced learning.
7. The method of claim 5, wherein the central similarity loss function L C The pairwise similarity loss function L P And said quantization loss function L Q The three are specifically trained by a combined loss function, which is defined as follows:
Figure 348046DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 854113DEST_PATH_IMAGE016
is the set of all parameters learned by the deep hash function,
Figure 539173DEST_PATH_IMAGE017
and
Figure 156099DEST_PATH_IMAGE018
is a hyper-parameter obtained by a grid search.
8. An apparatus for hashing a feature of an image, the apparatus comprising:
the acquisition unit is used for acquiring an image to be processed;
the system comprises a Hash unit, a processing unit and a processing unit, wherein the Hash unit is used for inputting the image to be processed into a deep Hash network, the deep Hash network comprises a feature extraction module, a long-time dependence module and a Hash layer, and in the working process of the deep Hash network, the feature extraction module carries out feature extraction on the input image to be processed to obtain a feature map as image features; the long-time dependence module takes each feature map output by the feature extraction module as a time sequence so as to detect the spatial correlation among the feature maps and obtain an enhanced feature map; the hash layer carries out hash coding on the reinforced characteristic graph output by the long-time dependence module to obtain a hash code;
and the extraction unit is used for extracting the hash code output by the deep hash network.
9. A processing device comprising a processor and a memory, a computer program being stored in the memory, the processor performing the method according to any of claims 1 to 7 when calling the computer program in the memory.
10. A computer-readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the method of any one of claims 1 to 7.
CN202210813030.9A 2022-07-12 2022-07-12 Hash method and device for image features and processing equipment Pending CN114898104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210813030.9A CN114898104A (en) 2022-07-12 2022-07-12 Hash method and device for image features and processing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210813030.9A CN114898104A (en) 2022-07-12 2022-07-12 Hash method and device for image features and processing equipment

Publications (1)

Publication Number Publication Date
CN114898104A true CN114898104A (en) 2022-08-12

Family

ID=82729260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210813030.9A Pending CN114898104A (en) 2022-07-12 2022-07-12 Hash method and device for image features and processing equipment

Country Status (1)

Country Link
CN (1) CN114898104A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241310A (en) * 2020-01-10 2020-06-05 济南浪潮高新科技投资发展有限公司 Deep cross-modal Hash retrieval method, equipment and medium
CN112989120A (en) * 2021-05-13 2021-06-18 广东众聚人工智能科技有限公司 Video clip query system and video clip query method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241310A (en) * 2020-01-10 2020-06-05 济南浪潮高新科技投资发展有限公司 Deep cross-modal Hash retrieval method, equipment and medium
CN112989120A (en) * 2021-05-13 2021-06-18 广东众聚人工智能科技有限公司 Video clip query system and video clip query method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LI YUAN等: "Central Similarity Quantization for Efficient Image and Video Retrieval", 《2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
QIBING QIN等: "Unsupervised Deep Multi-Similarity Hashing With Semantic Structure for Image Retrieval", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
ZHANGJIE CAO等: "HashNet: Deep Learning to Hash by Continuation", 《 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
ZHENG ZHANG等: "Improved Deep Hashing with Soft Pairwise Similarity for Multi-label Image Retrieval", 《ARXIV:1803.02987V3 [CS.CV]》 *
丁斌: "基于非线性哈希的图像与视频检索算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN109885709B (en) Image retrieval method and device based on self-coding dimensionality reduction and storage medium
Boureau et al. A theoretical analysis of feature pooling in visual recognition
CN112437926B (en) Fast robust friction ridge patch detail extraction using feedforward convolutional neural network
CN111177438B (en) Image characteristic value searching method and device, electronic equipment and storage medium
US11714921B2 (en) Image processing method with ash code on local feature vectors, image processing device and storage medium
US20200175259A1 (en) Face recognition method and apparatus capable of face search using vector
CN113869282B (en) Face recognition method, hyper-resolution model training method and related equipment
CN111988614A (en) Hash coding optimization method and device and readable storage medium
JP2023520625A (en) IMAGE FEATURE MATCHING METHOD AND RELATED DEVICE, DEVICE AND STORAGE MEDIUM
JP2015036939A (en) Feature extraction program and information processing apparatus
CN111241550B (en) Vulnerability detection method based on binary mapping and deep learning
CN114332500A (en) Image processing model training method and device, computer equipment and storage medium
CN116978011A (en) Image semantic communication method and system for intelligent target recognition
Tsai et al. A single‐stage face detection and face recognition deep neural network based on feature pyramid and triplet loss
CN114299304A (en) Image processing method and related equipment
CN115631330B (en) Feature extraction method, model training method, image recognition method and application
CN114077685A (en) Image retrieval method and device, computer equipment and storage medium
CN113743593B (en) Neural network quantization method, system, storage medium and terminal
CN114898104A (en) Hash method and device for image features and processing equipment
CN108536769B (en) Image analysis method, search method and device, computer device and storage medium
CN112307243A (en) Method and apparatus for retrieving image
CN115880556A (en) Multi-mode data fusion processing method, device, equipment and storage medium
CN113963241B (en) FPGA hardware architecture, data processing method thereof and storage medium
Liu et al. Margin-based two-stage supervised hashing for image retrieval
Žižakić et al. Learning local image descriptors with autoencoders

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220812

RJ01 Rejection of invention patent application after publication