WO2021036028A1 - 图像特征提取及网络的训练方法、装置和设备 - Google Patents
图像特征提取及网络的训练方法、装置和设备 Download PDFInfo
- Publication number
- WO2021036028A1 WO2021036028A1 PCT/CN2019/120028 CN2019120028W WO2021036028A1 WO 2021036028 A1 WO2021036028 A1 WO 2021036028A1 CN 2019120028 W CN2019120028 W CN 2019120028W WO 2021036028 A1 WO2021036028 A1 WO 2021036028A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- feature
- neighbor
- node
- training
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
- G06V10/422—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation for representing the structure of the pattern or shape of an object therefor
- G06V10/426—Graphical representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Definitions
- the present disclosure relates to computer vision technology, in particular to an image feature extraction and network training method, device and equipment.
- Image retrieval can include text-based image retrieval and content-based image retrieval (CBIR, Content-Based Image Retrieval) according to different ways of describing image content.
- CBIR Content-Based Image Retrieval
- content-based image retrieval technology has broad application prospects in industrial fields such as e-commerce, leather cloth, copyright protection, medical diagnosis, public safety, and street view maps.
- the present disclosure provides at least one image feature extraction and network training method, device and equipment.
- an image feature extraction method includes:
- the first association graph including a master node and at least one neighbor node, the node value of the master node represents the image feature of the target image, and the node value of the neighbor node represents the image feature of the neighbor image,
- the neighbor image is an image similar to the target image;
- the first correlation graph is input to a feature update network, and the feature update network updates the node value of the master node according to the node value of the neighbor node in the first correlation graph to obtain the image feature of the updated target image .
- the method before obtaining the first correlation map, further includes: obtaining neighbor images similar to the target image from an image library according to the target image.
- obtaining neighbor images similar to the target image from an image library includes: separately obtaining image features of the target image and each library image in the image library through a feature extraction network Based on the feature similarity between the image feature of the target image and the image feature of each of the library images in the image library, determine neighbor images similar to the target image from the image library.
- determining a neighbor image similar to the target image includes: combining the target image The feature similarity with each of the library images is sorted according to the numerical value of the feature similarity in descending order; the library image corresponding to the feature similarity with the first preset number of digits is selected as the similarity to the target image Neighbor image.
- a neighbor image similar to the target image is determined from the image library , Including: obtaining a first image similar to the target image from each of the library images according to the feature similarity between the image features of the target image and the image features of each of the library images; The feature similarity between the image feature of the first image and the image feature of each of the library images, a second image similar to the first image is obtained from each of the library images; and the first image and The second image is used as a neighbor image of the target image.
- the number of the feature update network is one, or N stacked in sequence, where N is an integer greater than 1; when the number of the feature update network is N: where the i-th feature update network The input of is the updated first correlation graph output by the i-1th feature update network, where i is an integer greater than 1 and less than or equal to N.
- the feature update network updates the node value of the master node according to the node value of the neighbor node in the first association graph to obtain the image feature of the updated target image, including: determining the first association graph A weight between the master node and each of the neighbor nodes in an association graph; combine the image features of the neighbor nodes according to the weight to obtain the weighted feature of the master node; according to the weight of the master node The image feature and the weighted feature obtain the image feature of the updated target image.
- combining the image features of each neighbor node according to the weight to obtain the weighted feature of the master node includes: performing a weighted summation of the image features of each neighbor node according to the weight , To obtain the weighted feature of the master node.
- obtaining the updated image feature of the target image according to the image feature of the main node and the weighted feature includes: splicing the image feature of the main node with the weighted feature; Non-linear mapping is performed on the features of, and the image features of the updated target image are obtained.
- determining the weight between the master node and the neighbor node in the first association graph includes: linearly mapping the master node and the neighbor node; The master node and the neighbor node determine an inner product; and the weight between the master node and the neighbor node is determined according to the inner product after nonlinear processing.
- the target image includes: a query image to be retrieved and each library image in an image library; after obtaining the updated image characteristics of the target image, the method further includes: The feature similarity between the image feature of the target image and the image feature of the respective library images is obtained from the library image as a search result.
- a method for training a feature update network where the feature update network is used to update image features of an image; the method includes:
- the second association graph including a training master node and at least one training neighbor node, the node value of the training master node represents the image feature of the sample image, and the node value of the training neighbor node represents the training neighbor
- the image feature of the image, the training neighbor image is an image similar to the sample image
- the second correlation graph is input to a feature update network, and the feature update network updates the node value of the master node according to the node value of the training neighbor node in the second correlation graph to obtain the image feature of the updated sample image ;
- the method before obtaining the second correlation graph, further includes: obtaining the training neighbor image similar to the sample image from a training image library according to the sample image.
- the method before acquiring the training neighbor image similar to the sample image from the training image library according to the sample image, the method further includes: extracting image features of the training image through a feature extraction network; The prediction information of the training image is obtained according to the image features of the training image; and the network parameters of the feature extraction network are adjusted based on the prediction information and label information of the training image.
- obtaining the training neighbor image similar to the sample image from a training image library includes: obtaining image features and training of the sample image through the feature extraction network. Image features of each library image in the image library; and based on the feature similarity between the image feature of the sample image and the image feature of each library image, the training neighbor image that is similar to the sample image is determined.
- an image feature extraction device includes:
- the graph acquisition module is configured to acquire a first association graph, the first association graph including a master node and at least one neighbor node, the node value of the master node represents the image feature of the target image, and the node value of the neighbor node represents The image feature of the neighbor image, the neighbor image is an image similar to the target image;
- the feature update module is configured to input the first association graph into a feature update network, and the feature update network updates the node value of the master node according to the node value of the neighbor node in the first association graph to obtain the updated The image characteristics of the target image.
- the device further includes: a neighbor acquisition module, configured to acquire neighbors similar to the target image from the image library according to the target image before the image acquisition module acquires the first correlation map. image.
- a neighbor acquisition module configured to acquire neighbors similar to the target image from the image library according to the target image before the image acquisition module acquires the first correlation map. image.
- the neighbor acquisition module is configured to: separately acquire the image features of the target image and the image features of each library image in the image library through a feature extraction network; based on the image features and image features of the target image Based on the feature similarity between the image features of each of the library images in the library, a neighbor image similar to the target image is determined from the image library.
- the neighbor acquisition module is further configured to: sort the feature similarity between the target image and each of the library images in descending order of feature similarity; preset before selection
- the library image corresponding to the feature similarity of the number of bits is used as the neighbor image that is similar to the target image.
- the neighbor acquisition module is further configured to: obtain from each of the library images according to the feature similarity between the image features of the target image and the image features of each of the library images.
- a first image similar to the target image; according to the feature similarity between the image feature of the first image and the image feature of each of the library images, a first image similar to the first image is obtained from each of the library images Two images; use the first image and the second image as neighbor images of the target image.
- the number of the feature update network is one, or N stacked in sequence, where N is an integer greater than 1; when the number of the feature update network is N: where the i-th feature update network The input of is the updated first correlation graph output by the i-1th feature update network, where i is an integer greater than 1 and less than or equal to N.
- the feature update module is configured to: determine the weight between the master node and each of the neighbor nodes in the first association graph; and calculate the weight of each neighbor node according to the weight
- the image features are merged to obtain the weighted feature of the master node; and the image feature of the updated target image is obtained according to the image feature of the master node and the weighted feature.
- the feature update module is further configured to: perform a weighted summation of the image features of each neighbor node according to the weight to obtain the weighted feature of the master node.
- the feature update module is further configured to: stitch the image features of the master node with the weighted features; perform nonlinear mapping on the stitched features to obtain the updated image features of the target image.
- the feature update module is further configured to: perform linear mapping on the master node and the neighbor node; determine the inner product of the master node and the neighbor node after linear mapping; The inner product after processing determines the weight between the master node and the neighbor node.
- a training device for a feature update network includes:
- the association graph obtaining module is configured to obtain a second association graph, the second association graph including a training master node and at least one training neighbor node, the node value of the training master node represents the image feature of the sample image, and the training neighbor The node value of the node represents the image feature of the training neighbor image, and the training neighbor image is an image similar to the sample image;
- the update processing module is configured to input the second association graph into a feature update network, and the feature update network updates the node value of the master node according to the node value of the training neighbor node in the second association graph to obtain the updated The image characteristics of the sample image;
- the parameter adjustment module is configured to obtain prediction information of the sample image according to the image characteristics of the updated sample image; adjust the network parameters of the feature update network according to the prediction information.
- the device further includes: an image acquisition module, configured to acquire images similar to the sample images from the training image library according to the sample images before the correlation map acquisition module acquires the second correlation map.
- the training neighbor image is a registered trademark of the device.
- the device further includes: a pre-training module for extracting image features of the training image through a feature extraction network; obtaining prediction information of the training image based on the image features of the training image; The prediction information and label information of the training image are used to adjust the network parameters of the feature extraction network; the training image is the image used to train the feature extraction network, and the sample image is used after the feature extraction network training is completed. To train the feature to update the image of the network.
- the image acquisition module is configured to: separately acquire the image features of the sample image and the image features of each library image in the training image library through the feature extraction network; The feature similarity between the image feature and the image feature of each library image determines the training neighbor image similar to the sample image.
- an electronic device in a fifth aspect, includes a memory and a processor.
- the memory is used to store computer instructions that can be run on the processor.
- the processor is used to implement the present disclosure when the computer instructions are executed. The method for extracting image features described in any embodiment, or the method for training a feature update network described in any embodiment of the present disclosure.
- a computer-readable storage medium on which a computer program is stored.
- the program is executed by a processor, the method for extracting image features according to any one of the embodiments of the present disclosure, or any one of the The training method of the feature update network described in the embodiment.
- a computer program is provided, which is used to make a processor execute the image feature extraction method according to any embodiment of the present disclosure, or the training of the feature update network according to any embodiment of the present disclosure method.
- Fig. 1 is an image feature extraction method provided by at least one embodiment of the present disclosure
- FIG. 2 is a processing flow of a feature update network provided by at least one embodiment of the present disclosure
- Fig. 3 is a method for training a feature update network provided by at least one embodiment of the present disclosure
- Fig. 4 is a method for training a feature update network provided by at least one embodiment of the present disclosure
- FIG. 5 is a schematic diagram of acquired neighbor images provided by at least one embodiment of the present disclosure.
- Fig. 6 is a schematic diagram of an association diagram provided by at least one embodiment of the present disclosure.
- FIG. 7 is an image retrieval method provided by at least one embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of a sample image and a library image provided by at least one embodiment of the present disclosure.
- FIG. 9 is a schematic diagram of a neighbor image search provided by at least one embodiment of the present disclosure.
- FIG. 10 is a network structure of a feature update network provided by at least one embodiment of the present disclosure.
- FIG. 11 is an image feature extraction device provided by at least one embodiment of the present disclosure.
- Fig. 12 is an image feature extraction device provided by at least one embodiment of the present disclosure.
- FIG. 13 is a training device for a feature update network provided by at least one embodiment of the present disclosure.
- Fig. 14 is a training device for a feature update network provided by at least one embodiment of the present disclosure.
- image retrieval can include text-based image retrieval and content-based image retrieval.
- a computer when performing image retrieval based on content, a computer may be used to extract image features, establish an image feature vector description, and store it in an image feature database.
- the same feature extraction method can be used to extract the image features of the query image to obtain the query vector, and then calculate the similarity between the query vector and each image feature in the image feature library under the similarity measurement criteria, and finally Sort by similarity and output corresponding pictures in order.
- FIG. 1 is an image feature extraction method provided by at least one embodiment of the present disclosure. As shown in FIG. 1, the The method can include the following processing:
- a first association graph is obtained, the first association graph includes a master node and at least one neighbor node, the node value of the master node represents the image feature of the target image, and the node value of the neighbor node represents the neighbor The image feature of the image, and the neighbor image is an image similar to the target image.
- the target image is an image whose image features are to be extracted.
- the image can be an image in different application scenarios. For example, it can be an image to be retrieved in an image retrieval application.
- the following image library can be It is the search image library in the image search application.
- the obtaining of the neighbor image may be obtained by obtaining a neighbor image similar to the target image from the image library according to the target image before the first correlation map is obtained.
- neighbor images can be determined according to the image feature similarity measurement criterion, for example, the image features of the target image and the image features of each library image in the image library are obtained through a feature extraction network, based on the image features of the target image And the feature similarity between the image features of each of the library images in the image library, and the neighbor images that are similar to the target image are determined from the image library.
- the feature similarity between the target image and each of the library images can be sorted in descending order of feature similarity values, and the top N feature similarities corresponding to the The library image is used as a neighbor image that is similar to the target image.
- the N is a preset number of digits, such as the first 10 digits.
- a first image similar to the target image may be acquired according to the similarity between image features, and then a second image similar to the first image may be acquired, and the first image and the second image may be combined. Both are regarded as neighbor images of the target image.
- step 102 the first association graph is input to a feature update network, and the feature update network updates the node value of the master node according to the node value of the neighbor node in the first association graph to obtain the updated target The image characteristics of the image.
- the feature update network may be an attention-based graph convolution module (AGCN for short), or it may be another module without limitation.
- AGCN attention-based graph convolution module
- the graph convolution module in this step can update the node value of the master node according to the node value of the neighbor node. For example, it can determine the master node and each node in the first association graph. The weight between the neighbor nodes, the image features of the neighbor nodes are combined according to the weight to obtain the weighted feature of the master node, and the update is obtained according to the image feature of the master node and the weighted feature The image characteristics of the target image afterwards.
- the subsequent flow shown in FIG. 2 exemplarily describes the specific process of updating the node value of the master node of the graph convolution module.
- the number of the graph convolution module may be one, or multiple stacked in sequence.
- the first correlation graph is input to the first graph convolution module, and the first graph convolution module updates the image features of the main node according to the image features of each neighbor node
- the image feature of the master node has been updated, and it is the updated first correlation graph.
- the updated first correlation graph continues to be input to the second graph convolution module, and the second graph convolution module continues to update the image features of the main node according to the image features of each neighbor node, and outputs the updated first correlation graph again ,
- the image features of the main node have also been updated again.
- the first association graph in this embodiment includes multiple nodes (for example, a master node, a neighbor node), and the node value of each node represents the image feature of the image represented by the node.
- each node in the first association graph can be used as the master node, and the image feature of the image corresponding to the node is updated by the method described in FIG. 1 of this embodiment. For example, when the node is the master node, obtain The node is used as the first correlation graph of the master node, and the first correlation graph is input into the feature update network to update the image feature of the node.
- the image feature extraction method of this embodiment uses the feature update network of the embodiment of the present disclosure to update and extract image features. Because the feature update network updates the image features of the master node according to the image features of the neighbor nodes of the master node, the updated The image characteristics of the target image can express the target image more accurately, so that it is more robust and discriminative in the image recognition process.
- Fig. 2 illustrates the processing flow of the feature update network in an embodiment, which describes how the feature update network updates the image features of the image input to the network.
- the processing flow of the feature update network may include the following steps 200-204.
- step 200 the weight between the master node and each neighbor node is determined according to the image characteristics of the master node and neighbor nodes.
- the master node may be the target image in the network application stage, and the neighbor node may be the neighbor image of the target image.
- the weight between the master node and neighbor nodes can be determined as follows: see formula (1),
- the image feature z u of the master node and the image feature z vi of the neighbor nodes can be linearly transformed, where vi represents one of the neighbor nodes of the master node, and k represents the number of neighbor nodes.
- W i and W u is the coefficient of linear transformation.
- the inner product can be determined for the image features of the main node and neighbor nodes after linear transformation.
- the inner product can be calculated by the function F.
- the non-linear transformation is realized through ReLU (Rectified Linear Unit), and finally the weight is obtained after softmax operation.
- the weight a i is the weight between the master node u and the neighbor node vi.
- the calculation of the weight between the main node and the neighbor node in this step is not limited to the above formula (1), for example, it can also be the value of the similarity of the image features between the main node and the neighbor node, As a weight between the two.
- step 202 according to the weight, a weighted summation of the image features of the neighbor nodes is performed to obtain the weighted feature of the master node.
- the image features of each neighbor node of the master node can be non-linearly mapped, and then the weights obtained in step 200 can be used to perform a weighted summation of the image features of each neighbor node after the non-linear mapping, and the resulting features can be called It is a weighted feature.
- formula (2) As shown in the following formula (2):
- n u is the weighted feature
- z vi is the image feature of the neighbor node
- a i is the weight calculated in step 200.
- Q and q are the coefficients of the nonlinear mapping.
- step 204 the updated feature of the updated target image is obtained according to the image feature of the master node and the weighted feature.
- z u is the image feature of the main node in the correlation graph
- n u is the weighted feature
- the nonlinear mapping is performed through ReLU
- W and w are the coefficients of the nonlinear mapping.
- the node value of the master node in the first association graph is updated, and the updated image feature of the master node is obtained.
- the processing flow of the feature update network of this embodiment uses the graph convolution module to perform a weighted summation of the image features of the neighbor nodes of the master node to determine the weighted features of the master node, so that the image features of the target image itself and its association can be comprehensively considered
- the image features of the neighbor images so that the updated image features of the target image are more robust and discriminative, and the accuracy of image retrieval is improved.
- Fig. 3 is a method for training a feature update network provided by at least one embodiment of the present disclosure. As shown in Fig. 3, the method describes the training process of a feature update network and may include the following processing:
- step 300 a training neighbor image similar to the sample image is obtained from the training image library according to the sample image used to train the feature update network.
- training is used to indicate that this is applied in the training phase of the network and is related to the neighbor images mentioned in the network application phase.
- image library does not constitute any restriction.
- training master node and “training neighbor node” mentioned in the following description are also only distinguished from the same concepts in the network application stage in name, and do not constitute any restrictive effect.
- group training can be used.
- the training samples can be divided into multiple image subsets (batch), each iteration of training inputs an image subset to the feature update network, combined with the loss value of each sample image included in the image subset, and the loss value is used to reverse the loss value. Adjust the network parameters by transmitting the network. After one iteration of training is completed, the next image subset can be input to the feature update network for the next iteration of training.
- each image in an image subset batch can be referred to as a sample image, and each sample image can perform the processing of steps 300 to 306, and the loss value loss can be obtained according to the prediction information and the label information.
- the training image database may be a retrieval image database, that is, an image similar to a sample image will be retrieved from the retrieval image database.
- the similarity may include the same object as the sample image, or belong to the same category as the sample image.
- an image similar to the sample image can be called a "training neighbor image”.
- the method for obtaining the training neighbor image may be, for example, determining an image with a higher similarity as the training neighbor image according to the feature similarity between the images.
- a second association graph is obtained.
- the second association graph includes a training master node and at least one training neighbor node.
- the node value of the training master node represents the image feature of the sample image.
- the node value represents the image feature of the training neighbor image, and the training neighbor image is an image similar to the sample image.
- association graph in the network training phase may be called the second association graph
- association graph that appeared in the network application phase above may be called the first association graph
- the second association graph may include multiple nodes.
- the nodes in the second association graph may include: a training master node and at least one training neighbor node.
- the training master node represents a sample image
- each training neighbor node represents a training neighbor image determined in step 300.
- the node value of each node is an image feature.
- the node value of the training master node is the image feature of the sample image
- the node value of the training neighbor node is the image feature of the training neighbor image.
- step 304 the second association graph is input to a feature update network, and the feature update network updates the node value of the training master node according to the node value of the training neighbor node in the second association graph.
- the feature update network may be a graph convolution module or other types of modules, which is not limited here.
- the graph convolution module is an attention-based graph convolution (AGCN), which is used to update training according to the image features of the training neighbor nodes in the second correlation graph
- ACN attention-based graph convolution
- the image feature of the master node for example, the image feature of the training master node can be updated after a weighted summation of the image features of each training neighbor node.
- the number of the graph convolution module may be one, or multiple stacked in sequence.
- the second correlation graph is input to the first graph convolution module, and the first graph convolution module updates the training master node according to the image characteristics of each training neighbor node Image feature, in the second correlation graph output by the first graph convolution module, the image feature of the training master node has been updated and is the updated second correlation graph.
- the updated second association graph continues to be input to the second graph convolution module, and the second graph convolution module continues to update the image features of the training master node according to the image features of each training neighbor node, and output the updated training master node again.
- the image characteristics of the node is two, the second correlation graph is input to the first graph convolution module, and the first graph convolution module updates the training master node according to the image characteristics of each training neighbor node Image feature, in the second correlation graph output by the first graph convolution module, the image feature of the training master node has been updated and is the updated second correlation graph.
- the updated second association graph continues to be input
- step 306 the image feature of the sample image extracted by the network is updated according to the feature to obtain the prediction information of the sample image.
- the prediction information of the sample image can be further determined according to the image features extracted by the image convolution module.
- the graph convolution module can be connected to a classifier, and the classifier obtains the probability that the sample image belongs to each preset category according to the image feature.
- step 308 the network parameters of the network are updated by adjusting the characteristics according to the prediction information.
- the difference between the prediction information output by the network and the label information can be updated according to the feature to determine the loss value loss corresponding to the sample image.
- the network parameters of the graph convolution module can be adjusted by backpropagation according to the loss value of each sample image in a batch. This enables the graph convolution module to extract image features more accurately according to the adjusted network parameters.
- the flow can be adjusted FIG convolution module mentioned in the description of FIG. 2 in accordance with the value of loss loss W i, W u, Q , q, W , and other coefficients w .
- the training method of the feature update network of this embodiment updates the image features of the sample image by combining similar images of the sample image when training the network, so that the image features of the sample image itself and the image of the associated training neighbor image can be comprehensively considered Features, so that the image features of the sample images obtained by using the trained features to update the network are more robust and discriminative, so as to improve the accuracy of image retrieval. For example, even if it is affected by changes in illumination, scale, and viewing angles, it is still A relatively accurate image feature can be obtained.
- Figure 4 illustrates another embodiment of the feature update network training method.
- image features can be extracted through a pre-trained network for feature extraction (which can be called feature extraction network), and similarity can be performed based on image features.
- the metric is to obtain training neighbor images similar to the sample image from the training image library.
- the method may include:
- step 400 a network for feature extraction is pre-trained using the training set.
- the pre-trained network used to extract features can be called a feature extraction network, including but not limited to: Convolutional Neural Network (CNN), BP (Back Propagation) neural network, discrete Hopfield Network, etc.
- CNN Convolutional Neural Network
- BP Back Propagation
- discrete Hopfield Network discrete Hopfield Network
- the images in the training set can be called training images.
- the training process of the feature extraction network may include: extracting the image features of the training image through the feature extraction network; obtaining the prediction information of the training image according to the image features of the training image; and based on the prediction information and labels of the training image Information, adjust the network parameters of the feature extraction network.
- the above-mentioned training image refers to the image used to train the feature extraction network
- the aforementioned sample image refers to the feature update that will be applied to the feature extraction network after the training is completed.
- the training process of the network for example, through the pre-trained feature extraction network, first extract the image features of the sample images and the image features of each library image in the training image library, and then input the feature update network to update the image features after generating the correlation map, and update the network training in the feature
- the input image used in the process is the sample image.
- the sample image and the training image can be the same or different.
- step 402 the image features of each library image in the sample image and the training image library are respectively obtained through the feature extraction network.
- step 404 according to the feature similarity between the image features of the sample image and each library image, a first image similar to the sample image is obtained from each library image.
- the library image is the image in the search image library.
- the feature similarity between the image feature of the sample image and the image feature of each library image can be calculated separately, and each library image can be sorted according to the similarity, for example, according to the order of similarity from high to low . Then, the library image ranked in the top K is selected from the sorting result as the first image of the sample image.
- node 31 represents a sample image
- the library images represented by node 32, node 33, and node 34 are all first images that are similar to the sample image.
- step 406 a second image similar to the first image is obtained from the library image according to the feature similarity between the image features of the first image and the library image.
- the feature similarity between the image features of the first image and the library image can be calculated, and a library image similar to the first image is obtained from the library image as the second image.
- nodes 35 to 37 are library images similar to node 32, and nodes 35 to 37 are second images similar to node 31.
- nodes 38 to 40 similar to node 34 are also second images similar to node 31.
- FIG. 5 is an example situation.
- the first image similar to the master node corresponding to the sample image can be found, and the search for neighbor images is stopped.
- a larger number of neighbor images such as the third image or the fourth image can also be found.
- the specific search for several layers of neighbor images can be determined according to the actual test results in different application scenarios.
- the above-mentioned first image, second image, etc. can all be called neighbor images.
- they can be called training neighbor images; in the network application stage, they can be called neighbor images.
- the neighbor image can also be obtained in other ways than the example in this step.
- a similarity threshold can be set, and all or part of the library images whose feature similarity is higher than the threshold are directly used as neighbor images of the sample image.
- a second correlation graph is generated based on the sample image and the neighbor image.
- the nodes in the second correlation graph include: a training master node for representing the sample image and at least one node representing the neighbor image A neighbor node is trained, the node value of the training master node is an image feature of the sample image, and the node value of the training neighbor node is an image feature of the neighbor image.
- the neighbor image in this step includes the first image obtained in step 404 and the second image obtained in step 406.
- the second association graph generated in this step is a graph including multiple nodes, and you can refer to the example in FIG. 6.
- the node 31 in FIG. 6 is the training master node, and all other nodes are training neighbor nodes.
- the node value may be an image feature of the image represented by the node, and the image feature may be extracted in step 402, for example.
- step 410 the second association graph is input to a feature update network, and the feature update network updates the image feature of the training master node according to the image feature of the training neighbor node in the second association graph to obtain the updated sample
- the image feature of the image, and the prediction information of the sample image is obtained according to the updated image feature.
- step 412 according to the prediction information of the sample image, the network parameters of the feature update network and the network parameters of the feature extraction network are adjusted.
- the network parameter adjustment in this step can adjust the network parameters of the feature extraction network or not adjust the network parameters of the feature extraction network, which can be determined according to the actual training situation.
- the training method of the feature update network of this embodiment updates the image features of the sample image by combining similar images of the sample image when training the network, so that the image features of the sample image itself and the image features of other images associated with it can be comprehensively considered Therefore, the image features of the sample images obtained by using the trained feature update network are more robust and discriminative, so as to improve the accuracy of image retrieval; and, by using the feature extraction network to extract image features, not only can the image feature be improved
- the extraction efficiency can further improve the network training speed, and the network parameters of the feature extraction network can be adjusted according to the loss value, so that the image features extracted by the feature extraction network are more accurate.
- the embodiment of the present disclosure also provides an image retrieval method, which is to retrieve an image similar to the target image from an image database.
- the method may include the following processing:
- step 700 the target image to be retrieved is acquired.
- the image M may be referred to as a target image. That is, images that have a certain association with the target image are retrieved from the image library. This association may include the same object or belong to the same category.
- step 702 image features of the target image are extracted.
- the image feature extraction method described in any embodiment of the present disclosure can be used.
- step 704 the image features of each library image in the image library are extracted.
- the image features of each library image in the image library can be extracted according to the image feature extraction method described in any embodiment of the present disclosure, for example, the extraction method shown in FIG. 1.
- step 706 based on the feature similarity between the image features of the target image and the image features of the respective library images, a similar image of the target image is obtained as a retrieval result.
- the feature similarity measurement can be performed between the image features of the target image and the image features of the respective library images, so that similar library images are used as the search result.
- Image retrieval can be applied to a variety of scenarios, such as medical diagnosis, street view maps, intelligent video analysis, security monitoring, etc.
- security monitoring Take the person search in security monitoring as an example as follows to describe how to apply the method of the embodiments of the present disclosure to train the network used for retrieval and how to use the network to perform image retrieval. In the following description, network training and its application will be explained separately.
- a group training method can be used. For example, the training samples can be divided into multiple image subsets (batch), and each iteration of training updates the network to the feature to be trained. Each sample image in a batch is input one by one, and Finally, the network parameters of the network are updated by adjusting the characteristics of the loss value of each sample image included in the image subset.
- the following uses one of the sample images as an example to describe how to obtain the loss value corresponding to the sample image.
- the sample image 81 includes a pedestrian 82.
- the goal of the pedestrian search in this embodiment is to search for library images that include the same pedestrian 82 from the search image library.
- a CNN network can be called a feature extraction network.
- the image features of each library image in the sample image 81 and the image library are respectively extracted through the feature extraction network. Then calculate the feature similarity between the sample image 81 and each library image, and sort according to the similarity, and select the top preset number (for example, according to the similarity from high to low, and the ranking result is in the top 10).
- the image, as an image similar to the sample image 81 may be referred to as a neighbor image of the sample image 81.
- the library image 83, the library image 84 and the library image 85 are all neighbor images.
- the pedestrians included in these neighbor images may be indeed the same as the pedestrian 82, or they may be different but very similar to the pedestrian 82.
- the library image is searched in the image library that is similar to each neighbor image.
- the library image 83 as an example, according to the similarity measure of image features, the top ten library images in the similarity ranking are selected from the library images as the ten neighbor images of the library image 83.
- the set 91 includes ten library images, and these images are the ten neighbor images of the library image 83.
- ten neighbor images similar to the library image 84 can be searched again, that is, the set 92 in FIG. 9.
- the ten neighbor images of the library image 83, the library image 84 and the library image 85 must be searched for the same similar images again, which will not be described in detail.
- the above library image 83, library image 84, etc. can be referred to as the first image similar to the sample image 81, and the library images in the set 91 and the set 92 can all be referred to as the second image similar to the sample image 81.
- This embodiment takes the first image and the second image as examples. In other application examples, it is also possible to continue to search for a third image similar to the second image.
- an association graph can be generated.
- the association graph is similar to that shown in Fig. 6, which includes a master node and multiple neighbor nodes.
- the master node represents the sample image 81
- each neighbor node represents a neighbor image
- these neighbor nodes include the first image and the second image.
- the node value of each node is the image feature of the image it represents.
- the image feature is the image feature extracted and used when obtaining neighbor images for feature similarity comparison. For example, it can be the image feature extracted through the above-mentioned feature extraction network. .
- FIG. 10 illustrates the network structure of the feature update network for extracting image features.
- the network structure can include a feature extraction network 1001. Through the feature extraction network 1001, the image features 1002 of the sample image and each library image in the image library are respectively extracted, and processed according to the similarity comparison of the image features, and finally the correlation diagram 1003 ( The figure shows some neighbor nodes, the number of neighbor nodes in actual use can be more).
- the correlation graph 1003 can be input to a graph convolutional network 1004.
- the graph convolutional network 1004 includes a stack of multiple graph convolution modules 1005, and each graph convolution module 1005 can be used for the master according to the process shown in FIG. 2 The image characteristics of the node are updated.
- the graph convolutional network 1004 can output the final updated image feature of the master node as the updated image feature of the sample image, and can continue to determine the prediction information corresponding to the sample image based on the updated image feature, according to the prediction information and The label information of the sample image calculates the loss value loss corresponding to the sample image.
- the loss value of each sample image can be calculated according to the above processing procedure, and finally the network parameters of the feature update network can be adjusted according to the loss value of these sample images, for example, including the parameters in the graph convolution module and the parameters of the feature extraction network.
- the network structure shown in FIG. 10 may not include the feature extraction network, but the association graph can be obtained in other ways.
- the target image is a pedestrian image.
- the image features of the target image can be extracted from the feature update network in the following manner:
- the target image is also extracted to image features through the feature extraction network 1001 in FIG. 10.
- the association graph may include a master node representing the target image and multiple neighbor nodes representing neighbor images.
- the correlation graph is input into the graph convolution network 1004 in FIG. 10, and the image feature of the master node in the target image is updated by the graph convolution module 1005, and the finally obtained image feature of the master node is the image feature of the updated target image.
- the image retrieval method of this embodiment combines the image features of neighboring images associated with the target image when performing image feature extraction, so that the image features learned by the updated network using the trained features are more robust and discriminative.
- the graph convolution module can be stacked in multiple layers, which has good scalability; in group training, each sample image in a batch can be calculated in parallel using the deep learning framework and hardware , The efficiency of network training is higher.
- Fig. 11 provides an image feature extraction device, which can be used to execute the image feature extraction method of any embodiment of the present disclosure.
- the device may include: a graph acquisition module 1101 and a feature update module 1102.
- the graph acquisition module 1101 is configured to acquire a first association graph, the first association graph including a master node and at least one neighbor node, the node value of the master node represents the image feature of the target image, and the node value of the neighbor node represents the neighbor The image feature of the image, and the neighbor image is an image similar to the target image.
- the feature update module 1102 is configured to input the first association graph into a feature update network, and the feature update network updates the node value of the master node according to the node value of the neighbor node in the first association graph, and obtains the updated The image characteristics of the target image.
- the device further includes: a neighbor obtaining module 1103, configured to obtain a neighboring image from an image library according to the target image before the image obtaining module obtains the first correlation diagram. Neighbor images that are similar to the target image.
- the neighbor acquisition module 1103 is configured to: separately acquire the image features of the target image and the image features of each library image in the image library through a feature extraction network; based on the image features and image features of the target image Based on the feature similarity between the image features of each of the library images in the library, a neighbor image similar to the target image is determined from the image library.
- the neighbor acquisition module 1103 is further used to: sort the feature similarity between the target image and each of the library images in descending order of feature similarity;
- the library image corresponding to the feature similarity of the number of bits is set as the neighbor image that is similar to the target image.
- the neighbor acquisition module 1103 is further configured to: according to the feature similarity between the target image and the image features of each of the library images, obtain from each library image similar to the target image The first image; according to the feature similarity between the image features of the first image and the image features of each of the library images, a second image similar to the first image is obtained from each of the library images; The first image and the second image are used as neighbor images of the target image.
- the number of the feature update network is one, or N stacked in sequence, where N is an integer greater than 1; when the number of the feature update network is N: the number of the i-th feature update network
- the input is the updated first correlation graph output by the i-1th feature update network, where i is an integer greater than 1 and less than or equal to N.
- the feature update module 1102 is configured to: determine the weight between the master node and each of the neighbor nodes in the first association graph; and calculate the weight of each neighbor node according to the weight
- the image features are merged to obtain the weighted feature of the master node; and the image feature of the updated target image is obtained according to the image feature of the master node and the weighted feature.
- the feature update module 1102 is further configured to: perform a weighted summation of the image features of each neighbor node according to the weight to obtain the weighted feature of the master node.
- the feature update module 1102 is further configured to: stitch the image features of the master node with the weighted features; perform nonlinear mapping on the stitched features to obtain the updated image features of the target image.
- the feature update module 1102 is further configured to: perform linear mapping on the master node and neighbor nodes; determine the inner product of the master node and neighbor nodes after linear mapping; The inner product determines the weight between the master node and the neighbor node.
- FIG. 13 provides a training device for a feature update network, which can be used to execute the training method for a feature update network according to any embodiment of the present disclosure.
- the apparatus may include: an association graph obtaining module 1301, an update processing module 1302, and a parameter adjustment module 1303.
- the association graph obtaining module 1301 is configured to obtain a second association graph, the second association graph including a training master node and at least one training neighbor node, the node value of the training master node represents the image feature of the sample image, and the training The node value of the neighbor node represents the image feature of the training neighbor image, and the training neighbor image is an image similar to the sample image;
- the update processing module 1302 is configured to input the second association graph into a feature update network, and the feature update network updates the node value of the master node according to the node value of the training neighbor node in the second association graph to obtain the update The image characteristics of the sample image afterwards;
- the parameter adjustment module 1303 is configured to obtain prediction information of the sample image according to the image feature of the updated sample image; adjust the network parameter of the feature update network according to the prediction information.
- the device further includes: an image acquisition module 1304, configured to acquire from the training image library according to the sample image before the correlation map acquisition module acquires the second correlation map The training neighbor image similar to the sample image.
- the device further includes: a pre-training module 1305.
- the pre-training module 1305 is used to extract the image features of the training image through the feature extraction network; obtain the prediction information of the training image according to the image features of the training image; adjust based on the prediction information and label information of the training image Network parameters of the feature extraction network; the training image is an image used to train the feature extraction network, and the sample image is an image used to train the feature update network after the training of the feature extraction network is completed;
- the image acquisition module 1304 is configured to: separately acquire the image features of the sample image and the image features of each library image in the training image library through the feature extraction network; and based on the image features of the sample image and each library The feature similarity between the image features of the image determines the training neighbor image similar to the sample image.
- the functions or modules contained in the device provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
- the functions or modules contained in the device provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
- At least one embodiment of the present disclosure provides an electronic device.
- the device may include a memory and a processor.
- the memory is used to store computer instructions that can be run on the processor.
- the processor is used to implement the computer instructions when the computer instructions are executed.
- At least one embodiment of the present disclosure provides a computer-readable storage medium on which a computer program is stored.
- the program is executed by a processor, the method for extracting image features or the feature update network described in any of the embodiments of the present disclosure is implemented. Training method.
- At least one embodiment of the present disclosure provides a computer program for causing a processor to execute the steps of the method for extracting image features or the steps of the method for training a feature update network according to any embodiment of the present disclosure.
- one or more embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of the present disclosure may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program may be stored, and when the program is executed by a processor, the steps of the method for extracting image features described in any of the embodiments of the present disclosure are implemented, and /Or, implement the steps of the feature update network training method described in any embodiment of the present disclosure.
- the "and/or” means having at least one of the two, for example, "A and/or B" includes three schemes: A, B, and "A and B".
- Embodiments of the subject matter described in the present disclosure can be implemented as one or more computer programs, that is, one or one of the computer program instructions encoded on a tangible non-transitory program carrier to be executed by a data processing device or to control the operation of the data processing device Multiple modules.
- the program instructions may be encoded on artificially generated propagated signals, such as machine-generated electrical, optical or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiver device for data transmission.
- the processing device executes.
- the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- the processing and logic flow described in the present disclosure can be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating according to input data and generating output.
- the processing and logic flow can also be executed by a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit), and the device can also be implemented as a dedicated logic circuit.
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- Computers suitable for executing computer programs include, for example, general-purpose and/or special-purpose microprocessors, or any other type of central processing unit.
- the central processing unit will receive instructions and data from a read-only memory and/or a random access memory.
- the basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
- the computer will also include one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled to this mass storage device to receive data from or send data to it. It transmits data, or both.
- the computer does not have to have such equipment.
- the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or, for example, a universal serial bus (USB ) Flash drives are portable storage devices, just to name a few.
- PDA personal digital assistant
- GPS global positioning system
- USB universal serial bus
- Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or Removable disks), magneto-optical disks, CD ROM and DVD-ROM disks.
- semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
- magnetic disks such as internal hard disks or Removable disks
- magneto-optical disks CD ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by or incorporated into a dedicated logic circuit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (30)
- 一种图像特征的提取方法,包括:获取第一关联图,所述第一关联图中包括主节点以及至少一个邻居节点,所述主节点的节点值表示目标图像的图像特征,所述邻居节点的节点值表示邻居图像的图像特征,所述邻居图像是与所述目标图像相似的图像;将所述第一关联图输入特征更新网络,所述特征更新网络根据所述第一关联图中的邻居节点的节点值更新所述主节点的节点值,以得到更新后的目标图像的图像特征。
- 根据权利要求1所述的方法,其特征在于,获取所述第一关联图之前,所述方法还包括:根据所述目标图像,由图像库中获取与所述目标图像相似的邻居图像。
- 根据权利要求2所述的方法,其特征在于,根据所述目标图像,由所述图像库中获取与所述目标图像相似的邻居图像,包括:通过特征提取网络分别获取所述目标图像的图像特征和所述图像库中的各个库图像的图像特征;基于所述目标图像的图像特征和所述图像库中的各个所述库图像的图像特征之间的特征相似度,从所述图像库中确定与所述目标图像相似的邻居图像。
- 根据权利要求3所述的方法,其特征在于,基于所述目标图像的图像特征和所述图像库中的各个所述库图像的图像特征之间的特征相似度,确定与所述目标图像相似的邻居图像,包括:将所述目标图像与各个所述库图像之间的特征相似度,按照特征相似度的数值由大到小的顺序进行排序;选取前预设位数的特征相似度对应的库图像,作为与所述目标图像相似的邻居图像。
- 根据权利要求3所述的方法,其特征在于,基于所述目标图像的图像特征和所述图像库中的各个所述库图像的图像特征之间的特征相似度,从所述图像库中确定与所述目标图像相似的邻居图像,包括:根据所述目标图像的图像特征和各个所述库图像的图像特征之间的特征相似度,由各个所述库图像中获得与所述目标图像相似的第一图像;根据所述第一图像的图像特征与各个所述库图像的图像特征之间的特征相似度,由各个所述库图像中获得与所述第一图像相似的第二图像;将所述第一图像和所述第二图像,作为所述目标图像的邻居图像。
- 根据权利要求1所述的方法,其特征在于,所述特征更新网络的数量为一个,或者依次堆积的N个,其中N是大于1的整数;当所述特征更新网络的数量为N个时:其中第i特征更新网络的输入,是第i-1特征更新网络输出的更新后的第一关联图,其中i是大于1且小于或等于N的整数。
- 根据权利要求1所述的方法,其特征在于,所述特征更新网络根据所述第一关联图中的邻居节点的节点值更新所述主节点的节点值,得到更新后的目标图像的图像特征,包括:确定所述第一关联图中的所述主节点和各所述邻居节点之间的权重;根据所述权重将各所述邻居节点的图像特征合并,得到所述主节点的加权特征;根据所述主节点的图像特征和所述加权特征,得到所述更新后的目标图像的图像特征。
- 根据权利要求7所述的方法,其特征在于,根据所述权重将各所述邻居节点的图像特征合并,得到所述主节点的加权特征,包括:根据所述权重,将各所述邻居节点的图像特征进行加权求和,得到所述主节点的加权特征。
- 根据权利要求7所述的方法,其特征在于,根据所述主节点的图像特征和所述加权特征,得到所述更新后的目标图像的图像特征,包括:将所述主节点的图像特征与所述加权特征拼接;对拼接后的特征进行非线性映射,得到所述更新后的目标图像的图像特征。
- 根据权利要求7所述的方法,其特征在于,确定所述第一关联图中的所述主节点和所述邻居节点之间的权 重,包括:对所述主节点和所述邻居节点进行线性映射;对线性映射后的所述主节点和所述邻居节点确定内积;根据非线性处理后的所述内积,确定所述主节点与所述邻居节点之间的权重。
- 根据权利要求1~10任一所述的方法,其特征在于,所述目标图像包括:待检索的查询图像以及图像库中各个库图像;在得到所述更新后的目标图像的图像特征之后,所述方法还包括:基于所述更新后的目标图像的图像特征和所述各个库图像的图像特征之间的特征相似度,由所述库图像中获得所述目标图像的相似图像作为检索结果。
- 一种特征更新网络的训练方法,其特征在于,所述特征更新网络用于更新图像的图像特征;所述方法包括:获取第二关联图,所述第二关联图中包括训练主节点以及至少一个训练邻居节点,所述训练主节点的节点值表示样本图像的图像特征,所述训练邻居节点的节点值表示训练邻居图像的图像特征,所述训练邻居图像为与所述样本图像相似的图像;将所述第二关联图输入所述特征更新网络,所述特征更新网络根据所述第二关联图中的训练邻居节点的节点值更新所述主节点的节点值,得到更新后的样本图像的图像特征;根据所述更新后的样本图像的图像特征,得到所述样本图像的预测信息;根据所述预测信息调整所述特征更新网络的网络参数。
- 根据权利要求12所述的方法,其特征在于,获取所述第二关联图之前,所述方法还包括:根据所述样本图像,由训练图像库中获取与所述样本图像相似的所述训练邻居图像。
- 根据权利要求13所述的方法,其特征在于,根据所述样本图像,由所述训练图像库中获取与所述样本图像相似的所述训练邻居图像之前,所述方法还包括:通过特征提取网络,提取训练图像的图像特征;根据所述训练图像的图像特征,获得所述训练图像的预测信息;基于所述训练图像的预测信息和标签信息,调整所述特征提取网络的网络参数;根据所述样本图像,由所述训练图像库中获取与所述样本图像相似的所述训练邻居图像,包括:通过所述特征提取网络分别获取所述样本图像的图像特征和所述训练图像库中的各个库图像的图像特征;并基于所述样本图像的图像特征和各个所述库图像的图像特征之间的特征相似度,确定与所述样本图像相似的所述训练邻居图像。
- 一种图像特征的提取装置,包括:图获取模块,用于获取第一关联图,所述第一关联图中包括主节点以及至少一个邻居节点,所述主节点的节点值表示目标图像的图像特征,所述邻居节点的节点值表示邻居图像的图像特征,所述邻居图像是与目标图像相似的图像;特征更新模块,用于将所述第一关联图输入特征更新网络,所述特征更新网络根据所述第一关联图中的邻居节点的节点值更新所述主节点的节点值,以得到更新后的目标图像的图像特征。
- 根据权利要求15所述的装置,还包括:邻居获取模块,用于在所述图获取模块获取所述第一关联图之前,根据所述目标图像,由图像库中获取与所述目标图像相似的邻居图像。
- 根据权利要求16所述的装置,其特征在于,所述邻居获取模块,用于:通过特征提取网络分别获取所述目标图像的图像特征和所述图像库中的各个库图像的图像特征;基于所述目标图像的图像特征和所述图像库中的各个所述库图像的图像特征之间的特征相似度,从所述图像库 中确定与所述目标图像相似的邻居图像。
- 根据权利要求17所述的装置,其特征在于,所述邻居获取模块还用于:将所述目标图像与各个所述库图像之间的特征相似度,按照特征相似度的数值由大到小的顺序进行排序;选取前预设位数的特征相似度对应的库图像,作为所述目标图像相似的邻居图像。
- 根据权利要求17所述的装置,其特征在于,所述邻居获取模块还用于:根据所述目标图像的图像特征和各个所述库图像的图像特征之间的特征相似度,由各个所述库图像中获得与所述目标图像相似的第一图像;根据所述第一图像的图像特征与各个所述库图像的图像特征之间的特征相似度,由各个所述库图像中获得与所述第一图像相似的第二图像;将所述第一图像和所述第二图像,作为所述目标图像的邻居图像。
- 根据权利要求15所述的装置,其特征在于,所述特征更新网络的数量为一个,或者依次堆积的N个,其中N是大于1的整数;当所述特征更新网络的数量为N个时:其中第i特征更新网络的输入,是第i-1特征更新网络输出的更新后的第一关联图,其中i是大于1且小于或等于N的整数。
- 根据权利要求15所述的装置,其特征在于,所述特征更新模块,用于:确定所述第一关联图中的所述主节点和各所述邻居节点之间的权重;根据所述权重将各所述邻居节点的图像特征合并,得到所述主节点的加权特征;根据所述主节点的图像特征和所述加权特征,得到所述更新后的目标图像的图像特征。
- 根据权利要求21所述的装置,其特征在于,所述特征更新模块还用于:根据所述权重,将各所述邻居节点的图像特征进行加权求和,得到所述主节点的加权特征。
- 根据权利要求21所述的装置,其特征在于,所述特征更新模块还用于:将所述主节点的图像特征与所述加权特征拼接;对拼接后的特征进行非线性映射,得到所述更新后的目标图像的图像特征。
- 根据权利要求21所述的装置,其特征在于,所述特征更新模块还用于:对所述主节点和所述邻居节点进行线性映射;对线性映射后的所述主节点和所述邻居节点确定内积;根据非线性处理后的所述内积,确定所述主节点与所述邻居节点之间的权重。
- 一种特征更新网络的训练装置,包括:关联图获得模块,用于获取第二关联图,所述第二关联图中包括训练主节点以及至少一个训练邻居节点,所述训练主节点的节点值表示样本图像的图像特征,所述训练邻居节点的节点值表示训练邻居图像的图像特征,所述训练邻居图像为与所述样本图像相似的图像;更新处理模块,用于将所述第二关联图输入特征更新网络,所述特征更新网络根据所述第二关联图中的训练邻居节点的节点值更新所述主节点的节点值,得到更新后的样本图像的图像特征;参数调整模块,用于根据所述更新后的样本图像的图像特征,得到所述样本图像的预测信息;根据所述预测信息调整所述特征更新网络的网络参数。
- 根据权利要求25所述的装置,还包括:图像获取模块,用于在所述关联图获得模块获取所述第二关联图之前,根据所述样本图像,由训练图像库中获取与所述样本图像相似的所述训练邻居图像。
- 根据权利要求25所述的装置,还包括:预训练模块,用于:通过特征提取网络,提取训练图像的图像特征;根据所述训练图像的图像特征,获得所述训练图像的预测信息;基于所述训练图像的预测信息和标签信息,调整所述特征提取网络的网络参数;其中,所述训练图像是用于训练所述特征提取网络所使用的图像,所述样本图像是特征提取网络训练完成之后用于训练所述特征更新网络的图像;所述图像获取模块,用于:通过所述特征提取网络分别获取所述样本图像的图像特征和训练图像库中的各个库图像的图像特征;并基于所述样本图像的图像特征和各个所述库图像的图像特征之间的特征相似度,确定与所述样本图像相似的所述训练邻居图像。
- 一种电子设备,其特征在于,所述设备包括存储器、处理器,所述存储器用于存储可在处理器上运行的计算机指令,所述处理器用于在执行所述计算机指令时实现权利要求1至11任一所述的方法,或者实现权利要求12至14任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现权利要求1至11任一所述的方法,或,实现权利要求12至14任一所述的方法。
- 一种计算机程序,其特征在于,所述计算机程序用于使处理器执行权利要求1至11任一所述的方法的步骤,或,执行权利要求12至14任一所述的方法的步骤。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022500674A JP2022539423A (ja) | 2019-08-23 | 2019-11-21 | 画像特徴抽出及びネットワークの訓練方法、装置並びに機器 |
KR1020227000630A KR20220017497A (ko) | 2019-08-23 | 2019-11-21 | 이미지 특징 추출 및 네트워크의 훈련 방법, 장치 및 기기 |
US17/566,740 US20220122343A1 (en) | 2019-08-23 | 2021-12-31 | Image feature extraction and network training method, apparatus, and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910782629.9A CN110502659B (zh) | 2019-08-23 | 2019-08-23 | 图像特征提取及网络的训练方法、装置和设备 |
CN201910782629.9 | 2019-08-23 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/566,740 Continuation US20220122343A1 (en) | 2019-08-23 | 2021-12-31 | Image feature extraction and network training method, apparatus, and device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021036028A1 true WO2021036028A1 (zh) | 2021-03-04 |
Family
ID=68589288
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/120028 WO2021036028A1 (zh) | 2019-08-23 | 2019-11-21 | 图像特征提取及网络的训练方法、装置和设备 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20220122343A1 (zh) |
JP (1) | JP2022539423A (zh) |
KR (1) | KR20220017497A (zh) |
CN (1) | CN110502659B (zh) |
TW (1) | TWI747114B (zh) |
WO (1) | WO2021036028A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115221976A (zh) * | 2022-08-18 | 2022-10-21 | 抖音视界有限公司 | 一种基于图神经网络的模型训练方法及装置 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102020111456B4 (de) | 2020-04-27 | 2023-11-16 | Ebner Industrieofenbau Gmbh | Vorrichtung und Verfahren zum Erwärmen mehrerer Tiegel |
CN111985616B (zh) * | 2020-08-13 | 2023-08-08 | 沈阳东软智能医疗科技研究院有限公司 | 一种图像特征提取方法、图像检索方法、装置及设备 |
CN113850179A (zh) * | 2020-10-27 | 2021-12-28 | 深圳市商汤科技有限公司 | 图像检测方法及相关模型的训练方法、装置、设备、介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160196479A1 (en) * | 2015-01-05 | 2016-07-07 | Superfish Ltd. | Image similarity as a function of weighted descriptor similarities derived from neural networks |
CN108985190A (zh) * | 2018-06-28 | 2018-12-11 | 北京市商汤科技开发有限公司 | 目标识别方法和装置、电子设备、存储介质、程序产品 |
CN109829433A (zh) * | 2019-01-31 | 2019-05-31 | 北京市商汤科技开发有限公司 | 人脸图像识别方法、装置、电子设备及存储介质 |
CN109934826A (zh) * | 2019-02-28 | 2019-06-25 | 东南大学 | 一种基于图卷积网络的图像特征分割方法 |
CN109934261A (zh) * | 2019-01-31 | 2019-06-25 | 中山大学 | 一种知识驱动参数传播模型及其少样本学习方法 |
CN110111325A (zh) * | 2019-05-14 | 2019-08-09 | 深圳大学 | 神经影像分类方法、计算机终端及计算机可读存储介质 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104008165B (zh) * | 2014-05-29 | 2017-05-24 | 华东师范大学 | 一种基于网络拓扑结构和节点属性的社团检测方法 |
US20180013658A1 (en) * | 2016-07-06 | 2018-01-11 | Agt International Gmbh | Method of communicating between nodes in a computerized network and system thereof |
CN113536019A (zh) * | 2017-09-27 | 2021-10-22 | 深圳市商汤科技有限公司 | 一种图像检索方法、装置及计算机可读存储介质 |
CN109657533B (zh) * | 2018-10-27 | 2020-09-25 | 深圳市华尊科技股份有限公司 | 行人重识别方法及相关产品 |
-
2019
- 2019-08-23 CN CN201910782629.9A patent/CN110502659B/zh active Active
- 2019-11-21 KR KR1020227000630A patent/KR20220017497A/ko active Search and Examination
- 2019-11-21 WO PCT/CN2019/120028 patent/WO2021036028A1/zh active Application Filing
- 2019-11-21 JP JP2022500674A patent/JP2022539423A/ja not_active Withdrawn
- 2019-12-24 TW TW108147317A patent/TWI747114B/zh not_active IP Right Cessation
-
2021
- 2021-12-31 US US17/566,740 patent/US20220122343A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160196479A1 (en) * | 2015-01-05 | 2016-07-07 | Superfish Ltd. | Image similarity as a function of weighted descriptor similarities derived from neural networks |
CN108985190A (zh) * | 2018-06-28 | 2018-12-11 | 北京市商汤科技开发有限公司 | 目标识别方法和装置、电子设备、存储介质、程序产品 |
CN109829433A (zh) * | 2019-01-31 | 2019-05-31 | 北京市商汤科技开发有限公司 | 人脸图像识别方法、装置、电子设备及存储介质 |
CN109934261A (zh) * | 2019-01-31 | 2019-06-25 | 中山大学 | 一种知识驱动参数传播模型及其少样本学习方法 |
CN109934826A (zh) * | 2019-02-28 | 2019-06-25 | 东南大学 | 一种基于图卷积网络的图像特征分割方法 |
CN110111325A (zh) * | 2019-05-14 | 2019-08-09 | 深圳大学 | 神经影像分类方法、计算机终端及计算机可读存储介质 |
Non-Patent Citations (2)
Title |
---|
2016XING: "Graph Neural Networks overview", 14 August 2019 (2019-08-14), pages 1 - 11, XP009526515, Retrieved from the Internet <URL:http://www.360doc.com/content/19/0814/02/37048517_854721596.shtml> * |
WEN WEN ,HUANG JIAMING ,CAI RUICHU ,HAO ZHIFENG ,WANG LIJUAN: "Graph Embedding by Incorporating Prior Knowledge on Vertex Information", JOURNAL OF SOFTWARE, vol. 29, no. 3, 31 March 2018 (2018-03-31), pages 786 - 798, XP055787278, ISSN: 1000-9825, DOI: 10.13328/j.cnki.jos.005437 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115221976A (zh) * | 2022-08-18 | 2022-10-21 | 抖音视界有限公司 | 一种基于图神经网络的模型训练方法及装置 |
CN115221976B (zh) * | 2022-08-18 | 2024-05-24 | 抖音视界有限公司 | 一种基于图神经网络的模型训练方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
JP2022539423A (ja) | 2022-09-08 |
CN110502659A (zh) | 2019-11-26 |
CN110502659B (zh) | 2022-07-15 |
TW202109312A (zh) | 2021-03-01 |
US20220122343A1 (en) | 2022-04-21 |
TWI747114B (zh) | 2021-11-21 |
KR20220017497A (ko) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021036028A1 (zh) | 图像特征提取及网络的训练方法、装置和设备 | |
US11949964B2 (en) | Generating action tags for digital videos | |
CN111523621B (zh) | 图像识别方法、装置、计算机设备和存储介质 | |
CN107480261B (zh) | 一种基于深度学习细粒度人脸图像快速检索方法 | |
WO2024021394A1 (zh) | 全局特征与阶梯型局部特征融合的行人重识别方法及装置 | |
CN110851645B (zh) | 一种基于深度度量学习下相似性保持的图像检索方法 | |
KR102305568B1 (ko) | 일정한 처리 시간 내에 k개의 극값을 찾는 방법 | |
CN108664526B (zh) | 检索的方法和设备 | |
CN111709311A (zh) | 一种基于多尺度卷积特征融合的行人重识别方法 | |
CN112463976B (zh) | 一种以群智感知任务为中心的知识图谱构建方法 | |
CN113393474B (zh) | 一种基于特征融合的三维点云的分类和分割方法 | |
JP7430243B2 (ja) | 視覚的測位方法及び関連装置 | |
CN113297369B (zh) | 基于知识图谱子图检索的智能问答系统 | |
CN113033507B (zh) | 场景识别方法、装置、计算机设备和存储介质 | |
CN105183746A (zh) | 从多相关图片中挖掘显著特征实现图像检索的方法 | |
Barman et al. | A graph-based approach for making consensus-based decisions in image search and person re-identification | |
Jung et al. | Few-shot metric learning: Online adaptation of embedding for retrieval | |
Negi et al. | End-to-end residual learning-based deep neural network model deployment for human activity recognition | |
CN106557533B (zh) | 一种单目标多图像联合检索的方法和装置 | |
CN114579794A (zh) | 特征一致性建议的多尺度融合地标图像检索方法及系统 | |
CN113779287A (zh) | 基于多阶段分类器网络的跨域多视角目标检索方法及装置 | |
Wu et al. | Visual loop closure detection by matching binary visual features using locality sensitive hashing | |
CN110750672A (zh) | 基于深度度量学习和结构分布学习损失的图像检索方法 | |
Barros et al. | TReR: A Lightweight Transformer Re-Ranking Approach for 3D LiDAR Place Recognition | |
Wang et al. | A G2G similarity guided pedestrian Re-identification algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19943375 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022500674 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20227000630 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19943375 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.08.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19943375 Country of ref document: EP Kind code of ref document: A1 |