CN110895700A - Image recognition method and system - Google Patents

Image recognition method and system Download PDF

Info

Publication number
CN110895700A
CN110895700A CN201811062833.5A CN201811062833A CN110895700A CN 110895700 A CN110895700 A CN 110895700A CN 201811062833 A CN201811062833 A CN 201811062833A CN 110895700 A CN110895700 A CN 110895700A
Authority
CN
China
Prior art keywords
image
recognition method
target model
image recognition
prediction function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811062833.5A
Other languages
Chinese (zh)
Inventor
祖辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201811062833.5A priority Critical patent/CN110895700A/en
Publication of CN110895700A publication Critical patent/CN110895700A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Abstract

The embodiment of the invention provides an image identification method and system, and relates to the technical field of computers. The method comprises the following steps: extracting image features of a plurality of modalities from an image training sample; constructing a plurality of hypergraphs according to the image features of the plurality of modalities; performing joint learning on the plurality of hypergraphs; and carrying out image classification according to the local optimal solution obtained by the joint learning. According to the image identification method provided by the embodiment of the invention, in the process of constructing the hypergraph, the plurality of hypergraphs are subjected to joint learning; and carrying out image classification according to the local optimal solution obtained by the joint learning. The similarity measurement among samples is carried out by using the long characteristic vectors spliced by all the characteristic vectors of different modes of one sample, so that the similarity of sample nodes in the same super edge is ensured, and the complementary information and the related information among the image characteristics of different modes are kept to the maximum extent.

Description

Image recognition method and system
Technical Field
The invention relates to the technical field of computers, in particular to an image identification method and system.
Background
In real life, the human diverse senses are composed of vision, hearing, touch, smell, and the like. There are many ways in which people perceive the world, such as objects seen, sounds heard, texture perceived, smells, etc. Invariance features that describe things are called modal features. In computer vision and multimedia analysis, multiple modal features are often utilized to characterize the same object from different perspectives. For example, to characterize an image of a natural scene well, a set of visual features representing its color, texture, or shape, etc. is typically extracted. How to fully mine the complementary information and the related information of the different modal characteristics to further improve the image identification performance has become a hotspot and difficult problem in the field of computer vision. Hypergraphs are widely used in the field of image recognition because they are able to portray complex relationships between a number of different objects.
In the prior art, in the process of constructing a hypergraph, a plurality of modal features of each sample are spliced into a long feature vector. And constructing a superedge according to the long feature vector, and finally generating a hypergraph with n nodes. The method for constructing the hypergraph ignores the difference of different modal characteristics, cannot ensure the similarity of sample nodes in the same hyperedge, and does not fully utilize complementary information and related information among the different modal characteristics.
Disclosure of Invention
In view of this, embodiments of the present invention provide an image recognition method and system, which complete image recognition based on joint hypergraph learning.
According to an aspect of the present invention, there is provided an image recognition method including: extracting image features of a plurality of modalities from an image training sample; constructing a plurality of hypergraphs according to the image features of the plurality of modalities; performing joint learning on the plurality of hypergraphs; and carrying out image classification according to the local optimal solution obtained by the joint learning.
Preferably, the plurality of hypergraphs each include a respective set of vertices and a respective set of hyper-edges. .
Preferably, the step of jointly learning the plurality of hypergraphs comprises: constructing an object model, wherein the object model comprises a super edge weight and a prediction function; performing alternate iterative optimization on the excess edge weight and the prediction function in the target model; and obtaining a locally optimal solution of the prediction function when the alternating iterations converge.
Preferably, the alternating iterative optimization comprises: fixing the excess edge weight, and optimizing a prediction function, wherein an independent item of the variable prediction function in the target model is removed to obtain an analytic solution of the prediction function; and fixing a prediction function, and optimizing the excess edge weight, wherein an item which is irrelevant to the variable excess edge weight in the target model is removed, and an analytic solution of the excess edge weight is obtained.
Preferably, the image recognition method further includes: collecting an image test sample and acquiring a category label vector corresponding to the image test sample; and verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
Preferably, the verifying the target model by using the image test sample, and adjusting and optimizing the parameters of the target model according to the verification result includes: applying the image test sample to the prediction function to obtain a prediction label vector corresponding to the image test sample; and comparing the prediction index vector and the category index vector corresponding to the image test sample.
Preferably, the verifying the target model by using the image test sample, and adjusting and optimizing the parameters of the target model according to the verification result further includes: according to the comparison result, if the alternative iterative optimization process does not reach the local optimal solution of the prediction function, further alternative iterative optimization is carried out on the weight of the excess edge in the target model and the prediction function; and according to the comparison result, if the alternative iterative optimization process converges to the local optimal solution of the prediction function, obtaining the prediction function of the target model.
Preferably, the classifying the image according to the local optimal solution obtained by the joint learning includes: acquiring an unknown image; and applying the unknown image to the prediction function to obtain an identification result of the unknown image.
Preferably, the k-nearest neighbor method is used to construct the super-edge set according to the image features of the plurality of modalities.
Preferably, the image features of the plurality of modalities include at least one of: color moment feature vectors, local two-dimensional histogram feature vectors, and histogram of oriented gradients feature vectors.
Preferably, the specific step of extracting the color moment feature vector includes: segmenting each image of the image training sample into a plurality of non-overlapping meshes; respectively calculating feature vectors of a color mean value, a color variance and a color skewness of the image in a plurality of channels of each grid; and connecting the feature vectors of the color mean, the color variance and the color skewness which are obtained by calculation in each grid to form a color moment feature vector.
Preferably, the specific step of extracting the local two-dimensional histogram feature vector includes: segmenting each image of the image training sample into a plurality of non-overlapping meshes; local two-dimensional histogram feature vectors are obtained by comparing image pixels of the central grid with image pixels of the surrounding grids of each image.
Preferably, the local two-dimensional histogram feature vector has good illumination invariance.
Preferably, the specific step of extracting the feature vector of the histogram of directional gradients includes: segmenting each image of the image training sample into a plurality of blocks; calculating a directional gradient histogram feature vector of each block; and normalizing the histogram feature vector of the direction gradient of each block in a block-by-block mode to obtain the histogram feature vector of the direction gradient.
According to another aspect of the present invention, there is provided an image recognition system including: a modal feature extraction module: extracting image features of a plurality of modalities from an image training sample; a target model building module: for constructing a plurality of hypergraphs from image features of the plurality of modalities; a joint learning module: for joint learning of the plurality of hypergraphs; a prediction module: carrying out image classification according to the local optimal solution obtained by the joint learning;
preferably, the image recognition system further comprises: an image acquisition module: the image training system is used for acquiring an image training sample and acquiring a class label vector corresponding to the image training sample; acquiring an image test sample and acquiring a category label vector corresponding to the image test sample; a test module: and verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
According to yet another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed, implement the image recognition method as described above.
According to still another aspect of the present invention, there is provided a control apparatus for image recognition, comprising: a memory for storing computer instructions; a processor coupled to the memory, the processor configured to perform an image recognition method implemented as described below based on computer instructions stored by the memory.
One embodiment of the present invention has the following advantages or benefits: performing joint learning on the plurality of hypergraphs; and carrying out image classification according to the local optimal solution obtained by the joint learning. The similarity measurement among samples is carried out by using the long characteristic vectors spliced by all the characteristic vectors of different modes of one sample, so that the similarity of sample nodes in the same super edge is ensured, and the complementary information and the related information among the image characteristics of different modes are kept to the maximum extent.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention with reference to the following drawings, in which:
fig. 1 shows a flow chart of an image recognition method according to an embodiment of the present invention.
Fig. 2 shows a flow chart of an image recognition method according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an image recognition system according to an embodiment of the present invention.
Fig. 4 is a block diagram of a control apparatus for image recognition according to an embodiment of the present invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention. The figures are not necessarily drawn to scale.
In the image recognition technique, a hypergraph
Figure BDA0001797557390000043
By collection of vertices
Figure BDA0001797557390000045
Figure BDA0001797557390000046
Set of hyper-edges ε ═ e1,e2,…,edAnd the super-edge weight vector w. The weight of the corresponding superedge of e ∈ is denoted w (e), whose value is set to 1 by default. Hypergraph
Figure BDA0001797557390000044
Can be made of the size of
Figure BDA0001797557390000047
The indication matrix H of (d) represents:
Figure BDA0001797557390000041
for node
Figure BDA0001797557390000048
Based on the indicator matrix H, the nodes
Figure BDA0001797557390000049
The degree of (d) is defined as:
Figure BDA0001797557390000042
similarly, the degree of the hyper-edge e ∈ ε is defined as:
Figure BDA0001797557390000051
Dvand DeRespectively representing diagonal matrices with the degrees of nodes and the degrees of super edges as diagonal elements. DwRepresents a super-edge weight matrix of size | ε × | ε | whose diagonal elements are the weights w (e) of the super-edges.
Assume that there are n samples, each of which has characteristics of m modalities. On each sample, feature vectors of m modes are spliced in series into a long feature vector. Constructing a superedge according to the long characteristic vector, and finally generating a hypergraph with n nodes
Figure BDA0001797557390000057
The objective function for hypergraph learning is defined as:
Figure BDA0001797557390000052
wherein omegafIs a hypergraph Laplace regularization term, Remp(f) To experience the loss term, either mean square error loss or hinge loss can be used. Lambda [ alpha ]>0 is a regularization parameter that balances the relative weight magnitudes of the two terms of the above equation. Hypergraph laplacian regularization term ΩfThe definition is as follows:
Figure BDA0001797557390000053
the intuitive understanding of the above formula is that when all nodes in the same super edge have similar class labels, formula (5) takes a smaller value. Order to
Figure BDA0001797557390000054
L=I-Λ,
The regularization term of the finished hypergraph is obtained as follows:
Ωf=fTLf (6)
where L is a semi-positive definite matrix called hypergraph laplacian, f ═ f1,f2,…,fn]TIs a prediction function defined at (-1, 1). When using the mean square error loss as the empirical loss term
Figure BDA0001797557390000055
The objective function has the following closed form solution:
Figure BDA0001797557390000056
in the above formula it is assumed that the weights of the super edges are all the same and 1. And finally, identifying the image through a prediction function f.
In summary, the inventors found that the image recognition method based on hypergraph learning described above has the following disadvantages:
in the process of constructing the hypergraph, the image identification method splices the modal features of each sample into a long feature vector. And constructing a superedge according to the long feature vector, and finally generating a hypergraph with n nodes. The method for constructing the hypergraph ignores the difference of different modal characteristics, cannot ensure the similarity of sample nodes in the same hyperedge, and does not fully utilize complementary information and related information among the different modal characteristics.
In the process of constructing the hypergraph, the image identification method splices the modal features of each sample into a long feature vector. Let the feature dimension of the mth modal feature of a sample be pmThen the long feature vector formed by splicing all modal features of a sample has the dimension of
Figure BDA0001797557390000061
The high-dimensional feature vector in the hypergraph construction method easily causes the problem of dimension disaster, and the similarity distance between every two samples is difficult to accurately measure in the high-dimensional feature space, so that the hypergraph cannot be accurately constructed.
In the process of constructing the hypergraph, the weight of each hyperedge in the image identification method is defaulted to be 1. The importance of the samples is different because the different super-edges contain different sample categories. Those super-edges that contain a greater number of samples of the same class should be given greater weight, while those that contain a greater number of samples of different classes should be given less weight. The super-edge weight of the method for constructing the hypergraph is set to be single, so that the discrimination capability of the image identification method based on hypergraph learning is reduced.
Fig. 1 is a schematic flow chart of an image recognition method according to an embodiment of the present invention, which specifically includes the following steps.
In step S101, image features of a plurality of modalities are extracted from an image training sample.
In step S102, a plurality of hypergraphs is constructed from image features of the plurality of modalities.
In step S103, joint learning is performed on the plurality of hypergraphs.
In step S104, image classification is performed based on the locally optimal solution obtained by the joint learning.
In one embodiment of the invention, first, image features of multiple modalities are extracted from an image training sample. Then, a plurality of hypergraphs is constructed from the image features of the plurality of modalities. Secondly, the plurality of hypergraphs are subjected to joint learning. And finally, carrying out image classification according to the local optimal solution obtained by the joint learning.
According to the embodiment of the invention, in the process of constructing the hypergraph, the plurality of hypergraphs are subjected to joint learning; and carrying out image classification according to the local optimal solution obtained by the joint learning. The similarity measurement among samples is carried out by using the long characteristic vectors spliced by all the characteristic vectors of different modes of one sample, so that the similarity of sample nodes in the same super edge is ensured, and the complementary information and the related information among the image characteristics of different modes are kept to the maximum extent.
Fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present invention, which specifically includes the following steps:
in step S201, an image training sample is acquired and a class label vector corresponding to the image training sample is acquired.
In step S202, image features of a plurality of modalities are extracted from an image training sample.
In step S203, a plurality of hypergraphs is constructed from the image features of the plurality of modalities.
In step S204, an object model is constructed, alternating iterative optimization is performed on the excess edge weight and the prediction function in the object model, and a local optimal solution of the prediction function is obtained when the alternating iterative optimization converges.
In step S205, an image test sample is acquired and a category label vector corresponding to the image test sample is acquired.
In step S206, the target model is verified by using the image test sample, and parameters of the target model are adjusted and optimized according to a verification result.
In one embodiment of the invention, an image training sample is acquired and a class label vector corresponding to the image training sample is obtained. Image features of multiple modalities are extracted from an image training sample. The plurality of hypergraphs respectively include respective sets of vertices and hyper-edges. The plurality of hypergraphs respectively include respective sets of vertices and hyper-edges. And constructing a target model, performing alternate iterative optimization on the excess edge weight and the prediction function in the target model, and obtaining a local optimal solution of the prediction function when the alternate iterative convergence is performed. And acquiring an image test sample and acquiring a category label vector corresponding to the image test sample. And verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
According to the embodiment of the invention, in the process of constructing the hypergraph, a plurality of hypergraphs are constructed according to the image characteristics of the plurality of modalities. The plurality of hypergraphs respectively include respective sets of vertices and hyper-edges. The super-edge is constructed without stitching the modal features of each sample into a long feature vector. Therefore, the characteristic dimension of the modal characteristic is reduced, and the accuracy of constructing the hypergraph is improved.
In one embodiment of the present invention, the target formula of the target model is:
Figure BDA0001797557390000081
wherein omegafIs the laplace regularization term of hypergraph, f ═ f1,f2,…,fn]TIs a prediction function defined on (-1,1), y ═ y1,y2,…,yn]T∈RnClass label vector for n training samples, w ═ w1,w2,…,wm]∈Rd×mIs a weight matrix consisting of weight vectors of image features of different modalities, λ and γ being regularization parameters.
Since the variables w and f in the target model are coupled together, the excess edge weights and the prediction function in the target model are alternately and iteratively optimized. When W is fixed to optimize f, removing the independent items of the variable f in the target model to obtain the following optimization problem:
Figure BDA0001797557390000082
derivation is performed on the above equation f, and the derivative is made zero to obtain an analytical solution for the prediction function:
Figure BDA0001797557390000083
and then f is fixed, w is optimized, wherein the item which is irrelevant to the variable w in the target model is removed, and the following optimization problem is obtained:
Figure BDA0001797557390000084
and (5) carrying out derivation on the formula w, and enabling the derivative to be zero to obtain an analytic solution of the super-edge weight.
And initializing the values of f and w in the next iteration by using the optimized values of f and w after each iteration, repeating iterative optimization in such a way, and obtaining a local optimal solution of the prediction function when the prediction function and the super-edge weight alternately and iteratively converge.
According to the embodiment of the invention, in the process of constructing the hypergraph, alternating iterative optimization is carried out on the hyperedge weight and the prediction function in the target model; and obtaining a local optimal solution of the prediction function during alternate iterative convergence, and finally obtaining the excess edge weight which accurately reflects the importance of the excess edge, thereby improving the discrimination capability of the image recognition method.
In one embodiment, the extracting image features of a plurality of modalities from image training samples includes at least one of: color moment feature vectors, local two-dimensional histogram feature vectors, and histogram of oriented gradients feature vectors.
The specific steps of extracting the color moment feature vectors comprise: segmenting each image of the image training sample into non-overlapping 3 x 3 sized grids; respectively calculating characteristic vectors of a color mean value, a color variance and a color skewness of the three channels of the image of each grid; and connecting the feature vectors of the color mean, the color variance and the color skewness which are obtained by calculation in each grid to form a color moment feature vector with a feature dimension of 81 dimensions. In one embodiment, the sample matrix of color moment eigenvectors is assumed to be
Figure BDA0001797557390000091
Figure BDA0001797557390000092
Where n is the number of samples, dcmThe feature dimension of the sample color moment feature vector. By a sample
Figure BDA0001797557390000093
For example, using it as a node, k neighbor samples are calculated. The set of k +1 samples is taken as a super edge. Then, a super edge is generated according to each sample, and then n super edges are generated by the image features of the color moment mode.
The specific step of extracting the local two-dimensional histogram feature vector comprises the following steps: segmenting each image of the image training sample into non-overlapping 3 x 3 sized grids; a local two-dimensional histogram feature vector with a feature dimension of 58 dimensions is obtained by comparing the image pixels of the central grid with the image pixels of the surrounding grids of each image. The local two-dimensional histogram feature vector has good illumination invariance.
The specific steps of extracting the feature vector of the directional gradient histogram comprise: segmenting each image of the image training sample into a plurality of blocks; calculating a directional gradient histogram feature vector of each block; and normalizing the histogram feature vector of the directional gradient of each block in a block-by-block mode to obtain a 31-dimensional histogram feature vector of the directional gradient with a feature dimension.
Fig. 3 is a schematic structural diagram of an image recognition system according to an embodiment of the present invention. As shown in fig. 3, the system 30 includes: a modal feature extraction module 301, an object model building module 302, a joint learning module 303, a prediction module 304, an image acquisition module 305, and a testing module 306.
Modal feature extraction module 301: the method is used for extracting image features of a plurality of modalities from an image training sample.
The object model building module 302: for constructing a plurality of hypergraphs from image features of the plurality of modalities.
The joint learning module 303: for joint learning of the plurality of hypergraphs.
The prediction module 304: and carrying out image classification according to the local optimal solution obtained by the joint learning.
Image acquisition module 305: the image training system is used for acquiring an image training sample and acquiring a class label vector corresponding to the image training sample; and acquiring an image test sample and acquiring a category label vector corresponding to the image test sample.
The test module 306: and verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
In one embodiment of the invention, the testing module 306 applies the image test sample to the prediction function to obtain a predictor vector corresponding to the image test sample; and comparing the prediction index vector and the category index vector corresponding to the image test sample. According to the comparison result, if the alternative iterative optimization process does not reach the local optimal solution of the prediction function, further alternative iterative optimization is carried out on the weight of the excess edge in the target model and the prediction function; and according to the comparison result, if the alternative iterative optimization process converges to the local optimal solution of the prediction function, obtaining the prediction function of the target model.
In one embodiment, the prediction module 304 obtains an unknown image and applies the unknown image to the prediction function to obtain the recognition result of the unknown image.
Fig. 4 is a structural diagram of a control apparatus of image recognition according to an embodiment of the present invention. The apparatus shown in fig. 4 is only an example and should not limit the functionality and scope of use of embodiments of the present invention in any way.
Referring to fig. 4, the apparatus includes a processor 401, a memory 402, and an input-output device 403 connected by a bus. Memory 402 includes Read Only Memory (ROM) and Random Access Memory (RAM), with various computer instructions and data required to perform system functions being stored in memory 402, and with various computer instructions being read by processor 401 from memory 402 to perform various appropriate actions and processes. An input/output device including an input portion of a keyboard, a mouse, and the like; an output section including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section including a hard disk and the like; and a communication section including a network interface card such as a LAN card, a modem, or the like. The memory 402 also stores the following computer instructions to perform the operations specified by the image recognition method of the embodiment of the present invention: extracting image features of a plurality of modalities from an image training sample; constructing a plurality of hypergraphs according to the image features of the plurality of modalities; performing joint learning on the plurality of hypergraphs; and carrying out image classification according to the local optimal solution obtained by the joint learning.
Accordingly, an embodiment of the present invention provides a computer-readable storage medium, which stores computer instructions that, when executed, implement the operations specified by the control method for image recognition.
The flowcharts and block diagrams in the figures and block diagrams illustrate the possible architectures, functions, and operations of the systems, methods, and apparatuses according to the embodiments of the present invention, and may represent a module, a program segment, or merely a code segment, which is an executable instruction for implementing a specified logical function. It should also be noted that the executable instructions that implement the specified logical functions may be recombined to create new modules and program segments. The blocks of the drawings, and the order of the blocks, are thus provided to better illustrate the processes and steps of the embodiments and should not be taken as limiting the invention itself.
The above description is only a few embodiments of the present invention, and is not intended to limit the present invention, and various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (18)

1. An image recognition method, comprising:
extracting image features of a plurality of modalities from an image training sample;
constructing a plurality of hypergraphs according to the image features of the plurality of modalities;
performing joint learning on the plurality of hypergraphs; and
and carrying out image classification according to the local optimal solution obtained by the joint learning.
2. The image recognition method of claim 1, wherein the plurality of hypergraphs each include a respective set of vertices and a respective set of hyperedges.
3. The image recognition method of claim 1, wherein the step of jointly learning the plurality of hypergraphs comprises:
constructing an object model, wherein the object model comprises a super edge weight and a prediction function;
performing alternate iterative optimization on the excess edge weight and the prediction function in the target model; and obtaining a locally optimal solution of the prediction function when the alternating iterations converge.
4. The image recognition method of claim 3, wherein the alternating iterative optimization comprises:
fixing the excess edge weight, and optimizing a prediction function, wherein an independent item of the variable prediction function in the target model is removed to obtain an analytic solution of the prediction function; and
and fixing a prediction function, and optimizing the excess edge weight, wherein an item which is irrelevant to the variable excess edge weight in the target model is removed, and an analytic solution of the excess edge weight is obtained.
5. The image recognition method according to claim 4, further comprising: collecting an image test sample and acquiring a category label vector corresponding to the image test sample;
and verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
6. The image recognition method of claim 5, wherein the verifying the target model by using the image test sample, and adjusting and optimizing the parameters of the target model according to the verification result comprises: applying the image test sample to the prediction function to obtain a prediction label vector corresponding to the image test sample; and
the predictor label vector and the category label vector corresponding to the image test sample are compared.
7. The image recognition method of claim 6, wherein the verifying the target model by using the image test sample, and adjusting and optimizing the parameters of the target model according to the verification result further comprises: according to the comparison result, if the alternative iterative optimization process does not reach the local optimal solution of the prediction function, further alternative iterative optimization is carried out on the weight of the excess edge in the target model and the prediction function;
and according to the comparison result, if the alternative iterative optimization process converges to the local optimal solution of the prediction function, obtaining the prediction function of the target model.
8. The image recognition method of claim 7, wherein the classifying the image according to the locally optimal solution obtained by the joint learning comprises: acquiring an unknown image; and
and applying the unknown image to the prediction function to obtain the identification result of the unknown image.
9. The image recognition method according to claim 2, wherein the hyper-edge sets are constructed using a k-nearest neighbor method respectively according to the image features of the plurality of modalities.
10. The image recognition method of claim 1, wherein the image features of the plurality of modalities include at least one of: color moment feature vectors, local two-dimensional histogram feature vectors, and histogram of oriented gradients feature vectors.
11. The image recognition method of claim 10, wherein the step of extracting the color moment feature vectors comprises: segmenting each image of the image training sample into a plurality of non-overlapping meshes;
respectively calculating feature vectors of a color mean value, a color variance and a color skewness of the image in a plurality of channels of each grid; and
and connecting the feature vectors of the color mean, the color variance and the color skewness which are obtained by calculation in each grid to form a color moment feature vector.
12. The image recognition method of claim 10, wherein the step of extracting the local two-dimensional histogram feature vector comprises: segmenting each image of the image training sample into a plurality of non-overlapping meshes;
local two-dimensional histogram feature vectors are obtained by comparing image pixels of the central grid with image pixels of the surrounding grids of each image.
13. The image recognition method of claim 12, wherein the local two-dimensional histogram feature vector has good illumination invariance.
14. The image recognition method of claim 10, wherein the step of extracting the histogram of oriented gradients feature vector comprises: segmenting each image of the image training sample into a plurality of blocks;
calculating a directional gradient histogram feature vector of each block; and
and normalizing the histogram feature vector of the direction gradient of each block in a block-by-block mode to obtain the histogram feature vector of the direction gradient.
15. An image recognition system, comprising:
a modal feature extraction module: extracting image features of a plurality of modalities from an image training sample;
a target model building module: for constructing a plurality of hypergraphs from image features of the plurality of modalities;
a joint learning module: for joint learning of the plurality of hypergraphs;
a prediction module: and carrying out image classification according to the local optimal solution obtained by the joint learning.
16. The image recognition system of claim 15, further comprising:
an image acquisition module: the image training system is used for acquiring an image training sample and acquiring a class label vector corresponding to the image training sample; acquiring an image test sample and acquiring a category label vector corresponding to the image test sample;
a test module: and verifying the target model by using the image test sample, and adjusting and optimizing parameters of the target model according to a verification result.
17. A computer-readable storage medium storing computer instructions which, when executed, implement the image recognition method of any one of claims 1 to 14.
18. An image recognition control device, comprising:
a memory for storing computer instructions;
a processor coupled to the memory, the processor configured to perform implementing the image recognition method of any of claims 1-14 based on computer instructions stored by the memory.
CN201811062833.5A 2018-09-12 2018-09-12 Image recognition method and system Pending CN110895700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811062833.5A CN110895700A (en) 2018-09-12 2018-09-12 Image recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811062833.5A CN110895700A (en) 2018-09-12 2018-09-12 Image recognition method and system

Publications (1)

Publication Number Publication Date
CN110895700A true CN110895700A (en) 2020-03-20

Family

ID=69784803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811062833.5A Pending CN110895700A (en) 2018-09-12 2018-09-12 Image recognition method and system

Country Status (1)

Country Link
CN (1) CN110895700A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268140A (en) * 2014-07-31 2015-01-07 浙江大学 Image retrieval method based on weight learning hypergraphs and multivariate information combination
CN106776554A (en) * 2016-12-09 2017-05-31 厦门大学 A kind of microblog emotional Forecasting Methodology based on the study of multi-modal hypergraph
CN107507195A (en) * 2017-08-14 2017-12-22 四川大学 The multi-modal nasopharyngeal carcinoma image partition methods of PET CT based on hypergraph model
CN107731283A (en) * 2017-10-23 2018-02-23 清华大学 A kind of image radio system based on more subspace modelings
EP3333771A1 (en) * 2016-12-09 2018-06-13 Fujitsu Limited Method, program, and apparatus for comparing data hypergraphs
CN108170729A (en) * 2017-12-13 2018-06-15 西安电子科技大学 Utilize the image search method of hypergraph fusion multi-modal information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268140A (en) * 2014-07-31 2015-01-07 浙江大学 Image retrieval method based on weight learning hypergraphs and multivariate information combination
CN106776554A (en) * 2016-12-09 2017-05-31 厦门大学 A kind of microblog emotional Forecasting Methodology based on the study of multi-modal hypergraph
EP3333771A1 (en) * 2016-12-09 2018-06-13 Fujitsu Limited Method, program, and apparatus for comparing data hypergraphs
CN107507195A (en) * 2017-08-14 2017-12-22 四川大学 The multi-modal nasopharyngeal carcinoma image partition methods of PET CT based on hypergraph model
CN107731283A (en) * 2017-10-23 2018-02-23 清华大学 A kind of image radio system based on more subspace modelings
CN108170729A (en) * 2017-12-13 2018-06-15 西安电子科技大学 Utilize the image search method of hypergraph fusion multi-modal information

Similar Documents

Publication Publication Date Title
JP7058669B2 (en) Vehicle appearance feature identification and vehicle search methods, devices, storage media, electronic devices
CN109740588B (en) X-ray picture contraband positioning method based on weak supervision and deep response redistribution
CA3066029A1 (en) Image feature acquisition
CN110929622A (en) Video classification method, model training method, device, equipment and storage medium
WO2020062360A1 (en) Image fusion classification method and apparatus
CN110765882B (en) Video tag determination method, device, server and storage medium
CN114529707B (en) Three-dimensional model segmentation method and device, computing equipment and readable storage medium
CN106156693A (en) The robust error correction method represented based on multi-model for facial recognition
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
Liu et al. RGB-D joint modelling with scene geometric information for indoor semantic segmentation
CN109726725B (en) Oil painting author identification method based on large-interval inter-class mutual-difference multi-core learning
CN108985161B (en) Low-rank sparse representation image feature learning method based on Laplace regularization
CN112488999A (en) Method, system, storage medium and terminal for detecting small target in image
CN107480627B (en) Behavior recognition method and device, storage medium and processor
CN114998695A (en) Method and system for improving image recognition speed
US8467607B1 (en) Segmentation-based feature pooling for object models
CN113177592A (en) Image segmentation method and device, computer equipment and storage medium
CN111507288A (en) Image detection method, image detection device, computer equipment and storage medium
CN111444923A (en) Image semantic segmentation method and device under natural scene
CN111091129A (en) Image salient region extraction method based on multi-color characteristic manifold sorting
CN110210523B (en) Method and device for generating image of clothes worn by model based on shape graph constraint
CN110895700A (en) Image recognition method and system
CN115019057A (en) Image feature extraction model determining method and device and image identification method and device
Chen et al. Multi-view robust discriminative feature learning for remote sensing image with noisy labels
Abdulsahib et al. A Double Clustering Approach for Color Image Segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination