CN108681746B - Image identification method and device, electronic equipment and computer readable medium - Google Patents

Image identification method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN108681746B
CN108681746B CN201810443324.0A CN201810443324A CN108681746B CN 108681746 B CN108681746 B CN 108681746B CN 201810443324 A CN201810443324 A CN 201810443324A CN 108681746 B CN108681746 B CN 108681746B
Authority
CN
China
Prior art keywords
image
classifier
sub
feature
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810443324.0A
Other languages
Chinese (zh)
Other versions
CN108681746A (en
Inventor
魏秀参
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Megvii Technology Co Ltd filed Critical Beijing Megvii Technology Co Ltd
Priority to CN201810443324.0A priority Critical patent/CN108681746B/en
Publication of CN108681746A publication Critical patent/CN108681746A/en
Application granted granted Critical
Publication of CN108681746B publication Critical patent/CN108681746B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image recognition method, an image recognition device, electronic equipment and a computer readable medium, and relates to the technical field of image recognition, wherein the method comprises the following steps: obtaining a test sample, wherein the test sample comprises a test example image and an image to be identified, and the category of an object in the test example image comprises the category of the object in the image to be identified; performing feature extraction on the test sample image to obtain a feature set, wherein the feature set comprises a plurality of sub-vectors, and the sub-vectors are part feature vectors of an object in the sample image; each sub-vector is mapped into a sub-classifier of a corresponding type through a segmented classifier mapping model, the sub-classifier is determined through the sub-classifier, a target classifier is further obtained, image recognition is carried out on an image to be recognized through the target classifier, the segmented classifier mapping model is a model obtained after small sample learning, and the method enables the traditional fine-grained level image recognition technology to be independent of massive fine-grained level images.

Description

Image identification method and device, electronic equipment and computer readable medium
Technical Field
The present invention relates to the field of image recognition technologies, and in particular, to an image recognition method, an image recognition apparatus, an electronic device, and a computer-readable medium.
Background
With the rapid development of artificial intelligence technology, image recognition and image detection technology is also being developed, and has been widely applied to life, for example, image recognition technology. Image recognition techniques may be used to identify the type of object contained in the image, e.g., whether the object contained in the image is a dog or cat, etc. With the rapid development of image recognition technology, another image recognition technology is also developed, namely, a fine-grained image recognition technology.
The fine-grained image recognition technology is an important research subject in the field of computer vision, and at present, the research mainly focuses on how to find object parts (object parts) areas with resolution capability in fine-grained images or construct a novel network model suitable for a fine-grained image recognition task. However, no matter what kind of deep learning method is described above, the deep learning method depends on massive fine-grained level images, so that development of fine-grained level image recognition and application of the fine-grained level image recognition in a real scene are limited.
Disclosure of Invention
In view of the above, the present invention provides an image recognition method, an image recognition apparatus, an electronic device, and a computer-readable medium, which enable a conventional fine-grained image recognition technology to be independent of a large amount of fine-grained images.
In a first aspect, an embodiment of the present invention provides an image recognition method, including: obtaining a test sample, wherein the test sample comprises a test example image and an image to be identified, and the category of an object in the test example image comprises the category of the object in the image to be identified; extracting features of the test sample image to obtain a feature set, wherein the feature set comprises a plurality of sub-vectors, and the sub-vectors are feature vectors of parts of objects in the sample image; and mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, and determining a target classifier through the sub-classifier, so that the image to be identified is identified through the target classifier, wherein the segmented classifier mapping model is a model obtained after small sample learning.
Further, the class of the object in the test example image is at least one; performing feature extraction on the test sample image to obtain a feature set, wherein the feature set comprises: for the class A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, class AiTaking 1 to k in sequence for the ith category in the plurality of categories, wherein k is the number of categories of the test sample image; mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, so as to obtain a target classifier, wherein the target classifier comprises: mapping each sub-vector in the feature set Xi to a sub-classifier of a corresponding type through the segmented classifier mapping model, and determining a target classifier F through the sub-classifieri
Further, mapping each sub-vector in the feature set Xi to a sub-classifier of a corresponding type through the segmented classifier mapping model, and determining a target classifier F through the sub-classifieriThe method comprises the following steps: mapping each sub-vector in the feature set Xi to a sub-classifier of a corresponding type through a segmented classification mapping function in the segmented classifier mapping model to obtain a plurality of sub-classifiersA sub-classifier; cascading the plurality of sub-classifiers to obtain the target classifier Fi
Further, mapping each sub-vector in the feature set Xi to a corresponding type of sub-classifier by a segment classification mapping function in the segment classifier mapping model comprises: by the formula
Figure BDA0001656449190000021
Collecting the features XiIs mapped to a corresponding type of sub-classifier, wherein,
Figure BDA0001656449190000023
representing said feature set XiThe t-th sub-vector of
Figure BDA0001656449190000022
The sub-classifiers corresponding to the sub-classifiers,
Figure BDA0001656449190000031
representing the piecewise-classified mapping function, t being 1 to n in sequenceB,nBIs the feature set XiThe number of neutron vectors.
Further, for the test sample image with class AiThe test sample image is subjected to feature extraction to obtain a feature set XiThe method comprises the following steps: for the class A in the test sample imageiEach test example image is subjected to feature extraction to obtain a feature set
Figure BDA0001656449190000032
Wherein x isjRepresents that the category is AiN of the j test case imageeFor the type A in the test sample imageiThe number of test case images of (1); according to the formula
Figure BDA0001656449190000033
For feature sets
Figure BDA0001656449190000034
Calculating to obtain the feature set Xi
Further, performing feature extraction on the test sample image to obtain a feature set, including: and performing feature extraction on the test example image through a bilinear neural network to obtain a bilinear feature set of the test example image, wherein the bilinear feature set comprises a plurality of sub-vectors.
Further, the feature extraction of the test example image through a bilinear neural network to obtain a bilinear feature set of the test example image includes: extracting a first bilinear feature set of the test example image through a first feature extraction network in the bilinear neural network; extracting a second bilinear feature set of the test exemplar image through a second branch feature extraction network in the bilinear neural network; and performing an outer product operation on the first bilinear feature set and the second bilinear feature set to obtain a bilinear feature set of the test example image.
Further, the image recognition of the image to be recognized by the target classifier further includes: extracting the features of the image to be recognized to obtain a target feature matrix, wherein the target feature matrix comprises feature information of the image to be recognized; and carrying out image recognition on the target feature matrix through the target classifier.
Further, the number of the target classifiers is multiple, and performing image recognition on the target feature matrix of the image to be recognized by the target classifier includes: performing image recognition on the target feature matrix through each target classifier to obtain a plurality of class confidence coefficients; determining the category of the image to be recognized based on the category of the target classifier corresponding to the target category confidence in the plurality of category confidences, wherein the target category confidence is the confidence of the plurality of category confidences which is greater than a preset threshold.
Further, the image recognition of the feature vector of the image to be recognized by each target classifier to obtain a plurality of class confidence degrees comprises: and carrying out inner product operation on the feature matrix of the target classifier and the target feature matrix, and taking an operation result as the class confidence.
Further, the method further comprises: acquiring a training sample set; wherein the training sample set comprises training images and the training image label information, the label information is used for representing the classes of the training images, and the training images comprise training example images and challenge set images; extracting features of the training example images to obtain a feature set of the training example images; and carrying out small sample training on a segmentation classification mapping function in the original segmentation classifier mapping model through the feature set of the training example image and the label information of the training example image to obtain the trained original segmentation classifier mapping model.
Further, the method further comprises: carrying out image recognition on the inquiry set image through the trained original segmentation classifier mapping model, and determining the value of a classification loss function based on the recognition result; and adjusting the parameters of the original segmentation classification mapping function based on the value of the classification loss function.
In a second aspect, an embodiment of the present invention provides an image recognition apparatus, including: the device comprises an acquisition unit, a recognition unit and a recognition unit, wherein the acquisition unit is used for acquiring a test sample, the test sample comprises a test example image and an image to be recognized, and the category of an object in the test example image comprises the category of the object in the image to be recognized; the characteristic extraction unit is used for extracting characteristics of the test sample image to obtain a characteristic set, wherein the characteristic set comprises a plurality of sub-vectors, and the sub-vectors are component characteristic vectors of objects in the sample image; and the mapping identification unit is used for mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, and determining a target classifier through the sub-classifier so as to perform image identification on the image to be identified through the target classifier, wherein the segmented classifier mapping model is a model obtained after small sample learning.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the image recognition method described in any one of the above when executing the computer program.
In a fourth aspect, an embodiment of the present invention provides a computer-readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the image recognition method of any one of the above claims.
In the embodiment of the invention, firstly, a test sample is obtained, wherein the test sample comprises a test example image and an image to be identified, and the category of an object in the test example image comprises the category of the object in the image to be identified; then, extracting features of the test sample image to obtain a feature set, wherein the feature set comprises a plurality of sub-vectors, and the sub-vectors are part feature vectors of objects in the sample image; and finally, mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, determining to obtain a target classifier through the sub-classifier, and carrying out image recognition on an image to be recognized through the target classifier, wherein the segmented classifier mapping model is a model obtained after small sample learning.
In this embodiment, the segmented classifier mapping model may also be referred to as a fine-grained image recognition model, and the model can enable the segmented classifier mapping model to learn a learning paradigm of the small sample learning task through the small sample learning task and be used for testing image recognition to perform accurate image recognition on an image to be recognized, thereby solving a technical problem that a conventional fine-grained image recognition technology is too dependent on a large amount of fine-grained images, so that the conventional fine-grained image recognition technology is not dependent on a technical effect of the large amount of fine-grained images.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an electronic device;
FIG. 2 is a flow chart of an image recognition method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a fine-grained level image recognition model according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, not all, embodiments of the present invention. First, an electronic device 100 for implementing an embodiment of the present invention, which may be used to execute the image recognition method of embodiments of the present invention, is described with reference to fig. 1.
As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memories 104, an input device 106, an output device 108, and an image collector 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.
The processor 102 may be implemented in at least one hardware form of a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), and an asic (application Specific Integrated circuit), the processor 102 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other form of processing unit having data processing capability and/or instruction execution capability, and may control other components in the electronic device 100 to perform desired functions.
The memory 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.
The image collector 110 is configured to collect an image, where data collected by the image collector is used to be input to an image recognition method for operation, for example, the image collector may capture an image (e.g., a photo, a video, etc.) desired by a user, and then input the image to the image recognition method for operation, and the image collector may further store the captured image in the memory 104 for use by other components.
Exemplarily, an electronic device for implementing an image recognition method according to an embodiment of the present invention may be implemented as a smart terminal such as a video camera, a snapshot machine, a smart phone, a tablet computer, and the like.
As can be known from the description of the background art, the fine-grained image recognition technology depends on a large amount of fine-grained images, so that the development of the fine-grained image recognition technology and the application thereof in a real scene are limited. Against humans, the ability to learn new concepts with little oversight information, for example, for an average adult to learn to identify a new species of birds with only a few images.
In order to enable the fine-grained image recognition model to have learning capacity under a small number of training samples like a human, the invention firstly provides and researches a small number of sample learning tasks of the fine-grained image recognition model. The fine-grained image recognition task based on a small number of training samples requires that the model is trained to obtain an ideal fine-grained object classifier under the condition that only a plurality of (generally one or five) marked samples exist, so that the recognition task is completed. The "several" marked samples are often referred to as "example images" or "examples" (exemplars), or "example samples", etc. It can be seen that, because the fine-grained image markers are difficult to acquire and mass data are difficult to acquire, the task has a huge prospect in practical application, but the task difficulty is greatly increased because the supervision information provided by a small number of samples is extremely limited. The image recognition method will be described below with reference to specific embodiments.
In accordance with an embodiment of the present invention, there is provided an embodiment of an image recognition method, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 2 is a flowchart of an image recognition method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
step S202, a test sample is obtained, wherein the test sample comprises a test example image and an image to be identified, and the category of the object in the test example image comprises the category of the object in the image to be identified.
In an embodiment of the present invention, the test sample image is the marked sample described above, that is, the test sample image is a sample marked with a type, wherein the type in the embodiment is determined by the type of the object included in the image.
The test sample image further includes label information, and the label information is used for characterizing the category of the corresponding test sample image.
Step S204, extracting the characteristics of the test sample image to obtain a characteristic set, wherein the characteristic set comprises a plurality of subvectors, and the subvectors are the component characteristic vectors of the object in the test sample image.
In the embodiment of the present invention, after the feature extraction is performed on the test sample image, the obtained feature set includes a plurality of sub-vectors, and each sub-vector is used for characterizing a feature of an implicit component of an object in the corresponding test sample image.
It should be noted that, in this embodiment, the component refers to a component obtained by implicitly dividing the image, that is, the component may be a local area of an object in the test sample image.
Implicit division refers to implicitly dividing an image block into a plurality of parts, and recording the position information of the parts in the image block. The image block after the implicit division is still represented originally, and only the part information existing in the image block is recorded. The corresponding display division means that the image block is actually broken down into a plurality of small component image blocks, and the original image block does not exist.
And step S206, mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, and determining a target classifier through the sub-classifier, so as to perform image recognition on an image to be recognized through the target classifier, wherein the segmented classifier mapping model is a model obtained after small sample learning.
In the embodiment of the present invention, a segment Classifier Mapping model (PCM) may also be referred to as a granularity level image recognition deep network model PCM.
In this embodiment, the segmenting means that each sub-vector is mapped to a sub-classifier of a corresponding type according to different mapping manners, so as to obtain a target classifier.
In this embodiment, the segmented classifier mapping model may also be referred to as a fine-grained level image recognition model, and the model can enable the segmented classifier mapping model to learn a learning paradigm (learning paradigm) of a small sample learning task through the small sample learning task, and is used for testing image recognition to perform accurate image recognition on an image to be recognized, thereby solving a technical problem that a conventional fine-grained level image recognition technology is too dependent on a large amount of fine-grained level images, so that the conventional fine-grained level image recognition technology does not depend on a technical effect of the large amount of fine-grained level images any more.
In an alternative embodiment, if the test sample image has a plurality of categories, then in step S204, the extracting features of the test sample image to obtain the feature set includes: for the type A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, class AiFor the ith category in the plurality of categories, i is 1 to k in turn, and k is the number of categories of the test sample image.
If there are multiple classes of the test sample image, then in step S206, mapping each sub-vector to a corresponding type of sub-classifier through the segmented classifier mapping model, and obtaining the target classifier includes: mapping model of feature set X by segment classifieriEach sub-vector in (a) is mapped to a corresponding type of sub-classifier, and a target score is determined by the sub-classifierClass device Fi
Specifically, it is assumed that the types of the test sample images are plural, for example, k. At this time, feature extraction may be performed on the test sample image of each type in the test sample image to obtain a corresponding feature set. For example, sequentially for class A1To AkThe test sample image is subjected to feature extraction, and a corresponding feature set X is obtained1To Xk
Obtaining the corresponding feature set X1To XkThen, each sub-vector in each feature set can be mapped into a sub-classifier of a corresponding type through a segmented classifier mapping model, and then a target classifier is determined according to the sub-classifiers.
Is represented by A1To AkClass A ofiThe description is given for the sake of example. First, for the type A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, the feature set XiIncluding a plurality of sub-vectors; then, feature set X is mapped to model by segment classifieriEach sub-vector in the sub-vector mapping table is mapped to a sub-classifier of a corresponding type to obtain a plurality of sub-classifiers; finally, a target classifier F may be determined from the plurality of sub-classifiersi
As can be seen from the above description, in the present embodiment, a segmented mapping manner is adopted, that is, the feature set XiEach sub-vector in (a) is mapped to a corresponding type of sub-classifier. Compared with the global mapping method, the segmented mapping method simplifies the global mapping method, and can reduce the training difficulty of the network.
Optionally, in this embodiment, the category in the test sample image is aiWhen the feature extraction is performed on the test sample image, the feature extraction may be performed on the test sample image through the following implementation manner to obtain a feature set, where the implementation manner specifically includes:
and performing feature extraction on the test example image through a bilinear neural network to obtain a bilinear feature set of the test example image, wherein the bilinear feature set comprises a plurality of sub-vectors.
In this embodiment, feature extraction may be performed on the example image through a bilinear neural network (bilinear network), so as to obtain a bilinear feature set.
For example, for the type A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, the feature set XiAlso called bilinear feature set, in which n is includedBSubvector xtI.e. the set is represented as:
Figure BDA0001656449190000111
in the present embodiment, feature sets
Figure BDA0001656449190000112
In the above description, the subscripts are identification information of the test case images belonging to the same category, and the subscripts are identification information of the components in a sample.
For example, feature sets
Figure BDA0001656449190000121
X in (2)jIndicates that the object belongs to the category AiThe j test example image in the test example image, wherein xjCan be expressed as
Figure BDA0001656449190000122
Figure BDA0001656449190000123
A sub-vector representing the component t in the jth test case image.
In this embodiment, a set of implicit component feature vectors of an object in an image that cannot be obtained by another model can be obtained by using a bilinear neural network. The method can effectively help the identification of the objects with fine granularity by extracting the set of the implicit component feature vectors of the objects in the image, thereby improving the identification precision of the objects with fine granularity.
Further, the step of extracting the features of the test sample image through the bilinear neural network to obtain a bilinear feature set of the test sample image includes the following steps:
step S11, extracting a first bilinear feature set of the test sample image through a first feature extraction network in the bilinear neural network;
step S12, extracting a second bilinear feature set of the test sample image through a second branch feature extraction network in the bilinear neural network;
step S13, performing an outer product operation on the first bilinear feature set and the second bilinear feature set to obtain a bilinear feature set of the test sample image.
In this embodiment, the bilinear neural network includes two network branches, which are the first branch feature extraction networks fAAnd a second sub-feature extraction network fB. In the present invention, the network f can be extracted by the first branch featureAAnd a second sub-feature extraction network fBFeature extraction is respectively carried out on the test sample image, and a first bilinear feature set and a second bilinear feature set are respectively obtained. And then performing an outer product operation on the first bilinear feature set and the second bilinear feature set.
For example, collections
Figure BDA0001656449190000124
X in (2)tIt can be seen that the network f is extracted from the first featureAThe obtained feature and the second sub-feature extraction network fBAnd calculating the obtained feature outer product.
The specific outer product calculation method is as follows: assume that the first bilinear feature set contains 49 512-dimensional sub-vectors, each of which actually corresponds to an image region of the original image according to the CNN characteristic. Also the assumption in the second bilinear feature set includes 49 512-dimensional subvectors. The outer product refers to an outer product operation of sub-vectors obtained by two network branches of the bilinear neural network, specifically, for a first sub-vector of the second bilinear feature set, a first dimension of the first sub-vector is multiplied by a first sub-vector in the first bilinear feature set to obtain a 512-dimensional result; continuing to perform product operation on the second dimension and the first sub-vector in the first bilinear feature set to obtain another 512-dimensional result; and repeatedly executing the operations until all dimensions of the first sub-vector in the second bilinear feature set are acted with the first sub-vector in the first bilinear feature set. At this point, 512 results of 512 dimensions are obtained, and we concatenate the 512 results directly into a 262144-dimension vector, which is the outer product of the first sub-vector in the second bi-linear feature set and the first sub-vector in the second bi-linear feature set. Then, the operations are performed on the second, third, … …, and forty-nine sub-vectors, respectively, that is, all the dimensions in the second sub-vector in the second bilinear feature set are completely acted on the second sub-vector in the first bilinear feature set, and so on, until all the dimensions in the forty-ninth sub-vector in the second bilinear feature set are completely acted on the forty-ninth sub-vector in the first bilinear feature set, so that 49 262144 outer product results can be obtained. Finally, the 49 outer product results are averaged to obtain the final 262144 dimension outer product result vector.
Wherein, aggregate
Figure BDA0001656449190000131
In xtI.e. the t-th of the averaged 512-dimensional results.
Class A in the test sample image in the manner described aboveiThe test sample image is subjected to feature extraction to obtain a feature set XiThen, each sub-vector in the feature set Xi is mapped into a sub-classifier of a corresponding type through a segmented classifier mapping model, and a target classifier F is obtainedi
If the category is AiThe number of the images in the test sample image is multiple, and when the feature extraction is performed on each test sample image according to the method, a set is obtained
Figure BDA0001656449190000141
Based on this, the class is A in the test sample imageiWhen the feature extraction is performed on the test sample image, the type of the test sample image is AiEach test example image is subjected to feature extraction to obtain a feature set
Figure BDA0001656449190000142
Wherein x isjIs represented by class AiN of the j test case imageeFor testing a class A in a sample imageiNumber of test sample images. Wherein x isjCan be expressed as:
Figure BDA0001656449190000143
in the present embodiment, the category a can be matched by the method described in the above-described step S11 to step S13iFeature extraction is performed on each test case image to obtain a feature set of each test case image, for example,
Figure BDA0001656449190000144
after obtaining the feature set of each test case image, { x } can be obtainedjSet of (C) } s
Figure BDA0001656449190000145
Obtaining a feature set
Figure BDA0001656449190000146
Then, the formula can be followed
Figure BDA0001656449190000147
For feature sets
Figure BDA0001656449190000148
Calculating to obtain a feature set Xi. Wherein, XiCan be expressed as
Figure BDA0001656449190000149
Figure BDA00016564491900001410
The middle subscript being the sample identity and the superscript being the identity of the part t in the sample, i.e.
Figure BDA00016564491900001411
Representing the feature set of part t in sample i.
In an optional embodiment, each sub-vector in the feature set Xi is mapped to a sub-classifier of a corresponding type through a segmented classifier mapping model, so as to obtain a target classifier FiThe method comprises the following steps:
firstly, mapping each sub-vector in a feature set Xi into a sub-classifier of a corresponding type through a segmented classification mapping function in a segmented classifier mapping model to obtain a plurality of sub-classifiers;
then, cascading a plurality of sub-classifiers to obtain a target classifier Fi
The main function of the segmented classifier mapping model is to map the feature set (or bilinear feature set) of the test case image to the classifier of its corresponding class. An intuitive solution to this is to map the bilinear representation directly from the feature space to the classifier space through global mapping, taking linear mapping as an example: fi=WgXi+bgWherein W isg∈RD×DAnd bg∈RDParameters representing a global mapping function.
Careful analysis can find that the above global mapping has the following two disadvantages. First, due to example feature XiGlobal information at the category level is included and thus its distribution is more complex. Learning global mappings based on this feature presents a significant challenge to find corresponding class classifiers. Secondly, the high dimensionality of the bilinear feature causes the amount of parameters to be learned by the mapping function to be extremely large, which brings great difficulty to model training.
To close the aboveThe invention provides a novel mapping strategy, namely a segment mapping method of a segment Classifier mapping model (PCM). As above, bilinear feature set XiCan be viewed as a subvector
Figure BDA0001656449190000151
While each subvector can be considered as an implicit component feature set. Intuitively, in the fine-grained level identification process, if the similarity between the component and the corresponding component in the sample image is judged according to the component, the fine-grained level category attribution of the image to be identified can be judged. This causes us to map the feature set of the component to the class classifier corresponding to the component using the segmentation classification mapping function in the segmentation classifier mapping model at the component level, and then cascade the classifiers corresponding to the respective components as the whole object level classifier. The whole learning process of the segment map is shown in fig. 3, and the working process of fig. 3 will be described in the following embodiment.
In this embodiment, optionally, mapping each sub-vector in the feature set Xi to a sub-classifier of a corresponding type in the following manner includes:
by the formula
Figure BDA0001656449190000152
Set the characteristics XiIs mapped to a corresponding type of sub-classifier, wherein,
Figure BDA0001656449190000154
representing a feature set XiThe t-th sub-vector of
Figure BDA0001656449190000153
The sub-classifiers corresponding to the sub-classifiers,
Figure BDA0001656449190000161
representing a piecewise-categorical mapping function, t being 1 to n in sequenceB,nBIs a feature set XiThe number of neutron vectors.
Specifically, feature set XiThe t-th sub-vector of
Figure BDA00016564491900001615
First by a multilayer perceptron (Multi layer perceptron)
Figure BDA0001656449190000162
Mapped to a corresponding type of sub-classifier. Wherein, the above formula can be passed
Figure BDA0001656449190000163
Sub-vectors
Figure BDA0001656449190000164
Mapped to a corresponding type of sub-classifier. Wherein the content of the first and second substances,
Figure BDA0001656449190000165
the mapping function for the segment classification can be specifically expressed as:
Figure BDA0001656449190000166
in the segmented classification mapping function,
Figure BDA0001656449190000167
for dividing subvectors
Figure BDA0001656449190000168
Is mapped to a sub-classifier of the corresponding type,
Figure BDA0001656449190000169
for dividing subvectors
Figure BDA00016564491900001610
Is mapped to a sub-classifier of the corresponding type,
Figure BDA00016564491900001611
for dividing subvectors
Figure BDA00016564491900001612
Is mapped to a sub-classifier of the corresponding type,
Figure BDA00016564491900001613
in the sub-vector
Figure BDA00016564491900001614
Mapped to a corresponding type of sub-classifier.
Feature set X is assembled in the manner described aboveiAfter the t-th sub-vector in the sub-classifier is mapped to the corresponding type of sub-classifier, a plurality of sub-classifiers are obtained. Then, after cascading a plurality of sub-classifiers, the target classifier F can be obtainediIs shown as Fi=[F1;F2;...;FnB]。
It should be noted that if the class of the test sample image is k, k object classifiers are obtained, which in turn are: f1,F2,…,Fi,…,Fk
As can be seen from the above description, in the image recognition method provided in this embodiment, a piecewise function mapping method is used to map each sub-vector in the bilinear feature set to a corresponding type of sub-classifier, so that the aforementioned global mapping method is greatly simplified, and the training difficulty of the network is further reduced.
It should be noted that the piecewise function mapping method described in this embodiment may also greatly reduce the network parameters in the classifier generation phase. Using a single layer mapping as an example, assume nA=nB512, this gives 5122Bilinear feature sets of dimensions. In this case, the global mapping model requires 5124One parameter, and the segment mapping method only needs to be (512 × 512) × 512 ═ 5123A mapping parameter, wherein nAExtracting dimensions, n, of a first bilinear feature set extracted by a network for a first featureBExtracting the dimensionality of a second bilinear feature set extracted by the network for a second sub-feature, in general, nA=nB
In the present embodimentK object classifiers F obtained by the method described above1,F2,…,Fi,…,FkThen, the image to be recognized can be subjected to image recognition through the k target classifiers.
In an optional embodiment, the image recognition of the image to be recognized by the target classifier further comprises: firstly, extracting the characteristics of the image to be recognized to obtain a target characteristic matrix, wherein the target characteristic matrix comprises the characteristic information of the image to be recognized; then, the target feature matrix is subjected to image recognition through the target classifier.
Specifically, in this embodiment, feature extraction may be performed on the image to be recognized through a bilinear neural network, so as to obtain a target feature matrix of the image to be recognized. The target feature matrix of the image to be recognized may be represented as:
Figure BDA0001656449190000171
Nta feature vector representing the part t of the image to be recognized. After the bilinear feature matrix of the image to be recognized is obtained, the target feature matrix can be subjected to image recognition through the target classifier.
Optionally, if the number of the target classifiers is multiple, performing image recognition on the target feature matrix of the image to be recognized by the target classifier includes the following steps:
firstly, carrying out image recognition on the target feature matrix through each target classifier to obtain a plurality of category confidence coefficients;
performing image recognition on the feature vectors of the image to be recognized through each target classifier, and obtaining multiple category confidence degrees comprises the following steps: and carrying out inner product operation on the feature matrix of the target classifier and the target feature matrix, and taking an operation result as the class confidence.
Then, the category of the image to be recognized is determined based on the category of the target classifier corresponding to the target category confidence in the plurality of category confidences, wherein the target category confidence is the confidence of the plurality of category confidences which is larger than a preset threshold.
If k (k is greater than 1) object classifiers are included in the object classifier, the specific identification process is described as follows:
and sequentially carrying out image recognition on the target feature matrix of the image to be recognized through each target classifier in the k target classifiers to obtain a corresponding recognition result, wherein the recognition result can be a category confidence coefficient, the category confidence coefficient represents the possibility that the type of the object to be recognized is the type corresponding to the target classifier, and the category confidence coefficient can be a numerical value in a range of 0-1. After the k target classifiers recognize the images to be recognized, k category confidences are obtained. At this time, the category of the image to be recognized can be determined by selecting the category of the target classifier corresponding to the target category confidence from the k category confidences; or determining the class of the target classifier corresponding to the maximum confidence coefficient as the class of the image to be recognized.
It should be noted that, in the embodiment of the present invention, before the test sample is identified, the initial model of the segmentation classifier mapping model (i.e., the original segmentation classifier mapping model) needs to be learned and trained, and a specific training process is described as follows:
step S21, acquiring a training sample set; the training sample set comprises training images and training image label information, the label information is used for representing the types of the training images, and the training images comprise training example images and inquiry set images;
step S22, extracting the features of the training example image to obtain a feature set of the training example image;
step S23, carrying out small sample training on the segmentation classification mapping function in the original segmentation classifier mapping model through the feature set of the training example image and the label information of the training example image to obtain the trained original segmentation classifier mapping model;
step S24, carrying out image recognition on the images of the inquiry set through the trained original segmented classifier mapping model, and determining the value of a classification loss function based on the recognition result;
and step S25, adjusting the parameters of the segmented classification mapping function based on the values of the classification loss function.
Specifically, in the present embodiment, first, an auxiliary data set B including Q marker images is acquired, B { (I)1,y1),(I2,y2),…,(Ii,yi),…,(IR,yR) In which IiAs an image sample, yi∈{1,2,...,CBAnd marking information for the corresponding category of the image.
Before training the original segmentation classifier mapping model, a small number of sample recognition tasks similar to the test environment need to be constructed in the auxiliary data set B. Specifically, at least one meta-training set is first randomly sampled from the auxiliary data set B, and each meta-training set includes CE<CBA randomly sampled class and an image sample belonging to the class. Then, each meta-training set is divided into two parts, namely a training example set E and a challenge set Q, wherein the image samples in the training example set E play the role of a small number of training samples, and the challenge set Q is used for evaluating the recognition performance of the classifier after learning.
After each element training set is divided according to the method, a training sample set is obtained; the training sample set comprises training images and training image label information, the label information is used for representing the classes of the training images, and the training images comprise training example images and inquiry set images. The training example images are images in a training example set, and the inquiry set images are images in the inquiry set.
Specifically, the training example set E includes a plurality of classes of training samples, and each class of training sample includes Ne (typically 1 or 5) training samples. The image samples in the challenge set Q are the rest of the images in the meta-training set (or training samples) except the example set.
After the training sample set is obtained, the original segmentation class mapping function can be trained by the training sample set. In the embodiment, the main idea of training is to hope that the model learns the learning paradigm of fine-grained level object recognition under a small number of samples, specifically, the original segmentation classifier mapping model can learn the mapping from the "example to the class classifier" through the training sample set, so as to complete the learning of the learning paradigm of the fine-grained level object recognition model, and perform the recognition task of the image based on the learning result.
Before training an original segmentation classification mapping function through a training sample set, firstly, performing feature extraction on training example images in the training sample set to obtain a feature set of the training example images; and then, carrying out small sample training on the segmentation classification mapping function in the original segmentation classifier mapping model through the feature set of the training example image and the label information of the training example image to obtain the trained original segmentation classifier mapping model. It should be noted that, in this embodiment, the training process of the original segmentation classifier mapping model may be understood as a process of learning and training the segmentation classification mapping function in the original segmentation classifier mapping model.
In this embodiment, after the primitive segmentation classifier mapping model is trained through the training example image, the image recognition on the image of the challenge set can be performed through the trained primitive segmentation classifier mapping model, so as to generate a classification error (a classification loss, or a value of a loss function). The obtained classification error is used for updating parameters of the segmented classification mapping function.
The process of evaluating the trained raw segment classifier mapping model through the challenge set image can also be described as follows:
Figure BDA0001656449190000201
wherein λ represents the target classifier F corresponding to the training example set EEL is a loss function,
Figure BDA0001656449190000204
representing the object classifier F to be generated from the set of training examples EEApplied to the challenge set Q.
The above embodiment will be further described with reference to fig. 3. Fig. 3 is a schematic structural diagram of a fine-grained level image recognition model. As shown in fig. 3, the fine-grained image recognition model includes: the system comprises a representation learning module and a classifier mapping module, wherein the representation learning module comprises a bilinear neural network, the classifier mapping module comprises a segmented classifier mapping model, the segmented classifier mapping model comprises a segmented mapping network, and the segmented mapping network maps a function through a segmented classification
Figure BDA0001656449190000202
To
Figure BDA0001656449190000203
To implement the mapping of the subvectors in the feature set.
As shown in fig. 3, in the present embodiment, first, an Input image (Input Images) is acquired, where the Input image includes an image in a training sample set or an image in a test sample in the above-described embodiment. After the Input Images are acquired, feature extraction may be performed on the Input Images through a bilinear neural network (bilinear network). Specifically, the feature extraction may be performed in the manner described in steps S11 to S13 in the above embodiment, and details are not repeated in this embodiment.
After feature extraction is performed on the input image through a bilinear neural network (bilinear network), a bilinear feature set (or segmented bilinear feature bilinear features) of the input image can be obtained. As shown in FIG. 3, after obtaining the bilinear feature set of the input image, the segmentation classification mapping function in the model may be mapped through the segmentation classifier
Figure BDA0001656449190000211
And mapping the sub-vectors in the feature set to the sub-classifiers of the corresponding types. As shown in fig. 3, segmentationThe mapping network includes: a piecewise linear feature layer, a hidden layer and a piecewise class classifier layer.
As can be seen from the above description, in the present embodiment, an image recognition method is proposed, which performs image recognition by using a fine-grained level image recognition deep network model PCM based on a small number of training samples, where PCM may also be referred to as a segmented classifier mapping model.
In this embodiment, the segmented classifier mapping model may also be referred to as a fine-grained image recognition model, and the network may learn a learning paradigm of the small sample learning task through the small sample learning task, and use the learning paradigm for testing image recognition to perform accurate image recognition on an image to be recognized, thereby solving a technical problem that a conventional fine-grained image recognition technology is too dependent on a large amount of fine-grained images, so that the conventional fine-grained image recognition technology is not dependent on a technical effect of the large amount of fine-grained images.
The embodiment of the present invention further provides an image recognition apparatus, which is mainly used for executing the image recognition method provided by the foregoing content of the embodiment of the present invention, and the image recognition apparatus provided by the embodiment of the present invention is specifically described below.
Fig. 4 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention, as shown in fig. 4, the image recognition apparatus mainly includes an acquisition unit 10, a feature extraction unit 20, and a mapping recognition unit 30, wherein:
the system comprises an acquisition unit 10, a recognition unit and a recognition unit, wherein the acquisition unit is used for acquiring a test sample, the test sample comprises a test example image and an image to be recognized, and the category of an object in the test example image comprises the category of the object in the image to be recognized;
a feature extraction unit 20, configured to perform feature extraction on the test sample image to obtain a feature set, where the feature set includes a plurality of sub-vectors, and the sub-vectors are feature vectors of components of an object in the sample image;
the mapping identification unit 30 is configured to map each sub-vector to a sub-classifier of a corresponding type through a segmented classifier mapping model, and determine a target classifier through the sub-classifier, so as to perform image identification on the image to be identified through the target classifier, where the segmented classifier mapping model is a model obtained after small sample learning.
In this embodiment, the segmented classifier mapping model may also be referred to as a fine-grained image recognition model, and the network may learn a learning paradigm of the small sample learning task through the small sample learning task, and use the learning paradigm for testing image recognition to perform accurate image recognition on an image to be recognized, thereby solving a technical problem that a conventional fine-grained image recognition technology is too dependent on a large amount of fine-grained images, so that the conventional fine-grained image recognition technology is not dependent on a technical effect of the large amount of fine-grained images.
Optionally, the feature extraction unit includes: a first feature extraction module for extracting A class from the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, class AiTaking 1 to k in sequence for the ith category in the plurality of categories, wherein k is the number of categories of the test sample image; the mapping identification unit includes: a mapping module, configured to map each sub-vector in the feature set Xi to a sub-classifier of a corresponding type through the segmented classifier mapping model, and determine a target classifier F through the sub-classifieri
Optionally, the mapping module is configured to: mapping each sub-vector in the feature set Xi into a sub-classifier of a corresponding type through a segmented classification mapping function in the segmented classifier mapping model to obtain a plurality of sub-classifiers; cascading the plurality of sub-classifiers to obtain the target classifier Fi
Optionally, the mapping module is further configured to: by the formula
Figure BDA0001656449190000231
Collecting the features XiThe tth sub-unit ofThe vectors are mapped to corresponding types of sub-classifiers, where,
Figure BDA0001656449190000237
representing said feature set XiThe t-th sub-vector of
Figure BDA0001656449190000232
The sub-classifiers corresponding to the sub-classifiers,
Figure BDA0001656449190000233
representing the piecewise-classified mapping function, t being 1 to n in sequenceB,nBIs the feature set XiThe number of neutron vectors.
Optionally, the first feature extraction module is configured to: for the class A in the test sample imageiEach test example image is subjected to feature extraction to obtain a feature set
Figure BDA0001656449190000234
Wherein x isjRepresents that the category is AiN of the j test case imageeFor the type A in the test sample imageiThe number of test case images of (1); according to the formula
Figure BDA0001656449190000235
For feature sets
Figure BDA0001656449190000236
Calculating to obtain the feature set Xi
Optionally, the feature extraction unit further includes: the second feature extraction module is configured to perform feature extraction on the test sample image through a bilinear neural network to obtain a bilinear feature set of the test sample image, where the bilinear feature set includes a plurality of sub-vectors.
Optionally, the second feature extraction module is configured to: extracting a first bilinear feature set of the test example image through a first feature extraction network in a bilinear neural network; extracting a second bilinear feature set of the test example image through a second branch feature extraction network in the bilinear neural network; and performing outer product operation on the first bilinear feature set and the second bilinear feature set to obtain a bilinear feature set of the test sample image.
Optionally, the mapping identification unit further includes: the characteristic extraction module is used for extracting the characteristics of the image to be identified to obtain a target characteristic matrix, wherein the target characteristic matrix comprises the characteristic information of the image to be identified; and the identification module is used for carrying out image identification on the target characteristic matrix through the target classifier.
Optionally, the identification module is further configured to: under the condition that the number of the target classifiers is multiple, performing image recognition on the target feature matrix through each target classifier to obtain multiple class confidence coefficients; determining the category of the image to be recognized based on the category of the target classifier corresponding to the target category confidence in the plurality of category confidences, wherein the target category confidence is the confidence of the plurality of category confidences which is greater than a preset threshold.
Optionally, the identification module is further configured to: and carrying out inner product operation on the feature matrix of the target classifier and the target feature matrix, and taking an operation result as the class confidence.
Optionally, the apparatus is further configured to: acquiring a training sample set; the training sample set comprises training images and training image label information, the label information is used for representing the types of the training images, and the training images comprise training example images and inquiry set images; extracting features of the training example images to obtain a feature set of the training example images; and carrying out small sample training on the segmentation classification mapping function in the original segmentation classifier mapping model through the feature set of the training example image and the label information of the training example image to obtain the trained original segmentation classifier mapping model.
Optionally, the apparatus is further configured to: carrying out image recognition on the images of the challenge set through the trained original segmented classifier mapping model, and determining the value of a classification loss function based on the recognition result; and adjusting parameters of the original segmentation classification mapping function based on the value of the classification loss function.
In another embodiment, a computer-readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform any of the above-described method embodiments is also provided.
In another embodiment, a computer program is also provided, which may be stored on a storage medium in the cloud or locally. When being executed by a computer or a processor, the computer program is used for executing the corresponding steps of the image recognition method of the embodiment of the invention and realizing the corresponding modules in the image recognition device according to the embodiment of the invention.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (15)

1. An image recognition method, comprising:
obtaining a test sample, wherein the test sample comprises a test example image and an image to be identified, and the category of an object in the test example image comprises the category of the object in the image to be identified;
extracting features of the test sample image to obtain a feature set, wherein the feature set comprises a plurality of sub-vectors, and the sub-vectors are feature vectors of parts of objects in the sample image;
mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, and determining a target classifier through the sub-classifier, wherein the segmented classifier mapping model is a model obtained after learning through a small sample;
performing image recognition on the image to be recognized through the target classifier;
wherein the segmented classifier mapping model is a fine-grained level image recognition model.
2. The method of claim 1, wherein the class of the object in the test example image is at least one;
performing feature extraction on the test sample image to obtain a feature set, wherein the feature set comprises: for the class A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiWherein, class AiTaking 1 to k in sequence for the ith category in the plurality of categories, wherein k is the number of categories of the test sample image;
mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, so as to obtain a target classifier, wherein the target classifier comprises: mapping each sub-in the feature set Xi by the segment classifier modelThe vectors are mapped into corresponding types of sub-classifiers, and a target classifier F is determined by the sub-classifiersi
3. The method according to claim 2, wherein each sub-vector in the feature set Xi is mapped to a corresponding type of sub-classifier by the segmented classifier mapping model, and a target classifier F is determined by the sub-classifieriThe method comprises the following steps:
mapping each sub-vector in the feature set Xi into a sub-classifier of a corresponding type through a segmented classification mapping function in the segmented classifier mapping model to obtain a plurality of sub-classifiers;
cascading the plurality of sub-classifiers to obtain the target classifier Fi
4. The method of claim 3, wherein mapping each sub-vector in the feature set Xi to a corresponding type of sub-classifier via a segment classification mapping function in the segment classifier mapping model comprises:
by the formula
Figure FDA0002663339000000021
Collecting the features XiIs mapped to a corresponding type of sub-classifier, wherein Fi tRepresenting said feature set XiThe t-th sub-vector of
Figure FDA0002663339000000022
The sub-classifiers corresponding to the sub-classifiers,
Figure FDA0002663339000000023
representing the piecewise-classified mapping function, t being 1 to n in sequenceB,nBIs the feature set XiThe number of neutron vectors.
5. The method of claim 2, wherein the step of removing the substrate comprises removing the substrate from the substrateFor the class A in the test sample imageiThe test sample image is subjected to feature extraction to obtain a feature set XiThe method comprises the following steps:
for the class A in the test sample imageiEach test example image is subjected to feature extraction to obtain a feature set
Figure FDA0002663339000000024
Wherein x isjRepresents that the category is AiN of the j test case imageeFor the type A in the test sample imageiThe number of test case images of (1);
according to the formula
Figure FDA0002663339000000025
For feature sets
Figure FDA0002663339000000026
Calculating to obtain the feature set Xi
6. The method of any one of claims 1 to 5, wherein performing feature extraction on the test case image to obtain a feature set comprises:
and performing feature extraction on the test example image through a bilinear neural network to obtain a bilinear feature set of the test example image, wherein the bilinear feature set comprises a plurality of sub-vectors.
7. The method of claim 6, wherein the feature extraction of the test example image through a bilinear neural network to obtain a bilinear feature set of the test example image comprises:
extracting a first bilinear feature set of the test example image through a first feature extraction network in the bilinear neural network;
extracting a second bilinear feature set of the test exemplar image through a second branch feature extraction network in the bilinear neural network;
and performing an outer product operation on the first bilinear feature set and the second bilinear feature set to obtain a bilinear feature set of the test example image.
8. The method of claim 1, wherein image recognizing the image to be recognized by the target classifier further comprises:
extracting the features of the image to be recognized to obtain a target feature matrix, wherein the target feature matrix comprises feature information of the image to be recognized;
and carrying out image recognition on the target feature matrix through the target classifier.
9. The method according to claim 8, wherein the number of the target classifiers is multiple, and performing image recognition on the target feature matrix of the image to be recognized by the target classifier comprises:
performing image recognition on the target feature matrix through each target classifier to obtain a plurality of class confidence coefficients;
determining the category of the image to be recognized based on the category of the target classifier corresponding to the target category confidence in the plurality of category confidences, wherein the target category confidence is the confidence of the plurality of category confidences which is greater than a preset threshold.
10. The method of claim 9, wherein performing image recognition on the feature vectors of the image to be recognized by each of the target classifiers to obtain a plurality of class confidences comprises:
and carrying out inner product operation on the feature matrix of the target classifier and the target feature matrix, and taking an operation result as the class confidence.
11. The method of claim 1, further comprising:
acquiring a training sample set; wherein the training sample set comprises training images and the training image label information, the label information is used for representing the classes of the training images, and the training images comprise training example images and challenge set images;
extracting features of the training example images to obtain a feature set of the training example images;
and carrying out small sample training on a segmentation classification mapping function in the original segmentation classifier mapping model through the feature set of the training example image and the label information of the training example image to obtain the trained original segmentation classifier mapping model.
12. The method of claim 11, further comprising:
carrying out image recognition on the inquiry set image through the trained original segmentation classifier mapping model, and determining the value of a classification loss function based on the recognition result;
and adjusting the parameters of the original segmentation classification mapping function based on the value of the classification loss function.
13. An image recognition apparatus, comprising:
the device comprises an acquisition unit, a recognition unit and a recognition unit, wherein the acquisition unit is used for acquiring a test sample, the test sample comprises a test example image and an image to be recognized, and the category of an object in the test example image comprises the category of the object in the image to be recognized;
the characteristic extraction unit is used for extracting characteristics of the test sample image to obtain a characteristic set, wherein the characteristic set comprises a plurality of sub-vectors, and the sub-vectors are component characteristic vectors of objects in the sample image;
the mapping identification unit is used for mapping each sub-vector into a sub-classifier of a corresponding type through a segmented classifier mapping model, determining a target classifier through the sub-classifier, and performing image identification on the image to be identified through the target classifier, wherein the segmented classifier mapping model is a model obtained after small sample learning;
wherein the segmented classifier mapping model is a fine-grained level image recognition model.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of the preceding claims 1 to 12 when executing the computer program.
15. A computer-readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the method of any of the preceding claims 1 to 12.
CN201810443324.0A 2018-05-10 2018-05-10 Image identification method and device, electronic equipment and computer readable medium Active CN108681746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810443324.0A CN108681746B (en) 2018-05-10 2018-05-10 Image identification method and device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810443324.0A CN108681746B (en) 2018-05-10 2018-05-10 Image identification method and device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN108681746A CN108681746A (en) 2018-10-19
CN108681746B true CN108681746B (en) 2021-01-12

Family

ID=63805790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810443324.0A Active CN108681746B (en) 2018-05-10 2018-05-10 Image identification method and device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN108681746B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522970B (en) * 2018-11-28 2021-05-04 南京旷云科技有限公司 Image classification method, device and system
CN111325225B (en) * 2018-12-13 2023-03-21 富泰华工业(深圳)有限公司 Image classification method, electronic device and storage medium
CN109740676B (en) * 2019-01-07 2022-11-22 电子科技大学 Object detection and migration method based on similar targets
CN111460880B (en) * 2019-02-28 2024-03-05 杭州芯影科技有限公司 Multimode biological feature fusion method and system
CN111797865A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN111061898A (en) * 2019-12-13 2020-04-24 Oppo(重庆)智能科技有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111126396B (en) * 2019-12-25 2023-08-22 北京科技大学 Image recognition method, device, computer equipment and storage medium
CN111191587B (en) * 2019-12-30 2021-04-09 兰州交通大学 Pedestrian re-identification method and system
CN111242230A (en) * 2020-01-17 2020-06-05 腾讯科技(深圳)有限公司 Image processing method and image classification model training method based on artificial intelligence
CN111368893B (en) * 2020-02-27 2023-07-25 Oppo广东移动通信有限公司 Image recognition method, device, electronic equipment and storage medium
CN111310858B (en) * 2020-03-26 2023-06-30 北京百度网讯科技有限公司 Method and device for generating information
CN111783889B (en) * 2020-07-03 2022-03-01 北京字节跳动网络技术有限公司 Image recognition method and device, electronic equipment and computer readable medium
KR20220018467A (en) * 2020-08-01 2022-02-15 센스타임 인터내셔널 피티이. 리미티드. Target object recognition method, device and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7451065B2 (en) * 2002-03-11 2008-11-11 International Business Machines Corporation Method for constructing segmentation-based predictive models
CN101373518A (en) * 2008-06-28 2009-02-25 合肥工业大学 Method for constructing prototype vector and reconstructing sequence parameter based on semantic information in image comprehension
CN105005794A (en) * 2015-07-21 2015-10-28 太原理工大学 Image pixel semantic annotation method with combination of multi-granularity context information
US20170287170A1 (en) * 2016-04-01 2017-10-05 California Institute Of Technology System and Method for Locating and Performing Fine Grained Classification from Multi-View Image Data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7451065B2 (en) * 2002-03-11 2008-11-11 International Business Machines Corporation Method for constructing segmentation-based predictive models
CN101373518A (en) * 2008-06-28 2009-02-25 合肥工业大学 Method for constructing prototype vector and reconstructing sequence parameter based on semantic information in image comprehension
CN105005794A (en) * 2015-07-21 2015-10-28 太原理工大学 Image pixel semantic annotation method with combination of multi-granularity context information
US20170287170A1 (en) * 2016-04-01 2017-10-05 California Institute Of Technology System and Method for Locating and Performing Fine Grained Classification from Multi-View Image Data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Bilinear CNN Models for Fine-grained Visual Recognition";Tsung-Yu Lin etc.;《2015 IEEE International Conference on Computer Vision》;20151231;第1449-1456页 *

Also Published As

Publication number Publication date
CN108681746A (en) 2018-10-19

Similar Documents

Publication Publication Date Title
CN108681746B (en) Image identification method and device, electronic equipment and computer readable medium
Albattah et al. A novel deep learning method for detection and classification of plant diseases
Tang et al. Deepchart: Combining deep convolutional networks and deep belief networks in chart classification
Quoc Bao et al. Plant species identification from leaf patterns using histogram of oriented gradients feature space and convolution neural networks
WO2017075939A1 (en) Method and device for recognizing image contents
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
CN108229347A (en) For the method and apparatus of the deep layer displacement of the plan gibbs structure sampling of people&#39;s identification
CN114549913B (en) Semantic segmentation method and device, computer equipment and storage medium
Kishorjit Singh et al. Image classification using SLIC superpixel and FAAGKFCM image segmentation
Babu Sam et al. Completely self-supervised crowd counting via distribution matching
Wu et al. A multi-level descriptor using ultra-deep feature for image retrieval
Najibi et al. Towards the success rate of one: Real-time unconstrained salient object detection
Nguyen Thanh et al. Depth learning with convolutional neural network for leaves classifier based on shape of leaf vein
Liu et al. Tread pattern image classification using convolutional neural network based on transfer learning
Jadhav et al. Comprehensive review on machine learning for plant disease identification and classification with image processing
Akusok et al. Image-based classification of websites
Juefei-Xu et al. DeepGender2: A generative approach toward occlusion and low-resolution robust facial gender classification via progressively trained attention shift convolutional neural networks (PTAS-CNN) and deep convolutional generative adversarial networks (DCGAN)
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
Dalara et al. Entity Recognition in Indian Sculpture using CLAHE and machine learning
Masita et al. Refining the efficiency of R-CNN in Pedestrian Detection
Bajpai et al. Real Time Face Recognition with limited training data: Feature Transfer Learning integrating CNN and Sparse Approximation
Banerjee et al. Random Forest boosted CNN: An empirical technique for plant classification
Channayanamath et al. Dynamic hand gesture recognition using 3d-convolutional neural network
Jun et al. Two-view correspondence learning via complex information extraction
Quazi et al. Image Classification and Semantic Segmentation with Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant