CN111652262A - Image object recognition method and device, computer equipment and storage medium - Google Patents

Image object recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111652262A
CN111652262A CN202010198070.8A CN202010198070A CN111652262A CN 111652262 A CN111652262 A CN 111652262A CN 202010198070 A CN202010198070 A CN 202010198070A CN 111652262 A CN111652262 A CN 111652262A
Authority
CN
China
Prior art keywords
image
dimensional vectors
dimensional
convolutional
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010198070.8A
Other languages
Chinese (zh)
Inventor
王国彬
胡鹏
侯兴兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Bincent Technology Co Ltd
Original Assignee
Shenzhen Bincent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Bincent Technology Co Ltd filed Critical Shenzhen Bincent Technology Co Ltd
Priority to CN202010198070.8A priority Critical patent/CN111652262A/en
Publication of CN111652262A publication Critical patent/CN111652262A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image object identification method, an image object identification device, computer equipment and a storage medium, wherein the image object identification method comprises the following steps: building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, compressing each convolutional characteristic graph into n-dimensional vectors to obtain m n-dimensional vectors, performing one-dimensional analysis on the m n-dimensional vectors, and learning through a full connection layer to obtain image representation; and classifying the image representation and outputting the classification probability of the image to be recognized. According to the method, end-to-end learning is achieved, characteristics are extracted more conveniently than those extracted manually, transfer learning is used, the training cost is reduced, the correlation characteristics of the characteristic diagram are calculated, a shallow network is used for learning the characteristics, better combination characteristics are obtained, and the identification performance of the object is improved.

Description

Image object recognition method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of image recognition technologies, and in particular, to an image object recognition method, an image object recognition apparatus, a computer device, and a storage medium.
Background
In the prior art, the identification of an image object is carried out by manually extracting features or extracting image features by adopting a common cnn classification model, however, manually extracting image texture features as the style representation of the image and establishing a mathematical or statistical model are too inefficient and difficult to be widely applied, and the high-level extracted by adopting the cnn classification model often contains various abstract feature combinations such as color, texture, style and the like, and the high abstraction of the features can lose many details, so that the performance of the identification of the image object is low.
Disclosure of Invention
The invention aims to provide an image object identification method, an image object identification device, computer equipment and a storage medium, which can improve the performance of image object identification.
The present invention is achieved in this way, and a first aspect of the present invention provides an image object recognition method including:
building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, wherein m is larger than 1;
compressing each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, and performing one-dimensional analysis on the m n-dimensional vectors and learning through a full connection layer to obtain an image representation;
and classifying the image features and outputting the classification probability of the image to be recognized.
A second aspect of the present invention provides an image object recognition apparatus comprising:
the characteristic diagram acquisition module is used for building a convolutional neural network, inputting an image to be identified into the convolutional neural network to acquire m convolutional characteristic diagrams, wherein m is larger than 1;
the image representation output module is used for compressing each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, and performing one-dimensional transformation on the m n-dimensional vectors and learning through a full connection layer to obtain image representations;
and the classification module is used for classifying the image representation and outputting the classification probability of the image to be identified.
A third aspect of the invention provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the first aspect of the invention when executing the computer program.
A fourth aspect of the invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method according to the first aspect of the invention.
The invention provides an image object identification method, an image object identification device, computer equipment and a storage medium, wherein the image object identification method comprises the following steps: building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, compressing each convolutional characteristic graph into n-dimensional vectors to obtain m n-dimensional vectors, performing one-dimensional analysis on the m n-dimensional vectors, and learning through a full connection layer to obtain image representation; and classifying the image representations and outputting the classification probability of the image to be recognized. According to the method, end-to-end learning is achieved, characteristics are extracted more conveniently than those extracted manually, transfer learning is used, the training cost is reduced, the correlation characteristics of the characteristic diagram are calculated, a shallow network is used for learning the characteristics, better combination characteristics are obtained, and the identification performance of the object is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of an image object recognition method according to embodiment 1 of the present invention;
fig. 2 is a flowchart of an implementation manner of step S11 in an image object identification method provided in embodiment 1 of the present invention;
fig. 3 is a schematic diagram of an implementation manner of step S12 in an image object identification method according to embodiment 1 of the present invention;
fig. 4 is a schematic diagram of a working process in an image object recognition method according to embodiment 1 of the present invention;
fig. 5 is a schematic structural diagram of an image object recognition apparatus according to embodiment 2 of the present invention;
fig. 6 is a schematic structural diagram of a computer device provided in embodiment 4 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Example 1
An embodiment 1 of the present invention provides an image object recognition method, and as shown in fig. 1, the image recognition method includes:
s11, building a convolutional neural network, inputting the image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, wherein m is larger than 1.
In step S11, building a convolutional neural network is to build a vgg16 model with 5 convolutional layers, and input the image to be recognized into a vgg16 model to obtain m convolutional feature maps.
The method for acquiring the m convolution characteristic graphs by inputting the image to be recognized into the vgg16 model comprises the following steps:
the image to be identified is input into vgg16 models to obtain 512 convolution characteristic maps of 14 x 14.
As shown in fig. 2, a 224 × 224 × 3 picture is input, and after two convolutions with 64 convolution kernels, a 112 × 112 × 64 picture is obtained by using posing (pooling) once; after two times of 128 convolution kernel convolution, adopting one pooling to obtain a 56 × 56 × 128 picture; after three times of convolution with 256 convolution kernels, obtaining a 28 × 28 × 256 picture by adopting posing for one time; after convolution with convolution kernel of 512 times, adopting posing once to obtain 14 × 14 × 512 pictures; after the convolution with the convolution kernel of 512 times in sequence, a picture of 7 × 7 × 512 is obtained by using pooling once, that is, 512 convolution feature maps of 7 × 7 are obtained.
And S12, compressing each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, performing one-dimensional analysis on the m n-dimensional vectors, and learning through a full connection layer to obtain image representation.
Specifically, each convolution feature map is compressed into 49-dimensional vectors to obtain 512 49-dimensional vectors, the 512 49-dimensional vectors are subjected to one-dimensional transformation to obtain 1 × 25088 vectors, and the 1 × 25088 vectors are subjected to learning through a full-connection network to obtain image representations.
Specifically, as shown in fig. 4, the fully-connected network includes a fully-connected layer dense, a batch normalization layer, a dimensionality reduction layer Dropout layer, and a softmax layer.
And S13, classifying the image representations and outputting the classification probability of the image to be recognized.
And classifying the combined features through a classifier, and acquiring a label corresponding to the maximum probability value as an output label of the image to be identified.
Specifically, as shown in fig. 3 and 4, the working process of the present embodiment is as follows:
firstly, inputting a picture, and through vgg16 model, obtaining the output of "block 5_ conv 1", namely 512 feature maps with size of 7 × 7.
Compressing each obtained feature map into a 49-dimensional vector to obtain 512 vectors, then serially connecting the vectors into a 1 multiplied by 25088 vector, then passing through a full connection layer dense, and then passing through a batch normalization layer BatchNormalizationlayer, a dimensionality reduction layer Dropout layer and a softmax layer.
And thirdly, outputting the object label to which the picture belongs by utilizing the mapping relation between the features learned by the classifier and the object. For example, if the probability of the tatami is 80% and the probabilities of the remaining picture styles are less than 10%, the object in the output picture includes the tatami.
The embodiment of the invention provides an image object identification method, which comprises the following steps: building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, compressing each convolutional characteristic graph into n-dimensional vectors to obtain m n-dimensional vectors, performing one-dimensional analysis on the m n-dimensional vectors, and learning through a full connection layer to obtain image representation; and classifying the image features and outputting the classification probability of the image to be recognized. According to the method, end-to-end learning is achieved, characteristics are extracted more conveniently than those extracted manually, transfer learning is used, training cost is reduced, by calculating correlation characteristics of the characteristic diagram and learning the characteristics through a shallow network, better combination characteristics are obtained, and identification performance of objects is improved.
Example 2
Embodiment 2 of the present invention provides an image object recognition apparatus, as shown in fig. 5, the image object recognition apparatus including:
the characteristic diagram obtaining module 200 is used for building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic diagrams, wherein m is greater than 1;
an image representation output module 201, configured to compress each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, and perform one-dimensional transformation on the m n-dimensional vectors and perform learning through a full connection layer to obtain an image representation;
and the classification module 202 is configured to classify the image representations and output a classification probability of the image to be identified.
Further, the feature map obtaining module 200 is specifically configured to build an vgg16 model with 5 convolutional layers, and input the image to be identified into the vgg16 model to obtain m convolutional feature maps.
Further, the image representation output module 201 is specifically configured to compress each convolution feature map into 49-dimensional vectors to obtain 512 49-dimensional vectors, perform one-dimensional transformation on the 512 49-dimensional vectors to obtain 1 × 25088 vectors, and output the combined features after passing through the full connection layer by the 1 × 25088 vectors.
The classification module is used for classifying the image representations through a classifier, and acquiring labels corresponding to the maximum probability values as output labels of the images to be recognized.
For the specific working process of the module in the computer device, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.
Example 3
Embodiment 3 of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method in the foregoing embodiments is implemented, and details are not described here for avoiding redundancy. Alternatively, the computer program is executed by the processor to implement the functions of the modules/units in the above embodiments, and is not described herein again to avoid repetition.
Example 4
Fig. 6 is a schematic diagram of a computer device in embodiment 4 of the present invention. As shown in fig. 6, the computer device 6 comprises a processor 63, a memory 61 and a computer program 62 stored in the memory 61 and executable on the processor 60. The processor 63, when executing the computer program 62, implements the various steps in the above embodiments, such as the steps S11, S12, S13 described in fig. 1. Alternatively, the functions of the modules/units in the above-described embodiments are implemented when the processor 63 executes the computer program 62.
Illustratively, the computer program 62 may be divided into one or more modules/units, which are stored in the memory 61 and executed by the processor 63 to perform the data processing procedures of the present invention. One or more of the modules/units may be a series of computer program segments capable of performing certain functions, which are used to describe the execution of the computer program 62 in the computer device 6. For example, the computer program 62 may be divided into modules as shown in fig. 6, and the specific functions of each module correspond to the steps of the method in embodiment 1 one by one, which are not repeated herein to avoid repetition.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. An image object recognition method, characterized by comprising:
building a convolutional neural network, inputting an image to be identified into the convolutional neural network to obtain m convolutional characteristic graphs, wherein m is larger than 1;
compressing each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, and performing one-dimensional analysis on the m n-dimensional vectors and learning through a full connection layer to obtain an image representation;
and classifying the image representation and outputting the classification probability of the image to be recognized.
2. The image object recognition method of claim 1, wherein the building a convolutional neural network, and the inputting an image to be recognized into the convolutional neural network to obtain m convolutional feature maps comprises:
an vgg16 model with 5 convolutional layers is built, and an image to be identified is input into the vgg16 model to obtain m convolutional characteristic maps.
3. The image object recognition method of claim 2, wherein the step of inputting the image to be recognized into the vgg16 model to obtain m convolution feature maps comprises:
and inputting the image to be recognized into the vgg16 model to obtain 512 7 x 7 convolution feature maps.
4. The image object recognition method of claim 3, wherein the compressing each of the convolved feature maps into n-dimensional vectors to obtain m n-dimensional vectors, the m n-dimensional vectors being one-dimensional and being learned through a full connected layer to obtain the image representation comprises:
compressing each convolution feature map into 49-dimensional vectors to obtain 512 49-dimensional vectors, performing one-dimensional quantization on the 512 49-dimensional vectors to obtain 1 × 25088 vectors, and learning the 1 × 25088 vectors through a full connection layer to obtain image representation.
5. The image object recognition method of claim 1, wherein the classifying the image characterization and outputting the classification probability of the image to be recognized comprises:
and classifying the image representations through a classifier, and acquiring labels corresponding to the maximum probability values as output labels of the images to be identified.
6. An image object recognition apparatus, characterized by comprising:
the characteristic diagram acquisition module is used for building a convolutional neural network, inputting an image to be identified into the convolutional neural network to acquire m convolutional characteristic diagrams, wherein m is larger than 1;
the image representation output module is used for compressing each convolution feature map into n-dimensional vectors to obtain m n-dimensional vectors, and performing one-dimensional transformation on the m n-dimensional vectors and learning through a full connection layer to obtain image representations;
and the classification module is used for classifying the image representation and outputting the classification probability of the image to be identified.
7. The image object recognition device of claim 6, wherein the feature map acquisition module is specifically configured to build an vgg16 model with 5 convolutional layers, and input an image to be recognized into the vgg16 model to acquire m convolutional feature maps.
8. The image object recognition device of claim 6, wherein the image representation output module is specifically configured to compress each convolved feature map into 49-dimensional vectors to obtain 512 49-dimensional vectors, to dimension the 512 49-dimensional vectors into 1 x 25088 vectors, and to learn the 1 x 25088 vectors through a full connection layer to obtain the image representation.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202010198070.8A 2020-03-19 2020-03-19 Image object recognition method and device, computer equipment and storage medium Pending CN111652262A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010198070.8A CN111652262A (en) 2020-03-19 2020-03-19 Image object recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010198070.8A CN111652262A (en) 2020-03-19 2020-03-19 Image object recognition method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111652262A true CN111652262A (en) 2020-09-11

Family

ID=72343900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010198070.8A Pending CN111652262A (en) 2020-03-19 2020-03-19 Image object recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111652262A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426858A (en) * 2017-08-29 2019-03-05 京东方科技集团股份有限公司 Neural network, training method, image processing method and image processing apparatus
CN110188776A (en) * 2019-05-30 2019-08-30 京东方科技集团股份有限公司 Image processing method and device, the training method of neural network, storage medium
WO2019223154A1 (en) * 2018-05-25 2019-11-28 平安科技(深圳)有限公司 Single-page high-load image recognition method, device, computer apparatus, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426858A (en) * 2017-08-29 2019-03-05 京东方科技集团股份有限公司 Neural network, training method, image processing method and image processing apparatus
WO2019223154A1 (en) * 2018-05-25 2019-11-28 平安科技(深圳)有限公司 Single-page high-load image recognition method, device, computer apparatus, and storage medium
CN110188776A (en) * 2019-05-30 2019-08-30 京东方科技集团股份有限公司 Image processing method and device, the training method of neural network, storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张朝阳: "深入浅出 工业机器学习算法详解与实战", 机械工业出版社, pages: 200 - 201 *

Similar Documents

Publication Publication Date Title
CN110532897B (en) Method and device for recognizing image of part
CN112906720B (en) Multi-label image identification method based on graph attention network
CN110826596A (en) Semantic segmentation method based on multi-scale deformable convolution
CN113326930B (en) Data processing method, neural network training method, related device and equipment
CN110222718B (en) Image processing method and device
CN113780412B (en) Fault diagnosis model training method and system and fault diagnosis method and system
CN111126481A (en) Training method and device of neural network model
CN112132145B (en) Image classification method and system based on model extended convolutional neural network
CN111523561A (en) Image style recognition method and device, computer equipment and storage medium
CN112926472A (en) Video classification method, device and equipment
CN114445651A (en) Training set construction method and device of semantic segmentation model and electronic equipment
CN112614110A (en) Method and device for evaluating image quality and terminal equipment
CN114612681A (en) GCN-based multi-label image classification method, model construction method and device
Hussain et al. Image denoising to enhance character recognition using deep learning
CN114049491A (en) Fingerprint segmentation model training method, fingerprint segmentation device, fingerprint segmentation equipment and fingerprint segmentation medium
CN117373064A (en) Human body posture estimation method based on self-adaptive cross-dimension weighting, computer equipment and storage medium
Yao A compressed deep convolutional neural networks for face recognition
CN116796248A (en) Forest health environment assessment system and method thereof
CN111652262A (en) Image object recognition method and device, computer equipment and storage medium
CN114677545B (en) Lightweight image classification method based on similarity pruning and efficient module
CN113011506A (en) Texture image classification method based on depth re-fractal spectrum network
CN113688783A (en) Face feature extraction method, low-resolution face recognition method and device
CN113139577A (en) Deep learning image classification method and system based on deformable convolution network
CN114077885A (en) Model compression method and device based on tensor decomposition and server
Kanabarkar et al. Performance Analysis of Convolutional Neural Network for Image Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination