CN108596195B - Scene recognition method based on sparse coding feature extraction - Google Patents

Scene recognition method based on sparse coding feature extraction Download PDF

Info

Publication number
CN108596195B
CN108596195B CN201810435125.5A CN201810435125A CN108596195B CN 108596195 B CN108596195 B CN 108596195B CN 201810435125 A CN201810435125 A CN 201810435125A CN 108596195 B CN108596195 B CN 108596195B
Authority
CN
China
Prior art keywords
sample image
image set
expression vector
feature
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810435125.5A
Other languages
Chinese (zh)
Other versions
CN108596195A (en
Inventor
曾伟波
苏江文
郑耀松
吕君玉
林吓强
陈铠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, Fujian Yirong Information Technology Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN201810435125.5A priority Critical patent/CN108596195B/en
Publication of CN108596195A publication Critical patent/CN108596195A/en
Application granted granted Critical
Publication of CN108596195B publication Critical patent/CN108596195B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/513Sparse representations

Abstract

The invention relates to the technical field of image recognition, in particular to a scene recognition method based on sparse coding feature extraction. A scene recognition method based on sparse coding feature extraction comprises the following steps: carrying out preprocessing operation on a sample image set which is acquired in advance and used for training; extracting a characteristic expression vector of the sample image set; adding the feature expression vectors and the corresponding class labels into a linear classifier to construct a linear scene classifier; preprocessing a sample image set to be identified; extracting a characteristic expression vector of a sample image set to be identified; and sending the characteristic expression vector of the sample image set to be identified into a linear scene classifier for identification, and obtaining the class label of the scene class of the sample image set. By adopting the sparse coding technology, the dimensionality of the image can be reduced, the main information of the image can be kept, and meanwhile, the sparse coding technology has strong robustness on noise and shielding.

Description

Scene recognition method based on sparse coding feature extraction
Technical Field
The invention relates to the technical field of image recognition, in particular to a scene recognition method based on sparse coding feature extraction.
Background
Scene recognition refers to recognizing scenes in scene pictures according to similar contents of the scene images, such as same color features, and aims to automatically recognize the scenes to which the images belong by mining the scene features in the images by simulating the perception capability of human beings. In the scene recognition process, the entire image is discriminated as a whole and does not relate to a specific object. Because the specific target can only be used as one basis for judging the category of the scene, but is not necessarily completely related to the category of the scene. Scene recognition is a fundamental pre-processing procedure in the computer vision and robotics fields, and plays an important role in the computer intelligence fields of image content retrieval, pattern recognition, machine learning, and the like.
In recent years, scene recognition research has made great progress, and methods for modeling a plurality of scene categories emerge. The existing scene recognition methods are divided into four categories according to the scene category modeling mode:
(1) scene recognition method based on global features
Scene recognition methods based on global features mostly describe scenes through global visual features of images such as colors, textures and shapes, and are successfully applied to outdoor scene recognition. In contrast, the color features can obtain better recognition results for the scale of the scene, the change of the visual angle and the rotation of the image; the texture and shape features correspond to the structural and directional information of the image, which is also very sensitive to the human visual system, so that the texture and shape features have better consistency with the result of human visual perception. However, the scene recognition method based on the global feature generally needs to search all pixel points of the image, and does not consider the spatial position relationship of the pixel points, so that the method has poor real-time performance and universality.
(2) Scene recognition method based on target
Based on the principle that a specific place can be accurately positioned through a series of extremely representative targets around the specific place, most scene recognition methods also recognize a scene corresponding to an image according to the recognition result of the targets in the image. Therefore, the scene recognition method needs to go through stages of image segmentation, multi-feature combination, target recognition and the like. When the target to be recognized is far away from the visual angle, the target is likely to be hidden in background information which lacks analysis value, and the target is ignored in the segmentation stage, so that the target recognition work cannot be realized. In addition, in order to simplify the complexity of a specific scene, a group of targets capable of representing the scene needs to be selected, and the problem of selecting these reliable and stable representative targets becomes another bottleneck restricting the target-based scene recognition.
(3) Region-based scene recognition method
In view of the limitation of the target-based scene recognition method, some researchers use the segmented regions to replace the scene representative target, and perform feature combination according to the structural relationship of the regions to form the scene mark. The key of the scene identification method is how to obtain a reliable region segmentation algorithm. There are many methods for characterizing such region information, for example: the method can be realized by adopting a mode of combining local and global, namely extracting global statistical characteristics in the region; regions can also be characterized by extracting local invariant features in the regions; the region information may also be characterized according to a bag of words model.
(4) Scene recognition method based on bionic features
In view of the real-time and efficient nature of scene recognition, there is still an uncompensated gap between the best current computer vision system and the vision systems of humans and other animals. In view of the superior scene recognition capability of human and animals, a scene recognition method based on bionic features is generated, and the scene recognition is realized by simulating a processing mechanism of a biological visual cortex. The basic idea is to develop research aiming at a certain biological visual mechanism or a certain class of biological visual characteristics and establish an effective calculation model through careful analysis so as to obtain a satisfactory result. For example, the method based on the human visual attention selection mechanism can take some image area information which is easy to attract human attention as a priority processing object, and the selective mechanism can greatly improve the efficiency of the scene recognition method in processing, analyzing and recognizing the visual information.
Various difficulties in existing scene recognition, such as the fact that a scene is dynamically changed, pictures of the same scene are changeable, images of different classes may have many similar points, images of different scenes may overlap, and the classification performance of the scene classification depends on the accuracy of class labeling of training scene images, all of which result in low accuracy of the scene classification recognition.
Disclosure of Invention
Therefore, a scene recognition method based on sparse coding feature extraction is needed to be provided for solving the problem of low accuracy of scene classification and recognition.
In order to achieve the above object, the inventor provides a scene recognition method based on sparse coding feature extraction, and the specific technical scheme is as follows:
a scene recognition method based on sparse coding feature extraction comprises the following steps: carrying out preprocessing operation on a sample image set which is acquired in advance and used for training; extracting a feature expression vector of the sample image set after the preprocessing operation; adding the feature expression vector of the sample image set and the class label corresponding to the feature expression vector into a linear classifier, performing parameter learning on the linear classifier to obtain the optimal parameter of the linear classifier, and constructing a linear scene classifier according to the optimal parameter; preprocessing a sample image set to be identified; extracting the feature expression vector of the sample image set to be identified after the preprocessing operation; and sending the preprocessed feature expression vector of the sample image set to be identified into the linear scene classifier for identification, and obtaining the class label of the scene class to which the sample image set to be identified belongs.
Further, the preprocessing operation includes: and (5) image contrast normalization and Gamma correction processing.
Further, the extracting the feature expression vector of the sample image set after the preprocessing operation includes: extracting the bottom-layer features of the sample image set after the preprocessing operation by adopting a multi-scale SIFT feature fusion method, namely adopting fields with various scales for each pixel point, extracting SIFT key points of the image in each field, solving sparse expression of the SIFT key points, and forming feature expression vectors of the preprocessed sample image set by adopting a space pyramid strategy and max-posing.
Further, the step of "solving sparse expressions of the SIFT key points" includes: and solving the sparse expression of the SIFT key points by adopting local linear constraint coding.
Further, the step "forming a feature expression vector of the sample image set after the preprocessing operation by using the spatial pyramid strategy and max-posing" includes: dividing the image into local areas of 1 × 1, 1 × 4 and 4 × 1, forming feature expressions of the local areas by adopting histograms of max-pooling statistical coding features in the local areas, and connecting the feature expressions of all the areas to form a feature expression vector of the sample image set after the preprocessing operation.
Further, the step of "learning parameters of the linear classifier to obtain optimal parameters of the linear classifier" includes: and calculating by adopting a least square method to obtain weight parameters of the linear classifier, and obtaining optimal parameters of the linear classifier by adopting a cross verification method.
Further, the "image contrast normalization" includes the steps of: converting the image from the RGB color space to the YUV color space, and carrying out global and local contrast normalization processing on the YUV color space; the global normalization is to normalize the pixel value of the image to be near the mean value of the pixel of the image, and the local normalization is to strengthen the edge.
Further, the extracting the feature expression vector of the sample image set to be identified after the preprocessing operation includes: extracting bottom layer characteristics of the preprocessed sample image set to be identified, and fusing by adopting multi-scale SIFT characteristics; extracting the bottom-layer features of the sample image set after the preprocessing operation by adopting a multi-scale SIFT feature fusion method, namely adopting fields with various scales for each pixel point, extracting SIFT key points of the image in each field, solving sparse expression of the SIFT key points, and forming feature expression vectors of the preprocessed sample image set by adopting a space pyramid strategy and max-posing.
The invention has the beneficial effects that:
1. the method is based on scene recognition of global features, and the whole scene image is judged as a whole without relating to a specific target. And when extracting the bottom layer characteristics of the sample image set, adopt multi-scale SIFT feature fusion, can increase the number of SIFT key point, can also increase the local detail information of image simultaneously.
2. By adopting the sparse coding technology, the image dimensionality can be reduced, the main information of the image can be kept, and meanwhile, the sparse coding technology has strong robustness on noise and shielding. The complexity of an upper-layer classifier model can be reduced by combining bottom-layer sparse coding feature expression with a max-firing method, and the training speed of the classifier is accelerated. And sparse coding is a nonlinear feature mapping mode, and subsequent classification performance can be effectively improved by adopting the feature mapping mode.
3. The preprocessing operation combines contrast normalization and Gamma correction, and can significantly reduce the influence caused by local shadow and illumination change of the image.
4. The sparse coding technology adopts a local linear constraint coding technology, can analytically obtain sparse expression of signals, namely directly obtain expression of solution, does not need iterative solution, and improves the solution efficiency of sparse coding.
5. The linear classifier is used as a scene classifier, so that the complexity of a model is reduced, the training speed of the classifier is improved, and the possibility of overfitting is reduced.
Drawings
Fig. 1 is a flowchart of a scene recognition method based on sparse coding feature extraction according to an embodiment;
FIG. 2 is a schematic diagram illustrating the extraction of feature expression vectors of the sample image set after the preprocessing operation according to one embodiment;
FIG. 3 is a diagram illustrating solving sparse expressions of SIFT key points using sparse coding techniques according to an embodiment;
FIG. 4 is a diagram illustrating a process of a sparse representation calculation method according to an embodiment;
fig. 5 is a schematic diagram of dividing an image into a plurality of local regions by using a spatial pyramid strategy according to an embodiment.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
First, some explanations will be made on terms related to this embodiment:
SIFT: scale-invariant feature transform (SIFT), is a description used in the field of image processing. The description has scale invariance, can detect key points in the image and is a local feature descriptor.
Sparse Coding (Sparse Coding): is a simple cell receptive field artificial neural network method for simulating the main visual cortex V1 area of the visual system of mammals. The method has spatial locality, directivity and band-pass property of frequency domain, and is a self-adaptive image statistical method.
Referring to fig. 1, in this embodiment, the sample image set for training at least satisfies the following conditions: 1. training sample image sets of homogeneous scenes are required to contain different modalities as much as possible; 2. the training sample image sets for different classes of scenes are kept as uniform as possible. The purpose of doing so is to learn the parameters of the linear scene classifier better, so that the scene classification recognition accuracy can be improved.
Step S101: a pre-processing operation is performed on a pre-acquired sample image set for training. The following may be used: the preprocessing operation comprises the following steps: and (5) image contrast normalization and Gamma correction processing. The "image contrast normalization" comprises the steps of: converting the image from the RGB color space to the YUV color space, and carrying out global and local contrast normalization processing on the YUV color space; the global normalization is to normalize the pixel value of the image to be near the mean value of the pixel of the image, and the local normalization is to strengthen the edge. The preprocessing operation combines contrast normalization and Gamma correction, and can remarkably reduce the influence caused by local shadow and light shadow change of the image.
Step S102: and extracting the characteristic expression vector of the sample image set after the preprocessing operation. The following may be used: extracting bottom-layer features of the sample image set after the preprocessing operation by adopting a multi-scale SIFT feature fusion method, wherein scale factors comprise (4, 6,8,9 and 10); namely, domains with multiple scales such as 4 × 4,6 × 6 and the like are adopted in one pixel point of the sample image, and SIFT key points of the image are extracted in each domain. When the bottom layer features of the sample image set are extracted, a multi-scale SIFT feature fusion method is adopted, the number of SIFT key points can be increased, more image information can be acquired, and meanwhile, the local detail information of the image can be increased.
Further, after the SIFT key points in different regions are obtained, sparse expression of the SIFT key points is solved, and in the embodiment, sparse expression of the SIFT key points is solved by adopting local linear constraint coding. The sparse coding technology adopts a local linear constraint coding technology, can analytically obtain sparse expression of signals, namely directly obtain expression of solution, does not need iterative solution, and improves the solution efficiency of sparse coding. And the sparse coding technology is adopted to construct the feature expression of the image, so that the complexity of the image can be effectively reduced, the main information of the image can be retained to the maximum extent, and the robustness to noise and shielding is strong. The sparse coding is a nonlinear feature mapping mode, and the subsequent classification performance can be effectively improved by adopting the feature mapping mode.
After solving the sparse expression of the SIFT key points, namely after completing the sparse coding of the SIFT key points, dividing the image into local areas of 1 × 1, 1 × 4 and 4 × 1, counting the histogram of the SIFT key point coding in the local areas by adopting max-posing to form the feature expression of the local areas, and connecting the feature expressions of all the areas to form the feature expression vector of the sample image set after the preprocessing operation.
Step S103: and adding the characteristic expression vector of the sample image set and the class label corresponding to the characteristic expression vector into a linear classifier, performing parameter learning on the linear classifier to obtain the optimal parameter of the linear classifier, and constructing the linear scene classifier according to the optimal parameter. The following may be used: and calculating by adopting a least square method to obtain weight parameters of the linear classifier, obtaining optimal parameters of the linear classifier by adopting a cross verification method, and constructing the linear scene classifier according to the optimal parameters. The linear classifier is adopted as a scene classifier, so that the complexity of a model is reduced, the training speed of the classifier is increased, and the probability of overfitting is reduced.
Step S104: and carrying out preprocessing operation on the sample image set to be identified. The following may be used: the preprocessing operation comprises the following steps: and (5) image contrast normalization and Gamma correction processing. The "image contrast normalization" comprises the steps of: converting the image from the RGB color space to the YUV color space, and carrying out global and local contrast normalization processing on the YUV color space; the global normalization is to normalize the pixel value of the image to be near the mean value of the pixel of the image, and the local normalization is to strengthen the edge. The preprocessing operation combines contrast normalization and Gamma correction, and can remarkably reduce the influence caused by local shadow and light shadow change of the image.
Step S105: and extracting the characteristic expression vector of the sample image set to be identified after the preprocessing operation. The following may be used: extracting bottom layer features of the sample image set after preprocessing operation by adopting multi-scale SIFT feature fusion, wherein scale factors comprise (4, 6,8,9 and 10); namely, domains with multiple scales such as 4 × 4,6 × 6 and the like are adopted in one pixel point of the sample image, and SIFT key points of the image are extracted in each domain. When the bottom layer features of the sample image set are extracted, multi-scale SIFT feature fusion is adopted, the number of SIFT key points can be increased, more image information can be obtained, and meanwhile, local detail information of the image can be increased.
Further, after the SIFT key points in different regions are obtained, sparse expression of the SIFT key points is solved, and in the embodiment, sparse expression of the SIFT key points is solved by adopting local linear constraint coding. The sparse coding technology adopts a local linear constraint coding technology, can analytically obtain sparse expression of signals, namely directly obtain expression of solution, does not need iterative solution, and improves the solution efficiency of sparse coding. And the sparse coding technology is adopted to construct the feature expression of the image, so that the complexity of the image can be effectively reduced, the main information of the image can be retained to the maximum extent, and the robustness to noise and shielding is strong. The sparse coding is a nonlinear feature mapping mode, and the subsequent classification performance can be effectively improved by adopting the feature mapping mode.
After solving the sparse expression of the SIFT key points, namely after completing the sparse coding of the SIFT key points, dividing the image into local areas of 1 × 1, 1 × 4 and 4 × 1, adopting a max-posing statistical key point coding histogram in the local area to form the feature expression of the local area, and connecting the feature expressions of all the areas to form the feature expression vector of the sample image set after the preprocessing operation.
Step S106: and sending the preprocessed feature expression vector of the sample image set to be identified into the linear scene classifier for identification, and obtaining the class label of the scene class to which the sample image set to be identified belongs. Namely: inputting a feature expression vector of a sample image set to be identified currently, sending the feature expression vector into a trained linear scene classifier model, and determining the type of the sample image set to be identified currently according to the output quantity of the linear scene classifier. The type to which it belongs is determined by the output point of the highest value.
The invention is based on scene recognition of global characteristics, the whole scene image is judged as a whole, and no specific target is involved. And when extracting the bottom layer characteristics of the sample image set, adopt multi-scale SIFT feature fusion, can increase the number of SIFT key point, can also increase the local detail information of image simultaneously. And the sparse coding technology is adopted, so that the dimensionality of the image can be reduced, the main information of the image can be kept, and the robustness to noise and shielding is strong. The complexity of an upper-layer classifier model can be reduced by combining bottom-layer sparse coding feature expression with a max-firing method, and the training speed of the classifier is accelerated. And sparse coding is a nonlinear feature mapping mode, and subsequent classification performance can be effectively improved by adopting the feature mapping mode. The sparse coding technology adopts a local linear constraint coding technology, can analytically obtain sparse expression of signals, namely directly obtain expression of solution, does not need iterative solution, and improves the solution efficiency of sparse coding. And a linear classifier is adopted as a scene classifier, so that the complexity of a model is reduced, the training speed of the classifier is increased, and the probability of overfitting is reduced.
Referring to fig. 2 to 5, the step S102 or the step S105 is implemented as follows:
solving sparse expression of SIFT key points by adopting sparse coding technology, and making x be as good as R n Is the input signal (i.e., SIFT keypoint), B ═ B 1 ,b 2 ,...,b m ]∈R n×m For dictionary, the sparse coding technique is to solve the following L1-norm problem
Figure BDA0001654432970000091
Thereby obtaining sparse expression c epsilon R of input signals m
Further, the calculation method of sparse representation specifically includes:
with local constraint, each input signal is projected to its local coordinate system, for an input vector x ═ x 1 ,x 2 ,...,x n Finding the K adjacent vectors of x in the local range, and then reconstructing x by using the K adjacent vectors, wherein the K adjacent vectors are used for each atomThe weighting achieves the purpose of selecting K nearest neighbors to obtain an objective function:
Figure BDA0001654432970000092
where λ is a regular term coefficient, represents a vector formed by multiplying by elements, d is a weight of each atom of the dictionary, A represents a vector whose elements are all 1, let
Figure BDA0001654432970000093
Wherein dist (x, B) [ dist (x, B) ] 1 ),...,dist(x,b m )] T ,dist(x,b j ) Representing the signal x with the atom b j The euclidean distance, j ═ 1, 2., m, σ, is a parameter for controlling the attenuation speed of the weight, and the target function is solved by analysis to obtain a code c:
Figure BDA0001654432970000101
wherein C represents the covariance matrix of the data, resulting in a code
Figure BDA0001654432970000106
And normalizing the code to obtain the final code c.
The image is divided into a plurality of local areas, and the dividing method adopts a spatial pyramid strategy, namely, the image is divided into sizes of 1 × 1, 1 × 4 and 4 × 1 (the area division is to count all key points in the area so as to obtain local information of the image). And in each local region, calculating a histogram of key point codes in each region by using max-posing as a characteristic expression of the region. E.g. sharing N in interval i i A key point, the coding matrix of all key points is
Figure BDA0001654432970000102
Each column of which represents a sparse representation of a keypoint. z is a radical of i ∈R m Is a characteristic expression of the region, then
Figure BDA0001654432970000103
Wherein z is ij Is z i The jth element of (1), c kj Row k and column j of C.
The feature expressions of all regions are connected to form the final feature expression vector of the image, i.e. Z ═ Z 1 ,z 2 ,...,z 9 ]∈R 9m
Further, the step S103 is implemented as follows:
training sample pairs for a series of inputs (z) i ,t i ) 1, 2., N (t is the label truth of the training sample), the objective function of the linear classifier is:
Figure BDA0001654432970000104
Subjectto:Wz i =t ii ,i=1,....,N.
and (3) solving to obtain the optimal model weight by adopting a Lagrange multiplier method as follows:
Figure BDA0001654432970000105
wherein C is a regular term coefficient, and an optimal parameter is obtained through adjustment by a cross-validation method.
It should be noted that, although the above embodiments have been described herein, the invention is not limited thereto. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by changing and modifying the embodiments described herein or by using the equivalent structures or equivalent processes of the content of the present specification and the attached drawings, and are included in the scope of the present invention.

Claims (6)

1. A scene recognition method based on sparse coding feature extraction is characterized by comprising the following steps:
carrying out preprocessing operation on a sample image set which is acquired in advance and used for training;
extracting a feature expression vector of the sample image set after the preprocessing operation;
adding the feature expression vector of the sample image set and the class label corresponding to the feature expression vector into a linear classifier, performing parameter learning on the linear classifier to obtain the optimal parameter of the linear classifier, and constructing a linear scene classifier according to the optimal parameter;
preprocessing a sample image set to be identified;
extracting a feature expression vector of the sample image set to be identified after the preprocessing operation;
sending the feature expression vector of the preprocessed sample image set to be identified into the linear scene classifier for identification, and obtaining the class label of the scene class to which the sample image set to be identified belongs;
the extracting the feature expression vector of the sample image set after the preprocessing operation includes: extracting bottom layer features of the sample image set by using a multi-scale SIFT feature fusion method, namely extracting SIFT key points of the image in each field by adopting fields with various scales for each pixel point;
solving sparse expression of the SIFT key points, and forming a feature expression vector of the preprocessed sample image set by adopting a space pyramid strategy and max-posing;
the calculation method of the sparse representation comprises the following steps:
with local constraint, each input signal is projected to its local coordinate system, for an input vector x ═ x 1 ,x 2 ,...,x n Finding K neighbor vectors of x in a local range, reconstructing x by using the K neighbor vectors, weighting each atom to achieve the purpose of selecting K nearest neighbors to obtain an objective function, and obtaining a code c by the objective function through resolution;
the method for obtaining the optimal parameters of the linear classifier by adding the feature expression vector of the sample image set and the class label corresponding to the feature expression vector into the linear classifier and performing parameter learning on the linear classifier further comprises the following steps:
training sample pairs for a series of inputs (z) i ,t i ) N, t is the true label value of the training sample, and the objective function of the linear classifier is:
Figure FDA0003678442420000021
Subject to:Wz i =t ii ,i=1,....,N.
and (3) solving to obtain the optimal model weight by adopting a Lagrange multiplier method as follows:
Figure FDA0003678442420000022
c is a regular term coefficient, and an optimal parameter is obtained through adjustment by a cross validation method;
wherein z is i ∈R m Is a characteristic expression of a local area;
the sample image set for training complies with the following conditions: the training sample image sets of the same type of scene need to contain different modalities, and the training sample image sets of different types of scenes need to be balanced.
2. The scene recognition method based on sparse coding feature extraction as claimed in claim 1,
the preprocessing operation comprises the following steps: and (5) image contrast normalization and Gamma correction processing.
3. The scene recognition method based on sparse coding feature extraction as claimed in claim 1,
the step of forming the feature expression vector of the sample image set after the preprocessing operation by adopting a spatial pyramid strategy and max-posing comprises the following steps of:
the method comprises the steps of dividing an image into local areas of 1 × 1, 1 × 4 and 4 × 1 by adopting a spatial pyramid strategy, forming feature expressions of the local areas by adopting a max-posing statistical SIFT key point coding histogram in the local areas, and connecting the feature expressions of all the areas to form a feature expression vector of a sample image set after preprocessing operation.
4. The scene recognition method based on sparse coding feature extraction as claimed in claim 1,
the step of learning parameters of the linear classifier to obtain the optimal parameters of the linear classifier includes:
and calculating to obtain weight parameters of the linear classifier by adopting a least square method, and obtaining optimal parameters of the linear classifier by adopting a cross verification method.
5. The scene recognition method based on sparse coding feature extraction as claimed in claim 2,
the "image contrast normalization" comprises the steps of: converting the image from the RGB color space to the YUV color space, and carrying out global and local contrast normalization processing on the YUV color space;
the global normalization is to normalize the pixel value of the image to be near the mean value of the pixel of the image, and the local normalization is to strengthen the edge.
6. The scene recognition method based on sparse coding feature extraction according to claim 1,
the extracting of the feature expression vector of the sample image set to be identified after the preprocessing operation includes: extracting bottom layer features of the sample image set by using a multi-scale SIFT feature fusion method, namely extracting SIFT key points of the image in each field by adopting fields with various scales for each pixel point;
and solving the sparse expression of the SIFT key points, and forming a feature expression vector of the preprocessed sample image set by adopting a space pyramid strategy and max-posing.
CN201810435125.5A 2018-05-09 2018-05-09 Scene recognition method based on sparse coding feature extraction Active CN108596195B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810435125.5A CN108596195B (en) 2018-05-09 2018-05-09 Scene recognition method based on sparse coding feature extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810435125.5A CN108596195B (en) 2018-05-09 2018-05-09 Scene recognition method based on sparse coding feature extraction

Publications (2)

Publication Number Publication Date
CN108596195A CN108596195A (en) 2018-09-28
CN108596195B true CN108596195B (en) 2022-08-19

Family

ID=63635982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810435125.5A Active CN108596195B (en) 2018-05-09 2018-05-09 Scene recognition method based on sparse coding feature extraction

Country Status (1)

Country Link
CN (1) CN108596195B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657569A (en) * 2018-11-30 2019-04-19 贵州电网有限责任公司 More vegetation areas transmission of electricity corridor hidden danger point quick extraction method based on cloud analysis
CN109616104B (en) * 2019-01-31 2022-12-30 天津大学 Environment sound identification method based on key point coding and multi-pulse learning
CN110852206A (en) * 2019-10-28 2020-02-28 北京影谱科技股份有限公司 Scene recognition method and device combining global features and local features
CN111225231B (en) * 2020-02-25 2022-11-22 广州方硅信息技术有限公司 Virtual gift display method, device, equipment and storage medium
CN112086197B (en) * 2020-09-04 2022-05-10 厦门大学附属翔安医院 Breast nodule detection method and system based on ultrasonic medicine

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793600A (en) * 2014-01-16 2014-05-14 西安电子科技大学 Isolated component analysis and linear discriminant analysis combined cancer forecasting method
CN107451596A (en) * 2016-05-30 2017-12-08 清华大学 A kind of classified nodes method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778476B (en) * 2015-04-10 2018-02-09 电子科技大学 A kind of image classification method
CN105069481B (en) * 2015-08-19 2018-05-25 西安电子科技大学 Natural scene multiple labeling sorting technique based on spatial pyramid sparse coding
CN105678278A (en) * 2016-02-01 2016-06-15 国家电网公司 Scene recognition method based on single-hidden-layer neural network
CN106919920B (en) * 2017-03-06 2020-09-22 重庆邮电大学 Scene recognition method based on convolution characteristics and space vision bag-of-words model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793600A (en) * 2014-01-16 2014-05-14 西安电子科技大学 Isolated component analysis and linear discriminant analysis combined cancer forecasting method
CN107451596A (en) * 2016-05-30 2017-12-08 清华大学 A kind of classified nodes method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于EEG的睡眠分期与睡眠评估方法研究;高群霞;《中国优秀博硕士学位论文全文数据库(硕士)医药卫生科技辑》;20151215(第(2015)12期);E060-162 *

Also Published As

Publication number Publication date
CN108596195A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
Karlekar et al. SoyNet: Soybean leaf diseases classification
CN108596195B (en) Scene recognition method based on sparse coding feature extraction
CN108304873B (en) Target detection method and system based on high-resolution optical satellite remote sensing image
Paisitkriangkrai et al. Effective semantic pixel labelling with convolutional networks and conditional random fields
CN105930815B (en) Underwater organism detection method and system
Abbass et al. A survey on online learning for visual tracking
CN111052126A (en) Pedestrian attribute identification and positioning method and convolutional neural network system
CN110633708A (en) Deep network significance detection method based on global model and local optimization
Khan et al. Image scene geometry recognition using low-level features fusion at multi-layer deep CNN
CN112733614B (en) Pest image detection method with similar size enhanced identification
CN108230330B (en) Method for quickly segmenting highway pavement and positioning camera
CN106874862B (en) Crowd counting method based on sub-model technology and semi-supervised learning
Feng et al. A color image segmentation method based on region salient color and fuzzy c-means algorithm
CN115170805A (en) Image segmentation method combining super-pixel and multi-scale hierarchical feature recognition
CN112464983A (en) Small sample learning method for apple tree leaf disease image classification
Sabri et al. Nutrient deficiency detection in maize (Zea mays L.) leaves using image processing
Rodrigues et al. Evaluating cluster detection algorithms and feature extraction techniques in automatic classification of fish species
Zhang et al. Contour detection via stacking random forest learning
Lone et al. Object detection in hyperspectral images
Zhang et al. Spatial contextual superpixel model for natural roadside vegetation classification
CN112784722B (en) Behavior identification method based on YOLOv3 and bag-of-words model
Nga et al. Combining binary particle swarm optimization with support vector machine for enhancing rice varieties classification accuracy
CN107423771B (en) Two-time-phase remote sensing image change detection method
Padmanabhuni et al. An extensive study on classification based plant disease detection system
Poostchi et al. Feature selection for appearance-based vehicle tracking in geospatial video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant