CN116977679A

CN116977679A - Image acquisition method and system based on image recognition

Info

Publication number: CN116977679A
Application number: CN202310997304.9A
Authority: CN
Inventors: 王子静; 邓伟宁
Original assignee: Individual
Current assignee: Individual
Priority date: 2023-08-09
Filing date: 2023-08-09
Publication date: 2023-10-31

Abstract

The application discloses an image acquisition method and system based on image recognition, which relate to the technical field of image recognition and comprise the following steps: acquiring an original image, preprocessing the original image, and outputting a preprocessed image A; constructing a symbiotic matrix and acquiring a feature set F of the image A; combining the feature set F to construct a LOF model of the local feature; performing cluster analysis on the LOF model by using DBSCAN to obtain a final cluster image C; wherein the symbiotic matrix is a GLCM matrix and a WCM matrix; aiming at the problem that the image target recognition and extraction in the prior art has insufficient capability of relying on global feature expression, the application establishes the LOF model to analyze local feature structure by constructing the symbiotic matrix to capture local feature mode, adopts the DBSCAN clustering algorithm to perform target recognition only based on local features, can effectively extract and express the local feature mode of the image, and realizes automatic recognition and accurate positioning of the image target region by utilizing the local feature mode, thereby remarkably improving the effect of image target extraction.

Description

Image acquisition method and system based on image recognition

Technical Field

The application relates to the technical field of image recognition, in particular to an image acquisition method and system based on image recognition.

Background

With the development of computer vision and image processing technology, image target recognition and extraction technology is increasingly widely applied to the fields of intelligent monitoring, unmanned driving, image searching and the like. The method aims at realizing automatic identification and accurate extraction of image targets of different scenes, and is a difficult problem to be solved in the field of computer vision.

The traditional image target extraction method mainly relies on global features, such as overall features of colors, textures, shapes and the like, and uses a classifier to train the global features so as to realize the identification of target types, but the global features have weaker expressive power on local targets of the image and are insufficient for accurately positioning target areas.

In the related art, for example, in chinese patent document CN116168440a, there is provided a target image recognition system including an image scanning acquisition module, an auxiliary image acquisition module, an image preprocessing module, an image feature extraction module, an image data storage module, an image intelligent recognition module, and a recognition abnormality management module; the image scanning acquisition module comprises a main contour scanning acquisition unit, a three-dimensional scanning acquisition unit and an acquisition integrity detection unit; the auxiliary image acquisition module comprises a local image scanning acquisition unit. However, the application has at least the following technical problems:

(1) The patent scheme still mainly relies on global features of images to carry out target recognition, and features extracted by an image feature extraction module are global features such as integral colors, textures and the like;

(2) Such global features have insufficient detail feature capabilities for representing local target areas of the image. Although the application comprises a local image scanning acquisition module, the local image scanning acquisition module is only used as an auxiliary acquisition means, and the local characteristics are not fully utilized for identification and expression.

Therefore, the method still belongs to a method based on the traditional global feature recognition, and the local features of the image are not fully considered and utilized, so that the target recognition is easily influenced by global environment change and noise, the recognition effect on smaller or local targets is poor, and the local target region cannot be accurately positioned and extracted.

Disclosure of Invention

1. Technical problem to be solved

Aiming at the problem that the image target identification and extraction in the prior art have insufficient capability of depending on global feature expression, the application provides an image acquisition method and system based on image identification, which can effectively extract and express local feature modes of images and realize automatic identification and accurate positioning of image target areas by utilizing the local feature modes, thereby remarkably improving the effect of image target extraction. According to the application, the local feature mode is captured by constructing the symbiotic matrix, the LOF model is established to analyze the local feature structure, and the DBSCAN clustering algorithm is adopted to perform target recognition only based on the local feature, so that the effective extraction of the image target is realized.

2. The application is that

The object of the present application is achieved by the following application.

An aspect of embodiments of the present disclosure provides an image acquisition method based on image recognition, including: acquiring an original image, preprocessing the original image, and outputting a preprocessed image A; constructing a symbiotic matrix and acquiring a feature set F of the image A; combining the feature set F to construct a LOF model of the local feature; performing cluster analysis on the LOF model by using DBSCAN to obtain a final cluster image C; wherein the symbiotic matrix is GLCM matrix and WCM matrix.

Further, acquiring the feature set of the image a includes: constructing a gray level co-occurrence matrix GLCM of the image under a given offset distance and angle; extracting gray features of the image A by utilizing a co-occurrence matrix GLCM, and outputting a GLCM feature set F1; constructing a co-occurrence matrix WCM of wavelet transformation and wavelet coefficients of an image; extracting frequency domain features of the image A by utilizing the co-occurrence matrix WCM, and outputting a WCM feature set F2; wherein the gray scale features comprise contrast and correlation features; the co-occurrence matrix WCM is used to reflect the frequency domain features of the image.

Further, constructing the LOF model of the local feature includes: combining the image A and the feature set F1 to construct a gray LOF model; combining the image A and the feature set F2 to construct a wavelet LOF model; wherein, the construction of the gray LOF model comprises the following steps: carrying out graying treatment on the image A; obtaining the distance between each pixel in the image A and the feature set F1 to obtain a local dyne distance; calculating local outlier factors according to the local dyne distances; constructing a gray level LOF model characterized by local outlier factors; wherein, constructing the wavelet LOF model comprises: extracting wavelet characteristics of the image A by adopting a local binary mode and haar wavelet transformation; calculating a K distance minimum value of each pixel to extract local features; constructing an improved local outlier factor according to the density and the distance; a wavelet LOF model featuring an improved local outlier factor is constructed.

Further, acquiring the final clustered image C includes: performing first clustering, and clustering the gray LOF model by using a DBSCAN algorithm to generate a first clustering result B1; performing second clustering, and clustering the wavelet LOF model by using a DBSCAN algorithm to generate a first clustering result B2; performing majority voting fusion, and performing majority voting on a first clustering result B1 and a second clustering result B2 of each pixel to obtain an intermediate clustering result; matching the same category, determining possible matching categories in the twice clustering results, and fusing the possible matching categories into the same category in the middle clustering results; judging the outlier, and judging the class with the pixel number proportion lower than the threshold value as the outlier; and generating a final clustering result, and fusing the results of majority voting, peer-class matching and outlier judgment to output a final clustering image C.

Further, the majority vote fusion includes: acquiring the pixel proportion of each category in the first clustering result and the second clustering result; giving weight to each category according to the pixel proportion; weighted voting is performed to determine the intermediate class of pixels.

Further, peer-to-peer matching includes: extracting image texture features of each category in the first clustering result and the second clustering result; obtaining similarity between features as a matching measure; the category having the similarity greater than the threshold value is determined as the matching category.

Further, performing outlier determination includes: extracting color, texture and shape features of each category; calculating outlier scores using an isolated forest algorithm; categories with outliers above the threshold are determined to be outliers.

Further, generating the final cluster result includes: constructing a Gaussian mixture model, and taking the intermediate clustering result as a training set to obtain a clustering type model; classifying each pixel into a category with the maximum posterior probability according to a Bayesian decision theory; carrying out clustering boundary optimization by using a Grab Cut algorithm; and outputting the optimized final clustering image C.

Another aspect of embodiments of the present disclosure further provides an image acquisition system based on image recognition, including: the image preprocessing module is used for preprocessing an original image and outputting a preprocessed image A; the feature extraction module is used for constructing a co-occurrence matrix to obtain a feature set F of the image A, wherein the co-occurrence matrix comprises a GLCM matrix and a WCM matrix; the local feature modeling module is used for combining the feature set F to construct an LOF model of the local feature; the image clustering module is used for carrying out cluster analysis on the LOF model by using a DBSCAN algorithm to obtain a final clustered image C; wherein, the feature extraction module includes: the GLCM feature extraction unit is used for constructing a gray level co-occurrence matrix GLCM of the image under a given offset distance and angle, extracting gray level features of the image A by using the GLCM and outputting a GLCM feature set F1; the WCM feature extraction unit is used for constructing a co-occurrence matrix WCM of wavelet transformation and wavelet coefficients of the image, extracting frequency domain features of the image A by utilizing the WCM and outputting a WCM feature set F2; wherein the local feature modeling module comprises: the gray LOF modeling unit is used for combining the image A and the GLCM feature set F1 to construct a gray LOF model; and the wavelet LOF modeling unit is used for combining the image A and the WCM feature set F2 to construct a wavelet LOF model.

Further, the gray level LOF modeling unit calculates the local dyne distance between each pixel and the feature set F1 by carrying out gray level processing on the image A, and calculates a local outlier factor according to the distance to construct a gray level LOF model characterized by the local outlier factor; a wavelet LOF modeling unit extracts wavelet features of an image A by using a local binary pattern and haar wavelet transform, calculates a K-distance minimum value of each pixel to extract local features, constructs an improved local outlier according to density and distance, and constructs a wavelet LOF model featuring the improved local outlier.

3. Advantageous effects

Compared with the prior art, the application has the advantages that:

the integral method acquires the gray level and frequency domain characteristics of the image by constructing the co-occurrence matrix, analyzes the local mode of the image by combining the local characteristic model LOF, classifies and identifies the local mode by using a DBSCAN clustering algorithm, and realizes the automatic extraction of the target object in the image; the gray scale and frequency domain characteristic modes of the local area of the image are captured by using the symbiotic matrix, and compared with the characteristics of direct color, texture and the like, the intrinsic structure of the local area of the image can be better expressed; constructing a local feature model LOF, analyzing the relative relation between each pixel and surrounding neighborhood, and effectively describing the local feature mode of the image; the DBSCAN clustering algorithm is utilized to classify the local characteristic modes, and classification is only carried out based on local sample density distribution, so that the method is not influenced by global distribution and noise.

Drawings

The present specification will be further described by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:

FIG. 1 is an exemplary flow chart of an image acquisition method based on image recognition according to some embodiments of the present description;

FIG. 2 is a flowchart of exemplary sub-steps for acquiring a feature set of image A, shown in accordance with some embodiments of the present description;

FIG. 3 is a flowchart illustrating exemplary sub-steps for constructing a LOF model of a local feature, according to some embodiments of the present description;

FIG. 4 is a flowchart illustrating exemplary sub-steps for acquiring a final clustered image C, according to some embodiments of the present description;

fig. 5 is an exemplary block diagram of an image acquisition system based on image recognition, according to some embodiments of the present description.

Detailed Description

In order to more clearly illustrate the application of the embodiments of the present specification, the drawings that are used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.

It should be appreciated that as used in this specification, a "system," "apparatus," "unit" and/or "module" is one method for distinguishing between different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.

As used in the specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.

Based on the technical problems, the specification provides an image acquisition method and system based on image recognition.

Name interpretation

The GLCM matrix is a full Gray Level Co-occurrence Matrix matrix, and is a matrix for the Gray distribution of local adjacent pixels of a statistical image. According to the application, through constructing GLCM matrix of the image in different directions and distances, gray structure mode of local area of the image can be effectively captured, and intrinsic characteristics of local area of the image are expressed. Compared with the direct use of image gray scale, the GLCM matrix can more comprehensively reflect gray scale dependency relationship among local pixels of the image. Therefore, the application uses the GLCM matrix to extract the local features of the image, so that the feature expression capability is stronger, the local detail mode of the complex image can be better depicted, the GLCM matrix enhances the expression capability of the local features of the image, and the local structure mode of the image can be better extracted and utilized.

The WCM matrix, known as Wavelet Co-occurrence Matrix, is a matrix that combines Wavelet transformation to count the frequency domain features of the image. In the application, the characteristic mode of the image in the frequency domain can be effectively captured by carrying out wavelet transformation on the image and constructing the WCM matrix of wavelet coefficients, and the detail, the edge and other high-frequency information of the image can be expressed. Compared with the method which directly uses pixel gray scale, the WCM matrix is based on frequency domain analysis, and can reflect the characteristics of the image under different scales, such as detail structures of edges, lines and the like. Therefore, the WCM matrix expands the frequency domain range of the features, so that the image frequency domain details become part of the expression of the local features, and the capability of describing the local structure mode of the complex image is enhanced.

The feature set refers to a set of image local features extracted by means of a co-occurrence matrix or the like. The application comprises two types of feature sets, namely a GLCM feature set and a WCM feature set. Wherein the GLCM feature set expresses a gray pattern of a local region of the image; the WCM feature set expresses frequency domain features of a localized region of the image. The two feature sets together form the expression of the local features of the image, and the local feature information of the image in both space and frequency domains is contained. The feature set provides more information than a single pixel gray or color, and the local feature pattern of the image can be described from multiple angles. Therefore, the application can make the local feature expression of the image more comprehensive and rich by extracting the feature set, and provides important support for the subsequent local feature modeling and target recognition. Therefore, the feature set is a bridge for connecting image local feature extraction and subsequent recognition modeling, plays a key role in expressing image local features, and is a foundation for realizing the final aim of the scheme.

The gray scale features refer to gray scale pattern features of local areas of the image extracted by the GLCM matrix. The application uses GLCM matrix to count the local gray distribution and change of the image, and extracts the gray characteristics of contrast, relativity and the like. The gray features reflect the information such as the degree, direction and the like of the gray change of the local image, and can express the texture structure of the local image. It provides more rich gray structure information than a single gray value. Thus, the gray features enhance the descriptive ability of the gray pattern of the image portion, so that the structural characteristics of the image portion can be more fully expressed. It is an important component of the feature set, providing important support for subsequent target recognition. In conclusion, the gray features play a role of supplementing pixel gray information in the application, expand the description dimension of the local structure of the image, and are one of key technologies for improving the image target recognition effect in the implementation scheme.

The frequency domain features refer to frequency domain information features of the local area of the image extracted through the WCM matrix. The application carries out wavelet transformation on the image, and analyzes wavelet coefficients through the WCM matrix statistics, and extracts high-frequency information such as edges, lines and the like of the image from the wavelet coefficients as frequency domain features. The frequency domain features reflect detail structure modes of the image under different scales, and can express detail information such as image edges, contours and the like. They are complementary to the gray scale features and together describe the image local features. Therefore, the frequency domain features expand the feature extraction to the frequency domain, so that the image local structure mode can be expressed in a richer angle. It enhances the descriptive ability of the image details, providing important supplementary information for subsequent object recognition.

Local features refer to features that express details of local areas of an image. The gray scale features and the frequency domain features extracted through the co-occurrence matrix belong to local features and are used for describing the structural mode of the image local block. Compared with the global feature, the local feature can express fine mode change in the image, and is more distinguishable for the local area representing the target key information. Thus, the local features highlight the analytical description of the local structure within the image, rather than relying on global features. This allows key local areas of the image object to be accurately extracted and expressed. The application fully utilizes the advantage of the local characteristic, realizes effective identification of the image target, and avoids the problem that the global characteristic is easily influenced by external environment. In conclusion, the local features play a key role in highlighting the internal details of the image in the application, and are the basis for realizing accurate positioning and extraction of the image target.

DBSCAN is a density-based spatial clustering algorithm. The application uses DBSCAN to cluster the local features of the image so as to identify the target area in the image. DBSCAN is classified only according to the distance density between samples in the feature space, without requiring a predetermined number of clusters. Compared with the traditional K-Means and other clustering methods, the DBSCAN is not influenced by the integral distribution of the samples and outliers, and the image local feature modes can be clustered more accurately. DBSCAN enables the clustering to be more in accordance with the distribution condition of the local feature mode, and subjective influence of the number of the specified categories in advance is avoided. Therefore, the application adopts DBSCAN, and can effectively identify the characteristics of the target area according to the density distribution condition of the local characteristics of the image, thereby realizing the extraction of the image target. In conclusion, DBSCAN is one of the key technical means for realizing local feature clustering and target recognition.

Local outlier factor (Local Outlier Factor, LOF), which is an indicator of how far an individual sample is outlier relative to its neighborhood samples. The application builds LOF model based on each pixel, analyzes local outlier degree of the pixel relative to surrounding area, and expresses local mode of image. The LOF model highlights the inherent structure of the local feature from a relative relationship perspective, as compared to using the pixel feature directly. The LOF model enhances the capability of describing the local feature mode of the image, so that the local target in the complex scene can be accurately expressed. Therefore, the local feature modeling is performed by adopting the LOF model, so that the structural features of the image target area can be better extracted, and support is provided for subsequent identification. In conclusion, LOF plays a key role in enhancing the expression capability of local features in the application, and is one of important technical means for realizing accurate positioning of target areas.

Local binary pattern (Local Binary Pattern, LBP), which is an operation method describing the local texture of an image. It encodes features describing the local texture pattern by comparing the gray scale relationship of the center pixel to surrounding pixels. The application extracts LBP characteristics of the image and is used as a characteristic expression mode for expressing the local structure information of the image. LBP features are more sensitive to image local texture information than using gray values directly, and can provide a richer local structure description. Therefore, the LBP feature enhances the descriptive capability of the local texture of the complex image, so that fine texture structures can also be extracted and expressed. Therefore, the application adopts LBP characteristic, can describe the local structure mode of the picture more fully, offer the effective support for the subsequent goal identification. In conclusion, the LBP plays a role of supplementing the local texture information of the expression image in the application, and is one of key technical means for realizing accurate positioning and identifying of a target area.

The haar wavelet transform is a wavelet transform method of multi-resolution analysis. It implements wavelet image decomposition by low pass filtering and downsampling. The application adopts the Haer wavelet to transform the image signal to obtain the image representation under different scales. The haar wavelet representation highlights the multi-scale information of the image, including different frequency components, than if the pixel representation was used directly. The application provides the characteristic expression form of the frequency domain by the haar wavelet transformation, so that the frequency domain information of the image can be extracted and utilized. Therefore, the application applies haar wavelet transformation, can obtain the characteristics of the image in the frequency domain, constructs the frequency domain characteristic set and provides frequency domain characteristic support for the subsequent target identification. In conclusion, the haar wavelet transformation plays a key role in obtaining the image frequency domain information in the application, so that the expression of local features is expanded to the frequency domain, and the recognition capability of the image target area is enhanced.

The K-distance is a method of measuring the distance between samples in the feature space. When the DBSCAN cluster and LOF model are constructed, the K distance is adopted to determine the neighborhood range of each sample. The K-distance reflects the similarity of the samples on the feature expression. By setting a proper K distance threshold, the clustering and the local model can more accurately describe local sample distribution. Compared with a preset distance threshold, the K distance adopts a neighbor distance sorting mode, and can be automatically adapted to sample distribution with different densities and scales. Therefore, the K distance enhances the adaptability of the clustering and the local model to the local feature distribution, so that the local structure can be more accurately characterized. In summary, the K distance is one of important parameters for realizing self-adaptive clustering and local modeling, plays roles of associating and normalizing local neighborhood of the sample, and is a key for accurately obtaining local feature expression.

The gray level LOF model refers to a Local Outlier Factor (LOF) model constructed based on the gray level of an image pixel. The present application uses a gray level LOF model to analyze the local outlier degree of each pixel with respect to the gray level distribution of the surrounding area. The gray LOF model can highlight the image local pattern by considering the gray relative relationship of the surrounding area and the center pixel, compared to directly using gray. The gray level LOF model enhances modeling and expression capabilities of local gray level structures of an image. Therefore, the gray level LOF model is adopted, so that the gray level characteristics of the local target of the image can be expressed more accurately, and effective support is provided for subsequent identification and extraction. In conclusion, the gray LOF model improves the expression of the local characteristics of the gray of the image by analyzing the local relative relation of gray, and is one of key technical means for realizing accurate positioning of a target area.

The wavelet LOF model refers to a Local Outlier Factor (LOF) model constructed based on features extracted by image wavelet transformation. The application carries out wavelet transformation on the image to obtain multi-scale frequency domain representation, and establishes an LOF model on the basis. The wavelet transform provides a richer frequency domain feature than pixel gray scale, and the wavelet LOF model can more fully analyze the pattern of local frequency domain features of the image. The wavelet LOF model expands the modeling of the local mode to the frequency domain, and enhances the expression and analysis capability of the local frequency domain structure of the image. Therefore, the application adopts the wavelet LOF model, can promote the characterization of the frequency domain characteristics of the image target area, and provides effective support for the subsequent identification and positioning. In conclusion, the wavelet LOF model enhances the expression of local features of an image through frequency domain information, and is one of key technical means for accurately identifying an image target area.

Majority vote fusion is a fusion method that integrates multiple classifier results. The method is characterized in that the same sample is independently classified by a plurality of classifiers, and the most classification labels are selected as the final classification of the sample through voting. The application establishes a plurality of image classifiers based on different characteristics. And in the prediction process, the same image is subjected to classification result fusion by adopting a majority voting mechanism. Compared with a single classifier, the majority voting mechanism can integrate the advantages of different classifiers, and classification robustness and accuracy are improved. Therefore, majority voting fusion enhances the generalization capability of the classifier, and utilizes the effect of mutual authentication of a plurality of expert classifiers. Therefore, the application adopts majority voting, can effectively fuse the classification results corresponding to different characteristics, and improves the overall performance of image target classification. In summary, majority voting fusion is one of the important technical means for improving classification performance, and plays a key role in classification result integration.

The intermediate clustering result refers to an intermediate result obtained after feature extraction and clustering in the image target recognition process. In the application, the extracted image local features are initially clustered to obtain an intermediate clustering result. These intermediate clustering results reflect the distribution and aggregation of local features in different areas of the image. The intermediate clustering results provide structural information of local features, such as boundaries of target regions, internal patterns, etc., for subsequent models of target recognition. Therefore, the intermediate clustering result retains the necessary local structure information for target positioning, and provides support for building the identification model. Compared with the method that original features are directly used, the method has the advantages that the intermediate clustering result is preprocessed, and the target recognition efficiency is improved. In conclusion, the intermediate clustering result plays a role in connecting and supporting the construction of the target recognition model in the application, and is a key intermediate output for realizing image target recognition.

The same category matching refers to matching the input image with the same category sample in the database in the image recognition process. After the local features of the input image are extracted, the sample similar to the image category is obtained through the matching of the same category. The peer-to-peer matching finds a similar reference sample for the input image and can be used to refine the category description of the image. Thus, peer-to-peer matching enhances the class description expression for a single sample, utilizing the complementary information of the peer samples. By matching the same category, the application can shorten the distance between the input image and the sample of the category, and improve the accuracy of classification. Compared with direct classification, the same-category matching fully utilizes the mutual verification and constraint actions of samples in the category. In summary, peer-to-peer matching is one of the important intermediate processing steps for achieving classification performance improvement in the present application.

Outliers refer to individual sample points that do not belong to the same cluster as most sample points in the feature space. In the application, due to the complex distribution condition of the local features of the image, outliers may exist in the feature space. These outliers often do not belong to the main target class, causing interference with target recognition. To process outliers, the present application employs a Local Outlier Factor (LOF) model for identification. LOF measures the local outliers of a sample and can detect outliers. Meanwhile, the application also uses DBSCAN algorithm to perform preliminary clustering. DBSCAN clusters densely connected sample points, while outliers that are not connected are excluded from the primary clusters. Therefore, LOF and DBSCAN can effectively detect and isolate outliers, and the outliers are prevented from affecting the target identification effect. In summary, detection and isolation of outliers are important links for achieving stable target identification in the application.

The number of pixels refers to the total number of pixels the image contains. In the present application, to reduce the computational complexity of image processing, the total number of pixels needs to be controlled. If the number of pixels is too large, the amount of computation in the processing such as feature extraction and modeling increases sharply. In order to control the number of pixels, the application firstly performs image scaling, reduces resolution and reduces the total number of pixels. Meanwhile, when the features are extracted, the method adopts a sliding window mode, and only the pixels in the local window block are processed each time, but not all the pixels of the global image. Therefore, the blocking strategy reduces the number of pixels to be considered by each modeling unit, and reduces the computational complexity. In summary, control of the number of pixels is an important support condition for the present application to achieve efficient image processing.

The matching metric is a series of metrics used to calculate and compare the degree of matching between different samples. In the application, the matching relation between the input image and the sample picture in the database is required to be calculated so as to realize classification. Therefore, the application adopts the matching measurement based on local feature matching, such as Hamming distance, included angle cosine and the like. These matching metrics can quantitatively evaluate whether the two images are similar in local feature expression or not, reflecting the matching degree of the two images. Compared with the direct use of global features, the local feature matching measurement is more flexible, and the method can adapt to the deformation of the image. Thus, a reasonable match metric is the basis for achieving accurate classification. In summary, designing and selecting an appropriate matching metric is one of the key elements in the successful implementation of the present application.

The isolated forest algorithm is an integrated classification algorithm, which constructs a plurality of decision trees, trains each tree according to different random subsamples, and then fuses the final classification results. The application adopts an isolated forest as an image category classification model. Compared with a single decision tree, the isolated forest algorithm can obtain better classification performance and avoid overfitting. Therefore, the isolated forest algorithm improves the generalization capability of classification by integrating decision trees. Therefore, the application applies the isolated forest algorithm, can synthesize different local features and realize the stable classification of the image target class. In summary, the isolated forest algorithm is one of the core technologies for completing the image classification task, and plays a key role in integrating a plurality of decision tree classifiers.

The gaussian mixture model is a probability density modeling method that fits the overall sample distribution by linearly weighting a plurality of gaussian distributions. The application uses GMM to model the image local feature to obtain the probability density expression of the image local structure. The GMM may approximate complex local feature distributions more finely and accurately than a single gaussian distribution. Therefore, GMM enhances the expressive power of local feature distribution, supporting subsequent image recognition and classification. Therefore, the application uses the GMM to obtain the probability expression of the local characteristics and provides support for realizing stable classification. In conclusion, the GMM is one of key means for realizing accurate local feature modeling, and plays an important role in probability density approximation.

The Bayesian decision theory is a decision analysis method based on a Bayesian probability framework. It combines a priori distribution, likelihood functions and loss functions to make decisions from the perspective of desired risk minimization. The application adopts Bayesian decision theory to train and predict when constructing the image classification model. Bayesian decision theory provides an essentially better decision rule than empirical risk minimization. Therefore, bayesian decision theory enhances the generalization capability of the classifier, making it more suitable for new samples. Therefore, the application applies Bayesian decision theory in image classification, and can obtain more reliable and robust classification results. In conclusion, the Bayesian decision theory is one of theoretical bases for realizing high-performance image classification, and plays a key role in probabilistic decision analysis.

The Grab Cut algorithm is an interactive image segmentation algorithm based on graph Cut. The method can extract the foreground target according to the outline drawn by the user. The application adopts Grab Cut algorithm, uses the contour marked by the user as prior, and segments the image to extract the foreground target area. Compared with full-automatic segmentation, the Grab Cut algorithm fully utilizes user input, and can obtain more accurate target extraction effect. Therefore, the Grab Cut algorithm represents the image pixels as a graph model, and converts the image pixels into graph cutting problem solving, so that interactive segmentation is realized. Therefore, the method can effectively extract the interested target area in the image by utilizing the Grab Cut algorithm, and provides support for subsequent identification and analysis. In summary, the Grab Cut algorithm is one of the key technical modules for realizing accurate image segmentation.

Boundary optimization refers to post-processing of the image segmentation results to optimize and refine the segmentation boundaries. In the application, after a preliminary image segmentation result is obtained through algorithms such as Grab Cut and the like, boundary optimization is needed. The boundary optimization can improve the precision and consistency of the segmentation boundary through operations such as refinement, smoothing, screening and the like. The optimized boundary marks the outline of the target area more accurately and continuously, and better support is provided for subsequent identification and positioning. Thus, boundary optimization enhances the quality of the segmentation result, lifting the rough initial boundary to a clear target region contour. In summary, boundary optimization is one of key post-processing links for realizing high-quality image segmentation, and plays an important role in improving the extraction effect of a target region by refining the segmentation boundary.

Examples

The method and system provided in the embodiments of the present specification are described in detail below with reference to the accompanying drawings.

Fig. 1 is an exemplary flowchart of an image acquisition method based on image recognition according to some embodiments of the present disclosure, as shown in fig. 1, the image acquisition method based on image recognition includes the steps of:

s110, acquiring an original image, preprocessing, S120, constructing a symbiotic matrix, acquiring image characteristics, S130, establishing a model based on LOF, and S140 DBSCAN clustering.

The image preprocessing is performed in S110, and reliability is mainly improved for subsequent feature extraction and modeling; s120, constructing a symbiotic matrix, and obtaining multidimensional local features of the image, wherein the multidimensional local features comprise gray level symbiotic information and frequency domain information, so that the expression of the local features is enhanced; s130, building a LOF model, fully utilizing the rich local features extracted in the previous step, building a probability model to model each local area, and enhancing the description of the local mode; s140, DBSCAN clustering is carried out, density clustering is carried out on the images according to the LOF model, a target area can be effectively extracted, and background and noise interference are filtered; through the steps, the local characteristic information is fully obtained and utilized, and a framework combining the probability model and the clustering is established, so that the recognition expression capacity of the image target area is improved; compared with the method which relies on global features, the method has the advantages that the local features and mode information in the image are modeled and analyzed better, the recognition and extraction capacity of a target area is enhanced, the image target is recognized accurately, the background and noise are filtered, and the purpose of collecting the image effectively is achieved.

S110, acquiring an original image, preprocessing, acquiring the original image by adopting image reading equipment, preprocessing such as scaling and filtering, and outputting a preprocessed image A. Preprocessing aims at reducing image noise and improving the reliability of subsequent processing.

S120, constructing a co-occurrence matrix, acquiring image features, and calculating a gray level co-occurrence matrix GLCM and a wavelet co-occurrence matrix WCM of the image A. GLCM reflects the relative positional relationship between gray levels and WCM reflects the frequency domain information. GLCM and WCM are taken as feature set F of image a.

S130, building a LOF-based model, and building the LOF model for each local area of the image A by using a local outlier factor LOF algorithm based on the feature set F. LOF measures the degree of outliers of each sample relative to the neighborhood. The LOF model can enhance the characterization of the local pattern of the image.

S140 DBSCAN clustering, and performing cluster analysis on the obtained LOF model by using a density clustering algorithm DBSCAN. The DBSCAN can effectively cluster the points associated with the neighborhood density and detect the outliers. And finally outputting the clustered image C.

In summary, compared with the method directly based on pixel information, the technical scheme obtains richer image features through the symbiotic matrix; the LOF model and DBSCAN clustering enhance modeling and analysis of local patterns of images, can effectively extract target areas of the images, filter background and noise, realize image recognition and acquisition, extract multidimensional features through matrixes, establish probability models to describe the local patterns, perform density cluster analysis, fully combine technologies such as image processing, pattern recognition and the like, realize image acquisition from the aspects of statistical modeling and decision recognition, and realize effective extraction of image targets.

FIG. 2 is a flowchart of exemplary sub-steps for acquiring a feature set of image A, as shown in FIG. 2, according to some embodiments of the present description, including: s121 builds a gray level co-occurrence matrix (GLCM) of the image, S122 extracts GLCM features, S123 builds a wavelet co-occurrence matrix (WCM) of the image, and S124 extracts WCM features.

S121, constructing a gray level co-occurrence matrix (GLCM) of the image, calculating a matrix of pixel value statistical relations of the image in different directions and distances, and reflecting integral gray level distribution characteristics of the image; s122, extracting GLCM features, and calculating statistical features such as contrast, association degree, entropy and the like of the GLCM to form an image global feature vector F1; s123, constructing a wavelet co-occurrence matrix (WCM) of the image, performing wavelet transformation on the image, calculating wavelet coefficient co-occurrence matrices in different directions, and reflecting local texture features of the image; s124, extracting WCM characteristics, calculating statistical characteristics such as entropy of the WCM, and the like, and forming an image local characteristic vector F2; splicing the global feature F1 and the local feature F2 to construct a mixed feature expression model; training an image classifier by using the mixed features, and improving the identification accuracy of the target area; extracting an image according to the coordinates of the target area; through the processing steps from S121 to S124, global features reflect the whole information of the image, local features reflect the detailed textures, the global features and the local features complement each other, a mixed feature model is constructed, and the identification information of the target area can be expressed. Finally, positioning and extracting are carried out, the problem that the target extraction effect is poor due to the fact that only global features are used is solved, and the technical effects of improving the image target identification and extraction accuracy are achieved.

The step S121 is to construct a gray level co-occurrence matrix (GLCM) of the image, perform a graying process on the input image a, and calculate the gray level co-occurrence matrix GLCM of the image under the given offset distance d and angle θ. The matrix element represents the probability that the pixels of the gray values i and j exhibit a symbiotic relationship in a given direction.

Specifically, the detailed technical scheme for constructing the gray level co-occurrence matrix (GLCM) of the image is as follows: image preprocessing, namely converting an input color image A into a gray image A', wherein the matrix size is M multiplied by N; and (3) smoothing the A' to remove noise. Parameter setting, and setting an offset distance d. Generally taking 1, representing an adjacent pixel; an offset angle θ is set. Four directions of 0, 45, 90 and 135 are commonly used. Constructing GLCM to shift (d, θ) through image A', and counting pixel value i at each position (x, y) and pixel value j at relative position (x+d, y+d); the number of occurrences of the pixel pair (i, j) is recorded, and a matrix P (i, j) of size l×l is constructed. L is the gray level number of the image; and normalizing P (i, j) to make the element sum be 1, and obtaining the gray level co-occurrence matrix GLCM. And outputting a gray level co-occurrence matrix GLCM with the parameters of (d, theta). The GLCM reflects the gray level distribution and conversion rule of the image in a given direction and can be used for subsequently extracting the statistical characteristics of the image.

Wherein, S122 extracts GLCM features, and based on GLCM matrix, various texture features such as contrast, correlation, etc. can be extracted. These features reflect the texture information of the image, and all the extracted GLCM features are combined to form the gray feature set F1 of the image a.

Specifically, the detailed technical scheme for extracting GLCM features is as follows: inputting a gray level co-occurrence matrix GLCM of the data image A; feature extraction:

contrast, reflecting the sharpness of the image and the depth of the texture ravines, the formula is:

contrast＝∑∑(i-j) ² P(i,j)

the correlation degree reflects the correlation of pixel values in an image, and the formula is as follows:

correlation＝∑∑(i-μ)(j-μ)P(i,j)/σ ²

entropy reflects the complexity of image textures, and the formula is:

entropy＝-∑∑P(i-j)log(P(i,j))

calculating statistical features such as contrast, association degree, entropy and the like in sequence, and connecting all the extracted GLCM features by feature fusion to form a gray level feature set F1 of an image A; the gray feature set F1 of the image a is output for subsequent use. The GLCM feature set comprehensively reflects the whole gray distribution and texture features of the image, and can effectively represent the global features of the image.

Wherein, S123 constructs a wavelet co-occurrence matrix (WCM) of the image, performs wavelet transformation on the image a to obtain a wavelet coefficient matrix, and calculates the co-occurrence matrix WCM of the wavelet coefficient matrix at a given offset distance and angle.

Specifically, the detailed technical scheme for constructing the wavelet co-occurrence matrix (WCM) of the image is as follows: wavelet transform, 2D wavelet transform is performed on the input image a using, for example, haar wavelet; the image is decomposed into a low-frequency component LL and three high-frequency components LH, HL and HH., and a wavelet coefficient matrix with different directions and scales is obtained. Setting parameters, namely setting an offset distance d, and taking 1 or 2; the offset angle θ is set to be 0 °, 45 °, 90 °, 135 °. Constructing WCM, calculating the co-occurrence matrix of wavelet coefficient matrix of each sub-band under the offset (d, theta); normalizing to obtain the wavelet co-occurrence matrix WCM. A wavelet co-occurrence matrix WCM with output parameters (d, θ). The WCM can reflect the local characteristics and micro-texture information of the image, and is beneficial to further extracting the local characteristics of the image.

Wherein, the step S124 extracts the WCM features, the WCM matrix reflects the correlation of the image frequency domain information, and the features of the WCM matrix are extracted to form the frequency domain feature set F2 of the image A. The feature set is output, and the gray feature set F1 and the frequency domain feature set F2 are output as the feature set of the image a.

Specifically, first, the average gray value μ and the gray variance δ of the image a' are calculated as the gray feature set F1 of the image. Then, discrete cosine transform is carried out on the image A' to obtain DCT coefficient matrix. And calculating a covariance matrix C of the DCT coefficient matrix, namely an amplitude spectrum correlation matrix (WCM matrix) of the image, wherein the matrix C reflects the correlation of the frequency domain information of the image.

Extracting characteristic quantity of WCM matrix including matrix trace tr (C), determinant det (C) and maximum characteristic value lambda _max Minimum eigenvalue lambda _min Etc., constitute the frequency domain feature set F2 of the image. And combining the global feature sets F1 and F2 obtained in the previous step and the deep learning features into an overall feature set of the image A, and taking the overall feature set as input for image target recognition and extraction.

More specifically, the DCT coefficient matrix refers to a coefficient matrix obtained by performing discrete cosine transform (Discrete Cosine Transform, DCT). In the technical scheme for extracting the image features, the method has the specific effects that: the input gray image a' is discrete cosine transformed to obtain a representation of the image in the frequency domain. After DCT transformation, a DCT coefficient matrix in the image block can be obtained, and the matrix reflects the information of the image block in the frequency domain. The element values in the DCT coefficient matrix represent the distribution information of the image block over the different frequency components. We will typically calculate the covariance matrix of the DCT coefficient matrix, i.e. the amplitude spectrum correlation matrix (WCM) of the image. The WCM matrix reflects correlation information between different frequency components of an image in the frequency domain. By analyzing the features of the WCM matrix, global feature information of the image can be obtained from the frequency domain angle. And finally, combining DCT coefficient matrix characteristics with spatial domain characteristics, so that the information of the image can be more comprehensively expressed, and the effect of identifying and extracting the image target is improved. Therefore, the DCT coefficient matrix plays a role of a bridge for connecting the space domain information and the frequency domain information in the scheme, and provides a basis for extracting the frequency domain global features. It reflects the information distribution of the input image block on the frequency spectrum and is the input source for the subsequent WCM matrix analysis.

In summary, the application combines the space domain and frequency domain information at the same time, utilizes the symbiotic matrix to extract the texture features and the frequency domain features of the image, can more comprehensively represent the image content, and is beneficial to improving the subsequent recognition and segmentation effects.

FIG. 3 is a flowchart of exemplary sub-steps for constructing a LOF model of a local feature, as shown in FIG. 3, according to some embodiments of the present description, including: s131 builds a gray level LOF model and S132 builds a wavelet LOF model.

The LOF features are used for enhancing local texture and structure information of the video frame, so that the identification capability of local targets of the image is improved; the wavelet transformation can keep image detail information, and the improved LOF model can strengthen the local feature expression of a video frame and enhance the feature extraction capability of a moving target; the gray level LOF model and the wavelet LOF model comprehensively use the gray level and frequency domain information of the video frame, and the expression of local features is enhanced through clustering and density analysis, so that the video image can be more accurately understood and analyzed, and the subsequent target recognition and tracking are facilitated.

Specifically, S131 builds a gray level LOF model, performs gray level processing on the input image a, converts the gray level image into a gray level image, calculates the euclidean distance between each pixel in the image a and the feature vector in the predefined feature set F1, and obtains the local dyne distance (Local Reachability Distance, LRD) of each pixel; calculating a local outlier factor (Local Outlier Factor, LOF) for each pixel based on the local dyne distance; LOF reflects the local density difference of the pixel relative to its k-nearest neighbor. And taking the LOF of each pixel as a characteristic, and constructing a gray LOF model.

The detailed technical scheme for constructing the gray LOF model is as follows: the input image a is converted into a grayscale image a' by graying. Constructing a predefined gray feature set F1 comprising m d-dimensional gray feature vectors { F ₁ ,f ₂ ,......f _m For each pixel x in image A', calculate it with each feature vector F in feature set F1 _i Euclidean distance between:

for each pixel x, selecting the nearest k eigenvectors as k adjacent points, wherein k is the preset number of adjacent points. Defining the reachable distance (reachability distance) to pixel x within k-nearest neighbor as:

RD_k(x)＝max(dist(x,f _i )),f _i e k neighbor set

Defining the local dyne distance (local reachability distance) of pixel x relative to feature vector fi as:

LRD(x,f _i )＝max(dist(x,f _i ),RD_k(x))

for each pixel x, its Local Outlier Factor (LOF) is calculated as:

LOF(x)＝MIN(LRD(x,f _i )/LRD(f _i ,f _i )),f _i e k neighbor set

And taking the LOF of each pixel in the image A' as a new characteristic value thereof to form a gray level characteristic set of the image A based on an LOF algorithm, and inputting the gray level characteristic set into a classification/identification model, namely constructing a gray level LOF model based on local outlier factors, wherein the model can enhance the characteristic expression of the local pattern of the image by detecting the local density change condition of each pixel, and improve the subsequent image identification effect.

Specifically, S132 builds a wavelet LOF model, performs wavelet transformation on an input image A, and extracts the micropbolter of the image by adopting a local binary mode and Harr wavelet; for each pixel, the distance between the pixel and K nearest points in the feature space is calculated, and K distance features are obtained. The local density of each pixel is calculated from its K-distance characteristics. And constructing an improved local outlier factor in combination with the distance; a wavelet LOF model was constructed with the modified LOF of each pixel as its characteristic.

The detailed technical scheme for constructing the wavelet LOF model is as follows: performing wavelet transformation on an input image A, and extracting wavelet characteristics of the image by adopting a local binary mode and haar wavelet; representing an image as a feature space consisting of N pixels; for each pixel x, calculate its Euclidean distance from K nearest points in feature space:

d(x,x _i ),i＝1,2,......,K

and constructing a hypersphere by using x as a center by using a KNN method, so that the hypersphere just comprises K neighbors of x. Calculating the volume of the super sphere to be V _kx The method comprises the steps of carrying out a first treatment on the surface of the The K-distance reachable density defining each pixel x is:

ρk(x)＝K/V _kx

defining the local density lrd (x) of each pixel x as the mean of its K-distance reachable density ρk (x):

lrd (x) =average (ρk (x) _i ))

x _i Is the K neighbor values of x.

The modified LOF defining each pixel x is:

LOF' (x) =average (ρk (x) _i )/lrd(x))

Taking the improved LOF' of each pixel in the image A as a new characteristic thereof, and forming a wavelet characteristic of the image A based on an improved LOF algorithm; inputting the wavelet characteristics into a classification/identification model, and constructing a wavelet LOF model based on an improved local outlier factor; the model can enhance the representation of the image micro-textures, improve the detection capability of small targets and hidden modes, and enhance the robustness of image identification.

In conclusion, compared with the expression based on the global features, the LOF model fully combines the local features and the density difference, can better reflect the abnormal and detailed information in the image, and is beneficial to improving the effect of image target identification and extraction. The scheme not only comprises an LOF model of a gray level image, but also comprises an improved LOF model of wavelet characteristics, and the robustness can be improved by combined use.

FIG. 4 is a flowchart of exemplary sub-steps for acquiring a final clustered image C, as shown in FIG. 4, according to some embodiments of the present description, including: s141 performs first clustering, S142 performs second clustering, S143 majority vote fusion, S144 peer-to-peer matching, S145 outlier judgment and S146 generate a final clustering result.

Wherein, S141 first clustering: performing preliminary clustering on the images by using the constructed gray LOF model to obtain a first clustering result; s142 second class: performing secondary clustering on the first clustering result by utilizing a wavelet LOF model, so that the clustering precision is improved; s143 majority vote fusion: combining the two clustering results, and carrying out majority voting algorithm fusion to improve the clustering stability; s144, matching with the category: matching the same-class connection areas to improve clustering connectivity; s145 outlier judgment: judging outliers in the image according to the clustering result; s146 generates a final cluster: and obtaining a final image clustering result according to the processing. The clustering accuracy is improved through multi-model fusion, the clustering stability is improved through multi-level clustering, the image details are analyzed through combining local features, and finally, a clustering result with strong recognition capability on an image target area is generated. The technical aim of improving the extraction and identification effects of the image target area by utilizing a mode of combining local and global features is achieved.

Specifically, S141, first clustering, constructing a gray level co-occurrence matrix by using GLCM, and extracting global features of the image; calculating local outlier factors LOF of each pixel, and constructing a gray level LOF model; and on the gray LOF model, performing preliminary clustering by using a DBSCAN algorithm to generate a first clustering result B1. S142, a second cluster, namely extracting local features of the image through wavelet transformation, and constructing a wavelet co-occurrence matrix WCM; calculating an improved local outlier factor iLOF, and constructing a wavelet LOF model; and on the wavelet LOF model, performing secondary clustering by using a DBSCAN algorithm to generate a second clustering result B2. S143, majority vote fusion calculates the proportion of each type of pixels in B1 and B2, and as the weight of the type, each pixel is weighted for majority vote, and an intermediate clustering result of majority vote fusion is obtained. S144, extracting texture features of each cluster category of B1 and B2 in the same category matching mode, calculating feature similarity, judging a matching category, and combining the matching categories into the same category. S145, outlier judgment, namely calculating outlier scores of all the categories by using an isolated forest algorithm, and judging the category with high score as the outlier. S146, generating a final clustering construction Gaussian mixture model optimization intermediate clustering result, optimizing a clustering boundary by using a Grab Cut algorithm, and outputting a final clustering image.

More specifically, the detailed technical scheme for performing the first clustering by using GLCM, LOF and DBSCAN is as follows: carrying out graying treatment on the input image A to obtain a gray image A'; calculating a gray level co-occurrence matrix GLCM of the image A', and extracting statistical characteristics such as contrast, entropy, cross correlation and the like to form an image global characteristic F_G; for each pixel in A', calculating a local dyne distance and a local outlier factor LOF as local features F_L of the pixel; splicing the global feature F_G and the local feature F_L to obtain a mixed feature representation F of the image A;

on the feature space F, performing preliminary clustering by using a DBSCAN clustering algorithm: setting a distance threshold Eps and a minimum sample number MinPts; checking whether sample points greater than or equal to MinPts exist in the Eps neighborhood of each unaccessed sample point P; if so, marking the points in the P and the Eps adjacent areas as the same cluster; if not, marking P as a noise point; repeating the above process until all sample points are accessed; finally, a first clustering result B1 is obtained, wherein the first clustering result B1 comprises a plurality of clustering clusters and noise points.

B1 reflects the overall structure and distribution characteristics of the image in terms of gray scale, texture, local mode and the like, the first round of clustering result B1 reserves more complete image information, provides a basis for subsequent clustering, and can perform next round of fine-granularity clustering after B1 is obtained so as to obtain more accurate image classification results. The GLCM, LOF and DBSCAN are utilized for second aggregation, the technical schemes are the same, and the description is omitted.

In summary, the scheme fully fuses multi-source clustering results, performs multi-round optimization processing, can effectively extract a target area, filters background and noise, accurately identifies and collects images, fully plays advantages of pattern identification and image processing technologies through multi-view clustering, multi-step optimization and fusion, combination of probability and a graph model and the like, and enhances understanding and analysis capability of complex images.

More specifically, S143 majority vote fusion includes: acquiring the pixel proportion of each category in the first clustering result and the second clustering result; giving weight to each category according to the pixel proportion; weighted voting is performed to determine the intermediate class of pixels.

The specific technical scheme comprises the following steps: acquiring all cluster categories C in the first cluster result B1 and the second cluster result B2 ₁ ,C ₂ ,C ₃ ,......,C _n . Calculate each class C _i Number of pixels P in B1 _i1 And the number of pixels P in B2 _i2 . For each category C _i Calculating the pixel proportion in the whole image: w (W) _i ＝(P _i1 +P _i2 ) N, N is the total number of pixels. Will W _i As category C _i Is weighted for weighted majority voting: for each pixel point P in the image: in B1, note that P belongs to category C ₁ The method comprises the steps of carrying out a first treatment on the surface of the In B2, note that P belongs to category C ₂ The method comprises the steps of carrying out a first treatment on the surface of the Computing class C ₁ And C ₂ Corresponding weight W ₁ 、W ₂ The method comprises the steps of carrying out a first treatment on the surface of the If W is ₁ >W ₂ Then p belongs to category C in the majority vote result ₁ The method comprises the steps of carrying out a first treatment on the surface of the If W is ₂ >W ₁ P belongs to category C in the majority vote result ₂ The method comprises the steps of carrying out a first treatment on the surface of the If W is ₁ ＝W ₂ The class of P is randomly assigned. And carrying out the operation on all the pixel points P to obtain a majority voting result as an intermediate clustering result.

In the statistical decision theory, the majority voting can improve the robustness of the decision, and the scheme fully combines the results of two rounds of clustering, can improve the reliability of the decision, and effectively reduces errors.

More specifically, S144 peer-category matching includes: extracting image texture features of each category in the first clustering result and the second clustering result; obtaining similarity between features as a matching measure; the category having the similarity greater than the threshold value is determined as the matching category.

For each category C1 in the first clustering result B1, extracting image texture features F1, including statistical features such as contrast, homogeneity and the like. For each class C2 in the second class result B2, extracting an image texture feature F2, and adopting the same feature as F1. Calculating the feature similarity between each C1 and each C2:

sim (C1, C2) =1-distance (F1, F2)/maximum distance

Distance (x), the distance function may employ euclidean distance, mahalanobis distance, or the like. Taking sim (C1, C2) as a matching measure of C1 and C2, taking a threshold value tau, and judging that C1 and C2 are of a matching type when sim (C1, C2) > tau. And carrying out the operation on all the combinations of C1 and C2 to obtain the final matching result of the same category.

The samples with the same category have extremely high similarity in texture characteristics, can be used for judging the corresponding relation of the categories, and can effectively fuse two rounds of classification results by judging the matching relation of the categories through the texture characteristics.

More specifically, the S145 outlier determination includes: feature extraction, extracting features representing colors, textures and shapes from each category in the image, wherein the color features can adopt color histograms, color moments and the like; the texture features can adopt GLCM, LBP and other algorithms; shape features may be described in terms of area, perimeter, compactness, etc.; a multidimensional feature vector for each category is constructed. And calculating outlier scores, and processing the feature vector of each category by using an isolated Forest (Isolation Forest) algorithm. The feature space is segmented randomly and recursively, the distance from the statistical point to the segmentation hyperplane is calculated, the shorter the distance is, the easier the distance is to be isolated, the higher the score is, and the outlier score of each category is calculated. And judging the outlier, wherein if the outlier score of one category is higher than a preset threshold value, judging the category as the outlier, wherein the threshold value can be determined through cross verification, and outputting the final outlier category.

In conclusion, compared with the method for judging the outliers based on the global model, the method for judging the outliers based on the isolated forest algorithm utilizes the class local features to carry out multidimensional expression, and then uses the isolated forest algorithm to judge the relative density of the local features, so that various types of outliers can be detected more accurately.

More specifically, generating the final cluster result at S146 includes: and constructing a Gaussian mixture model, and calculating a characteristic mean vector mu k and a covariance matrix sigma k of each clustering region by using the intermediate clustering result as a training set. Each cluster is represented by a gaussian distribution Gk (x|μk, Σk), the gaussian distribution of all clusters constituting the mixture model G (x). Bayesian classification, calculating class conditional probability p (x|Gk) under each Gaussian distribution Gk for each pixel x of the image; estimating the prior probability p (Gk) of each cluster, and calculating the posterior probability of each pixel belonging to each cluster: p (gk|x) =p (x|gk) p (Gk)/p (x); each pixel is classified into a cluster with the highest posterior probability. Optimizing Grab Cut, and initializing a Grab Cut algorithm by using a Bayes classification result; marking a distinct foreground and background area; performing Graph Cut segmentation iteratively, and updating the classification of each pixel; merging small cluster connected domains and smoothing cluster boundaries; and after iteration convergence, outputting an optimized image clustering result. The probability modeling is carried out by constructing a mixed model, and iterative optimization is carried out by combining with a Grab Cut algorithm, so that an image clustering result with more accuracy and clearer boundary can be generated.

In summary, compared with a clustering method based on global features, the method utilizes local pixel information to construct a hybrid model, combines Grab Cut to perform local optimization, and can generate a clustering result with clear boundaries and accurate classification.

Fig. 5 is an exemplary block diagram of an image recognition-based image acquisition system according to some embodiments of the present description, as shown in fig. 5, an image recognition-based image acquisition system 200 includes: an image preprocessing module 210, a feature extraction module 220, a local feature modeling module 230, and an image clustering module 240.

The image preprocessing module 210 performs preprocessing operations, including filtering, correction, and the like, on an input original image. And outputting the preprocessed image A. The feature extraction module 220 constructs a gray level co-occurrence matrix GLCM and a wavelet co-occurrence matrix WCM for the image a. Based on the GLCM and WCM matrices, the gray scale features and frequency domain features of image A are extracted, respectively. Outputting the combined feature set F. The local feature modeling module 230 uses the feature set F to construct a gray-scale LOF model of image a from the local dyne distance and the local outlier factor. And constructing a wavelet LOF model of the image A by using the K neighbor characteristics and the improved local outlier factor. Both models together enhance local feature expression. The image clustering module 240 takes the LOF model as input and clusters using a DBSCAN density clustering algorithm. DBSCAN makes full use of local density information. And outputting a final clustering result image C.

The system combines the local feature modeling of the image and the local clustering method based on density, does not depend on global feature expression, can effectively identify and extract a target area in the image, and improves the local feature characterization capability.

Wherein, the feature extraction module 220 comprises: GLCM feature extraction unit: the method is used for constructing a gray level co-occurrence matrix GLCM of the image under a given offset distance d and an angle theta, calculating statistical characteristics such as contrast, association degree, entropy and the like, and forming a GLCM characteristic vector F1.WCM feature extraction unit: the method is used for carrying out wavelet transformation, calculating wavelet coefficients under each scale, constructing a co-occurrence matrix WCM of the wavelet coefficients, calculating statistical characteristics such as entropy and the like, and forming a WCM characteristic vector F2. And splicing F1 and F2 to form a final feature vector F.

In summary, the application fully combines the local and global feature information of the image, improves the recognition and expression capability of the target area, achieves the effect of accurately extracting the target of interest in the image, and solves the problem of insufficient feature expression capability caused by using only global features.

Specifically, the local feature modeling module 230 includes: and the gray LOF modeling unit is used for carrying out graying treatment on the input image A to obtain a gray image. The local dyne distance between each pixel of the image and the feature vector in the GLCM feature set F1 is calculated. From the local distances, a local outlier factor LOF for each pixel is calculated. And taking the LOF of each pixel as the characteristic of the point, and constructing a gray LOF model. And the wavelet LOF modeling unit is used for carrying out local binary pattern processing and haar wavelet transformation on the image A and extracting wavelet characteristics. The K-distance minimum value for each pixel is calculated as a local feature. From the density and distance of the local features, an improved local outlier factor iLOF is calculated. And taking the iLOF of each pixel as a characteristic, and constructing a wavelet LOF model.

In conclusion, the two LOF models can fully model local feature information in an image, and by analyzing the density and distance relation between each pixel point and a neighborhood point and calculating an outlier factor to perform feature expression, the recognition capability of the local target of the image can be improved, and support is provided for subsequent image extraction. Compared with the direct use of pixel information, LOF features can abstract local features with more expressive power, so that the extraction of a target region is more accurate.

The foregoing has been described schematically the application and embodiments thereof, which are not limiting, but are capable of other specific forms of implementing the application without departing from its spirit or essential characteristics. The drawings are also intended to depict only one embodiment of the application, and therefore the actual construction is not intended to limit the claims, any reference number in the claims not being intended to limit the claims. Therefore, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the spirit of the application and that the application is intended to be protected by the scope of the appended claims. In addition, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" preceding an element does not exclude the inclusion of a plurality of such elements. The various elements recited in the product claims may also be embodied in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.

Claims

1. An image acquisition method based on image recognition comprises the following steps:

acquiring an original image, preprocessing the original image, and outputting a preprocessed image A;

constructing a symbiotic matrix and acquiring a feature set F of the image A;

combining the feature set F to construct a LOF model of the local feature;

performing cluster analysis on the LOF model by using DBSCAN to obtain a final cluster image C;

wherein the symbiotic matrix is GLCM matrix and WCM matrix.

2. The image acquisition method according to claim 1, characterized in that:

acquiring the feature set of the image a includes:

constructing a gray level co-occurrence matrix GLCM of the image under a given offset distance and angle;

extracting gray features of the image A by utilizing a co-occurrence matrix GLCM, and outputting a GLCM feature set F1;

constructing a co-occurrence matrix WCM of wavelet transformation and wavelet coefficients of an image;

extracting frequency domain features of the image A by utilizing the co-occurrence matrix WCM, and outputting a WCM feature set F2;

wherein the gray scale features comprise contrast and correlation features; the co-occurrence matrix WCM is used to reflect the frequency domain features of the image.

3. The image acquisition method according to claim 1, characterized in that:

constructing the LOF model of the local feature includes:

combining the image A and the feature set F1 to construct a gray LOF model;

Combining the image A and the feature set F2 to construct a wavelet LOF model;

wherein, the construction of the gray LOF model comprises the following steps:

carrying out graying treatment on the image A;

obtaining the distance between each pixel in the image A and the feature set F1 to obtain a local dyne distance;

calculating local outlier factors according to the local dyne distances;

constructing a gray level LOF model characterized by local outlier factors;

wherein, constructing the wavelet LOF model comprises:

extracting wavelet characteristics of the image A by adopting a local binary mode and haar wavelet transformation;

calculating a K distance minimum value of each pixel to extract local features;

constructing an improved local outlier factor according to the density and the distance;

a wavelet LOF model featuring an improved local outlier factor is constructed.

4. The image acquisition method according to claim 1, characterized in that:

acquiring the final clustered image C includes:

performing first clustering, and clustering the gray LOF model by using a DBSCAN algorithm to generate a first clustering result B1;

performing second clustering, and clustering the wavelet LOF model by using a DBSCAN algorithm to generate a first clustering result B2;

performing majority voting fusion, and performing majority voting on a first clustering result B1 and a second clustering result B2 of each pixel to obtain an intermediate clustering result;

Matching the same category, determining possible matching categories in the twice clustering results, and fusing the possible matching categories into the same category in the middle clustering results;

judging the outlier, and judging the class with the pixel number proportion lower than the threshold value as the outlier;

and generating a final clustering result, and fusing the results of majority voting, peer-class matching and outlier judgment to output a final clustering image C.

5. The image acquisition method according to claim 4, wherein:

the majority vote fusion includes:

acquiring the pixel proportion of each category in the first clustering result and the second clustering result;

giving weight to each category according to the pixel proportion;

weighted voting is performed to determine the intermediate class of pixels.

6. The image acquisition method according to claim 4, wherein:

the peer-to-peer matching includes:

extracting image texture features of each category in the first clustering result and the second clustering result;

obtaining similarity between features as a matching measure;

the category having the similarity greater than the threshold value is determined as the matching category.

7. The image acquisition method according to claim 4, wherein:

the outlier judgment comprises the following steps:

extracting color, texture and shape features of each category;

Calculating outlier scores using an isolated forest algorithm;

categories with outliers above the threshold are determined to be outliers.

8. The image acquisition method according to claim 4, wherein:

generating the final clustering result includes:

constructing a Gaussian mixture model, and taking the intermediate clustering result as a training set to obtain a clustering type model;

classifying each pixel into a category with the maximum posterior probability according to a Bayesian decision theory;

carrying out clustering boundary optimization by using a Grab Cut algorithm;

and outputting the optimized final clustering image C.

9. An image acquisition system based on image recognition, comprising:

the image preprocessing module is used for preprocessing an original image and outputting a preprocessed image A;

the feature extraction module is used for constructing a co-occurrence matrix to obtain a feature set F of the image A, wherein the co-occurrence matrix comprises a GLCM matrix and a WCM matrix;

the local feature modeling module is used for combining the feature set F to construct an LOF model of the local feature;

the image clustering module is used for carrying out cluster analysis on the LOF model by using a DBSCAN algorithm to obtain a final clustered image C;

wherein, the feature extraction module includes:

the GLCM feature extraction unit is used for constructing a gray level co-occurrence matrix GLCM of the image under a given offset distance and angle, extracting gray level features of the image A by using the GLCM and outputting a GLCM feature set F1;

The WCM feature extraction unit is used for constructing a co-occurrence matrix WCM of wavelet transformation and wavelet coefficients of the image, extracting frequency domain features of the image A by utilizing the WCM and outputting a WCM feature set F2;

wherein the local feature modeling module comprises:

the gray LOF modeling unit is used for combining the image A and the GLCM feature set F1 to construct a gray LOF model;

and the wavelet LOF modeling unit is used for combining the image A and the WCM feature set F2 to construct a wavelet LOF model.

10. The image acquisition system of claim 9, wherein:

the gray level LOF modeling unit calculates the local dyne distance between each pixel and the feature set F1 by carrying out gray level processing on the image A, and calculates local outlier factors according to the distances to construct a gray level LOF model which is characterized by the local outlier factors;

a wavelet LOF modeling unit extracts wavelet features of an image A by using a local binary pattern and haar wavelet transform, calculates a K-distance minimum value of each pixel to extract local features, constructs an improved local outlier according to density and distance, and constructs a wavelet LOF model featuring the improved local outlier.