CN113326790A - Capsule robot drain pipe disease detection method based on abnormal detection thinking - Google Patents

Capsule robot drain pipe disease detection method based on abnormal detection thinking Download PDF

Info

Publication number
CN113326790A
CN113326790A CN202110647069.3A CN202110647069A CN113326790A CN 113326790 A CN113326790 A CN 113326790A CN 202110647069 A CN202110647069 A CN 202110647069A CN 113326790 A CN113326790 A CN 113326790A
Authority
CN
China
Prior art keywords
image
abnormal
detection
disease
gabor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110647069.3A
Other languages
Chinese (zh)
Inventor
李清泉
臧翀
王全
朱家松
刘志
方旭
朱松
王维康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhiyuan Space Innovation Technology Co ltd
Shenzhen Huanshui Pipe Network Technology Service Co ltd
Original Assignee
Shenzhen Zhiyuan Space Innovation Technology Co ltd
Shenzhen Huanshui Pipe Network Technology Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhiyuan Space Innovation Technology Co ltd, Shenzhen Huanshui Pipe Network Technology Service Co ltd filed Critical Shenzhen Zhiyuan Space Innovation Technology Co ltd
Priority to CN202110647069.3A priority Critical patent/CN113326790A/en
Publication of CN113326790A publication Critical patent/CN113326790A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking; the method comprises the following steps: s10, acquiring a shot video file; s20, inputting an image, and dividing the image into 4-by-4 image blocks; s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value; s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set; s50, calculating a probability formula of each characteristic value; s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set; s70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon; the invention has the beneficial effects that: and carrying out abnormal clustering detection on the extracted characteristic data set to realize pipeline disease detection and outputting a disease type label.

Description

Capsule robot drain pipe disease detection method based on abnormal detection thinking
Technical Field
The invention relates to the technical field of drainage pipeline detection, in particular to a capsule robot drainage pipeline disease detection method based on an abnormal detection thought.
Background
The underground pipe network is an important infrastructure of a city and is a life line for maintaining safe operation of the city. However, in the process of construction and use, more and more pipe network problems are continuously exposed due to factors such as rapid city development, standard exceeding of load flow, old facilities and insufficient maintenance. In recent years, urban disaster accidents such as urban waterlogging, environmental pollution and even surface collapse caused by underground pipe network diseases frequently occur, and great casualties and economic losses are caused to the masses. According to Shenzhen city ground subsidence statistics, nearly 90% of urban ground subsidence is caused by various underground pipe network diseases, large-range normalized exploration work is carried out on the underground pipe network diseases, and the Shenzhen city ground subsidence method is important work for effectively preventing various underground pipe network accidents.
The detection method of pipeline detection instruments (such as pipeline closed-circuit television detection system CCTV, pipeline periscope QV) has become one of the main detection means of urban drainage pipe networks. During CCTV or QV detection operation, a video analyst looks at pipeline images recorded by the CCTV or QV to find out diseases in the pipeline and classifies and marks the diseases. However, it is far from sufficient to analyze the disease in the video by a manual method only. The CCTV or QV pipeline image data volume is extremely large, and the manual screening method is time-consuming and labor-consuming. In recent years, with the development of image recognition and artificial intelligence technologies, researches at home and abroad propose a technology for realizing automatic pipeline disease recognition by using a deep learning technology. According to the method, a large number of drainage pipeline disease images which are marked manually are input into a deep learning model as sample data for training, then the detected and collected pipeline images are input into the trained deep learning model for recognition, and disease type labels are output. The method reduces the workload of manual screening and improves the disease identification efficiency. The method belongs to the process of supervised learning, the disease identification effect depends on input sample image data, and the good disease identification effect can be obtained only by providing enough sample data aiming at different pipes, different pipeline environments and different disease types and training the model. However, drainage pipelines constructed in different cities and different ages in China have obvious difference, complex pipeline environments and various disease types, and brand new disease types often appear. At the moment, the disease identification based on supervised learning is difficult to realize the full coverage of the disease sample data, which can cause the omission and the false detection of the pipeline diseases and influence the identification effect of the disease pipeline diseases.
Therefore, the drainage pipeline disease detection technology based on deep learning still needs to be improved and developed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for detecting the diseases of the drainage pipe of the capsule robot based on abnormal detection thinking.
The technical scheme adopted by the invention for solving the technical problems is as follows: the improvement of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking is characterized by comprising the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
s20, inputting an image, and dividing the image into 4-by-4 image blocks;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set;
s50, calculating a probability formula of each characteristic value;
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
Further, step S40 specifically includes the following steps:
combining the feature values obtained in step S30 to obtain a feature combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, analyzing the distribution of m training samples to obtain the probability density function of the training set, and obtaining the mathematical expectation mu and the square difference sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum variance
Figure BDA0003110338690000021
The calculation formula is as follows:
Figure BDA0003110338690000022
wherein the content of the first and second substances,
Figure BDA0003110338690000023
representing the jth dimension characteristic data.
Further, in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:
Figure BDA0003110338690000024
further, in step S10, the inside of the drain pipe is photographed by the fish-eye lens of the capsule robot, and a video file is obtained.
Further, in step S30, the LBP feature value is calculated as follows:
Figure BDA0003110338690000031
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
Figure BDA0003110338690000032
further, in step S30, the image feature of the GLMC feature value is expressed as:
Figure BDA0003110338690000033
Figure BDA0003110338690000034
Figure BDA0003110338690000035
Figure BDA0003110338690000036
Figure BDA0003110338690000037
Figure BDA0003110338690000038
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
Figure BDA0003110338690000039
further, in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.
Further, convolving the Gabor filter with the image to obtain a Gabor feature, where the two-dimensional Gabor function is expressed as follows:
Figure BDA0003110338690000041
Figure BDA0003110338690000042
Figure BDA0003110338690000043
x′=xcosθ+ysinθ;
y′=-xsinθ+ycosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.
Further, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.
Further, in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is configured by calculating and counting a gradient direction histogram of a local region of the image, and a gradient magnitude and direction calculation formula of a pixel point of the image is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
Figure BDA0003110338690000044
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
The invention has the beneficial effects that: on one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. The disease sample base does not need to be established in advance, the missing detection rate of pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of carrying out a large amount of manual labeling on the diseases in the early stage by adopting a supervised learning method can be greatly reduced.
Drawings
Fig. 1 is a schematic flow chart of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking.
Fig. 2 is a diagram of an embodiment of a capsule robot drain disease detection method based on abnormal detection thinking according to the invention.
FIG. 3 is a partial sample diagram of experimental data of a method for detecting diseases in a drain pipe of a capsule robot based on abnormal detection thinking.
Fig. 4 to 9 are comparative diagrams of methods and feature combinations.
FIG. 10 is a graph showing the statistical results of various methods.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The conception, the specific structure, and the technical effects produced by the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features, and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention. In addition, all the connection/connection relations referred to in the patent do not mean that the components are directly connected, but mean that a better connection structure can be formed by adding or reducing connection auxiliary components according to specific implementation conditions. All technical characteristics in the invention can be interactively combined on the premise of not conflicting with each other.
The invention aims to provide a pipe network disease video detection method based on an unsupervised anomaly detection idea. The human visual recognition mechanism consists in: when watching a section of video data, people can distinguish the pipeline diseases because the picture has abnormal characteristics, namely, the picture has a significant difference with the previous and next pictures. The invention is based on a human visual identification mechanism, takes disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image to extract the significant abnormal characteristics of the front and back images, and carries out unsupervised abnormal cluster detection to realize pipeline disease detection.
The invention adopts an unsupervised learning image identification technology, does not need to establish a disease sample base in advance, treats disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image, extracts the significant abnormal characteristics existing in the images before and after the extraction, performs abnormal clustering detection on the extracted characteristic data set, realizes pipeline disease detection, and outputs a disease type label.
According to the method, a video file shot by a capsule robot on a drainage pipeline is framed into sequence images, a continuous segment of the sequence images is sequentially selected, LBP, GLCM, Gabor and HOG characteristics of the images are extracted, local texture change, image brightness change, edge information and object information of the images are obtained, and the accuracy of extraction of the salient abnormal characteristics of the front and rear images is improved through combination of various different image characteristics. The method adopts an anomaly detection algorithm based on Gaussian probability estimation, and is characterized in that when the characteristic values are mutually independent, the total probability is equal to the product of the probabilities of the characteristic values, and when the characteristic values are not mutually independent, a better effect can be obtained.
Referring to fig. 1, the present invention provides a method for detecting diseases in drain pipes of a capsule robot based on abnormal detection thinking, which comprises the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
in the embodiment, shooting is carried out inside the drainage pipe through the fish-eye lens of the capsule robot, so as to obtain a video file;
s20, inputting an image, dividing the image into 4 × 4 image blocks, and totaling 16 image blocks to form a sequence image;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
the invention relates to a capsule robot drain pipe disease detection method based on abnormal detection thinking, which comprises the steps of converting video data into sequence images, extracting the characteristics of each image, and mainly extracting LBP (local binary pattern) characteristics, GLMC (global warming potential) characteristics, Gabor characteristics and HOG (histogram of oriented gradient) characteristics aiming at the characteristics of drain pipe diseases;
the LBP (Local Binary Pattern) is used for extracting texture features, and has significant advantages of rotation invariance, gray scale invariance and the like. The extracted features are local texture features of the image, an original LBP operator is defined in a 3 x 3 window, the central pixel of the window is used as a threshold value, the threshold value is compared with the gray values of 8 adjacent pixels, and if the surrounding pixel values are larger than the central pixel value, the position is marked as 1; otherwise, it is marked as 0. Thus, an 8-bit binary number can be obtained, which is usually converted into a 10-bit binary number, i.e. 256 kinds of LBP codes, and this value is used as the LBP value of the pixel point in the center of the window, so as to reflect the texture information of the 3 × 3 region.
The calculation formula of the LBP characteristic value is as follows:
Figure BDA0003110338690000061
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
Figure BDA0003110338690000071
GLCM characteristics: (Gray-level Co-occurrence Matrix, GLCM), i.e. a Gray level Co-occurrence Matrix, GLCM is an L × L square Matrix, L is the Gray level of the source image, and describes the joint distribution of two pixels with a certain spatial position relationship, which can be seen as a joint histogram of two pixel Gray level pairs, which not only reflects the distribution characteristics of the brightness, but also reflects the position distribution characteristics between pixels with the same brightness or close to the brightness, which is a second-order statistical characteristic related to the brightness variation of the image. The gray level co-occurrence matrix of an image can reflect the comprehensive information of the gray level of the image about the direction, the adjacent interval and the variation amplitude. The image features of the GLMC feature values are expressed as:
Figure BDA0003110338690000072
Figure BDA0003110338690000073
Figure BDA0003110338690000074
Figure BDA0003110338690000075
Figure BDA0003110338690000076
Figure BDA0003110338690000077
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
Figure BDA0003110338690000078
gabor characteristics: designing Gabor filters, and selecting 4 sizes and 6 directions to form 24 Gabor filters; convolving the Gabor filter with the image to obtain Gabor characteristics, wherein the two-dimensional Gabor function is expressed as follows:
Figure BDA0003110338690000079
Figure BDA0003110338690000081
Figure BDA0003110338690000082
x′=xcosθ+ysinθ;
y′=-xsinθ+ycosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction. In the above embodiment, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.
HOG characteristics: the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature forms a feature by calculating and counting a gradient direction histogram of a local area of an image, and a gradient size and direction calculation formula of an image pixel point is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
Figure BDA0003110338690000083
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
1) Graying (treating the image as a three-dimensional image in x, y, z (grayscale));
2) standardizing (normalizing) the color space of the input image by using a Gamma correction method; the method aims to adjust the contrast of an image, reduce the influence caused by local shadow and illumination change of the image and inhibit the interference of noise;
3) calculating the gradient (including magnitude and direction) of each pixel of the image; mainly for capturing contour information while further attenuating the interference of illumination.
4) Dividing the image into small cells (e.g., 6 x 6 pixels/cell);
5) counting the gradient histogram (the number of different gradients) of each cell to form a descriptor of each cell;
6) and (3) forming each cell into a block (for example, 3 × 3 cells/block), and connecting the feature descriptors of all the cells in the block in series to obtain the HOG feature descriptor of the block.
7) The HOG feature descriptors of all blocks in the image are connected in series to obtain the HOG feature descriptors of the image. This is the final feature vector available for classification.
S40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum variance
Figure BDA0003110338690000091
The calculation formula is as follows:
Figure BDA0003110338690000092
wherein the content of the first and second substances,
Figure BDA0003110338690000093
representing the jth dimension characteristic data.
S50, when a new point is given, determining the probability p of the new point on the Gaussian distribution, wherein the calculation formula of the probability p is as follows:
Figure BDA0003110338690000094
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain a total probability value p of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
In the video data acquired by drainage pipeline detection, the disease condition only occupies a small part of images and can be regarded as an abnormal signal, and the task of abnormal detection is to find objects different from most other objects. The invention adopts the anomaly detection algorithm based on Gaussian probability density estimation to carry out anomaly detection, and the method has the advantage that a better detection result can be obtained no matter whether the selected feature combinations are mutually independent or not.
On one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. On the other hand, the method belongs to an unsupervised learning mode, prior knowledge is not needed, a disease sample database is not needed to be established in advance, the missing rate of the pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of manually marking the diseases in the early stage by adopting a supervised learning method can be greatly reduced.
Referring to fig. 2, the present invention adopts and compares four typical anomaly detection algorithms and an anomaly detection method based on a clustering algorithm, and obtains a final evaluation result.
(1) And iFroest: the isolated forest is a rapid anomaly detection method based on Ensemble, has linear time complexity and high precision, belongs to a Non-parametric and unsupervised method, and does not need to define a mathematical model and label training. The IF uses a binary tree to segment the data, and the depth of the data point in the binary tree reflects the degree of "thinning out" of the data. The iForest is composed of t iTrees (isolation Trees) isolated trees, each iTree is a binary tree structure, and the algorithm steps are as follows:
the first step is as follows: and (5) a process of constructing a decision tree.
The second step is that: the average height h (x) of the sample points to be detected per tree is calculated. First, each iTree needs to be traversed to obtain the number ht (x) of layers of the detected data point x which finally falls on any tth iTree. The ht (x) represents the depth of the tree, that is, the closer to the root node, the smaller ht (x) is, and the closer to the bottom layer, the larger ht (x) is, and the height of the root node is 0;
the third step: according to h (x), judging whether x is an abnormal point. We generally calculate the anomaly probability score for x using the following equation:
Figure BDA0003110338690000101
the value range of s (x, m) is [0,1], and the closer the value is to 1, the higher the probability of being an outlier is. Wherein m is the number of samples, and the expression is as follows:
Figure BDA0003110338690000102
as can be seen from the expression s (x, m), if the height h (x) → 0, the probability of s (x, m) → 1, i.e., the outlier, is 100%, and if the height h (x) → m-1, the probability of s (x, m) → 0, i.e., the outlier, is impossible. If height h (x) → c (m), then s (x, m) → 0.5, i.e. the probability of being an outlier is 50%, typically we can set a threshold of $ s (x, m) and then de-tune, so that values greater than the threshold are considered outliers.
(2) One Class SVM: the One Class SVM also belongs to a large family of support vector machines, but is different from the traditional classification regression support vector machine based on supervised learning, and is an unsupervised learning method without needing to mark output labels of a training set. Then there is no class label, the hyperplane divided by using SVDD method and finding support vector, for SVDD we expect all samples not abnormal to be positive class, at the same time it uses a hypersphere instead of a hyperplane to make division, the algorithm obtains the spherical boundary around the data in the feature space, it expects to minimize the volume of this hypersphere, thus minimizing the influence of abnormal point data.
Assuming the generated hypersphere parameters are center o and the corresponding hypersphere radius r >0, the hypersphere volume v (r) is minimized, center o is the linear combination of support vectors; similar to the conventional SVM method, it is required that the distances from all training data points xi to the center are strictly less than r, but a relaxation variable ξ i with a penalty coefficient of C is constructed at the same time, and the optimization problem is as follows:
Figure BDA0003110338690000111
||xi-o||2≤r+ξi,i=1,2,…m;
ξi≥0,i=1,2,…m;
after the lagrange dual solution is adopted, whether a new data point z is in the class or not can be judged, if the distance from the z to the center is smaller than or equal to the radius r, the new data point z is not an abnormal point, and if the new data point z is outside the hyper-sphere, the new data point z is an abnormal point.
(3) Local Outlier algorithm-Local Outlier Factor (LOF): the method mainly judges whether each point p is an abnormal point by comparing the density of the point p with the density of the adjacent points, and if the density of the point p is lower, the point p is more likely to be considered as an abnormal point. If the ratio is closer to 1, the neighborhood point density of p is almost the same, and p may belong to the same cluster as the neighborhood; if the ratio is less than 1, the density of p is higher than that of the neighborhood points, and p is a dense point; if this ratio is greater than 1, it indicates that the density of p is less than its neighborhood point density, and p is more likely to be an outlier.
The outlier factor for point P is represented as:
Figure BDA0003110338690000112
(4) an anomaly detection algorithm based on Gaussian probability density estimation: an Anomaly Detection Algorithm (Anomaly Detection Algorithm) based on gaussian distribution is widely used in many scenarios. The core idea of the algorithm is as follows: giving an m x n dimensional training set, converting the training set into Gaussian distribution with n, obtaining a probability density function of the training set by analyzing the distribution of m training samples, namely obtaining a mathematical expected mu and a variance sigma 2 of the training set on each dimension, determining a threshold epsilon by using a small amount of Cross Validation sets, judging that p < epsilon is abnormal when a new point is given according to the probability calculated on the Gaussian distribution and the threshold epsilon, and judging that p > epsilon is not abnormal when p < epsilon.
(5) K-Means + +: the K-Means + + algorithm is the optimization of the method for initializing the centroid randomly by the K-Means, and comprises the following steps:
1) randomly selecting a point from the input data point set as a first clustering center mu 1;
2) for each point xi in the data set, its distance from the closest one of the selected cluster centers is calculated:
Figure BDA0003110338690000121
3) selecting a new data point as a new cluster center according to the following selection principles: d (x) the larger point, the probability of being selected as the clustering center is larger;
4) repeating 2 and 3 until k clustered centroids are selected;
5) the K centroids are used as initialization centroids to run the standard K-Means algorithm.
(6) DBSCAN: the DBSCAN is a density clustering method, a core object without a category is selected as a seed, then a sample set which can reach the density of all the core objects is found, and the sample set which is connected with the maximum density and is derived from the density reachable relation is a category of the final clustering, or a cluster is a clustering cluster. And then continuously selecting another core object without categories to search a sample set with reachable density, thereby obtaining another cluster. Run until all core objects have a category.
(7) Hierarchical clustering algorithm: the hierarchy method (hierarchal algorithms) first calculates the distance between samples. Each time merging the closest points to the same class. Then, the distance between the classes is calculated, and the classes with the closest distance are combined into a large class. And continuously merging until a class is synthesized.
Hierarchical Clustering (Hierarchical Clustering) is one of the Clustering algorithms that creates a Hierarchical nested cluster tree by calculating the similarity between data points of different classes. In the clustering tree, the original data points of different classes are the lowest layer of the tree, the top layer of the tree is a root node of a cluster, and the merging algorithm of hierarchical clustering combines the two most similar data points in all the data points by calculating the similarity between the two data points, and repeats the iteration process. In brief, the merging algorithm of hierarchical clustering determines the similarity between data points of each category by calculating the distance between them, and the smaller the distance, the higher the similarity. And combining two data points or categories closest to each other to generate a cluster tree.
Referring to fig. 3, for a part of experimental data samples, the first two rows are data in the pipeline, and the last row is a picture taken when the capsule robot just enters the pipeline and is recovered from the pipeline. And converting the video between the two well lids into a sequence image, and then converting the sequence image into a sequence signal characteristic through characteristic extraction. And then, carrying out disease detection by adopting an anomaly detection algorithm. And the different features were combined, 899 images of data were converted into images, and features and combinations were performed, 40 seconds for video between the two well heads.
And 5 parameter indexes are adopted to evaluate a disease detection algorithm. Let P positive samples (non-diseased samples) and N negative samples (diseased samples). The detection results of the algorithm are TP positive examples determined as positive examples, FN positive examples determined as negative examples (false negative examples) P ═ TP + FN, TN negative examples determined as negative examples, N ═ TN + FP (false positive examples) determined as negative examples, TP/(TP + FP) positive sample Accuracy, TN/(TN + FN) negative sample Accuracy 1 ═ TN/(TN + FN) negative sample Accuracy, TN + N ratio (TP + TN + acc) Accuracy (Accuracy) acc ═ TN + N ratio determined as correct examples, TP/P + N non-disease sample Recall (Recall) Recall ═ TP/P, and TN/N disease sample Recall (Recall) all _ b 1.
The disease sample recall rate, the disease sample detection precision and the comprehensive detection precision are very important parameters, and the algorithm with the largest mean value and the smallest variance is considered to have the best detection effect. Various pairs of methods and feature combinations are shown in fig. 4-9.
First, from the viewpoint of feature combination, the feature combination including the GLCM features has the best detection effect, and it is described that the GLCM can better extract the texture features of the image. Secondly, the classification effect of the original image features and the Gabor features is good, and the original image features and the Gabor features can be well extracted. Meanwhile, the image data can better reflect texture information, and the iForest anomaly detection algorithm is most stable in the aspect of the anomaly detection algorithm. In sum, even under the condition of different feature combinations, the correct classification result of about 60 percent can be achieved, and the correct classification of the GLCM feature and the original data feature of about 70 percent is achieved. The algorithm is relatively robust. Without a priori knowledge of the data, the algorithm may be prioritized for use. From the anomaly detection algorithm based on the clustering method, the basic parameters regarding the clustering method processing follow the following principles: wherein the initial KMean + + clustering number is 10 classes, the final clustering number is 5 classes, and the hierarchical clustering is 10 classes. DBSCAN does not need to specify the number of clusters. After the final result is obtained, the number of each category is arranged from small to large, and the former J-Int (0.6-K) is taken as an abnormal category according to the number K of cluster clusters in the category. For the problem of the method, disease detection scenes are provided, and two parameters of disease detection precision _ b1 and disease recall rate are very important. Essentially, the two evaluation indexes also determine other evaluation indexes such as false alarm rate, missed detection rate and overall accuracy. In machine learning, in order to evaluate the performance of a classification algorithm, the average and variance of the accuracy and recall of negative samples are used in the project to measure the performance of the classification algorithm, the algorithm with the smaller variance is considered to be the best when the average is larger, and the average is considered to be the best. Meanwhile, the two values also accord with the thought of drawing an ROC curve and the visual feeling of people, the GLCM characteristic with the best characteristic combination effect is selected in the project, and different algorithm results are compared, as shown in the table:
Figure BDA0003110338690000141
TABLE 3.3-1 anomaly detection results for different algorithms
Fig. 10 is a schematic diagram of statistical results of different methods.
As shown in the figure, iForest, Gaussion-D, KMeans + +, DBSCAN and GLCM all have good effects. KMeans + + showed the best results. The clustering-based method requires some prior knowledge, two important parameters are set in advance, and the proportion of the number of clustering clusters to the abnormal clusters is determined. If the prior knowledge is lacked, the iForest and Gaussion-D anomaly detection algorithms are recommended to be used and are more in line with the actual case.
In conclusion, the combination of the gray level co-occurrence matrix characteristics based on the Gaussian probability density anomaly point detection algorithm is superior. Therefore, in a real pipeline scene, the disease detection based on the pipe network video can be realized by combining an anomaly detection algorithm with texture feature extraction.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A capsule robot drain pipe disease detection method based on abnormal detection thinking is characterized by comprising the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
s20, inputting an image, and dividing the image into 4-by-4 image blocks;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set;
s50, calculating a probability formula of each characteristic value;
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
2. The method for detecting drain diseases of capsule robots based on abnormal thinking detection as claimed in claim 1, wherein the step S40 comprises the following steps:
combining the feature values obtained in step S30 to obtain a feature combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum variance
Figure FDA0003110338680000011
The calculation formula is as follows:
Figure FDA0003110338680000012
wherein the content of the first and second substances,
Figure FDA0003110338680000013
representing the jth dimension characteristic data.
3. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 2, wherein in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:
Figure FDA0003110338680000021
4. the method for detecting drain pipe disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S10, the inside of the drain pipe is photographed by a fish-eye lens of the capsule robot to obtain a video file.
5. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the LBP characteristic value is calculated as follows:
Figure FDA0003110338680000022
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
Figure FDA0003110338680000023
6. the method for detecting drain disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S30, the image characteristics of GLMC characteristic value are expressed as:
Figure FDA0003110338680000024
Figure FDA0003110338680000025
Figure FDA0003110338680000026
Figure FDA0003110338680000027
Figure FDA0003110338680000028
Figure FDA0003110338680000029
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
Figure FDA0003110338680000031
7. the method for detecting drain diseases in capsule robots based on abnormal thinking detection as claimed in claim 1, wherein in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.
8. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 7, wherein the Gabor filter is convolved with the image to obtain Gabor characteristics, and the two-dimensional Gabor function is expressed as follows:
Figure FDA0003110338680000032
Figure FDA0003110338680000033
Figure FDA0003110338680000034
x*=x cosθ+y sinθ;
y′=-x sinθ+y cosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the phase offset in the range-180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric functions; σ represents the standard deviation of the Gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.
9. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 8, wherein λ is 3, σ -0.56 λ, γ -0.5, θ -60, ψ -90.
10. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is a feature by calculating and counting the histogram of gradient direction of local area of image, the calculation formula of gradient magnitude and direction of image pixel is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
Figure FDA0003110338680000041
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
CN202110647069.3A 2021-06-10 2021-06-10 Capsule robot drain pipe disease detection method based on abnormal detection thinking Withdrawn CN113326790A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110647069.3A CN113326790A (en) 2021-06-10 2021-06-10 Capsule robot drain pipe disease detection method based on abnormal detection thinking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110647069.3A CN113326790A (en) 2021-06-10 2021-06-10 Capsule robot drain pipe disease detection method based on abnormal detection thinking

Publications (1)

Publication Number Publication Date
CN113326790A true CN113326790A (en) 2021-08-31

Family

ID=77420333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110647069.3A Withdrawn CN113326790A (en) 2021-06-10 2021-06-10 Capsule robot drain pipe disease detection method based on abnormal detection thinking

Country Status (1)

Country Link
CN (1) CN113326790A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117388637A (en) * 2023-11-13 2024-01-12 国家电网有限公司技术学院分公司 AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695482A (en) * 2020-06-04 2020-09-22 华油钢管有限公司 Pipeline defect identification method
CN111986188A (en) * 2020-08-27 2020-11-24 深圳市智源空间创新科技有限公司 Capsule robot drainage pipe network defect identification method based on Resnet and LSTM

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695482A (en) * 2020-06-04 2020-09-22 华油钢管有限公司 Pipeline defect identification method
CN111986188A (en) * 2020-08-27 2020-11-24 深圳市智源空间创新科技有限公司 Capsule robot drainage pipe network defect identification method based on Resnet and LSTM

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
XU FANG ET AL.: "Sewer Pipeline Fault Identification Using Anomaly Detection Algorithms on Video Sequences", 《IEEE ACCESS》 *
于冰洁 等: "基于高斯过程模型的异常检测算法", 《计算机工程与设计》 *
李建军: "《基于图像深度信息的人体动作识别研究》", 31 December 2018, 云南大学出版社 *
熊欣: "《人脸识别技术与应用》", 31 August 2018, 黄河水利出版社 *
王丽娜 等: "《信息隐藏技术与应用》", 31 May 2012, 武汉大学出版社 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117388637A (en) * 2023-11-13 2024-01-12 国家电网有限公司技术学院分公司 AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method
CN117388637B (en) * 2023-11-13 2024-05-14 国家电网有限公司技术学院分公司 AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method

Similar Documents

Publication Publication Date Title
Le Goff et al. Deep learning for cloud detection
US5640468A (en) Method for identifying objects and features in an image
US7983486B2 (en) Method and apparatus for automatic image categorization using image texture
Sirmacek et al. Urban-area and building detection using SIFT keypoints and graph theory
US8655070B1 (en) Tree detection form aerial imagery
CN110866896B (en) Image saliency target detection method based on k-means and level set super-pixel segmentation
CN107977661B (en) Region-of-interest detection method based on FCN and low-rank sparse decomposition
CN110097596B (en) Object detection system based on opencv
Siraj et al. Digital image classification for malaysian blooming flower
Zhang et al. Road recognition from remote sensing imagery using incremental learning
CN111080678B (en) Multi-temporal SAR image change detection method based on deep learning
Palomo et al. Learning topologies with the growing neural forest
CN108596195B (en) Scene recognition method based on sparse coding feature extraction
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
Alamdar et al. A new color feature extraction method based on QuadHistogram
Stucker et al. Supervised outlier detection in large-scale MVS point clouds for 3D city modeling applications
CN111709317A (en) Pedestrian re-identification method based on multi-scale features under saliency model
Aman et al. Content-based image retrieval on CT colonography using rotation and scale invariant features and bag-of-words model
CN113326790A (en) Capsule robot drain pipe disease detection method based on abnormal detection thinking
CN112418262A (en) Vehicle re-identification method, client and system
Jafrasteh et al. Generative adversarial networks as a novel approach for tectonic fault and fracture extraction in high resolution satellite and airborne optical images
CN108564020B (en) Micro-gesture recognition method based on panoramic 3D image
CN116415210A (en) Image infringement detection method, device and storage medium
Sheta et al. Metaheuristic search algorithms for oil spill detection using sar images
Partio et al. An ordinal co-occurrence matrix framework for texture retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20210831

WW01 Invention patent application withdrawn after publication