CN113762084A

CN113762084A - Building night scene light abnormity detection method based on RetinaXNet

Info

Publication number: CN113762084A
Application number: CN202110909371.1A
Authority: CN
Inventors: 宋雪桦; 王赟; 王昌达; 金华; 杜聪; 刘思雨
Original assignee: Jiangsu University
Current assignee: Jiangxi Wangkai Construction Co ltd
Priority date: 2021-08-09
Filing date: 2021-08-09
Publication date: 2021-12-07
Anticipated expiration: 2041-08-09
Also published as: CN113762084B

Abstract

The invention relates to a RetinaXNet-based building night scene lighting abnormality detection method. The data set adopts equalization processing, retains the texture information of the image, and reduces the complexity of the image. The input module of the RetinaXNet network reduces the video frame to a 224*224 image, the backbone module uses an improved residual structure to extract the contour information of the image, the detection head module uses the XNet network to strengthen the integration of information, perform classification and regression, and the output module follows the reduction Scale restores the image to its original size. The RetinaXNet network proposed by the present invention can be used to detect the position of the fault light in the image and the fault classification, realize automatic abnormal detection, improve the correct rate of detection, reduce the situation of false detection, and provide a reliable method for the detection of abnormal lights in building night scenes. method.

Description

Building night scene light abnormity detection method based on RetinaXNet

Technical Field

The invention relates to the field of image processing and abnormity detection, in particular to a building night scene light abnormity detection method based on RetinaXNet.

Background

Along with the application of modern city science and technology and the high-speed development of economic strength, city lighting engineering plays a remarkable role in improving city environment, building livable cities, improving the overall functions of the cities, pulling internal requirements, promoting the development of city economy, improving the images of corresponding enterprises and the like. However, the arrangement of the night scene light of the building is exposed outdoors all the year round, and the night scene light of the building has frequent faults due to the problems of lamp aging, installation environment, heat dissipation and the like. The existing detection means mainly take manual inspection visual inspection as a main part, and the manual inspection has the defects of high cost, low real-time property, strong subjectivity and the like. With the development of artificial intelligence technology, the detection method based on deep learning can replace the traditional artificial-based method in some image-related fields, and the early-training network is adopted to detect the abnormal light of the night scene of the building, so that the detection accuracy is improved, the artificial subjectivity is reduced, and the detection automation is realized.

Disclosure of Invention

Aiming at the defects of high cost, low real-time performance, strong subjectivity and the like of the existing manual inspection, the building night scene light abnormity detection method based on RetinaXNet is provided, and the automation of night scene light abnormity detection is realized through a camera and a network model, and the detection accuracy is improved.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a building night scene light abnormity detection method based on RetinaXNet comprises the following steps:

1) constructing an initial night scene lamp image set C and sending the initial night scene lamp image set C to a GPU (graphic processing Unit) computing server for storage;

2) processing the image set C to obtain a data set, and dividing the data set into a training set E and a test set T;

3) constructing a RetinaXNet network model; the RetinaXNet network model comprises an input module, a trunk module and a detection head module;

4) optimizing the weight of the RetinaXNet network model by using a UFL function;

5) training a RetinaXNet network model;

6) and (3) night scene lamp abnormity detection, namely acquiring a frame to be detected through a camera, sending the frame to be detected into a RetinaXNet network, mapping an output result of the network back to an original image, and judging whether the night scene lamp is abnormal or not.

Further, the step 1) comprises the following steps:

1.1) acquiring video data V of night scene light by using a camera, wherein the camera is fixedly arranged at a place where night scene light detection is needed in advance;

1.2) extracting one frame of image from the video data V at intervals of delta t, and constructing an initial night scene light image set C which is recorded as C ═ I₁，I₂...I_n}，I_iThe number of the ith frame image is n, and the number of the night scene light images is n;

1.3) sending the initial night scene light image set C to a GPU computing server for storage.

Further, the step 2) includes the following steps:

2.1) for each frame image in the image set C, calculating the pixel occurrence probability p with the pixel value less than j_i(j) The calculation formula is as follows:

p_i(j) representing the probability of occurrence of gray scale greater than 0 and less than j in the ith frame image, n_tNumber of pixels having a gray level smaller than j, n_IThe total number of pixels of each frame image;

2.2) calculating a histogram result G (i) of each frame image in the set C, wherein the calculation formula is as follows:

g (i) is the gray level histogram processing result of the ith frame, wherein i is more than or equal to 0 and less than 256, p_i(j) Representing the occurrence probability that the pixel is more than 0 and less than j in the ith frame image;

2.3) calculating a result h (v) of pixel equalization, equalizing the set of images C, where C ═ I₁,I₂...I_nAnd recording the processed image set as C', C ═ I₁',I₂'...I_n' }, the calculation formula is as follows:

wherein v is the pixel value of a single image I in the image set C, H (v) is the calculation result of v equalization, G (v) is the histogram processing result of the current v, G (v)_minAs the minimum value of histogram processing, G_maxThe maximum value of histogram processing is shown, L is the gray level number, round represents the rounding of the pixel value result, and all pixels are calculated to obtain a single image I ', and the set is marked as C';

2.4) calculate the average pixel value a for the images in the set of images C', where C ═ { I ═ I₁',I₂'...I_n' }, the calculation formula is as follows:

where M is the length pixel value of the image, N is the width pixel value of the image, I_t'(r, C) are the coordinates of the image pixels in the image set C', and t is the number of the selected image;

2.5) carrying out missing and filling processing on the images in the image set C', wherein the missing and filling values are g (i, j), and obtaining a data set C ", and the calculation formula is as follows:

wherein g (I, j) is a missing or filled value, I ' (I, j) is a pixel value of the image I ' in the image set C ' with the coordinate of (I, j), and Th is a set threshold;

2.6) divide the data set C' into a training set E and a test set T in a ratio of m: n.

Further, the step 3) of constructing the RetinaXNet network model includes the following steps:

3.1) uniformly reducing the images in the training set E into images of r: l size by utilizing an input module, wherein r is the reduced length pixel valueL is the reduced width pixel value, and the transformed set is denoted as X ═ X₁,x₂...x₃}；

3.2) extracting the contour characteristics of each frame in the set X by using a trunk module through a residual error structure, wherein the residual error structure formula is as follows:

F(x)＝f(x)+f(f(x))+f(f(f(x)))

wherein f (x) δ (W x) + c

In the formula, each frame image x is used as the input of a convolution layer, W is a parameter required to be learned by convolution, delta is an activation function, and F (x) is the output result of a residual error structure;

3.3) Structure F for setting input parameters for Classification fusion in the detection head Module_CLSThe formula is as follows:

F_CLS＝δ[W_CLS*F(x)]+c

wherein F (x) is the output result of the residual structure, W_CLSAs training parameters, δ is the activation function, and c is a constant term;

3.4) Structure for setting nonlinear regression connection parameters in the detection head Module

The formula is as follows:

in the formula, F_CLSAs input part for parameter fusion, W₁，W₂For training parameters, δ is an activation function, c is a constant term, and LCR connection is nonlinear connection and has the function of strengthening the connection between classification and regression;

3.5) calculating the enhancement parameters of the regression

The calculation formula is as follows:

in the formula, F_regAre the original regression parameters of the network,

are the enhanced regression parameters;

3.6) Structure for setting nonlinear classification connection parameters in the detection head Module

The formula is as follows:

in the formula, W₃In order to train the parameters of the device,

in order to have the regression parameters strengthened on,

classifying the structure of the connection parameter for non-linearity;

3.7) calculating enhanced parameters of the classification

The calculation formula is as follows:

in the formula, F_clsIs the original classification parameter of the network,

are enhanced classification parameters.

Further, the weight of the RetinaXNet network model is optimized by using the UFL function in step 4), and the optimization formula is as follows:

wherein y is a real label and takes the value of 0 or 1,

is the dynamic adjustment factor parameter of the UFL, gamma is the rate of adjusting the sample weights, and alpha is the weight parameter.

Further, the step 5) of training the RetinaXNet network model includes the following steps:

5.1) calculating the accuracy rate accurve, wherein the calculation formula is as follows:

in the formula, accuracycacy is accuracy, TP represents that network output in the test set T is a positive sample, and the reference standard is also the number of positive samples; TN represents the network output as negative sample, and the reference standard is also the number of negative samples; FP represents that the network output is a positive sample, but the reference standard is the number of negative samples; FN represents the number of positive samples of which the network output is negative samples and the reference standard is positive samples;

5.2) calculating the recall rate recall, wherein the calculation formula is as follows:

in the formula, recall is recall rate. The TP represents the number of positive samples judged to be positive samples, namely the positive samples; FN represents the number of samples that are judged to be negative, but in fact positive;

5.3) calculating F₁The value, the calculation formula is as follows:

in the formula, F₁The method is a calculation result of the balance of accuracy and recall, the comprehensive accuracy and the recall rate;

5.4) judgment of F₁If the value is less than t, turning to the step 5.1) for retraining, otherwise, turning to the step 6).

Further, the step 6) of determining whether the night scene light is abnormal includes the following steps:

6.1) mapping the network output result back to the original image, wherein the coordinate, length and height formula mapped to the original image is as follows:

in the formula (I), the compound is shown in the specification,

x, y, W and H are the coordinates of the upper left corner of the detection frame and the length and height of the detection frame respectively, and x_N、y_N、W_N、H_NThe coordinates and the length and the height of the original image are mapped;

6.2) anomaly detection, wherein the detection formula is as follows:

in the formula, when warner is 0, no abnormality is found, and when warner is 1, an abnormality is found, an alarm is triggered.

Compared with the traditional method for detecting the abnormal light of the night scene of the building, the method has the advantages that the cost can be saved, the subjectivity of manual judgment is reduced, the detection efficiency is improved, and the good detection effect can be achieved under the conditions of illumination change, angle change, impaired definition and the like.

Drawings

Fig. 1 is a flowchart of a building night scene light abnormality detection method according to the present invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings and specific embodiments, wherein the technical solutions and design principles of the present invention are described in detail below, but the present invention is not limited thereto. Any obvious improvement, replacement or modification by a person skilled in the art can be made without departing from the spirit of the invention.

The invention relates to a building night scene lamplight abnormity detection method based on RetinaXNet, and used equipment comprises an Internet of things camera, a GPU (graphic processing unit) calculation server and an alarm.

The building night scene light abnormity detection method based on RetinaXNet is shown in figure 1, and comprises the following steps:

1) an initial night scene light image set C is constructed and sent to a GPU computing server for storage, and the method comprises the following steps:

1.1) acquiring video data V of night scene light by using a camera, wherein the camera is fixedly arranged in advance at a place where night scene light detection is needed;

1.3) sending the initial night scene light image set C to a GPU computing server for storage;

2) processing the image set C to obtain an image set C ', performing deletion and filling processing on images in the image set C' to obtain a data set C ', and dividing the C' into a training set E and a test set T; as a preferred embodiment of the invention, the method comprises the following steps:

g (i) is the gray level histogram processing result of the ith frame, wherein i is more than or equal to 0 and less than 256, p_i(j) Indicating the occurrence probability of pixels larger than 0 and smaller than j in the ith frame image.

wherein v is the pixel value of a single image I in the image set C, H (v) is the calculation result of v equalization, G (v) is the histogram processing result of the current v, G (v)_minAs the minimum value of histogram processing, G_maxAnd (3) taking the maximum value of histogram processing, wherein L is the gray level number, round represents the rounding of the pixel value result, and a single image I 'is obtained after all pixels are calculated and is recorded as C' in a set.

2.4) calculate the average pixel value a for the images in the set of images C', where C ═ { I ═ I₁',I₂'...I_n' }, the formula is as follows:

2.5) carrying out missing and filling processing on the images in the image set C', wherein the missing and filling values are g (i, j), and obtaining a data set C ", and the formula is as follows:

wherein g (I, j) is a missing or filled value, I ' (I, j) is a pixel value of the image I ' in the image set C ' whose coordinate is (I, j), Th is a set threshold, and in the embodiment of the present invention, Th is 180;

2.6) the data set C "is divided into a training set E and a test set T in a ratio m: n, where m: n is 9: 1;

3) constructing a RetinaXNet network model; the RetinaXNet network model comprises an input module, a trunk module and a detection head module; as a preferred embodiment of the invention, the method comprises the following steps:

and 3.1) uniformly reducing the images in the training set E into images of r: l size by using an input module, wherein r is a reduced length pixel value, l is a reduced width pixel value, and a converted set is recorded as X ═ X₁,x₂...x₃}; in a particular embodiment of the invention, r 224, l 224;

F(x)＝f(x)+f(f(x))+f(f(f(x)))

wherein f (x) δ (W x) + c

In the formula, each frame image x is used as the input of the convolution layer, W is the parameter to be learned for convolution, δ is the activation function, and f (x) is the output result of the residual error structure. The method uses n repeated calculations, and in a specific embodiment of the invention, n is set to 50.

F_CLS＝δ[W_CLS*F(x)]+c

wherein F (x) is the output result of the residual structure, W_CLSTo train the parameters, δ is the activation function and c is a constant term.

The formula is as follows:

in the formula, F_CLSAs input part for parameter fusion, W₁，W₂To train the parameters, δ is the activation function, c is a constant term, and the LCR linkage is a nonlinear linkage that acts to strengthen the link between classification and regression.

3.5) calculating the enhancement parameters of the regression

The formula is as follows:

in the formula, F_regAre the original regression parameters of the network,

are the regression parameters that are strengthened.

The formula is as follows:

in the formula, W₃In order to train the parameters of the device,

in order to have the regression parameters strengthened on,

the structure of the connection parameters is classified for non-linearity.

3.7) calculating enhanced parameters of the classification

The formula is as follows:

in the formula, F_clsIs the original classification parameter of the network,

are enhanced classification parameters.

4) Optimizing the weight of the RetinaXNet network model by using a UFL function; the formula is as follows:

wherein y is a real label and takes the value of 0 or 1,

the dynamic adjustment factor parameter of UFL, γ is the rate of adjusting the sample weight, α is the weight parameter, in the embodiment of the present invention, γ is 2, and γ is 0.25;

5) the training network model, as a preferred embodiment of the present invention, comprises the following steps:

in the formula, accuracycacy is accuracy, TP represents that network output in the test set T is a positive sample, and the reference standard is also the number of positive samples; TN represents the network output as negative sample, and the reference standard is also the number of negative samples; FP represents that the network output is a positive sample, but the reference standard is the number of negative samples; FN represents the number of positive samples for which the network output is negative, but the reference criterion is positive.

5.3) calculating F₁The value, the calculation formula is as follows:

in the formula, F₁The method is a calculation result of the balance of accuracycacy and recall, and the comprehensive accuracy and recall rate of the accuracycacy and recall.

5.4) judgment of F₁If the value is less than t, turning to step 5.1) for retraining, otherwise turning to step 6), in a specific embodiment of the present invention, t is 0.6;

6) the method comprises the following steps of acquiring a frame to be detected through a camera, sending the frame to be detected to a network for detection, mapping an output result of the network back to an original image, and judging whether a fault lamp occurs, wherein the method comprises the following steps:

in the formula (I), the compound is shown in the specification,

x, y, W and H are the coordinates of the upper left corner of the detection frame and the length and height of the detection frame respectively, and x_N、y_N、W_N、H_NTo the coordinates and length and height of the original.

6.2) anomaly detection, wherein the detection formula is as follows:

Claims

1. a kind of abnormal detection method of building night scene lights based on RetinaXNet, is characterized in that, comprises the steps:

1) Build the initial night scene light image set C and send it to the GPU computing server for storage;

2) process the image set C to obtain a data set, and divide the data set into a training set E and a test set T; 3) build a RetinaXNet network model; the RetinaXNet network model includes an input module, a backbone module, and a detection head module;

4) Use the UFL function to optimize the weights of the RetinaXNet network model;

5) Train the RetinaXNet network model;

6) Anomaly detection of night scene lights, that is, the frames to be detected are obtained through the camera and sent to the RetinaXNet network, and the output results of the network are mapped back to the original image to determine whether the night scene lights are abnormal.

2. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, described step 1) comprises the steps:

1.1) Utilize a camera to collect video data V of night scene lights, and the camera is fixed and installed in advance at a location where night scene lighting detection can be performed;

1.2) Extract a frame of image every time Δt from the video data V, and construct the initial night scene light image set C, denoted as C={I ₁ , I ₂ ... I _n }, I _i is the ith frame image, n is the number of night scene light images;

1.3) Send the initial night scene light image set C to the GPU computing server for storage.

3. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, described step 2) comprises the steps:

2.1) For each frame of image in the image set C, calculate the occurrence probability p _i (j) of the pixel whose pixel value is less than j, and the calculation formula is as follows:

p _i (j) represents the probability of occurrence of gray level greater than 0 and less than j in the image of the ith frame, n _t is the number of pixels with gray level less than j, and n _I is the total number of pixels in each frame of image;

2.2) Calculate the histogram result G(i) of each frame of image in the set C, and the calculation formula is as follows:

G(i) is the grayscale histogram processing result of the ith frame, where 0≤i<256, and p _i (j) represents the probability of the occurrence of pixels in the image of the ith frame greater than 0 and less than j;

2.3) Calculate the pixel equalization result H(v), equalize the image set C, where C={I ₁ , I ₂ . . . I _n }, the processed image set is denoted as C′, C′={I ₁ ',I ₂ '...In '}, the calculation formula _is as follows:

In the formula, v is the pixel value of a single image I in the image set C, H(v) is the calculation result of the equalization of v, G(v) is the histogram processing result of the current v, and G _min is the histogram processing The minimum value of , G _max is the maximum value of histogram processing, L is the number of gray levels, round represents the rounding of the pixel value result, all pixel calculations are completed to obtain a single image I', and the set is denoted as C';

2.4) Calculate the average pixel value a for the images in the image set C', where C'= _{ I ₁ ', I ₂ '...In '}, and the calculation formula is as follows:

In the formula, M is the length pixel value of the image, N is the width pixel value of the image, I _t '(r, c) is the coordinate of the image pixel in the image set C', and t is the number of the selected image;

2.5) Perform missing and filling processing on the images in the image set C', the missing and filling values are g(i, j), and the data set C" is obtained. The calculation formula is as follows:

In the formula, g(i,j) is the missing and filled value, I'(i,j) is the pixel value of the image I' in the image set C' whose coordinates are (i,j), and Th is the set threshold ;

2.6) Divide the dataset C" into training set E and test set T according to the ratio m:n.

4. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, described step 3) builds RetinaXNet network model and comprises the steps:

3.1) Use the input module to uniformly reduce the images in the training set E to images of size r:l, where r is the reduced length pixel value, l is the reduced width pixel value, and the transformed set is denoted as X={ x ₁ , x ₂ ... x ₃ };

3.2) Use the backbone module to extract the contour features of each frame in the set X through the residual structure. The residual structure formula is as follows:

F(x)=f(x)+f(f(x))+f(f(f(x)))

where f(x)=δ(W*x)+c

In the formula, each frame of image x is used as the input of the convolution layer, W is the parameter to be learned by the convolution, δ is the activation function, and F(x) is the output result of the residual structure;

3.3) Set the structure F _CLS of the input parameters of classification fusion in the detection head module, the formula is as follows:

F _CLS = δ[W _CLS *F(x)]+c

In the formula, F(x) is the output result of the residual structure, W _CLS is the training parameter, δ is the activation function, and c is the constant term;

3.4) Setting the structure of nonlinear regression connection parameters in the detection head module

The formula is as follows:

In the formula, F _CLS is the input part of parameter fusion, W ₁ , W ₂ are training parameters, δ is the activation function, c is a constant term, and the LCR connection is a nonlinear connection, which is used to strengthen the connection between classification and regression;

3.5) Calculate the reinforcement parameters for regression

Calculated as follows:

In the formula, F _reg is the original regression parameter of the network,

are the enhanced regression parameters;

3.6) The structure of setting the nonlinear classification connection parameters in the detection head module

The formula is as follows:

In the _formula , W3 is the training parameter,

are the enhanced regression parameters,

is the structure of the connection parameters for nonlinear classification;

3.7) Calculate the reinforcement parameters for classification

Calculated as follows:

In the formula, F _cls is the original classification parameter of the network,

are the enhanced classification parameters.

5. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, utilize UFL function to optimize the weight of RetinaXNet network model in described step 4), optimization formula is as follows:

In the formula, y is the real label, and its value is 0 or 1,

is the dynamic adjustment factor parameter of UFL, γ is the rate at which the weight of the sample is adjusted, and α is the weight parameter.

6. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, described step 5) training RetinaXNet network model comprises the steps:

5.1) Calculate the accuracy rate, and the calculation formula is as follows:

In the formula, accuracy is the accuracy rate, TP represents that the network output in the test set T is a positive sample, and the reference standard is also the number of positive samples; TN represents that the network output is a negative sample, and the reference standard is also the number of negative samples; FP represents that the network output is a positive sample, but the reference standard is the number of negative samples; FN represents that the network output is a negative sample, but the reference standard is the number of positive samples;

5.2) Calculate the recall rate recall, the calculation formula is as follows:

In the formula, recall is the recall rate TP, which represents the number of positive samples that are judged to be positive samples; FN represents the number of positive samples that are judged to be negative samples;

5.3) Calculate the F ₁ value, the calculation formula is as follows:

In the formula, F ₁ is the balance of accuracy and recall, and the calculation result of its comprehensive accuracy and recall;

5.4) Determine whether F1 is less _than t, if it is less than, go to step 5.1) to retrain, otherwise go to step 6).

7. the abnormal detection method of building night scene lights based on RetinaXNet as claimed in claim 1, is characterized in that, described step 6) judges whether night scene lights are abnormal and comprises the steps:

6.1) Map the network output results back to the original image, and the coordinates, length and height formulas mapped to the original image are as follows:

In the formula,

are the length and height of the original image, x, y, W, H are the coordinates of the upper left corner of the detection frame and the length and height of the detection frame, respectively, x _N , y _N , W _N , H _N are the coordinates mapped to the original image and length and height;

6.2) Anomaly detection, the detection formula is as follows:

In the formula, Warn=0 represents no abnormality, and Warn=1 represents abnormality and triggers the alarm.