WO2022124479A1

WO2022124479A1 - Device for monitoring area prone to freezing and risk of slippage using deep learning model, and method therefor

Info

Publication number: WO2022124479A1
Application number: PCT/KR2021/002474
Authority: WO
Inventors: 지택수; 김진술; 김치훈; 장재혁
Original assignee: 전남대학교 산학협력단
Priority date: 2020-12-08
Filing date: 2021-02-26
Publication date: 2022-06-16
Also published as: KR102287823B1

Abstract

A method for monitoring an area prone to freezing and risk of slippage of the present invention comprises: a step of generating a multi-channel image vector by embedding a lidar image in which a data processing unit scans a monitoring area through a lidar sensor, an infrared filter image in which the monitoring area is photographed with an infrared filter camera, and a thermal infrared image photographed with a thermal infrared camera; a step in which a detection unit inputs the multi-channel image vector into a detection model; a step in which the detection model performs a plurality of operations to which weights learned between a plurality of layers are applied to the multi-channel image vector, and outputs a partition box for specifying an area in which the occurrence of freezing is estimated, and a probability that freezing is present in an area specified by the partition box; and a step in which, when the probability is greater than or equal to a preset threshold, the detection unit recognizes that freezing has occurred, and transmits a message notifying of the occurrence of freezing.

Description

Apparatus and method for monitoring habitual ice and slippery risk areas using deep learning model

The present invention relates to a monitoring technology, and more particularly, to an apparatus for monitoring a habitual ice and slippery risk area using a deep learning model and a method therefor.

Black ice (or clear ice) refers to a phenomenon in which a thin layer of ice is formed as if coated on the road surface. This includes the phenomenon in which snow and moisture get entangled with soot and dust in the air through cracks in the asphalt surface and then freeze to black. In cold winter, it occurs mainly in shady and low temperature places such as on bridges, tunnel entrances, shady roads, and in the shade of mountain corners.

An object of the present invention is to provide an apparatus for monitoring a habitual ice and slippery risk area using a deep learning model and a method therefor.

In the method for monitoring a habitual ice and slippery danger area according to a preferred embodiment of the present invention for achieving the above object, the data processing unit scans the monitoring area through the lidar sensor, the lidar image, and the monitoring region. Creating a multi-channel image vector by embedding an infrared filter image captured by an infrared filter camera and a thermal infrared image captured by a thermal infrared camera in the monitoring area; and inputting the multi-channel image vector into a detection model by a detector And, the detection model performs a plurality of calculations to which the weights learned between a plurality of layers are applied to the multi-channel image vector, and a partition box for specifying an area where the occurrence of icing is estimated, and an area specified by the partition box. outputting a probability of the existence of the ice crystals, and when the probability is greater than or equal to a preset threshold, the detection unit recognizes that ice has occurred, and transmits a message informing of the occurrence of ice.

In the generating of the multi-channel image vector, the data processing unit divides the lidar image, the infrared filter image, and the thermal infrared image into a plurality of unit regions having predetermined unit heights and unit widths in horizontal and vertical directions. Step, the data processing unit performing a convolution operation using a convolution filter of the same standard for each of the plurality of unit regions to extract a feature value expressing the characteristics of the unit region, the data processing unit performing the generating a lidar image vector, an infrared filter image vector, and a thermal infrared image vector using the feature values derived for each of the plurality of unit regions of the IDA image, the infrared filter image, and the thermal infrared image as elements; and generating, by a processing unit, the multi-channel image vector by merging the lidar image vector, the infrared filter image vector, and the thermal infrared image vector.

The convolution filter has the same standard as the unit area, has elements corresponding to the number of pixels in the unit area, all elements of the convolution filter have a value of 0 or 1, and elements adjacent to each other of the convolution filter is characterized by having different values.

In the method, before the step of dividing the plurality of unit regions, the data processing unit detects a region of interest through image processing for the infrared filter image, and erases pixel values of the remaining regions except for the detected region of interest, or 0 It further comprises the step of filling with

The method further includes, before dividing the plurality of unit regions, by the data processing unit erasing or filling the pixel values of pixels in the thermal infrared image with a temperature equal to or greater than a predetermined value by zero.

In the method, before the step of generating the multi-channel image vector, the model generation unit scans the learning area including at least a part of the area for which the freezing state is known through the data processing unit with the lidar sensor, and the learning area is subjected to infrared rays. Generating a multi-channel image vector for learning from the infrared filter image captured by the filter camera and the thermal infrared image captured by the thermal infrared camera of the learning area, and the model generating unit classifying the icy state and the non-icing state to learn multi-channel Setting a label for the image vector, the model generating unit setting hyperparameters including independent hyperparameters and dependent hyperparameters for the loss function, and the model generating unit detecting the multi-channel image vector for training as a detection model calculating an output value through a plurality of calculations in which a plurality of layer weights are applied to the multi-channel image vector for learning to which the detection model is input, and the model generator using the loss function to calculate the output value. and performing optimization of correcting the weight of the detection model so that the loss that is the difference between the label and the label is minimized, and verifying the detection model through an evaluation index until the detection model reaches a preset accuracy The method further includes repeating the steps of generating a channel image vector, setting the label, inputting the input to the detection model, calculating the output value, and performing the optimization.

The loss function is

where S is the number of cells, C is the confidence score, B is the number of compartments in one cell, and pi(c) is the probability that the object in the i-th cell belongs to class c, and i is a parameter indicating a cell in which the frozen state object exists, j is a parameter indicating a predicted compartment box, bx and by are the center coordinates of the compartment box, bw and bh are the width and height of the compartment box, respectively , remind

is an independent hyperparameter, wherein

is a dependent hyperparameter.

The step of setting the hyperparameter is the independent hyperparameter at each iteration.

By setting by increasing by a predetermined value for each repetition from 0.5 to 1, the equation

According to the dependent hyperparameter, the

It is characterized in that it is set by decreasing by a predetermined value from 0.5 to 0.

In the method for monitoring a habitual freezing and slippery danger area according to a preferred embodiment of the present invention for achieving the above object, the model generator selects a learning area including at least a part of an area known as whether or not ice is frozen through the data processing unit. Generating a multi-channel image vector for learning from a lidar image scanned with a lidar sensor, an infrared filter image obtained by photographing the learning area with an infrared filter camera, and a thermal infrared image photographed with the learning area with a thermal infrared camera; Setting a label for a multi-channel image vector for training by a model generating unit by distinguishing between an icy state and a non-icing state, and setting, by the model generating unit, a hyperparameter including independent hyperparameters and dependent hyperparameters for the loss function and inputting the multi-channel image vector for training into a detection model by the model generator, and calculating an output value through a plurality of calculations in which a plurality of layer weights are applied to the multi-channel image vector for training to which the detection model is input. performing optimization of correcting the weight of the detection model so that the loss, which is the difference between the output value and the label, is minimized by the model generator through the loss function, and verifying the detection model through the evaluation index until the detection model reaches a preset accuracy, generating the multi-channel image vector for training, setting the label, inputting to the detection model, and calculating the output value; and repeating the step of performing the optimization.

The method embeds a lidar image in which the data processing unit scans the monitoring area through a lidar sensor, an infrared filter image in which the monitoring area is photographed with an infrared filter camera, and a thermal infrared image in which the monitoring area is photographed by a thermal infrared camera. generating a multi-channel image vector; inputting the multi-channel image vector into the detection model by a detection unit; outputting a partition box specifying an area in which the occurrence of ice is estimated by performing an operation and a probability that ice exists in the area specified by the partition box; Recognizing and transmitting a message notifying the occurrence of freezing.

The loss function is

is an independent hyperparameter, wherein

is a dependent hyperparameter.

According to the dependent hyperparameter, the

The apparatus for monitoring a habitual ice and slippery danger area according to a preferred embodiment of the present invention for achieving the above object is a lidar image scanned through a lidar sensor in a monitoring area, and an infrared filter camera in the monitoring area A data processing unit for generating a multi-channel image vector by embedding an infrared filter image taken with a thermal infrared camera and a thermal infrared image photographed with a thermal infrared camera in the monitoring area, and inputting the multi-channel image vector into a detection model so that the detection model is When a partition box for specifying an area where the occurrence of ice is estimated by performing a plurality of operations to which a weight learned between a plurality of layers is applied on a multi-channel image vector and a probability that ice exists in the area specified by the partition box are output, and a detector for recognizing whether or not ice has occurred according to the probability.

The data processing unit divides the lidar image, the infrared filter image, and the thermal infrared image into a plurality of unit regions having predetermined unit heights and unit widths in horizontal and vertical directions, and the same is applied to each of the plurality of unit regions. A convolution operation is performed using a standard convolution filter to extract a feature value expressing a characteristic of a corresponding unit area, and derived for a plurality of unit areas of each of the LiDAR image, the infrared filter image, and the thermal infrared image A lidar image vector, an infrared filter image vector, and a thermal infrared image vector are generated using feature values as elements, and the multi-channel image vector is obtained by merging the lidar image vector, the infrared filter image vector and the thermal infrared image vector. It is characterized by creating

The data processing unit detects a region of interest through image processing for the infrared filter image before classifying the plurality of unit regions, and erases or fills in pixel values of the remaining regions except for the detected region of interest do it with

The data processing unit may erase or fill a pixel value of a pixel having a temperature equal to or greater than a predetermined value in the thermal infrared image in the thermal infrared image before dividing it into the plurality of unit regions.

The model generating unit includes a lidar image scanned by a lidar sensor on a learning region including at least a part of a region for which ice is known through the data processing unit, an infrared filter image captured by the learning region with an infrared filter camera, and the learning region Generates a multi-channel image vector for training from a thermal infrared image taken with a thermal infrared camera, sets a label for the multi-channel image vector for learning by classifying an icy state and a non-freezing state, and sets the multi-channel image vector for learning Set hyperparameters including independent hyperparameters and dependent hyperparameters for the loss function, input the multi-channel image vector for training into a detection model, and weights of a plurality of layers with respect to the multi-channel image vector for training to which the detection model is input When an output value is calculated through a plurality of operations to which α is applied, optimization is performed to correct the weight of the detection model so that a loss that is a difference between the output value and the label is minimized through the loss function.

The loss function is

is an independent hyperparameter, wherein

is a dependent hyperparameter.

The model generating unit is the independent hyperparameter.

According to the dependent hyperparameter, the

According to the present invention, by using a deep learning model, it is possible to detect in real time whether or not icing such as black ice has occurred in a habitual icing and slippery risk area, and notify it. Therefore, accidents caused by road icing such as black ice can be prevented in advance.

1 is a diagram for explaining the configuration of a system for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention.

2 is a diagram for explaining a monitoring area of a monitoring device in a system for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention.

3 is a block diagram for explaining the configuration of an apparatus for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention.

4, 5, 6, 7, 8, and 9 are diagrams for explaining a method of generating a multi-channel image vector of a data processing unit of a monitoring apparatus according to an embodiment of the present invention.

10 is a diagram for explaining the configuration of a detection model (DM) according to an embodiment of the present invention.

11 is a diagram for explaining an output value of a detection model DM according to an embodiment of the present invention.

12 is a flowchart illustrating a method of generating a detection model for monitoring a habitual ice and slippery danger area according to an embodiment of the present invention.

13 is a flowchart for explaining a method for monitoring a habitual ice and slippery risk area using a deep learning model according to an embodiment of the present invention.

14 is a diagram illustrating a computing device according to an embodiment of the present invention.

Prior to the detailed description of the present invention, the terms or words used in the present specification and claims described below should not be construed as being limited to their ordinary or dictionary meanings, and the inventors should develop their own inventions in the best way. It should be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that it can be appropriately defined as a concept of a term for explanation. Therefore, the embodiments described in the present specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all the technical spirit of the present invention, so various equivalents that can be substituted for them at the time of the present application It should be understood that there may be water and variations.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this case, it should be noted that in the accompanying drawings, the same components are denoted by the same reference numerals as much as possible. In addition, detailed descriptions of well-known functions and configurations that may obscure the gist of the present invention will be omitted. For the same reason, some components are exaggerated, omitted, or schematically illustrated in the accompanying drawings, and the size of each component does not fully reflect the actual size.

First, a system for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention will be described. 1 is a diagram for explaining the configuration of a system for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention. 2 is a diagram for explaining a monitoring area of a monitoring device in a system for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention. Referring to FIG. 1 , the monitoring system according to an embodiment of the present invention includes a plurality of monitoring devices 10 , a plurality of edge devices 20 connected to the plurality of monitoring devices 10 , and a plurality of edge devices 20 . It includes a monitoring server 30 and a control server 40 to manage. The plurality of monitoring devices 10 , the plurality of edge devices 20 , the monitoring server 30 and the control server 40 may be connected to each other through communication.

As shown in FIG. 2 , the monitoring device 10 is disposed at a predetermined location and monitors whether ice is generated in the monitoring area MA allocated to the disposed location. If it is detected that icing such as black ice has occurred during such monitoring, the monitoring device 10 transmits a message notifying that icing has occurred to the monitoring server 30 through the edge device 20, and the monitoring server 30 sends this message back to the control server 40 . Here, the control server 40 may be a device used in a situation control room such as the Road Traffic Authority or a police station.

Then, the configuration of the monitoring device 10 according to the embodiment of the present invention will be described in more detail. 3 is a block diagram for explaining the configuration of an apparatus for monitoring a habitual ice and slippery danger area using a deep learning model according to an embodiment of the present invention. 4 to 9 are diagrams for explaining a method of generating a multi-channel image vector of a data processing unit of a monitoring apparatus according to an embodiment of the present invention. 10 is a diagram for explaining the configuration of a detection model (DM) according to an embodiment of the present invention. 11 is a diagram for explaining an output value of a detection model DM according to an embodiment of the present invention.

First, referring to FIG. 3 , the monitoring apparatus 10 according to an embodiment of the present invention includes a camera unit 11 , a lidar unit 12 , a communication unit 13 , and a control unit 14 .

The camera unit 11 is for capturing an image. The camera unit 11 includes an infrared filter camera 110 and a thermal infrared camera 120 . The infrared filter camera 110 adds an infrared cut filter (IR cut) to the video camera. The infrared filter camera 110 outputs an infrared filter image that is a color image in which the near infrared region is filtered by photographing a subject. The thermal infrared camera 120 outputs a thermal image by photographing a subject.

The lidar unit 12 includes a lidar sensor 200 . The lidar sensor 200 radiates radio waves in an omni-Directional direction, and coordinates for objects for each angle in a vertical or horizontal direction in a three-dimensional space or a horizontal direction in a two-dimensional space from the center of the lidar sensor 200 and outputting scan data including a plurality of scan information including a reflection intensity indicating a light reflected intensity. The scan information includes the coordinates of the object on the three-dimensional Cartesian coordinate system consisting of the X and Y axes of a plane parallel to the ground and the Z axis in the height direction, and the intensity at which the radio waves are reflected. That is, the scan information included in the scan data scanned and output by the lidar sensor 200 includes coordinates indicating the position of any one of a plurality of points constituting the object surface through a three-dimensional Cartesian coordinate system, and the radio waves from the point. Includes reflection intensity indicating the intensity that is reflected. In this way, the scan data output by the lidar sensor 200 is expressed by Equation 1 below.

In Equation 1, Sd represents scan data. In addition, through the scan data, N represents the number of output scan information. That is, the number of scan information indicates the number of the plurality of points when there are a plurality of points on the object surface where the radio waves are reflected. The number N of scan information may vary for each scan moment of the lidar sensor 200 . In addition, (xk, yk, zk) are coordinates according to the Cartesian coordinate system of each of a plurality of points on the surface of an object that reflects radio waves, and vk is a reflection intensity indicating the intensity of radio waves reflected from each of the points on the surface of the object. The lidar unit 12 generates and outputs a lidar image based on scan data including scan information.

The communication unit 15 is for communication with the edge device 20 . Also, the communication unit 15 may communicate with the monitoring server 30 through the edge device 20 . The communication unit 15 includes an RF transmitter for up-converting and amplifying a frequency of a transmitted signal, and an RF receiver for low-noise amplifying and down-converting a received signal. In addition, the communication unit 15 includes a modem that modulates a transmitted signal and demodulates a received signal.

The controller 14 may control the overall operation of the monitoring device 10 and the signal flow between internal blocks of the monitoring device 10 , and may perform a data processing function of processing data. Also, the control unit 14 basically serves to control various functions of the monitoring device 10 . The controller 14 may include a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit (APU), a digital signal processor (DSP), and the like. The control unit 14 includes a data processing unit 300 , a model generation unit 400 , and a detection unit 500 .

The data processing unit 300 includes an infrared filter image FI photographed by the infrared filter camera 110 of the camera unit 11 and a thermal infrared image TI photographed by the thermal infrared camera 120 and the lidar unit 12 . To generate a multi-channel image vector (M) that is data input to the detection model (DM) by embedding the lidar image (LI) generated based on the scan data scanned by the lidar sensor 200 in a predetermined vector space it is for The data processing unit 300 receives the lidar image LI scanned through the lidar sensor 200 of the monitoring area MA from the lidar unit 12 , and receives the monitoring area MA from the camera unit 11 . An infrared filter image FI photographed by the infrared filter camera 110 and a thermal infrared image TI photographed by the thermal infrared camera 120 in the monitoring area MA may be input.

In this way, when the data processing unit 300 receives the LIDAR image, the infrared filter image, and the thermal infrared image LI, FI, TI, the pixels of some of the infrared filter image and the thermal infrared image FI, TI are selectively received. You can erase the pixel value of , or fill it with 0. That is, the data processing unit 300 may detect the region of interest ROI as shown in FIG. 4 through image processing for the infrared filter image FI. In this case, the data processing unit 300 may detect the region of interest (ROI) using techniques such as histogram filtering, Canny edge detection, Hough transform, and Harris corner detection. Then, as shown in FIG. 5 , the data processing unit 300 may erase all pixel values of the remaining regions except for the region of interest (ROI) or fill them with zero. Also, when the temperature of a pixel in the thermal infrared image TI is greater than or equal to a predetermined value, the data processing unit 300 may erase all pixel values of the corresponding pixel or fill it with zero. As described above, the process of erasing pixel values of some pixels of the infrared filter image and the thermal infrared image or filling them with 0 may be optional and may be omitted.

Next, referring to FIG. 6 , when the three images LI, FI, and TI all have the same height H and width W, the data processing unit 300 generates the three images LI, FI, and TI. ), divided into x and y in the horizontal and vertical directions, and divided into a plurality of unit areas Ua having a predetermined unit height Uh and unit width Uw.

Then, the data processing unit 300 performs a convolution operation using a convolution filter Uf of the same standard as each of the plurality of unit areas Ua to express a feature of the corresponding unit area Ua (Uf) to extract At this time, as shown in FIG. 7 , the convolution filter Uf has the same standard (Uw*Uh) as the unit area Ua, and has elements corresponding to the number of pixels in the unit area Ua. In particular, all elements of the convolution filter Uf have a value of 0 or 1, and elements adjacent to each other have different values. That is, 0 is always adjacent to 1, and 1 is always arranged to be adjacent to 0.

Then, as shown in FIG. 8 , the data processing unit 300 uses a feature value Uf derived for a plurality of unit areas Ua of each of the three images LI, FI, and TI as an element. An image vector, which is a two-dimensional matrix, is generated, that is, a lidar image vector, an infrared filter image vector, and a thermal infrared image vector (DL, DF, DT). Next, the data processing unit 300 generates a multi-channel image vector M by merging all three two-dimensional image vectors DL, DF, and DT. This multi-channel image vector (M) is input to the detection model (DM), and the detection model (DM) estimates whether or not icing occurs through an operation on the multi-channel image vector (M).

The detection model (DM) includes one or more neural networks including one or more layers. Such a detection model (DM) includes one or more layers, and any one layer performs one or more operations. The calculation result of one layer is weighted and transmitted to the next layer. This means that the weight is applied to the operation result of the current layer and input to the operation of the next layer. In other words, the detection model DM performs a plurality of operations to which weights are applied. A plurality of layers is a convolution layer (CVL) that performs a convolution operation, a pooling layer that performs a down sampling operation or an up sampling operation (PLL: Pooling Layer), It may include a fully connected layer (FCL) that performs an operation by an activation function, and the like. Each of the convolution, downsampling, and upsampling operations uses a kernel composed of a predetermined matrix, and values of elements of the matrix constituting the kernel may be the weight w. Here, the activation function may be exemplified by Sigmoid, Hyperbolic tangent (tanh), Exponential Linear Unit (ELU), Rectified Linear Unit (ReLU), Leakly ReLU, Maxout, Minout, Softmax, and the like. . The detection model DM may basically include models such as You Only Look Once (YOLO), YOLOv2, YOLO9000, and YOLOv3. The detection model (DM) may further include additional layers or networks such as a Fully Connected Layer (FCL), a Neural Network (DN), and a Deep Neural Network (DNN).

According to an embodiment, the detection model DM includes a prediction network (PN) and a detection network (DN) corresponding to the prediction network (PN), as shown in FIG. 10 . When a multi-channel image vector is input, the prediction network EN performs a plurality of operations to which weights of a plurality of layers are applied and outputs a predicted value. That is, referring to FIG. 11 , the prediction network EN generates images LI, FI, TI or multi-channel image vectors M, for example, as (1,1) to (3,4) of FIG. 11 , After dividing into a plurality of cells, a plurality of partition boxes (B: Bounding Box) having center coordinates (x, y) in each of the plurality of cells are coordinates (x) defining the center, width, and height based on the cell to which each belongs , y, w, h), the confidence indicating the probability that the object exists in the area of the compartment B while the object is included in the compartment B, and the object in the compartment B is of multiple classes The probability of belonging to each object can be calculated and output as a predicted value.

The detection network DN selects a partition box B corresponding to one or more predicted values among a plurality of partition boxes B corresponding to the predicted value and outputs it as an output value. The detection network DN calculates an output value through a plurality of operations in which a weight is applied to the predicted value. In this case, the first detection network DN1 and the second detection network DN2 may calculate an output value using the prediction values of the first prediction network PN1 and the second prediction network PN2. The third detection network DN3 and the fourth detection network DN4 may calculate an output value using prediction values of all of the first to fourth prediction networks. For example, the detection network (DN: DN1, DN2, DN3, DN4) selects a partition box (B) in which the probability that the object in the plurality of partition boxes (B) is an object of a pre-learned class is greater than or equal to a preset threshold output value can be calculated. As shown in FIG. 10 , the detection network DN may display an output value on the images LI, FI, TI, preferably FI.

The model generator 400 is for learning the detection model DM. The model generator 400 trains the detection model DM to output a boundary box (B) that specifies an area where the occurrence of ice is estimated and the probability that ice exists in the area specified by the partition box. To this end, the model generator 400 generates a multi-channel image vector for learning (M) and then inputs it to the detection model (DM). In the multi-channel image vector (M) for learning, two types of labels are set, and a label indicating the area occupied by icing in the above-described multi-channel image vector (M) in the same format as the partition box (B) and an area without icing It includes a label that is displayed in the same format as the compartment box (B).

Then, the detection model DM calculates and outputs an output value through a plurality of operations in which a plurality of layer weights are applied to the multi-channel image vector M for learning. The output value is the coordinates (bx, by, bw, bh) defining the partition box (B), and the degree to which the area occupied by the partition box (B) matches the ideal box (ground-truth box) containing 100% of the freezing area It includes the confidence (confidence: 0~1) representing , and the probability that ice has occurred in the compartment box (B) (eg, 0.785).

Based on the output value of the detection model DM, the model generator 400 may derive a loss value according to the loss function. For example, the loss function is expressed by Equation 2 below.

S represents the number of cells, and C represents the confidence score. B represents the number of compartments in one cell. pi(c) represents the probability that the object of the i-th cell belongs to the corresponding class (c). For example, when the first class (c=1) is a frozen state object representing a frozen state, if pi(1)=0.789, it indicates that the probability that the frozen state object exists is 78.9%. Here, i is a parameter indicating a cell in which the frozen state object exists, and j is a parameter indicating a predicted partition box. In addition, bx and by represent the center coordinates of the partition box, and bw and bh represent the width and height of the partition box, respectively.

is to further reflect the values of the variables of the compartment box, and is an independent hyperparameter for balancing the loss for the coordinates (bx, by, bw, bh) of the compartment box (B) and other losses. to be.

is to reflect the values of the variables of the compartment box more and reflect the values less for the area where the frozen state object does not exist. in other words,

is a dependent hyperparameter for balancing between a compartment with an ice state object and a compartment without ice. According to an embodiment, the parameter including the independent hyperparameter and the dependent hyperparameter is preset, and the value of the dependent hyperparameter is dependent on the value of the independent hyperparameter. Accordingly, the independent hyperparameter and the dependent hyperparameter may be set in a relationship as shown in Equation 3 below.

The setting of these hyperparameters may be sequentially changed as learning proceeds.

is 1 when there is ice in cell i, and 0 when there is no ice.

is 1 if there is ice in compartment j in cell i, and 0 otherwise.

is 1 if there is no object in partition box j in cell i, and 0 if there is.

The first and second terms of the loss function of Equation 2 are as shown in Equation 4 below.

The first and second terms of this loss function calculate the coordinate loss representing the difference between the coordinates (x, y, w, h) of the compartment box and the coordinates of the label indicating the area occupied by ice. it is to do

In addition, the third and fourth terms of the loss function of Equation 2 are as shown in Equation 5 below.

The third and fourth terms of this loss function calculate a confidence loss representing the difference between the area occupied by the compartment box (B) and the ideal box (ground-truth box) containing 100% of the area occupied by ice. it is to do

Finally, the last term of the loss function of Equation 2 is as Equation 6 below.

Equation 6 is for calculating a classification loss representing a difference between an object output as existing in the partition box B and an object existing in the actual partition box B. For example, the probability that the frozen state object exists in any one compartment B is output as 0.765, but when it does not exist, for example, when the expected value is 0.000, this loss (-0.765) is calculated.

The model generator 400 calculates a loss value, that is, a coordinate loss, a reliability loss, and a classification loss through the loss function, and optimizes the weight of the detection model DM so that the coordinate loss, the reliability loss, and the classification loss are minimized. . According to an embodiment of the present invention, the model generator 400 may perform optimization by adjusting hyperparameters. This method will be described in more detail below.

The detection unit 500 specifies an area where ice has occurred through the detection model DM through a boundary box (B), and calculates a probability that ice exists in the specified area. Then, the existence of ice is finally determined according to the probability that there is ice in the specified area. To this end, the detection unit 500 generates a multi-channel image vector (Mt) from the data processing unit 300, the lidar image, the infrared filter image, and the thermal image input from the camera unit 11 and the lidar unit 12, When this is output, the multi-channel image vector Mt is input, and the multi-channel image vector Mt is input to the detection model DM. Then, the detection model DM calculates and outputs an output value through a plurality of operations to which the weights learned between the plurality of layers are applied. At this time, if the probability that the object in the partition box (B) having a reliability equal to or greater than a predetermined value in the output value of the detection model (DM) is greater than or equal to a preset threshold, the detection unit 500 is within the area occupied by the partition box (B). It is judged that icing has occurred. On the other hand, the detection unit 500 has a probability that the object in the partition box B having a reliability greater than or equal to a predetermined value in the output value of the detection model DM is less than a preset threshold or less than the probability of belonging to an object in a non-freezing state. , it is considered that no freezing has occurred.

Next, a method for generating a detection model (DM), which is a deep learning model for monitoring habitual ice and slippery risk areas, will be described. 12 is a flowchart illustrating a method of generating a detection model for monitoring a habitual ice and slippery danger area according to an embodiment of the present invention.

12 , in step S110 , the model generating unit 400 scans a learning area including at least a part of an area for which freezing or not, through the data processing unit 300 , is scanned with the lidar sensor 200 , A multi-channel image vector (Mt) for learning is generated from the infrared filter image photographed by the infrared filter camera 110 of the same learning area and the thermal infrared image obtained by photographing the same learning area with the thermal infrared camera 120 .

The model generator 400 sets the label for the multi-channel image vector Mt for learning by classifying the frozen state and the non-freezing state in step S120 . That is, the partition box B is added by dividing the area in the frozen state and the area in the non-freezing state.

Then, the model generator 400 sets the loss function hyperparameter in step S130. In this case, the dependent hyperparameter may be set by setting the independent hyperparameter. In this step S130, the model generator 400 may set the independent hyperparameter to 0.5 as an initial value. In addition, whenever step S130 is repeated, the value of the independent hyperparameter may be sequentially increased by a predetermined value. Then, according to Equation 2, the value of the dependent hyperparameter may be decreased by a predetermined number from 0.5 to 0.

Next, the model generator 400 inputs the multi-channel image vector Mt for training to the detection model DM in step S140. Then, the detection model DM will output an output value calculated through a plurality of operations in which a plurality of layer weights are applied to the multi-channel image vector Mt for learning input in step S150. The output value of the detection model (DM) is the coordinates (x, y, w, h) of the compartment box (B), the reliability of the compartment box (B), the probability that the object in the compartment box (B) is an frozen state object, and the non-freezing state It contains the probability of being an object. Accordingly, the loss function of the detection model (DM) is a coordinate loss indicating the difference between the coordinates of the partition box (B) output as an output value and the coordinates of the label indicating the area occupied by the actual freezing area, Confidence loss indicating the difference between the compartment box (B) and the ideal box (ground-truth box) and classification loss indicating the difference between the class of the object in the compartment box (B) output as an output value and the class of the real object (classification loss).

At this time, the model generating unit 400 calculates the loss that is the difference between the output value and the label, that is, the coordinate loss, the reliability loss, and the classification loss through the loss function in step S160, and the loss including the coordinate loss, the reliability loss and the classification loss. Optimization is performed to correct the weight of the detection model DM so that this is minimized. Steps S110 to S160 described above may be repeatedly performed using a plurality of different multi-channel image vectors for learning. In this repetition, as described above, the hyperparameter value may be changed and set in step S130. At this time, the model generating unit 400 is an independent hyperparameter according to Equation 2 at each iteration.

By increasing the value by a predetermined value for each repetition from 0.5 to 1, the dependent hyperparameter

can be set by decreasing by a predetermined value from 0.5 to 0. Because it is difficult to clearly distinguish between the frozen state and the non-freeze state before the learning level is increased, a term for compensation, that is, the fourth term, is required. However, after the learning level is increased, the independent hyperparameter value can be set to 1. Accordingly, the 4th term of the loss function is canceled because the dependent hyperparameter value becomes 0. Therefore, learning is performed to clearly distinguish between the frozen state and the non-freeze state without compensating for the fourth term of the loss function.

The above-described steps S110 to S160 may be repeated until the detection model reaches a preset accuracy by verifying the detection model through an evaluation index. Accordingly, the model generator 400 determines whether the learning completion condition is satisfied in step S170 . According to an embodiment, the model generator 400 may determine that the learning completion condition is satisfied when the output value of the detection model DM through a preset evaluation index is equal to or greater than a preset accuracy. In this way, if the learning completion condition is satisfied, the model generating unit 400 completes the learning in step S180.

When the learning of the detection model DM is completed as described above, it is possible to monitor whether icing occurs using the detection model DM. These methods will be described. 13 is a flowchart for explaining a method for monitoring a habitual ice and slippery risk area using a deep learning model according to an embodiment of the present invention.

Referring to FIG. 13 , the data processing unit 300 captures a lidar image scanned through the lidar through the lidar unit 12 through the lidar unit 12 in step S210 and the monitoring region through the camera unit 11 with an infrared filter camera. It receives an infrared filter image and a thermal infrared image captured by a thermal infrared camera of the monitoring area.

Then, the data processing unit 300 embeds the lidar image, the infrared filter image and the thermal infrared image in step S220 to generate and output a multi-channel image vector M. From the lidar image, the infrared filter image and the thermal infrared image, A method of generating the channel image vector M is the same as described above with reference to FIGS. 4 to 9 .

When the multi-channel image vector M is input from the data processing unit 300, the detection unit 400 inputs the multi-channel image vector M to the detection model DM. Then, the detection model DM is input. Output values calculated through a plurality of calculations to which the weights learned between a plurality of layers are applied to the channel image vector (M. These output values are the coordinates (x, y, w, h) of the partition box (B), It includes the reliability of the partition box (B) and the probability that the object in the partition box (B) is an icing object, and the probability that it is an unfreezing object.

Then, the detection unit 500 determines whether ice is generated in the monitoring area according to the output value of the detection model DM in step S220.

According to an embodiment, the detection unit 500 recognizes that ice has occurred when the probability that the freezing state object exists in the partition box B having a reliability equal to or greater than a preset threshold is greater than or equal to the threshold. On the other hand, the detection unit 500 may determine that the freezing does not occur when the probability that the freezing state object exists in the partition box B having the reliability equal to or greater than the preset threshold is less than the threshold.

According to another embodiment, the detection unit 500 is more than a preset threshold while the probability of the existence of the frozen state object in the compartment box (B) having the reliability equal to or greater than the preset threshold exceeds the probability of the existence of the non-frozen state object, it is diagnosed as intussusception. . For example, assuming that the threshold is 0.700 (70%), the probability that the frozen state object exists in the compartment B with a reliability greater than or equal to a predetermined value is 88% (blk = 0.877), and the probability that the non-frozen object exists is a normal class Assuming that the probability of belonging to is 12% (noblk=0.123), the detection unit 500 determines that the probability that the frozen state object exists (88%) exceeds the probability that the frozen state object exists (12%), and the threshold value (70%) ), it is recognized that ice has occurred in the area specified by the compartment box (B). On the other hand, the detection unit 500 determines that freezing has not occurred if the probability of the existence of the frozen state object in the partition box B having the reliability equal to or greater than the preset value is less than or equal to the probability of the non-icing state object being present or less than the preset threshold. . Similarly, assume that the threshold is 0.700 (70%). According to the output value of another example, it is assumed that the probability that the frozen state object exists in the compartment B is 69% (blk=0.691), and the probability that the non-freeze state object exists is 31% (noblk=0.309). Then, the detection unit 500 determines that the probability (69%) of the existence of the frozen state object exceeds the probability (31%) of the non-freezing state object, but the probability (69%) of the existence of the frozen state object is less than the threshold (70%). Therefore, it is recognized that freezing has not occurred.

After the above determination, the detection unit 500 may transmit a message notifying whether or not ice has occurred to the control server 40 through the communication unit 13 . Accordingly, the manager of the control server 40 may take follow-up actions according to the message.

14 is a diagram illustrating a computing device according to an embodiment of the present invention. The computing device TN100 may be a device described herein (eg, the monitoring device 10 , the edge device 20 , the monitoring server 30 , the control server 40 , etc.).

The computing device TN100 may include at least one processor TN110 , a transceiver device TN120 , and a memory TN130 . In addition, the computing device TN100 may further include a storage device TN140 , an input interface device TN150 , an output interface device TN160 , and the like. Components included in the computing device TN100 may be connected by a bus TN170 to communicate with each other.

The processor TN110 may execute a program command stored in at least one of the memory TN130 and the storage device TN140. The processor TN110 may mean a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods according to an embodiment of the present invention are performed. The processor TN110 may be configured to implement procedures, functions, and methods described in connection with an embodiment of the present invention. The processor TN110 may control each component of the computing device TN100.

Each of the memory TN130 and the storage device TN140 may store various information related to the operation of the processor TN110. Each of the memory TN130 and the storage device TN140 may be configured as at least one of a volatile storage medium and a nonvolatile storage medium. For example, the memory TN130 may include at least one of a read only memory (ROM) and a random access memory (RAM).

The transceiver TN120 may transmit or receive a wired signal or a wireless signal. The transceiver TN120 may be connected to a network to perform communication.

On the other hand, the embodiment of the present invention is not implemented only through the apparatus and/or method described so far, and a program for realizing a function corresponding to the configuration of the embodiment of the present invention or a recording medium in which the program is recorded may be implemented. And, such an implementation can be easily implemented by those skilled in the art from the description of the above-described embodiment.

Meanwhile, the method according to the embodiment of the present invention described above may be implemented in the form of a program readable by various computer means and recorded in a computer readable recording medium. Here, the recording medium may include a program command, a data file, a data structure, etc. alone or in combination. The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. For example, the recording medium includes magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floppy disks ( magneto-optical media) and hardware devices specially configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions may include not only machine language wires such as those generated by a compiler, but also high-level language wires that can be executed by a computer using an interpreter or the like. Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

Although the present invention has been described above using several preferred embodiments, these examples are illustrative and not restrictive. As such, those of ordinary skill in the art to which the present invention pertains will understand that various changes and modifications can be made in accordance with the doctrine of equivalents without departing from the spirit of the present invention and the scope of rights set forth in the appended claims.

10: monitoring device 11: camera unit

12: lidar unit 13: communication unit

14: control unit 20: edge device

30: monitoring server 40: control server

110: infrared camera 120: thermal infrared camera

200: lidar sensor 300: data processing unit

400: model generation unit 500: detection unit

Claims

A method for monitoring a habitual ice and slippery risk area, the method comprising:

A multi-channel image by embedding a lidar image in which the data processing unit scans the monitoring area through the lidar sensor, an infrared filter image in which the monitoring area is photographed with an infrared filter camera, and a thermal infrared image captured in the monitoring area by a thermal infrared camera generating a vector;

inputting the multi-channel image vector into a detection model by a detection unit;

The detection model performs a plurality of calculations to which the weights learned between a plurality of layers are applied to the multi-channel image vector, and a partition box for specifying an area in which the occurrence of ice is estimated, and a partition box for specifying a region specified by the partition box. outputting a probability; and

when the probability is greater than or equal to a preset threshold, recognizing, by the detector, that ice has occurred, and transmitting a message informing of the occurrence of ice;

characterized in that it comprises

Methods for monitoring.
According to claim 1,

The step of generating the multi-channel image vector comprises:

dividing, by the data processing unit, the lidar image, the infrared filter image, and the thermal infrared image into a plurality of unit regions having predetermined unit heights and unit widths in horizontal and vertical directions;

extracting, by the data processing unit, a feature value expressing a characteristic of the unit region by performing a convolution operation on each of the plurality of unit regions using a convolution filter of the same standard;

The data processing unit generates a lidar image vector, an infrared filter image vector, and a thermal infrared image vector using feature values derived for a plurality of unit regions of each of the lidar image, the infrared filter image, and the thermal infrared image as elements. to do;

generating, by the data processing unit, the multi-channel image vector by merging the lidar image vector, the infrared filter image vector, and the thermal infrared image vector;

characterized in that it comprises

Methods for monitoring.
3. The method of claim 2,

The convolutional filter is

It is the same standard as the unit area,

having an element corresponding to the number of pixels in the unit area;

All elements of the convolution filter have a value of 0 or 1,

Neighboring elements of the convolution filter, characterized in that have different values

Methods for monitoring.
3. The method of claim 2,

Before the step of dividing into the plurality of unit areas,

detecting, by the data processing unit, a region of interest through image processing with respect to the infrared filter image, and erasing or filling pixel values of regions other than the detected region of interest with zero;

characterized in that it further comprises

Methods for monitoring.
3. The method of claim 2,

Before the step of dividing into the plurality of unit areas,

erasing or filling, by the data processing unit, a pixel value of a pixel having a temperature equal to or greater than a predetermined value in the thermal infrared image;

characterized in that it further comprises

Methods for monitoring.
According to claim 1,

Before generating the multi-channel image vector,

The model generator scans the learning area including at least a part of the area for which the freezing state is known through the data processing unit with the lidar sensor, and opens the learning area. generating a multi-channel image vector for learning from a thermal infrared image taken with an infrared camera;

setting, by the model generator, a label for a multi-channel image vector for training by classifying an icy state and a non-icing state;

setting, by the model generator, hyperparameters including independent hyperparameters and dependent hyperparameters for the loss function;

inputting, by the model generator, the multi-channel image vector for training into a detection model;

calculating an output value through a plurality of operations in which weights of a plurality of layers are applied to the multi-channel image vector for learning to which the detection model is input;

performing, by the model generator, an optimization of modifying the weight of the detection model so that a loss, which is a difference between the output value and the label, is minimized through the loss function;

The detection model is verified through the evaluation index until the detection model reaches a preset accuracy.

generating the multi-channel image vector for training, setting the label, inputting the label to the detection model, calculating the output value, and performing the optimization.

repeating;

characterized in that it further comprises

Methods for monitoring.
7. The method of claim 6,

The loss function is

ego,

where S is the number of cells,

Wherein C is the confidence score,

where B is the number of compartments in one cell,

The pi(c) is the probability that the object of the i-th cell belongs to class c,

wherein i is a parameter indicating a cell in which a frozen state object exists,

Where j is a parameter representing the predicted partition box,

The bx and by are the center coordinates of the compartment box,

Wherein bw and bh are the width and height of the compartment box, respectively,

remind
is an independent hyperparameter,

remind
is a dependent hyperparameter, characterized in that

Methods for monitoring.
8. The method of claim 7,

The step of setting the hyperparameter is

For each repetition,

said independent hyperparameter, said
By setting increments from 0.5 to 1 by a predetermined value each time it is repeated,

formula
Depending on the

said dependent hyperparameter, said
characterized in that it is set by decreasing by a predetermined number from 0.5 to 0

Methods for monitoring.
A device for monitoring a habitual ice and slippery risk area, the device comprising:

A multi-channel image vector is generated by embedding a lidar image scanned in the monitoring area through a lidar sensor, an infrared filter image captured in the monitoring area using an infrared filter camera, and a thermal infrared image captured in the monitoring area using a thermal infrared camera. data processing unit; and

A partition box that inputs the multi-channel image vector into a detection model, and the detection model performs a plurality of calculations to which the weights learned between a plurality of layers are applied to the multi-channel image vector to specify an area in which the occurrence of icing is estimated; and a detection unit for recognizing whether or not ice is generated according to the probability when the probability that ice exists in the area specified by the partition box is output;

characterized in that it comprises

device for monitoring.
10. The method of claim 9,

The data processing unit

dividing the lidar image, the infrared filter image, and the thermal infrared image into a plurality of unit regions having predetermined unit heights and unit widths in horizontal and vertical directions,

performing a convolution operation on each of the plurality of unit regions using a convolution filter of the same standard to extract a feature value expressing the characteristics of the unit region;

generating a lidar image vector, an infrared filter image vector and a thermal infrared image vector using the feature values derived for each of the plurality of unit regions of the lidar image, the infrared filter image and the thermal infrared image as elements;

The multi-channel image vector is generated by merging the lidar image vector, the infrared filter image vector, and the thermal infrared image vector.

device for monitoring.
11. The method of claim 10,

The convolutional filter is

It is the same standard as the unit area,

having an element corresponding to the number of pixels in the unit area;

All elements of the convolution filter have a value of 0 or 1,

Neighboring elements of the convolution filter, characterized in that have different values

device for monitoring.
11. The method of claim 10,

the data processing unit

Before dividing into the plurality of unit areas,

Detecting a region of interest through image processing with respect to the infrared filter image, and erasing pixel values of the remaining regions except for the detected region of interest or filling the image with 0

device for monitoring.
11. The method of claim 10,

the data processing unit

Before dividing into the plurality of unit areas,

In the thermal infrared image, a pixel value of a pixel having a temperature greater than or equal to a predetermined value is erased or filled with zero.

device for monitoring.
10. The method of claim 9,

The model generation unit

A lidar image scanned by a lidar sensor of a learning region including at least a part of a region for which freezing or not is known through the data processing unit, an infrared filter image obtained by photographing the learning region with an infrared filter camera, and a thermal infrared camera for the learning region A multi-channel image vector for learning is generated from the thermal infrared image taken with

Set a label for the multi-channel image vector for learning by classifying the frozen state and the non-freezing state,

Set hyperparameters including independent hyperparameters and dependent hyperparameters for the loss function with respect to the multi-channel image vector for training,

When the multi-channel image vector for training is input to the detection model and the detection model calculates an output value through a plurality of calculations in which weights of a plurality of layers are applied to the input multi-channel image vector for training,

Optimizing the weight of the detection model so that the loss that is the difference between the output value and the label is minimized through the loss function

device for monitoring.
15. The method of claim 14,

The loss function is

ego,

where S is the number of cells,

Wherein C is the confidence score,

where B is the number of compartments in one cell,

The pi(c) is the probability that the object of the i-th cell belongs to class c,

wherein i is a parameter indicating a cell in which a frozen state object exists,

Where j is a parameter representing the predicted partition box,

The bx and by are the center coordinates of the compartment box,

Wherein bw and bh are the width and height of the compartment box, respectively,

remind
is an independent hyperparameter,

remind
is the dependent hyperparameter,

The model generation unit

said independent hyperparameter, said
By setting increments from 0.5 to 1 by a predetermined value each time it is repeated,

formula
Depending on the

said dependent hyperparameter, said
characterized in that it is set by decreasing by a predetermined number from 0.5 to 0

device for monitoring.