CN116452850A - Road ponding area identification method based on data mining and deep learning - Google Patents

Road ponding area identification method based on data mining and deep learning Download PDF

Info

Publication number
CN116452850A
CN116452850A CN202310238225.XA CN202310238225A CN116452850A CN 116452850 A CN116452850 A CN 116452850A CN 202310238225 A CN202310238225 A CN 202310238225A CN 116452850 A CN116452850 A CN 116452850A
Authority
CN
China
Prior art keywords
image
ponding
data
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310238225.XA
Other languages
Chinese (zh)
Inventor
黄国如
廖宇鸿
郑嘉璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202310238225.XA priority Critical patent/CN116452850A/en
Publication of CN116452850A publication Critical patent/CN116452850A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a road ponding area identification method based on data mining and deep learning. The method comprises the following steps: acquiring urban road waterlogging and ponding images from an Internet big data platform through a web crawler and a data mining technology; preprocessing an image, and making a data label for ponding characteristic information; carrying out data set expansion by adopting an image processing algorithm and a data enhancement technology, generating a large amount of image data, and dividing a training set and a testing set; clustering the training set data label frame by using a K-means clustering algorithm, and adjusting model configuration parameters; based on a deep learning Mask RCNN target detection model, learning and training a visible ponding range in a data set to obtain a final model training weight file; through extraction and visual prediction of the image ponding characteristics, detection and identification of a ponding range are realized. The invention widens the channel of image data acquisition and greatly reduces the difficulty of image data set construction.

Description

Road ponding area identification method based on data mining and deep learning
Technical Field
The invention relates to the field of urban waterlogging monitoring, in particular to a road ponding area identification method based on data mining and deep learning technologies.
Background
In recent years, under the background of large climate change, the urban process is continuously accelerated, and is influenced by urban construction and human life, urban storm water logging disasters frequently occur, so that the daily life and work of residents are seriously influenced, and serious social public property loss is caused. In order to facilitate the related departments to develop the emergency rescue and disaster emergency management work of urban inland inundation and reduce the life safety and property safety hazards caused by urban rainstorm inland inundation as much as possible, real-time monitoring of urban inland inundation water is needed, the emergency management capability for coping with urban rainstorm flood disasters is improved, and the loss is reduced. The traditional manual or instrument monitoring method not only needs to consume a large amount of manpower, material resources and financial resources, but also has the problems that real-time and rapid monitoring is difficult to achieve, the price of instrument equipment is high, the instrument equipment is easily influenced by external environment, the stability is insufficient and the like. For this reason, a novel technical method for monitoring waterlogging in real time needs to be sought.
Deep learning is an emerging research field, is in the spotlight in recent years, the birth of an RCNN model and a YOLO model shows unique advantages of a deep learning technology in the aspects of feature extraction and simulation optimization in the field of computer vision, and a Mask RCNN target detection network model is provided in 2017 He et al (He K, gkioxari G, doll r P, et al Mask r-cnn [ C ]// Proceedings of the IEEE international conference on computer vision.2017:2961-2969 ]). For the application of deep learning in urban waterlogging monitoring, researches are carried out by current scholars, such as Bai Ganggang and the like (Bai Ganggang, hou Jingming, han Hao and the like; a road ponding intelligent monitoring method based on deep learning [ J ]. Water resource protection, 2021,37 (05): 75-80.) explores the use of a YOLOv2 network model for automatic recognition of ponding areas and extraction of ponding areas; jafari et al (Jafari N H, li X, chen Q, et al real-time water level monitoring using live cameras and computer vision techniques [ J ]. Computers & Geosciences,2021, 147:104642.) successfully differentiated image areas of heavy rain water and localized floods using urban hydrologic images acquired by traffic cameras trained on a deep learning-based image segmentation model. However, the image sources studied above depend on fixed monitoring cameras, and there are limitations such as single data source, poor diversity of ponding scenes, and the like, and the arrangement of multiple cameras also causes problems such as increased difficulty in image acquisition, high equipment cost, and the like.
The rapid development of the internet makes the extraction and mining of network big data start to become a new trend of data acquisition. At present, domestic scholars develop related researches on waterlogging disasters by utilizing microblog text big data, such as Wu Xianhua and the like (Wu Xianhua, xiaoyang, wang Guofu and the like), and a technical method for judging the disaster grade of the urban waterlogging disasters and researching public opinion based on the microblog big data is provided by taking Nanjing city as an example [ J ]. Disaster science, 2018,33 (03): 117-122 ]. However, related research on network image application is still relatively few, and urban waterlogging monitoring technology research combined with network big data and deep learning technology is still one of the directions worth exploring deeply. Based on the method, the road ponding area identification method based on data mining and deep learning is provided, and has important significance for orderly development of waterlogging disaster emergency management work, improvement of the capability of related departments and people for coping with sudden urban waterlogging disaster events and reduction of property loss and casualties caused by urban waterlogging.
Disclosure of Invention
The invention aims to solve the problems that the traditional urban waterlogging monitoring mode consumes a large amount of manpower, material resources and financial resources, has poor instantaneity and poor stability, and the existing deep learning waterlogging detection technology method is single in image data source, difficult in data acquisition and the like.
The object of the invention is achieved by at least one of the following technical solutions.
A road ponding area identification method based on data mining and deep learning comprises the following steps:
s1, acquiring urban waterlogging ponding images from an Internet big data platform through a web crawler and a data mining technology, constructing a ponding image database, and establishing an initial ponding image data set through screening;
s2, preprocessing an image in the initial ponding image data set, and carrying out boundary drawing and label making on a ponding range visible in the data set;
s3, utilizing an image processing algorithm to perform data enhancement on the image and the label at the same time, expanding a data set, dividing the marked data set into a training set and a test set, and converting the training set and the test set into a data set format which can be read by the neural network model;
s4, clustering the target frames of the training set by using a K-means clustering algorithm, automatically generating a group of anchor frames more suitable for the user-defined data set, and adjusting the size and the length-width ratio of the anchors in the model configuration file;
s5, inputting the marked training set and verification set into a Mask RCNN instance segmentation model for learning, evaluating the ponding detection performance of the neural network model after training is completed, and repeatedly training by adjusting training parameters of the neural network model until the model training effect reaches the optimal value, namely completing model training, and obtaining a final model training weight file;
s6, inputting the test image or video into a detection project based on the trained weight file in the S5, and extracting the ponding characteristics of the image to obtain a ponding range detection and identification result.
Further, in step S1, the internet big data platform includes a network social media and crowd-sourced data platform;
and formulating information retrieval keywords, such as selecting wider keywords of waterlogging, road ponding, heavy rain and the like, or selecting specific rainfall occasions as a retrieval range, then acquiring urban road ponding images and relevant position time information from an Internet big data platform through keyword retrieval by utilizing web crawlers and a data mining technology, and downloading the urban road ponding images and relevant position time information to a local database for storage and arrangement. And screening and extracting waterlogging ponding images from an image database to form an initial urban road ponding image data set.
Further, in step S2, the following steps are included:
s2.1, performing image operation on an initial ponding image by using an Opencv computer vision image processing library, wherein the image operation comprises cutting, size expansion or scaling, and the initial ponding image is adjusted to be uniform in resolution so as to facilitate subsequent image processing and data label manufacturing;
s2.2, extracting the characteristics of the ponding area by using Labelme data labeling software, importing the image into Labelme, selecting to create a polygon to begin labeling, carrying out boundary contour sketching and connecting lines on a ponding target visible in the image by sequentially drawing points, finally generating a closed polygon just covering the ponding area in an end-to-end mode, and creating a json tag file in a corresponding Labelme format for each image by giving classification tag information to the generated closed polygon.
Further, in step S3, a new ponding image is generated by using an image processing algorithm in an Opencv library and an Augmentor semantic segmentation data enhancement method, so as to implement data enhancement of an image dataset, thereby improving diversity and generalization capability of the dataset;
the image processing method comprises a color transformation type method and a geometric transformation type method, wherein the color transformation type method comprises the steps of adjusting contrast, changing brightness, modifying RGB (red, green and blue) values, adding noise points and blackening or replacing background areas; the geometric transformation method comprises the steps of amplifying or shrinking the image, transforming the scale, translating the image, turning over and rotating;
and meanwhile, carrying out corresponding transformation enhancement processing on the image label generated in the step S2: if only the data enhancement of the color transformation class is performed and the geometric position transformation is not involved, the label position information is kept unchanged; if there is data enhancement for geometric transformation, the label also needs to change corresponding position along with image transformation, specifically as follows:
the position of a mark point in the tag file is transformed into a coordinate position corresponding to an image transformation method, or a corresponding transformation processing is carried out on a mask image transformed by the tag file by using a data enhancement method, so that a tag corresponding to a new image is generated, and the multiple expansion of the number of the data set images is realized;
after more image data are generated through data enhancement, all the images marked by information and corresponding labels are divided into a training set and a testing set according to proportion, and all the labels corresponding to the training set and the testing set images are converted in batches by utilizing a data set format conversion tool so as to be read by a target detection framework and further used.
Further, in step S4, a basic environment for deep learning image recognition is configured, a target detection frame is installed, and then a marked target frame in a training set is clustered by using a K-means clustering algorithm; and clustering the frame samples into K clusters by adjusting the K value, so as to find the optimal number and size of the anchor frames, and inputting the aspect ratio of the corresponding anchor into the target detection model configuration file.
Further, in step S4, the algorithm flow for performing K-means clustering calculation of the anchor frame is as follows:
(1) Selecting the number K of clusters;
(2) Randomly selecting K cluster boxes as initial anchor boxes;
(3) Using the IOU value as a measurement, calculating the IOU value of each anchor box and each target frame; the IOU value is the intersection ratio of the anchor box and the target frame, and the value range is [0,1]The method comprises the steps of carrying out a first treatment on the surface of the In calculating the IOU, it is assumed that the upper left vertices of all boxes are at the origin, and that the anchor size is (w a ,h a ) The size of the frame is (w b ,h b ) Then
Since the greater the IOU value is, the better the greater the similarity is, in order to make the metric value smaller, a distance parameter d is defined herein, so that each target frame is assigned to an anchor with the smallest error from the distance parameter d, and the distance parameter d is taken as:
d=1-IOU
wherein w is a And h a Respectively the bottom and the height, w, of the anchor box b And h b The bottom and the height of the target frame are respectively, and the IOU is the intersection ratio of the anchor box and the target frame.
(4) Calculating the bottom and high median or mean value of all target frames in each cluster, and recalculating a new cluster center to serve as a new anchor box size to update anchors;
(5) Repeating the steps until the anchor is not changed any more, meeting the convergence requirement, or reaching the maximum iteration times.
Further, in step S5, a neural network model and a configuration file thereof are selected according to self-training requirements and configuration conditions, and a model pre-training weight file to be used is downloaded from the network;
registering paths and types of a training set and a testing set in a training neural network model, defining main training parameter information including basic learning rate, learning rate attenuation, sample number of each batch, iteration number and training cycle number, and starting to train the neural network;
and after training, evaluating the ponding learning detection effect of the neural network model, further repeating training for a plurality of times by adjusting various neural network model parameters until the model training effect reaches the optimal value, completing training, and generating a final model training weight file.
Further, in step S6, a detection project file is created, category information of the detection target and a model reasoning confidence threshold are written, and the model weight file is loaded with the weight file trained in step S5. Through inputting commands in the terminal to detect and identify ponding areas with various input data sources, the model will infer and draw a prediction frame and a mask corresponding to the area with the inference score higher than the confidence threshold, so that an image or video with a ponding range mask is output, and a prediction result is visualized.
Further, the plurality of sources of input data include images, video, or webcams.
The invention has the beneficial effects that:
1. by acquiring waterlogging ponding images from network platforms such as network social media and the like, the acquisition difficulty of the images and video data is effectively reduced, massive image data is acquired at low cost, the channel for acquiring the image data is widened, and the generalization capability of a data set is improved.
2. The data set is expanded by utilizing a data enhancement mode, and the image and the label are simultaneously enhanced without respectively labeling the massive data sets one by one, so that the workload of labeling the image data is greatly reduced, and the working efficiency of manufacturing the data set is remarkably improved.
3. Through adopting the deep learning image recognition technology, need not consume a large amount of manpower, material resources and financial resources and carry out manual monitoring, can carry out real-time supervision through remote terminal operation to the waterlogging point to realize the automatic extraction of the characteristic and the boundary information of the on-the-spot waterlogging ponding of control, discernment ponding boundary and ponding region fast.
Drawings
FIG. 1 is a schematic diagram of a method for identifying road ponding areas based on data mining and deep learning;
FIG. 2 is a schematic diagram of a water accumulation range extraction flow based on a deep learning Mask RCNN algorithm in an embodiment of the invention;
FIG. 3 is a flowchart of an algorithm for performing K-means clustering calculation of an anchor frame in an embodiment of the present invention.
Detailed Description
The following describes the embodiments of the present invention further with reference to the detailed description and drawings. The drawings referred to below are merely illustrative in nature and embodiments of the present invention are not limited thereto.
Example 1:
a road ponding area identification method based on data mining and deep learning is shown in fig. 1, and comprises the following steps:
s1, acquiring urban waterlogging ponding images from an Internet big data platform through a web crawler and a data mining technology, constructing a ponding image database, and establishing an initial ponding image data set through screening;
the Internet big data platform comprises a network social media and crowdsourcing data platform;
in this embodiment, information retrieval keywords, such as "road ponding," "water logging," and similar words used for adjective city waterlogging ponding events, may also be specifically used for specific rainfall events such as "Guangzhou 5.22 extra heavy storm," etc., and the network crawlers and data mining technology are utilized to retrieve urban road ponding images and relevant position time information from the internet big data platform through keyword retrieval, and then the urban road ponding images and relevant position time information are downloaded to a local database for storage and arrangement. And screening and extracting waterlogging ponding images from an image database to form an initial urban road ponding image data set.
S2, preprocessing an image in an initial ponding image data set, and carrying out boundary drawing and label making on a ponding range visible in the data set, wherein the method comprises the following steps of:
s2.1, performing image operation on an initial ponding image by using an Opencv computer vision image processing library, wherein the image operation comprises cutting, size expansion or scaling, and the initial ponding image is adjusted to be uniform in resolution so as to facilitate subsequent image processing and data label manufacturing;
s2.2, extracting the characteristics of the ponding area by using Labelme data labeling software, importing the image into Labelme, selecting to create a polygon to begin labeling, carrying out boundary contour sketching and connecting lines on a ponding target visible in the image by sequentially drawing points, finally generating a closed polygon just covering the ponding area in an end-to-end mode, and creating a json tag file in a corresponding Labelme format for each image by giving classification tag information to the generated closed polygon.
In this embodiment, the method is implemented in a Detectron2 environment, and the size specification of the data set image is not particularly specified, but on one hand, in order to facilitate subsequent data enhancement and label processing in the data set manufacturing process, and on the other hand, in order to avoid program running errors caused by excessive memory space occupation in training, the size of the large-resolution image is reduced, and the image resolution is unified. The Opencv computer vision image processing library can be utilized to cut and scale the initial ponding image, and the initial ponding image is uniformly adjusted to 512 x 512 resolution ratio, so that the subsequent image processing and data label making are convenient.
And then in the use of Labelme data labeling software, a water label information is given to the generated closed polygon by creating a visible water accumulation boundary in the polygon drawing image so as to represent water accumulation, and each image generates a json file in a corresponding Labelme format. After the image is marked, the image with the label mask can be output by inputting a command labelme_json_to_dataset < file name >. Json in the terminal, and the json file can be converted into label information data.
S3, utilizing an image processing algorithm to perform data enhancement on the image and the label at the same time, expanding a data set, dividing the marked data set into a training set and a test set, and converting the training set and the test set into a data set format which can be read by the neural network model;
for deep learning. In order to obtain a better model recognition effect, enough training data is needed for training, in general, the larger the data volume of a data set is, the better the effect of deep learning training is, but in many practical projects, it is often difficult to find a sufficient amount of high-quality data to complete a deep learning task. One approach to this problem is data enhancement, a technique that generates new training samples from existing training samples. For limited ponding image data volume, the limited image data can be used for generating more new images which are different and unique in a computer view angle in a data enhancement mode, so that the number and diversity of training samples are increased, the overfitting of a neural network is reduced, the network with stronger generalization capability is obtained, and the method can be better suitable for application scenes.
In the embodiment, a new ponding image is generated by using an image processing algorithm in an Opencv library and an Augmentor semantic segmentation data enhancement method, so that the data enhancement of an image data set is realized, and the diversity and generalization capability of the data set are improved;
the image processing method comprises a color transformation type method and a geometric transformation type method, wherein the color transformation type method comprises the steps of adjusting contrast, changing brightness, modifying RGB (red, green and blue) values, adding noise points and blackening or replacing background areas; the geometric transformation method comprises the steps of amplifying or shrinking the image, transforming the scale, translating the image, turning over and rotating;
and meanwhile, carrying out corresponding transformation enhancement processing on the image label generated in the step S2: if only the data enhancement of the color transformation class is performed and the geometric position transformation is not involved, the label position information is kept unchanged; if there is data enhancement for geometric transformation, the label also needs to change corresponding position along with image transformation, specifically as follows:
the position of a mark point in the tag file is transformed into a coordinate position corresponding to an image transformation method, or a corresponding transformation processing is carried out on a mask image transformed by the tag file by using a data enhancement method, so that a tag corresponding to a new image is generated, and the multiple expansion of the number of the data set images is realized;
after more image data are generated through data enhancement, all the images marked by information and corresponding labels are divided into a training set and a testing set according to proportion, and all the labels corresponding to the training set and the testing set images are converted in batches by utilizing a data set format conversion tool so as to be read by a target detection framework and further used.
In the embodiment, various image processing modes such as overturn, pixel RGB numerical conversion, noise addition, image brightness reduction, 45-degree rotation, 90-degree rotation, image scaling and the like are utilized, so that the number of images which are multiple times that of the original images is generated, a data set is effectively expanded, and the diversity of data is increased. Since the framework used in this embodiment is Detectron2, the data set format supported by its operation is COCO format, and thus data set format conversion is required. The image marked with the information is processed according to the following steps of 7:3 into training and testing sets, and converting the Labelme format labels into COCO data set format by means of a data set format conversion tool, to be further read and used by an open source target detection framework, detectron 2.
S4, clustering the target frames of the training set by using a K-means clustering algorithm, automatically generating a group of anchor frames more suitable for the user-defined data set, and adjusting the size and the length-width ratio of the anchors in the model configuration file;
configuring a basic environment for deep learning image recognition, installing a target detection frame, and then clustering marked target frames in a training set by using a K-means clustering algorithm; and clustering the frame samples into K clusters by adjusting the K value, so as to find the optimal number and size of the anchor frames, and inputting the aspect ratio of the corresponding anchor into the target detection model configuration file.
The parameters related to the size of the anchor and the aspect ratio in the framework of the Detectron2 are artificially designed and defined in configs/Base-RCNN-FPN.yaml, and the aspect ratio value of the anchor is fixed and taken as [0.5,1.0,2.0]. The preset anchor parameters are applicable to common public data sets, but are not necessarily applicable to self-made urban road ponding data sets, so that in the embodiment, the boxes of the ponding image training set are clustered by utilizing a K-means algorithm.
As shown in fig. 3, the algorithm flow for performing K-means clustering calculation of an anchor frame is as follows:
(1) Selecting the number K of clusters;
(2) Randomly selecting K cluster boxes as initial anchor boxes;
(3) Using the IOU value as a measurement, calculating the IOU value of each anchor box and each target frame; the IOU value is the intersection ratio of the anchor box and the target frame, and the value range is [0,1]The method comprises the steps of carrying out a first treatment on the surface of the In calculating the IOU, it is assumed that the upper left vertices of all boxes are at the origin, and that the anchor size is (w a ,h a ) The size of the frame is (w b ,h b ) Then
Since the greater the IOU value is, the better the greater the similarity is, in order to make the metric value smaller, a distance parameter d is defined herein, so that each target frame is assigned to an anchor with the smallest error from the distance parameter d, and the distance parameter d is taken as:
d=1-IOU
wherein w is a And h a Respectively the bottom and the height, w, of the anchor box b And h b The bottom and the height of the target frame are respectively, and the IOU is the intersection ratio of the anchor box and the target frame.
(4) Calculating the bottom and high median or mean value of all target frames in each cluster, and recalculating a new cluster center to serve as a new anchor box size to update anchors;
(5) Repeating the steps until the anchor is not changed any more, meeting the convergence requirement, or reaching the maximum iteration times.
In this embodiment, the clustering number K is selected to be 9, then 9 anchor aspect ratios suitable for self-made urban road ponding data sets are automatically generated through clustering operation of a K-means algorithm, and are respectively [0.6,0.7,0.9,1.0,1.2,1.3,1.5,1.8,2.4], and according to the clustering result of the K-means algorithm, the anchor aspect ratio parameters in the configuration file are correspondingly adjusted, and the size is not changed.
S5, inputting the marked training set and verification set into a Mask RCNN neural network model for learning, evaluating the ponding detection performance of the neural network model after training is completed, and repeatedly training by adjusting training parameters of the neural network model until the model training effect reaches the optimal value, namely completing model training, and obtaining a final model training weight file;
selecting a neural network model and a configuration file thereof according to self-training requirements and configuration conditions, and downloading a model pre-training weight file required to be used from the network;
registering paths and types of a training set and a testing set in a training neural network model, defining main training parameter information including basic learning rate, learning rate attenuation, sample number of each batch, iteration number and training cycle number, and starting to train the neural network;
and after training, evaluating the ponding detection effect of the neural network model, further repeating training for a plurality of times by adjusting various neural network model parameters until the model training effect reaches the optimal value, completing training, and generating a final model training weight file.
In this embodiment, the Mask RCNN neural network model is used to extract the ponding range, and the operation structure is shown in fig. 2, and mainly includes the following steps:
(1) Inputting an image data set, wherein the network model firstly sends an input picture into a pre-trained feature extraction backbone network to generate a corresponding feature map;
(2) And sending the feature map output after feature extraction into a region candidate network. The network traverses each pixel position of the feature map, sets a fixed number of anchors, and after one 3*3 convolution, respectively enters different branches and corresponds to different 1*1 convolutions, wherein the first convolution is a positioning layer, outputs 4 coordinate offsets of the anchors, and the second convolution is a classification layer, performs object classification, and outputs foreground and background probabilities of the anchors. Judging the IOU values of the overlapped anchor and the target frame by executing an NMS method, thereby obtaining a refined candidate ROI region;
(3) Performing an ROI alignment operation on the candidate ROI region, calculating coordinate values of sampling points of a unit by using a bilinear interpolation method, and maximally pooling the corresponding region into a feature map with a fixed size so as to perform subsequent classification and candidate frame regression operations;
(4) And respectively executing operations on the candidate areas, realizing classification of the target object and regression of the candidate frames through a full connection layer, and simultaneously executing FCN full convolution operation on each candidate area in a single Mask branch to generate masks, thereby completing the task of segmentation and identification.
According to the self-training requirement and the configuration condition, selecting a model and a configuration file thereof, selecting a 50-layer Resnet residual network from the backbone network in the embodiment 1, wherein the feature extraction network is an FPN feature pyramid network, selecting a configuration file of 'configs/COCO-instant segment/mask_rcnn_R_50_FPN_3x.yaml', and selecting a configuration file from the official mThe model pre-training weight file to be used is downloaded in the odel_zoo. The training program train. Py is created, and the paths and categories of the training set and the test set are registered in the program. The number of training set and test set images of an input data set in the network training process is 5040 and 2160 respectively, configuration information of a plurality of main parameters in the Mask RCNN neural network is shown in table 1, when the number of iterative steps reaches 21000, learning rate attenuation is changed to 0.1 times of the original learning rate, namely 0.001, and according to the configuration condition of a computer, each batch of samples is defined as 8, so that the complete training process needs to carry out 5040 x 100/8=63000 batches of learning and training. Training to obtain the positioning accuracy AP of the embodiment 1 bbox 83.662, partition accuracy AP seg 78.624%.
TABLE 1Mask RCNN neural network parameter configuration
To better illustrate the methods mentioned in this specification, a further 3 embodiments are supplemented herein to further illustrate the practice of the invention by modifying the neural network model parameters and network structure configuration of the model.
Example 2 based on example 1, the learning rate attenuation step number was changed, and when the training iteration step number reached 31500 times and 50400 times, that is, the training cycle number reached 50 times and 80 times, the attenuation of the learning rate was performed once respectively, the learning rate attenuation coefficient was taken as 0.1, and when the step number reached 31500 times and 50400 times, the learning rate became 0.001 and 0.0001 respectively;
in the embodiment 3, the training backbone network is changed into a 101-layer Resnet residual network, the feature extraction network is kept as an FPN feature pyramid network, the configuration file is selected from 'configs/COCOInstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml', and the rest training parameters are consistent with those in the embodiment 1;
example 4 by changing the number of clusters k=5 in the K-means clustering algorithm, 5 anchor aspect ratios were automatically generated, respectively [0.7,1.0,1.2,1.5,2.0], and the configuration of Mask RCNN neural network structure and Yu Can numbers thereof were the same as example 1.
Training was performed according to the configuration of four examples, and the water accumulation detection performance of the neural network model was evaluated after the training was completed, and the model performance pairs of the four examples are shown in table 2.
TABLE 2 comparison of training results for Mask RCNN neural networks of four examples
Comparing the four embodiments, it can be seen that the model training effect is different by changing the model parameters and configuration. In embodiment 3, the best training results in the four embodiments are obtained only by changing the configuration of the backbone network, but the time used in the whole training process is also obviously prolonged. In order to obtain a better model identification effect, training is further needed to be repeated for a plurality of times by adjusting training parameters in the follow-up process until the model training effect reaches the optimal value, and a final model training weight file model_final.
S6, inputting the test image or video into a detection project based on the trained weight file in the S5, and extracting the ponding characteristics of the image to obtain a ponding range detection and identification result;
creating a detection project file, writing in category information of a detection target and a model reasoning confidence threshold, and loading a model weight file with the weight file trained in the step S5. Through inputting commands in the terminal to detect and identify ponding areas with various input data sources, the model will infer and draw a prediction frame and a mask corresponding to the area with the inference score higher than the confidence threshold, so that an image or video with a ponding range mask is output, and a prediction result is visualized.
The plurality of sources of input data include images, video, or network monitoring cameras.
Four embodiments in the present specification refer to the demo/demo.py file in the detecton 2 source code to create a prediction result visualization program, and write the data set category information and model threshold parameters defined by train.py in the model training process, and the model weight file loads the model_final.pth weight file pre-trained in S5. The Detectron2 framework allows detection and identification of target objects through various input sources such as images, videos or network monitoring cameras, and visualizes and stores prediction results. The image or the file can be detected and identified by using the post-added image or the file path of the input; the video can be subjected to ponding prediction by using the video-input; using "- -webcam", it can be run on a webcam.
In the embodiment, the image to be detected is put into an img/test folder created under the 2 root directory of the detection, and the command 'python prediction. Py-input img/test-output result' is input into the terminal, namely, the images in the test folder are detected and identified in batches, and the identified image result is output into the result folder. The recognition speed of each image is determined by computer configuration, but is generally lower than 1 second or within a few seconds, so that the method has very fast detection speed, can realize the fast recognition of the ponding range, and can further expand and develop related software and application programs according to the method.
The same or similar symbols and labels mentioned in the description of this specification represent the same or similar physical meaning or have the same or similar function, and the illustrations used in this specification are only for better explaining the invention, and the applicability of the invention is not limited thereto. Any equivalent alterations, modifications and variations to the embodiments described above will be apparent to those skilled in the art using this disclosure, and they are intended to be within the scope of this disclosure.
The invention relates to a road ponding area identification method based on data mining and deep learning, which can be used for monitoring waterlogging ponding processes in urban areas according to examples and related steps provided by specifications. According to the invention, the network images acquired by the network social media and the crowdsourcing data platform are used as data sources by the network crawlers and the data mining technology, so that the data resources of the public are effectively utilized, the difficulty in acquiring the ponding image data is reduced, and the acquisition of massive image data with low cost is realized based on big data; the characteristic information of urban water accumulation is learned through the deep learning target detection algorithm, the water accumulation range can be rapidly and accurately detected and identified, the rapidity, the high efficiency and the safety of the urban water accumulation monitoring process are promoted, the instantaneity, the effectiveness and the accuracy of information acquisition in the urban water accumulation monitoring process are ensured, and an effective approach and thinking are provided for orderly development of water accumulation disaster emergency management work and improvement of the capability of related departments and people for coping with sudden urban water accumulation disaster events.

Claims (10)

1. The road ponding area identification method based on data mining and deep learning is characterized by comprising the following steps of:
s1, acquiring urban waterlogging ponding images from an Internet big data platform through a web crawler and a data mining technology, constructing a ponding image database, and establishing an initial ponding image data set through screening;
s2, preprocessing an image in the initial ponding image data set, and carrying out boundary drawing and label making on a ponding range visible in the data set;
s3, utilizing an image processing algorithm to perform data enhancement on the image and the label at the same time, expanding a data set, dividing the marked data set into a training set and a test set, and converting the training set and the test set into a data set format which can be read by the neural network model;
s4, clustering the target frames of the training set by using a K-means clustering algorithm, automatically generating a group of anchor frames more suitable for the user-defined data set, and adjusting the size and the length-width ratio of the anchors in the model configuration file;
s5, inputting the marked training set and verification set into a Mask RCNN instance segmentation model for learning, evaluating the ponding detection performance of the neural network model after training is completed, and repeatedly training by adjusting training parameters of the neural network model until the model training effect reaches the optimal value, namely completing model training, and obtaining a final model training weight file;
s6, inputting the test image or video into a detection project based on the trained weight file in the S5, and extracting the ponding characteristics of the image to obtain a ponding range detection and identification result.
2. The method for identifying road ponding area based on data mining and deep learning according to claim 1, wherein in step S1, the internet big data platform comprises a network social media and crowd-sourced data platform;
formulating information retrieval keywords, such as selecting wider keywords of waterlogging, road ponding, heavy rain and the like, or selecting specific rainfall occasions as a retrieval range, then acquiring urban road ponding images and relevant position time information from an Internet big data platform through keyword retrieval by utilizing web crawlers and a data mining technology, and downloading the urban road ponding images and relevant position time information to a local database for storage and arrangement; and screening and extracting waterlogging ponding images from an image database to form an initial urban road ponding image data set.
3. The method for identifying the road ponding region based on data mining and deep learning according to claim 1, wherein in step S2, the method specifically comprises the following steps:
s2.1, performing image operation on an initial ponding image by using an Opencv computer vision image processing library, wherein the image operation comprises cutting, size expansion or scaling, and the initial ponding image is adjusted to be uniform in resolution so as to facilitate subsequent image processing and data label manufacturing;
s2.2, extracting the characteristics of the ponding area by using Labelme data labeling software, importing the image into Labelme, selecting to create a polygon to begin labeling, carrying out boundary contour sketching and connecting lines on a ponding target visible in the image by sequentially drawing points, finally generating a closed polygon just covering the ponding area in an end-to-end mode, and creating a json tag file in a corresponding Labelme format for each image by giving classification tag information to the generated closed polygon.
4. The method for identifying the road ponding region based on data mining and deep learning according to claim 1, wherein in the step S3, a new ponding image is generated by using an image processing algorithm in an Opencv library and an Augmentor semantic segmentation data enhancement method, so that the data enhancement of an image dataset is realized to improve the diversity and generalization capability of the dataset;
the image processing method comprises a color transformation type method and a geometric transformation type method, wherein the color transformation type method comprises the steps of adjusting contrast, changing brightness, modifying RGB (red, green and blue) values, adding noise points and blackening or replacing background areas; the geometric transformation method comprises the steps of amplifying or shrinking the image, transforming the scale, translating the image, turning over and rotating;
and meanwhile, carrying out corresponding transformation enhancement processing on the image label generated in the step S2: if only the data enhancement of the color transformation class is performed and the geometric position transformation is not involved, the label position information is kept unchanged; if there is data enhancement for geometric transformation, the label also needs to change corresponding position along with image transformation, specifically as follows:
the position of the mark point in the label file is transformed corresponding to the coordinate position of the image transformation method, or the mask image transformed by the label file is subjected to corresponding transformation processing by using a data enhancement method, so that a label corresponding to a new image is generated, and the multiple expansion of the number of the data set images is realized.
5. The method for identifying the road ponding region based on data mining and deep learning according to claim 1, wherein after more image data are generated through data enhancement, all the images marked by information and corresponding labels are required to be divided into a training set and a testing set according to proportion, and all the labels corresponding to the training set and the testing set images are converted in batches by utilizing a data set format conversion tool so as to be read by a target detection framework and further used.
6. The method for identifying the road ponding region based on data mining and deep learning according to claim 1, wherein in the step S4, a basic environment for deep learning image identification is configured, a target detection frame is installed, and then marked target frames in a training set are clustered by using a K-means clustering algorithm; and clustering the frame samples into K clusters by adjusting the K value, so as to find the optimal number and size of the anchor frames, and inputting the aspect ratio of the corresponding anchor into the target detection model configuration file.
7. The method for identifying a road ponding region based on data mining and deep learning according to claim 6, wherein in step S4, an algorithm flow for performing K-means clustering calculation of an anchor frame is as follows:
(1) Selecting the number K of clusters;
(2) Randomly selecting K cluster boxes as initial anchor boxes;
(3) Using the IOU value as a measurement, calculating the IOU value of each anchor box and each target frame; the IOU value is the intersection ratio of the anchor box and the target frame, and the value range is [0,1]The method comprises the steps of carrying out a first treatment on the surface of the In calculating the IOU, it is assumed that the upper left vertices of all boxes are at the origin, and that the anchor size is (w a ,h a ) The size of the frame is (w b ,h b ) Then
Since the greater the IOU value is, the better the greater the similarity is, in order to make the metric value smaller, a distance parameter d is defined herein, so that each target frame is assigned to an anchor with the smallest error from the distance parameter d, and the distance parameter d is taken as:
d=1-IOU
wherein w is a And h a Respectively the bottom and the height, w, of the anchor box b And h b The bottom and the height of the target frame are respectively, and the IOU is the intersection ratio of the anchor box and the target frame;
(4) Calculating the bottom and high median or mean value of all target frames in each cluster, and recalculating a new cluster center to serve as a new anchor box size to update anchors;
(5) Repeating the steps until the anchor is not changed any more, meeting the convergence requirement, or reaching the maximum iteration times.
8. The method for identifying road ponding area based on data mining and deep learning according to claim 1, wherein in step S5, a neural network model and a configuration file thereof are selected according to self-training requirements and configuration conditions, and model pre-training weight files to be used are downloaded from the network;
registering paths and types of a training set and a testing set in a training neural network model, defining main training parameter information including basic learning rate, learning rate attenuation, sample number of each batch, iteration number and training cycle number, and starting to train the neural network;
and after training, evaluating the ponding detection effect of the neural network model, further repeating training for a plurality of times by adjusting various neural network model parameters until the model training effect reaches the optimal value, completing training, and generating a final model training weight file.
9. The method for identifying the road ponding region based on data mining and deep learning according to claim 1, wherein in the step S6, a detection project file is created, category information of a detection target and a model reasoning confidence threshold value are written, and the model weight file is loaded with the weight file trained in the step S5; through inputting commands in the terminal to detect and identify ponding areas with various input data sources, the model will infer and draw a prediction frame and a mask corresponding to the area with the inference score higher than the confidence threshold, so that an image or video with a ponding range mask is output, and a prediction result is visualized.
10. The method for identifying a road water area based on data mining and deep learning according to claim 9, wherein the plurality of input data sources include images, video or network monitoring cameras.
CN202310238225.XA 2023-03-10 2023-03-10 Road ponding area identification method based on data mining and deep learning Pending CN116452850A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310238225.XA CN116452850A (en) 2023-03-10 2023-03-10 Road ponding area identification method based on data mining and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310238225.XA CN116452850A (en) 2023-03-10 2023-03-10 Road ponding area identification method based on data mining and deep learning

Publications (1)

Publication Number Publication Date
CN116452850A true CN116452850A (en) 2023-07-18

Family

ID=87119176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310238225.XA Pending CN116452850A (en) 2023-03-10 2023-03-10 Road ponding area identification method based on data mining and deep learning

Country Status (1)

Country Link
CN (1) CN116452850A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117218801A (en) * 2023-10-23 2023-12-12 华北科技学院(中国煤矿安全技术培训中心) Urban flood disaster monitoring and early warning method and device
CN117746342A (en) * 2024-02-19 2024-03-22 广州市突发事件预警信息发布中心(广州市气象探测数据中心) Method for identifying road ponding by utilizing public video

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117218801A (en) * 2023-10-23 2023-12-12 华北科技学院(中国煤矿安全技术培训中心) Urban flood disaster monitoring and early warning method and device
CN117746342A (en) * 2024-02-19 2024-03-22 广州市突发事件预警信息发布中心(广州市气象探测数据中心) Method for identifying road ponding by utilizing public video
CN117746342B (en) * 2024-02-19 2024-05-17 广州市突发事件预警信息发布中心(广州市气象探测数据中心) Method for identifying road ponding by utilizing public video

Similar Documents

Publication Publication Date Title
CN110619282B (en) Automatic extraction method for unmanned aerial vehicle orthoscopic image building
CN111986099B (en) Tillage monitoring method and system based on convolutional neural network with residual error correction fused
CN109446992B (en) Remote sensing image building extraction method and system based on deep learning, storage medium and electronic equipment
CN113449594B (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN116452850A (en) Road ponding area identification method based on data mining and deep learning
CN111191654B (en) Road data generation method and device, electronic equipment and storage medium
CN112508079B (en) Fine identification method, system, equipment, terminal and application of ocean frontal surface
CN114943876A (en) Cloud and cloud shadow detection method and device for multi-level semantic fusion and storage medium
CN112989995B (en) Text detection method and device and electronic equipment
US10685443B2 (en) Cloud detection using images
CN111291826A (en) Multi-source remote sensing image pixel-by-pixel classification method based on correlation fusion network
KR20220125719A (en) Method and equipment for training target detection model, method and equipment for detection of target object, electronic equipment, storage medium and computer program
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112329559A (en) Method for detecting homestead target based on deep convolutional neural network
CN114120141A (en) All-weather remote sensing monitoring automatic analysis method and system thereof
CN112819837A (en) Semantic segmentation method based on multi-source heterogeneous remote sensing image
CN115984603A (en) Fine classification method and system for urban green land based on GF-2 and open map data
CN113536944A (en) Distribution line inspection data identification and analysis method based on image identification
CN115937492A (en) Transformer equipment infrared image identification method based on feature identification
CN111126187A (en) Fire detection method, system, electronic device and storage medium
CN114511862B (en) Form identification method and device and electronic equipment
CN112991398B (en) Optical flow filtering method based on motion boundary guidance of cooperative deep neural network
CN115457385A (en) Building change detection method based on lightweight network
CN109146893B (en) Oil light area segmentation method and device and mobile terminal
CN112883840B (en) Power transmission line extraction method based on key point detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination