CN118485882A

CN118485882A - Well lid property right judging method and system based on environment feature fusion and computer equipment

Info

Publication number: CN118485882A
Application number: CN202410948010.1A
Authority: CN
Inventors: 储翔; 周家伟; 陈宇杰; 屠雪瑜; 韩帅
Original assignee: Yuntu Information Technology Hangzhou Co ltd
Current assignee: Yuntu Information Technology Hangzhou Co ltd
Priority date: 2024-07-16
Filing date: 2024-07-16
Publication date: 2024-08-13
Anticipated expiration: 2044-07-16
Also published as: CN118485882B

Abstract

The embodiment of the invention discloses a well lid property right judging method, a well lid property right judging system and computer equipment based on environment feature fusion. The method comprises the following steps: acquiring a well lid picture of property rights to be judged; inputting a property classification model for classification detection so as to identify the unit property of each well lid; positioning the position of each well cover; transmitting the unit property of each well lid and the well lid position to a monitoring background; the title classification model comprises a feature extraction network, a feature fusion network and a classification network, wherein the training process of the title classification model is as follows: collecting pictures and videos of the well lid and the surrounding environment as a training data set, extracting the characteristics of the well lid and the surrounding environment from the training data set through a characteristic extraction network, carrying out characteristic fusion by a characteristic fusion network, and inputting the fused characteristic map into a classification network for training. By implementing the method of the embodiment of the invention, various visual characteristics can be fully utilized to classify the property information, and accurate statistics and management of the distribution positions of the well cover are realized.

Description

Well lid property right judging method and system based on environment feature fusion and computer equipment

Technical Field

The invention relates to a well lid property right judging method, in particular to a well lid property right judging method, a well lid property right judging system and computer equipment based on environment feature fusion.

Background

Along with the acceleration of the urban process and the continuous construction of urban infrastructure, the well cover plays a key role as an important component of a road and pipeline system. However, the conventional well lid management method has various problems, such as imperfect information and uneven distribution, resulting in low management efficiency.

Traditional well lid management mainly relies on logging well lid property information in advance or reading background data through equipment to obtain. However, in actual operation, well lid information in many areas is not subjected to informatization management, and a property unit lacks the well lid information, so that the distribution positions and the number of the well lids cannot be accurately counted, and the scientificity and the efficiency of well lid management are affected; in addition, most of the existing well lid management methods focus on extracting the features of the well lid area, but many well lids do not display clear property features due to long-time abrasion or damage, and the property cannot be accurately judged. In addition, the utilization of well lid layout construction information is also insufficient, and special marks are often ignored.

Therefore, a new method is necessary to design, various visual features are fully utilized to classify the property information, and accurate statistics and management of well lid distribution positions are realized.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a well lid property right judging method, a well lid property right judging system and computer equipment based on environment feature fusion.

In order to achieve the above purpose, the present invention adopts the following technical scheme: the well lid property right judging method based on the environment feature fusion comprises the following steps:

acquiring a well lid picture of property rights to be judged;

Inputting the well lid pictures of the property to be judged into a property classification model for classification detection so as to identify the unit property of each well lid; the title classification model comprises a feature extraction network, a feature fusion network and a classification network, wherein the training process of the title classification model is as follows: collecting pictures and videos of a well lid and the surrounding environment thereof as a training data set, extracting the characteristics of the well lid and the surrounding environment from the training data set through a characteristic extraction network, carrying out characteristic fusion through a characteristic fusion network, and inputting the fused characteristic pictures into a classification network for training;

Positioning the position of each well cover;

and sending the unit property of each well lid and the well lid position to a monitoring background.

The further technical scheme is as follows: the feature extraction of the well lid and the surrounding environment is carried out on the training data set through the feature extraction network, and the method comprises the following steps:

Performing convolution operation and downsampling on the training data set through a convolution layer in a feature extraction network, and performing batch standardization and ReLU activation function processing to obtain a first extraction result, wherein the first extraction result comprises the shape features of a shallow well lid and the surrounding environment features;

Extracting feature information from the training data set through a C2f layer in a feature extraction network, and transmitting the extracted feature information by using residual error links to obtain a second extraction result, wherein the second extraction result comprises deep well lid pattern features, well lid word features and detail features of surrounding environments;

And carrying out pooling operation of different scales on the first extraction result and the second extraction result through an SPPF layer in a feature extraction network to generate a feature vector with fixed length.

The further technical scheme is as follows: the feature fusion is performed by a feature fusion network, which comprises the following steps:

and carrying out feature fusion on the feature vectors with fixed lengths by a multi-feature coding layer of the feature fusion network.

The further technical scheme is as follows: the feature fusion network carries out feature fusion on the feature vectors with fixed lengths by a multi-feature coding layer, and the method comprises the following steps:

the method comprises the steps that a multi-feature coding layer with a feature fusion network uses up-sampling and down-sampling technologies to adjust the space size of a feature vector with a fixed length so as to obtain an adjusted feature vector;

and carrying out fusion operation on the adjusted feature vectors to obtain a fused feature map.

The further technical scheme is as follows: the step of inputting the fused feature images into a classification network for training comprises the following steps:

the fused feature graphs are input into a decoupling head of a classification network, convolved through convolution layers of different convolution kernels, normalized by using sigmoid, and optimized by using BCDELoss loss functions.

The further technical scheme is as follows: the method for inputting the well lid picture of the property to be judged into the property classification model for classification detection so as to identify the unit property and the well lid position of each well lid comprises the following steps:

inputting the well lid picture of the property to be judged into a property classification model, and extracting the well lid shape characteristics and the surrounding environment characteristics of the shallow layer by adopting the convolution layer;

extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer;

fusing the shape features, the surrounding environment features, the deep well cover pattern features, the well cover word features and the detail features of the surrounding environment of the shallow well cover by adopting multiple feature coding layers to obtain a fused feature map;

And inputting the fused characteristic diagram into a decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

The further technical scheme is as follows: the well lid shape feature, the surrounding environment feature, the deep well lid pattern feature, the well lid word feature and the detail feature of the surrounding environment of the shallow layer are fused by adopting a multiple feature coding layer, so as to obtain a fused feature map, which comprises the following steps:

Adjusting the number of corresponding channels to 1 for the surrounding environment characteristics and the detail characteristics of the surrounding environment, and adopting a mixed structure of maximum pooling and average pooling to carry out downsampling so as to obtain an adjusted environment characteristic diagram;

The method comprises the steps of adjusting the number of channels by using convolution operation on the shape features of a shallow well lid, the pattern features of a deep well lid and the pattern features of the well lid, and up-sampling by using a nearest neighbor interpolation method to obtain an adjusted well lid characteristic diagram;

and performing primary convolution operation on the adjusted environmental feature map and the adjusted well lid feature map, and splicing the environment feature map and the adjusted well lid feature map in the channel dimension to obtain a fused feature map.

The further technical scheme is as follows: the surrounding environment features comprise the features of motor vehicle lanes, non-motor vehicle lanes and sidewalks, wherein for the motor vehicle lanes, the pavement and lane line features are extracted; for a non-motor vehicle lane, extracting road surface and curb features; for the sidewalk, the shape and characteristics of the ground tile patterns and the well lid word are extracted.

The invention also provides a well lid property right judging system based on the environment feature fusion, which comprises the following steps:

The picture acquisition unit is used for acquiring a well lid picture of the property right to be judged;

The prediction unit is used for inputting the well lid pictures of the property to be judged into the property classification model for classification detection so as to identify the unit property of each well lid; the title classification model comprises a feature extraction network, a feature fusion network and a classification network, wherein the training process of the title classification model is as follows: collecting pictures and videos of a well lid and the surrounding environment thereof as a training data set, extracting the characteristics of the well lid and the surrounding environment from the training data set through a characteristic extraction network, carrying out characteristic fusion through a characteristic fusion network, and inputting the fused characteristic pictures into a classification network for training;

The positioning unit is used for positioning the position of each well lid;

And the sending unit is used for obtaining and sending the unit property of each well lid and the well lid position to the monitoring background.

The further technical scheme is as follows: the prediction unit includes:

The first extraction subunit is used for inputting the well lid picture of the property to be judged into the property classification model, and extracting the well lid shape characteristics and the surrounding environment characteristics of the shallow layer by adopting the convolution layer;

The second extraction subunit is used for extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer;

The fusion subunit is used for fusing the shape features, the peripheral environment features, the deep well cover pattern features, the well cover word features and the detail features of the peripheral environment of the shallow well cover by adopting multiple feature coding layers so as to obtain a fused feature map;

And the prediction subunit is used for inputting the fused characteristic diagram into the decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

Compared with the prior art, the invention has the beneficial effects that: according to the invention, the automatic judgment and positioning of the well lid property rights are realized through the feature extraction, fusion and classification network by utilizing the image data of the well lid, and then related information is sent to the monitoring background, wherein the property rights classification model formed by the feature extraction, fusion and classification network is characterized in that the well lid and the surrounding environment of the well lid are extracted by taking pictures and videos of the well lid and the surrounding environment as training data sets through the feature extraction network, the features are fused through the feature fusion network, finally the well lid and the surrounding environment are input into the classification network for training, and various visual features are fully utilized for property rights information classification, so that the accurate statistics and management of the well lid distribution position are realized.

The invention is further described below with reference to the drawings and specific embodiments.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flow chart of a well lid property right judging method based on environmental feature fusion provided by an embodiment of the invention;

Fig. 2 is a schematic diagram of a well lid picture of property rights to be judged according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a feature of a surrounding environment according to an embodiment of the present invention;

FIG. 4 is a schematic view of detailed features of the surrounding environment provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram I of a shallow manhole cover shape feature provided by an embodiment of the present invention;

FIG. 6 is a second schematic diagram of a shape feature of a shallow manhole cover according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a manhole cover word feature provided by an embodiment of the present invention;

FIG. 8 is a schematic diagram I of deep manhole cover pattern features provided by an embodiment of the invention;

FIG. 9 is a second schematic diagram of deep manhole cover pattern features provided by an embodiment of the present invention;

FIG. 10 is a schematic view of an adjusted manhole cover feature map provided by an embodiment of the present invention;

FIG. 11 is a schematic view of an adjusted environmental profile provided by an embodiment of the present invention;

FIG. 12 is a schematic diagram of a fused feature map provided by an embodiment of the present invention;

FIG. 13 is a schematic block diagram of a well lid property judgment system based on environmental feature fusion provided by an embodiment of the invention;

fig. 14 is a schematic block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

Referring to fig. 1, fig. 1 is a schematic flowchart of a well lid property right judging method based on environmental feature fusion according to an embodiment of the present invention. The well lid property right judging method based on the environment feature fusion is applied to a server. The server performs data interaction with the cameras and the like, the cameras directly collect the well covers and the surrounding pictures thereof, the server judges the unit property of each well cover through a visual algorithm, and the distribution positions are counted. And the property information classification is carried out by fully utilizing various visual characteristics, so that the accurate statistics and management of the well lid distribution positions are realized.

Fig. 1 is a flow chart of a well lid property right judging method based on environment feature fusion provided by the embodiment of the invention. As shown in fig. 1, the method includes the following steps S110 to S140.

S110, obtaining a well lid picture of the property right to be judged.

In this embodiment, a manhole cover image for performing title judgment or attribution confirmation is required. In some cases, it may be desirable to determine the owner or responsible party of a certain manhole cover, especially in urban infrastructure management. Such images are typically used to identify and confirm a particular manhole cover for further maintenance, safety inspection, or legal liability division.

The picture can be taken of the surrounding environment besides the picture of the manhole cover itself.

S120, inputting the well lid pictures of the property to be judged into a property classification model for classification detection so as to identify the unit property of each well lid.

The title classification model comprises a feature extraction network, a feature fusion network and a classification network, wherein the training process of the title classification model is as follows: collecting pictures and videos of the well lid and the surrounding environment as a training data set, extracting the characteristics of the well lid and the surrounding environment from the training data set through a characteristic extraction network, carrying out characteristic fusion by a characteristic fusion network, and inputting the fused characteristic map into a classification network for training.

In an embodiment, the feature extraction of the well lid and the surrounding environment is performed on the training data set through the feature extraction network, and the method includes the following steps:

Firstly, carrying out convolution operation and downsampling on the training data set through a convolution layer in a feature extraction network, and carrying out batch standardization and ReLU activation function processing to obtain a first extraction result, wherein in the embodiment, the first extraction result comprises a shallow well lid shape feature and a surrounding environment feature, in the process, the training data set is input into the convolution layer, and the convolution operation is carried out on input data through a filter (convolution kernel), so that space features are extracted; the convolutional layer downsamples, i.e., reduces the size of the feature map by a stride (stride) parameter and increases the number of channels. This step helps reduce the spatial dimension of the data while enhancing the expressive power of the features. After the convolution operation, the feature map enters a batch normalization layer, which helps to accelerate the convergence process of the network and enhance the stability of the network. Next, a ReLU activation function is applied to each pixel point of each feature map to increase the non-linear capability of the network so that it can learn more complex data patterns.

Then, extracting the characteristic information of the training data set through a C2f layer in a characteristic extraction network, and transmitting the extracted characteristic information by using residual error links to obtain a second extraction result; in this embodiment, the second extraction result includes deep well pattern features, and details of the surrounding environment, specifically, the training data set enters the C2f layer, where the C2f layer is a network structure specifically designed to extract higher level features, and the C2f layer is used to further extract feature information, including specific feature combinations or other advanced feature engineering methods. And directly transmitting the input original information to a subsequent level for Concat splicing by using residual link. This technique allows more originally entered information to be retained, helps to avoid information loss and alleviates the gradient vanishing problem, while enhancing the depth profile expressive power of the network.

And finally, carrying out pooling operation of different scales on the first extraction result and the second extraction result through an SPPF layer in a feature extraction network to generate a feature vector with fixed length. Specifically, the first extraction result and the second extraction result are subjected to pooling operation of different scales through an SPPF layer, the SPPF layer generates feature vectors of fixed length, and the feature vectors are subjected to pooling operation on different scales so as to adapt to input images of different scales, and output consistency and stability are maintained. The feature vectors with fixed length comprise P3, P4 and P5, and refer to feature vectors or feature graphs after different scale pooling. Furthermore, SPPF layer (SPATIAL PYRAMID Pooling with Features) is a technique of spatial pyramid pooling for processing images of different input sizes. It can be pooled on different scales to ensure that inputs of different sizes can all generate output feature vectors of fixed length. Pooling is used to reduce the spatial size of feature maps while preserving important feature information. In the SPPF layer, different scale pooling operations can capture different levels of spatial information, so that the model is more robust and adaptive.

In an embodiment, the feature fusion of the feature vectors with fixed length by the multiple feature encoding layer of the feature fusion network includes:

Firstly, a feature vector with fixed length is subjected to space size adjustment by a multi-feature coding layer with a feature fusion network by using an up-sampling and down-sampling technology so as to obtain an adjusted feature vector; specifically, the feature fusion network comprises multiple feature coding layers, and the layers are responsible for processing and fusing input feature vectors; the multiple feature coding layer may be formed of multiple branches or layers, each of which may process feature maps of different scales. The up-sampling and down-sampling technology is used for adjusting the space size of the feature images, so that the feature images of different branches or layers are ensured to have consistent space sizes; downsampling is typically accomplished through a pooling operation to reduce the size and computation of the feature map while increasing the receptive field of the feature map; upsampling increases the feature map size to a size that matches other branches or layers by interpolation methods (e.g., deconvolution operations) or other magnification techniques; the feature vector with fixed length obtains a version with adjusted space size through the operation; these adjusted feature vectors may already contain information of multiple scales, ready to enter a fusion operation

And then, carrying out fusion operation on the adjusted feature vectors to obtain a fused feature map. Specifically, the adjusted feature vectors are combined together through a fusion operation to form a comprehensive feature representation; the fusion operation may include stitching (concatenation), weighted addition, or other methods of combining multi-scale features. The fused feature graphs, such as T1, T2 and T3, are obtained in the output stage of the feature fusion network; these feature maps represent a combination of feature information of different scales and levels, have richer and more complex representation capabilities, and are suitable for subsequent tasks such as object detection or semantic segmentation.

The feature fusion network is capable of processing multi-scale feature maps from different branches or layers, and adjusting the spatial dimensions of the feature maps by upsampling and downsampling techniques. The method ensures that the characteristic information of different scales can be reasonably integrated in the fusion process, thereby improving the understanding capability of the model on the multi-scale information of the object or the scene.

The adjusted feature vectors form a comprehensive feature representation through fusion operations (such as splicing, weighted addition and the like), and the feature images have richer and more complex expression capability. Such rich feature representations are very beneficial for complex tasks such as object detection and semantic segmentation, as they can capture more semantic information and context.

The fused feature graphs T1, T2 and T3 are generated in the output stage of the feature fusion network, and represent comprehensive feature information of different scales and levels. The comprehensive information enables the subsequent task model to learn and infer more effectively, so that the performance and generalization capability of the model under a complex scene are improved.

The downsampling operation reduces the size and the calculation amount of the feature map through pooling and other technologies, and increases the receptive field of the feature map, which is particularly important for processing large-scale data and accelerating the reasoning speed.

In an embodiment, the inputting the fused feature map into the classification network for training includes:

In this embodiment, the decoupling head is a network structure for multi-task learning or specific tasks, and is used for processing the fused feature graphs and outputting corresponding prediction results. In the decoupling head, the fused feature maps T1, T2, T3 pass through a series of convolution layers. These convolution layers include:

43 x3 convolutional layers: these convolution layers are typically used to extract features and add nonlinearities that help capture spatial information and abstract features in the feature map.

21 X1 convolutional layers: the 1x1 convolution layer is typically used to reduce the dimension and computation of the feature map while nonlinear transformation can be introduced.

After passing through the convolutional layer, normalization is typically performed using a Sigmoid function. The Sigmoid function scales the value of the output to within the range of (0, 1), which is particularly useful for the two-classification problem, since the output can be interpreted as a probability value. BCDELoss (Binary Cross entropy loss, binary Cross-Entropy Dice Loss) is a loss function commonly used in image segmentation or classification problems, combining Binary Cross entropy loss and Dice loss. The definition is as follows: Wherein p (x) is the model output, y is the difference between the model output and the real label measured by the real label, and model learning is promoted to approach the real label.

The goal of the overall process is to optimize the network parameters in the decoupling head by minimizing BCDELoss the loss function. The optimization process uses a back propagation algorithm to update the weights of the convolution layers by gradient descent or variants thereof to bring the model's output closer to the real labels.

The fused feature graphs are input into a decoupling head, and through a series of convolution operations and the normalization by using Sigmoid and the optimization process of BCDELoss loss functions, the model is ensured to be capable of effectively learning from complex features and generating a prediction result suitable for classification tasks. The method combines feature extraction, nonlinear transformation and loss function optimization, so that the model can be continuously approximated to a real label in the training process, and the accuracy and generalization capability of classification tasks are improved.

The trained property classification model is applied to actual well lid property judgment, so that accurate statistics and management of well lid distribution positions can be realized.

Specifically, in one embodiment, the step S120 may include steps S121 to S124.

S121, inputting the well lid picture of the property to be judged into a property classification model, and extracting the well lid shape characteristics and the surrounding environment characteristics of the shallow layer by adopting the convolution layer.

In this embodiment, the surrounding environmental features include features of a motor vehicle lane, a non-motor vehicle lane, and a pavement, where for the motor vehicle lane, road surface and lane line features are extracted; for a non-motor vehicle lane, extracting road surface and curb features; for the sidewalk, the shape and characteristics of the ground tile patterns and the well lid word are extracted.

In particular, different types of input images may guide different feature extraction paths. For example, for motor vehicles, road surface features are first extracted and then further analyzed, typically such images correspond to thermal wells or gas-gas wells, requiring recognition of words such as "hot", "gas", etc. For non-motor vehicles, road surface features are first extracted, then features of curbs are focused, such images usually correspond to communication, rainwater or sewage drain covers, shape features need to be identified, and words such as "rain", "sewage" and names of communication companies are further extracted. As for the sidewalk, firstly, the pavement characteristics are extracted, then, the pattern characteristics of the floor tiles are further analyzed, and usually, the shape characteristics are required to be identified and the characters such as electricity, water and the like are required to be extracted corresponding to the power or the water supply well cover.

In this embodiment, as shown in fig. 2,3 and 5 and 6, the convolution layer is effective to capture local features and structural information in the image, such as the shape features of the well lid and the features of the surrounding environment, such as the motor vehicle lane, non-motor vehicle lane or pavement. These features are the basis for subsequent deep feature extraction, helping the model understand well lid images in different environments.

S122, extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer.

In this embodiment, the C2f layer can learn more abstract and complex features through a deeper network structure, such as specific patterns, patterns of manhole covers, or fine features of the surrounding environment, as shown in fig. 4, 7 to 9, which are critical to accurately distinguish different types of manhole covers and environments.

In particular, deep manhole cover pattern features may encompass specific logos, such as chinese UNICOM logos or logo patterns. Such features typically need to be extracted by models such as deep Convolutional Neural Networks (CNNs) because they may involve complex texture and image details. For example, chinese-style signs may appear in or around the center of the manhole cover, which requires the model to be able to accurately identify and classify these features.

The word feature of the manhole cover refers to characters or marks printed on the manhole cover, such as Chinese Unicom. These patterns are usually designed according to the specific use or management unit where the manhole cover is located, and it is important for the intelligent recognition system to recognize and understand these patterns accurately. The extraction of such features requires that the model possess a certain text recognition capability, which can be achieved by OCR (optical character recognition) or text detection algorithms.

The detail features of the surrounding environment typically include curb features around the manhole cover. The curbstone refers to isolation belts or guardrails at the edges of the road surface, and the design and the structure of the isolation belts or guardrails can tell us important information about the environment in which the manhole cover is positioned. For example, the shape, color, or the above markers (e.g., warning signs or signs) of the curb may help the model better understand the specific location and environmental context of the manhole cover.

In summary, the deep well lid pattern features, well lid word features and detail features of the surrounding environment are important information sources in the intelligent well lid recognition system, and can help the system to accurately classify and understand different types of well lids, so that more efficient management and maintenance are realized.

S123, fusing the well lid shape features, the surrounding environment features, the deep well lid pattern features, the well lid word features and the detail features of the surrounding environment of the shallow layer by adopting multiple feature coding layers so as to obtain a fused feature map.

In the embodiment, multiple feature codes improve the understanding capability and classification accuracy of the model to the well lid image by combining features of different layers and types. The fused feature map not only contains more abundant and comprehensive information, but also can keep the effectiveness and diversity of various features, which is important for the subsequent classification and prediction steps.

In one embodiment, the step S123 may include steps S1231 to S1234.

S1231, adjusting the number of corresponding channels to 1 for the surrounding environment characteristics and the detail characteristics of the surrounding environment, and downsampling by adopting a mixed structure of a maximum pool and an average pool to obtain an adjusted environment characteristic diagram.

In this embodiment, as shown in fig. 10 to 11, the feature map of the surrounding environment generally includes a plurality of channels, and by adjusting the number of channels to 1, the complexity of the data is reduced, so that the subsequent processing is simpler and more efficient. The maximally pooled and averaged pooled hybrid structure can preserve important information of environmental features while reducing feature map resolution. The maximum pooling is beneficial to capturing the remarkable characteristics in the environment, the average pooling is beneficial to preserving more comprehensive environmental characteristic information, and the comprehensive utilization of the two can effectively reduce the dimension of the data, and meanwhile, the diversity and the effectiveness of the environmental characteristics are maintained.

S1232, the number of channels is adjusted by using convolution operation on the shape features of the shallow well covers, the pattern features of the deep well covers and the pattern features of the well covers, and up-sampling is performed by using a nearest neighbor interpolation method, so that an adjusted well cover characteristic diagram is obtained.

In this embodiment, as shown in fig. 10, the well lid feature map may be from a plurality of different sources, and the number of channels is adjusted by convolution operation, so that the dimensions of all feature maps can be kept consistent for subsequent processing and fusion.

The nearest neighbor interpolation method is used for up-sampling, and the up-sampling is helpful for recovering the resolution of the well lid characteristic diagram and preserving the richness of the local characteristics of the well lid characteristic diagram. Nearest neighbor interpolation is a simple and effective up-sampling method, and can avoid excessive processing and information loss while maintaining the integrity of characteristic information.

S1233, performing a convolution operation on the adjusted environmental feature map and the adjusted well lid feature map, and splicing the environment feature map and the adjusted well lid feature map in the channel dimension to obtain a fused feature map.

In this embodiment, as shown in fig. 12, after the environmental features and the well lid features are fused, the model can comprehensively consider various information of the surrounding environment and the well lid, so as to improve accuracy of identification and analysis. The convolution operation is helpful for further extracting the advanced representation of the fused feature map, and the channel splicing organically combines the environment and the well lid features, so that the overall features are more comprehensive and rich.

Each step is designed to retain and enhance information about the environment and manhole cover characteristics, thereby enabling the final signature to have better characterization capabilities. The mixed use of pooling and convolution operations, as well as upsampling and downsampling techniques, not only improves processing efficiency, but also enhances the adaptability of the system to different scale and complexity characteristics.

Therefore, through orderly execution of the steps, the surrounding environment and the well lid characteristics can be effectively fused, and more accurate and comprehensive characteristic information is provided for the intelligent identification system, so that the efficiency and the reliability of well lid management and maintenance are improved.

S124, inputting the fused characteristic diagram into a decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

In this embodiment, the unit property of the manhole cover refers to the unit or property to which the manhole cover belongs, i.e., the management unit or owner to which the manhole cover belongs. In city management or infrastructure maintenance, each well lid will have a responsibility unit or owner in charge of its security, maintenance and management. This information is important for city planning, infrastructure management, and emergency response.

In the present embodiment, first, the feature map fused in step S1233 is taken as an input. The feature map already contains the combination information of the environmental features and the manhole cover features, and is a high-level abstract representation. The decoupling head is a specially designed module for processing complex feature maps and performing class prediction. The feature map is subjected to a series of convolution, pooling and full connection layer operations to learn and extract the probability distribution of the property categories corresponding to each location. The output is a classifier output tensor, where the value at each location represents the probability of the corresponding class. And (3) selecting the category with the highest probability as a prediction result by comparing the probabilities of different categories at each position. This represents the most likely property class for each manhole cover.

The decoupling head can fully utilize the rich information of the fused feature map, so that the accuracy and the robustness of the property category prediction are improved; by selecting the category with the highest probability, the property category of each well lid can be ensured to be accurately determined, and the accuracy of management and maintenance can be ensured; the design of the decoupling head optimizes the processing capacity, can complete complex class prediction tasks in a short time, and improves the response speed of the system; the fused feature map contains various information from the surrounding environment and the well lid, and the information can be comprehensively considered through the processing of the decoupling head, so that the performance and the intelligent degree of the whole recognition system are improved.

S130, positioning the position of each well cover.

In this embodiment, the location information may also be output by using the location module. Such location information may be specific location coordinates of each manhole cover in an image or geographic space, facilitating further management and identification processes.

And S140, sending the unit property of each well lid and the well lid position to a monitoring background.

In this embodiment, the relevant unit property and the well lid position are sent to the monitoring background, so that the monitoring background can monitor in real time.

According to the well lid property right judging method based on the environment feature fusion, the image data of the well lid is utilized, the automatic judgment and positioning of the well lid property right are realized through the feature extraction, fusion and classification network, and then related information is sent to the monitoring background, wherein a property right classification model formed by the feature extraction, fusion and classification network is formed by collecting pictures and videos of the well lid and the surrounding environment thereof as a training data set, the feature extraction network is utilized to extract the features of the well lid and the surrounding environment, the features are fused through the feature fusion network, finally, the features are input into the classification network for training, and various visual features are fully utilized for property right information classification, so that accurate statistics and management of well lid distribution positions are realized.

Fig. 13 is a schematic block diagram of a well lid property right judging system 300 based on environmental feature fusion according to an embodiment of the present invention. As shown in fig. 13, the present invention further provides a well lid property right judging system 300 based on the environmental feature fusion, corresponding to the above well lid property right judging method based on the environmental feature fusion. The well lid property right judging system 300 based on the environmental feature fusion includes a unit for performing the well lid property right judging method based on the environmental feature fusion described above, and may be configured in a server. Specifically, referring to fig. 13, the manhole cover property right judging system 300 based on the environmental feature fusion includes a picture obtaining unit 301, a predicting unit 302, a positioning unit 303, and a transmitting unit 304.

A picture obtaining unit 301, configured to obtain a well lid picture of the property to be judged; the prediction unit 302 is configured to input the well lid picture of the property to be judged into the property classification model for classification detection, so as to identify the unit property of each well lid; a positioning unit 303, configured to position each manhole cover; a sending unit 304, configured to obtain and send the unit property and the well lid position of each well lid to a monitoring background; the title classification model comprises a feature extraction network, a feature fusion network and a classification network, wherein the training process of the title classification model is as follows: collecting pictures and videos of the well lid and the surrounding environment as a training data set, extracting the characteristics of the well lid and the surrounding environment from the training data set through a characteristic extraction network, carrying out characteristic fusion by a characteristic fusion network, and inputting the fused characteristic map into a classification network for training.

In an embodiment, the feature extraction of the well lid and the surrounding environment for the training data set through the feature extraction network includes: performing convolution operation and downsampling on the training data set through a convolution layer in a feature extraction network, and performing batch standardization and ReLU activation function processing to obtain a first extraction result, wherein the first extraction result comprises the shape features of a shallow well lid and the surrounding environment features; extracting feature information from the training data set through a C2f layer in a feature extraction network, and transmitting the extracted feature information by using residual error links to obtain a second extraction result, wherein the second extraction result comprises deep well lid pattern features, well lid word features and detail features of surrounding environments; and carrying out pooling operation of different scales on the first extraction result and the second extraction result through an SPPF layer in a feature extraction network to generate a feature vector with fixed length.

The feature fusion is performed by a feature fusion network, which comprises the following steps: and carrying out feature fusion on the feature vectors with fixed lengths by a multi-feature coding layer of the feature fusion network.

The step of inputting the fused feature images into a classification network for training comprises the following steps: the fused feature graphs are input into a decoupling head of a classification network, convolved through convolution layers of different convolution kernels, normalized by using sigmoid, and optimized by using BCDELoss loss functions.

In an embodiment, the prediction unit 302 includes:

The first extraction subunit is used for inputting the well lid picture of the property to be judged into the property classification model, and extracting the well lid shape characteristics and the surrounding environment characteristics of the shallow layer by adopting the convolution layer; the second extraction subunit is used for extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer; the fusion subunit is used for fusing the shape features, the peripheral environment features, the deep well cover pattern features, the well cover word features and the detail features of the peripheral environment of the shallow well cover by adopting multiple feature coding layers so as to obtain a fused feature map; and the prediction subunit is used for inputting the fused characteristic diagram into the decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

In one embodiment, the fusion subunit comprises:

The first adjusting module is used for adjusting the peripheral environment characteristics and the detail characteristics of the peripheral environment to 1, and downsampling by adopting a mixed structure of maximum pooling and average pooling to obtain an adjusted environment characteristic diagram; the second adjusting module is used for adjusting the number of channels for the shape features, deep well pattern features and well pattern features of the shallow well cover by using convolution operation and up-sampling by using a nearest neighbor interpolation method so as to obtain an adjusted well cover feature map; and the convolution splicing module is used for carrying out one-time convolution operation on the adjusted environmental characteristic diagram and the adjusted well lid characteristic diagram and splicing the environmental characteristic diagram and the adjusted well lid characteristic diagram in the channel dimension to obtain a fused characteristic diagram.

It should be noted that, as those skilled in the art can clearly understand, the detailed implementation process of the well lid property right judging system 300 and each unit based on the environmental feature fusion can refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, the detailed description is omitted here.

The manhole cover property judgment system 300 based on the environmental feature fusion described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 14.

Referring to fig. 14, fig. 14 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server, where the server may be a stand-alone server or may be a server cluster formed by a plurality of servers.

With reference to FIG. 14, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.

The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a well lid property judgment method based on environmental feature fusion.

The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.

The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a well lid property judgment method based on fusion of environmental features.

The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the structure shown in FIG. 14 is merely a block diagram of some of the structures associated with the present inventive arrangements and does not constitute a limitation of the computer device 500 to which the present inventive arrangements may be applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or may have a different arrangement of components.

Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:

Acquiring a well lid picture of property rights to be judged; inputting the well lid pictures of the property to be judged into a property classification model for classification detection so as to identify the unit property of each well lid; positioning the position of each well cover; transmitting the unit property of each well lid and the well lid position to a monitoring background;

In one embodiment, when the step of extracting the features of the well lid and the surrounding environment from the training data set through the feature extraction network is implemented by the processor 502, the following steps are specifically implemented:

Performing convolution operation and downsampling on the training data set through a convolution layer in a feature extraction network, and performing batch standardization and ReLU activation function processing to obtain a first extraction result, wherein the first extraction result comprises the shape features of a shallow well lid and the surrounding environment features; extracting feature information from the training data set through a C2f layer in a feature extraction network, and transmitting the extracted feature information by using residual error links to obtain a second extraction result, wherein the second extraction result comprises deep well lid pattern features, well lid word features and detail features of surrounding environments; and carrying out pooling operation of different scales on the first extraction result and the second extraction result through an SPPF layer in a feature extraction network to generate a feature vector with fixed length.

In one embodiment, when the step of feature fusion by the feature fusion network is implemented by the processor 502, the following steps are specifically implemented:

In one embodiment, when the processor 502 performs the feature fusion step on the feature vector with a fixed length by the multi-feature encoding layer of the feature fusion network, the following steps are specifically implemented:

the method comprises the steps that a multi-feature coding layer with a feature fusion network uses up-sampling and down-sampling technologies to adjust the space size of a feature vector with a fixed length so as to obtain an adjusted feature vector; and carrying out fusion operation on the adjusted feature vectors to obtain a fused feature map.

In an embodiment, when the processor 502 performs the training step of inputting the fused feature map into the classification network, the following steps are specifically implemented:

In an embodiment, when the step of inputting the well lid picture of the property to be judged into the property classification model to perform classification detection to identify the unit property and the well lid position of each well lid is implemented by the processor 502, the following steps are specifically implemented:

Inputting the well lid picture of the property to be judged into a property classification model, and extracting the well lid shape characteristics and the surrounding environment characteristics of the shallow layer by adopting the convolution layer; extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer; fusing the shape features, the surrounding environment features, the deep well cover pattern features, the well cover word features and the detail features of the surrounding environment of the shallow well cover by adopting multiple feature coding layers to obtain a fused feature map; and inputting the fused characteristic diagram into a decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

The surrounding environment features comprise the features of motor vehicles lanes, non-motor vehicles lanes and sidewalks, wherein for the motor vehicles lanes, the road surface and lane line features are extracted; for a non-motor vehicle lane, extracting road surface and curb features; for the sidewalk, the shape and characteristics of the ground tile patterns and the well lid word are extracted.

In an embodiment, when the processor 502 performs the step of fusing the shape feature, the surrounding environment feature, the deep well lid pattern feature, the well lid word feature and the detail feature of the surrounding environment of the shallow well lid by using multiple feature coding layers to obtain the fused feature map, the following steps are specifically implemented:

Adjusting the number of corresponding channels to 1 for the surrounding environment characteristics and the detail characteristics of the surrounding environment, and adopting a mixed structure of maximum pooling and average pooling to carry out downsampling so as to obtain an adjusted environment characteristic diagram; the method comprises the steps of adjusting the number of channels by using convolution operation on the shape features of a shallow well lid, the pattern features of a deep well lid and the pattern features of the well lid, and up-sampling by using a nearest neighbor interpolation method to obtain an adjusted well lid characteristic diagram; performing primary convolution operation on the adjusted environmental feature map and the adjusted well lid feature map, and splicing in the channel dimension to obtain a fused feature map

It should be appreciated that in embodiments of the present application, the Processor 502 may be a central processing unit (Central Processing Unit, CPU), the Processor 502 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf Programmable gate arrays (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.

Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:

In one embodiment, when the processor executes the computer program to implement the step of extracting features of the well lid and the surrounding environment from the training data set through the feature extraction network, the processor specifically implements the following steps:

In one embodiment, when the processor executes the computer program to implement the feature fusion step performed by the feature fusion network, the following steps are specifically implemented:

In one embodiment, when the processor executes the computer program to implement the feature fusion step of the feature vector with fixed length by the multi-feature encoding layer of the feature fusion network, the method specifically includes the following steps:

In one embodiment, when the processor executes the computer program to implement the step of inputting the fused feature map to the classification network for training, the processor specifically implements the following steps:

In an embodiment, when the processor executes the computer program to implement the step of inputting the well lid picture of the property to be judged into the property classification model to perform classification detection so as to identify the unit property and the well lid position of each well lid, the processor specifically implements the following steps:

Extracting deep well cover pattern features, well cover word features and detail features of surrounding environment by adopting a C2f layer; fusing the shape features, the surrounding environment features, the deep well cover pattern features, the well cover word features and the detail features of the surrounding environment of the shallow well cover by adopting multiple feature coding layers to obtain a fused feature map; and inputting the fused characteristic diagram into a decoupling head for prediction so as to output the property category with the highest probability and determine the unit property of each well lid.

In an embodiment, when the processor executes the computer program to realize the step of fusing the shape feature, the surrounding environment feature, the deep well pattern feature, the well pattern feature and the detail feature of the surrounding environment of the shallow well cover by adopting multiple feature coding layers to obtain the fused feature map, the specific implementation steps are as follows:

Adjusting the number of corresponding channels to 1 for the surrounding environment characteristics and the detail characteristics of the surrounding environment, and adopting a mixed structure of maximum pooling and average pooling to carry out downsampling so as to obtain an adjusted environment characteristic diagram; the method comprises the steps of adjusting the number of channels by using convolution operation on the shape features of a shallow well lid, the pattern features of a deep well lid and the pattern features of the well lid, and up-sampling by using a nearest neighbor interpolation method to obtain an adjusted well lid characteristic diagram; and performing primary convolution operation on the adjusted environmental feature map and the adjusted well lid feature map, and splicing the environment feature map and the adjusted well lid feature map in the channel dimension to obtain a fused feature map.

The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided by the present invention, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.

The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the system of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention.

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. The well lid property right judging method based on the environment feature fusion is characterized by comprising the following steps:

acquiring a well lid picture of property rights to be judged;

Positioning the position of each well cover;

2. The well lid property right judging method based on the environment feature fusion according to claim 1, wherein the feature extraction of the well lid and the surrounding environment is performed on the training data set through the feature extraction network, and the method comprises the following steps:

3. The well lid property right judging method based on the environment feature fusion according to claim 2, wherein the feature fusion is performed by a feature fusion network, comprising:

4. The well lid property right judging method based on the environment feature fusion according to claim 3, wherein the feature fusion is carried out on the feature vector with fixed length by a multi-feature coding layer of a feature fusion network, comprising:

5. The well lid property right judging method based on the environmental feature fusion according to claim 4, wherein the step of inputting the fused feature map to a classification network for training comprises the steps of:

6. The method for determining well lid property rights based on environmental feature fusion according to claim 5, wherein inputting the well lid picture of property rights to be determined into a property right classification model for classification detection to identify unit properties and well lid positions of each well lid comprises:

7. The method for determining well lid property rights based on environmental feature fusion according to claim 6, wherein the fusing of the well lid shape feature, the surrounding environmental feature, the deep well lid pattern feature, the well lid word feature and the detail feature of the surrounding environment by using multiple feature coding layers to obtain the fused feature map comprises:

8. The well lid property right judging method based on the environment feature fusion according to claim 6, wherein the surrounding environment features comprise features of a motor vehicle lane, a non-motor vehicle lane and a sidewalk, and road surface and lane line features are extracted for the motor vehicle lane; for a non-motor vehicle lane, extracting road surface and curb features; for the sidewalk, the shape and characteristics of the ground tile patterns and the well lid word are extracted.

9. Well lid title judgement system based on environmental feature fuses, its characterized in that includes:

The positioning unit is used for positioning the position of each well lid;

10. The well lid property judgment system based on the environmental feature fusion according to claim 9, wherein the prediction unit comprises: